broadcasting and the internet

) elements in the structural map (example 1a).

[subsidiary

s containing image information]

[subsidiary

s containing image information]

[subsidiary

s containing image information]

example 1a. sample experiment-level structural map within these containing divisions, subsidiary

elements are used to map the combination of images necessary to deliver the content for each type. mets allows the specification of the parallel or sequential structuring of files using its and elements respectively. the parallel processing of the apotone subtype, for instance, could be encoded as shown in example 1b. information technology and libraries | september 2012 27

example 1b. sample parallel structure for raw image files to be combined using a process specified in associated metadata (behavior section) each division of the structural map of this type may in its turn be attached to a specific software item in the mets behavior section to designate the application through which it should be processed: the tri-partite set of images in example 1b, for instance, would be linked to the processapotone software using the code in example 1c. example 1c. sample mets behavior mechanism for a specification of image processing this approach is straightforward, and mets is capable of encoding all of the requirements of this data model, although at the cost of large file sizes and a degree of inflexibility. this may be no problem when the principle rationale behind the creation of this metadata is preservation: linking all of the project metadata in a coherent, albeit monolithic, structure of this kind benefits especially its usage as an open archival information system (oais) archival information package (aip), one of the key functions for which mets was designed. problems are likely to arise, however, when this approach is scaled up in a delivery system to include the potentially millions of data objects that this project may produce. the large size of the mets files that this approach necessitates makes their on-the-fly processing for delivery much slower than a system that uses aggregations of the smaller files required by the fcm model and so processes only metadata at the granularity necessary for the delivery of each object. such flexibility is much harder to achieve within mets, although mechanisms that currently exist for aggregating diverse objects within mets may seem to offer some degree of solution to this problem. complex relationships under mets underlying the mets structural map is an assumed ontology of digital objects that encodes a longestablished view of text as an ordered hierarchy of content objects;11 this model accounts for the mets as an intermediary schema for a digital library of scientific multimedia | gartner 28 map’s use of hierarchical nesting and the ordinality of the object’s components. the rigidity of this model is alleviated to some extent by the facility within mets to encode structural links that cut across these hierarchies. these links, which join nodes at any level of the structural map, are particularly useful for encoding hyperlinks within webpages,12 and so are often used for archiving websites. various attempts have been made to extend the functionality of the structural map and structural links sections to allow more sophisticated aggregations and combinations of components beyond the boundaries of a single digital object, in a manner analogous to the flexible granularity of fcm. mets itself offers the possibility of aggregating other mets files through its (mets pointer) element: this element, always a direct child of a

element in the structural map, references a mets file that contains metadata on the digital content represented by this

. for example, two complex digital objects could be represented at a higher collection level, as shown in example 2.

example 2. use of mets element this feature has found some use in such projects as the echo depository, which uses it to register digital objects at various stages of their ingest into, and dissemination from, a repository;13 it is also recommended by the paradigm project as a method for archiving born-digital content, such as emails.14 nonetheless, its usage remains fairly limited; of all the mets profiles registered on the central mets repository, for instance, echo dep at the time of writing remains the only project on the library of congress’s repository of mets profiles to employ this feature. 15 an important reason for its limited take-up is that its potential for more sophisticated uses than merely populating a division of the structural map is severely limited by its place in the mets schema. the element can only be used as a direct child of its parent

: it cannot, for instance, be located in or elements to indicate that the objects referenced in its subordinate mets files should be processed in parallel or in sequence (as is required by the different experiment types in figure 1), nor may the contents of these files be processed by the sophisticated partitioning features of the element, which allows subsidiary parts of a

to be addressed directly. a more sophisticated approach to combining digital object components is to employ open archives initiative object reuse and exchange (oai-ore) aggregations,16 which express more complex relationships at greater levels of granularity than the method allows. information technology and libraries | september 2012 29 mcdonough’s examination of the possibility of aligning the two standards concludes that it is indeed possible, although at the cost of eliminating the mets behavior section and removing much of the flexibility of mets’s structural links, both side effects of oai-ore’s requirement that resource maps must form a connected rdf graph.17 in addition, converting between mets and oai-ore may not be lossless, depending on the design of the mets document.18 neither approach therefore seems ideal for an application of this type, the former because of the limited ways in which the element can be deployed outside the element and its subsidiaries, the latter because of its removal of the functionality of the behavior section, which is essential for the delivery of material such as this. mets as an intermediary schema an alternative approach adopted here uses the technique of employing mets files as intermediary schemas to act as templates from which mets-encoded packages for delivery can be generated. intermediary xml schemas are intermediary in the sense that they are designed not to act as final metadata containers for archiving or delivery, but as mediating encoding mechanisms from which data or metadata in these final forms can be generated by xslt transformations: one example is cerif4ref, a heavily constrained xml schema used to encode research management information from which metadata in the complex common european research information format (cerif) data model can be generated.19 the cerif4ref schema attempts to emulate the architectural processing features of sgml,20 which are absent from xml; these allowed simpler document type definitions (dtds) to be compiled for specific applications, which could then be mapped to established, more complex, sgml models. instead of architectural processing, cerif4ref uses xslt to carry out this processing, so allowing the combination of a simpler scheme tailored to the requirements of an application to be combined with the benefits of a more interoperable but highly complex model that is difficult to implement in its standard form. instead of using this technique for constraining encoding to a simpler model and generating more complex data structures from this, the intermediary schema technique may be used to define templates, similar to a content model, from which the final mets files to be delivered can be constructed. as is the case with cerif4ref, xslt is used for these transformations, and the xslt files form an integral part of the application. in this way, a series of templates, beginning with highest-level abstractions, are used to generate their more concrete subsidiaries, until a final version used for dissemination is generated. the core of this application is a mets file, which acts as a template for the data delivery requirements for each type of experiment. figure 2 demonstrates the components necessary for defining these for the 2grating experiment subtype detailed previously in figure 1. mets as an intermediary schema for a digital library of scientific multimedia | gartner 30 figure 2. defining an experiment subtype in mets the data model for the delivery of these objects is defined in the (b): as can be seen here, a series of nested

elements is used to define the relationship of experiment subtypes to types, and then to define, at the lowest level of this structure, the model for delivering the objects themselves. in this example, two files are to be processed in parallel; these are defined by elements within the (parallel) element. in a standard mets file, the fileid attribute of would reference a element within the mets file section (a): in this case, however, they reference empty file group () elements, which are populated with elements when this template undergoes its xslt transformation. the final component of this template is the mets behavior section (c), in which the applications required to process the digital objects are defined. two behavior sections are shown in this example: the first is used to invoke the xslt transformation by which this mets template file is to be processed, the second to define the software necessary to co-process the two images files for delivery. both indicate the divisions of the structural map whose components they process by their structid attributes: the first references the map as a whole because it applies to recursively to the mets file itself, the second references the experiment for which it is needed. when delivering a digital object, it is then necessary to process this template mets file to generate the final version used to encode its metadata in full. the xslt used to do this co-processes the template and a separate mets file defined for each object containing all of its relevant metadata: information technology and libraries | september 2012 31 this latter file is used to populate the empty sections of the template, in particular the file section. figure 3 provides an illustration of the xslt fragment which carries out this function. figure 3. the xslt transformation file is evoked with the sample parameter, which contains the number of the sample to be rendered: this is used to generate the filename for the document function, which selects the relevant mets file containing metadata for the image itself. the element within this file, which corresponds to the required image, is then integrated into the relevant element in the template file, populating it with its subcomponents, including the element, which contains the location of the file itself. in the case of the actin_5 experiment, which generates a video file from a sequence of still images, the processes involved are slightly more complicated. because the number of still images to be processed will vary for each sample, it is not possible to specify the template for the delivery mets as an intermediary schema for a digital library of scientific multimedia | gartner 32 version of the sequence explicitly within a element as is done for the other experiments. instead, it is necessary to define a further mets file (the “sequence file”) in which the sequence for a given sample is defined. in this case, the architecture is shown in figure 4. figure 4. populating sequentially processed file section with xslt in this case the element in the mets template file acts as a placeholder only and does not encode even the skeletal information for the parallel-processed tiff files in figure 3. similarly, the structural map

for this experiment indicates only that this section is a sequence but does not enumerate the files themselves even in template form. both of these sections are populated when the file is processed by the xslt transformation to import metadata from the mets “sequence file,” information technology and libraries | september 2012 33 in which the file inventory (a) and sequential structure (b) for a given sample are listed. the xslt file populates the file section and structural map directly from this file, replacing the skeletal sections in the template with their counterparts from the sequence file. through this relatively simple xslt transformation, the final delivery version of the mets file is readily generated for either content model. this file can itself then be delivered on the fly (for instance, as a fedora disseminator); this is done by using a further xslt file to process the complex digital object components using the mechanism associated with each experiment in the mets behavior section. given the relatively small size of all of the files involved, this processing can be done more quickly than would be possibly using a fully aggregated mets approach. in the laboratory environment in particular, where the fast rendering and delivery of these images is needed so as not to impede workflows, this has major advantages. although the project aimed to examine specifically the use of fedora for the delivery of this complex material, and so employed fcm as the basis of its metadata strategy, the technique examined in this article proved itself a viable alternative that made much fewer demands on developer time. the small number of xslt stylesheets required to render and deliver the mets files were written within a few hours: the development time to program the delivery of the rdfbased metadata that formed the fcm required several weeks. processing xml using xslt disseminators in fedora is very fast, and so using this method instead of processing rdf introduces no discernible delays in object delivery. conclusions this approach to delivering complex content appears to offer the benefits of the alternative approaches outlined above in a simpler manner than either currently allows. it offers much greater flexibility than the mets element, which can only populate a complete structural map division. when compared to the fcm approach, this strategy, which relies solely on relatively simple xslt transformations for processing the metadata, requires less developer time but offers a similar degree of flexibility of structure and granularity. it also avoids much of the rigidity of the oai-ore approach by not requiring the use of connected rdf graphs, and so frees up the behavior section to define the processing mechanisms needed to deliver these objects. using the intermediary schema technique in this way does therefore offers a means of combining the advantages of employing well-defined interoperable metadata schemes and the practicalities of delivering digital content in an efficient manner, which makes limited demands on development. as such, it represents a viable alternative to the previous attempts to handle complex aggregations within mets discussed above. the adoption of integrated library systems (ils) became prevalent in the 1980s and 1990s as libraries began or continued to automate their processes. these systems enabled library staff to work, in many cases, more efficiently than they had been in the past. however, these systems were also restrictive—especially as the nature of the work began to change—largely in response to the growth of electronic and digital resources for which they were not intended to manage. new library systems—the second (or next) generation—are needed to effectively manage the processes of acquiring, describing, and making available all library resources. this article examines the state of library systems today and describes the features needed in a next-generation ils. the authors also examine some of the next-generation ilss currently in development that purport to fill the changing needs of libraries. mets as an intermediary schema for a digital library of scientific multimedia | gartner 34 references 1 library of congress, “metadata encoding and transmission standard (mets) official web site,” 2011 http://www.loc.gov/standards/mets (accessed august 1, 2011). 2 organisation internationale de normalisation, “iso/iec jtc1/sc29/wg11: coding of moving pictures and audio,” 2002, http://mpeg.chiariglione.org/standards/mpeg-21/mpeg-21.htm (accessed august 1, 2011). 3 fedora commons, “the fedora content model architecture (cma),” 2007, http://fedoracommons.org/documentation/3.0b1/userdocs/digitalobjects/cmda.html (accessed december 9, 2011). 4 carl lagoze et al., “fedora: an architecture for complex objects and their relationships,” international journal on digital libraries 6, no. 2 (2005): 130. 5 ibid., 127. 6 ibid., 135. 7 ibid. 8 rishi sharma, fedora interoperability review (london: centre for e-research, 2007), http://wwwcache1.kcl.ac.uk/content/1/c6/04/55/46/fedora-report-v1.pdf.3 (accessed august 1, 2011). 9 richard gartner, “intermediary schemas for complex xml publications: an example from research information management,” journal of digital information 12, no. 3 (2011), https://journals.tdl.org/jodi/article/view/2069 (accessed august 1, 2011). 10 centre for e-research, “bril,” n.d., http://bril.cerch.kcl.ac.uk (accessed august 1, 2011). 11 s. j. derose et al., “what is text, really,” journal of computing in higher education 1, no. 2 (1990): 6. 12 digital library federation, “: metadata encoding and transmission standard: primer and reference manual,” digital library federation, 2010, www.loc.gov/standards/mets/metsprimerrevised.pdf, 77 (accessed august 1, 2011). 13 bill ingram, “echo dep mets profile for master mets documents,” n.d., http://dli.grainger.uiuc.edu/echodep/mets/drafts/mastermetsprofile.xml (accessed august 1, 2011). 14 susan thomas, “using mets for the preservation and dissemination of digital archives,” n.d., www.paradigm.ac.uk/workbook/metadata/mets-altstruct.html (accessed august 1, 2011). 15 library of congress. “mets profiles: metadata encoding and transmission standard (mets) http://www.loc.gov/standards/mets http://mpeg.chiariglione.org/standards/mpeg-21/mpeg-21.htm http://fedora-commons.org/documentation/3.0b1/userdocs/digitalobjects/cmda.html http://fedora-commons.org/documentation/3.0b1/userdocs/digitalobjects/cmda.html http://wwwcache1.kcl.ac.uk/content/1/c6/04/55/46/fedora-report-v1.pdf.3 https://journals.tdl.org/jodi/article/view/2069 http://bril.cerch.kcl.ac.uk/ http://dli.grainger.uiuc.edu/echodep/mets/drafts/mastermetsprofile.xml information technology and libraries | september 2012 35 officialweb site”, 2011. http://www.loc.gov/standards/mets/mets-profiles.html (accessed december 6, 2011). 16 open archives initiative, “open archives initiative protocol—object exchange and reuse,” n.d., www.openarchives.org/ore (accessed december 12, 2011). 17 jerome mcdonough, “aligning mets with the oai-ore data =mmodel,” jcdl ’09 proceedings of the 9th acm/ieee-cs joint conference on digital libraries (new york: association for computing machinery, 2009): 328. 18 ibid., 329. 19 gartner, “intermediary schemas.” 20 gary simons, “using architectural processing to derive small, problem-specific xml applications from large, widely-used sgml applications,” summer institute of linguistics electronic working papers (chicago: summer institute of linguistics, 1998), www.silinternational.org/silewp/1998/006/silewp1998-006.html (accessed august 1, 2011). http://www.loc.gov/standards/mets/mets-profiles.html http://www.openarchives.org/ore/ using dpla and the wikimedia foundation to increase usage of digitized resources article using dpla and the wikimedia foundation to increase usage of digitized resources dominic byrd-mcdevitt and john dewees information technology and libraries | march 2022 https://doi.org/10.6017/ital.v41i1.13659 dominic byrd-mcdevitt (dominic@dp.la) is data fellow, digital public library of america. john dewees (john.dewees@toledolibrary.org) is supervisor, digitization services, toledo lucas county public library. © 2022. abstract the digital public library of america has created a process by which rights-free or openly licensed resources that have already been harvested can be copied over into wikimedia commons, thus creating a simple path for including those digital collections materials into wikipedia articles. by meeting internet users where they already are, rather than relying on them to navigate to individual digital libraries, the access and usage of digital assets is dramatically increased, in particular to user groups that might otherwise not have a reason to interact with such digitized resources. introduction a dpla-sponsored webinar given by dominic byrd-mcdevitt, dpla data fellow, and sandra fauconnier, glam-wiki specialist at the wikimedia foundation, on april 21, 2020, entitled “dpla intro to wikimedia: increased discoverability and use” introduced a workflow by which records harvested by the digital public library of america (dpla) could be automatically copied over into wikimedia commons with their accompanying metadata.1 the major benefit of this migration is the ease with which assets can then be added to wikipedia articles, exposing resources to a large audience of general internet users who might otherwise have no reason to interact with a given repository’s resources. the gains from making digitized resources available in wikipedia articles is substantial, providing incredibly high usage statistics while requiring very little time commitment to execute the work. this dpla project, launched in early 2020, was a result of grant funding provided by the alfred p. sloan foundation and ongoing consultation from the wikimedia foundation. dpla’s interest in designing this system stemmed from an exploration of new ways to increase usage of materials. while previous bulk uploads to wikimedia commons by cultural institutions have required technical expertise and steep learning curves in navigating the wikimedia community, this project was designed to reduce these barriers by taking advantage of dpla’s role as an aggregator (more information is available at https://commons.wikimedia.org/wiki/commons:partnerships). with the workflow developed by dpla’s technology team in mid-2020, an authorized bot account on wikimedia commons (user:dpla_bot, https://commons.wikimedia.org/wiki/user:dpla_bot) uploads assets from dpla institutions. using data provided by contributing institutions, dpla applies filters to identify eligible items from participating institutions, then for each of these generates wiki markup from descriptive metadata and downloads media files to a server. these files are uploaded by a script that interacts with wikimedia’s api using the pywikibot framework (https://www.mediawiki.org/wiki/manual:pywikibot). by centralizing all of the dpla network’s wikimedia commons uploads, dpla was able to upload over 2.25 million files (or 2.5 tb of total mailto:dominic@dp.la mailto:john.dewees@toledolibrary.org https://commons.wikimedia.org/wiki/commons:partnerships https://commons.wikimedia.org/wiki/user:dpla_bot https://www.mediawiki.org/wiki/manual:pywikibot information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 2 storage) from 780,000 items in under a year and a half, becoming the largest single contribution to wikimedia commons ever (by more than quadruple the previous record). 2,3 this approach to the problem provides a simple on-ramp to participation in wikipedia for dpla institutions—especially the many that would otherwise lack the resources or expertise to do so— by requiring of them only those tasks that need their local knowledge, such as describing their own collections prior to aggregation and then making editorial decisions on wikipedia about them once uploaded. this project required a chain of partnerships between separate organizations, as well as a variety of metadata and technical requirements that needed to be satisfied: records of digitized resources are created by an organization locally and are then harvested by dpla. the eligible records in dpla are then copied over into wikimedia commons. once images are in wikimedia commons it is a straightforward process to embed the images in wikipedia articles, thus achieving the goal of expanded use and access to digitized resources. john dewees, supervisor digitization services at the toledo lucas county public library (tlcpl), was in attendance at the april 21, 2020 webinar and subsequently met with dominic byrd mcdevitt on april 30, 2020 to discuss the feasibility of using tlcpl collections as a pilot project for this workflow. the copying of records from tlcpl’s repository into wikimedia commons was actually started in the course of that first conversation on april 30. a map from page 96 of the book geography of ohio (see figures 1 and 2), previously digitized by dewees, will be used to illustrate the process of how records move through the various tools and platforms discussed.4 tlcpl makes digitized resources available through ohio memory, a shared contentdm instance for libraries, archives, and museums in ohio maintained by the state library of ohio and the ohio history connection. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 3 figure 1. digitized image of geography of ohio, page 96, as seen in ohio memory. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 4 figure 2. record metadata for geography of ohio, page 96, as seen in ohio memory. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 5 dpla harvest dpla is a discovery portal that aggregates records of digitized resources from over 4,000 libraries, archives, and museums from around the united states. this creates a single search interface allowing millions of digital records to be searched simultaneously without having to navigate through a wide variety of different digital libraries. the aggregation of these records is accomplished by working with two different types of partners: content hubs and service hubs.5 content hubs are either organizations large enough to contribute to dpla directly, such as the library of congress or harvard library, or large digital libraries that work with partner institutions of their own, such as hathitrust or the internet archive. service hubs, on the other hand, act as mediators between the national aggregation service and individual organizations in states (such as ohio and its service hub, the ohio digital network) or regions (such as utah, idaho, and nevada, who have collectively formed a service hub in the mountain west digital library). service hubs ensure that the technical and metadata requirements for harvesting into dpla are satisfied and act as consultants and facilitators to prospective contributors. as dpla has grown over time, the metadata requirements and possibilities have also evolved and have varied depending on which service hub a contributing organization is working with. the ohio digital network (odn) is the service hub for our example page from ohio memory. odn’s metadata requirements for contributors in march 2021 included a title and a standardized rights statement in the metadata application profile for the contributing collection. more information on the dpla harvest process for the ohio digital network is available at https://ohiodigitalnetwork.org/contributors/getting-started. the nature of these requirements has also evolved since odn’s first harvest in march 2018. initially, the standardized rights statement was required to be one of the options from rightsstatements.org but through the work of dpla and odn, now creative commons licenses and the cc0 public domain dedication can be utilized as well. standardized rights statements must be formatted as machine-readable uris rather than textual descriptions. finally, the technical backend that supports the harvest of a digital collection is via an oai-pmh feed. other hubs operate in very different ways—such as some that actually host all their contributors’ collections in a single domain—but in all cases the end result is providing a data set that dpla can harvest and ingest. figures 3 and 4 illustrate this process, showing the geography of ohio represented as a record in dpla (available at https://dp.la/item/aaba7b3295ff6973b6fd1e23e33cde14) with associated metadata. https://ohiodigitalnetwork.org/contributors/getting-started https://rightsstatements.org/page/1.0/?language=en. https://creativecommons.org/licenses/ https://creativecommons.org/share-your-work/public-domain/cc0/ https://dp.la/item/aaba7b3295ff6973b6fd1e23e33cde14 information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 6 figure 3. geography of ohio as seen in dpla, specifically focusing on the thumbnail, link to the original record, and initial metadata fields. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 7 figure 4. geography of ohio as seen in dpla, specifically focusing on the remaining metadata fields harvested. this process achieves the first level of aggregation: harvesting thumbnail images (full-sized images suitable for research are not harvested in this process) and metadata from local digital repositories and making them available for a unified search experience in dpla. dpla’s aggregation currently contains over 42 million items, with the majority of these containing standardized rights uris; 18 million items have rights compatible with upload to wikimedia commons (as can be seen at https://dp.la/search?rightscategory=%22unlimited%20reuse%22). once dpla has access to the records, it is possible for the code authored by dpla staff to be utilized to then integrate the resources into wikimedia commons. wikimedia commons harvest wikimedia commons is part of the larger network of services and tools under the umbrella of the wikimedia foundation. there are a wide variety of different tools available such as wikidata, a portal for open structured data; wikipedia, a collaboratively edited open encyclopedia; and https://dp.la/search?rightscategory=%22unlimited%20re-use%22 https://dp.la/search?rightscategory=%22unlimited%20re-use%22 information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 8 wikimedia commons. this last portal uses the same software platform that powers wikipedia to create an open file and media server that can interoperate with the other tools. wikimedia commons is capable of hosting digital still images, audio files, and video files. anyone can contribute to this open repository so long as the work is in the public domain or openly licensed. users may either release a work for which they own the rights under an open license at upload time or may upload any other works by providing evidence in the metadata that the work is out of copyright or openly licensed (more information on copyright and licensing in wikimedia commons is available at https://commons.wikimedia.org/wiki/commons:licensing). with this in mind, in order for records in dpla to be eligible for harvest into wikimedia commons, they first must have one of the five specific standardized rights statements available at the following links:6 • http://rightsstatements.org/vocab/noc-us/1.0/ • https://creativecommons.org/publicdomain/zero/1.0/ • https://creativecommons.org/publicdomain/mark/1.0/ • https://creativecommons.org/licenses/by/4.0/ • https://creativecommons.org/licenses/by-sa/4.0/ the uris above indicate the most recent version of each of the associated copyright descriptions or licenses, though being published under the most recent version is not a requirement for harvest into wikimedia commons. while standardized rights statements are not a requirement for contributing to dpla generally, they are being used as a requirement for wikimedia commons upload so that the software has a machine-readable way to determine the compatibility of rights. though it is a non-profit educational resource, wikimedia commons does not utilize media under fair use or materials only licensed for noncommercial/educational use, in order to ensure its users may reuse the media for any purpose. as a result one thing to keep in mind is that while a given organization may include in their gift or accession agreement a statement that digitized versions of physical resources are allowed to be shared through channels decided by the organization, this does not necessarily extend to wikimedia commons users outside your organization, because of the requirement to be able to reuse materials with little restriction past attribution and the need to share alike, depending on the standardized rights statement. dpla locates the asset to upload by using urls explicitly provided by the service hub; the urls can be provided in one of two ways. one is to provide the iiif manifest url (via the iiif presentation api), from which the dpla-developed software queries the manifest for the list of assets which are listed by the presentation api in the form of iiif image api urls. the other way the media location can be identified is by providing a list of direct urls to the media in the field dpla calls mediamaster during the initial harvest process. unlike the iiif manifest url, this is a multivalued field that can accommodate a list of urls. the reason for this approach is to allow any institution to contribute assets via the pipeline, regardless of whether they actually have implemented iiif in their repository or not. not all organizations have adopted the iiif suite of apis so it is important to be able to provide more than one avenue for wikimedia commons harvest. https://commons.wikimedia.org/wiki/commons:licensing http://rightsstatements.org/vocab/noc-us/1.0/ https://creativecommons.org/publicdomain/zero/1.0/ https://creativecommons.org/publicdomain/mark/1.0/ https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by-sa/4.0/ https://iiif.io/api/presentation/3.0/ https://iiif.io/api/presentation/3.0/ https://iiif.io/api/image/3.0/ information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 9 however, providing a iiif manifest when and if it becomes available has benefits over the mediamaster field. it will always be true when queried, whereas the mediamaster values are only accurate to the last harvest, which may be a month or more out of date. figure 5. the dashboard developed by dpla displaying, for pine river library, percent of records that have open rights statements and percent of files with media access. a dashboard has been developed for dpla content hub and service hub administrators to analyze how many records in a given collection conform to the standardized rights statement and iiif api requirements (see figure 5). harvest of a collection into wikimedia commons from dpla necessitates that all eligible records in the collection be harvested into wikimedia commons; it is not possible for a participating institution to hand-curate which of the eligible items will be included. that is, all records in a given collection with the aforementioned standardized rights statements will be harvested into wikimedia commons. an additional signed agreement or memorandum of understanding has not been required between dpla and participating organizations due to the open nature of the works being transferred. since the works have been identified as in the public domain or openly licensed, users can already freely use the resources for any purpose they want, so long as it conforms to the appropriate creative commons license. resource presentation in wikimedia commons each portion of the migration process presents the resource in different ways. the original instance of geography of ohio is made available in contentdm as a complex digital object: multiple images (or more specifically in this case, pages) associated with a single metadata record. dpla presents this resource only in terms of its metadata along with a thumbnail image of the resource itself; to view the contents of the resource the user is directed back to the original repository for full access to the digital object. the migration process into wikimedia commons actually copies the image assets themselves along with the metadata. in this example, both the information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 10 image assets and the metadata are drawn from contentdm. wikimedia commons is not able to accommodate complex digital objects, and any that are imported via this process are broken out into discrete simple digital objects in wikimedia commons, for example, page 96 of geography of ohio (see figures 6, 7, and 8; view page 96 in wikimedia commons). figure 6. geography of ohio, page 96, as seen in wikimedia commons, with a focus on the file name, image, and viewing options. https://commons.wikimedia.org/wiki/file:geography_of_ohio_-_dpla_-_aaba7b3295ff6973b6fd1e23e33cde14_(page_96).jpg information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 11 figure 7. geography of ohio, page 96, as seen in wikimedia commons, with a focus on the record metadata. figure 8. geography of ohio, page 96, as seen in wikimedia commons, with a focus on the derivative images created from the original record and administrative metadata. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 12 the filename is programmatically generated and embeds a great deal of information; the following example illustrates the various components of the filename. example filename: file:geography of ohio dpla aaba7b3295ff6973b6fd1e23e33cde14 (page 96) (cropped).jpg 1. the prefix for all items in wikimedia commons, “file:” 2. the title of the work, in this case “geography of ohio” followed by a hyphen 3. the source of the digital object, universally “dpla” for this project, followed by a hyphen 4. the unique identifier assigned by dpla, in this case “aaba7b3295ff6973b6fd1e23e33cde14” 5. in the case of complex objects, the page number, in this case “(page 96)” 6. if the file was cropped using wikimedia common’s built-in image editing tool, “(cropped)” will be included between the page number and file extension to indicate the image is a derivative of an original 7. the file format extension, in this case “.jpg” even if the complex object being imported is not actually a book, the individual item records in wikimedia commons still uses the “(page x)” nomenclature to differentiate the individual objects. the summary section of the wikimedia commons record displays how the metadata is crosswalked into this environment. the dublin core creator, title, description and date fields are copied verbatim from the local metadata application profile (map). to identify the contributing institution, and to differentiate between similarly named institutions, dpla maintains a json file mapping all dpla institutions with their wikidata identifiers.7 this document also indicates which hubs/institutions are participating in the project at any given time through a true/false field that is toggled when an institution authorizes upload. this enables distinct category pages for each contributing institution and analytics to be tracked and provided to dpla, relevant hubs, and contributing institutions. the source/photographer field is one of the most important as it ensures that attribution for the contributing institution is clear. the field contains a narrative description of how dpla facilitated this resource to be available in wikimedia commons. it also makes available information on the original contributing institution with links to the record as it is originally displayed (in ohio memory in this case) as well as in dpla. proper attribution of items was a topic that came up continuously when discussing this project with other organizations, so it should be reassuring to know that credit and direct links back to resources is enabled in this workflow. the permission and standardized rights statement fields leverage the aforementioned uris to be able to provide information to the user on the copyright status of the work as well as concrete information on how exactly they are able to use it responsibly for their own purposes. finally, an interesting aspect of this record is the links provided to derivative images. in this case we can see the map displayed on this book page has two cropped derivative images. use in wikipedia articles all of the work described above is in service of one goal: to enable higher usage and exposure of digitized resources in wikipedia articles. while it is possible to do this work manually, inserting images into articles without being a dpla contributor or even having a digital repository to speak information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 13 of, the automated process is a clear advantage, especially when talking about large collections. for the map on page 96 in geography in ohio, we can see that the map of limestone distribution in ohio has been included in an image gallery on the limestone wikipedia article (see figures 9 and 10). figure 9. the wikipedia article on limestone displaying the introduction and one image (but not the worked-example image). https://en.wikipedia.org/wiki/limestone information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 14 figure 10. the image gallery in the limestone wikipedia article, with the map from geography of ohio included and seen at the bottom right. figure 11. the source view editing option for the limestone article in wikipedia, allowing direct editing of the wikitext. once images are in wikimedia commons, embedding the images in wikipedia articles is a simple process. one option for wikipedia editors is to use a what you see is what you get (wysiwyg) information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 15 html editor that should be familiar to most users. alternately, there is also a source view editing option which uses the custom markdown called wikitext to format pages in wikipedia (see figure 11). source view editing allows more precision when inserting images into wikipedia articles and makes it easier to understand how they will ultimately be displayed in the article. the way in which different page elements flow around one another in articles can be surprising when using the wysiwyg editing option as images assumed to show up where you placed the cursor can ultimately be placed in very different locations than expected. 8 usage analytics analytics tools are available that allow organizations to track the articles containing their assets, showing what image was embedded in an article and how many views the article received. tlcpl’s initial ingest added a total of 129,725 discrete image assets to wikimedia commons. from that pool, images were added to a total of 227 wikipedia articles between may 2020 and february 2021 (see figure 12). in that time period the articles had a total of 11.7 million page views (see figure 13).9 in february 2021 alone, the 227 enriched articles received 1.87 million page views. by comparison, the total number of records tlcpl had available in ohio memory was 129,395 in february 2021, and those records received 26,602 unique page views. the major strength of this project is to display locally created digitized resources where researchers would already be on the open web and take advantage of that much wider level of exposure.10 figure 12. a graph displaying the cumulative total number of articles with inserted images from tlcpl resources from may 2020 to february 2021. https://en.wikipedia.org/wiki/help:wikitext information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 16 figure 13. a graph displaying the monthly total number of page views of wikipedia articles with inserted images from tlcpl resources from may 2020 to february 2021. there is a valid discussion to be had about the comparison of these metrics, as comparing page views to unique page views is not a one-to-one match, but no matter the measurement it is fairly clear this audience is an order of magnitude beyond what might conventionally be available. ultimately what might be one of the most exciting metrics for an organization looking to implement this work is the amount of time it actually took to execute this project. since tlcpl’s records already satisfied the requirements to be copied over to wikimedia commons, the actual import process was able to begin during the april 30, 2020 zoom call between tlcpl and dpla staff that was set up to discuss the project; from the perspective of the contributing organization, this process takes essentially no time or effort. once the process is started, staff at the contributor institution will be informed when the records have finished being copied. the actual work of locating images for inclusion into articles and inserting them took roughly an hour of work a week, or roughly ten minutes per article, sometimes less. approximately 38 hours of work was spent identifying images and inserting them into articles between may 2020 and february 2021. while not of central concern to the project or its usage, the editorial work is also interesting and uses enough creativity and problem solving to be an enjoyable activity. because the resources in wikimedia commons are available to be used by anyone (as in, anyone with a device and an internet connection), this makes it a wonderful opportunity for interns or volunteers to contribute. volunteers could work on the editorial portion of this project remotely with no real barriers. while all the effort of getting tlcpl digitized images into wikipedia described here has been using previously existing articles, this work could make an excellent partnership opportunity information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 17 with schools to write and create whole new articles for which there is an abundance of already digitized resources to support. conclusion the work of remediating and writing metadata to participate in large consortial efforts such as dpla is always going to be a major undertaking, but projects like this that can leverage automation and partnerships show just how powerful these relationships can be. making locally digitized resources available through dpla, copying them over to wikimedia commons, and then embedding those images into wikipedia articles is an excellent opportunity to meet users where they already are—online. this work provides exceptionally high usage statistics and is fertile ground for outreach and programming opportunities to get partners, volunteers, and interns involved with making those digitized resources available in wikipedia. acknowledgements special thanks to jen johnson, library consultant at the state library of ohio, and virginia dressler, digital projects librarian at kent state university, for their support in enabling this work and article. endnotes 1 this presentation is available on youtube at https://youtu.be/0bsoksybcbi. information on all past dpla webinars and programming can be found at https://pro.dp.la/events/workshops. 2 the entire collection of all resources contributed to wikimedia commons via dpla can be found at https://commons.wikimedia.org/wiki/category:media_contributed_by_the_digital_public_libr ary_of_america. 3 statistics related to contributor totals were created from a wikimedia database query published at https://quarry.wmflabs.org/query/51256. 4 geography of ohio was published as part of a series of bulletins by the ohio state geological survey. the book was authored by roderick peattie, an assistant professor of geology at ohio state university, in 1923. this item was digitized by the toledo lucas county public library and uploaded as part of public domain day 2019. the digitized version of this book is available through ohio memory at https://ohiomemory.org/digital/collection/p16007coll33/id/115214. 5 more information, including a complete list of content hubs and services hubs and their geographic distribution, is available on the dpla website at https://pro.dp.la/hubs/our-hubs. 6 as shared by dominic byrd-mcdevitt in a webinar on march 18, 2021 entitled “dpla + wikimedia: one year in + ten million views,” available at https://www.youtube.com/watch?v=jloj0gvvsnu. 7 the json file is available for view on dpla’s github page at https://github.com/dpla/ingestion3/blob/develop/src/main/resources/wiki/institutions_v2. json. https://youtu.be/0bsoksybcbi https://pro.dp.la/events/workshops https://commons.wikimedia.org/wiki/category:media_contributed_by_the_digital_public_library_of_america https://commons.wikimedia.org/wiki/category:media_contributed_by_the_digital_public_library_of_america https://quarry.wmflabs.org/query/51256 https://ohiomemory.org/digital/collection/p16007coll33/id/115214 https://pro.dp.la/hubs/our-hubs https://www.youtube.com/watch?v=jloj0gvvsnu https://github.com/dpla/ingestion3/blob/develop/src/main/resources/wiki/institutions_v2.json https://github.com/dpla/ingestion3/blob/develop/src/main/resources/wiki/institutions_v2.json information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 18 8 for more information on step-by-step instructions for adding images into wikipedia articles after harvest, see the blog post at https://johndewees.com/2021/03/18/adding-images-towikipedia-articles-via-dpla/. 9 all statistics on wikipedia page views are drawn from the baglama 2 utility available at https://glamtools.toolforge.org/baglama2/#gid=430. 10 up-to-date statistics and data are available at the digitization statistics dashboard created to communicate about major projects in digitization services at the toledo lucas county public library and available at https://docs.google.com/spreadsheets/d/1jv0zzt6h_jl1tq8v2zdxmf5ygn0dfbnhqbffifcep zm/edit?usp=sharing. https://johndewees.com/2021/03/18/adding-images-to-wikipedia-articles-via-dpla/ https://johndewees.com/2021/03/18/adding-images-to-wikipedia-articles-via-dpla/ https://glamtools.toolforge.org/baglama2/#gid=430 https://docs.google.com/spreadsheets/d/1jv0zzt6h_jl1tq8v2zdxmf5ygn0dfbnhqbffifcepzm/edit?usp=sharing https://docs.google.com/spreadsheets/d/1jv0zzt6h_jl1tq8v2zdxmf5ygn0dfbnhqbffifcepzm/edit?usp=sharing abstract introduction dpla harvest wikimedia commons harvest resource presentation in wikimedia commons use in wikipedia articles usage analytics conclusion acknowledgements endnotes microsoft word september_ital_fortier_final.docx hidden online surveillance: what librarians should know to protect their own privacy and that of their patrons alexandre fortier and jacquelyn burkell information technology and libraries | september 2015 59 abstract librarians have a professional responsibility to protect the right to access information free from surveillance. this right is at risk from a new and increasing threat: the collection and use of non-‐ personally identifying information such as ip addresses through online behavioral tracking. this paper provides an overview of behavioral tracking, identifying the risks and benefits, describes the mechanisms used to track this information, and offers strategies that can be used to identify and limit behavioral tracking. we argue that this knowledge is critical for librarians in two interconnected ways. first, librarians should be evaluating recommended websites with respect to behavioral tracking practices to help protect patron privacy; second, they should be providing digital literacy education about behavioral tracking to empower patrons to protect their own privacy online. introduction privacy is important to librarians. the american library association code of ethics (2008) states that “we protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted,” while the canadian library association code of ethics (1976) states that members have responsibility to “protect the privacy and dignity of library users and staff.” this translates to a core professional commitment: according to the american library association (2014, under “why libraries?”), “librarians feel a professional responsibility to protect the right to search for information free from surveillance.” increasingly, information searches are conducted online, and as a result librarians should be paying specific attention to online surveillance in their efforts to satisfy their privacy-‐related professional responsibility. this is particularly important given the current environment of significant and increasing threat to privacy in the online context. although many concerns about online privacy relate to the collection, use, and sharing of personally identifiable information, there is increasing awareness of the risks associated with the collection and use of what has been termed ‘non-‐personally identifiable information’ (e.g.: internet protocol addresses, pages visited, geographic location information, search strings, etc.; office of the privacy commissioner of canada alexandre fortier (afortie@uwo.ca) is a phd candidate and lecturer, faculty of information and media studies, the university of western ontario, london, ontario. jacquelyn burkell (jburkell@uwo.ca) is associate professor, faculty of information and media studies, the university of western ontario, london, ontario. hidden online surveillance: what librarians should know to protect their own privacy and that of their patrons| fortier and burkell | doi: 10.6017/ital.v34i3.5495 60 2011, 12). this practice has been termed ‘behavioral tracking’, and recent revelations of government security agency collection of user metadata (ball 2013; weston, greenwald and gallager 2014) have heightened awareness of this issue. the problem, however, is not new, nor is the practice restricted to the actions of governmental agencies. indeed, as early as 1996 commercial and non-‐commercial entities were practicing online behavioral tracking for purposes of website and interaction personalization and to present targeted advertising (“affinicast unveils personalization tool” 1996; “adone classified network and clickover announce strategic alliance” 1997). since these initial forays into behavioral tracking and personalization of online content the practice has proliferated, and many sites now use a variety of behavioral tracking tools to enhance user experience and deliver targeted advertisements (see, e.g., the “what they know” series from the wall street journal 2010; gomez, pinnick and soltani 2009; soltani et al. 2009). there can be no question that behavioral tracking is a form of surveillance (castelluccia and narayanan 2012), and the ubiquity of this practice means that users are regularly subject to this type of surveillance when they access online resources. in order to satisfy a professional commitment to support information access free from surveillance, librarians must therefore address two related issues: first, they must ensure that the resources they recommend are privacy-‐respecting in that those resources engage in little if any online surveillance; second, they must raise the digital literacy of their patrons with respect to online privacy, increasing understanding of online tracking mechanisms and the strategies that patrons can use to protect their privacy in light of these activities. addressing the first issue requires that librarians attend to surveillance practices when recommending online information resources. privacy and surveillance issues, however, are notably absent from common guidelines for evaluating web resources (see, e.g., kapoun 1998; university of california, berkley 2012; john hopkins university 2013), and thus librarians do not have the guidance they need to ensure that the resources they recommend are privacy-‐respecting. it is critical that librarians and other information professionals address this gap by developing an understanding of the surveillance mechanisms used by websites and the strategies that can be deployed to identify and even nullify these mechanisms. this same understanding is necessary to address the second goal of raising the privacy-‐related digital literacy of patrons. librarians must understand tracking mechanisms and potential responses in order to integrate privacy literacy into library digital literacy initiatives that are central to the mission of libraries (american library association 2013). this paper provides an introduction to behavioral tracking mechanisms and responses. the goals of this paper are to provide an overview of the risks and benefits associated with online behavioral tracking, to discuss the various surveillance mechanisms that are used to track user behavior, and to provide strategies for identifying and limiting online behavioral tracking. we have elsewhere published analyses of behavioral tracking practices on websites recommended by information professionals (burkell and fortier 2015), and on practices with respect to the disclosure of tracking mechanisms (burkell and fortier 2015). this paper serves as an adjunct to information technologies and libraries | september 2015 61 those empirical results, providing information professionals with background that will assist them in negotiating, on the part of themselves and their patrons, the complex territory of online privacy. consumer attitudes toward behavioral tracking survey data suggest that consumers are, in general, aware of behavioral tracking practices. the 2013 us consumer data privacy study (truste 2013), for example, reveals that 80 percent of users are aware of online behavioral tracking on their desktop devices, while slightly under 70 percent are aware of tracking on mobile devices (see also office of the privacy commissioner of canada 2013). awareness, however, does not directly translate to understanding, and recent data indicate that even relatively sophisticated internet users are not fully informed about behavioral tracking practices (mcdonald and cranor 2010; smit et al. 2014). moreover, attitudes about tracking are at best ambivalent (ur et al. 2012), and many studies indicate a predominantly negative reaction to these practices (turow et al. 2009; mcdonald and cranor 2010; truste 2013). although it is not universally required by regulatory frameworks, many users feel that companies should request permission before collecting behavioral tracking data (office of the privacy commissioner of canada 2013). finally, although some users take steps to limit or even eliminate behavioral tracking, many do not. for example, while one-‐third to three-‐quarters of survey respondents indicate that they manage or refuse browser cookies (truste 2013; comscore 2007; 2011; rainie et al. 2013), at least one quarter reported no attempts to limit behavioral tracking. this may be attributed to the difficulty in using such mechanisms (leon et al. 2011). behavioral tracking and its mechanisms tracking mechanisms transmit non-‐personally identifiable information to websites for different purposes. originally, the information collected by these mechanisms was used to enhance user experience and to make these website interactions more efficient. tracking mechanisms can record user actions on a web page and their interaction preferences. using these data, websites can for example direct returning visitors to a specific location in the site, allowing those visitors to resume interaction with a website at the point where they were on the previous visit. using the internet protocol (ip) address of a user, websites can display information relevant to the geographic area where a user is located. tracking mechanisms also allow a website to remember registration details and the items users have put in their shopping basket (harding, reed and gray 2007). tracking mechanisms are also of great use to webmasters, supporting the optimization of website design. thus, for example, these mechanisms can inform webmasters of users’ movements on their websites: what pages are visited, how often they are visited, and in what order. they can also indicate the common entry and exit points for a specific website. this information can be leveraged in site redesign to increase user satisfaction and traffic. hidden online surveillance: what librarians should know to protect their own privacy and that of their patrons| fortier and burkell | doi: 10.6017/ital.v34i3.5495 62 website optimization and interaction personalization have potential benefit to users. at the same time, however, the detailed profile of user activities, potentially aggregated across multiple visits to different websites, presents potential privacy risks. the information gathered through tracking mechanisms can allow a website to identify browsing and information access habits, to infer user characteristics including location and some demographics, and to know what topics or products are of particular interest to a user. tracking mechanisms can be first-‐party or third-‐party, and the difference has implications for the detail that can be assembled in the user profile. first-‐party mechanisms are set by directly by the website a user is visiting, while third-‐party mechanisms are set by outside companies providing services, such as advertising, analysis of user patterns and social media integration, on the primary site. first-‐party tracking mechanisms collect information about a site visit and visitor and deliver that information to the site itself. using first-‐party tracking, web sites can provide personalized interaction, integrating visit and visitor information both within a single visit and across multiple visits (randall 1997). this information is available only to the web site itself, and thus neither includes information about visits to other sites nor is accessible by other websites, unless the information is sold or leaked by the first-‐party site (see narayanan 2011). third-‐party tracking mechanisms, by contrast, deliver information about a site visit and visitor to a third party. this transaction is often invisible to the user, and the information is transmitted typically without explicit user consent. third-‐party tracking represents a greater menace to privacy, since third parties have a presence on multiple sites, and are able to collect information about users and their activities on all those sites and integrate that information across sites and across visits into a single detailed user profile (see mayer and mitchell 2012 for a discussion of privacy problems associated with third-‐party tracking). research demonstrates that third-‐party tracking is a common and perhaps even ubiquitous practice (gomez, pinnick and soltani 2009; (burkell and fortier 2013). it is not uncommon for websites to have trackers from more than one third party, and some websites, especially popular ones, have trackers from dozens of different organizations: gomez, pinnick and soltani (2009), for example, found 100 unique web beacons on a single website. furthermore, the same tracking companies are present on many different websites, allowing them to integrate into a single profile information about visits to each of these many sites. privacychoice1, which maintains a comprehensive database of tracking companies, estimates that google display network (doubleclick), for instance, has a presence on 57 percent of websites. thus, a user traveling the web is likely to be tracked by doubleclick on more than half of the sites they visit, and doubleclick has access to information about all visits to each of these many sites. worries about the potential privacy breaches that mechanisms for tracking a user’s activities online can allow are not new. even at their inception in the mid-‐1990s, http cookies (also known as browser cookies) were generating controversy about the potential invasion of privacy 1 http://www.privacychoice.org/. information technologies and libraries | september 2015 63 (e.g. randall 1997). users, however, quickly realized that they could manage http cookies using accessible browser settings that limit or even entirely disallow the practice of setting cookies. as a result, websites, advertisers and others who benefit from web audience segmentation and behavior analytics developed newer and more obscure tracking technologies including ‘supercookies’ and web beacons, and these technologies are now deployed along with http cookies (sipior, ward and mendoza 2011). tracking technologies are constantly evolving in response to user behavior and advertiser demand, therefore keeping up to date is an ongoing challenge (see, e.g., goodwin 2011). http cookies http cookies were originally meant to help web developers “invisibly” gather information about users in order to personalize and optimize user experience (randall 1997). these cookies are simply a few lines of text shared in an http transaction, and a typical cookie might include a user id, the time of a visit, and the ip address of the computer. cookies are associated with a specific browser, and the information is not shared between different browsers on the same machine: thus, the cookies stored by firefox are not accessible to internet explorer, and vice versa. cookies do not usually include identifying information such as name or address, and they are able to do so if and only if the user has explicitly provided this information to the website. when users want to access a web page, their browser sends a request to the server for the specific website and the server searches the hard drive for a cookie file from this site. if there is no cookie, a unique identifier code is assigned to the browser and a cookie file is saved on the hard drive. if there is a cookie, it is retrieved and the information is used to personalize and structure the website interaction (for a detailed description of the mechanics of cookies, see kriscol 2001, 152–155). some http cookies, called session or transient cookies, automatically expire when the browser is closed (barth 2011). they are mainly used to keep track of what a consumer has added to a shopping cart or to allow users to navigate on a website without having to log in repeatedly. other http cookies, called permanent, persistent or stored cookies, are configured to keep track of users until the cookie reaches its expiration date, which can be set many years after creation (barth 2011). permanent http cookies can be easily deleted using browser management tools (sipior, ward and mendoza 2011). studies have shown that approximately a third of users delete cookies once a month (e.g. comscore 2007; 2011). such behavior, however, displeases advertisers, as it leads to an overestimation of the number of true unique visitors on a website and impede user tracking (marshall 2005; see also comscore 2007; 2011). flash cookies and other ‘supercookies’ to palliate this ‘attack’ on http cookies, an online advertising company, united virtualities, developed a backup system for cookies using the local shared object feature of adobe’s flash player plug-‐in: the persistent identification element (sipior, ward and mendoza 2011). this type of storage, called flash player local shared objects or, more commonly, flash cookies, shares many similarities with http cookies with regard to their tracking capabilities, storing similar hidden online surveillance: what librarians should know to protect their own privacy and that of their patrons| fortier and burkell | doi: 10.6017/ital.v34i3.5495 64 non-‐personally identifying information. unlike http cookies, however, flash cookies do not have an expiration date, a characteristic that makes them permanent until they are manually deleted. they are also not handled by a browser, but are stored in a location accessible to different browsers and flash widgets, which are thus all able to access the same cookie. they can hold much more data (up to 100 kb by default compared to 4 kb for http cookies), and support more complex data types than http cookies (see macdonald and cranor 2012 for a technical comparison of http and flash cookies). moreover, it is estimated that adobe’s flash player is installed on over 99 percent of personal computers (adobe 2011), making flash cookies usable on virtually all computers. flash cookies represent a more resilient technology for tracking than http cookies. erasing traditional cookies within a browser does not affect flash cookies, which needs to be erased in a separate panel (sipior, ward and mendoza 2011). flash cookies also have the ability to ‘respawn’ (or recreate) deleted http cookies. a website using flash cookies can therefore track users across sessions even if the user has taken reasonable steps to avoid this type of online profiling (soltani et al. 2009), and although it is declining in incidence, this practice is still occurring, sometimes on very popular websites (ayenson et al. 2011; macdonald and cranor 2012). it should also be noted that other internet technologies (e.g. silverlight, javascript, and html5), which have so far attracted less attention from researchers, use local storage for similar purposes. one developer even created the ‘evercookie’, a very persistent cookie incorporating twelve types of storage mechanisms available in a browser that makes data persist and allows for respawning (kamkar 2010), a method investigated by the national security agency to de-‐anonymize users of the tor network, (‘tor stinks’ presentation 2013), a network which aims at concealing the location and usage of users. web beacons users’ online behavior can also be monitored by web beacons (also called web bugs, clear gifs or pixel tags), which tiny are image tags embedded within a document, appearing on a webpage or attached to an email, that are intended to be unnoticed (martin, wu and alsaid 2003). the image tag creates a holding space for a referenced image residing on the web, and beacons transmit information to a remote computer when the document (web page or email) is viewed. web beacons can gather information on their own, and they can also retrieve information from a previously set cookie (angwin 2010; see martin, wu and alsaid 2003 for description of the different technological abilities of web beacons). such capacity means, according to the privacy foundation (smith 2000; quoted in martin, wu and alsaid 2003), that beacons could potentially transfer to a third party demographic data and personally identifiable information (name, address, phone number, email address, etc.) that a user has typed on a page. unlike cookies, beacons are not tied to a specific server and can track users over multiple web sites (schoen 2009). beacons, moreover, cannot be managed through browser settings. while blocking third-‐party cookies limit information technologies and libraries | september 2015 65 their range of action, it does not preclude beacons from gathering information on their own, and users have to install extensions to their browser to efficiently limit the effects of web beacons. strategies for identifying behavioral tracking in order to identify privacy-‐respecting online resources, librarians must learn to assess the behavioral tracking activities occurring on websites. the first step is to identify and review website privacy policies. privacy guidelines regulating the collection, retention and use of personal information in the online environment usually require that users should be given notice of website practices (e.g., fair information practice principles2 proposed in 1973 by the us secretary’s advisory committee on automated personal data systems, the convention for the protection of individuals with regard to automatic processing of personal data developed by the council of europe (1981), and the organisation for economic co-‐operation and development guidelines on the protection of privacy and transborder flows of personal data3). this notice is typically provided in privacy policies that identify what information is collected, how it is used, and with whom it is shared. regulatory frameworks, however, did not originally contemplate the collection of non-‐ personally identifiable information. while such disclosure would seem to be consistent with the fair information practice principles, the current mode of mode of control is in many cases self-‐ regulatory45, and full compliance with notice requirements is far from universal (komanduri et al. 2011-‐2012). thus, while disclosure of behavioral tracking practices in websites should be seen as diagnostic of the presence of these mechanisms, lack of disclosure cannot be interpreted to mean that the site does not engage in behavioral tracking (komanduri et al. 2011-‐2012; burkell and fortier 2013b). furthermore, privacy policy disclosures, where they do exist, may be difficult to understand (burkell and fortier 2013b). website privacy policies are often complex (micheti, burkell and steeves 2010). they tend to be written with the goal of protecting a website owner against lawsuits rather than informing users (earp et al. 2005; pollach 2005). pollach (2005), for example, details a variety of linguistic strategies that serve to undermine user understanding of website practices, including mitigation and enhancement, obfuscation of reality, relationship building, and persuasive appeals. therefore, even if many websites acknowledge the collection of non-‐ personally identifiable information, both from first-‐ and third-‐party, the effectiveness of this disclosure is limited, making privacy policies a relatively ineffective tool to identify behavioral tracking practices. 2 the privacy act of 1974, 5 u.s.c. § 552a. 3 c(80)58/final, as amended on 11 july 2013 by c(2013)79. 4 for instance, the new self-regulatory guidelines for online behavioral advertising identify the need to provide notice to users when behavioral data is collected that allows the tracking of users across websites and over time (united states federal trade commission, 2009). 5 exceptions to this self-regulatory principle are increasing, including but not limited to the california online privacy protection act of 2003 (oppa), and the eu cookie directive (2009/136/ec) of the european parliament and of the council. hidden online surveillance: what librarians should know to protect their own privacy and that of their patrons| fortier and burkell | doi: 10.6017/ital.v34i3.5495 66 as a result, librarians need to develop strategies and tools that allow them to assess directly the behavioral tracking practices of websites, in order that these practices can be considered in making websites recommendations. different protocols can be followed in making this assessment, but they should be built around the following guiding principles (see burkell and fortier 2013a for a full discussion). the first important principle is that each website should be visited in an independent session to eliminate contamination. each website under consideration should be visited in an independent session, beginning with the browser at an about:blank page, with clean data directories (no http and flash cookies, and an empty cache). the evaluator should ensure that browser settings are configured to allow cookies, tools to track web beacons (e.g., the ghostery6 browser extension) are installed in the browser, and adobe flash, via the website storage settings panel is configured to accept data. the website should then be accessed directly by entering the domain name into the browser’s navigation bar. evaluators should mimic a typical user interaction with the website on many pages without clicking on advertisements or following links to outside sites. as they browse through the site, the evaluator should record the web beacons and trackers identified by the browser extension (e.g., ghostery). at the end of the session, they should immediately review the contents of the browser cookie file and the adobe flash panel via website storage settings, recording any cookies that are present. privacychoice, as well as ghostery, maintains a database of trackers that evaluators can use to identify associated privacy risk. while all third-‐party trackers raise some privacy issues, some of them put users at a greater risk than others, either because of their practices or their presence on a large number of websites. evaluators should take that into account when making a decision. strategies for limiting behavioral tracking users may also take these steps to identify the presence of behavioral tracking, and digital literacy initiatives should provide this information along with tools and strategies that users can employ to limit tracking. it should be noted that elimination of all behavioral tracking may not be a desirable outcome from the perspective of users who benefit from the website personalization and optimization supported by these mechanisms. targeted advertising can also be positive for many people, since it eliminates unwanted or ‘useless’ advertisements. ultimately, a user must decide whether he or she wants to be tracked. digital literacy initiatives should raise awareness of behavioral tracking and provide users with the tools they need to identify and control tracking should they choose to do so. the easiest step is for users to learn how to manage http cookies in every web browser that they use. using browser settings, users can decide to refuse third-‐party cookies or even all cookies. the latter, however, will make the make the browsing experience much less efficient and may impede users from accessing some websites. users should also learn how to delete cookies and they should be encouraged to think about periodically emptying the cookie file of each of their browsers. controlling flash cookies is more complex, yet crucial considering the capabilities of 6 https://www.ghostery.com/. information technologies and libraries | september 2015 67 flash cookies. this is achieved through settings on the adobe website storage settings panel. browser extensions, such as ghostery and adblock plus7, can be added to most browsers. ghostery allows users to block trackers, either on a tracker-‐by-‐tracker basis, a site-‐by-‐site basis or a mixture of the two. also customable, adblock plus allows users to block either all advertisements or only the ones they do not want to see. these extensions, however, may slow down internet browsing. users can also change their internet use habits. it is possible for user to use search engines that do dot store any non-‐personally identifiable information, such as ixquick8 and duckduckgo9. ixquick returns the top ten results from multiple search engines. it only sets one cookie that remembers a user’s search preferences and that is deleted after a user does not visit ixquick for 90 days. duckduckgo, which returns the same search results for a given search term to all users, aims at getting information from the best sources rather than the most sources. while these search engines do not have all the functionality of the major search engines, both of them have received praise (e.g. mccracken 2011). the ultimate solution, one that allows a user to navigate online total anonymity, is to use the tor10 web browser, which impedes network surveillance or traffic analysis and which the u.s. national security agency has characterized as “the king of high secure, low latency internet anonymity” (schneier 2013). the anonymity afforded by tor, however, comes at the price of reduced speed and limitations to available content. conclusion it is widely understood that online privacy is at risk, threatened by the actions of governmental agencies and commercial entities. there is widespread awareness of and attention to the risks associated with the collection and use of personally identifiable information, but less attention is paid to an equally significant issue: the collection and use of information that is highly personal but nonetheless ‘non-‐identifying’. this practice, termed ‘behavioral tracking’, is the focus of this paper. other research demonstrates that behavioral tracking is widespread (gomez, pinnick and soltani 2009; burkell and fortier 2013a), but users demonstrate only a limited knowledge of the practice and they do little to control tracking (comscore 2007; 2011; rainie et al. 2013; truste 2013). we argue that librarians have a dual professional responsibility with respect to this issue: first, librarians should be aware of the surveillance practices of the websites they recommend to patrons and take these practices into account in making website recommendations; second, digital literacy initiatives spearheaded by librarians include a focus on online privacy, and provide patrons with the information they need to manage their own online privacy. this paper presents an overview of online behavioral tracking mechanisms, and provides strategies for identifying and limiting online behavioral tracking. the information presented provides a basic understanding of tracking mechanisms along with practical strategies that 7 https://adblockplus.org/. 8 https://www.ixquick.com/. 9 https://duckduckgo.com/. 10 www.torproject.org/torbrowser/. hidden online surveillance: what librarians should know to protect their own privacy and that of their patrons| fortier and burkell | doi: 10.6017/ital.v34i3.5495 68 librarians can use to evaluate websites with respect to these practices and strategies that can be used to limit online tracking. we recommend that website evaluation standards be extended to include assessment of online privacy and especially behavioral tracking. we also recommend that librarians actively promote digital literacy by engaging in public education programs that take privacy and other digital literacy issues into account (american library association 2013). finally, we note that protecting online privacy is an ongoing challenge, and librarians must ensure that they continually update their understanding of online surveillance mechanisms and the approaches that can be used to monitor and limit these activities. acknowledgement support for this project was provided by the office of the privacy commissioner of canada through its contributions program. the views expressed in this document are those of the researchers and do not necessarily reflect the views of the officer of the privacy commissioner of canada. references adobe. 2011. “adobe flash platform runtimes: pc penetration”. http://www.adobe.com/mena_en/products/flashplatformruntimes/statistics.html. “adone classified network and clickover announce strategic alliance”. 1997. business wire, march 24. “affinicast unveils personalization tool”. 1996. adage, december 4. http://adage.com/article/news/affinicast-‐unveils-‐personalization-‐tool/2714/. american library association. 2008. code of ethics. http://www.ala.org/advocacy/proethics/codeofethics/codeethics. ———. 2013. digital literacy, libraries, and public policies: report of the office for information technology policy’s digital literacy task force. http://www.districtdispatch.org/wp-‐ content/uploads/2013/01/2012_oitp_digilitreport_1_22_13.pdf. ———. 2014. choose privacy week. accessed april 8. http://chooseprivacyweek.org. angwin, julia. 2010. “the web’s new gold mine: your secrets”. the wall street journal july 31. http://online.wsj.com/news/articles/sb10001424052748703940904575395073512989404. ayenson, mika, dietrich james wambach, ashkan soltani, nathan good and chris jay hoofnagle. 2011. “flash cookies and privacy ii: now with html5 and etag respawning”. social science research network. http://ssrn.com/abstract=1898390. ball, james. 2013. “nsa stores metadata of millions of web users for up to a year, secret files show”. the guardian, september 30. http://www.theguardian.com/world/2013/sep/30/nsa-‐americans-‐ metadata-‐year-‐documents. information technologies and libraries | september 2015 69 barth, adam. 2011. “http state management mechanism”. internet engineering task force, rfc 6265. http://tools.ietf.org/html/rfc6265. burkell, jacquelyn and alexandre fortier. 2013. privacy policy disclosures of behavioural tracking on consumer health websites. proceedings of the 76th annual meeting of the association for information science and technology, edited by andrew grove. doi: 10.1002/meet.14505001087. burkell, jacquelyn and alexandre fortier. 2015. could we do better? behavioural tracking on recommended consumer health websites. health information and libraries journal 32 (3): 182– 194. canadian library association. 1976. code of ethics. http://www.cla.ca/content/navigationmenu/resources/positionstatements/code_of_ethics.htm. castelluccia, claude and arvind narayanan. 2012. privacy considerations of online behavioural tracking. heraklion, greece: european union agency for network and information security. http://www.enisa.europa.eu/activities/identity-‐and-‐trust/library/deliverables/privacy-‐ considerations-‐of-‐online-‐behavioural-‐tracking. comscore 2007. the impact of cookie deletion on the accuracy of site-‐server and ad-‐server metrics: an empirical comscore study. https://www.comscore.com/fre/insights/presentations_and_whitepapers/2007/cookie_deletio n_whitepaper. ———. 2011. the impact of cookie deletion on site-‐server and ad-‐server metrics in latin america: an empirical comscore study. http://www.comscore.com/insights/presentations_and_whitepapers/2011/impact_of_cookie_de letion_on_site-‐server_and_ad-‐server_metrics_in_latin_america. council of europe. 1981. convention for the protection of individuals with regard to automatic processing of personal data. http://conventions.coe.int/treaty/en/treaties/html/108.htm. earp, julia b., annie i. antón, lynda. aiman-‐smith and william h. stufflebeam. 2005. “examining internet privacy policies within the context of user values”. ieee transactions on engineering and management 52 (2): 227–237. gomez, joshua, travis pinnick and ashkan soltani. 2009. knowprivacy. http://ashkansoltani.files.wordpress.com/2013/01/knowprivacy_final_report.pdf. goodwin josh. 2011. super cookies, ever cookies, zombie cookies, oh my. ensighten, blog entry. http://www.ensighten.com/blog/super-‐cookies-‐ever-‐cookies-‐zombie-‐cookies-‐oh-‐my. harding, william t., anita j. reed and robert l. gray. 2001. cookies and web bugs: what they are and how they work together. information systems management 18 (3): 17–24. hidden online surveillance: what librarians should know to protect their own privacy and that of their patrons| fortier and burkell | doi: 10.6017/ital.v34i3.5495 70 johns hopkins university sheridan libraries. 2013. evaluating information found on the internet. http://guides.library.jhu.edu/evaluatinginformation. kamkar, samy. 2010. “evercookie”. http://samy.pl/evercookie/. kapoun, jim. 1998. “teaching undergrads web evaluation: a guide for library instruction”. college & research libraries news, july/august: 522–523. komanduri, saranga, richard shay, greg norcie, blase ur and lorrie faith cranor. 2011-‐2012. “adchoices? compliance with online behavioral advertising notice and choice requirements”. i/s: a journal of law and policy for the information society 7: 603–638. kristol, david m. 2001. http cookies: standards, privacy, and politics. acm transactions on internet technology 1 (2): 151–198. leon, pedro giovanni, blase ur, rebecca balebako, lorrie faith cranor, richard shay, and yang wang. 2012. “why johnny can’t op out: a usability evaluation of tools to limit online behavioral advertising”. proceedings of the sigchi conference on human factors in computing systems. http://dl.acm.org/citation.cfm?id=2207759. marshall, matt. 2005. “new cookies much harder to crumble”. the standard-‐times, may 15. http://www.southcoasttoday.com/apps/pbcs.dll/article?aid=/20050515/news/305159957. martin, david, hailin wu and adil alsaid. 2003. hidden surveillance by web sites: web bugs in contemporary use. communications of the acm 46 (1): 258–264. mayer, jonathan r. and john c. mitchell. 2012. third-‐party web tracking: policy and technology. proceedings of the 2012 ieee symposium on security and privacy. https://cyberlaw.stanford.edu/files/publication/files/trackingsurvey12.pdf. mccracken, harry. 2011. “50 websites that make the web great. time, august 16. http://content.time.com/time/specials/packages/0,28757,2087815,00.html. mcdonald, aleecia m. and lorrie faith cranor. 2010. “beliefs and behaviors: internet users’ understanding of behavioral advertising”. social science research network. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1989092. ———. 2012. “a survey of the use of adobe flash local shared objects to respawn http cookies”. i/s: a journal of law and policy for the information society 7 (3): 639–687. micheti, anca, jacquelyn burkell and valerie steeves. 2010. “fixing broken doors: strategies for drafting privacy policies young people can understand”. bulletin of science, technology, and society. 30 (2): 130–143. narayanan, arvind. 2011. “there is no such thing as anonymous online”. blog entry, july 28. https://cyberlaw.stanford.edu/blog/2011/07/there-‐no-‐such-‐thing-‐anonymous-‐online-‐tracking. information technologies and libraries | september 2015 71 office of the privacy commissioner of canada. 2011. report on the 2010 office of the privacy commissioner of canada's consultations on online tracking, profiling and targeting, and cloud computing. https://www.priv.gc.ca/resource/consultations/report_201105_e.pdf. ———. 2013. survey of canadians on privacy-‐related issues. http://www.priv.gc.ca/information/por-‐rop/2013/por_2013_01_e.pdf. pollach, irene. 2005. “a typology of communicative strategies in online privacy policies: ethics, power, and informed consent”. journal of business ethics 62 (3): 221–235. rainie, lee, sara kiesler, ruogu kang and mary madden. anonymity, privacy, and security online. pew research internet project. http://www.pewinternet.org/2013/09/05/anonymity-‐privacy-‐ and-‐security-‐online/. randall, neil. 1997. “the new cookie monster”. pc magazine 16 (8): 211–214. schneier, bruce. 2013. “attacking tor: how the nsa targets users' online anonymity”. the guardian, 4 october. http://www.theguardian.com/world/2013/oct/04/tor-‐attacks-‐nsa-‐users-‐ online-‐anonymity. schoen, seth. 2009. “new cookie technologies: harder to see and remove, widely used to track you”. blog entry, september 14. https://www.eff.org/deeplinks/2009/09/new-‐cookie-‐ technologies-‐harder-‐see-‐and-‐remove-‐wide. sipior , janice c., burke t. ward and ruben a. mendoza. 2011. online privacy concerns associated with cookies, flash cookies, and web beacons. journal of internet commerce 10 (1): 1–16. smit, edith g., guda van noort hilde a. m. voorveld. 2014. understanding online behavioural advertising: user knowledge, privacy concerns, and online coping behaviour in europe. computers in human behavior 32 (1): 15–22. smith, r. m. 2000. “why are they bugging you?” privacy foundation. http://www.privacyfoundation.org/resources/whyusewb.asp. soltani, ashkan, shannon canty, quentin mayo, lauren thomas, chris jay hoofnagle. 2009. “flash cookies and privacy”. social science research network. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1446862. “‘tor stinks’ presentation”. 2013. the guardian online, october 4. http://www.theguardian.com/world/interactive/2013/oct/04/tor-‐stinks-‐nsa-‐presentation-‐ document. truste. 2013. us 2013 consumer data privacy study – advertising edition. http://www.truste.com/us-‐advertising-‐privacy-‐index-‐2013/. hidden online surveillance: what librarians should know to protect their own privacy and that of their patrons| fortier and burkell | doi: 10.6017/ital.v34i3.5495 72 turow, joseph, jennifer king, chris jay hoofnagle, amy bleakley and michael hennessy. 2009. “americans reject tailored advertising and three activities that enable it”. social science research network. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1478214. united states federal trade commission. 2009. ftc staff report: self-‐regulatory principles for online behavioral advertising. http://www.ftc.gov/os/2009/02/p085400behavadreport.pdf. university of california, berkley library. 2012. “finding information on the internet: a tutorial” http://www.lib.berkeley.edu/teachinglib/guides/internet/evaluate.html. ur, blase, pedro giovanni leon, lorrie faith cranor, richard shay, and yang wang. 2012. “smart, useful, scary, creepy: perceptions of online behavioral advertising”. soups ’12 proceedings of the eighth symposium on usable privacy and security. http://dl.acm.org/citation.cfm?id=2335362. weston, greg, glenn greenwal and ryan gallagher. 2014. “csec used airport wi-‐fi to track canadian travelers: edward snowden documents”. cbc news, january 30. http://www.cbc.ca/news/politics/csec-‐used-‐airport-‐wi-‐fi-‐to-‐track-‐canadian-‐travellers-‐edward-‐ snowden-‐documents-‐1.2517881. “what they know”. 2010. the wall street journal online. http://blogs.wsj.com/wtk/. library use of web-based research guides jimmy ghaphery and erin white information technology and libraries | march 2012 21 abstract this paper describes the ways in which libraries are currently implementing and managing webbased research guides (a.k.a. pathfinders, libguides, subject guides, etc.) by examining two sets of data from the spring of 2011. one set of data was compiled by visiting the websites of ninety-nine american university arl libraries and recording the characteristics of each site’s research guides. the other set of data is based on an online survey of librarians about the ways in which their libraries implement and maintain research guides. in conclusion, a discussion follows that includes implications for the library technology community. selected literature review while there has been significant research on library research guides, there has not been a recent survey either of the overall landscape or of librarian attitudes and practices. there has been recent work on the efficacy of research guides as well as strategies for their promotion. there is still work to be done on developing a strong return on investment metric for research guides, although the same could probably be said for other library technologies including websites, digital collections, and institutional repositories. subject-based research guides have a long history in libraries that predates the web as a servicedelivery mechanism. a literature-review article from 2007 found that research on the subject gained momentum around 1996 with the advent of electronic research guides, and that there was a need for more user-centric testing.1 by the mid-2000s, it was rare to find a library that did not offer research guides through its website.2 the format of guides has certainly shifted over time to database-driven efforts through local library programming and commercial offerings. a number of other articles start to answer some of the questions about usability posed in the 2007 literature review by vileno. in 2008, grays, del bosque, and costello used virtual focus groups as a test bed for guide evaluation.3 two articles from the august 2010 issue of the journal of library administration contain excellent literature reviews and look toward marketing, assessment, and best practices.4 also in 2010, vileno followed up on the 2007 literature review with usability testing that pointed toward a number of areas in which users experienced difficulties with research guides.5 jimmy ghaphery (jghapher@vcu.edu) is head, library information systems and erin white (erwhite@vcu.edu) is web systems librarian, virginia commonwealth university libraries, richmond, va. mailto:jghapher@vcu.edu library use of web-based research guides | ghaphery and white 22 in terms of cross-library studies, an interesting collaboration in 2008 between cornell and princeton universities found that students, faculty, and librarians perceived value in research guides, but that their qualitative comments and content analysis of the guides themselves indicated a need for more compelling and effective features.6 the work of morris and grimes from 1999 should also be mentioned; the authors surveyed 53 university libraries, finding that it was rare to find a library with formal management policies for their research guides.7 most recently, libguides has emerged as a leader in this arena, offering a popular software-as-aservice (saas) model and as such is not yet heavily represented in the literature. a multichapter libguides lita guide is pending publication and will cover such topics as implementing and managing libguides, setting standards for training and design, and creating and managing guides. arl guides landscape during the week of march 3rd, 2011, the authors visited the websites of 99 american university arl libraries to determine the prevalence and general characteristics of their subject-based research guides. in general, the visits reinforced the overarching theme within the literature that subject-based research guides are a core component of academic library web services. all 99 libraries offered research guides that were easy to find from the library home page. libguides was very prominent as a platform, in production at 67 of the 99 libraries. among these, it appeared that at least 5 libraries were in the process of migrating from a previous system (either a homegrown, database-driven site or static html pages) to libguides. in addition to the presence and platform, the authors recorded additional information about the scope and breadth of each site’s research guides. for each site, the presence of course-based research guides was recorded. in some cases the course guides had a separate listing, whereas in others they were intermingled with the subject-based research guides. course guides were found on 75 of the 99 libraries visited. of these, 63 were also libguides sites. it is certainly possible that course guides are being deployed at some of the other libraries but were not immediately visible in visiting the websites, or that course guides may be deployed through a course management system. nonetheless, it appears that the use of libguides encourages the presence of public-facing course guides. qualitatively, there was wide diversity of how course guides were organized and presented, varying from a simple a-to-z listing of all guides to separately curated landing pages specifically organized by discipline. the number of guides was recorded for each libguides site. it was possible to append “/browse.php?o=a” to the base url to determine how many guides and authors were published at each site. this php extension was the publicly available listing of all guides on each libguides platform. the “/browse.php?o=a” extension no longer publicly reports these statistics; however, findings could be reproduced by manually counting the number of guides and authors on each site. the authors confirmed the validity of this method in the fall of 2011 by revisiting four sites and finding that the numbers derived from manual counting were in line with the previous findings. of information technology and libraries | march 2012 23 the 63 libguides sites we observed, a total of 14,522 guides were counted from 2,101 authors for an average of 7 guides per author. on average, each site had 220 guides from 32 authors (median of 179 guides; 29 authors). at the high end of the scale, one site had 713 guides from 46 authors. based on the volume observed, libraries appear to be investing significant time toward the creation, and presumably the maintenance, of this content. in addition to creation and ongoing maintenance, such long lists of topics raise a number of usability issues that libraries will also be wise to keep in mind.8 survey the literature review and website visits call out two strong trends: 1. research guides are as commonplace as books in libraries, 2. libguides is the elephant in the room, so much so that it is hard to discuss research guides without discussing libguides. based on preliminary findings from the literature review and survey, we looked to further describe how libraries are supporting, innovating, implementing, and evaluating their research guides. a ten-question survey was designed to better understand how research guides sit within the cultural environment of libraries. it was distributed to a number of professional discussion lists the week of april 19, 2011 (see appendix). the following lists were used in an attempt to get a balance of opinion from populations of both technical and public services librarians: code4lib, web4lib, lita-l, lib-ref-l, and ili-l. the survey was made available for two weeks following the list announcements. survey response was very strong, with 198 responses (188 libraries) received without the benefit of any follow-up recruitment. ten institutions submitted more than one response. in these cases only the first response was included for analysis. we did not complete a response for our own institution. the vast majority (155, 82%) of respondents were from college or university libraries. of the remaining 33, 24 (13%) were from community college libraries, with only 9 (5%) identifying themselves as public, school, private, or governmental. among the college and university libraries, 17 (9%) identified themselves as members of the arl, which comprises 126 members.9 in terms of “what system best describes your research guides by subject?” the results were similar to the survey of arl websites. most libraries (129, 69%) reported libguides as their system, followed by “customized open source system” and “static html pages,” both at 20 responses (11% each). sixteen libraries (9%) reported using a homegrown system, with three libraries (2%) reporting “other commercial system.” in terms of initiating and maintaining a guides system, much of the work within libraries seems to be happening outside of library systems departments. when asked which statement best described who selected the guides system, 67 respondents (36%) indicated their library research library use of web-based research guides | ghaphery and white 24 guides were “initiated by public services,” followed closely by “more of a library-wide initiative” at 63 responses (34%). in the middle at 34 responses (18%) was “initiated by an informal crossdepartmental group.” only 10 respondents (5%) selected “initiated by systems,” with the top down approach of “initiated by administration” gathering 14 responses (7%). when narrowing the responses to those sites that are using libguides or campus guides, the portrait is not terribly different, with 36% library-wide, 35% public services, 18% informal cross-departmental, 7% administration, and systems trailing at 4%. likewise there was not a strong indication of library systems involvement in maintaining or supporting research guides. sixty-nine responses (37%) indicated “no ongoing involvement” and an additional 35 (19%) indicated “n/a we do not have a systems department.” there were only 21 responses (11%) stating “considerable ongoing involvement,” with the balance of 63 responses (34%) for “some ongoing involvement.” not surprisingly, there was a correlation between the type of research guide and the amount of systems involvement. for sites running a “customized open source system,” “other commercial system,” or “homegrown system,” at least 80% of responses indicated either “considerable” or “some” ongoing systems involvement. in contrast, 37% of sites running libguides or campusguides indicated “considerable” or “some” technical involvement. further, the libguides and campusguides users recorded the highest percentage (43%) of “no ongoing involvement” compared to 37% of all respondents. interestingly, 20% of libguides and campus guides users answered “n/a we do not have a systems department,” which is not significantly higher than all respondents for this question at 19%. the level of interaction between research guides and enterprise library systems was not reported as strong. when asked “which statement best describes the relationship between your web content management system and your research guides?” 112 responses (60%) indicated that “our content management system is independent of our research guides” with an additional 51 responses (27%) indicating that they did not have a content management system (cms). only 12 respondents (6%) said that their cms was integrated with their research guides with a remaining 13 (7%) saying that their cms was used for “both our website and our research guides.” a similar portrait was found in seeking out the relationship between research guides and discovery/federated search tools. when asked “which statement best describes the relationship between your discovery/federated search tool and your research guides?” roughly half of the respondents (96, 51%) did not have a discovery system (“n/a we do not have a discovery tool”). only 12 respondents (6%) selected “we prominently feature our discovery tool on our guides,” whereas more than double that number, 26 (14%), said “we typically do not include our discovery tool on our guides.” fifty four respondents (29%) took the middle path of “our discovery tool is one of many search options we feature on our guides.” in the case of both discovery systems and content management systems, it seems that research guides are typically not deeply integrated. when asked “what other type of content do you host on your research guides system?” respondents selected from a list of choices as reflected in table 1. information technology and libraries | march 2012 25 answer total percent libguides/campusguides course pages 127 68% 74% “how to” instruction 123 65% 77% alphabetical list of all databases 76 40% 42% “about the library” information (for example hours, directions, staff directory, event) 59 31% 35% digital collections 34 18% 19% everything—we use the research guide platform as our website 16 9% 9% none of the above 17 9% 2% table 1. other types of content hosted on research guides system these answers reinforce the portrait of integration within the larger library web presence. while the research guides platform is an important part of that presence, significant content is also being managed by libraries through other systems. it is also consistent with the findings from the arl website visits, where course pages were consistently found within the research guides platform. for sites reporting libguides or campusguides as their platform, inclusion of course pages and how-to instruction was even higher, at 74% and 77%, respectively. another multi-answer question sought to determine what types of policies are being used by libraries for the management of research guides: “which of the following procedures or policies do you have in place for your research guides?” responses are summarized in table 2. library use of web-based research guides | ghaphery and white 26 answer total percent percent using libguides/campusguides style guides for consistent presentation 105 56 58 maintenance and upkeep of guides 94 50 53 link checking 87 46 50 required elements such as contact information, chat, pictures, etc. 78 41 56 training for guide creators 73 39 43 transfer of guides to another author due to separation or change in duties 72 38 41 defined scope of appropriate content 43 23 22 allowing and/or moderating user tags, comments, ratings 36 19 25 none of the above 36 19 19 controlled vocabulary/tagging system for managing guides 23 12 25 table 2. management policies/procedures for research guides while nearly one in five libraries reported none of the policies in place at all, the responses indicate that there is effort being applied toward the management of these systems. the highest percentage for any given policy was 56% for “style guides for consistent presentation.” best practices in these areas could be emerging or many of these policies could be specific to individual library needs. as with the survey question on content, the research-guides platform also has a role with the libguides and campusguides users reporting much higher rates of policies for “controlled vocabulary/tagging” (25% vs. 12%) and “required elements” (56% vs. 41%). in both information technology and libraries | march 2012 27 of these cases, it is likely that the need for policies arise from the availability of these features and options that may not be present in other systems. based on this supposition, it is somewhat surprising that the libguides and campusguides sites reported the same lack of policy adoption (none of the above; 19%). the final question in the survey further explored the management posture for research guides by asking a free-text question: “how do you evaluate the success or failure of your research guides?” results were compiled into a spreadsheet. the authors used inductive coding to find themes and perform a basic data analysis on the responses, including a tally of which evaluation methods were used and how often. one in five institutions (37 respondents, 19.6%) looked only to usage stats, while seven respondents (4%) indicated that their library had performed usability testing as part of the evaluation. forty-our respondents (23.4%) said they had no evaluation method in place (“ouch! it hurts to write that.”), though many expressed an interest or plans to begin evaluation. another emerging theme included ten respondents who quantified success in terms of library adoption and ease of use. this included one respondent who had adopted libguides in light of prohibitive it regulations (“we choose libguides because it would not allow us to create class specific research webpages”). several institutions also expressed frustration with the survey instrument because they were in the process of moving from one guides system to another and were not sure how to address many questions. most responses indicated that there are more questions than answers regarding the efficacy of their research guides, though the general sentiment toward the idea of guides was positive, with words such as “positive,” “easy,” “like,” and “love” appearing in 16 responses. countering that, 5 respondents indicated that their libraries’ research-guides projects had fallen through. conclusion this study confirms previous research that web-based research guides are a common offering, especially in academic libraries. adding to this, we have recorded a quantitative adoption of libguides both through visiting arl websites and through a survey distributed to library listservs. further, this study did not find a consistent management or assessment practice for library research guides. perhaps the most interesting finding from this study is the role of library systems departments with regard to research guides. it appears that many library systems departments are not actively involved in either the initiation or ongoing support of web-based research guides. what are the implications for the library technology community and what questions arise for future research? the apparent ascendancy of libguides over local solutions is certainly worth considering and in part demonstrates some comfort within libraries for cloud computing and saas. time will tell how this might spread to other library systems. the popularity of libguides, at its heart a specialized content management system, also calls into question the vitality and adaptability of local content management system implementations in libraries. more generally, does the desire to professionally select and steward information for users on research guides indicate librarian misgivings about the usability of enterprise library systems? how do attitudes library use of web-based research guides | ghaphery and white 28 toward research guides differ between public services and technical services? hopefully these questions serve as a call for continued technical engagement with library research guides. what shape that engagement may have in the future is an open question, but based on the prevalence and descriptions of current implementations, such consideration by the library technology community is worthwhile. references 1. luigina vileno, “from paper to electronic, the evolution of pathfinders: a review of the literature,” reference services review 35, no. 3 (2007): 434–51. 2. martin courtois, martha higgins, aditya kapur, “was this guide helpful? users’ perceptions of subject guides,” reference services review 33 , no. 2 (2005): 188–96. 3. lateka j. grays, darcy del bosque, and kristen costello, “building a better m.i.c.e. trap: using virtual focus groups to assess subject guides for distance education students,” journal of library administration 48, no. 3/4 (2008): 431–53. 4. mira foster et al., “marketing research guides: an online experiment with libguides,” journal of library administration 50, no. 5/6 (july/september, 2010): 602–16; alisa c. gonzalez and theresa westbrock, “reaching out with libguides: establishing a working set of best practices,” journal of library administration 50, no. 5/6 (july/september, 2010): 638–56. 5. luigina vileno, “testing the usability of two online research guides,” partnership: the canadian journal of library and information practice and research 5, no. 2 (2010), http://journal.lib.uoguelph.ca/index.php/perj/article/view/1235 (accessed august 8, 2011). 6. angela horne and steve adams, “do the outcomes justify the buzz? an assessment of libguides at cornell university and princeton university—presentation transcript,” presented at the association of academic and research libraries, seattle, wa, 2009, http://www.slideshare.net/smadams/do-the-outcomes-justify-the-buzz-an-assessment-oflibguides-at-cornell-university-and-princeton-university (accessed august 8, 2011). 7. sarah morris and marybeth grimes, “a great deal of time and effort: an overview of creating and maintaining internet-based subject guides,” library computing 18, no. 3 (1999): 213–16. 8. mathew miles and scott bergstrom, “classification of library resources by subject on the library website: is there an optimal number of subject labels?” information technology & libraries 28, no. 1 (march 2009): 16–20, http://www.ala.org/lita/ital/files/28/1/miles.pdf (accessed august 8, 2011). 9. association of research libraries, “association of research libraries: member libraries,” http://www.arl.org/arl/membership/members.shtml (accessed october 24, 2011). http://journal.lib.uoguelph.ca/index.php/perj/article/view/1235 http://www.slideshare.net/smadams/do-the-outcomes-justify-the-buzz-an-assessment-of-libguides-at-cornell-university-and-princeton-university http://www.slideshare.net/smadams/do-the-outcomes-justify-the-buzz-an-assessment-of-libguides-at-cornell-university-and-princeton-university http://www.ala.org/lita/ital/files/28/1/miles.pdf http://www.arl.org/arl/membership/members.shtml information technology and libraries | march 2012 29 appendix. survey library use of web-based research guides please complete the survey below. we are researching libraries’ use of web-based research guides. please consider filling out the following survey, or forwarding this survey to the person in your library who would be in the best position to describe your library’s research guides. responses are anonymous. thank you for your help! jimmy ghaphery, vcu libraries erin white, vcu libraries 1) what is the name of your organization? __________________________________ note that the name of your organization will only be used to make sure multiple responses from the same organization are not received. any publication of results will not include specific names of organizations. 2) which choice best describes your library? o arl o university library o college library o community college library o public library o school library o private library o governmental library o nonprofit library 3) what type of system best describes your research guides by subject? o libguides or campusguides o customized open source system o other commercial system o homegrown system o static html pages 4) which statement best describes the selection of your current research guides system? o initiated by administration o initiated by systems o initiated by public services o initiated by an informal cross-departmental group o more of a library-wide initiative library use of web-based research guides | ghaphery and white 30 5) how much ongoing involvement does your systems department have with the management of your research guides? o no ongoing involvement o some ongoing involvement o considerable ongoing involvement o n/a we do not have a systems department 6) what other type of content do you host on your research guides system? o course pages o “how to” instruction o alphabetical list of all databases o “about the library” information (for example: hours, directions, staff directory, events) o digital collections o everything—we use the research guide platform as our website o none of the above 7) which statement best describes the relationship between your discovery/federated search tool and your research guides? o we typically do not include our discovery tool on our guides o our discovery tool is one of many search options we promote on our guides o we prominently feature our discovery tool on our guides o n/a we do not have a discovery tool 8) which statement best describes the relationship between your web content management system and your research guides? o our content management system is independent of our research guides o our content management system is integrated with our research guides o our content management system is used for both our website and our research guides o n/a we do not have a content management system 9) which of the following procedures or policies do you have in place for your research guides? o defined scope of appropriate content o required elements such as contact information, chat, pictures, etc. o style guides for consistent presentation o allowing and/or moderating user tags, comments, ratings o training for guide creators o controlled vocabulary/tagging system for managing guides o maintenance and upkeep of guides o link checking information technology and libraries | march 2012 31 o transfer of guides to another author due to separation or change in duties o none of the above 10) how do you evaluate the success or failure of your research guides? [free text] lib-s-mocs-kmc364-20141024053122 201 a regional serials program under national serials data program auspices: discussion paper prepared for ad hoc serials discussion group audrey n. grosch: university of minnesota, minneapolis. purpose of the program a regionally organized program for serials bibliography is proposed because of the large volume of complex data needing control and the many purposes to which the data can be put in support of regional or local needs. · the size of the data base comprising serials bibliography in the united states alone may exceed 2 million titles. gregory's union list of serials represents the largest single source of controlled titles-450,000 in the third edition.1at a minimum its successor publication, new serial titles (nst), contains 325,000 titles. 2 therefore, some 775,000 titles are under control for ·identification, interlibrary transfer, and location purposes . . the data base requirements for the isds/ nsdp record comprise several dozen fields. when added together with other information now found in a marc serials record for cataloging purposes and when further coupled with explicit holdings information needed for regional networks, the file size would exceed 1 billion characters. therefore, the systems design basis as well as the functional purposes of such data would encourage us 'to explore a regionally organized serials program. another even more overwhelming factor which gives support to a regional system is the mixture of rules applied in cataloging. the nsdp data establish conclusive identity via the key-title without a full bibliographic record. however, libraries will not suddenly drop their local practices or use of lc cataloging copy, affecting their internal arrangements of collections. therefore, the most prudent course would seem to be reconciliation of past practice with the new system and development of a machinereadable serials record specification which accommodates the requirements of the isds/ nsdp and the cataloging rules of present libraries. a regionally organized program should be highly responsive to such a reconciliation. 202 journal of library automation vol. 6/4 december 1973 the regional serials program within the framework of the isds and its respective national center ( nsdp in the u.s. ) , each country will proceed to develop its serials program. the united states program must be organized with nsdp at its center. figure i is a schematic showing the relationships of nsdp and the other units within this proposed regional program. this figure also shows the bi-directional communications flow between the various units. n.a.l. r-e--c:=j ' r--1 i i i i i i i i i ""-1 i i i n.l.m. reiional serials data centers local libraries i i i i i i l ------------------------------------------------------------· fig. i. a regional serials program organization and lines of communication. the three national libraries originate bibliographic data for use in the nsdp system and also can continue to function as providers of cataloging data to the library community via cards, marc tapes, the national ~~brary of medicine's serline, and any new services of this type. marc, serline or other machine readable sources should ultimately be the method by which raw data from the national libraries would enter nsdp. the regional serials data centers would receive information from nsdp and also would provide certain kinds of data for the nsdp data base which would be nonduplicative of the national libraries input. local libraries would interface to their regional serials data center, supplying information for the region in a shared environment. hopefully, the regional centers could take over the functions now performed by the marc serials service, supplying products requested by the libraries locally and obviating the need of local libraries subscribing to marc serial tapes. the serials environment may be organized into: • a local library serials management component; . • a network serials management component; and, • an international/national serials system component. a regional program can address itself to these three facets of the environment, in the following manner. a regional serials programj grosch 203 the local library, whatever its size or type, must develop some system for internal control of its serials collections. this author feels that the local library should be free to adopt either nsdp or anglo-american cataloging rules for current serials but need not change its retrospective records unless it can really afford to, if such data became available in the future through the program herein proposed. ideally, such a conversion and uniformity has much to offer the library user, but costs would be too high for most libraries. also, small libraries and certain libraries, because of their physical conditions, may need to preserve differences. therefore, the local library can be urged to adopt nsdp i aacr as standard but cannot be forced to standardize because of the large retrospective conversion problem. the local library can develop its serials system independently or through partial or full support through a network-the regional serials data center under this plan. independent of whether it chooses to use the regional serials data center for such services its needs remain the same: • to identify the serial in hand; • to obtain cataloging copy for it; • to service its subscriptions, claims, binding; and • to produce some form of catalog showing its holdings and arranged to reflect its specific shelf arrangement. the networking serials management component represents the development of union catalogs, wherein members of the network can: • identify a serial; • identify who holds a specific issue; • provide interlibrary loan/photocopy service to obtain the actual document; and • provide a way to consolidate fragmentary sets, eliminate unnecessary duplication or provide more copies when needed, and broaden subject coverage among the network. union catalogs, document delivery services, and bibliographic reference assistance comprise the products that are used by the network component. the international/ national serials system component must be the vehicle to provide a uniform bibliographic description and local components. the issn and the nsdp record provide the means to this end. the machine-readable record at the regional serials data center would comprise at least the key elements of the international record, i.e. keytitle as supplied via nsdp from either national library or regional center input and the issn. beyond that, the regional center bibliographic data base should be structured to provide full bibliographic description according to aacr rules for current publications, accommodating the retrospective data as it is found in its region-at least until some national effort at conversion of superimposed records can be mounted. moreover, the regional center should be tailored to perform the functions that its local libraries deem important. this obviously will vary with the region and with time as regions will vary in size and their mix of libraries. 204 journal of library automation vol. 6/4 december 1973 figure 2 enumerates the functions and responsibilities of the respective parts of the regionally organized serials program illustrated in figure 1. this list is not meant to be all inclusive as other functions could be recognized by other libraries or regions. national se1·ials data program 1. assign issn/key-title to titles reported via national libraries and regional serials data centers. 2. create and maintain data base, indexes of key-titles, issn' s, etc. 3. create and maintain or accept surrogates from regional serials data centers. 4. maintain essential isds data elements, other isds elements. 5. maintain essential non-isds data elements or national library extensions via other data elements. 6. publish indexes to the data base at nsdp for use by regional serials data centers and national libraries. 7. transmit issn's and key-titles as assigned to the regional serials data centers and the national libraries. 8. carry out publisher relations to convince publishers of the need to use issn, etc. as well as foster some additional uniformity wherever possible. national libraries 1. provide cataloging copy-surrogate for newly cataloged titles-using new or available mechanisms such as marc, serline or nst to nsdp for key-titlesj issn assignments. 2. provide subscription ca.rdj tape cataloging copy to local libraries until a dual definition national serials data program record canbe developed and revised through the regional centers. 3. maintain national union list functions via nst, eventually coordinating with nsdp to provide key-title and issn entry points. regional centers 1. create and maintain regional center serials data base reflecting holdings of libraries in the region. 2. forward new title bibliographic/surrogate data to nsdp for issn/ key-title assignment if not processed by nsdp through national libraries or another regional center. 3. publish union catalogs or holdings in region for network library use. 4. process and provide machine/manual cataloging data for region use with m~rc, serline, nst type processing. 5. forward retrospective data to nsdp converted to their requirement$ for fig. 2. functions and responsibilities of the respective parts of a regionally organized serials program. a regional serials programjgros~h 205 retrospective issn/key-title assignment and addition to the central store atnsdp. 6. develop local library services as required. for example: a. catalog card production. b. book catalog production. c. oclc type serials check-in, claiming, binding, subscription system. d. document/ photocopy delivery system for interlibrary loan. e. coordinated acquisitions program for new serials, added copies. f. coordinate interlibrary transfers and consolidations of retrospective holdings. 7. communicate with other centers or nsdp to locate titles not supplied by local region. local library 1. notify regional center of new titles, changes, corrections to maintain regional data base. 2. use local library services as deemed necessary from regional center. 3. participate in document delivery / resource sharing/set consolidation within the region network. fig. 2. (continued) figure 3 shows the basic tasks and their current status with respect to developing such a program, provided that funding became available for at least a pilot regional center. the isds record specifications are presently available and implemented through the nsdp data base. the nsdp i university of minnesota contract for a feasibility study will determine the costs and required bibliographic and programming support to convert locally generated data bases, i.e., the minnesota union list of serials ( muls) to nsdp requirements. the results of this feasibility study will determine the prospects for funding any proposals for actual conversion of local data bases such as muls. the current muls data base represents one model of a regional center data base, with the addition of issn and key-title as the links to the nsdp system and/ or augmentation to provide other kinds of services to local libraries participating in muls. creation of the system of regional centers would depend upon proposing and funding such a program based on the above work. establishment of a pilot regional center would be one manner in which such a plan could be tested, followed by further center establishment based on the results of the pilot program. with such a system in routine operation, the nsdp in its central role could focus its attention on the retrospective conversion-issn and key-title assignment possibly via some nationally coordinated cooperative venture among the regional centers or other contractors. conclusion obviously, any system for a serials program will have its problems. are206 journal of library automation vol. 6/ 4 december 1973 task i. development of isds record specifications for nsdp and other centers. 2. development of software/ systems to convert locally generated marc based serials data bases to nsdp requirements. 3. conversion of locally generated marc based data bases to nsdp requirements and issn ;key title assignments for unique titles. 4. design of regional center data base record-basic data element specifications. 5. creation of regional centers-pilot center establishment, followed by other regional centers and full implementation of a regional plan. 6. possible retrospective serial title conversion. presently available feasibility study contracted. future future-model available in muls with addition of key-title/issn as linking fields between nsdp record and regional center record. future future-on a nationally coordinated basis. fig. 3. requirements for establishment of a regionauy organized serials program. gionally organized system would have greater responsiveness to local and networking needs than a large centralized program. moreover, certain technical problems of data base manipulation would be easier to solve under this organization. no attempt at greater specificity has been made here, as the purpose of this paper is to describe the nucleus of one way in which a serials program for the u.s. could be structured for maximum local library and networking benefit. let the discussion flow! references i. winifred gregory, union list uf serujls in libraries uf the u.s. and canada (2d ed.; new york: wilson,1943). 2. new serial titles (washington, d.c.: library of congress, 1950). the provision of mobile services in us urban libraries ya jun guo, yan quan liu, and arlene bielefield information technology and libraries | june 2018 78 ya jun guo (yadon0619@hotmail.com) is associate professor of information and library science at zhengzhou university of aeronautics, china. yan quan liu (liuy1@southernct.edu) is professor of information and library science at southern connecticut state university. arlene bielefield (bielefielda1@southernct.edu) is professor in information and library science at southern connecticut state university. . abstract to determine the present situation regarding services provided to mobile users in us urban libraries, the authors surveyed 138 urban libraries council members utilizing a combination of mobile visits, content analysis, and librarian interviews. the results show that nearly 95% of these libraries have at least one mobile website, mobile catalog, or mobile app. the libraries actively applied new approaches to meet each local community’s remote-access needs via new technologies, including app download links, mobile reference services, scan isbn, location navigation, and mobile printing. mobile services that libraries provide today are timely, convenient, and universally applicable. introduction the mobile internet has had a major impact on people’s lives and on how information is found located and accessed. today, library patrons are untethered from and free of the limitations of the desktop computer.1 the popularity of mobile devices has changed the relationship between libraries and patrons. mobile technology allows libraries to have the kind of connectivity with their patrons that did not exist previously. patrons no longer think that it is necessary for them to be physically in the library building to use library services, and they are eager to obtain 24/7 access to library resources anywhere using their mobile devices. mobile patrons need mobile libraries to provide them with services. in other words, “patrons want to have a library in their pocket.”2 as a result, libraries around the world are exploring and developing mobile services. according to the state of america’s libraries 2017 report by the american library association, the 50 us states, the district of columbia, and outlying territories have 8,895 public library administrative units (as well as 7,641 branches and bookmobiles). the vital role public libraries play in their communities has also expanded.3 as part of the main role of public libraries, us urban libraries need to embrace the developmental trend of the mobile internet to better serve their communities. the provision of mobile services in us urban libraries is worthy of study and is of great significance as a model for how other public libraries plan and implement their mobile services. mailto:yadon0619@hotmail.com mailto:liuy1@southernct.edu mailto:bielefielda1@southernct.edu the provision of mobile services in us urban libraries | guo, liu, and bielefield 79 https://doi.org/10.6017/ital.v37i2.10170 literature review definition and types of mobile devices and mobile services as early as 1991, mark weiser proposed “ubiquitous computing,” pointing out how people could obtain and handle information at anytime, anywhere, and in any way.4 with this expectation, the possibilities of using personal digital assistants (pdas) as mobile web browsers were researched in 1995.5 in combination with a wireless modem, library users are able to use pdas to access information services whenever they are needed. today, mobile devices are generally defined as units small enough to carry around in a pocket, falling into the categories of pdas, mobile phones, and personal media players.6 for many researchers, laptops are not included in the definition of mobile devices. although wireless laptops purportedly offer the opportunity to go “anywhere in the home,” laptops are generally used in a small set of locations, rather than moving fluidly through the home; wireless laptops are portable, but not mobile.7 in contrast, lippincott suggested that mobile devices should include laptops, netbooks, notebook computers, cell phones, audio players such as mp3 players, cameras, and other items.8 according to the “mobile strategy report” by the california digital library, mobile phones, e-readers, mp3 players, tablets, gaming devices, and pdas are common mobile devices.9 each mobile device has its own characteristics and the potential to connect to the internet from anywhere with a wi-fi network, driving widespread use and thus the provision of library mobile services. mobile services are services libraries offer to patrons via their mobile devices. these services as described herein comprise two categories: traditional library services modified to be available via mobile devices and services created for mobile devices.10 pope et al. listed several mobile services, including sms or text-messaging services, the my info quest project, digital collections, audiobooks, applications, and mobile-friendly websites.11 the california digital library pointed out that a growing number of university and public libraries are offering mobile services. libraries are creating mobile versions of library websites, using text messaging to communicate with patrons, developing mobile catalog searching, providing access to resources, and creating new tools and services, particularly for mobile devices.12 the most recognized mobile services in university libraries are mobile sites, mobile apps, mobile opacs, mobile access to databases, text messaging services, qr codes, augmented reality, and e books.13 both academic and public libraries’ use of web 2.0 applications and services include blogs, wikis, phone apps, qr codes, mash-ups, video or audio sharing, customized webpages, social media and social networking, and types of social tagging.14 this study focuses on the two most common mobile devices, mobile phones and tablets, and on the services provided to library patrons and local communities through mobile websites, mobile apps, and mobile catalogs. status of mobile services in us libraries mobile devices present a new and exciting opportunity for libraries of all types to provide information to people of all ages on the go, wherever they are.15 it is generally observed that there is an increased use of mobile technology in the library environment. information technology and libraries | june 2018 80 librarians see their users increasingly using mobile phones instead of laptops and desktop computers to search the catalog, check the library’s opening hours, and maintain contact with library staff.16 in an earlier investigation of 766 librarians, spires found that there was very little demand for services for mobile devices as of august 2007. at that time, relatively few libraries (18%) purchased content specifically for wireless handheld device use, and very few libraries (15%) reformatted content for these devices.17 however, a survey of public libraries completed by the american library association between september and november 2011 indicated interesting changes: 15% of library websites are optimized for mobile devices, and 12% of libraries use scanned codes (e.g. qr codes), and 7% of libraries have developed smartphone applications for access to library services; 36% of urban libraries have websites optimized for mobile devices, compared to 9% of rural libraries; 76% of libraries offer access to e-books; 70% of libraries use social networking tools such as facebook. 18 later studies revealed more significant changes. 99 association of research libraries member libraries were surveyed in 2012 to identify how many had optimized at least some services for the mobile web. apps were not investigated. the result showed that 83 libraries (84%) had a mobile website.19 a study in 2015 by liu and briggs showed that the top 100 university libraries in the united states offered one or more mobile services, with mobile websites, mobile access to the library catalog, mobile access to the library’s databases, e-books, and text messaging services being the most common. qr codes and augmented reality were less common.20 kim noted that “libraries are acknowledging that people expect to do just about everything on mobile devices and that more and more people are now using a mobile device as their primary access point for the web.”21 although librarians may have previously underestimated what people wanted to do using mobile devices, there is a growing understanding of the potential of these access points. research design survey samples while a growing number of users tend to access information remotely, urban libraries, as the most popular public-sector institutions and community centers, are facing great challenges in addressing the growing need for mobile services. the urban libraries council (ulc) (https://www.urbanlibraries.org), as an authoritative source founded in 1971, is the premier membership association of north america’s leading public library systems. ulc’s member libraries are in communities throughout the united states and canada, comprising a mix of institutions with varying revenue sources and governance structures, and serving communities with populations of differing sizes. ulc’s website lists 145 us and canadian urban libraries. since this study focused only on us urban libraries, 138 libraries were chosen as the study targets, and all were examined. https://www.urbanlibraries.org/ the provision of mobile services in us urban libraries | guo, liu, and bielefield 81 https://doi.org/10.6017/ital.v37i2.10170 table 1. the survey and examples of survey results. contents options example no.1: pima county public library … example no.138: milwaukee public library components of mobile websites 1 account login; 2 catalog search; 3 contact us; 4 downloadables; 5 events; 6 interlibrary loan; 7 kids & teens; 8 locations and hours; 9 meeting room; 10 recent arrivals; 11 recommendations; 12 social media; 13 suggest a purchase; 14 support 1, 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14. 1, 2, 3, 4, 5, 7, 8, 9, 12, 13, 14. components of mobile apps 1 account login; 2 barcode wallet; 3 bestsellers; 4 catalog search; 5 contact us; 6 downloadables; 7 events; 8 full website; 9 interlibrary loan; 10 just ordered; 11 kids & teens; 12 locations and hours; 13 meeting room; 14 my bookshelf; 15 my library; 16 pay fines; 17 popular this week; 18 recent arrivals; 19 recommendations; 20 scan isbn; 21 social media; 22 suggest a purchase; 21 support 1, 4, 5, 6, 7, 8, 12, 15, 18, 20, 21. 1, 4, 5, 6, 7, 8, 12, 17, 20, 21. mobile reference services 1 chat/im; 2 social medias; 3 text/sms; 4 web form - 1, 3, 4. social media 1 blog; 2 facebook; 3 flickr; 4 goodreads; 5 google+; 6 instagram; 7 linkedin; 8 pinterest; 9 tumblr; 10 twitter; 11 youtube 1, 2, 3, 6, 8, 10, 11. 1, 2, 6, 8, 10. mobile reservation services 1 reserve a computer; 2 reserve a librarian; 3 reserve a meeting room; 4 reserve a museum pass; 5 reserve a study room; 6 reserve exhibit space - 3. mobile printing 1 mobile printing; 2 no mobile/ wi-fi printing; 3 wifi printing 3. 2. apps or databases 1 axis 360; 2 biblioboard; 3 bookflix;4 brainfuse; 5 career transitions; 6 cloud library; 7 driving -tests.org; 8 ebscohost; 9 flipster; 10 freading; 11 freegal; 12 gale virtual; 13 hoopla; 14 instant flix; 15 learning express; 16 lynda.com; 17 mango languages; 18 master file; 19 morningstar; 20 new york times; 21 novelist; 22 one click digital; 23 overdrive; 24 reference usa; 25 safari; 26 tumble book; 27 tutor.com; 28 world book; 29 worldcat; 30 zinio. 4, 11, 14, 22, 23, 26, 28, 30. 4, 8, 11,12, 13, 15, 17, 18, 19, 21, 23, 24, 30. information technology and libraries | june 2018 82 survey methods as mobile services are offered basically via wireless systems and mobile devices, a combination of research methods, including mobile website visits, content analysis, and librarian interviews, were applied for data collection. specifically, librarian interviews were employed as a verification and supplemental process to ensure that survey data were accurate and exhaustive. first, the authors utilized an iphone, an android mobile phone, and an ipad to access the websites of the 138 us urban libraries in the study sample to ascertain if these libraries have mobile websites or mobile catalogs and whether the platforms are operated properly. then the authors checked whether these libraries have mobile apps that can be downloaded from the apple app store or the google play store. the survey was conducted from june 18 to june 24, 2017. next, the authors went through all the mobile websites and the mobile apps the libraries provide to check the mobile services offered. the authors used a specially designed survey to collect data about each library’s mobile website and app (see table 1). the procedure of survey content analysis was conducted between june 25 and july 24, 2017, with the examination of each library’s services taking approximately 30 minutes. finally, for those libraries that had no mobile websites or mobile apps found through the website visits, the authors made interview requests to staff librarians via their online reference services such as live chat, web form and email. an additional purpose of this step was to confirm the accuracy of the survey data collected from website visits. the survey was conducted from july 22 to august 3, 2017. results and analysis results from the examination of mobile website visits, content analysis, and librarian interviews revealed what services us urban libraries provided as mobile services, how they were provided, and which were commonly provided. how many libraries provide mobile services? over 83% of us urban libraries have developed their own mobile websites (see figure 1) for communities they serve. the mobile website is currently the most popular service platform for mobile users. the provision of mobile services in us urban libraries | guo, liu, and bielefield 83 https://doi.org/10.6017/ital.v37i2.10170 figure 1. types of mobile services provided by libraries. promisingly, each test of these websites through the authors’ mobile devices, either smartphones or tablets, confirmed that all the study subjects can be accessed 100% of the time. these library websites, however, are not entirely built specially for mobile devices. while the majority of urban libraries have transformed their desktop websites into mobile sites with proper responsive design, about 17% are just smaller versions of their desktop websites (see figure 2). a responsive mobile website can react or change according to the needs of the users and the mobile device they’re viewing it on to achieve a good layout and content display. here, text and images change from a three-column to a single-column layout, and unnecessary images are hidden. the web address of a responsively designed mobile website is the same as the desktop website. responsive design is described as a long-term solution for addressing both designers’ and users’ needs.22 the survey found that 59% of libraries now have apps. our analysis of the earliest version of apps records indicate that los angeles public library was the first to use an app, in august 2010. mobile apps have advantages and disadvantages compared to mobile websites, and many libraries compared them and chose between the two. skokie (illinois) public library, as of october 2015, is no longer supporting the library’s mobile app because they claim the library’s website offers a better mobile experience. they also offer an easy access solution like that for a mobile app, with a message displayed to users: “miss having an icon on your home screen? bookmark the site to your home screen and you’ll have an icon to take you directly to this site.” 83% 59% 22% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% mobile website mobile app mobile catalog information technology and libraries | june 2018 84 figure 2. the smaller versions of the desktop website and the specially designed mobile website the proportion of libraries providing mobile catalog services is only 22%. libraries can use multiple options to create one or more mobile service platforms. nearly half (46%) of us urban libraries have both mobile websites and mobile apps. according to the survey, 95% of libraries have at least one mobile website, mobile catalog, or mobile app. a survey the authors conducted in april 2014 found that only 81% of the urban libraries had at least one mobile website, mobile catalog, or mobile app (see figure 3). clearly, libraries are paying increasing attention to mobile services, and providing mobile services has become the unavoidable choice of libraries nowadays. figure 3. changes in the proportion of libraries that provide mobile services from 2014 to 2017. 19% 81% 2014 no mobile services at least one mobile service 5% 95% 2017 no mobile services at least one mobile service the provision of mobile services in us urban libraries | guo, liu, and bielefield 85 https://doi.org/10.6017/ital.v37i2.10170 what content do the mobile websites offer? through mobile website visits and content analysis, it was found that some types of information are available at all libraries, including “account login,” “events,” “locations and hours,” “contact us,” and “social media” (see figure 4). figure 4. components of mobile websites the proportion of library mobile sites that offer “support” and “downloadables” is 96% and 95%, respectively. among them, “support” generally includes donations to the library foundation, donation of books and other materials, and providing volunteer services; “downloadables” generally include e-books, e-magazines, and music. a total of 86% of the urban libraries set up “kids” and “teens” sections, providing specialized information services, such as storytime, games, events, book lists, homework help, volunteer information, and college information. a majority (62%) of libraries provide interlibrary loan information on mobile websites, but one library, palo alto (california) city library, no longer offers the costly interlibrary loan service as of july 2011. more than half (56%) of the libraries set up a “suggest a purchase” function and generally ask readers to provide title, author, publisher, year published, format, and other information in web form. some libraries display “recommendations” (26%) on their mobile websites. denver public library has a special column recommending books for children and teenagers and offers personalized reading suggestions: “tell us what you like to read and we’ll send you our recommendations in about a week.” many mobile websites will pop hints to the libraries’ mobile apps and link to the apple app store or the google play store after automatically identifying the user’s mobile phone operating system. this is helpful for promoting the use of the libraries’ apps, and it also provides great convenience for users. 100% 100% 100% 100% 100% 99% 96% 95% 86% 74% 62% 56% 32% 26% 0% 20% 40% 60% 80% 100% account login events locations and hours contact us social media catalog search support downloadables kids & teens meeting room interlibrary loan suggest a purchase recent arrivals recommendations http://www.marinlibrary.org/events/?trumbaembed=filter3%3dstorytimes information technology and libraries | june 2018 86 what content do the mobile apps offer? the content of mobile websites in libraries is basically the same, but the content of their mobile apps varies widely. the reason is that the understanding of the various libraries about the functions an app should offer differs from one library to another. some of these apps were designed by software vendors, such as boopsie, sirsidynix, and bibliocommoms, but some were designed by the libraries themselves, leading to the absence of a uniform standard or template for the app design. survey results show that only “account login” and “catalog search” are available in all apps (see figure 5). “locations and hours” accounts for a high proportion of apps at 96%. the “locations” feature in many libraries apps, with the help of gps, helps users find their nearest library location. figure 5. components of mobile apps about 85% of apps provide “contact us.” click “contact us” in poudre river public library district and some other libraries’ apps, and you can directly call the library or send text messages via email. “scan isbn” is a unique feature of mobile apps, and 75% of apps provide this functionality. if a library user finds a book they need in a bookstore or elsewhere, they can scan the isbn to can see if that book is in the library’s collection. apps designed by bibliocommoms all have “bestsellers”, “recently reviewed”, “just ordered” and “my library” (see chart figure 6). in “my library,” the “checked out” section contains red alerts for “overdue,” yellow alerts for “due soon,” and “total items.” the “holds” section contains “ready for pickup,” “active holds,” and “paused holds.”. the “my shelves” section contains “completed,” “in progress,” and “for later.” in this way, users can clearly see the details of the books they have 100% 100% 96% 89% 85% 77% 75% 68% 46% 27% 24% 19% 18% 16% 16% 10% 6% 5% 3% 0% 20% 40% 60% 80% 100% account login catalog search locations and hours downloadables contact us events scan isbn social media full website recent arrivals bestsellers recently reviewed popular this week just ordered my library my bookshelf pay fines barcode wallet kids & teens the provision of mobile services in us urban libraries | guo, liu, and bielefield 87 https://doi.org/10.6017/ital.v37i2.10170 borrowed and intend to borrow. apps designed by boopsie generally have “popular this week” to tell users which books have been borrowed more recently. figure 6. an app designed by bibliocommoms. only 3% of apps have “kids” and “teens” sections, which differs greatly from the percentage of mobile websites that offer those sections (86%). what mobile reference services do libraries provide? according to the survey, the most common way for us urban libraries to provide mobile reference service is a web form, which is available in 86% of surveyed libraries (see figure 7). related to “call us,” a web form has the advantage of being independent from the library’s working hours. although users fill out and submit a web form, it is similar to email and, generally, librarians respond to the user’s e-mail address, but it does not require users to enter their own email system, as they only need to fill in the content required by the web form. therefore, it is more convenient to use. the authors believe that providing only an email address is not mobile reference service. the survey found that 6% of libraries do not have mobile reference services. information technology and libraries | june 2018 88 figure 7. mobile reference services provided by libraries. currently, 43% of libraries offer chat and instant messaging (im) services, which allow users to communicate with librarians instantly. for example, when gwinnett county (georgia) public library’s mobile website is visited, an “ask us” dialog box appears in the upper right corner of the site, which allows visitors to chat with librarians. outside of the library’s work hours, the box displays “sorry, chat is offline but you can still get help” (see figure 8). the county of los angeles public library provides four options for im. they are aim, google talk, yahoo! messenger, and msn messenger. figure 8. “ask us” on gwinnett county public library’s mobile website 86% 43% 33% 8% 0% 20% 40% 60% 80% 100% web form chat/im text/sms social media the provision of mobile services in us urban libraries | guo, liu, and bielefield 89 https://doi.org/10.6017/ital.v37i2.10170 all the florida urban libraries surveyed offer reference services via the web form, chat, and text because an “ask a librarian” service administered by the tampa bay library consortium provides florida residents with those mobile reference services. the survey shows that only 8% of the libraries provide social media reference service in “ask a librarian.” the social media that provides reference service is either facebook or twitter. in fact, 100% of libraries have social media, and 100% of libraries have facebook and twitter, but most libraries do not use them to provide reference services. what social media do the libraries use? survey results showed that 100% of mobile websites display links to their social media, usually in the prominent position of the front page of the websites; 68% of apps have social media links. facebook and twitter are social media leaders, and now all libraries’ mobile websites have both (see figure 9). the survey conducted in 2014 showed that facebook and twitter had the highest occupancy rate, but only 61% of libraries offered facebook and 53% offered twitter. it is obvious that libraries have made great progress in the last three years in the application of social media. figure 9. social media being used by libraries. instagram and pinterest are both photo social media, and they are used 76% and 49%, respectively. as the leading social media in the video field, youtube is used by 67% of libraries. what mobile reservation services do libraries provide? mobile reservation services were found in 78% of all libraries’ mobile services. a majority (62%) of the libraries allow online reservation of a meeting room via web form or other forms, and 14% allow reserving a study room (see figure 10). some libraries only reserve a study or meeting room via phone. 100% 100% 76% 67% 57% 49% 41% 19% 12% 12% 9% 0% 20% 40% 60% 80% 100% facebook twitter instagram youtube blog pinterest flickr tumblr linkedin google+ goodreads information technology and libraries | june 2018 90 figure 10. mobile reservation services provided by libraries. a few libraries provide instant online access to free and low-cost tickets to museums, science centers, zoos, theatres, and other fun local cultural venues with discover & go. a total of 14% of the libraries provide “reserve a librarian” service, allowing patrons to reserve a free session with a reference librarian or subject specialist at the library. in addition, several libraries, such as pasadena public library, allow reserving of exhibit space. how many libraries provide mobile printing? mobile printing services allow patrons to print to a library printer from outside the library or from their mobile device. patrons’ print jobs are available for pick up at the library. already, 43% of the libraries provide mobile printing service (see figure 11). it is expected that more libraries will provide this service. to print from a mobile device, patrons need to download an app that supports mobile printing. printeron is the more commonly used app, which has been used by oakland public library, and san mateo county (california) libraries, and others. however, san diego public library uses the your print cloud print system, and santa clara county (california) library uses smart alec. san mateo county libraries offers wireless printing from smartphones, tablets, and laptops at all of its locations, and its wireless printing includes mobile printing, web printing, and email printing. in addition, 14% of libraries offer wireless printing services but do not provide mobile printing services. for example, live oak public libraries in savannah, georgia, states that printing from laptops (pc and mac) is available in all branches, but they don’t have apps that support printing from tablets or mobile phones. 62% 20% 15% 14% 14% 4% 0% 10% 20% 30% 40% 50% 60% 70% reserve a meeting room reserve a computer reserve a museum pass reserve a study room reserve a librarian reserve exhibit space the provision of mobile services in us urban libraries | guo, liu, and bielefield 91 https://doi.org/10.6017/ital.v37i2.10170 figure 11. the proportion of libraries that offer mobile printing. what apps or databases do libraries provide for patrons? four main software programs found to be used to display e-books of the surveyed libraries are overdrive (93%), hoopla (64%), tumblebook (61%), and cloud library (48%). for audiobooks, overdrive (93%) and hoopla (64%) are the most popular; oneclickdigital is used by 48%. most libraries (74%) use zinio for e-magazines, and 48% use the music software freegal. overdrive is the most common application in libraries (see table 2). table 2. the proportion of apps or databases being used in libraries. apps or databases % of libraries providing apps or databases % of libraries providing overdrive 93 world book 46 novelist 79 new york times 44 referenceusa 74 masterfile 43 zinio 74 ebscohost 43 learningexpress 69 flipster 29 gale virtual 68 bookflix 28 hoopla 64 brainfuse 22 morningstar 64 tutor.com 17 mango languages 61 safari 17 tumblebook 61 driving-tests.org 16 lynda.com 57 biblioboard 12 worldcat 51 career transitions 12 freegal 48 axis 360 11 oneclick digital 48 instantflix 10 cloud library 48 freading 9 mobile printing 43% no wireless/mobile printing 42% wireless printing 14% information technology and libraries | june 2018 92 the libraries provide users with various types of databases. survey statistics show that the widely used databases include referenceusa (business), mango languages (language learning), learningexpress and career transitions (job and career), lynda.com and tutor.com (education), morningstar (investment), world book (encyclopedias), worldcat (library resources worldwide), new york times (newspaper articles), driving-tests.org (testing preparation), and safari (technology). conclusion this study shows that mobile services have become popular in us urban libraries as of summer 2017, with 95% offering one or more types of mobile service. responsive mobile websites and mobile apps are the main platforms of current mobile services. the us urban libraries are terribly striving to meet local community’s remote access needs via new technologies. compared with desktop websites, mobile websites and apps for mobile devices offer services that are more accessible, smarter and interactive for local users. some mobile websites automatically prompt the user to install the libraries’ apps; many libraries’ apps offer the “scan isbn” function, making it convenient for the user to scan a book title at any time to see if it is in the library’s collection; “location” provides gps positioning and navigation services for users; “contact us” can directly link telephone, text, and email. libraries are actively developing and adding more mobile services, such as mobile reservation services and mobile printing services. the development of mobile technology has provided the support for libraries to offer mobile services. a future world of users accessing services provided by the libraries at anytime, anywhere, and in any way is getting closer and closer. acknowledgements this work was supported by grant no. 14ctq028 from the national social science foundation of china. references 1jason griffey, mobile technology and libraries (new york: neal-schuman, 2010). 2meredith farkas, “a library in your pocket,” american libraries no. 41 (2010): 38. 3american library association, “the state of america’s libraries 2017: a report from the american library association,” special report, american libraries, april 2017, http://www.ala.org/news/sites/ala.org.news/files/content/state-of-americas-librariesreport-2017.pdf. 4mark weiser, “the computer for the 21st century,” scientific american 265, no. 3 (1991): 94–104. 5stefan gessler and andreas kotulla, “pdas as mobile www browsers,” computer networks and isdn systems 28, no. 1–2 (1995): 53–59. 6georgina parsons, “information provision for he distance learners using mobile devices,” electronic library 28, no. 2 (2010): 231–44, https://doi.org/10.1108/02640471011033594. http://www.ala.org/news/sites/ala.org.news/files/content/state-of-americas-libraries-report-2017.pdf http://www.ala.org/news/sites/ala.org.news/files/content/state-of-americas-libraries-report-2017.pdf https://doi.org/10.1108/02640471011033594 the provision of mobile services in us urban libraries | guo, liu, and bielefield 93 https://doi.org/10.6017/ital.v37i2.10170 7allison woodruff et al., “portable, but not mobile: a study of wireless laptops in the home,” international conference on pervasive computing 4480 (2007): 216–33, https://doi.org/10.1007/978-3-540-72037-9_13. 8joan k. lippincott, “a mobile future for academic libraries,” reference services review 38, no. 2 (2010): 205–13. 9rachel hu and alison meir, “mobile strategy report,” california digital library, august 18, 2010, https://confluence.ucop.edu/download/attachments/26476757/cdl+mobile+device+user+r esearch_final.pdf?version=1. 10yan quan liu and sarah briggs, “a library in the palm of your hand: mobile services in top 100 university libraries,” information technology & libraries 34, no. 2 (2015): 133–48, https://doi.org/10.6017/ital.v34i2.5650. 11kitty pope et al., “twenty-first century library must-haves: mobile library services,” searcher 18, no. 3 (2010): 44–47. 12hu and meir, “mobile strategy report.” 13qian and briggs, “a library in the palm of your hand.” 14kalah rogers, “academic and public libraries’ use of web 2.0 applications and services in mississippi,” slis connecting 4, no. 1 (2015), https://doi.org/10.18785/slis.0401.08. 15 pope et al., “twenty-first century library must-haves.” 16lorraine paterson and low boon, “usability inspection of digital libraries: a case study,” ariadne 63, no. 1 (2010): 11, https://doi.org/10.1007/s00799-003-0074-4. [website lists h. rex hartson, priya shivakumar, and manuel a. pérez-quiñones as the authors] 17todd spires, “handheld librarians: a survey of librarian and library patron use of wireless handheld devices,” internet reference services quarterly 13, no. 4 (2008): 287–309, https://doi.org/10.1080/10875300802326327. 18 american library association, “libraries connect communities 2011-2012,” last modified june, 2012, http://connect.ala.org/files/68293/2012.67b%20plfts%20results.pdf. 19barry trott and rebecca jackson, “mobile academic libraries,” reference & user services quarterly 52, no. 3 (2013): 174–78. 20 liu and briggs, “a library in the palm of your hand.” 21bohyun kim, “the present and future of the library mobile experience,” library technology reports 49, no. 6 (2013): 15–28. 22hannah gascho rempel and laurie bridges, “that was then, this is now: replacing the mobileoptimized site with responsive design,” information technology & libraries 32, no. 4 (2013): 8–24, https://doi.org/10.6017/ital.v32i4.4636. https://doi.org/10.1007/978-3-540-72037-9_13 https://confluence.ucop.edu/download/attachments/26476757/cdl+mobile+device+user+research_final.pdf?version=1 https://confluence.ucop.edu/download/attachments/26476757/cdl+mobile+device+user+research_final.pdf?version=1 https://doi.org/10.6017/ital.v34i2.5650 https://doi.org/10.18785/slis.0401.08 https://doi.org/10.1007/s00799-003-0074-4 https://doi.org/10.1080/10875300802326327 http://connect.ala.org/files/68293/2012.67b%20plfts%20results.pdf https://doi.org/10.6017/ital.v32i4.4636 abstract introduction literature review definition and types of mobile devices and mobile services status of mobile services in us libraries research design survey samples survey methods results and analysis how many libraries provide mobile services? what content do the mobile websites offer? what content do the mobile apps offer? what mobile reference services do libraries provide? what social media do the libraries use? what mobile reservation services do libraries provide? how many libraries provide mobile printing? what apps or databases do libraries provide for patrons? conclusion acknowledgements references lib-s-mocs-kmc364-20141005044228 book reviews die elektronische datenverarbeitung im bibliothekswesen. by paul niewalda. muenchen-pullach, berlin, verlag dokumentation, 1971. (bibliothekspraxis, 1) as the first volume in a new series called bibliothekspraxis (library practice), v erlag dokumentation has published a short monograph on library automation by paul n iewalda, of the university library of regensburg. niewalda has written an introductory text, in german, condensing the standard, largely american, literature on the subject. his treatment is concise, wellwritten, and · well-organized. computer capabilities, and existing library applications in the united states and elsewhere, are carefully delineated. the text is thoroughly documented, with a large number of notes and a useful bibliography included. the book addresses itself to the german reader and, in fact, much is already familiar to american librarians. yet niewalda's frequent references to the ·european, particularly the german, library automation scene enhance the book's value. the author is clearly well informed both about library automation in general, and about local practice and problems. he brings to his task common sense and sound judgment. the work is recommended to those readers having a general interest in foreign developments in the field of library automation. s. micha namenwirth university of california, berkeley dictionanj of library science, information and documentation in six languages. compiled and artanged by w. e. clason. amsterdam: elsevier scientific publ. co., 1973. the basic table, a numbered list of entries for 5,439 english language words and phrases, alphabetically arranged, forms the body of the dictionary of library science, information and documentation. each entry consists ·of a serial number, the english term (american and/or british), equivalents in french, spanish, italian, dutch; and german, and a code identifying the book reviews 123 vocabulary with which the term is associated. hence, there are separate entries for volume as a book trade or library term and as an information processing term. many entries are augmented by brief definitions. english synonyms are also frequently given; in general these are terms from which references have been made. in such cases entry is under the synonym which files first. this practice produces some apparently eccentric choices; e.g., pseudonym, see allonym; udc, see brussels system. following the basic ' table are indexes for the five non-english languages. numerical references are given to basic table entries in which the index term is cited. german band is found not only in the first volume entry mentioned above but also in the bookbinding and information processing entries for tape. criteria employed for the selection of entries are unexplained. ibm's data pro-·' cessing glossary and the american national standard vocabulary for information processing appear to have been important sources of information processing terms. the glossary in anglo-american cataloging rules was evidently not used. it is clear that some of the source lists used were in other languages. the juxtaposition of related vocabularies which often put the same words to different uses presents difficulties which the approach taken here seems capable of handling. nevertheless the work as executed has flaws which reduce its effectiveness. the notions of synonymy and nonsynonymy among the english terms are puzzling. definitions are frequently unclear and occasionally wrong. there are cases in which the non-english equivalents for a single' term are certainly not synonymous with each other. the utility of the mdexes would be enhanced if the number of nonenglish synonyms given were greater. however, if approached with care, the volume can ·provide much useful information. in works of this type it is probably unfair to expect perfection. besides, a dictionary which manages to encompass both negative entropy (information theory) and scrivener's palstj (authors and authorship) has to be interesting, at least. charles w. husbands harvard universitv library generating collaborative systems for digital libraries | hilera et al. 195 josé r. hilera, carmen pagés, j. javier martínez, j. antonio gutiérrez, and luis de-marcos an evolutive process to convert glossaries into ontologies dictionary, the outcome will be limited by the richness of the definition of terms included in that dictionary. it would be what is normally called a “lightweight” ontology,6 which could later be converted into a “heavyweight” ontology by implementing, in the form of axioms, knowledge not contained in the dictionary. this paper describes the process of creating a lightweight ontology of the domain of software engineering, starting from the ieee standard glossary of software engineering terminology.7 ■■ ontologies, the semantic web, and libraries within the field of librarianship, ontologies are already being used as alternative tools to traditional controlled vocabularies. this may be observed particularly within the realm of digital libraries, although, as krause asserts, objections to their use have often been raised by the digital library community.8 one of the core objections is the difficulty of creating ontologies as compared to other vocabularies such as taxonomies or thesauri. nonetheless, the semantic richness of an ontology offers a wide range of possibilities concerning indexing and searching of library documents. the term ontology (used in philosophy to refer to the “theory about existence”) has been adopted by the artificial intelligence research community to define a categorization of a knowledge domain in a shared and agreed form, based on concepts and relationships, which may be formally represented in a computer readable and usable format. the term has been widely employed since 2001, when berners-lee et al. envisaged the semantic web, which aims to turn the information stored on the web into knowledge by transforming data stored in every webpage into a common scheme accepted in a specific domain.9 to accomplish that task, knowledge must be represented in an agreed-upon and reusable computer-readable format. to do this, machines will require access to structured collections of information and to formalisms which are based on mathematical logic that permits higher levels of automatic processing. technologies for the semantic web have been developed by the world wide web consortium (w3c). the most relevant technologies are rdf (resource description this paper describes a method to generate ontologies from glossaries of terms. the proposed method presupposes an evolutionary life cycle based on successive transformations of the original glossary that lead to products of intermediate knowledge representation (dictionary, taxonomy, and thesaurus). these products are characterized by an increase in semantic expressiveness in comparison to the product obtained in the previous transformation, with the ontology as the end product. although this method has been applied to produce an ontology from the “ieee standard glossary of software engineering terminology,” it could be applied to any glossary of any knowledge domain to generate an ontology that may be used to index or search for information resources and documents stored in libraries or on the semantic web. f rom the point of view of their expressiveness or semantic richness, knowledge representation tools can be classified at four levels: at the basic level (level 0), to which dictionaries belong, tools include definitions of concepts without formal semantic primitives; at the taxonomies level (level 1), tools include a vocabulary, implicit or explicit, as well as descriptions of specialized relationships between concepts; at the thesauri level (level 2), tools further include lexical (synonymy, hyperonymy, etc.) and equivalence relationships; and at the reference models level (level 3), tools combine the previous relationships with other more complex relationships between concepts to completely represent a certain knowledge domain.1 ontologies belong at this last level. according to the hierarchic classification above, knowledge representation tools of a particular level add semantic expressiveness to those in the lowest levels in such a way that a dictionary or glossary of terms might develop into a taxonomy or a thesaurus, and later into an ontology. there are a variety of comparative studies of these tools,2 as well as varying proposals for systematically generating ontologies from lower-level knowledge representation systems, especially from descriptor thesauri.3 this paper proposes a process for generating a terminological ontology from a dictionary of a specific knowledge domain.4 given the definition offered by neches et al. (“an ontology is an instrument that defines the basic terms and relations comprising the vocabulary of a topic area as well as the rules for combining terms and relations to define extensions to the vocabulary”)5 it is evident that the ontology creation process will be easier if there is a vocabulary to be extended than if it is developed from scratch. if the developed ontology is based exclusively on the josé r. hilera (jose.hilera@uah.es) is professor, carmen pagés (carmina.pages@uah.es) is assistant professor, j. javier martínez (josej.martinez@uah.es) is professor, j. antonio gutiérrez (jantonio.gutierrez@uah.es) is assistant professor, and luis de-marcos (luis.demarcos@uah.es) is professor, department of computer science, faculty of librarianship and documentation, university of alcalá, madrid, spain. 196 information technology and libraries | december 2010 configuration management; data types; errors, faults, and failures; evaluation techniques; instruction types; language types; libraries; microprogramming; operating systems; quality attributes; software documentation; software and system testing; software architecture; software development process; software development techniques; and software tools.15 in the glossary, entries are arranged alphabetically. an entry may consist of a single word, such as “software,” a phrase, such as “test case,” or an acronym, such as “cm.” if a term has more than one definition, the definitions are numbered. in most cases, noun definitions are given first, followed by verb and adjective definitions as applicable. examples, notes, and illustrations have been added to clarify selected definitions. cross-references are used to show a term’s relations with other terms in the dictionary: “contrast with” refers to a term with an opposite or substantially different meaning; “syn” refers to a synonymous term; “see also” refers to a related term; and “see” refers to a preferred term or to a term where the desired definition can be found. figure 2 shows an example of one of the definitions of the glossary terms. note that definitions can also include framework),10 which defines a common data model to specify metadata, and owl (ontology web language),11 which is a new markup language for publishing and sharing data using web ontologies. more recently, the w3c has presented a proposal for a new rdf-based markup system that will be especially useful in the context of libraries. it is called skos (simple knowledge organization system), and it provides a model for expressing the basic structure and content of concept schemes, such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabularies.12 the emergence of the semantic web has created great interest within librarianship because of the new possibilities it offers in the areas of publication of bibliographical data and development of better indexes and better displays than those that we have now in ils opacs.13 for that reason, it is important to strive for semantic interoperability between the different vocabularies that may be used in libraries’ indexing and search systems, and to have compatible vocabularies (dictionaries, taxonomies, thesauri, ontologies, etc.) based on a shared standard like rdf. there are, at the present time, several proposals for using knowledge organization systems as alternatives to controlled vocabularies. for example, folksonomies, though originating within the web context, have been proposed by different authors for use within libraries “as a powerful, flexible tool for increasing the user-friendliness and interactivity of public library catalogs.”14 authors argue that the best approach would be to create interoperable controlled vocabularies using shared and agreed-upon glossaries and dictionaries from different domains as a departure point, and then to complete evolutive processes aimed at semantic extension to create ontologies, which could then be combined with other ontologies used in information systems running in both conventional and digital libraries for indexing as well as for supporting document searches. there are examples of glossaries that have been transformed into ontologies, such as the cambridge healthtech institute’s “pharmaceutical ontologies glossary and taxonomy” (http://www.genomicglossaries.com/content/ontolo gies.asp), which is an “evolving terminology for emerging technologies.” ■■ ieee standard glossary of software engineering terminology to demonstrate our proposed method, we will use a real glossary belonging to the computer science field, although it is possible to use any other. the glossary, available in electronic format (pdf), defines approximately 1,300 terms in the domain of software engineering (figure 1). topics include addressing assembling, compiling, linking, loading; computer performance evaluation; figure 1. cover of the glossary document generating collaborative systems for digital libraries | hilera et al. 197 4. define the classes and the class hierarchy 5. define the properties of classes (slots) 6. define the facets of the slots 7. create instances as outlined in the introduction, the ontology developed using our method is a terminological one. therefore we can ignore the first two steps in noy’s and mcguinness’ process as the concepts of the ontology coincide with the terms of the glossary used. any ontology development process must take into account the basic stages of the life cycle, but the way of organizing the stages can be different in different methods. in our case, since the ontology has a terminological character, we have established an incremental development process that supposes the natural evolution of the glossary from its original format (dictionary or vocabulary format) into an ontology. the proposed life cycle establishes a series of steps or phases that will result in intermediate knowledge representation tools, with the final product, the ontology, being the most semantically rich (figure 4). therefore this is a product-driven process, in which the aim of every step is to obtain an intermediate product useful on its own. the intermediate products and the final examples associated with the described concept. in the resulting ontology, the examples were included as instances of the corresponding class. in figure 2, it can be seen that the definition refers to another glossary on programming languages (std 610.13), which is a part of the series of dictionaries related to computer science (“ieee std 610,” figure 3). other glossaries which are mentioned in relation to some references about term definitions are 610.1, 610.5, 610.7, 610.8, and 610.9. to avoid redundant definitions and possible inconsistencies, links must be implemented between ontologies developed from those glossaries that include common concepts. the ontology generation process presented in this paper is meant to allow for integration with other ontologies that will be developed in the future from the other glossaries. in addition to the explicit references to other terms within the glossary and to terms from other glossaries, the textual definition of a concept also has implicit references to other terms. for example, from the phrase “provides features designed to facilitate expression of data structures” included in the definition of the term high order language (figure 2), it is possible to determine that there is an implicit relationship between this term and the term data structure, also included in the glossary. these relationships have been considered in establishing the properties of the concepts in the developed ontology. ■■ ontology development process many ontology development methods presuppose a life cycle and suggest technologies to apply during the process of developing an ontology.16 the method described by noy and mcguinness is helpful when beginning this process for the first time.17 they establish a seven-step process: 1. determine the domain and scope of the ontology 2. consider reusing existing ontologies 3. enumerate important terms in the ontology figure 2. example of term definition in the ieee glossary figure 3. ieee computer science glossaries 610—standard dictionary of computer terminology 610.1—standard glossary of mathematics of computing terminology 610.2—standard glossary of computer applications terminology 610.3—standard glossary of modeling and simulation terminology 610.4—standard glossary of image processing terminology 610.5—standard glossary of data management terminology 610.6—standard glossary of computer graphics terminology 610.7—standard glossary of computer networking terminology 610.8—standard glossary of artificial intelligence terminology 610.9—standard glossary of computer security and privacy terminology 610.10—standard glossary of computer hardware terminology 610.11—standard glossary of theory of computation terminology 610.12—standard glossary of software engineering terminology 610.13—standard glossary of computer languages terminology high order language (hol). a programming language that requires little knowledge of the computer on which a program will run, can be translated into several difference machine languages, allows symbolic naming of operations and addresses, provides features designed to facilitate expression of data structures and program logic, and usually results in several machine instructions for each program statement. examples include ada, cobol, fortran, algol, pascal. syn: high level language; higher order language; third generation language. contrast with: assembly language; fifth generation language; fourth generation language; machine language. note: specific languages are defined in p610.13 198 information technology and libraries | december 2010 since there are terms with different meanings (up to five in some cases) in the ieee glossary of software engineering terminology, during dictionary development we decided to create different concepts (classes) for the same term, associating a number to these concepts to differentiate them. for example, there are five different definitions for the term test, which is why there are five concepts (test1–test5), corresponding to the five meanings of the term: (1) an activity in which a system or component is executed under specified conditions, the results are observed or recorded, and an evaluation is made of some aspect of the system or component; (2) to conduct an activity as in (1); (3) a set of one or more test cases; (4) a set of one or more test procedures; (5) a set of one or more test cases and procedures. taxonomy the proposed lifecycle establishes a stage for the conversion of a dictionary into a taxonomy, understanding taxonomy as an instrument of concepts categorization, product are a dictionary, which has a formal and computer processed structure, with the terms and their definitions in xml format; a taxonomy, which reflects the hierarchic relationships between the terms; a thesaurus, which includes other relationships between the terms (for example, the synonymy relationship); and, finally, the ontology, which will include the hierarchy, the basic relationships of the thesaurus, new and more complex semantic relationships, and restrictions in form of axioms expressed using description logics.18 the following paragraphs describe the way each of these products is obtained. dictionary the first step of the proposed development process consists of the creation of a dictionary in xml format with all the terms included in the ieee standard glossary of software engineering terminology and their related definitions. this activity is particularly mechanical and does not need human intervention as it is basically a transformation of the glossary from its original format (pdf) into a format better suited to the development process. all formats considered for the dictionary are based on xml, and specifically on rdf and rdf schema. in the end, we decided to work with the standards daml+oil and owl,19 though we are not opposed to working with other languages, such as skos or xmi,20 in the future. (in the latter case, it would be possible to model the intermediate products and the ontology in uml graphic models stored in xml files.)21 in our project, the design and implementation of all products has been made using an ontology editor. we have used oiled (with oilviz plugin) as editor, both because of its simplicity and because it allows the exportation to owl and daml formats. however, with future maintenance and testing in mind, we decided to use protégé (with owl plugin) in the last step of the process, because this is a more flexible environment with extensible modules that integrate more functionality such as ontology annotation, evaluation, middleware service, query and inference, etc. figure 5 shows the dictionary entry for “high order language,” which appears in figure 2. note that the dictionary includes only owl:class (or daml:class) to mark the term; rdf:label to indicate the term name; and rdf:comment to provide the definition included in the original glossary. figure 4. ontology development process highorderlanguage figure 5. example of dictionary entry generating collaborative systems for digital libraries | hilera et al. 199 example, when analyzing the definition of the term compiler: “(is) a computer program that translates programs expressed in a high order language into their machine language equivalent,” it is possible to deduce that compiler is a subconcept of computer program, which is also included in the glossary.) in addition to the lexical or syntactic analysis, it is necessary for an expert in the domain to perform a semantic analysis to complete the development of the taxonomy. the implementation of the hierarchical relationships among the concepts is made using rdfs:subclassof, regardless of whether the taxonomy is implemented in owl or daml format, since both languages specify this type of relationship in the same way. figure 6 shows an example of a hierarchical relationship included in the definition of the concept pictured in figure 5. thesaurus according to the international organization for standardization (iso), a thesaurus is “the vocabulary of a controlled indexing language, formally organized in order to make explicit the a priori relations between concepts (for example ‘broader’ and ‘narrower’).”25 this definition establishes the lexical units and the semantic relationships between these units as the elements that constitute a thesaurus. the following is a sample of the lexical units: ■■ descriptors (also called “preferred terms”): the terms used consistently when indexing to represent a concept that can be in documents or in queries to these documents. the iso standard introduces the option of adding a definition or an application note to every term to establish explicitly the chosen meaning. this note is identified by the abbreviation sn (scope note), as shown in figure 7. ■■ non-descriptors (“non-preferred terms”): the synonyms or quasi-synonyms of a preferred term. a nonpreferred term is not assigned to documents submitted to an indexing process, but is provided as an entry point in a thesaurus to point to the appropriate descriptor. usually the descriptors are written in capital letters and the nondescriptors in small letters. ■■ compound descriptors: the terms used to represent complex concepts and groups of descriptors, which allow for the structuring of large numbers of thesaurus descriptors into subsets called micro-thesauri. in addition to lexical units, other fundamental elements of a thesaurus are semantic relationships between these units. the more common relationships between lexical units are the following: ■■ equivalence: the relationship between the descriptors and the nondescriptors (synonymous and that is, as a systematical classification in a traditional way. as gilchrist states, there is no consensus on the meaning of terms like taxonomy, thesaurus, or ontology.22 in addition, much work in the field of ontologies has been done without taking advantage of similar work performed in the fields of linguistics and library science.23 this situation is changing because of the increasing publication of works that relate the development of ontologies to the development of “classic” terminological tools (vocabularies, taxonomies, and thesauri). this paper emphasizes the importance and usefulness of the intermediate products created at each stage of the evolutive process from glossary to ontology. the end product of the initial stage is a dictionary expressed as xml. the next stage in the evolutive process (figure 4) is the transformation of that dictionary into a taxonomy through the addition of hierarchical relationships between concepts. to do this, it is necessary to undertake a lexicalsemantic analysis of the original glossary. this can be done in a semiautomatic way by applying natural language processing (nlp) techniques, such as those recommended by morales-del-castillo et al.,24 for creating thesauri. the basic processing sequence in linguistic engineering comprises the following steps: (1) incorporate the original documents (in our case the dictionary obtained in the previous stage) into the information system; (2) identify the language in which they are written, distinguishing independent words; (3) “understand” the processed material at the appropriate level; (4) use this understanding to transform, search, or traduce data; (5) produce the new media required to present the produced outcomes; and finally, (6) present the final outcome to human users by means of the most appropriate peripheral device—screen, speakers, printer, etc. an important aspect of this process is natural language comprehension. for that reason, several different kinds of programs are employed, including lemmatizers (which implement stemming algorithms to extract the lexeme or root of a word), morphologic analyzers (which glean sentence information from their constituent elements: morphemes, words, and parts of speech), syntactic analyzers (which group sentence constituents to extract elements larger than words), and semantic models (which represent language semantics in terms of concepts and their relations, using abstraction, logical reasoning, organization and data structuring capabilities). from the information in the software engineering dictionary and from a lexical analysis of it, it is possible to determine a hierarchical relationship when the name of a term contains the name of another one (for example, the term language and the terms programming language and hardware design language), or when expressions such as “is a” linked to the name of another term included in the glossary appear in the text of the term definition. (for 200 information technology and libraries | december 2010 indicating that high order language relates to both assembly and machine languages. the life cycle proposed in this paper (figure 4) includes a third step or phase that transforms the taxonomy obtained in the previous phase into a thesaurus through the incorporation of relationships between the concepts that complement the hierarchical relations included in the taxonomy. basically, we have to add two types of relationships—equivalence and associative, represented in the standard thesauri with uf (and use) and rt respectively. we will continue using xml to implement this new product. there are different ways of implementing a thesaurus using a language based on xml. for example, matthews et al. proposed a standard rdf format,26 where as hall created an ontology in daml.27 in both cases, the authors modeled the general structure of quasi-synonymous). iso establishes that the abbreviation uf (used for) precedes the nondescriptors linked to a descriptor; and the abbreviation use is used in the opposite case. for example, a thesaurus developed from the ieee glossary might include a descriptor “high order language” and an equivalence relationship with a nondescriptor “high level language” (figure 7). ■■ hierarchical: a relationship between two descriptors. in the thesaurus one of these descriptors has been defined as superior to the other one. there are no hierarchical relationships between nondescriptors, nor between nondescriptors and descriptors. a descriptor can have no lower descriptors or several of them, and no higher descriptors or several of them. according to the iso standard, hierarchy is expressed by means of the abbreviations bt (broader term), to indicate the generic or higher descriptors, and nt (narrower term), to indicate the specific or lower descriptors. the term at the head of the hierarchy to which a term belongs can be included, using the abbreviation tt (top term). figure 7 presents these hierarchical relationships. ■■ associative: a reciprocal relationship that is established between terms that are neither equivalent nor hierarchical, but are semantically or conceptually associated to such an extent that the link between them should be made explicit in the controlled vocabulary on the grounds that it may suggest additional terms for use in indexing or retrieval. it is generally indicated by the abbreviation rt (related term). there are no associative relationships between nondescriptors and descriptors, or between descriptors already linked by a hierarchical relation. it is possible to establish associative relationships between descriptors belonging to the same or different category. the associative relationships can be of very different types. for example, they can represent causality, instrumentation, location, similarity, origin, action, etc. figure 7 shows two associative relations, .. high order language (descriptor) sn a programming language that... uf high level language (no-descriptor) uf third generation language (no-descriptor) tt language bt programming language nt object oriented language nt declarative language rt assembly language (contrast with) rt machine language (contrast with) .. high level language use high order language .. third generation language use high order language .. figure 7. fragment of a thesaurus entry figure 6. example of taxonomy entry ... generating collaborative systems for digital libraries | hilera et al. 201 terms. for example: . or using the glossary notation: . ■■ the rest of the associative relationships (rt) that were included in the thesaurus correspond to the cross-references of the type “contrast with” and “see also” that appear explicitly in the ieee glossary. ■■ neither compound descriptors nor groups of descriptors have been implemented because there is no such structure in the glossary. ontology ding and foo state that “ontology promotes standardization and reusability of information representation through identifying common and shared knowledge. ontology adds values to traditional thesauri through deeper semantics in digital objects, both conceptually, relationally and machine understandably.”29 this semantic richness may imply deeper hierarchical levels, richer relationships between concepts, the definition of axioms or inference rules, etc. the final stage of the evolutive process is the transformation of the thesaurus created in the previous stage into an ontology. this is achieved through the addition of one or more of the basic elements of semantic complexity that differentiates ontologies from other knowledge representation standards (such as dictionaries, taxonomies, and thesauri). for example: ■■ semantic relationships between the concepts (classes) of the thesaurus have been added as properties or ontology slots. ■■ axioms of classes and axioms of properties. these are restriction rules that are declared to be satisfied by elements of ontology. for example, to establish disjunctive classes ( ), have been defined, and quantification restrictions (existential or universal) and cardinality restrictions in the relationships have been implemented as properties. software based on techniques of linguistic analysis has been developed to facilitate the establishment of the properties and restrictions. this software analyzes the definition text for each of the more than 1,500 glossary terms (in thesaurus format), isolating those words that a thesaurus from classes (rdf:class or daml:class) and properties (rdf:property or daml:objectproperty). in the first case they proposed five classes: thesaurusobject, concept, topconcept, term, scopenote; and several properties to implement the relations, like hasscopenote (sn), isindicatedby, preferredterm, usedfor (uf), conceptrelation, broaderconcept (bt), narrowerconcept (nt), topofhierarchy (tt) and isrelatedto (rt). recently the w3c has developed the skos specification, created to define knowledge organization schemes. in the case of thesauri, skos includes specific tags, such as skos:concept, skos:scopenote (sn), skos:broader (bt), skos:narrower (nt), skos:related (rt), etc., that are equivalent to those listed in the previous paragraph. our specification does not make any statement about the formal relationship between the class of skos concept schemes and the class of owl ontologies, which will allow different design patterns to be explored for using skos in combination with owl. although any of the above-mentioned formats could be used to implement the thesaurus, given that the endproduct of our process is to be an ontology, our proposal is that the product to be generated during this phase should have a format compatible with the final ontology and with the previous taxonomy. therefore a minimal number of changes will be carried out on the product created in the previous step, resulting in a knowledge representation tool similar to a thesaurus. that tool does not need to be modified during the following (final) phase of transformation into an ontology. nevertheless, if for some reason it is necessary to have the thesaurus in one of the other formats (such as skos), it is possible to apply a simple xslt transformation to the product. another option would be to integrate a thesaurus ontology, such as the one proposed by hall,28 with the ontology representing the ieee glossary. in the thesaurus implementation carried out in our project, the following limitations have been considered: ■■ only the hierarchical relationships implemented in the taxonomy have been considered. these include relationsips of type “is-a,” that is, generalization relationships or type–subset relationships. relationships that can be included in the thesaurus marked with tt, bt, and nt, like relations of type “part of” (that is, partative relationships) have not been considered. instead of considering them as hierarchical relationships, the final ontology includes the possibility of describing classes as a union of classes. ■■ the relationships of synonymy (uf and use) used to model the cross-references in the ieee glossary (“syn” and “see,” respectively) were implemented as equivalent terms, that is, as equivalent axioms between classes (owl:equivalentclass or daml:sameclassas), with inverse properties to reflect the preference of the 202 information technology and libraries | december 2010 match the name of other glossary terms (or a word in the definition text of other glossary terms). the isolated words will then be candidates for a relationship between both of them. (figure 8 shows the candidate properties obtained from the software engineering glossary.) the user then has the option of creating relationships with the identified candidate words. the user must indicate, for every relationship to be created, the restriction type that it represents as well as existential or universal quantification or cardinality (minimum or maximum). after confirming this information, the program updates the file containing the ontology (owl or daml), adding the property to the class that represents the processed term. figure 9 shows an example of the definition of two properties and its application to the class highorderlanguage: a property express with existential quantification over the class datastructure to indicate that a language must represent at least one data structure; and a property translateto of universal type to indicate that any high-level language is translated into machine language (machinelanguage). ■■ results, conclusions, and future work the existence of ontologies of specific knowledge domains (software engineering in this case) facilitates the process of finding resources about this discipline on the semantic web and in digital libraries, as well as the reuse of learning objects of the same domain stored in repositories available on the web.30 when a new resource is indexed in a library catalog, a new record that conforms to the ontology conceptual data model may be included. it will be necessary to assign its properties according to the concept definition included in the ontology. the user may later execute semantic queries that will be run by the search system that will traverse the ontology to identify the concept in which the user was interested to launch a wider query including the resources indexed under the concept. ontologies, like the one that has been “evolved,” may also be used in an open way to index and search for resources on the web. in that case, however, semantic search engines such as swoogle (http://swoogle.umbc .edu/), are required in place of traditional syntactic search engines, such as google. the creation of a complete ontology of a knowledge domain is a complex task. in the case of the domain presented in this paper, that of software engineering, although there have been initiatives toward ontology creation that have yielded publications by renowned authors in the field,31 a complete ontology has yet to be created and published. this paper has described a process for developing a modest but complete ontology from a glossary of terminology, both in owl format and daml+oil format, accept access accomplish account achieve adapt add adjust advance affect aggregate aid allocate allow allow symbolic naming alter analyze apply approach approve arrangement arrive assign assigned by assume avoid await begin break bring broke down builds call called by can be can be input can be used as can operate in cannot be usedas carry out cause change characterize combine communicate compare comply comprise conduct conform consist constrain construct contain contains no contribute control convert copy correct correspond count create debugs decompiles decomposedinto decrease define degree delineate denote depend depict describe design designate detect determine develop development direct disable disassembles display distribute divide document employ enable encapsulate encounter ensure enter establish estimate establish evaluate examine exchange execute after execute in executes expand express express as extract facilitate fetch fill follow fulfil generate give give partial given constrain govern have have associated have met have no hold identify identify request ignore implement imply improve incapacitate include incorporate increase indicate inform initiate insert install intend interact with interprets interrelate investigate invokes is is a defect in is a form of is a method of is a mode of is a part is a part of is a sequence is a sequenceof is a technique is a techniqueof is a type is a type of is ability is activated by is adjusted by is applied to is based is called by is composed is contained is contained in is establish is established is executed after is executed by is incorrect is independent of is manifest is measured in is not is not subdivided in is part is part of is performed by is performed on is portion is process by is produce by is produce in is ratio is represented by is the output is the result of is translated by is type is used is used in isolate know link list load locate maintain make make up may be measure meet mix modify monitors move no contain no execute no relate no use not be connected not erase not fill not have not involve not involving not translate not use occur occur in occur in a operate operatewith optimize order output parses pas pass test perform permit permitexecute permit the execution pertaining place preclude predict prepare prescribe present present for prevent preventaccessto process produce produce no propose provide rank reads realize receive reconstruct records recovery refine reflect reformat relate relation release relocates remove repair replace represent request require reserve reside restore restructure result resume retain retest returncontrolto reviews satisfy schedule send server set share show shutdown specify store store in structure submission of supervise supports suppress suspend swap synchronize take terminate test there are no through throughout transfer transform translate transmit treat through understand update use use in use to utilize value verify work in writes figure 8. candidate properties obtained from the linguistic analysis of the software engineering glossary generating collaborative systems for digital libraries | hilera et al. 203 to each term.) we defined 324 properties or relationships between these classes. these are based on a semiautomated linguistic analysis of the glossary content (for example, allow, convert, execute, operatewith, produces, translate, transform, utilize, workin, etc.), which will be refined in future versions. the authors’ aim is to use this ontology, which we have called ontoglose (ontology glossary software engineering), to unify the vocabulary. ontoglose will be used in a more ambitious project, whose purpose is the development of a complete ontology in software engineering from the swebok guide.32 although this paper has focused on this ontology, the method that has been described may be used to generate an ontology from any dictionary. the flexibility that owl permits for ontology description, along with its compatibility with other rdf-based metadata languages, makes possible interoperability between ontologies and between ontologies and other controlled vocabularies and allows for the building of merged representations of multiple knowledge domains. these representations may eventually be used in libraries and repositories to index and search for any kind of resource, not only those related to the original field. ■■ acknowledgments this research is co-funded by the spanish ministry of industry, tourism and commerce profit program (grant tsi-020100-2008-23). the authors also want to acknowledge support from the tifyc research group at the university of alcala. references and notes 1. m. dörr et al., state of the art in content standards (amsterdam: ontoweb consortium, 2001). 2. d. soergel, “the rise of ontologies or the reinvention of classification,” journal of the american society for information science 50, no. 12 (1999): 1119–20; a. gilchrist, “thesauri, taxonomies and ontologies—an etymological note,” journal of documentation 59, no. 1 (2003): 7–18. 3. b. j. wielinga et al., “from thesaurus to ontology,” proceedings of the 1st international conference on knowledge capture (new york: acm, 2001): 194–201: j. qin and s. paling, “converting a controlled vocabulary into an ontology: the case of gem,” information research 6 (2001): 2. 4. according to van heijst, schereiber, and wielinga, ontologies can be classified as terminological ontologies, information ontologies, and knowledge modeling ontologies; terminological ontologies specify the terms that are used to represent knowledge in the domain of discourse, and they are in use principally to unify vocabulary in a certain domain. g. van heijst, a. t. which is ready to use in the semantic web. as described at the opening of this article, our aim has been to create a lightweight ontology as a first version, which will later be improved by including more axioms and relationships that increase its semantic expressiveness. we have tried to make this first version as tailored as possible to the initial glossary, knowing that later versions will be improved by others who might take on the work. such improvements will increase the ontology’s utility, but will make it a lessfaithful representation of the ieee glossary from which it was derived. the ontology we have developed includes 1,521 classes that correspond to the same number of concepts represented in the ieee glossary. (included in this number are the different meanings that the glossary assigns ... figure 9. example of ontology entry 204 information technology and libraries | december 2010 20. w3c, skos; object management group, xml metadata interchange (xmi), 2003, http://www.omg.org/technology/documents/formal/xmi.htm (accessed oct. 5, 2009). 21. uml (unified modeling language) is a standardized general-purpose modeling language (http://www.uml.org). nowadays, different uml plugins for ontologies’ editors exist. these plugins allow working with uml graphic models. also, it is possible to realize the uml models with a case tool, to export them to xml format, and to transform them to the ontology format (for example, owl) using a xslt sheet, as the one published in d. gasevic, “umltoowl: converter from uml to owl,” http://www.sfu.ca/~dgasevic/projects/umltoowl/ (accessed oct. 5, 2009). 22. gilchrist, “thesauri, taxonomies and ontologies.” 23. soergel, “the rise of ontologies or the reinvention of classification.” 24. j. m. morales-del-castillo et al., “a semantic model of selective dissemination of information for digital libraries,” information technology & libraries 28, no. 1 (2009): 22–31. 25. international standards organization, iso 2788:1986 documentation—guidelines for the establishment and development of monolingual thesauri (geneve: international standards organization, 1986). 26. b. m. matthews, k. miller, and m. d. wilson, “a thesaurus interchange format in rdf,” 2002, http://www.w3c.rl.ac .uk/swad/thes_links.htm (accessed feb. 10, 2009). 27. m. hall, “call thesaurus ontology in daml,” dynamics research corporation, 2001, http://orlando.drc.com/daml/ ontology/call-thesaurus (accessed oct. 5, 2009). 28. ibid. 29. y. ding and s. foo, “ontology research and development. part 1—a review of ontology generation,” journal of information science 28, no. 2 (2002): 123–36. see also b. h. kwasnik, “the role of classification in knowledge representation and discover,” library trends 48 (1999): 22–47. 30. s. otón et al., “service oriented architecture for the implementation of distributed repositories of learning objects,” international journal of innovative computing, information & control (2010), forthcoming. 31. o. mendes and a. abran, “software engineering ontology: a development methodology,” metrics news 9 (2004): 68–76; c. calero, f. ruiz, and m. piattini, ontologies for software engineering and software technology (berlin: springer, 2006). 32. ieee, guide to the software engineering body of knowledge (swebok) (los alamitos, calif.: ieee computer society, 2004), http:// www.swebok.org (accessed oct. 5, 2009). schereiber, and b. j. wielinga, “using explicit ontologies in kbs development,” international journal of human & computer studies 46, no. 2/3 (1996): 183–292. 5. r. neches et al., “enabling technology for knowledge sharing,” ai magazine 12, no. 3 (1991): 36–56. 6. o. corcho, f. fernández-lópez, and a. gómez-pérez, “methodologies, tools and languages for buildings ontologies. where is their meeting point?” data & knowledge engineering 46, no. 1 (2003): 41–64. 7. intitute of electrical and electronics engineers (ieee), ieee std 610.12-1990(r2002): ieee standard glossary of software engineering terminology (reaffirmed 2002) (new york: ieee, 2002). 8. j. krause, “semantic heterogeneity: comparing new semantic web approaches with those of digital libraries,” library review 57, no. 3 (2008): 235–48. 9. t. berners-lee, j. hendler, and o. lassila, “the semantic web,” scientific american 284, no. 5 (2001): 34–43. 10. world wide web consortium (w3c), resource description framework (rdf): concepts and abstract syntax, w3c recommendation 10 february 2004, http://www.w3.org/tr/rdf-concepts/ (accessed oct. 5, 2009). 11. world wide web consortium (w3c), web ontology language (owl), 2004, http://www.w3.org/2004/owl (accessed oct. 5, 2009). 12. world wide web consortium (w3c), skos simple knowledge organization system, 2009, http://www.w3.org/ tr/2009/rec-skos-reference-20090818/ (accessed oct. 5, 2009). 13. m. m. yee, “can bibliographic data be put directly onto the semantic web?” information technology & libraries 28, no. 2 (2009): 55-80. 14. l. f. spiteri, “the structure and form of folksonomy tags: the road to the public library catalog,” information technology & libraries 26, no. 3 (2007): 13–25. 15. corcho, fernández-lópez, and gómez-pérez, “methodologies, tools and languages for buildings ontologies.” 16. ieee, ieee std 610.12-1990(r2002). 17. n. f. noy and d. l. mcguinness, “ontology development 101: a guide to creating your first ontology,” 2001, stanford university, http://www-ksl.stanford.edu/people/dlm/ papers/ontology-tutorial-noy-mcguinness.pdf (accessed sept 10, 2010). 18. d. baader et al., the description logic handbook (cambridge: cambridge univ. pr., 2003). 19. world wide web consortium, daml+oil reference description, 2001, http://www.w3.org/tr/daml+oil-reference (accessed oct. 5, 2009); w3c, owl. emergency remote library instruction and tech tools: a matter of equity during a pandemic article emergency remote library instruction and tech tools a matter of equity during a pandemic kathia ibacache, amanda rybin koob, and eric vance information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.12751 abstract during spring 2020, emergency remote teaching became the norm for hundreds of higher education institutions in the united states due to the covid-19 pandemic. librarians were suddenly tasked with moving in-person services and resources online. for librarians with instruction responsibilities, this online mandate meant deciding between synchronous and asynchronous sessions, learning new technologies and tools for active learning, and vetting these same tools for security issues and ada compliance. in an effort to understand our shared and unique experiences with emergency remote teaching, the authors surveyed 202 academic instruction librarians in order to answer the following questions: (1) what technology tools are academic librarians using to deliver content and engage student participation in emergency remote library sessions during covid-19? (2) what do instruction librarians perceive as the strengths and weaknesses of these tools? (3) what digital literacy gaps are instruction librarians identifying right now that may prevent access to equitable information literacy instruction online? this study will deliver and discuss findings from the survey as well as make recommendations toward best practices for utilizing technology tools and assessing them for equity and student engagement. introduction the worldwide covid-19 pandemic has had important repercussions for university libraries. all library services, including information literacy instruction, moved online in a matter of days, creating a wave of needs that required immediate response. with the closure of university campuses all around the world, academic libraries encountered an unprecedented test of their adaptation abilities. although online education has been around for many years, widespread use of the remote classroom may have been unprecedented for many librarians until the spring of 2020. this type of online learning, as charles hodges et al. explain, is significantly different from the otherwise established domains of online and distance learning because it is unplanned, rushed, and happening in the midst of a crisis.1 as they note, “emergency remote teaching has emerged as a common alternative term” to differentiate from standard online education prior to the pandemic.2 the authors recognize the different and sometimes overlapping personal and professional impacts covid-19 has had on our communities, both inside and outside of the classroom. rather than broadly assessing emergency remote teaching, the authors are looking at what jody greene, referring to teaching during the covid-19 pandemic, calls “specific technological tools and flexible teaching practices.”3 this paper is concerned with issues of equity, student engagement, kathia salomé ibacache oliva (kathia.ibacache@colorado.edu) is romance languages librarian, assistant professor, university of colorado boulder. amanda rybin koob (amanda.rybinkoob@colorado.edu) is literature and humanities librarian, assistant professor, university of colorado boulder. eric vance (eric.vance@colorado.edu) is associate professor of applied mathematics and director of lisa (laboratory for interdisciplinary statistical analysis), university of colorado boulder. © 2021. mailto:kathia.ibacache@colorado.edu mailto:amanda.rybinkoob@colorado.edu mailto:eric.vance@colorado.edu information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 2 and technology tools that could be used to facilitate library instruction during emergency remote teaching. the authors seek to answer the following questions: (1) what technology tools are academic librarians using to deliver content and engage student participation in emergency remote library sessions during covid-19? (2) what do instruction librarians perceive as the strengths and weaknesses of these tools? (3) what digital literacy gaps are instruction librarians identifying since covid-19 that may prevent equitable access to information literacy instruction online? literature review technology tools facilitated a quick transition online in march 2020, enabling librarians to interact with students despite the move to emergency remote teaching. however, this fast transition and its associated learning curve accentuated issues of student engagement including equity and accessibility. there is a dearth of existing literature on teaching and learning online during times of great societal stress, with some notable exceptions, including a recent piece about university closures and moving to online classes during student-led protests in south africa from 2015 to 2017.4 as such, this literature review considers some of the barriers that contribute to inequitable information access in online learning, as well as digital literacy definitions. here we consider both ongoing challenges to equitable online access and specific challenges for the current covid-19 pandemic. barriers to equitable student access in online learning equity in academic libraries is widely represented in the scholarship through topics including disability, race, class, and salary gaps among librarians.5 however, as our ongoing pandemic illustrates, there is a strong need for more literature regarding students’ equitable online access to information during times that call for emergency remote teaching. the issue of equity may be considered in terms of external and internal challenges, which affect students differently. external barriers include low bandwidth and lack of devices. some researchers advise letting students communicate through chat instead of a webcam, since webcam use increases bandwidth consumption.6 understandably, colleges may need to provide computers and wireless hotspots to students who lack access to computers or to the internet.7 moreover, a 2018 pew fact tank publication noted that 15 percent of homes with school-age students (6–17 years old) do not have access to high-speed connection, and this digital divide particularly affects teens and their ability to be involved with homework.8 although this data focused on school-age students, these issues probably affected some college students during the pandemic. students may also be experiencing internal barriers such as language differences, lack of self regulation, lack of previous educational experience, and stress, all of which may affect academic performance. for example, one study found that language barriers challenged international students during remote web conferences with librarians.9 another study of international students showed that their academic success relied significantly on a variety of internal characteristics, such as self-regulation.10 additionally, a survey of students taking online courses showed that previous educational experience, including with online learning or within a given discipline, supported completion of those courses.11 moreover, stress is an internal barrier for students that may have external causes and is likely affecting librarians, faculty, and students during covid -19. scholars note that stress changes peoples’ use of technology, and this stress manifests differently depending on individual identity markers, such as gender and experience.12 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 3 technology tools, digital literacy, student engagement in addition to barriers to equitable access, the digital age that has characterized the late 20th and 21st centuries, has prompted the advent of multiple technology tools that may be used in online library sessions, including emergency remote library instruction. these tools are meant to facilitate instruction and engagement, but they require students and instructors to be comfortable with technology. in the case of higher education, this level of comfort involves digital literacy competencies that surpass what is known as traditional textual literacy. the american library association’s (ala) digital literacy task force defines digital literacy as “the ability to use information and communication technologies to find, evaluate, create, and communicate information, requiring both cognitive and technical skills.”13 during the pandemic, the technical and cognitive skills of library instructors and students may be compromised due to stress as well as individual situations and specific environments. one of the technical challenges for remote library sessions stems from the need for instructor s to use tools to achieve flexibility and hybridity. librarians steven j. bell and john shank, addressing the challenges of new technologies for librarianship, coined the term “blended librarian” in 2004 to denote a librarian who combines traditional skills with those involving knowledge of hardware and software as applied in the teaching and learning process.14 the concept of the “blended librarian” may be outdated, but it encompasses the notion that librarians are expected to be comfortable with technology. again, librarians are now facing the mandate of presenting information literacy and library resources online, navigating between and facilitating the use of multiple technology tools and formats. it is worth considering how well our tools meet this mandate. although remote learning may be more amenable to some learners than others, there is consensus on the benefits of using technology for teaching and learning even if a learning curve exists for instructors. for example, researchers examining school support for classroom technology found that teachers supported enhanced technology integration even if it surpassed their own technology skills.15 notwithstanding the benefits perceived by teachers, there are also some drawbacks in the use of technology in the classroom, especially for distance learning. digital technologies researcher jesper aagaard, reporting part of a study on “technological mediation in the classroom” refers to two processes: “outside in,” where students use educational technologies to acquire knowledge in the classroom, and “inside out,” where students use technology tools to withdraw from the classroom visiting non-related websites.16 for instruction librarians, student engagement is paramount; therefore, redirecting students who leave the digital classroom is important, though it can be difficult to know when this occurs. a number of reasons could explain why students may disengage in a distance learning setting, one of them being the lack of digital literacy. moreover, the belief that higher education students in the 21st century are technologically savvy may be misleading. citing mark prensky, who originated the terms “digital native” and “digital immigrant,” wan ng explains that the phrase “digital natives” describes those people born in 1980 and after whose lives have been shaped by technology.17 ng found that while the students in his study were very comfortable with technologies such as word processor software, youtube, and facebook, they were not as comfortable using technologies to create content.18 there may be a digital literacy divide between information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 4 knowing and using a technology for social media and using a technology to create online content such as web pages and blogs. similarly, ng found that when presented with unfamiliar technology, students spent less time learning the new technology and instead focused on preparation of content.19 this finding may be of concern to instruction librarians who use a myriad of tools during emergency remote teaching. it is important to consider that these digital literacy divides could stem from factors not related to a student’s age group. researchers ellen johanna helsper and rebecca envon question the notion that a person may be called a digital native if they were born after 1980. these authors state that there are variables other than generational differences that could define a person as a digital native, such as gender, education, experience, and interaction with technology.20 therefore, even when people grow up in technological environments, they may not be considered digital natives. to minimize a gap in equity, lecture design, even for one-time library sessions, offers an opportunity to think of technology tools that could increase students’ participation and prompt learning. david ellis, studying classroom resources to enhance student engagement, notes that padlet, a web 2.0 technology, supports interaction and learning.21 seyed abdollah shahrokni, reviewing another web 2.0 technology, playposit, as a video tool for language instruction, states that it “can support learning in language classrooms” if used in a lecture design that includes relevant questions.22 lecture design applies to all types of settings: in person, flipped, and distance learning. approaches should be applied consistently to help students become more digitally literate and bridge equity issues where possible. jurgen schulte et al., providing examples of “new” librarian roles in a science curriculum, note that digital literacy enables better learning. 23 in the case of emergency remote teaching, instruction librarians may promote digital literacies through the use of technologies that increase students’ engagement and their “outside in” participation in the teaching and learning process. considering these challenges, the authors seek to identify the strengths and weaknesses of technology tools used by librarians and the digital literacy gaps that may prevent access to equitable library instruction. methods instrument the authors used a six-question qualtrics survey approved by the institutional review board at the university of colorado boulder. the survey was open for two weeks, between may 10 and may 24, 2020. it is worth noting that the questions were specific to this timeframe, and some responses indicated that instruction librarians were still finishing up spring semester 2020. the survey received 202 responses. however, the number of responses to each question varied as answers were not required. the data collected were both quantitative and qualitative, reflecting respondents’ practices, perceptions, and personal knowledge. respondents answered two multiple-choice and four free-text questions. for the multiple-choice questions, participants could choose all the options that applied and enter their own choice as well. the multiple-choice questions gathered data on the technology tools that librarians used to deliver content or to engage with students during covid-19. these questions distinguished between content delivery platforms (like zoom) and technology tools used for student engagement (like padlet). the technology tools included in the multiple-choice questions were chosen based on the authors’ knowledge of their potential relevance to instruction librarians. the information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 5 final four qualitative questions collected information about respondents’ perceptions of strengths and weaknesses of technology tools, as well as digital literacy gaps identified during covid -19 and other challenges to equitable instruction. qualtrics provided a report, which the authors organized in a spreadsheet used to analyze the data and create the figures. the following survey questions were asked: 1. what content delivery technology have you used to create your distance learning library sessions during covid-19? 2. what technology tools have you used to enhance student engagement in your distance learning library sessions during covid-19? 3. what are the strengths of the technology tools you’re using right now? 4. what are the weaknesses of the technology tools you’re using right now? 5. what digital literacy gaps have you identified in your students since covid-19 closures? ala’s digital literacy task force defines digital literacy as “the ability to use information and communication technologies to find, evaluate, create, and communicate information, requiring both cognitive and technical skills.” 6. what other challenges exist in your ability to effectively provide equitable information literacy instruction during this time? please see appendix a for the complete survey instrument. participants the survey was distributed through email to five listservs associated with academic libraries and library organizations: the seminar on the acquisition of latin american library materials (salalm) listserv, information literacy instruction discussion listserv, the library instruction roundtable (lita) listserv, the lita instructional technologies interest group listserv, and the literature in english discussion list. these organizations were chosen due to their connection with library instruction in academic libraries and the authors’ subject specialty affiliations (romance languages and english and american literature). grounded theory approach the data for questions 3, 4, 5, and 6 were analyzed using a basic grounded theory approach, where the authors collected themes and patterns from the responses rather than approaching the data with pre-existing hypotheses.24 based on their observations, the authors categorized responses according to an agreed-upon set of keywords. in addition, after coding the data separately, the researchers examined every answer together to ensure consistency and reliability. a mixedmethods survey with a grounded theory approach to analysis allowed for a larger number of responses than qualitative interviews. the survey format also allowed for quicker solicitation and analysis of data, given the urgency of the topic and the authors’ desire to provide recommendations to colleagues in a timely manner. findings popularity of technology tools figure 1 shows respondent selections from the list of content delivery tools provided by the authors. a large number of respondents used libguides as a content delivery tool during covid 19, followed closely by the video conferencing tool zoom. however, although libguides and zoom displayed a substantial amount of concurrence among the respondents, fewer than half of the https://lists.ala.org/sympa/info/ili-l https://lists.ala.org/sympa/info/lirt-l https://lists.ala.org/sympa/info/lirt-l https://lists.ala.org/sympa/info/lita-insttechig https://lists.ala.org/sympa/info/les-l information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 6 respondents used the rest of the technology tools shown in figure 1. these data suggest that a large number of the respondents were able to deliver library instruction via synchronous learning through zoom or by providing resources asynchronously via libguides, and thus had the opportunity to have at least some engagement with students. figure 1 also shows that more respondents used snagit and screencast-o-matic to create videos than playposit. similarly, a little over one-eighth of respondents used the graphic design tool canva to create content, although this tool had better usage than adobe illustrator, which was only used by one respondent. in addition, the communication software google hangouts was largely not used by respondents. the authors listed formative and pear deck in the survey options as well, but these were not selected by any respondents (not shown in figure 1). figure 1. respondent selections to question 1: what content delivery technology have you used to create your distance learning library sessions during covid-19? figure 2 represents the tools used by the 95 respondents who selected other and entered additional tools in a free-text box in question 1. tools mentioned, such as webex, camtasia, panopto, and kaltura capture, were used for video conferencing but to a lesser extent than zoom. similarly, only six respondents reported using narrated powerpoint. interestingly these tools were still used by more people than playposit. respondents mentioned a wide array of other technology tools in the free-text box (see appendix b); however, none of these tools were used individually by more than three respondents. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 7 figure 2. other content delivery technology used to create distance learning library sessions during covid-19. the survey also asked about technology tools used for student engagement in distance learning library sessions during covid-19. the authors distinguish these tools from content delivery tools, as they are often utilized in conjunction with some of the tools mentioned in figure 1 to facilitate student interaction. figure 3 shows that, among the tools listed by the authors, respondents preferred the application google forms, found in google drive and google classroom, as more than one-third of respondents indicated they used this application to enhance student engagement. although representing fewer than half of the respondents, 18 more people selected google forms over poll everywhere, the tool with the second-best representation. moreover, poll everywhere and padlet, two online tools that enable student participation through custom-made polls and post-it boards, were each utilized by about one-fourth of participants. the game-based learning platform kahoot was used by nearly one-fifth of respondents, and mentimeter, another interactive platform allowing students to answer multiple-choice and openended questions, was used by 11 respondents. less than five percent of the respondents used the interactive technology tools flipgrid, answergarden, jamboard, mural, slido, and socrative. no respondents indicated they used pear deck, google drawings, quizalize, gosoapbox, and yo teach! (not shown in figure 3). in addition, 42 respondents entered the names of technology tools they used to enhance student engagement in the other free-text option. similar to the responses in the free-text answer for question 1, respondents provided a broad list of technology tools. two of the tools listed displayed a higher number of concurrences: eight respondents mentioned zoom polls and four mentioned information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 8 springshare libwizard. an additional 20 tools were used by just one participant each (see appendix c). figure 3. respondent selections to question 2: what technology tools have you used to enhance student engagement in your distance learning library sessions during covid-19? strengths and weaknesses of technology tools instruction librarians also described the perceived strengths of the technology tools they used. figure 4 shows that a little less than half of the respondents agreed “easy to use” was an important consideration for technology tools, making it the most frequently mentioned strength. responses showed interest in ease of use for librarians, students, and faculty alike. for example, respondents included the phrases “our learners were comfortable with them,” “it’s easy to get started,” and “everyone already knows zoom.” in addition, nearly one-fourth of participants selected the strength “interactive/collaborative” followed at a distance by the strength “flexible,” which dropped dramatically to 15 percent. in fact, the number of respondents who noted “interactive/collaborative” was almost quadruple the number of respondents who mentioned the less popular choices “supported by it” or “captioning functionality.” fewer than 19 participants acknowledged that it was important for the technology tools to enable remote instruction, include recording functionality and screen-sharing functionality, and to be able to enhance communication. only 11 participants wrote that it was important for the tool to be readily available. respondents referred to other strengths not included in figure 4 due to their infrequency. nonetheless, some of these strengths offer unique insights. for example, four respondents noted information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 9 that they favor free tools. in addition, three respondents stated that it was beneficial to repurpose content created with technology tools. two respondents mentioned that they preferred tools that do not require download and/or account creation. another respondent mentioned that mo bilefriendly tools were most helpful for engaging students. figure 4. respondent answers to question 3: what are the strengths of the technology tools you’re using right now? respondents also shared their observations around technology tool weaknesses. figure 5 shows that several perceived weaknesses were the inverse of strengths from figure 4, including that tools were “difficult to use” or “not interactive or engaging.” figure 5 also indicates that respondents were divided as to the most significant weaknesses. in fact, not even one-fourth of respondents selected the most frequent response, “not interacting or engaging,” displaying a lack of concurrence. the second most-repeated weakness referred to bandwidth requirements, with 27 respondents worrying about the lack of requisite internet access. the authors joined together seven weaknesses mentioned by respondents as “other functional limitations.” these weaknesses included “lack of screen capture,” “connection failures,” “lack of captioning,” “lack of recording capabilities,” “limited sharing screen,” “freezing video,” and “video quality.” each of these specific limitations was only mentioned a couple of times, but together these functional limitations were mentioned by 17 respondents. again, there were some specific weaknesses mentioned by only a few respondents. some of the highlights included tech overload or too many tools to choose from (two respondents), computer storage requirements (three respondents), and that the tools are not flexible enough or easy to information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 10 integrate into other systems such as canvas or libguides (four respondents). two people observed that the tech tools they used had no weaknesses. interestingly, 18 respondents included keywords and phrases in their answers (not shown in figure 4) that were not directly related to tool weaknesses, but rather described other issues affecting teaching and learning in a remote setting. these included students lacking computer s or having only cell phones (seven respondents), students’ limited technology skills or attitudes about remote learning (six respondents), students’ home setups (three respondents), and limited familiarity with the tools among teaching faculty (two respondents). these kinds of responses illustrate the wide range of interconnected factors impacting librarians’ experiences engaging with students and technology during covid-19. finally, 26 percent of librarians answering this question mentioned some weaknesses related to zoom (not shown in figure 5). to illustrate, some comments included “active learning in zoom [sic] is difficult . . .”; “zoom recordings take up a lot of space and our college is running out of room . . .”; “zoom doesn’t work as well when using wifi [sic], as opposed to connecting through a network”; “it is easy to zone out and not pay attention to zoom [sic]”; “with zoom it is difficult to interact with students on a one-to-one basis as they breakout [sic] to conduct research”; and “students tend to not have cameras on . . . and it’s hard to tell if they are actually paying attention.” these observations may show that while respondents favor using tools like zoom, they are also aware of important limitations. figure 5. respondent answers to question 4: what are the weaknesses of the technology tools you’re using right now? information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 11 digital literacy gaps beyond describing the technology tools used, respondents were asked to identify digital literacy gaps that they noticed in students during covid-19 closures. as stated above, the authors defined digital literacy in the survey question according to the ala digital literacy task force definition. still, answers to this question provoked a wide range of responses as seen in figure 6. the most frequently recurring response was that digital literacy gaps were the same as those perceived before the pandemic, although only 25 respondents agreed on this. digital literacy gaps observed by respondents included “lack of tech skills in general,” “problems evaluating information online,” “ineffective search strategies,” “difficulty communicating online,” “problems using library resources in general,” “problems using online resources,” “problems using library databases,” and “understanding citation and plagiarism.” the second-biggest category, “lack of tech skills in general,” included varied responses such as “some of my students lack a basic understanding of . . . browsers, upload/download, url versus link, activate/enable a feature etc.”; “students have trouble navigating multiple windows”; and “students are having a hard time trying something new which involves more than a single click or two.” eleven other respondents noted that it was too early to evaluate digital literacy gaps during emergency remote teaching. one respondent offered insight about the possibility that librarians missed gaps because they were not able to meet with all students. as they stated, “students who have access and are in contact with librarians seem to have adequate skills. i don’t know how many students simply lack internet access, and i don’t know how many need the library and don’t figure out how to access it. . . .” ideas for reaching more students who may not have access to in-class library sessions are mentioned in the recommendations section below. when asked about digital literacy gaps, some respondents mentioned student experiences during covid-19 that were not directly related to digital literacy and therefore are not included in figure 6. however, the authors considered this information relevant because it provided insight into perceived challenges students faced. the authors separated such responses into two g roups: external challenges and internal challenges. external challenges mostly involved technology access rather than digital literacy per se, with 22 respondents mentioning lack of tech access as a barrier or gap. it is worth noting that respondents mentioned this lack more than any individual digital literacy gap shown in figure 6. fifteen respondents also noted that students may lack internet access at home, while five percent mentioned a home environment that was not ideal or conducive to learning. although these external challenges are not explicitly related to digital literacy, the fact that they are mentioned here may indicate that respondents perceived these challenges as interrelated during the covid-19 pandemic. internal challenges included concepts that may be seen as related to digital literacy but are not explicitly included in the ala task force definition. in fact, many of these challenges had to do with pandemic-specific difficulties such as “emotional issues” arising during covid-19 (10 respondents). five respondents worried about information overload, while two respondents each mentioned that students were less likely to ask for help and more likely to have problems following directions during emergency remote learning. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 12 figure 6. respondent answers to question 5: what digital literacy gaps have you identified in your students since covid-19 closures? the last survey question asked respondents to reflect on any other challenges that may have impacted their ability to effectively provide equitable library instruction during emergency remote learning. figure 7 displays an array of responses, including some related to technology tools, home environments, and institutional support. nonetheless, technology access from home was perceived as the most important challenge (39 respondents) followed closely by internet issues (35 respondents). for both challenges, the authors included responses that specified lack of tech access for students, teaching faculty, or instruction librarians. many respondents d id not specify who lacked access. however, one could argue that lack of access by any of those three groups may impede connection and student engagement. other challenges such as home environment, fewer library instruction sessions, communication barriers with students, lack of student engagement, no time to plan, emotional distress, and issues with synchronous or asynchronous instruction affected 11 percent or less of respondents each. additionally, the data indicated that librarians perceived more communication barriers with students (14 respondents) than with faculty (nine respondents). in figure 7, “asynchronous/synchronous” refers to problems encountered by respondents that had to do, in general, with the unique challenges of presenting content online either asynchronously or synchronously. for example, respondents mentioned being unsure whether students were engaging with asynchronous content. they also mentioned being asked by faculty to use one format over another, despite librarian preferences. one respondent focused specifically information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 13 on the need for flexibility when addressing equity: “asynchronous instruction does not allow the real time adaptation to student needs (cognitive and technical).” even though figure 7 relates to challenges experienced in providing equitable library instruction, respondents showed that there was also an emotional factor surrounding these challenges . two revealing responses to the question about challenges included “my kids running around in the background, not having an actual office, being expected to work 40 hours a week while homeschooling and running a household” and “some students [are] more or less in shock from the pandemic; some students have illness in the family; some students have economic issues, some students just don’t learn well with online learning only.” other comments stated personal challenges, such as the “stress of living in [the] epicenter of [a] global pandemic” and “my own mental and emotional capacity.” figure 7. respondent answers to question 6: what other challenges exist in your ability to effectively provide equitable information literacy instruction during this time? because there was often little consensus among responses, the authors created word clouds for all four qualitative questions (figure 8). each of these questions showed students at the center of instruction librarians’ responses, which is not surprising given their roles and the subject of this survey. the purpose of emergency remote teaching and learning is, at its core, to continue to connect students with resources and to engage them in their learning, even and especially when it is challenging to do so. still, it is meaningful to see students at the heart of these data. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 14 figure 8. word cloud visualization for each qualitative question answer set. challenges and limitations many of the challenges encountered while analyzing data had to do with creating meaningful keyword codes for the qualitative survey questions. this coding was challenging because respondents expressed varied experiences and opinions and there was no significant consensus regarding tools used, tool weaknesses, digital literacy gaps, or other challenges. in contrast, respondents frequently referred to students’ lack of technology and internet access, even when the question at hand did not explicitly address this. these challenges speak both to the varied experiences of and institutional responses to covid-19, as well as perceived lack of tech or internet access among students as a primary barrier to effective emergency remote teaching. further, while some questions signaled a clear answer, others required interpretation. to illustrate, respondents used the term “accessibility” inconsistently. some respondents used this term to refer to accessibility for students with disabilities, and others used it to refer to “availability.” therefore, the authors employed contextual clues to determine meaning. regardless, if the meaning remained unclear, then these answers were not considered for coding. similarly, respondents didn’t always use the same language to describe the same concepts. for example, a participant noted that “the technology we have is limited to lecturing and answering questions and providing documents and videos online. we don’t have polls enabled . . . .” the information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 15 authors interpreted this to mean that the technology tools didn’t allow for robust engagement with students, though the respondent didn’t specifically mention the word “engagement.” again, if context or meaning was unclear, those responses were not coded. another challenge occurred in analyzing responses to question 6: what digital literacy gaps have you identified in your students since covid-19 closure? some respondents appeared to be unfamiliar with the term “digital literacy,” even though a definition was provided within the question. some respondents referred to hardware access, home environment, tech access, or psychological stress rather than explicitly reflecting on digital literacy gaps as included in ala ’s task force definition. these responses could indicate either confusion around the definition of digital literacy or, as suggested above, the perception of all these factors being codependent or interrelated. limitations of the study included the design of the survey itself. for example, respondents received a list of tools for questions 1 and 2, which may have meant that they were more likely to select these than to remember other tools that they used and add them to the other category accordingly. questions 3 through 6, in contrast, did not include any multiple-choice options, which may have limited the thoroughness of responses. for example, the average number of responses to question 3 was 2.08 strengths mentioned per respondent. we think it is likely that respondents would have indicated more strengths had they been presented with a list rather than only a freetext box. the authors also did not define the difference between content delivery tools and tools for student engagement in the survey. for this reason, there was some overlap noted in the responses for questions 1 and 2. also, respondents mentioned tools for engagement that were sometimes features of content delivery tools, such as webex whiteboards and lms discussion forums. the vast landscape of tools used meant that our survey could not account for all possible manifestations of technology for content delivery and student engagement. discussion questions 1 and 2: technology tools instruction librarians are using to deliver content and engage students the instruction librarians that answered the survey have widely used technology tools such as libguides and zoom in their library seminars during covid-19. however, as data show, librarians have also used many other technology tools to create and deliver emergency remote library sessions during covid-19, due perhaps to the wide array of tools available. while libguides and zoom exhibit a high percentage of usage, this result was expected because libguides is a wellknown tool used by academic librarians and, according to the company, zoom became more prominent as a tool during covid-19.25 the relatively low usage of adobe illustrator is also somewhat predictable because this tool not only requires a subscription, but also may have a higher learning curve than other free graphic editor and design programs. data raised some further questions about the role of information technology (it) departments. are instruction librarians reaching out to their respective universities’ it departments to learn about technology tools available to them and vice versa? are it offices willing and able to provide training via video conferencing if in-person training is not available due to the pandemic? do it departments offer enough promotion to advertise these tools? these questions are not addressed in this manuscript but are important avenues for further research. only six percent of respondents information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 16 recorded “supported by it” as a strength of the technology tools they were using. this shallow percentage may appear striking but could be understood under the premise that as the pandemic set in across the united states and instruction librarians rushed to prepare and present online sessions, librarians relied on the tools that were most familiar to them instead of learning a new technology tool. the data from this survey would seem to corroborate this, as so many respondents chose “ease of use” as an important strength. one interesting detail that is worth addressing about the tools is that respondents mentioned other more than they selected the options the authors provided, which may imply that either the authors did not include the most-used tools or that the number and/or variety of tools is so wide that it is difficult to reach a consensus. one wonders if the tools mentioned in other had been included as part of the listed options, the number of respondents using these tools would be higher. questions 3 and 4: the strengths and weaknesses of tools as they affect student engagement the authors wanted to know what tools respondents had used to enhance student engagement. data show that google forms is the tool that most of the respondents have used for this purpose. however, fewer respondents used tools that are purposely designed to increase interaction in online sessions, such as kahoot and mentimeter, which do not even require a fee for using their basic features. respondents’ perceptions of the strengths and weaknesses of tools they have used provided useful information. in terms of tools strengthening student engagement, the responses were not as conclusive, as 40 of 150 respondents found these tools interactive or collaborative, and an even lower number of these respondents thought of these tools as flexible or helpful to enable remote instruction. one could argue that ada capabilities are features that may facilitate student engagement. however, when respondents were asked about the strengths of technology tools they had used, accessibility was not often mentioned as a strength. moreover, data showed that only eight respondents referred to ada problems and three of them voiced concern over captioning capabilities, which is considered a relevant ada feature. there was no mention of alt text for images, screen-readable and software-neutral file formats, or the importance of user ability to change the color and font setting in their devices to see the content. in fact, only three respondents specifically mentioned issues with videos in terms of their audio quality, lack of auto closed-captioning, and freezing images. respondents noted wide-ranging effects of tool weaknesses on both instruction librarians and students. to illustrate, the weaknesses “time intensive,” “not designed for teaching,” and “no feedback or assessment” likely affected instruction librarians at a personal level as they prepared for and assessed their teaching. in contrast, the weaknesses “ada problems,” “not interactive or engaging,” “difficult to use,” and “makes communication difficult” might primarily impact students. other concerns respondents stated that may influence student engagement included poor bandwidth, which affects internet access and causes connection issues. for example, even if librarians try to improve the video quality in zoom by disabling the higher definition option, or start a session with audio conferencing only, which will decrease the amount of bandwidth needed, students with poor bandwidth may still not be able to engage. therefore, in situations of emergency remote learning, if students lack bandwidth or an appropriate home environment, learning and engaging may become a challenge. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 17 questions 5 and 6: digital literacy gaps and equitable instruction a number of factors affect the ability to provide equitable library instruction on the librarian side and to engage with equitable library instruction on the students’ side. one of these factors is home environment, including access to computers, good bandwidth, or an appropriate working station. our data specify that 15 respondents perceived “home environment” as a challenge when providing equitable library instruction. in addition, some respondents noticed home environment issues when asked about digital literacy gaps in relation to shared spaces and lack of computers and access. it’s possible that equity issues increased during the covid-19 shutdown, which raises the question of whether there is a correlation between the issues affecting students, librarians, and faculty. data showed that of 139 participants, a little over one-fourth of them considered “tech access from home” and “internet issues” as challenges in the ability to provide equitable library instruction. these challenges, along with “emotional issues,” were perceived to affect not only students but also librarians and faculty. although responses recorded librarians’ perceptions on equity issues, often including their own experiences, data revealed that respondents presumed faculty and students were having similar issues. data also exhibited that respondents perceived other challenges, such as “fewer library sessions” and “student lack of engagement,” that may affect students directly. fewer library sessions are a challenge that may be further addressed forthwith. however, students’ lack of engagement is a difficulty that may require thoughtful outreach, collaboration with mental health offices at the campus level, and reflective and inclusive lecture design. these challenges may have a negative impact on receiving an equitable learning experience. in fact, less commonly acknowledged gaps may, in some ways, be more important than those frequently mentioned. a well-known gap can be addressed because there is consensus that the gap exists and poses a barrier to equity in education. unnoticed or overlooked gaps, in contrast, are more difficult to address but may be no less important as barriers to equitable education. equity issues may also arise as a result of lack of digital literacy skills in students. students with higher digital literacy are deemed to perform better in an emergency remote library instruction setting, may be more prone to stay in tune and engaged with the lesson, and have less emotional stress by feeling confident. however, as wan ng explains above, those recognized as digital natives may not necessarily have digital literacy, even if they are comfortable with social media tools.26 data do not tell us the age of students; regardless, digital literacy gaps were detected by respondents. these perceived gaps in digital literacies (evaluating information, communicating online, applying search strategies, using library resources and databases, understanding plagiarism and citation, and using online resources) are important for librarians to address during emergency remote learning. last, the lack of consensus may be explained by the complexity of the concept of digital literacy. it is possible that many of these gaps existed before, but librarians recognized them as new during emergency remote learning. one response illustrates this idea: “the closure has prompted many more students to request help in every step of the digital literacy process. i’m not sure if students typically ask each other, or their professors/instructors. regardless, it’s exposed that not all students know things i’d assumed they did.” whether these gaps are new or not remains unclear, as evidenced by another respondent who stated, “nothing new to the covid era.” information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 18 recommendations these recommendations seek to address some of the issues that arose in the data, especially those regarding equity and emergency remote library instruction. to illustrate, one respondent summed up the current situation while also posing a question that appears valuable: “not all of our students have the same access to stable technology and internet, nor do they all respond to online teaching strategies in the same ways. how do we create equitable and accessible learning opportunities?” while the authors do not have all the answers, based on the analysis of the data and emerging themes, some recommendations may help instruction librarians move forward through the covid-19 crisis. technology and equity the authors realize that a budget is essential for the implementation of recommendations that may reduce both inequitable access to information and lack of digital literacy. nonetheless, the recommendations below intend to offer guidance on ways to improve equitable access, digital literacy, and student engagement during emergency remote library sessions. one external digital barrier for students engaging in emergency remote library sessions was the lack of equipment at home, possibly due to economic hardship. university libraries could provide kits containing a chromebook, webcam, microphone, wi-fi hotspot, and headphones to increase equitable access. access to this equipment may help students feel supported and understood, with a sense of dignity. these offerings should be in coordination with other campus units who may provide similar services, such as student affairs and it departments. likewise, a coordinated marketing and outreach effort at the campus level may enhance the visibility of equipment available for student use. as stated above, “ease of use” rose to the top as the most-frequently mentioned technology tool strength, which is understandable given the many stressors educators and students may be experiencing during covid-19. however, it is important to keep in mind that tools should be “easy to use” not just for librarians and teaching faculty, but especially for students. nonetheless, given the difficulty of assessing instructional technology and library information literacy sessions right now, it is challenging to know whether students find the technology tools that librarians choose truly “easy to use.” compounding the perception that tools are easy to use is the possibility that tools may not be ada accessible. though the survey did not ask about accessibility explicitly, and while the authors did not vet the tools listed in the survey for their accessibility features, the authors wonder how many tools are fully accessible to all learners. instead of choosing tools for their perceived ease of use, a further recommendation is to move beyond valuing what’s easy to critically reflect on whether tools are fully accessible to students with visual or hearing impairments or learning differences. if the answer is no or unclear, perhaps using basic content delivery tools that are vetted for accessibility features is the better option. it is recommended to follow best practices for using those tools (for example, by referring to guidance from campus it departments). if instruction librarians consider themselves “blended,” or perhaps even so well-versed in technology that a term like “blended librarian” is no longer needed, they should also prioritize flexible, responsive, and intentional use of technology in their lecture design. if a tool that they assumed would be easy to use for all students is proving challenging for some, librarians should have alternative options and extra support at the ready. they may also ask themselves whether information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 19 use of a technology tool furthers the learning process and outcomes of the course, or if technology is added for its own sake. in addition, avoiding use of extra tools and technology that does not genuinely enhance lecture goals and priorities may help students avoid stress related to technology, which could further students’ emotional well-being during this fraught time. being clear with students about which tools will be used and for what purpose may help students who would otherwise struggle with layered content delivery and engagement tools. a glossary of these tools, along with when and how they’ll be used and links to technical support, could be a helpful support document for students. communication and equity it is worth exploring librarian, student, and faculty communication not explicitly focused on technology. some respondents mentioned outreach and connection challenges that have less to do with technology and more to do with other stressors and limitations. for example, some librarians reported receiving fewer requests for information literacy sessions or library support than usual, and some speculated that this was because of the quick move to emergency remote learning, lack of time to plan, and the possibility that a library session was “extra” and faculty were trying to simplify. there are several ways to address this challenge. librarians can attempt to meet students and faculty where they are by offering multimodal learning opportunities, including both synchronous and asynchronous offerings (zoom meetings, prerecorded videos, tutorials/quizzes, canvas discussion posts, and libguides are a few options). it is also paramount to make sure librarians are reachable at the point-of-need, which may mean extended weekend and evening hours on the virtual ask-a-librarian desk. also imperative is ensuring that virtual services, as well as consultation request links and/or email addresses, are clear and visible to students and faculty on the library’s website. some survey respondents mentioned that communication with faculty was difficult, and this may have contributed to fewer instruction requests. while it is understandable that faculty may have been less responsive to librarian outreach for a variety of reasons, there are some ways to encourage faculty communications. for example, librarians could provide simple, bulleted lists with updated information on services and offerings, individual attention (focused on specific classes and topics), and options, acknowledging that some faculty will simply not want to share classroom time during emergency remote teaching. librarians can also work to bridge the disconnect between it and their departments by proactively reaching out to learn about best practices not only for technology use, but also for ada accommodations. even when information literacy sessions are requested, faculty may not always share student accommodation needs. librarians can ask for help from it or other units on campus (such as centers for teaching and learning) to make sure that their communication techniques are aligned with inclusive, user-centered approaches to teaching and learning with technology. as professionals in a unique role serving both students and faculty, librarians may also check in on a person-to-person basis with both groups. acknowledging that we are people with mental and physical health needs working together in difficult circumstances is one way of connecting with students and faculty in an authentic way. emergency remote teaching and learning is different information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 20 from typical remote or online learning and being clear about that might also help everyone adjust expectations and extend compassion. professional development and personal support while emergency remote teaching and learning may not seem like the best time for professional development, it is important to acknowledge that librarians deserve support in navigating this unprecedented time. even as we clearly want to help students who may be especially vulnerable during covid-19, there is a sense of being overwhelmed, and librarians may not always know where to start. while there are online webinars and discussions that provide advice about how to best help students during covid-19, the authors recommend a more specific approach targeting digital literacy gaps and support systems for librarians. in reviewing survey responses to perceived digital literacy gaps and other challenges, it became clear that not all librarians are well-versed in digital literacy concepts. if librarians have time to take one approach to professional development as it relates to instruction and information literacy, the authors recommend learning more about digital literacy competencies and thinking critically about how emergency remote library instruction design can address those competencies and potential gaps. of course, stresses of the pandemic are impacting librarians as well as faculty and students. it is important that we connect with colleagues and support systems during this time. one option might be to form a community with colleagues to determine best practices for use of technology in instruction among other relevant topics (examples at the authors’ library include anti-racist actions and a caregiver’s support group). librarians should also prioritize their own health (mental and physical) and stress management. the recommendations are everywhere but bear repeating: connect with family and friends, exercise, take time away from the computer, and make sure to rest. librarians should be kind to themselves and their colleagues and offer or ask for support when needed. conclusion as of spring 2021, the covid-19 pandemic is not yet over. it remains unclear whether and when academic library instruction will return to the old normal. the data collected and analyzed during this paper, as well as the discussion and recommendations, can inform how instruction librarians respond to student needs and challenges as everyone continues to cope with life during emergency remote learning. especially compelling are the data shared about the strengths and weaknesses of technology tools used to enhance student engagement in library instruction. these data provide parameters that may help other instruction librarians make decisions when choosing a technology tool and be prepared to troubleshoot when issues arise. a concerning data revealed that digital literacy, as defined by ala’s digital literacy task force, is a subject that may not be widely understood by instructors. although our pool of respondents was small, instruction librarians may need a broader understanding of what digital literacies look like in practice when dealing with emergency remote teaching and a diverse student population. while instruction librarians’ experiences and perceptions are one important piece of the puzzle, especially in acknowledging shared challenges, it is important to recognize that students may have needs, digital literacy or otherwise, that educators are missing. though assessment is difficult right now, reflection and attention to the whole student experience is necessary. working with colleagues on campus to provide technology, including laptop computers and wi-fi hotspots, as information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 21 well as evaluating our content delivery and engagement tech tools for ada accessibility, are examples of ways that instruction librarians can connect students with unmet needs to resources during this difficult time. examining instruction librarians’ ongoing response to the pandemic, while challenging, will help libraries become more emergency-responsive and better able to meet the needs of diverse students in the 21st century. acknowledgement we would like to thank moria woodruff from the university of colorado boulder writing center for her help revising this manuscript. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 22 appendix a: survey instrument distance learning during a pandemic: a matter of equity we are curious to hear about your experiences of library instruction during the abrupt shift to online learning. in particular, we are researching librarians’ use of technology tools for online content delivery and student engagement during covid-19.this survey should take less than ten minutes to complete. your answers will be anonymous. please do not include personally identifiable information. participation in the survey indicates your consent for us to use the data collected in a forthcoming research paper about using online technology tools to teach information literacy or library seminars during covid-19. the survey will be open through sunday, may 24th. thank you for your participation! q1 what content delivery technology have you used to create your distance learning library sessions during covid-19? select as many as apply. zoom microsoft teams libguides course management system (e.g., canvas) formative pear deck adobe illustrator snagit screencast-o-matic playposit google hangouts google classrooms canva (graphic design tool) other information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 23 q2 what technology tools have you used to enhance student engagement in your distance learning library sessions during covid-19? select as many as apply. padlet answergarden kahoot! mentimeter flipgrid slido socrative jamboard pear deck mural google drawings google forms quizalize gosoapbox poll everywhere yo teach! other q3 what are the strengths of the technology tools you’re using right now? q4 what are the weaknesses of the technology tools you’re using right now? information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 24 q5 what digital literacy gaps have you identified in your students since covid-19 closures? ala’s digital literacy task force defines digital literacy as “the ability to use information and communication technologies to find, evaluate, create, and communicate information, requiring both cognitive and technical skills.” q6 what other challenges exist in your ability to effectively provide equitable library instruction during this time? information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 25 appendix b: tools mentioned by three or fewer respondents, question 1, option other ninety-five respondents answered other to question 1: what content delivery technology have you used to create your synchronous and asynchronous distance learning library sessions during covid-19? tool name type of tool number of respondents bluejeans online meetings 3 google meet online meetings 3 jing / techsmith capture screen capture 3 blackboard ensemble video creation 2 imovie video editing 2 guide on the side interactive tutorials 2 kapwing video and image editing 2 libchat communications service 2 piktochart graphics editing 2 techsmith relay video creation 2 thinglink multimedia editing 2 adobe indesign desktop publishing 1 adobe photoshop graphics editing 1 adobe premiere pro video editing 1 amazon chime communications service 1 audacity audio editing 1 chat (in general) communications service 1 clideo video editing 1 faststone capture screen capture 1 genially interactive content creation 1 google sheets web-based spreadsheets 1 gotomeeting online meetings 1 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 26 tool name type of tool number of respondents microsoft bookings scheduling 1 microsoft stream video sharing 1 powtoons video creation 1 pressbooks content management 1 prezi video video creation 1 qualtrics surveys 1 quicktime multimedia editing 1 screenflow video editing and screen capture 1 springshare libwizard interactive tutorials and forms 1 telephone communications service 1 videoscribe animated video creation 1 vimeo video sharing 1 whatsapp communications service 1 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 27 appendix c: tools mentioned by one respondent, question 2, option other forty-two respondents answered other to question 2: what technology tools have you used to enhance student engagement in your distance learning library sessions or courses during covid 19? each tool was used by only one respondent. tool name type of tool articulate storyline interactive e-learning modules calendly scheduling camtasia video editing and screen recording canva quizzes quizzes google voice communications service h5p programming language for websites handout (not a technology tool) html/css programming language for websites knight lab tools storytelling lms discussion forums discussions microsoft powerpoint presentation platform microsoft word word processor nearpod interactive lessons parlay discussions qualtrics surveys remind communications service speakpipe communications service twine storytelling voicethread video, voice, and text commenting webex whiteboard drawing tool information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 28 endnotes 1 charles hodges et al., “the difference between emergency remote teaching and online learning,” educause review (2020), https://er.educause.edu/articles/2020/3/thedifference-between-emergency-remote-teaching-and-online-learning. 2 hodges et al., “the difference.” 3 jody greene, “how (not) to evaluate teaching during a pandemic,” chronicle of higher education (2020), https://www-chronicle-com.colorado.idm.oclc.org/article/how-not-to-evaluateteaching/248434. 4 laura czerniewicz, “what we learnt from ‘going online’ during university shutdowns in south africa,” philoned (2020), https://philonedtech.com/what-we-learnt-from-going-onlineduring-university-shutdowns-in-south-africa/. 5 for scholarship on equity and librarianship see joanne oud, “systematic workplace barriers for academic librarians with disabilities,” college & research libraries 80, no. 2 (2019), https://doi.org/10.5860/crl.80.2.169; amanda l. folk, “reframing information literacy as academic cultural capital: a critical and equity-based foundation for practice, assessment, and scholarship,” college & research libraries 80, no. 5 (2019), https://doi.org/10.5860/crl.80.5.658; scott seaman, carol krismann, and nancy carter, “salary market equity at the university of colorado at boulder libraries: a case study followup,” college & research libraries 64, no. 5 (2003), https://doi.org/10.5860/crl.64.5.390; freeda brook, dave ellenwood, and althea eannace lazzaro, “in pursuit of antiracist social justice: denaturalizing whiteness in the academic library,” library trends 64, no. 2 (2015), https://doi.org/10.1353/lib.2015.0048; isabel gonzalez-smith, juleah swanson, and azusa tanaka, “unpacking identity: racial, ethnic, and professional identity and academic librarians of color,” in the librarian stereotype: deconstructing perceptions and presentations of information work, ed. nicole pagowsky and miriam rigby (chicago: association of college and research libraries, 2014), 149–73. 6 tom riedel and paul betty, “real time with the librarian: using web conferencing software to connect to distance students,” journal of library & information services in distance learning 7, no. 1–2 (2013): 101, https://doi.org/10.1080/1533290x.2012.705616. 7 keith shaw, “colleges expand vpn capacity, conferencing to answer covid-19,” network world (online) (2020): 1. 8 monica anderson and andrew perrin, “nearly one-in-five teens can’t always finish their homework because of the digital divide,” pew research center fact tank news in the numbers, october 26, 2018, https://www.pewresearch.org/fact-tank/2018/10/26/nearlyone-in-five-teens-cant-always-finish-their-homework-because-of-the-digital-divide/. 9 julie arnold lietzau and barbara j. mann, “breaking out of the asynchronous box: suing web conferencing in distance learning,” journal of library & information services in distance learning 3, no. 3–4 (2009): 113, https://doi.org/10.1080/15332900903375291. https://er.educause.edu/articles/2020/3/the-difference-between-emergency-remote-teaching-and-online-learning https://er.educause.edu/articles/2020/3/the-difference-between-emergency-remote-teaching-and-online-learning https://www-chronicle-com.colorado.idm.oclc.org/article/how-not-to-evaluate-teaching/248434 https://www-chronicle-com.colorado.idm.oclc.org/article/how-not-to-evaluate-teaching/248434 https://philonedtech.com/what-we-learnt-from-going-online-during-university-shutdowns-in-south-africa/ https://philonedtech.com/what-we-learnt-from-going-online-during-university-shutdowns-in-south-africa/ https://doi.org/10.5860/crl.80.2.169 https://doi.org/10.5860/crl.80.2.169 https://doi.org/10.5860/crl.80.5.658 https://doi.org/10.5860/crl.64.5.390 https://doi.org/10.1353/lib.2015.0048 https://doi.org/10.1080/1533290x.2012.705616 https://www.pewresearch.org/fact-tank/2018/10/26/nearly-one-in-five-teens-cant-always-finish-their-homework-because-of-the-digital-divide/ https://www.pewresearch.org/fact-tank/2018/10/26/nearly-one-in-five-teens-cant-always-finish-their-homework-because-of-the-digital-divide/ https://doi.org/10.1080/15332900903375291 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 29 10 aek phakiti, david hirsh, and lindy woodrow, “it’s not only english: effects of other individual factors on english language learning and academic learning of esl international students in australia,” journal of research in international education 12, no. 3 (2013): 248. 11 t. v. semenova and l. m. rudakova, “barriers to taking massive open online courses,” russian education & society 58, no. 3 (2016): 242, https://doi.org/10.1080/10609393.2016.1242992. 12 xinghua wang, seng chee tan, and lu li, “technostress in university students’ technologyenhanced learning: an investigation from multidimensional person-environment misfit,” computers in human behavior 105, (2020): 2, https://doi.org/10.1016/j.chb.2019.106208. 13 “digital literacy,” ala literacy clearinghouse, accessed may 16, 2021: https://literacy.ala.org/digital-literacy/. 14 steven j. bell and john shank, “the blended librarian: a blueprint for redefining the teaching and learning role of academic librarians,” college & research libraries news 65, no. 7 (2004): 374, https://doi.org/10.5860/crln.65.7.7297. 15 vanessa w. vongkulluksn, kui xie, and margaret a bowman, “the role of value on teachers’ internalization of external barriers and externalization of personal beliefs for classroom technology integration,” computer & education 118, (2018): 79, https://doi.org/10.1016/j.compedu.2017.11.009. 16 jesper aagaard, “breaking down barriers: the ambivalent nature of technologies in the classroom, new media & society 19, no. 7 (2017): 1138, https://doi.org/10.1177/1461444816631505. 17 wan ng, “can we teach digital natives digital literacy?” computers & education 59, no. 3 (2012): 1065, https://doi.org/10.1016/j.compedu.2012.04.016. 18 ng, “can we teach,” 1071–72. 19 ng, “can we teach,” 1072. 20 ellen helsper and rebecca enyon, “digital natives: where is the evidence?,” british educational research journal 36, no. 3. (2010): 515, https://doi.org/10.1080/01411920902989227. 21 david ellis, “using padlet to increase student engagement in lectures,” in proceedings of the 14th european conference on e-learning (ecel 2015), ed. amanda jefferies and marija cubric (reading, uk: academic conferences and publishing international limited): 195. 22 seyed abdollah shahrokni, “playposit: using interactive videos in language education,” teaching english with technology 18, no. 1 (2018): 106. 23 jurgen schulte et al., “shaping the future of academic libraries: authentic learning for the next generation,” college & research libraries 79, no. 5 (2018): 688, https://doi.org/10.5860/crl.79.5.685. https://doi.org/10.1080/10609393.2016.1242992 https://doi.org/10.1016/j.chb.2019.106208 https://literacy.ala.org/digital-literacy/ https://doi.org/10.5860/crln.65.7.7297 https://doi.org/10.1016/j.compedu.2017.11.009 https://doi.org/10.1177/1461444816631505 https://doi.org/10.1177/1461444816631505 https://doi.org/10.1016/j.compedu.2012.04.016 https://doi.org/10.1080/01411920902989227 https://doi.org/10.5860/crl.79.5.685 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 30 24 chiara faggiolani, “perceived identity: applying grounded theory in libraries,” jlis.it: italian journal of library and information science 2, no. 1 (2011): 4592, https://doi.org/10.4403/jlis.it-4592. 25 “over 700 universities and colleges now use zoom!” zoom blog, july 15, 2013, https://blog.zoom.us/over-700-universities-and-colleges-now-use-zoom-video-conferencing/. 26 ng, “can we teach,” 1071–72. https://doi.org/10.4403/jlis.it-4592 https://blog.zoom.us/over-700-universities-and-colleges-now-use-zoom-video-conferencing/ abstract introduction literature review barriers to equitable student access in online learning technology tools, digital literacy, student engagement methods instrument participants grounded theory approach findings popularity of technology tools strengths and weaknesses of technology tools digital literacy gaps challenges and limitations discussion questions 1 and 2: technology tools instruction librarians are using to deliver content and engage students questions 3 and 4: the strengths and weaknesses of tools as they affect student engagement questions 5 and 6: digital literacy gaps and equitable instruction recommendations technology and equity communication and equity professional development and personal support conclusion acknowledgement appendix a: survey instrument appendix b: tools mentioned by three or fewer respondents, question 1, option other appendix c: tools mentioned by one respondent, question 2, option other endnotes institutional political and fiscal factors in the development of library automation, 1967-71 allen b. veaner: stanford university, stanford, california. 5 this paper (1) summarizes an investigation into the political and financial factors which inhibited the ready application of computers to individual academic libraries during the period 1967-71, and (2) presents the author's speculations on the future of libraries in a computer dominant society.il> technical aspects of system design were specifically excluded from the investigation. twenty-four institutions were visited and approximately 100 pe1·sons interviewed. substantial future change is envisaged in both the structure and function of the library, if the eme1·ging trend of coalescing libraries and computerized «information processing centers" continues. summary of major factors which inhibited the application of computers to library problems, 1967-71 major factors which inhibited the application of the computer to the library during the period 1967-71 can be categorized under three broad headings: (a) governance, organization, and management of the computer facility; (b) personnel in the computer facility; and (c) deficiencies in the library environment. a. governance, organization, and management of the computer facility 1. uncertainty over who was in charge of the computer facility.-this problem was partly attributable to the fact that the goals and objectives of the facility were imprecisely stated or not stated at all often there was no charter, no systematic procedures for establishing priorities, and excessive autonomy by the computer facility. these factors often permitted the facility to operate as a self-directing, self-sustaining entity, responsible to no informed, upper level manager. '~> the paper is based on a clr fellowship report to the council on library resources, inc., for the period january-june 1972. 6 journal of lihra1·y automation vol. 7/1 march 1974 2. effect of high level administrative changes.-in a few instances, the library automation effort was instigated by the president of the institution. he could, in effect, personally direct the allocation of resources. however, whenever a high administrative official leaves, the resulting vacuum is quickly filled by other interests, the atmosphere changes, and his personal program goals dissolve. 3. management inadequacies.-the effects of domination by a technician or special interest group are described below in more detail. although more and more organizations are putting together influential user groups to point the way toward better management, decision-making responsibility and authority continued to be misplaced in a few institutions which vested authority for technical decisions in a committee of deans who were somewhat remote from current trends in computing because of their administrative responsibilities. (in one institution, it was half jokingly stated that a dean in any hard science could be characterized as suffering from a minimum technological time lag of two years.) 4. lack of long-range planning inclusive of attention to community priorities.-few facilities visited had any written long-range plans, either for the acquisition of hardware, the conversion of older programs, or the involvement of users in systems design. ad hoc arrangements were prevalent. 5. system instability.-this was more the rule than the exception, especially in software, operating systems, hardware configuration, and pricing. wherever an academic computing facility was used for library development, the same broken record always seemed to be playing: the facility was always being taken apart and put together again. of course library development was not the only user affected; complaints arose from all users. 6. biased pricing algorithms.-in the academic facility, student and research use were competitive. hence systems were typically geared to distribute computing resources around the clock in some equitable and rational way. for instance, short student jobs were sometimes given a high priority for rapid turnaround, while long, grinding calculation work was pushed off to the evening or night shift by means of variable pricing schedules or algorithms. a pricing algorithm is basically a load leveling device to smooth out the peaks of over-demand and the valleys of under-utilization which would have occurred in the absence of such controls. devising pricing algorithms is by no means a simple task, since many factors must be taken into account: the kinds of machine resources available, their respective costs, the data rates at which they can function, market demand, hardware and software available, and system overhead, to name but a few. library jobs tended to suffer in both batch and on-line processing. in the former case, because batch jobs on large data bases took so much institutional political and fiscal factors/veaner 7 time, library work generally could not be done during the prime shift; in the latter case, an on-line library system made substantial demands upon a facility's storage equipment and telecommunications support, and competed with all other on-line users. 7. sense of competition with the library for hard dollars.-this problem, which is related to pricing bias, is detailed further on page 21. 8. scheduling problems.-many of the institutions visited had systems or charts for scheduling production, development, and maintenance. but conversations with system users often verified that schedules were either not met or had been unrealistically established. this was especially the case with development work b. personnel in the computer facility 1. selection and evaluation.-inasmuch as the library often did not have the competence to judge personnel nor the ability to generate meaningful specifications, there was generally very little protection from incompetence in this area. 2. elitism: the notion that the masters of the computer are inherently superior to and have better judgment than computer customers.-elitism is a paradox: it can be positive or negative-positive when the best brains produce software designs of true genius with respect to function, performance, economy, and reliability-but in its negative manifestation, reminiscent of the girl with the curl in the middle of her forehead: "when she was good, she was very, very good; when she was bad, she was horrid." during the boom years when computer facilities were expanding faster than the supply of competent staff, elitism seemed fairly common in the computer center. the excitement of rapid development, the seemingly unlimited intellectual challenge presented by the powerful apparatus, and high strung dispositions sometimes caused tempers to flare or immaturity to sustain itself beyond a reasonable time. strange hours, strange habits, bizarre behavior, all seemed to conspire against ordered and rational development. fortunately, as the field matures, the negative aspects of elitism are dying; managers now can concentrate on staff development work to turn top intellectual talents toward productive achievement. 3. disinterest.-this factor may be allied to elitism. in some instances, the computer center's staff gave considerable attention to the library during the period immediately following machine installation, when utilization was low. later, the staff's keen interest became "dulled" at the thought of operating a production system. "more interesting jobs" were .challenging the programmers and beginning to fill up the machine. 4. fear of the unknown big user.-it was recognized early that the library could be among the computer facility's largest potential customers, perhaps the largest. in some facilities, this recognition may have induced 8 journal of library automation vol. 7/1 march 1974 a fear of being taken over or overwhelmed by the user, who would then be in a position to dominate and dictate the direction of further development and operations. 5. fears of an unknown production environment-simply expressed, a production environment removes much of the stimulus for creative approaches to problem solving unless continuous development is maintained for new systems and new applications. many of the best programmers did not wish to lose their freedom to innovate and actively resisted participation in establishment of a production environment, with its concomitant requirement of "dull" maintenance support work. c. deficiencies within the library environment 1. failure to understand in full detail the current manual system.even where the manual system was understood, there was often an inability to describe it in the clear, unambiguous style essential to system design work. these deficiencies were further compounded by the unwillingness of some librarians to learn how to communicate adequately with computer personnel. 2. inability to communicate design specification.-many did not understand how to put together a specification document; particularly they did not know how to account exhaustively for all possible cases or alternatives. librarians were unaccustomed to defining their data processing requirements quantitatively or with precision-both absolutely indispensable to the computer environment. also, as much as the computer facility changed its software environment, many library development efforts were constantly changing their system requirements-a condition which made it all but impossible to program efficiently. 3. failure to understand the development process.-development is a new phenomenon in libraries. most librarians were not educated to comprehend development as an iterative process, characterized by experimentation, error, feedback, and corrective measures. accustomed to the relative stability of long-established procedures-some of which had stood for generations, even centuries-some librarians were baffled by the rapidly changing new technology, others showed impatience and a low tolerance for frustration. many expected development projects to resemble turnkey operations, and the failure of the process to accommodate these expectations produced disappointment and an inability to cope with the computer environment. 4. failure to recognize the computer as a finite resource.-both librarians and early facility managers seemed to look upon the computer as an inexhaustible resource, the former through lack of sophistication and the latter apparently through myopia or possibly ambition. some managers must have told their users that there was "no way" their equipment could be saturated in the foreseeable future. apparently some library users were naive enough to believe. institutional political and fiscal factorsjveaner 9 5. excessive or unrealistic performance expectations.-few library users understood the relationship between the system specifications and functional results, and fewer still understood the significance of performance specifications. the situation was not assisted by notions of "instantaneous" retrieval pushed by salesmen or the popular press. (the writer recalls vividly how one salesman told him the library could have a crt device for $1 a day! and indeed, the device itself was $1 per day if one cared to do without the keyboard, without cables, installation, control units, teleprocessing overhead, a computer, software, etc.) 6. lack of an established tradition of research and development ( r & d) and the lack of venture capital in the library community.the challenge of the computer may have been largely responsible for activating research and development as a serious and continuous effort in librarianship. inexperience in raising and managing funds for r & d, as well as a general lack of knowledge of computer cost factors inhibited progress or tended to make the development effort inefficient and full of surprises. 7. human problems.-some libraries having prior experience with small batch systems underestimated the scale of effort for contributing to the design of the large system, selling it to the users, installing it, and training the users. 8. insufficient support from top management.-in some instances, library management did not accord the automation effort the kind and degree of support essential to success. in particular, some librarians seemed to feel that automation was a temporary affair, definitely of less importance and significance than current manual operations. some did not recognize the sacrifices in regular production that would be necessary and some did not appreciate the continuing nature of development work. background two important prerequisites to progress in library automation were money and technical readiness. the government supplied the first, industry the second. the announcement by ibm in 1964 of its system 360 occurred at a fortunate time for the american library community. president johnson's administration had launched enormous programs in support of education. the library services and construction act was soon to channel millions of dollars into library plant expansion and, perhaps more significantly, the higher education act of 1965 was to sponsor research, which ui1til then had only the support of limited funds from the council on library resources, inc., and the national science foundation. (support from the national science foundation was largely, although not exclusively; directed toward discipline-oriented information services; one of the largest nsf grants went to the university of chicago library.) it was the right time to invest in library automation. important milestones were already behind the library community: the national library 10 journal of lihm1'y automation vol. 7/1 march 1974 of medicine's medlars program was well underway, the airlie conference on library automation had been held and its report published ("the white book"), and the library of congress automation feasibility study ("the red book") had appeared. 1 • 2 the first marc format was being tested in the field. in computer technology, third generation equipment represented major increases in computing power, processing speed, reliability, and capacity to store data in machine-readable form. ibm's sales force was successful beyond imagination in getting system 360's installed in large universities, as well as in business and government. ibm promised a new kind of software-time-sharing-which would virtually eliminate the tremendous mismatch of data processing speed between the human being and the machine. the new methods of spreading computer power through teleprocessing and time-sharing promised to make the computer at least competitive with and possibly an improvement over "antiquated" manual systems of providing rapid access to large and complex data files. within this relatively unknown environment, universities and libraries entered the software development process, which if successful, could enable them to catch up where they had been hopelessly falling behind. circulation, book purchasing, and technical processing loads in many libraries seemed to double and triple overnight as the country's schools and their programs grew to accommodate expanding enrollments. manual systems that had been reasonably workable and responsive in environments characterized by slow growth demonstrated significant and disturbing defects -the inability to deal with peak loads, or rapidly changing loads. the same effects were felt in administrative and academic computing: a bigger and more complex payroll, more students to register, construction contracts to monitor, more research grants which demanded bigger computers, and so on. these were truly boom years. but in the academic community there was still another force developing which was ultimately to be of even greater significance for libraries than the inconveniences of being unable to handle the housekeeping load: a dramatic rise in the expectations of patrons, especially in the academic community, where computers already abounded. libraries had come to be felt by some as strongholds of conservatism and expensive luxuries; librarians were faulted for not "putting the card catalog onto magnetic tape," for not implementing automated circulation systems, or otherwise failing to take advantage of new and powerful data processing techniques. the libraries were caught amidst a variety of sometimes conflicting, sometimes complementary factors: the visionary ignorance of the computer salesman, the senior academic officer possessed by the computer dybbuk, a lack of sympathy or understanding among some computer center managers, a lack of appreciation by students and faculty of the complexity of identifying, procuring, and cataloging unique copies of what must be the least standardized product known to man, and their own lukeinstitutional political and fiscal facto1'sjveaner 11 warm commitment to undertake the hard work required to learn how to use the computer resource. anxieties about jog displacement caused some library staff to look upon computers with trepidation, thus further placing the librarian in a defensive position. while these forces were taking shape, the library's bibliographic activities continued to be seriously hampered by inadequate international bibliographic control.~~ some essential computer hardware, especially the programmable crt terminal with an adequate character set, was either nonexistent or totally unsuitable to library applications. in this institutional context librarians entered the world of computers and data processing. t purpose it is the purpose of this report to examine in some detail how internal institutional factors affected the development of computerized bibliographic systems, and especially to consider nontechnical, negative factors: what slowed down or inhibited the applications of computers in librarianship? this report is not concerned with the merits or demerits of specific systems or their features; indeed, the investigator did not inquire about system specifications. major questions centered about the factors which fostered or hindered the development p1'ocess, regardless of the merit of a project or system. scope investigation was limited almost solely to those institutions considered likely to have large scale, in-house development projects using third generation computer equipment. the majority of places visited were large academic libraries. the time span included in the survey begins approximately in 1967 and ends in 1971. a total of twenty-four institutions was visited and some 100 persons interviewed; a list of the institutions visited is in appendix 1. methodology site visits and i nte1'views arrangements were made to visit four types of individuals: the director of libraries, the head of the library's system development department, the director of the computation center, and whatever principal institutional officer was managerially and/ or financially responsible for campus computing. considerable variation was found in the type of person assigned this last responsibility-it could be the provost, the vice-president u implementation of the library of congress' shared cataloging program under title ii6f the higher education act of 1965 was soon to alter this situation dramatically. t the painful trauma libraries and librarians experienced in getting into computers is too well documented to summarize here. perhaps the best summary has been done by stuart-stubbs. a 12 ] oumal of library automation vol. 7/1 march 197 4 for academic affairs, or the vice-president for business/ financial affairs. choice of the major institutional official to be interviewed was often determined by the pattern of computing in a particular institution, or the facility which supported the development effort. at first the investigator attempted to utilize a structured questionnaire for interviewing. this very quickly broke down, as the interviewees were generally voluble and ranged widely over many related topics or items which they would have been asked about later. accordingly, after the first few interviews, the formal questionnaire approach was dropped and a simple checklist of major questions kept on a few cards to make sure that each major issue had been addressed. every interviewee received the investigator graciously and none was unwilling to talk; indeed, if anything the opposite was the case-most persons seemed to be eagerly waiting for an opportunity to air their views. visits and interviews occurred during the period january-april1972. literature searches searching the literature on this topic has been extremely frustrating. in the literature of computer science and management, there are many articles on pricing algorithms, machine resource allocation schemes, and issues of managing the computer facility, but none specific to the topic of this report. besides scanning professional literature, the author has regularly conducted for the past year monthly computer searches via the ucla center for information service's sdi service. abstracts and citations were searched in research in education (rie) and current index to journals in education (cije). with respect to problems faced by the library in acquiring computer services, the results have been nil in both cases. the author reluctantly concludes that no major recent studies have yet been published in this sensitive area, although two papers by canadian librarians are very helpful. 3• 4 the national academy of sciences/computer science and engineering board's information systems panel appears to have come closest to identifying the issues in its report, library and information technology: a national systems challenge. still, the comments in that report are highly generalized and do not grapple with specifics. 5 structure of educational computing most of the visited institutions maintained separate facilities for administrative and academic computing, while a few ran combined facilities or were in the throes of consolidating their facilities. the differences between administrative and academic computing have historical roots deeply embedded in institutional soil. administrative computing is usually an outgrowth of punched card installations first set up for payroll and financial reporting. academic computing, on the other hand, has its origins within the institution's instructional and research programs. typically it has been supported by external grants and contracts and has been oriented toward institutional political and fiscal facto1'sjveaner 13 the "hard" sciences. until the recent dropoff in federal support of higher education, academic computing was a money maker (through the overhead on grants and contracts) while administrative computing was a money spender. administrative computing typically very little computational work is done in administrative applications; most of the computer work is associated with input, update, reading records, writing records, and printing reports. except for the pay.roll application, the consumer group has tended to be somewhat smaller and less transient than the academic group. but to university administrators the computer could do much more than write checks and pay bills. many significant administrative applications had already been installed on second generation equipment: faculty-staff directories, inventories of space, supplies, and equipment, records of grades, course consumption reports, etc. all these tended to expand the user group, increasing competition for the resource. the advent of third generation equipment made it attractive for administrators to think about applications centered around the so-called "integrated data base." this led to a demand for further new services for the registrar, fund raising and gift solicitation, student services, purchasing, etc. conventional administrative computing-particularly that part of it which generated regular reports-lent itself naturally to batch processing, and indeed many of the early computer installations actually continued established punched card operations, merely using the computer as a faster calculator and printer. the administrative computing shop is typically characterized by (or hopes to be characterized by) great systems stability and dependability, a cautious and measured rate of innovation, and in the opinion of some academic computing types, not much imagination. file integrity, backup and recovery, and timely delivery of its products are prime goals in an administrative computing system. the administrative computing facility very much resembles the library in two important aspects: ( 1) it is a production system; and ( 2) it is almost entirely an overhead function, i.e., there is little or no attempt at cost recovery from system users for its services. academic computing academic computing is a much different world. it serves a large, vociferous, .influential, and mostly technological user community, many of whom ~~e not only competent in programming, but more importantly, possess ready cash. but this is changing: as academic computing expands to service users in the humanities and social sciences rather than mainly those in the "hard" sciences, the user group is growing and it will probably not be long before it embraces the total academic community. in hard science applications, the academic facility typically performs an 14 journal of library automation vol. 7/1 march 1974 enormous amount of computing ("number crunching") with a relatively small amount of output. system backup and recovery is important to the academic computing facility, but file integrity responsibility may often be assigned to the user since such a center sometimes does not maintain the data base but merely provides a service for manipulating it. the main components of academic use are departmentor discipline-oriented research and student instruction, the latter being particularly strong if there is a well-established computer science department. software development has customarily played a major role in academic computing and the usual practice was to actively seek out imaginative systems programmers for whom change and system improvement are food and drink. consequently, instability, both in hardware and software, has been more the rule than the exception in the recent past, although as the management of computer facilities matures, this too is changing. currenttrendsandstatus it is obvious from the above that administrative and academic computing have been characterized by diametrically opposed machine and managerial requirements. where they have been combined in the same facility, tensions have prevailed and neither user was happy. in a few instances known to the writer, such combinations have been abortive and a reversion made to divided facilities. but as computing matures it is becoming evident that operational stability is needed for all types of computing, not just administrative computing. additionally, the financial crises now prevalent in institutions of higher education have brought more realistic attitudes to the fore in understanding just what kinds of facilities can be afforded, and how they should be managed. additionally, the economies of scale, the increasing flexibility of hardware and growing sophistication of software are now combining to form an environment which can better satisfy all potential users of computers. there are clear indications that a unified, well-managed shop with competent staff might now economically and efficiently serve a variety of applications, including administrative and academic-on the same facility. however, this is a developing trend and does not correspond with what the writer actually observed during his visits. in situ he saw much evidence that anthony oettinger's observations of some years ago were still valid: ... routine scheduled administrative work and unpredictable experimental work coexist only very uneasily at best, and quite often to the serious detriment of both. where the demands of administrative data processing and education require the same facilities at precisely the same time, the argument is invariably won by whoever pays the bills. finances permitting, the loser sets up an independent installation. 6 indeed, it would not be unreasonable to conclude from the interviews that in most places visited, computing during the period 1967-71 was in a institutional political and fiscal factorsjveaner 15 state of disarray. there is abundant and disagreeable evidence of technical incompetence, lack of management ability, ill spent money, communication failures, and naive and disillusioned users. but it would be a mistake to conclude that the failures in library automation are attributable primarily to computer-oriented personnel or hardware problems-librarians in their own way displayed many of these same failures. it would be another mistake to dwell excessively on the high failure rates observed. in any complex technological endeavor, the rate of failure is dramatically high at the beginning; there is ample evidence here from the aircraft and space industries. indeed, the likelihood of a first success in anything complex-library automation is complex, as we have learned the hard way-is practically nil. organization and management problems: the academic computing environment early academic computing facilities were typically run by faculty members in engineering, applied mathematics, computer science, or related fields. this arrangement was satisfactory when computers were small, relatively primitive, and the user community was confined to those few people who could program in machine language or assembly language. as equipment became bigger and more powerful, and as higher level programming languages developed, more and more people learned programming. correspondingly, the task of managing the computer facility grew rapidly in size and scope. the budget of a large computer center in a modern university can easily run to several millions of dollars annually. the manager must balance seemingly innumerable, complex forces: personnel, management, government and vendor relationships, demands from vocal users, establishing priorities, the challenge of hardware advances, marketing, pricing services, balancing the budget, etc. it soon became clear that few faculty members possessed either the multifaceted talents or the experience required for effective management. as the center's budget grew, and particularly as the shift was made from second to third generation equipment, th,e faculty member tended to be replaced by the technician as manager. unf01tunately for many of the facility users, the technician tended to promote his own technical interests in software development or hardware utilization. in some instances, the user community felt that the facility was being run more for the benefit of the staff than for the users. the technician-manager often looked at the computer as his personal machine, much as some faculty members had earlier felt the computer to be their own private preserve. the vice-president of one university expressed the view that the technician-manager doesn't really have an institutional loyalty tied to the goals and objectives of the academic programs; he is more loyal to the machine or the software. in a school with a long history of computer utilization, there had been no tech16 journal of library automation vol. 7/1 march 1974 nician in charge of the computer facility for a decade. yet in a school not too far away, an officer indicated that his institution had "made the same mistake twice in a row" by hiring a technician to manage the computer facility. the technician-manager represents a highly personalized management style, one in which goodwill, friendship, or personal interest is the key to effective service. it can hardly represent an arrangement for the successful development and implementation of computerized bibliographic systems. in the third and current organization and management phase of academic computer facilities, the professional manager is in charge. schools are now beginning to see the need to develop formal charters for their computing centers, quasi-legal instruments which will lay out their specific responsibilities as service agencies. a professionally managed service agency eliminates one of the most irritating elements in the allocation of computer resources: personal judgment by the faculty or technician-manager as to the worth of a project, which was so prevalent during earlier management stages. at the time of the interviews, very few institutions actually had such charters, but their need was being recognized. it is now universally accepted that the computer center can no longer be the plaything of the faculty nor the expensive toy of the technician. organization and management: the administrative environment because of its historical development the administrative computing facility was usually first run by someone with an accounting or financial background. (academic computing persons occasionally put disparaging labels on such people as "edp-types" or characterized them as having a "punched card mentality.") the nature of the workload virtually meant that the administrative shop would be set up mainly for batch processing and any data base services provided for other users would involve printed lists. such facilities were found satisfactory by a number of libraries even for applications such as circulation, which produced gigantic lists-probably because it represented a vast improvement over an antiquated, poorly designed, or overloaded manual system. however, there was at least one major technical consideration which had direct political and financial implications for the library which turned to the administrative computing facility for its computer support. this was the library's need to support and manipulate a data base with nearly every data element of variable length-a requirement that was practically nonexistent in administrative computing. some facilities were unable or unwilling to meet this requirement. the move from tape-oriented systems to mixed disc and tape systems on third generation equipment necessitated an upgrading of programming staff, and brought into the administrative shop the same clearcut distinction between system programmers and application programmers which had institutional political and fiscal factorsjveaner 17 emerged earlier in the academic shop. this change in turn demanded appointment of more knowledgeable facility managers, many of whom were drawn from business and industry rather than the ranks of in-house accounting staff. this transitional period was characterized by two enormously challenging parallel efforts: the conversion of existing programs to run on third generation equipment and the development of new applications. to an extent these responsibilities were competitive, and from this viewpoint it was certainly not a propitious time to embark upon anything as complex as bibliographic data processing. yet numerous workable systems emerged for circulation, book catalogs, ordering and accounting systems, and serials lists. these were not accomplished without anguish as the library did not control the machine resources and often did not control the human resources -the facility manager tended to make his pliority decisions to please his boss who was certainly not the librarian. besides, no application could really take precedence over payroll or accounting in the administrative shop. to the librarian it was more like borrowing another person's car than renting or owning a car: when the resource was urgently needed someone else had first call. organization and management: the library automation endeavor a detailed study of this subject is not within the scope of this investigation. however, it will be useful to note that the organization and management of library automation activities demonstrate development phases which closely parallel those in the computing environment: 1. a stage in which the user himself ( cf. accountant or faculty member) undertakes to perform the activity. in this stage individual librarians learned programming, did their own design work, wrote, debugged, and ran programs themselves. (this was possible in the "open shop" environment prevalent in many early computer facilities.) 2. a stage in which the technician-in this case a librarian with appropriate public service expertise (for circulation applications) or technical processing knowledge (for acquisitions, cataloging, or serials) -took charge of an organized development effort, hired his own programmers and systems analysts, and negotiated directly with the computer facility.* 3. a stage in which the professional system development manager is hired to oversee the total effort. such a person is sometimes drawn from business or industry, is a seasoned project manager, and has broad knowledge of computers, especially in the area of costs. such an ap*the technical person need not be a librarian. northwestern university represents a significant instance where a faculty member in computer sciences and electrical engineering undertook the development effort. 18 ]oumal of library automation vol. 7/1 march 1974 pointment is more common in the large library, the consortium, or network. human problems associated with rapid change in institutions some institutions, particularly in their administrative functions, became embroiled in a seemingly endless round of internal psycho-social problems which did not make the environment conducive to problem solving. the move to computerizing manually oriented functions, whether in the library or other parts of an institution, was found to be extremely threatening to established departmental structures. it was consistently reported that the political and emotional aspects of system conversion, both in the libra.ry and elsewhere, were much more aggravating than the technical aspects. the problem simply showed up first outside the library because applications of computers occurred there earlier. departments were sometimes unwilling to give up data for computer manipulation for fear that computerization would take jobs away. this phenomenon is not unknown in librarianship where some professionals take an extremely proprietary attitude toward bibliographic data. now pressures from governments, legislatures, and the academic community at large are gradually establishing the concept that some categories of data are corporate, and do not belong to a specific individual or department, or even to an institution, but should be shared through networking or other mechanisms. but the rapidity of microsocial change and its upsetting emotional consequences caught some library leaders unawares. a considerable reeducational process for both management and labor is required to smooth the transition to the new view. motivation problems it is difficult to elicit sound comment concerning motivation (or lack thereof) as a deterrent to progress in library automation. it is an emotional subject and neither the librarians nor the programmers come out "clean." the prima donna computer programmer, much in evidence in the early days of computer center development, is very much on the wane these days. like the spoiled child, the prima donna programmer could only exist where personal interests were permitted to take precedence over social goals-or perhaps where institutional goals for the computer facility had not been clearly articulated or had not yet come into focus. some prima donnas, partly out of ignorance, partly through a stereotyped image of library activities, were inclined to disdainfully dismiss library applications as "trivial," and demand "really challenging" assignments. but the librarians had their prima donnas, too. some had learned enough programming to be a little dangerous and they then felt like peers who could tell the computer center not only what to do but how to do it. at first, few members of the library staff were willing to learn how to arinstitutional political and fiscal factorsjveaner 19 ticulate their specifications and requirements to the management of a computer facility. most librarians expected some kind of miraculous magic, akin to a wave of the hand, to bring a computer system to reality. very few understood the heuristic nature of development. so there were barriers of status, depth of knowledge, and language-any one of which would have sufficed to kill the development of the good motivation essential to breaking new gro~nd. in the wrong combination they could present an overwhelming conspiracy, for their mutual interaction could only produce polarization and intransigence. the library and the computer facility the role of similarities and differences for a long time the library has been the "heart of the university." until the advent of the computer, little could challenge the supremacy of the library as the principal resource of an educational institution. even the faculty could be put into second place, since it was difficult to attract high quality faculty without good library resources, and the faculty were to a greater degree transient, for the library was considered "permanent," an investment for all time. the computer represents a new and challenging force in the arena where shrinking resources are allocated among competing academic users. both the library and the computer facility have experienced exceedingly rapid growth in the recent past, concurrent with an expanded demand for services which can easily outstrip available resources. among some of the larger academic libraries, the staff of the computer center may be half or greater than half that of the library. important differences between the two services have recently come into focus. first, most of the services and benefits of the library are intangible. because of this it has always been difficult to measure the cost benefit of the library as an institution, and it is well known that counts of the number of people entering the door or the number of circulations are far from true measures of the library's functional success. the computer, on the other hand, is a relentless accounting engine; computer facilities can produce endless statistics on the number of jobs run, lines printed, terminal hours provided to users, turnaround time, cards punched, etc. the computer's output is extremely tangible and can be more directly and easily related to academic achievement than can library use. a second major difference lies in apparently different financial roles within the institution. in most organizations, the library is run as an overhead expense, without any attempt to charge back to users or departments proportional costs of utilization. like air, the library resource is there for anyone to use as much or as little as he pleases; the library gets a "free ride," but the computer center is expected to pay its own way. this dichotomy is often explicitly designated as the "library-bookstore" duo model. furthermore, since the library does not generate much in the way of re20 journal of library automation vot 7/1 march 1974 search grants and contracts, it is looked upon as a consumer rather than a producer of financial resources. in fact, those who support computing in preference to books point to. the fact that overhead income generated by computer-related research grants and contracts is shared with the library which may have done little to contribute toward the acquisition of such income! in some institutions the situation has become critical indeed because of the recent substantial reductions in federal· support. much political in-fighting has been necessary to maintain current levels of computer activity, and not all such efforts· have been successful.· some institutions have been forced to cut back on computing power, merge facilities, or combine resources with other institutions. · · · · several years ago when the national science foundation imposed an expenditure ceiling on grants, associated overhead income was correspondingly reduced. one computer center director was reported to have suggested that the effect of this overhead cut could be nullified by a simple, internal reallocation of funds, say by taking the needed amount from the budget of another agency on campus of less significance to researchers and scientists, such as the library. this attitude is clear evidence that the library has lost its sacred cow status as a "good thing" on the campus. it too must justify itself. close examination of the library and the computer facility gives clear evidence that both deal with the same commodity: information. within the recent past several computer facilities have changed their designations to "information processing" facilities or centers. several institutions, notably the university of pittsburgh and columbia university, have coalesced the library and the computer center organizationally or have both units reporting to a vice-president for information services. the recognition and furtherance of this natural linkage may do much to reduce the potentially destructive competition which can characterize the relationship between the two units. there are remarkable growth parallels between the two facilities-the library acquiring and processing more and niore books in response to expanded publication patterns, more users, and the· growth of new ·disciplines and interdisciplinary research, while the computation facility moves rapidly from one generation of software and hardware to the next. the expansion of both organizations produces seemingly equal capital-intensive and labor-intensive pressures: library processing staff doubles and triples, while the ·newly acquired books demand ·more in the way of housing, whether of the traditional library type or warehouse space; the computer center moves toward more sophisticated hardware, especially terminals and communications, which need to be supported by greater numbers of still more highly qualified· systems programmers, communication experts, and user services staff. both services have a marketing. problem; but the computation facility, being relatively more dynamic and more interactive (because of terminal services), can be more sensitive and responsive, .financially and technically, to its clientele than can the library. only now institutional political and fiscal factors/veaner 21 with the emphasis upon computerized bibliographic networking has the library as an institution begun to approach the marketing strategies and the effective user feedback already well developed in computation facilities. service capacity, resource utilization and sharing differences both in service capacity and resource utilization represent a key political issue affecting the future of both libraries and computer facilities. in major universities, the budget for the computer facility is now not far from the library budget in size, and in a few institutions it exceeds the library budget. with the diminution of external grants and contracts, the two organizations compete for the same hard dollars. this economic competition can either drive the two facilities apart, dividing the campus, or cause them to coalesce-as has been the case at columbia and pittsburgh. despite its high operating costs, from the viewpoint of resource utilization, the well-managed computer facility can almost always point to an excellent record.§ no matter how well managed, the research library can never make this claim in the context of its current materials and processing expenditures, much of which by definition is aimed at filling future needs. the library and its patrons cannot "use" all the resources at their command; the library could not even service all the patrons should they demand the use of "all" the resources. in contrast, the computer facility (particularly large on-line systems with interactive capabilities) can be very efficiently utilized even when demand is heavy. thus, to the "objective" eye, it would appear that in the computer facility both the institution and the individual patron get more value for their dollar than they do in the library, which in comparison resembles a bottomless financial pit. one may counter that apples and oranges are being compared, but the institution which pays their bills nevertheless makes the comparison. flexibility, inflexibility, and the future besides better resource utilization, the computer facility offers the patron far greater flexibility of resource use than can the library. there is no way a large collection of books on the celtic language or the military history of the austro-hungarian empire can help a professor of structural engineering, a student of marine biology, or a researcher in modern urban problems. even the books these people actually need and use cannot easily assist others, as relevant data in them is not indexed or readily available for computer manipulation. · the point is that, unlike the library, the computer is a highly elastic universal tool, one that each user can temporarily shape to his own need, replicate .the shape later, or if he wishes change the shape at will. the traditional.lib:rary has no such flexibility; its main bibliographic retrieval de§in fact, if a computer resource is not much used and isn't "carrying its weight," it can be disposed of, by sale if purchased, or by cancellation if leased. 22 journal of library automation vol. 7/1 march 1974 vice-the card catalog-is especially noted for its high maintenance cost, its limited ability to respond to complex queries, and a general fixity of organization and structure that is ever at variance with changing patron expectations and interests. (if computers can be flexible, why can't the library?) there is much in the library that is not used because it is inaccessiblelocked up in an inflexible retrieval tool or unavailable because the stateof-the-art (both in bibliography and computer science) or staffing does not yet permit far deeper access via "librarian-negotiators" and patrons at terminals interacting with large and deeply indexed data bases. as long as major portions of the library budget and staff are devoted to housekeeping and internal technical processing, the library will look less good, less "costbeneficial" to the academic community than does the computer facility. but there is growing recognition that both institutions deal with information processing which covers a wide spectrum of time. true, the storage formats differ, but this may be a temporary phenomenon. as progress is made on improved, less expensive conversion of data from analog to digital form and vice-versa, the day may arrive when the library and the computer facility are indistinguishable. will the library become an information utility? computer utilities are an important developing trend and it is sometimes suggested that library services could be delivered within the utility model. utilities and libraries as they exist today have very different characteristics. a utility can be defined as a system providing a relatively undifferentiated but tangible service to a mass consumer group and with use charges in accordance with a pricing structure designed for load leveling (i.e., optimization of resource utilization). typically, a utility both wholesales and retails its services. within this definition, a conventional library cannot be construed as a utility; its services are generally intangible and very highly differentiated-indeed, chiefly unique, for rarely is one book "just as good as another"; its clientele is not the general public but a highly select group which itself contains highly unequal concentrations of users; and almost no libraries impose user charges in the interest of cost recovery; practically speaking, there is only one united states wholesaler (of bibliographic data) -the library of congress. this situation is changing in several respects. first, the establishment of practical, computerized bibliographic networks has introduced among participating institutions cost sharing schemes closely resembling the load leveling or rate averaging algorithms prevalent among utilities.ll these han example of rate averaging is the practice of the ohio college library center to lump total telecommunication cost and prorate it into the membership fee, in effect creab":ng a distance independent tariff. (this arrangement does not hold outside of ohio.) institutional political and fiscal factorsjveaner 23 new ideas have been readily accepted by libraries and could even become the basis for balancing more equitably the costs of interlibrary loan traffic. second, specialized "information centers" have evolved in certain fields, partially as a consequence of lack of responsiveness (or slow turnaround) by conventional library services, and "for profit" commercial services have been set up. examples of the latter include the european s'il vous plait and its american counterpart, f.i.n.d. (often such commercial services do not hire librarians as they are considered too tradition bound.) a third force which is rather inchoate at the moment may soon take on a recognizable shape: facilities management. under such a scheme, the complete management responsibility for all or part of a function is contracted to an outside vendor. for instance, it is conceivable that some libraries in the near future may have no in-house staff for technical processing. services would be purchased totally from a vendor or obtained from his resident staff, much as computer centers buy specialized expertise through the "resident s.e." (systems engineer). the gradual buildup of computerized bibliographic services offers an excellent opportunity for commercial ventures into turnkey bibliographic operations for libraries. this would bring the libraries one step closer to the utility concept, as they buy a complete package from a wholesaler who probably services many customers. the traditional library service concepts we know today may undergo drastic changes in financing and in methods of delivery. beyond the commercialized or contractual arrangement for technical processing, which is only one component of the total information flow, lie unknown territory and little explored concepts: use charges for library services (the bookstore model), the "for profit" library, the complete information delivery system integrated with computers, communication satellites, and cable tv. if the computer-based library is to become an information utility, a major accommodation will be needed in the financing arrangements, perhaps in form of user charges-for no utility can survive without regulated demand. an unlimited, uncontrolled demand for any product or service is untenable, for without regulation (i.e., pricing) demand rapidly outruns supply. in the traditional library, where theoretically every user has the "righf' to unlimited demand, this never happens for several reasons: (1) not all potential patrons elect to use the resource; ( 2) the users must usually go to the library to access the bibliographic apparatus and obtain the materials held by the library; ( 3) every item in a library collection does not have an equal probability of use; and ( 4) there is a finite rate at which human beings can "use the resource," i.e., people can read just so f~st. none of these self-limiting factors applies to say, electric power, radxo and tv broadcasting, telecommunication services, or similar utilities. the library picture could become quite different if these limitations were removed or mitigated. suppose the patron could access the bibliographic apparatus through his home computer terminal attached to his tv 24 journal of libmry automation vol. 7 ;1· march 1974 in the "wired city." further suppose that he could receive selected, short items (where time of delivery is important to him) directly at his tv set, or longer items having less time value as microforms or hard copy delivered by mail or private delivery systems. given such possibilities, the collecting policies of individual .. libraries" (if they continue to be called by that name) might well change drastically so that nationally, collections might become much more standardized or .. homogenized" -increasing the likelihood that individual holdings will have more nearly equal use probabilities. this would imply the need for one or more national and/ or regional centers for servicing the less used materials, along with appropriate delivery systems and pricing schedules. conclusion work on library automation has proceeded during a highly developmental period in the history of computing. in this sense, librarianship has suffered no worse than any other computer application, nearly all of which have gone through traumas of design, installation, redesign, reprogramming, etc. the main distinction is that in many of these other applications -government, military, industrial, or commercial-there have been . far greater resources available to the task and vastly greater experience with the development process. despite the obstacles, progress in computerized bibliographic work has been far more significant and has achieved far more than many librarians-especially those unaccustomed to the developmerit cycle..;..can appreciate. the snowballing growth of practical consortia and networks along with the successful installation and operation of several on-line bibliographic systems has already changed the face of libtarianship in ·a very short time. like the breaking of the sonic barrier, once the initial.difficulty is overcome, further progress is easier. the ·computer has successfully achieved what librarians have until recently· only paid lip service to: cooperation and wide sharing of an expensive· and large· resource. though the linear growth model in libraries has been dead for some time, the recognition of this fact has riot yet penetrated the entire profession. if libraries are to survive as viable institutions throughout this century and into the next, their leaders inust solve the financial, space, ·and human communication problems inherent in growth. local autonomy, local self-sufficiency, and the "freedom" to ·avoid, evade, and even· undermine national standards now show up as expensive and dangerous luxuries-potentially self-destructive. only through the computet will true library cooperation be possible~ only the development of regional and national bibliographic networks,· with the assistance of substantial federal funding, can really .. save" the library. the computer is actually the' library's life insurance and blood plasma .. a failure to respond to the challenge of the ·computer could be fatal, for it is increasingly apparent that patrons growing up in the computer era will not patiently interact'with··library systems geared to nineteenth-century methods. nothing institutional political and fiscal factorsjveaner 25 in the educational system exists to .force people to use a given resource; people use the resources which are effective, responsive, and economical. if the computer is a better performer than the library, patrons will go to the computer. this will be pa!ticularly the case as computer services· become broader in coverage, simpler to lise, and unit prices continue to decline. despite the serious and irritating problems associated with learning''tp ·use the computer,. librarians must continue aggressively to support. computer applications; indeed, library leaders can impart no more important message than this to their community leaders. · acknowledgments· i wish to thank the following persons for their support: dr. e. howard brooks, who was vice-provost for academic affairs in 1971, and da'vid c. weber, director of libraries, respectively, stanford university, for granting the leave of absence which enabled me to undertake this project. i acknowledge with thanks the contributions of the following persons who reviewed early drafts of the paper, in many cases making valuable suggestions and in other instances helping me ward off errors: mrs. henriette d. avram, head, marc development office, library of congress; hank epstein, director of project ballots and associate director for library and administrative computing, stanford center for information processing; frederick g. kilgour, executive director, ohio college library center; peter simmons, professor of library science, university of british columbia; carl m. spaulding, program officer, council on library resources, inc.; david c. weber, director of libraries, stanford university. references 1. barbara evans markuson, ed., libra1'ies and automation; conference on libraries and automation, warrenton, va., 1963. (washington, d.c.: library of congress, 1964). 2. u.s. library of congress, automation and the library of congress; a survey sponsored by the council on library resources, inc. (washington, d.c.: library of congress, 1963), 3, basil stuart-stubbs, "trial by computer: a punched card parable for library administrators," library ]ournal92:4471-4 (15 dec. 1967). 4. dan mather, "data processing in an academic library: some conclusions and observations," pnla quarterly 32:4-21 (july 1968). 5. lib1'aries and information technology: a national systems challenge; a report to the council on library resources, inc., by the information systems panel, computer science and engineering board. (washington: national academy of sciences, 1972). 6. anthony oettinger, run, computer, run (cambridge, mass.: harvard university · press, 1969), p.196. (these same comments were cited in allen b. veaner's earlier article, "major decision points in library automation," college & research libraries :299-312. 26 journal of library automation vol. 7/1 march 1974 appendix 1 list of institutions visited university of alberta university of british columbia university of chicago cleveland public library the college bibliocentre, ontario university of colorado columbia university cornell university harvard university university of illinois indiana university massachusetts institute of technology university of michigan new york public library northwestern university ohio college library center university of pennsylvania pennsylvania state university umversity of pittsburgh purdue university simon fraser university syracuse university university of toronto yale university examining attributes of open standard file formats for long-term preservation and open access eun g.park and sam oh information technology and libraries | december 2012 44 abstract this study examines the attributes that have been used to assess file formats in literature and compiles the most frequently used attributes of file formats to establish open-standard file-formatselection criteria. a comprehensive review was undertaken to identify the current knowledge regarding file-format-selection criteria. the findings indicate that the most common criteria can be categorized into five major groups: functionality, metadata, openness, interoperability, and independence. these attributes appear to be closely related. additional attributes include presentation, authenticity, adoption, protection, preservation, reference, and others. introduction file format is one of the core issues in the fields of digital content management and digital preservation. as many different types of file formats are available for texts, images, graphs, audio recordings, videos, databases, and web applications, the selection of appropriate file formats poses an ongoing challenge to libraries, archives, and other cultural heritage institutions. some file formats appear to be more widely accepted: tagged image file format (tiff), portable document format (pdf), pdf/a, office open xml (ooxml), and open document format (odf), to name a few. many institutions, including the library of congress (lc), possess guidelines on file format applications for long-term preservation strategies that specify requisite characteristics of acceptable file formats (e.g., they are independent of specific operating systems, are independent of hardware and software functions, conform to international standards, etc.).1 the format descriptions database of the global digital format registry is an effort to maintain a detailed representation of information and sustainability factors for as many file formats as possible (the pronom technical registry is another such database).2 despite these developments, file format selection remains a complex task and prompts many questions that range from a general interest (“which selection criteria are appropriate?”) to more specific (“are these international standard file formats sufficient for us to ensure long term preservation and access?” or “how should we define and implement standard file formats in harmony with our local context?”). in this study, we investigate the definitions and features of standard file formats and examine the eun g. park (eun.park@mcgill.ca) is associate professor, school of information studies, mcgill university, montreal, canada. sam oh (samoh@skku.edu) is corresponding author and professor, department of library and information science, sungkyunkwan university, seoul, korea. mailto:eun.park@mcgill.ca mailto:samoh@skku.edu information technology and libraries | december 2012 45 major attributes of assessing file formats. we discuss relevant issues from the viewpoint of openstandard file formats for long-term preservation and open access. background on standard file formats the term file format is generally defined as what “specifies the organization of information at some level of abstraction, contained in one or more byte streams that can be exchanged between systems.”3 according to interpares 2, file format is “the organization of data within files, usually designed to facilitate the storage, retrieval, processing, presentation, and/or transmission of the data by software.”4 the premis data dictionary for preservation metadata observes that, technically, file format is “a specific, pre-established structure for the organization of a digital file or bitstream.”5 in general, file format can be divided into two types: an access format and a preservation format. an access format is “suitable for viewing a document or doing something with it so that users access the on-the-fly converted access formats.”6 in comparison, a preservation format is “suitable for storing a document in an electronic archive for a long period”7; it provides “the ability to capture the material into the archive and render and disseminate the information now and in the future.”8 while the ability to ensure long-term preservation focuses on the sustainability of preservation formats, the document in its access format tends to emphasize that it should be accessible and available by users, presumably all of the time. many researchers have discussed file formats and long-term preservation in relation to various types of resources. for example, folk and barkstrom describe and adopt several attributes of file formats that may affect the long-term preservation of scientific and engineering data (e.g., the ease of archival storage, ease of archival access, usability, data scholarship enablement, support for data integrity, and maintainability and durability of file formats).9 barnes suggests converting word processing documents in digital repositories, which are unsuitable for long-term storage, into a preservation format.10 the evaluation by rauch, krottmaier, and tochtermann illustrates the practical use of file formats for 3d objects in terms of long-term reliability.11 others have developed and/or applied numerous criteria in different settings. for instance, sullivan uses a list of desirable properties of a long-term preservation format to explain the purpose of pdf)/a from an archival and records management prospective.12 sullivan cites device independence, self-containment, self-describing, transparency, accessibility, disclosure, and adoption as such properties. rauch, krottmaier, and tochtermann’s study applies criteria that consist of technical characteristics (e.g., open specification, compatibility, and standardization) and market characteristics (e.g., guarantee duration, support duration, market penetration, and the number of independent producers). rog and van wijk propose a quantifiable assessment method to calculate composite scores of file formats.13 they identify seven main categories of criteria: openness, adoption, complexity, technical protection mechanism, self-documentation, robustness, and dependencies. sahu focuses on the criteria developed by the uk’s national archives, which include open standards, ubiquity, stability, metadata support, feature set, examining attributes of open standard file formats for long-term preservation and open access | park and oh 46 interoperability, and viability.14 a more comprehensive evaluation by the lc reveals three components—technical factors, quality, and functionality—while placing a particular emphasis on the balance between the first two.15 hodge and anderson use seven criteria for sustainability, which are similar to the technical factors of the lc study: disclosure, adoption, transparency, selfdocumentation, external dependencies, impact of patents, and technical protection mechanisms.16 some institutions adopt another term, standard file formats, to differentiate accepted and recommended file formats from others. according to the david project, “standard file formats owe their status to (official) initiatives for standardizing or to their widespread use.”17 standard may be too general to specify the elements of file formats. however, there is a recognition that only those file formats accepted and recommended by national or international standard organizations (such as the international standardization organization [iso], international industry imaging association [i3a], www consortium, etc.) are genuine standard file formats. for example, iso has announced several standard file formats for images: tiff/it (iso 12639:2004), png (iso/iec 15948:2004), and jpeg 2000 (iso/iec 15444:2003, 2004, 2005, 2007, 2008). for document file formats, pdf/a-1 (iso standard 19005-1. document file format for long-term preservation) is one example. this format is proprietary to maintain archival and recordsmanagement requirements and to preserve the visual appearance and migration needs of electronic documents. office open xml file format (iso/iec 29500–1:2008. information technology—document description and processing languages) is another open standard that can be implemented from microsoft office applications on multiple platforms. odf (iso/iec 26300:2006. information technology—open document format for office applications [opendocument] v1.0) is an xml-based open file format. regardless of iso-announced standards, some errors in these file formats have been reported. for example, although pdf/a-1 is for longterm preservation of and access to documents, studies reveal that the feature-rich nature of pdf can create difficulties in preserving pdf information over time.18 to overcome the barriers of pdf and pdf/a-1, xml technology seems prevalent for digital resources in archiving systems and digital preservation.19 the digital repository community is treating xml technology as a panacea and converting most of their digital resources to xml. the netherlands institute for scientific information service (nisis) adopts another noteworthy definition of standard file formats. it observes that standard image file formats “are widely accepted, have freely available specifications, are highly interoperable, incorporate no data compression and are capable of supporting preservation metadata.”20 this definition implies specific and advanced ramifications for cost-free interoperability and metadata, which closely relates to open access. open standard is another relevant term to consider in file formats. although perspectives vary greatly between researchers, open standards can be acquired and used without any barrier or cost.21 in other words, open standard products are free from restrictions, such as patents, and are independent of proprietary hardware or software. since the 1990s, open standard has been broadly adopted in many fields and is now an almost compulsory feature in information services. information technology and libraries | december 2012 47 to follow the national archives’ definition, open standard formats are “formats for which the technical specifications have been made available in the public domain.”22 in comparison, the folk and barkstrom approach opens standards from institutional support perspectives, relying on user communities for standards that are widely available and used.23 on a more specific level, stanescu emphasizes independence as the basic selection criteria for file formats.24 others, such as todd, propose determining whether a standard should be more open than others by applying criteria: adoption, platform independence, disclosure, transparency, and metadata support.25 other factors considered by todd include reusability and interoperability; robustness, complexity, and viability; stability; and intellectual property (ip) and rights management.26 echoing the lc, hodge and anderson also suggest a list of selection criteria that have been grouped under the banner of “technical factors”: disclosure, adoption, transparency, self-documentation, external dependencies, impact of patents, and technical protection mechanisms.27 researchers agree that open standard file formats are less obsolete and more reliable than proprietary formats.28 close examination of the nisis definition mentioned above reveals that standard file formats are in reality not free, nor do they allow unrestricted access to resources. the three file formats that iso has announced (pdf/a, ooxml, and odf) are proprietary and sometimes costly. they also prohibit the purchase of access to a proprietary standard, although there is an assumption that a standard should be free from legal and financial restrictions. the iso-announced file formats, in short, are only standard file formats, not open standard file formats. for cultural heritage institutions, questions regarding appropriate selection criteria and the sufficiency of existing international standard file formats for long-term preservation and access remain unanswered. there exists neither a uniform method to compare the specifications of different file formats nor an objective approach to assess format specifications that would ensure long-term preservation and persistent access. objectives of the study in this study, we attempt to better define and establish open-standard file-format-selection criteria. to that end, we assess and compile the most frequently used attributes of file formats to establish open-standard file-format-selection criteria. method we performed a comprehensive review of published articles, institutional reports, and other literature to identify the current knowledge regarding file-format-selection criteria. we included literature that deals with the three standard file formats (pdf, pdf/a, and xml) but excluded the recently announced odf format due to the scarcity of literature on odf. among more than the thirty articles initially reviewed, only twenty-five that use their own clear attributes were included in this study. all of the attributes that we have employed are listed by frequency and grouped according to similarities in meaning (see appendix). the original definitions or descriptions that we used are listed in the second column. the file formats that we assessed by their attributes are examining attributes of open standard file formats for long-term preservation and open access | park and oh 48 listed in the third column. when we give attributes without specific definitions or descriptions, “no definite term” is inserted. findings as illustrated in the appendix, the criteria identified by the studies vary. although the requirements and context of the studies may differ, the most common criteria can be divided into five categories: functionality, metadata, openness, interoperability, and independence. first, functionality refers to the ability of a format to do exactly what it is supposed to be doing.29 it is important to distinguish between two broad uses: preservation of document structure and formatting and preservation of useable content. to preserve document formatting, a “published view” of a given piece of content is critical for distribution. other content, such as database information or device-specific documents, needs to be preserved as well. functionality criteria include various attributes related to formats and structure or physical and technical specifications of files (e.g., robustness, feature set, viability, color maintenance, clarity, compactness, modularity, compression algorithms, etc.). second, metadata indicates that a format allows rich descriptive and technical metadata to be embedded in files. metadata can be expressed as metadata support, self-documentation (selfdocumenting), documentation, content-level (as opposed to presentation-level) description, selfdescribing, self-describing files, formal description of format, etc. third, openness refers to specifications of a file format that are publicly available and accessible and formats that are not proprietary. whether seen as a single definition or as a set of criteria, the characteristic that appears to be at the core of the open standard movement is its independence from outside proprietary or commercial control. openness also may refer to the autonomy of a file format, which relies on several factors. first, the document should be self-contained in terms of the content information (e.g., the text), the structural information (i.e., for those documents that are structured), the formatting information (e.g., fonts, colours, styles, etc.), and the metadata information. self-containment does not necessarily mean that an archivist will only have one document to deal with. it does mean, however, that they will have documents that will provide them with all the information to access and process the content, structure, formatting, and metadata. openness is expressed as open availability by some researchers.30 other researchers adopt the term disclosure for expressing that specification is publicly available.31 fourth, is the independence of a document from proprietary or commercial hardware and software configurations, especially to prevent any issues resulting from different versions of software, hardware, and operating systems. this aspect is expressed in the appendix as open standards, open source software or equivalent, standard/proprietary, etc. this also closely relates to independence, one of the five categories in the appendix, expressed as device independencies, independent implementations, no external dependency, no external dependencies, portability, and monitoring obsolescence. having documents in a proprietary format controlled by a third party information technology and libraries | december 2012 49 implies that, at one time or another, this format may no longer be supported, or that a change in the user agreement may lead to restricted access, access to outdated material, or patent and copyright issues. this fact means that the document must be freely accessible, without password restrictions or protection, and without any digital rights management scheme. blocking access to a document with a password can lead to serious problems if the password gets lost. in addition, the size and compactness of the document will influence the selection of a file format. fifth, interoperability primarily refers to the ability of a file format to be compatible with other formats and to exchange documents without loss of information.32 specifically, it refers to the ability of a given software to open a document without requiring any special application, plug-in, codec, or proprietary add-on. adherence to open source standards is usually a good indication of the interoperability of a format. in general, an open standard is released after years of bargaining and agreements between major players. supervision by an international standard (such as iso or the w3c) commonly helps propagate the format. in addition to the five categories mentioned above, other attributes are often used. presentation, authenticity, adoption, protection, preservation and reference are such examples. among these attributes, authenticity, although this is the seventh in the appendix, is one of the most important attributes in archives and records management. it refers to the ability to guarantee that a file is what it originally was without any corruption or alteration.33 specific to authenticity is data integrity, which assesses the integrity of the file through an internal mechanism (e.g., png files include byte sequences to validate against errors). another method of validating the authenticity of a document is to look at its traceability,34 that is, the traces left by the original author and those who modified or opened a file. one example is the difference between the creation date, modification date, and access date of any file on a personal computer. these three dates correspond to a moment when someone (often a different person each time) opened the file. other mechanisms may require log information, which is external to the file. another good indication of authenticity is the stability of a format.35 a format that is widely used is more likely to be stable. a stable format is also more likely to cause less data loss and corruption; hence it is a better indicator of authenticity. presentation includes attributes related to presenting and rendering data, expressed as distributing a page image, normal rendering, self-containment, selfcontained, and beyond normal rendering. adoption indicates how popular and widely a file format is adopted by user communities, also represented as popularity, widely used formats, ubiquity, or continuity. protection includes the technical protection mechanism or source verification to protect with security skills. preservation means long-term preservation, institutional support, or ease of transformation and preservation. reference indicates citability, or referential extensibility. among other attributes, transparency is interesting to note because it indicates the degree to which files are open to direct analysis with basic tools and human readability. another important aspect across these criteria is that the terminologies used in the studies may be quite different yet describe the same or similar concepts from different angles. for instance, rog and van wijk use openness for standardization and specification without restrictions,36 while examining attributes of open standard file formats for long-term preservation and open access | park and oh 50 several other researchers use open availability to convey the same thing.37 they in turn adopt the term disclosure to express that specification is publicly available.38 discussion and conclusion functionality, metadata, openness, interoperability, and independence appear to be the most important factors when selecting file formats. when file formats for long-term preservation and open access are under discussion, cultural heritage institutions need to consider many issues. despite several efforts, it is still tricky for them to identify the most appropriate file format or even to discern acceptable formats from unacceptable formats. where it is difficult to prevent the creation of a new file format, format selection is not an easy task, both in theory and in practice. it is critical, however, to base the decision on a clear understanding of the purpose for which the document is preserved: access preservation or repurposing preservation. cultural heritage institutions and digital repository communities need to guarantee long-term preservation of digital resources in selected file formats. additionally, users find it necessary to have access to digital information in these file formats. additional consideration involves the level of access users may enjoy (e.g., long-term access, permanent access, open access, persistent access, etc.). when determining international standard file formats, an aspect of open access should be included because it is a well-liked topic. it is necessary to develop a scale or measurement to assess open-standard format specifications to ensure long-term preservation and open access. identifying which attributes are required to be an open-standard file format and which digital format is most apt for the use and sustainability of long-term preservation is a meaningful task. the outcome of our study provides a framework for appropriate strategies when selecting file formats for long-term preservation and access to digital content. we hope that the criteria described in this study will benefit librarians, preservers, record creators, record managers, archivists, and users. we are reminded of todd’s remark that “the most important action is to align the recognition and weighting of criteria with a clear preservation strategy and keep them under review using risk management techniques.”39 the question of how to adopt and implement these attributes can only be answered in the local context and decisions of each cultural heritage institution.40 each institution should consider implementing a file format throughout the entire life cycle of digital resources, with a holistic approach to managerial, technical, procedural, archival, and financial issues for the purpose of long-term preservation and persistent access. the criteria may change over time, as is necessary for any format to adequately serve its purpose. maintaining its quality may be an ongoing task that cultural heritage institutions should take into account at all times. even more importantly, cultural heritage institutions need to establish and implement a set of standard guidelines specific to each context for the selection of open-standard file formats. note: this research was supported by the sungkyunkwan university research fund (2010-2011). information technology and libraries | december 2012 51 references and notes 1. library of congress, “sustainability of digital formats: planning for library of congress collections,” www.digitalpreservation.gov/formats/intro/intro.shtml (accessed november 21, 2011). 2. global digital format registry, www.gdfr.info (accessed november 17, 2011); the technical registry pronom, www.nationalarchives.gov.uk/aboutapps/pronom (accessed november 21, 2011). 3. mike folk and bruce r. barkstrom, “attributes of file formats for long-term preservation of scientific and engineering data in digital libraries” (paper presented at the joint conference on digital libraries (jcdl), houston, tx, may 27–31, 2003), 1, www.larryblakeley.com/articles/storage_archives_preservation/mike_folk_bruce_barkstrom2 00305.pdf (accessed november 21, 2011). 4. interpares 2 project glossary, p. 24, www.interpares.org/ip2/ip2_term_pdf.cfm?pdf=glossary (accessed november 21, 2011). 5. premis editorial committee, premis data dictionary for preservation metadata, ver. 2.0, march 2008, p. 195, www.loc.gov/standards/premis/v2/premis-2-0.pdf (accessed november 21, 2011). 6. ian barnes, “preservation of word processing documents,” july 14, 2006, p. 4, http://apsr.anu.edu.au/publications/word_processing_preservation.pdf (accessed november 21, 2011). 7. ibid. 8. gail hodge and nikkia anderson, “formats for digital preservation: a review of alternatives and issues,” information services & use 27 (2007): 46. 9. folk and barkstrom, “attributes of file formats.” 10. barnes, “preservation of word processing documents.” 11. carl rauch, harald krottmaier, and klaus tochtermann, “file-formats for preservation: evaluating the long-term stability of file-formats,” in proceedings of the 11th international conference on electronic publishing 2007 (vienna, austria, june 13–15, 2007): 101–6. 12. susan j. sullivan, “an archival/records management perspective on pdf/a,” records management journal 16, no. 1 (2006): 51–56. 13. judith rog and caroline van wijk, “evaluating file formats for long-term preservation,” 2008, www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_2 7022008.pdf (accessed november 21, 2011). http://www.digitalpreservation.gov/formats/intro/intro.shtml http://www.nationalarchives.gov.uk/aboutapps/pronom http://www.larryblakeley.com/articles/storage_archives_preservation/mike_folk_bruce_barkstrom200305.pdf http://www.larryblakeley.com/articles/storage_archives_preservation/mike_folk_bruce_barkstrom200305.pdf http://www.interpares.org/ip2/ip2_term_pdf.cfm?pdf=glossary http://www.loc.gov/standards/premis/v2/premis-2-0.pdf http://apsr.anu.edu.au/publications/word_processing_preservation.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf examining attributes of open standard file formats for long-term preservation and open access | park and oh 52 14. d. k. sahu, “long term preservation: which file format to use” (paper presented in workshops on open access & institutional repository, chennai, india, may 2–8, 2004), http://openmed.nic.in/1363/01/long_term_preservation.pdf (accessed november 21, 2011). 15. cendi digital preservation task group, “formats for digital preservation: a review of alternatives and issues,” www.cendi.gov/publications/cendi_presformats_whitepaper_03092007.pdf (accessed november 21, 2011). 16. hodge and anderson, “formats for digital preservation.” 17. david 4 project (digital archiving, guideline and advice 4), “standards for fileformats,” 1, www.expertisecentrumdavid.be/davidproject/teksten/guideline4.pdf (accessed november 21, 2011). 18. sullivan, “an archival/records management perspective on pdf/a”; john michael potter, “formats conversion technologies set to benefit institutional repositories,” http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881&rep=rep1&type=pdf (accessed november 21, 2011). 19. eva müller et al., “using xml for long-term preservation: experiences from the diva project,” in proceedings of the 6th international symposium on electronic theses and dissertations (may 20–24, 2003): 109–16, https://edoc.hu-berlin.de/conferences/etd2003/hanssonpeter/html/index.html (accessed november 21, 2011). 20. rene van horik, “image formats: practical experiences” (paper presented in erpanet training, vienna, austria, may 10–11, 2004), 22, www.erpanet.org/events/2004/vienna/presentations/erpatrainingvienna_horik.pdf (accessed november 21, 2011). 21. open standard is related to open access, which comes from the open access movement that allows resources to be freely available to the public and permits any user to use those resources (e.g., mainly electronic journals, repositories, databases, software applications, etc.) without financial, legal, or technical barriers. see amy e. c. koehler, “some thoughts on the meaning of open access for university library technical services,” serials review 32, no. 1 (march 2006): 17–21; budapest open access initiative, “read the budapest open access initiative,” www.soros.org/openaccess/read.shtml (accessed november 21, 2011). 22. national archives, “selecting file formats for long-term preservation,” 6, www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_2 7022008.pdf (accessed november 21, 2011). 23. folk and barkstrom, “attributes of file formats.” http://openmed.nic.in/1363/01/long_term_preservation.pdf http://www.cendi.gov/publications/cendi_presformats_whitepaper_03092007.pdf http://www.expertisecentrumdavid.be/davidproject/teksten/guideline4.pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881&rep=rep1&type=pdf https://edoc.hu-berlin.de/conferences/etd2003/hansson-peter/html/index.html https://edoc.hu-berlin.de/conferences/etd2003/hansson-peter/html/index.html http://www.erpanet.org/events/2004/vienna/presentations/erpatrainingvienna_horik.pdf http://www.soros.org/openaccess/read.shtml http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf information technology and libraries | december 2012 53 24. andreas stanescu, “assessing the durability of formats in a digital preservation environment: the inform methodology,” d-lib magazine 10, no. 11 (november 2004), www.dlib.org/dlib/november04/stanescu/11stanescu.html (accessed november 21, 2011). 25. malcolm todd, “technology watch report: file formats for preservation,” www.dpconline.org/advice/technology-watch-reports (accessed november 21, 2011). 26. ibid. 27. hodge and anderson, “formats for digital preservation.” 28. edward m. corrado, “the importance of open access, open source, and open standards for libraries,” issues in science & technology librarianship (spring 2005), www.library.ucsb.edu/istl/05-spring/article2.html (accessed november 21, 2011); carl vilbrandt et al., “cultural heritage preservation using constructive shape modeling,” computer graphics forum 23, no. 1 (2004): 25–41; marshall breeding, “preserving digital information,” information today 19, no. 5 (2002): 48–49. 29. eun g. park, “xml: examining the criteria to be open standard file format,” (paper presented at the interpares 3 international symposium, oslo, norway, september 17, 2010), www.interpares.org/display_file.cfm?doc=ip3_isym04_presentation_3–3_korea.pdf (accessed november 21, 2011). 30. adrian brown, “digital preservation guidance note: selecting file formats for long-term preservation,” www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf (accessed november 21, 2011); barnes, “preservation of word processing documents”; sahu, “long term preservation”; potter, “formats conversion technologies.” 31. stephen abrams et al., “pdf-a: the development of a digital preservation standard” (paper presented at the 69th annual meeting for the society of american archivists, new orleans, louisiana, august 14–21, 2005), www.aiim.org/documents/standards/pdf-a.ppt (accessed november 21, 2011); sullivan, “an archival/records management perspective on pdf/a”; cendi, “formats for digital preservation”; and hodge & anderson, “formats for digital preservation.” 32. the national archives, http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_me thod_27022008.pdf (accessed november 21, 2011); ecma international, “office open xml file formats—ecma-376,” www.ecma-international.org/publications/standards/ecma-376.htm (accessed november 21, 2011). 33. christoph becker et al., “systematic characterisation of objects in digital preservation: the extensible characterisation languages,” www.jucs.org/jucs_14_18/systematic_characterisation_of_objects/jucs_14_18_2936_2952_bec ker.pdf (accessed november 21, 2011); national archives, http://www.dlib.org/dlib/november04/stanescu/11stanescu.html http://www.dpconline.org/advice/technology-watch-reports http://www.library.ucsb.edu/istl/05-spring/article2.html http://www.interpares.org/display_file.cfm?doc=ip3_isym04_presentation_3–3_korea.pdf http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf http://www.aiim.org/documents/standards/pdf-a.ppt http://www.ecma-international.org/publications/standards/ecma-376.htm http://www.jucs.org/jucs_14_18/systematic_characterisation_of_objects/jucs_14_18_2936_2952_becker.pdf http://www.jucs.org/jucs_14_18/systematic_characterisation_of_objects/jucs_14_18_2936_2952_becker.pdf examining attributes of open standard file formats for long-term preservation and open access | park and oh 54 www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_2 7022008.pdf (accessed november 21, 2011). 34. folk and barkstrom, “attributes of file formats.” 35. national archives, www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_2 7022008.pdf (accessed november 21, 2011); rog and van wijk, “evaluating file formats for long-term preservation.” 36. rog and van wijk, “evaluating file formats for long-term preservation.” 37. see brown, “digital preservation guidance note: selecting file formats for long-term preservation,” www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf (accessed november 21, 2011); barnes, “preservation of word processing documents”; sahu, “long term preservation”; potter, “formats conversion technologies.” 38. stephen abrams et al., “pdf-a: the development of a digital preservation standard” (paper presented at the 69th annual meeting for the society of american archivists, new orleans, louisiana, august 14–21, 2005), www.aiim.org/documents/standards/pdf-a.ppt (accessed november 21, 2011).; sullivan, “an archival/records management perspective on pdf/a”; cendi, “formats for digital preservation”; and hodge & anderson, “formats for digital preservation.” 39. todd, “technology watch report,” 33. 40. evelyn peters mclellan, “selecting digital file formats for long-term preservation: interpares 2 project general study 11 final report,” www.interpares.org/display_file.cfm?doc=ip2_file_formats(complete).pdf (accessed november 21, 2011). http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf http://www.aiim.org/documents/standards/pdf-a.ppt http://www.interpares.org/display_file.cfm?doc=ip2_file_formats(complete).pdf information technology and libraries | december 2012 55 appendix: file format attributes no. attribute definition/description assessed file format 1. f u n c t i o n a l i t y robustness robust against single point of failure, support for file corruption detection, file format stability, backward compatibility and forward compatibility (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (limited) microsoft word (limited) a robust format contains several layers of defense against corruption (frey, 2000). n/a feature set formats supporting the full range of features and functionality (brown, 2003) n/a not defined (sahu, 2006) n/a viability error-detection facilities to allow detection of file corruption (brown, 2003). png format (yes) not defined (sahu, 2006) n/a support for graphic effects and typography not defined (cendi, 2007; hodge & anderson, 2007) tiff_g4 (no) color maintenance not defined (cendi, 2007; hodge & anderson, 2007) tiff_g4 (limited) clarity support for high image resolution (cendi, 2007; hodge & anderson, 2007) tiff_g4 (yes) quality this pertains to how well the format fulfills its task today: (1) low space costs, (2) highly encompassing, (3) robust, (4) simplicity, (5) highly tested, (6) loss-free, (7) supports metadata (clausen, 2004). n/a compactness to minimize storage and i/o costs (folk & barkstrom, 2003) n/a simplicity ease of implementing readers (folk & barkstrom, 2003) n/a file corruption detection to be able to detect that a file has been corrupted; to provide errorcorrection (folk & barkstrom, 2003) n/a raw i/o efficiency formats that are organized for fast sequential access (folk & barkstrom, 2003) n/a availability of readers to maintain ease of data access for readers (folk & barkstrom, 2003) n/a ease of subsetting to process only part of data files (folk & barkstrom, 2003) n/a size to transfer data in large blocks (folk & barkstrom, 2003) n/a ability to aggregate many objects in a single file to maintain as small as archive “name space” as possible (folk & barkstrom, 2003) n/a ability to embed data extraction software in the files the files come with read software embedded (folk & barkstrom, 2003). n/a ability to name file elements to work with data based on manipulating the element names instead of binary offsets, or other references (folk & barkstrom, 2003) n/a rigorous definition to be defined in a sufficient rigorous way (folk & barkstrom, 2003) n/a multilanguage implementation of library software to have multiple implementations of readers for a single format (folk & barkstrom, 2003) n/a memory some formats emphasize the presence or absence of memory (frey, 2000). tiff (yes) examining attributes of open standard file formats for long-term preservation and open access | park and oh 56 accuracy in some cases, the accuracy of the data can be decreased to save memory, e.g., through compression. in the case of a digital master, however, accuracy is very important (frey, 2000). n/a speed the ability to access or display a data set at a certain speed is critical to certain applications (frey, 2000). n/a extendibility a data format can be modified to allow for new types of data and features in the future (frey, 2000). n/a modularity a modular data set definition is designed to allow some of its functionality to be upgraded or enhanced without having to propagate changes through all parts of the data set (frey, 2000). n/a plugability related to modularity, this permits the user of an implementation of a data set reader or writer to replace a module with private code (frey, 2000). n/a interpretability not binary formats (barnes, 2006) rtf (yes) ms word (no) xml (yes) the standard should be written in characters that people can read (lesk, 1995). n/a complexity human readability, compression, variety of features (rog & van wijk, 2008; wijk & rog, 2007). n/a simple raster formats are preferred (puglia et al., 2004). n/a compression algorithms the format uses standard algorithms (puglia et al., 2004). n/a accessibility to prohibit encryption in the file trailer (sullivan, 2006) pdf/a (yes) component reuse not defined (sahu, 2006) pdf (no) html (limited) sgml (excellent) xml (excellent) repurposing not defined (sahu, 1999) pdf (limited) html (limited) sgml (excellent) xml (excellent) packaging formats in general, packaging formats should be acceptable as transfer mechanisms for image file formats (puglia et al., 2004). zip (yes) significant properties the format accommodates high-bit, high-resolution (detail), color accuracy, and multiple compression options (puglia et al., 2004). n/a processability the requirement to maintain a processable version of the record to have any reuse value (brown, 2003) conversion of a word-processed document into pdf format. (no) searching not defined (sahu, 2006) pdf (limited) html (good) sgml (excellent) xml (excellent) no definite term to support the automatic validation of document conversions and the evaluation of conversion quality by hierarchically decomposing documents from different sources and representing them in an abstract xml language (becker et al., 2008a; becker et al., 2008b) n/a xcl (yes) to make transferring data easy (johnson, 1999) n/a xml (yes) a format that is easy to restore and understand by both humans and machines (müller et al., 2003) n/a xml (yes) information technology and libraries | december 2012 57 inability to be backed out into a usable format (potter, 2006) pdfs (no) 2. m e t a d a t a self-documentation self-documenting digital objects that contain basic descriptive, technical, and other administrative metadata (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (yes) tiff_g4 (yes) xml (yes) metadata and technical description of format embedded (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (limited) microsoft word (limited) the ability of a digital format to hold (in a transparent form) metadata beyond that needed for basic rendering of the content (arms & fleischhauer, 2006) n/a self-documenting to contain its own description (abrams et al., 2005) n/a documentation deep technical documentation publicly and fully is available. it is maintained for older versions of the format (puglia et al., 2004). n/a metadata support file formats making provision for the inclusion of metadata (brown, 2003) tiff (yes) microsoft word 2000 (yes) not defined (kenney, 2001) fiff 6.0 (yes) gif 89a (yes) jpeg (yes) flashpix 1.0.2 (yes) imagepac, photo cd (no) png 1.2 (yes) pdf (yes) not defined (sahu, 2006) n/a metadata the format allows for self-documentation (puglia et al., 2004). n/a content-level description not presentation-level description; structural markup, not formatting (barnes, 2006) pdf (no) docbook (yes) tei (yes) xhtml (yes) xml (yes) content-level, not presentation-level, descriptions where possible, the labeling of items should reflect their meaning, not their appearance (lesk, 1995). sgml (yes) self-describing many different types of metadata are required to decipher the contents of a file (folk & barkstrom, 2003). n/a self-describing files embed metadata in pdf files (sullivan, 2006) pdf/a (adobe extensible metadata platform required) formal (bnfor xml-like) description of format to create new readers solely on the basis of formal descriptions of the file content (folk & barkstrom, 2003) n/a no definite term its self-describing tags identify what your content is all about (johnson, 1999). n/a xml (yes) a format for strong descriptive and administrative metadata and the complete content of the document (müller et al., 2003) n/a xml (yes) examining attributes of open standard file formats for long-term preservation and open access | park and oh 58 3. o p e n n e s s disclosure authoritative specification publicly available (abrams et al., 2005) pdf/a (yes) microsoft word (no) the degree to which complete specifications and tools for validating technical integrity exist and are accessible to those creating and sustaining digital content (cendi, 2007; hodge & anderson, 2007; arms & fleischhauer, 2006) pdf (yes) pdf/a (yes) tiff_g4 (yes) xml (yes) authoritative specification is publicly available (sullivan, 2006). pdf/a (yes) open availability no proprietary formats (barnes, 2006) odf (yes) gif (no) pdf (no) rtf (no) microsoft word (no) any manufacturer or researcher should have the ability to use the standard, rather than having it under the control of only one company (lesk, 1995). kodak photocd (no) gif (no) openness standardization, restrictions on the interpretation of the file format, reader with freely available source (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (yes) ms word (no) a standard is designed to be implemented by multiple providers and guide 5: file formats for digital masters employed by a large number of users (frey, 2000). n/a formats that are described by publicly available specifications or open-source source code can, with some effort, be reconstructed later: (1) open publicly available specification, (2) specification in public domain, (3) viewer with freely available source, (4) viewer with gpl’ed source, (5) not encrypted (clausen, 2004). n/a open-source software or equivalent to move toward obtaining open-source arrangements for all parts of the file format and associated libraries (folk & barkstrom, 2003) n/a open standard formats for which the technical specification has been made available in the public domain (brown, 2003) jpeg (yes) pdf (limited) ascii (limited) not defined (sahu, 2006) n/a standard/ proprietary not defined (kenney, 2001) fiff 6.0 (yes) gif 89a (yes) jpeg (yes) flashpix 1.0.2 (yes) imagepac, photo cd (no) png 1.2 (yes) pdf (yes) nonproprietary formats the specification is independent of a particular vendor (public records office of victoria, 2004). n/a no definite term to avoid vendor-lock (potter, 2006) odf (yes) information technology and libraries | december 2012 59 4. i n t e r o p e r a b i l i t y interoperability is the format supported by many software applications/os platforms or is it linked closely with a specific application (puglia et al., 2004)? n/a the ability to exchange electronic records with other users and it systems (brown, 2003) n/a not defined (sahu, 2006) n/a data interchange not defined (sahu, 2006) pdf (no) html (limited) sgml (excellent) xml (excellent) compatibility compatibility with prior versions of data set definitions often is needed for access and migration considerations (frey, 2000). n/a stability compatibility between versions (folk & barkstrom, 2003) n/a stable, not subject to constant or major changes over time (brown, 2003) n/a the format is supported by current applications and backward compatible, and there are frequent updates to the format or the specification (puglia et al., 2004). n/a not defined (sahu, 2006). n/a scalability the design should be applicable both to small and large data sets and to small and large hardware systems (frey, 2000). n/a markup compatibility and extensibility to support a much broader range of applications (ecma, 2008) n/a xml (yes) suitability for a variety of storage technologies the format should not be geared toward any particular technology (folk & barkstrom, 2003). n/a no definite term to allow data to be shared across information systems and remain impervious to many proprietary software revisions (potter, 2006) openoffice (yes) 5. i n d e p e n d e n c e device independencies can be reliably and consistently rendered without regard to the hardware/software platform (abrams et al., 2005) pdf/a (yes) tiff (no) static visual appearance can be reliably and consistently rendered and printed without regard to the hardware or software platform used (sullivan, 2006). pdf/a (yes) pdf/x (yes) this is a very important aspect for master files because they will be most likely used on various systems (frey, 2000). n/a independent implementations independent implementations help ensure that vendors accurately implement the specification (public records office of victoria, 2004). n/a externaldependency degree to which the format is dependent on specific hardware, operating system, or software for rendering or use and the complexity of dealing with those dependencies in future technical environments (arms & fleischhauer, 2006) n/a external dependencies the degree to which a particular format depends on particular hardware, operating system, or software for rendering or use and the predicted complexity of dealing with those dependencies in future technical environments (cendi, 2007; hodge & anderson, 2007) pdf (limited) pdf/a (no) tiff_g4 (no) xml (no) examining attributes of open standard file formats for long-term preservation and open access | park and oh 60 portability a format that makes extensive use of specific hardware or operating system features is likely to be unusable when that hardware or operating system falls into disuse. a format that is defined in an independent way will be much easier to use in the future: (1) independent of hardware; (2) independent of operating system; (3) independent of other software; (4) independent of particular institutions, groups, or events; (5) widespread current use; (6) little built-in functionality; and (7) single version or well-defined versions (clausen, 2004). n/a monitoring obsolescence information gathered through regular web harvesting can give us some information about what file types are approaching obsolescence, at least for the more frequently used types (clausen, 2004). n/a no definite term a human-readable text format and internationalized character sets are supported (müller et al., 2003). n/a xml (yes) not dependent on specific hardware, not dependent on specific operating systems, not dependent on one specific reader, not dependent on other external resources (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (limited) microsoft word (little) the format requires a plug-in for viewing if appropriate software is not available or relies on external programs to function (puglia et al., 2004). n/a 6. p r e s e n t a t i o n distributing page image not defined (sahu, 2006) pdf (excellent) html (good) sgml (good) xml (good) normal rendering not defined (cendi, 2007; hodge & anderson, 2007). pdf (yes) pdf/a (limited) tiff_g4 (yes) xml (yes) presentation preservation of its original look and feel (brown, 2003) n/a self-containment everything that is necessary to render or print a pdf/a file must be contained within the file (sullivan, 2006). pdf/a (yes) self-contained to contain all resources necessary for rendering (abrams et al., 2005) n/a beyond normal rendering not defined (cendi, 2007; hodge & anderson, 2007). pdf (yes) pdf/a (yes) tiff_g4 (yes) xml (limited) 7. a u t h e n t i c i t y authenticity the format must preserve the content (data and structure) of the record and any inherent contextual, provenance, referencing and fixity information (brown, 2003). n/a provenance traceability ability to trace the entire configuration of data production (folk & barkstrom, 2003) n/a integrity of layout not defined (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (yes) tiff_g4 (n/a) xml (yes) integrity of rendering of equations not defined (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (yes) tiff_g4 (n/a) xml (limited) integrity of structure not defined (cendi, 2007; hodge & anderson, 2007) pdf (limited) pdf/a (limited) tiff_g4 (n/a) information technology and libraries | december 2012 61 xml (yes) 8. a d o p t i o n adoption degree to which the format is already used by the primary creators, disseminators, or users of information resources (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (yes) tiff_g4 (yes) xml (yes) worldwide usage, usage in the cultural heritage sector as archival format (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (yes) microsoft word (limited) the degree to which the format is already used by the primary creators, disseminators, or users of information resources (arms & fleischhauer, 2006) n/a widespread use may be the best deterrent against preservation risk (abrams et al., 2005). tiff (yes) the format is widely used by the imaging community in cultural institutions (puglia et al., 2004). n/a flexibility of implementation to promote its wide adoption (sullivan, 2006) pdf/a (yes) popularity a format that is widely used (folk & barkstrom, 2003) n/a widely used formats it is far more likely that software will continue to be available to render the format (public records office of victoria, 2004). n/a ubiquity popular formats supported by as much software as possible (brown, 2003) n/a not defined (sahu, 2006) n/a continuity the file format is mature (puglia et al., 2004) n/a 9. p r o t e c t i o n technical protection mechanism password protection, copy protection, digital signature, printing protection and content extraction protection (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (limited) microsoft word (limited) implementation of a mechanism such as encryption that prevents the preservation of content by a trusted repository (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (no) tiff_g4 (no) xml (no) it must be able to replicate the content on new media, migrate and normalize it in the face of changing technology, and disseminate it to users at a resolution consistent with network bandwidth constraints (arms & fleischhauer, 2006). n/a no encryption, passwords, etc. (abrams et al. (2005) n/a protection the format accommodates error detection, correction mechanisms, and encryption options (puglia et al., 2004). n/a source verification cryptographic encoding of files or digital watermarks without overburdening the data centers or archives (folk & barkstrom, 2003) n/a examining attributes of open standard file formats for long-term preservation and open access | park and oh 62 10. p r e s e r v a t i o n preservation the format contains embedded objects (e.g., fonts, raster images) or links to external objects (puglia et al., 2004). n/a long-term institutional support to ensure the long-term maintenance and support of a data format by placing responsibility for these operations on institutions (folk & barkstrom, 2003) n/a ease of transformation/ preservation the format will be supported for fully functional preservation in a repository setting, or the format guarantee can currently only be made at the bitstream (content data) level (puglia et al., 2004). n/a no definite term to create files with either a very high or very low preservation value (becker et al., 2008a, becker et al., 2008b) pdf (no) tiff (no) 11. r e f e r e n c e citability a machine-independent ability to reference or “cite” the individual data element in a stable way (folk & barkstrom, 2003) n/a referential extensibility ability to build annotations about new interpretations of the data (folk & barkstrom, 2003) n/a no definite term an open and established notation (müller et al., 2003) n/a xml (yes) data is easily repurposed via tags or translated to any medium (johnson, 1999) n/a xml (yes) creating, using, and reusing tags is easy, making it highly extensible (johnson, 1999). n/a xml (yes) 12. o t h e r s transparency degree to which the digital representation is open to direct analysis with basic tools, such as human readability using a text-only editor (cendi, 2007, hodge & anderson, 2007). pdf (limited) pdf/a (limited) tiff_g4 (limited) xml (yes) in natural reading order (sullivan, 2006). pdf/a (yes) microsoft notepad (yes) the degree to which the format is already used by the primary creators, disseminators, or users of information resources (arms & fleischhauer, 2006) n/a amenable to direct analysis with basic tools (abrams et al., 2005) n/a ample comment space to allow rich metadata (barnes, 2006) n/a items should be labeled, as far as possible, with enough information to serve for searching or cataloging (lesk, 1995). tiff (yes) a digital format may inhibit the ability of archival institutions to sustain content in that format (arms & fleischhauer, 2006). n/a information technology and libraries | december 2012 63 table bibliography abrams, stephen et al. 2005. “pdf-a: the development of a digital preservation standard.” paper presented at the 69th annual meeting for the society of american archivists, new orleans, louisiana, august 14–21, http://www.aiim.org/documents/standards/pdf-a.ppt (accessed november 21, 2011). arms, caroline r. and carl fleischhauer. 2006. “sustainability of digital formats: planning for library of congress collections.” http://www.digitalpreservation.gov/formats/sustain/sustain.shtml (accessed november 21, 2011). barnes, ian. 2006. “preservation of word processing documents.” http://apsr.anu.edu.au/publications/word_processing_preservation.pdf (accessed november 21, 2011). becker, christoph et al. 2008. “a generic xml language for characterising objects to support digital preservation.” in proceedings of the 2008 acm symposium on applied computing, fortaleza, ceara, brazil, march 16–20. becker, christoph et al. 2008. “systematic characterization of objects in digital preservation: the extensible characterization language.” journal of universal computer science 14, no 18: 2936– 2952. brown, adams. 2003. “the national archives. digital preservation guidance note: selecting file formats for long-term preservation.” http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf (accessed november 21, 2011). cendi digital preservation task group. 2007. “formats for digital preservation: a review of alternatives and issues.” http://www.cendi.gov/publications/cendi_presformats_whitepaper_03092007.pdf (accessed november 21, 2011). clausen, lars r. 2004. “handling file formats.” http://netarchive.dk/publikationer/fileformats2004.pdf (accessed november 21, 2011). ecma. 2008. “office open xml file formats—part 1.” 2nd ed. http://www.ecmainternational.org/publications/standards/ecma-376.htm (accessed november 21, 2011). folk, mike, and bruce barkstrom. 2003. “attributes of file formats for long-term preservation of scientific and engineering data in digital libraries.” paper presented at the joint conference on digital libraries, houston, tx, may 27–31. http://www.hdfgroup.org/projects/nara/sci_formats_and_archiving.pdf (accessed november 21, 2011). http://www.digitalpreservation.gov/formats/sustain/sustain.shtml http://apsr.anu.edu.au/publications/word_processing_preservation.pdf http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf http://www.cendi.gov/publications/cendi_presformats_whitepaper_03092007.pdf http://netarchive.dk/publikationer/fileformats-2004.pdf http://netarchive.dk/publikationer/fileformats-2004.pdf http://www.ecma-international.org/publications/standards/ecma-376.htm http://www.ecma-international.org/publications/standards/ecma-376.htm http://www.hdfgroup.org/projects/nara/sci_formats_and_archiving.pdf examining attributes of open standard file formats for long-term preservation and open access | park and oh 64 frey, franziska. 2000. “5. file formats for digital masters.” in guides to quality in visual resource imaging, research libraries group and digital library federation. http://imagendigital.esteticas.unam.mx/pdf/guides.pdf (accessed november 21, 2011). hodge, gail and nikkia anderson. 2007. “formats for digital preservation: a review of alternatives and issues.” information services & use 27: 45–63. johnson, amy helen. 1999. “xml xtends its reach: xml finds favor in many it shops, but it’s still not right for everyone.” computerworld 33, no. 42: 76–81. lesk, michael e. 1995. “preserving digital objects: recurrent needs and challenges.” in proceedings of the 2nd npo conference on multimedia preservation. brisbane, australia. http://www.lesk.com/mlesk/auspres/aus.html (accessed november 21, 2011). müller, eva et al. 2003. “using xml for long-term preservation: experiences from the diva project.” in proceedings of the sixth international symposium on electronic theses and dissertations. berlin, may: 109–116, https://edoc.hu-berlin.de/conferences/etd2003/hanssonpeter/pdf/index.pdf (accessed december 8, 2012). potter, john michael. 2006. “formats conversion technologies set to benefit institutional repositories.” http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881\u0026rep=rep1\u0026typ e=pdf (accessed november 21, 2011). public records office of victoria (australia). 2006. “advice on vers long-term preservation formats pros 99/007 (version2) specification 4.” department for victorian communities. http://prov.vic.gov.au/wp-content/uploads/2012/01/vers_advice13.pdf (accessed november 21, 2011). puglia, steven, jeffrey reed, and erin rhodes. 2004. “technical guidelines for digitizing archival materials for electronic access: creation of production master files—raster images.” us national archives and records administration. http://www.archives.gov/preservation/technical/guidelines.pdf (accessed november 21, 2011). rog, judith, and caroline van wijk. 2008. “evaluating file formats for long-term preservation.” national library of the netherlands. http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_metho d_27022008.pdf (accessed november 21, 2011). sahu, d.k. 2004. “long term preservation: which file format to use.” presentation at workshops on open access & institutional repository, chennai, india, may 2–8, http://openmed.nic.in/1363/01/long_term_preservation.pdf (accessed november 21, 2011). sullivan, susan j. 2006. “an archival/records management perspective on pdf/a.” records management journal 16, no. 1: 51–56. http://imagendigital.esteticas.unam.mx/pdf/guides.pdf http://www.lesk.com/mlesk/auspres/aus.html https://edoc.hu-berlin.de/conferences/etd2003/hansson-peter/pdf/index.pdf https://edoc.hu-berlin.de/conferences/etd2003/hansson-peter/pdf/index.pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881\u0026rep=rep1\u0026type=pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881\u0026rep=rep1\u0026type=pdf http://prov.vic.gov.au/wp-content/uploads/2012/01/vers_advice13.pdf http://www.archives.gov/preservation/technical/guidelines.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://openmed.nic.in/1363/01/long_term_preservation.pdf information technology and libraries | december 2012 65 van wijk, caroline, and judith rog. 2007. “evaluating file formats for long-term preservation.” presentation at international conference on digital preservation, beijing, china, oct 11–12. http://ipres.las.ac.cn/pdf/caroline-ipres2007-11-12oct_cw.pdf (accessed november 21, 2011). http://ipres.las.ac.cn/pdf/caroline-ipres2007-11-12oct_cw.pdf 150 information technology and libraries | december 2011 hardly a day goes by in my professional life (and it sometimes creeps into my personal life too!) when i don’t think about the issues of connecting people with data, and then how to present that data in ways that are relevant to their needs. the tides are shifting in health sciences library and likely in your library too. ongoing changes in publishing and the changing nature of research have challenged the traditional nature of the library. it is no longer solely a repository for information, physical or virtual. as librarians move from collecting and cataloging bibliographic information new roles have emerged in data discovery, in its preservation, and in helping to make data more accessible. important specialties include; knowledge management, data visualization, e-science and copyright. librarians have valuable skills sets in mining and accessing data, human–computer interaction, computer interface design, and knowledge management that can be leveraged now. it is inevitable that data discovery will quicken the pace of science and lead to collaboration and collaboration will in turn lead to data discovery and accelerate the pace of science and so on and so on. in short twentieth century data stored in individual scientists’ notebooks or computers is largely inaccessible. twenty-first-century data needs to be available 24/7 in a curated state for continuous analysis. information overload and data deluge created by intersection of science and technology are two very real problems that the librarians have the skill and ability to deal with. and, as i talk of science, bear in mind that it extends beyond the biological and physical sciences to encompass the social sciences as well. interdisciplinary studies in particular have intensive data needs. in fields such as public health and urban planning, government data alongside research data is used to predict trends, forecast, make decisions, etc. government data is a particularly important part of the equation. consider the recent nsf requirement for researchers to provide open access to their data for any nsf-sponsored grants. it is likely other government agencies will follow suit. one of taiga’s provocative statements of 2011 is “#10. the oversupply of mlss” which states that “within five years, library programs will have overproduced mlss at a rate greater even than humanities phds and glutted a permanently diminished market.”1 as the alarming scenario of an over abundance of new mlss in proportion to available library jobs presents itself, i encourage librarians to begin to envision themselves as digital information brokers or data scientists. the us department of labor in the 2010–11 occupational outlook handbook, anticipates that librarian jobs in nontraditional settings will grow the fastest over this decade. nontraditional libraries and jobs include working as information brokers for private corporations, nonprofit organizations, and consulting firms. “many companies are turning to librarians because of l ast week i attended the second annual vivo conference in washington, d.c. vivo (vivoweb .org) is a semantic web application that enables the discovery of research and scholarship across disciplines in an institution with the potential to also link scholars and research across institutions. despite an earthquake and a hurricane the conference itself was the real showstopper—excellent, informative programming, engaging speakers, great networking and exchange of ideas. my institution is one of the core vivo members so it was an opportunity to showcase our work, see what others are doing as well as learn more about trends in research, e-science and data discovery and collaboration initiatives. much of what i learned or rediscovered at vivo will make it into my fifty-minute presentation on the subject at the lita national forum in st. louis later this month. in fact the vivo conference itself reminded me of our own national forum in size, scope and content. it was a good mix of in-depth technical discussions coupled with broad coverage of issues and trends in scientific research. this attention to content balance is something that lita consistently gets right at our annual forum—there is literally something for everyone from introductory concepts to technical details—and i look forward to seeing many familiar faces and meeting some new folks at this year’s lita national forum in st. louis “rivers of data: currents of change.” i would also like to take this opportunity to personally invite each and every ital reader to the 2012 lita national forum. building on this year’s theme, the 2012 lita national forum will be “the new world of data: discover. connect. remix.” i just signed off on theme this week and i am excited and impressed by the work completed by the national forum planning committee so far. please look for the call for papers and posters to come out in late december. i love the forum because it is much more intimate than the much larger ala meetings i always come away with new ideas and new friends. i am not alone in this feeling. a recent forum attendee commented,” (the lita forum) was one of the best conferences i have attended. i met a far greater concentration of peers—colleagues at other libraries doing similar work—at lita forum than i have met at other similar conferences.” i don’t think i could say it better myself. the 2012 forum theme is one of great personal interest to me and i plan to extend the theme to the lita president’s program on june 24, 2012, in anaheim. in fact colleen cuddypresident’s message: data discovery colleen cuddy (colleen.cuddy@med.cornell.edu) is lita president 2011–12 and director of the samuel j. wood library and c. v. starr biomedical information center at weill cornell medical college, new york, new york president’s message | cuddy 151 column a call to arms for librarians of all backgrounds. the time to address data discovery is now! references 1. “taiga 2011 provocative statements,” http://taigaforum provocativestatements.blogspot.com/ (accessed sept. 22, 2011). 2. united states department of labor, bureau of labor statistics, occupational outlook handbook, 2010–11 edition, http:// www.bls.gov/oco/ocos068.htm (accessed sept. 22, 2011). their research and organizational skills and their knowledge of computer databases and library automation systems. librarians can review vast amounts of information and analyze, evaluate, and organize it according to a company’s specific needs.” 2 we have been seeing new job titles emerging to reflect these needs, such as data curation librarian, digital data outreach librarian, gis librarian, etc. what is your library doing with data? how can you and your library address the data needs of the twenty-first century? what technology is needed to address data needs? how can lita help you meet those needs? consider this statement of ownership, management, and circulation information technology and libraries, publication no. 280-800, is published quarterly in march, june, september, and december by the library information and technology association, american library association, 50 e. huron st., chicago, illinois 60611-2795. editor: marc truitt, associate director, information technology resources and services, university of alberta, k adams/cameron library and services, university of alberta, edmonton, ab t6g 2j8 canada. annual subscription price, $65. printed in u.s.a. with periodical-class postage paid at chicago, illinois, and other locations. as a nonprofit organization authorized to mail at special rates (dmm section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. extent and nature of circulation (average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: september 2010 issue). total number of copies printed: average, 4,547; actual, 4,494. mailed outside country paid subscriptions: average, 3,608; actual, 3,577. sales through dealers and carriers, street vendors, and counter sales: average, 395; actual 367. total paid distribution: average, 4,003; actual, 3,944. free or nominal rate copies mailed at other classes through the usps: average, 27; actual, 27. free distribution outside the mail (total): average, 118; actual, 117. total free or nominal rate distribution: average, 145; actual, 144. total distribution: average, 4,148; actual, 4,088. office use, leftover, unaccounted, spoiled after printing: average, 399; actual, 406. total: average, 4,547; actual, 4,494. percentage paid: average, 96.50; actual, 96.48. s t a t e m e n t o f o w n e r s h i p , m a n a g e m e n t , a n d c i r c u l a t i o n ( p s f o r m 3 5 2 6 , s e p t e m b e r 2 0 0 7 ) f i l e d w i t h t h e u n i t e d s t a t e s p o s t o f f i c e p o s t m a s t e r i n c h i c a g o , o c t o b e r 1 , 2 0 11 . editorial | marmion 167 dan marmioneditorial: why is ital important? editor’s note: what follows is a reprint of dan marmion’s editorial from ital 20, no. 2 (2001), http://www.ala.org/ ala/mgrps/divs/lita/ital/2002editorial.cfm. after reading, we ask you to consider: why does ital matter to you? post your thoughts on italica (http://ital-ica.blogspot .com/). s ome time ago i received an e-mail from a library school student, who asked me “why is [ital] important in the library profession?” i answered the question in this way: ital is important to the library profession for at least four reasons. first, while it is no longer the only publication that addresses the use of technology in the library profession, it is the oldest (dating back to 1968, when it was founded as the journal of library automation) and, we like to think, most distinguished. second, not only do we publish on a myriad of topics that are pertinent to technology in libraries, we publish at least three kinds of articles on those subjects: pure scholarly articles that give the results of empirical research done on topics of importance to the profession, communications from practitioners in the field that present real-world experiences from which other librarians can profit, and tutorials on specific subjects that teach our readers how to do useful things that will help them in their everyday jobs. the book and software reviews that are in most issues are added bonuses. third, it is the “official” publication of lita, the only professional organization devoted to the use of information technology in the library profession. fourth, it is a scholarly, peer-reviewed journal, and as such is an important avenue for many academic librarians whose career advancement depends in part on their ability to publish in this type of journal. in a sentence, then, ital is important to the library profession because it contributes to the growth of the profession and its professionals. after sending my response, i thought it would be interesting to see what some other people with close associations to the journal would add. thus i posed the same question to the editorial board and to the person who preceded me as editor. here are some of their comments: one of the many things that was not traditionally taught in library school was a systematic approach to problem solving—for somebody who needs to acquire this skill and doesn’t have a mentor handy, ital is a wonderful resource. over and over again, ital describes how a problem was identified and defined, explains the techniques used to investigate it, and details the conclusions that might fairly be drawn from the results of the investigation. few other journals so effectively model this approach. regardless of the specific subject of the article, the opportunity to see practical problem solving techniques demonstrated is always valuable. (joan frye williams) the one thing i would add to your points, and it ties into a couple of them, is that by some definitions a “profession” is one that does have a major publication. as such, it is not only the “official” publication of lita but an identity focus for those professionals in this particular area of librarianship. in fact, ideally, i would like to think that’s more of a reason why ital is important than just the fact that it’s a perk of lita membership. (jim kopp) real world experiences from which other librarians would profit—to use your own words. that is my primary reason for reading it, although i take note of tutorials as well. and the occasional book review here may catch my eye as it is likely more detailed that what might appear in lj or booklist, and [i would] be more likely to purchase it for either my office or for the general collection. (donna cranmer) ital begins as the oldest and best-established journal for refereed scholarly work in library automation and information technology, a role that by itself is important to libraries and the library profession. ital goes beyond that role to add high-quality work that does not fit in the refereed-paper mold, helping librarians to work more effectively. as the official publication of america’s largest professional association for library and information technology, ital assures a broad audience for important work—and, thanks to its costrecovery subscription pricing, ital makes that work available to nonmembers at prices far below the norm for scholarly publishing. (walt crawford) the journal serves as an historical record/documentation and joins its place with many other items that together record the history of mankind. a professional/scholarly journal has a presumed life that lasts indefinitely. (ken bierman) in a sentence, ital is important to the profession because “communication is the key to our success.” dan marmion was editor of ital, 1999–2004. this editorial was first published in the june 2002 issue of ital. 168 information technology and libraries | december 2010 to paper. ital provides one means of fostering this communication in a format that is easily usable and recognizable. it is not the only communications format, but it fills a particular niche. (eric lease morgan) so there you have the thoughts of the editor and a few other folks as to why this journal is important. * * * why does ital matter to you? post your thoughts on italica (http://ital-ica.blogspot.com/). ital is a formal, traditional, and standardized way of sharing ideas within a specific segment of the library community. librarianship is an institutional profession. as an institution it is an organic organization requiring communication between its members. an advantage of written communication, especially paper-based written communication, is its ability to transcend space and time. a written document can communicate an idea long after the author has died and half way around the world. yes, electronic communication can do the same thing, but electronic communication is much more fragile than ideas committed 60 information technology and libraries | june 2011 b ecause this is a family program and because we are all polite people, i can’t really use the term i want to here. let’s just say that i am an operating system [insert term here for someone who is highly promiscuous]. i simply love to install and play around with various operating systems, primarily free operating systems (oses), primarily linux distributions. and the more exotic, the better, even though i always dutifully return home at the end of the evening to my beautiful and beloved ubuntu. in the past year or two i can recall installing (and in some cases actually using) the following: gentoo, mint, fedora, debian, moonos, knoppix, damn small linux, easypeasy, ubuntu netbook remix, xubuntu, opensuse, netbsd, sabayon, simplymepis, centos, geexbox, and reactos. (aside from stock ubuntu and all things canonical, the one i keep a constant eye on is moonos [http://www.moonos.org/], a stunningly beautiful and eminently usable ubuntu-based remix by a young artist and programmer in cambodia, chanrithy thim.) in the old days i would have rustled up an old, sloughed-off pc to use as an experimental “server” upon which i would unleash each of these oses, one at a time. but those were the old days, and these are the new days. my boss kindly bought me a big honkin’ windows-based workstation about a year and a half ago, a box with plenty of processing power and memory (can you even buy a new workstation these days that’s not incredibly powerful, and incredibly inexpensive?), so my need for hardware above and beyond what i use in my daily life is mitigated. specifically, it’s mitigated through use of virtual machines. i have long used virtualbox (http://www.virtualbox .org/) to create virtual machines (vms), lopped-off hunks of ram and disk space to be used for the installation of a completely different os. with virtualbox, you first describe the specifications of the vm you’d like to create—how much of the host’s ram to provide, how large a virtual hard disk, boot order, access to host cd drives, usb devices, etc. you click a button to create it, then you install an os onto it, the “guest” os, in the usual way. (well, not exactly the usual way; it’s actually easier to install an os here because you can boot directly from a cd image, or iso file, negating the need to mess with anything so distasteful and old-fashioned and outre as an actual, physical cd-rom.) in my experience, you can create a new vm in mere seconds; then it’s all a matter of how difficult the os is to install, and the linux distributions are becoming easier and easier to install as the months plow on. at any rate, as far as your new os is concerned, it is being installed on bare metal. virtual? real? for most intents and purposes the guest os knows no difference. in the titillatingly dangerous and virus-ridden cyberworld in which we live, i’ll not mention the prophylactic uses of vms because, again, this is a family program and we’re all polite people. suffice it to say, the typical network connection of a vm is nated behind the nic of the host machine, so at least as far as active network– based attacks are concerned, your guest vm is at least as secure as its host, even more so because it sits in its own private network space. avoiding software-based viruses and trojans inside your vm? let’s just say that the wisdom passed down the cybergenerations still holds: when it rains, you wear a raincoat—if you see what i’m saying. aside from enabling, even promoting my shameless os promiscuity, how are vms useful in an actual work setting? for one, as a longtime windows guy, if i need to install and test something that is *nix-only, i don’t need a separate box with which to do so. (and vice versa too for all you unix-weaned ladies and gentlemen who find the need to test something on a rocker from redmond.) if there is a software dependency on a particular os, a particular version of a particular os, or even if the configuration of what i’m trying to test is so peculiar i just don’t want to attempt to mix it in with an existing, stable vm, i can easily and painlessly whip up a new instance of the required os and let it fly. and deleting all this when i’m done is easily accomplished within the virtualbox gui. using a virtual machine facilitates the easy exploration of new operating systems and new applications, and moving toward using virtual machines is similar to when i first started using a digital camera. you are free to click click click with no further expense accrued. you don’t like what you’ve done? blow it away and begin anew. all this vm business has spread, at my home institution, from workstation to data center. i now run both a development and test server on vms physically sitting on a massive production server in our data center—the kind of machine that when switched on causes a brown-out in the tri-state area. this is a very efficient way to do things though because when i needed access to my own server, our system administrator merely whipped up a vm for me to use. to me, real or virtual, it was all the same; to the system administrator, it greatly simplified operations. and i may joke about the loud clank of the host server’s power switch and subsequent dimming of the lights, but doing things this way has been shown to be more energy efficient than running a server farm in which each server editorial board thoughts: just like being there, or how i learned to stop coveting bare metal and learned to love my vm mark cyzyk (mcyzyk@jhu.edu) is the scholarly communication architect in the sheridan libraries, johns hopkins university, baltimore, maryland. mark cyzyk editorial board thoughts | cyzyk 61 virtual machines: zero-cost playgrounds for the promiscuous, and energy efficient, staff saving tools for system operations. what’s not to like? throw dual monitors into the mix (one for the host os; one for the guest), and it’s just like being there. sucks in enough juice to quench the thirst of its redundant power supplies. (they’re redundant, they repeat themselves; they’re redundant, they repeat themselves—so you don’t want too many of them around slurping up the wattage, slurping up the wattage . . . ) letter from the editor kenneth j. varnum information technology and libraries | december 2017 1 https://doi.org/10.6017/ital.v36i4.10237 i am excited to have been appointed editor of information technology and libraries as the journal enters its 50th year. originally published as the journal of library automation, ital has a long history of tracking the rapid-fire changes in technology as it relates to libraries. much as it has over the past 50 years, technology will continue to change not just the way libraries offer services to their communities, but the way we conceptualize what it is we do. if past is prologue, i have no doubt the next decades will continue to amaze, probably in ways even the most adventurous trend-forecaster won’t get quite right. in the context of the rapid change in how we do our work, what we do will remain the same: collecting, preserving, and providing access to the information and artefacts of our culture, whatever that may be. i would like ital to grow and expand, while keeping its core essence the same. that core is high-quality, relevant, and informative articles, reviewed by our peers, and made available to the world. but i think there is more we can do for lita and the library technology profession by expanding the scope and impact of the journal through seeking and soliciting articles from a wider range of librarians, adding more case studies to the research articles that are at the journal’s core, and being more rapidly responsive to the evolving technology landscape in front of us. to that end, i invite you to think broadly about researching, documenting, and describing the technology-related work you do so that others can learn about it. i welcome questions about how your project might fit into ital, and look forward to working with you. i’d like to close by extending my thanks to bob gerrity, who served as ital’s editor for the past 6 years and stewarded the journal’s transition to an open access publication. i am grateful for his service to ital, lita, and the profession. sincerely, kenneth j. varnum editor varnum@umich.edu mailto:varnum@umich.edu 2 information technology and libraries | march 2011 program that would provide for educating mentees about lita, sharing of areas of expertise and awareness, and develop a network of professionals. dialogue on the lita electronic discussion list and conversations with committee and interest group chairs suggests a desire and need for leadership training. the membership development committee is addressing the need for mentors in lita 101 and lita 201 held at ala annual conferences and midwinter meetings. lita leadership, including the membership development committee, committee and interest group chairs, the education committee, lita emerging leaders, and others, will be included in an ongoing dialogue to see how and what can be implemented from the lita leadership institute and the lita mentorship program recommendations as submitted by the 2009 emerging leaders team t. follow-up by lita to implement the recommendations of emerging leader projects is important to the vitality and longevity of the association. since 2007, a number of projects have been developed by emerging leaders. information about the projects is available at the following locations online: ■■ the ala website: http://www.ala.org/ala/educationcareerleader ship/emergingleaders/index.cfm ■■ ala connect: http://connect.ala.org/emergingleaders ■■ facebook: http://www.facebook.com/pages/ala-emerging -leaders/156736295251?ref=ts/ ■■ the emerging leaders blog: http://connect.ala.org/2011emergingleaders ■■ the emerging leaders wiki: http://emergingleaders.ala.org/wiki/index.php ?title =main_page i n 2006, ala president leslie burger implemented six initiatives, including an emerging leaders program that is now in its fifth year. the initiative was designed to prepare librarians who are new to the profession in leadership skills that are applicable on the job and as active leaders within the association. lita is sponsoring 2011 emerging leaders bohyun kim and andreas orphanides. bohyun is currently digital access librarian at the florida international university medical library. andreas is currently librarian for digital technologies and learning at the north carolina state university libraries. as of the writing of this column, the projects for 2011 have not been assigned. additional lita members accepted into the 2011 ala emerging leaders program include tabatha farney, deana greenfield, amanda harlan, colleen harris, megan hodge, matthew jabaily, catherine kosturski, nicole pagowsky, casey schacher, sibyl schaefer, jessica sender, and andromeda yelton. lita provides an ideal environment for its members to enhance their skills. in 2009, emerging leaders team t developed a project “making it personal: leadership development programs for lita,” working in consultation with the lita membership development committee. team members included amanda hornby (university of washington), angelica guerrero fortin (san diego county library), dan overfield (cuyahoga community college), and lisa carlucci thomas (yale university). the team t members recommended the creation of “an online continuing education program to develop the leadership and project management skills necessary to maintain and promote the value and ability of lita’s professional membership to the greater librarian population.” outcomes for the training would include project-management and team-building skills within a context that focuses on the development and application of technology in libraries. the team members also recommended the establishing of a lita mentorship karen j. starr (kstarr@nevadaculture.org) is lita president 2010–11 and assistant administrator for library and development services, nevada state library and archives, carson city. karen j. starr president’s message: membership, leadership, emerging leaders, and lita articles weathering the twitter storm: early uses of social media as a disaster response tool for public libraries during hurricane sandy sharon han information technology and libraries | june 2019 37 sharon han (shrnhan@gmail.com) is candidate for master of science in library and information science, school of information sciences, university of illinois. abstract after a disaster, news reports and online platforms often document the swift response of public libraries supporting their communities. despite current scholarship focused on social media in disasters, early uses of social media as an extension of library services require further scrutiny. the federal emergency management agency (fema) recognized hurricane sandy as one of the earliest u.s. disasters in which first responders used social media. this study specifically examines early uses of twitter by selected public libraries as an information tool during sandy’s aftermath. results can inform uses of social media in library response to future disasters. introduction in the digital age of instantaneous communication, when disasters hit, they hit us all. the fall and winter of 2017-18 brought a literal and figurative deluge to our screens with the arrival of hurricanes harvey, irma, and maria to the united states. within moments of each event, websites and news feeds filled with images of destruction and cries for help. the use of social media to bring awareness to victims’ situations through hashtags and directly tagging first responders underscores the importance of this technological tool in the twenty-first century. in fact, the ubiquity of social media in documenting hurricane harvey have led some to believe that it should be considered the first “social media storm.”1 however, many of the most popular social media platforms have existed since the mid-2000s and have already been used to communicate disasterrelated information since well before harvey reached the united states’ shores. some of social media’s earliest adapters were even public libraries who had the resources and means to use this information technology as a method of connecting with their communities. why should social media matter to public libraries in times of disaster? as a physical manifestation of information access, the public library maintains a relationship with its community that varies across regions, time, and context. currently, the public library as an entity is in an interventionist period, according to jaeger’s article “libraries, policy, and politics in a democracy: four historical epochs,” where its roles and responsibilities are heavily influenced by outside factors, especially the federal government.2 from tax forms to permits to insurance claims, the government encourages people to use the public library to find and use information necessary to navigate american society. public demand for accessing government and other resources is especially apparent after natural disasters, which, due to their unpredictable nature, can heighten weathering the twitter storm | han 38 https://doi.org/10.6017/ital.v38i2.11018 community uncertainty and the need for credible and reliable information. public libraries can meet this information need by using social media as one strategy to assess and provide resources in real time. when hurricane sandy landed on new jersey’s shore on october 29, 2012, it prompted a new era for societal response to emergencies and community needs. due to the hurricane’s trajectory into densely populated areas of the american northeast and subsequent widespread flooding, hurricane sandy was the deadliest storm of 2012.3 with initial estimated recovery costs of up to $50 billion, the degree of damage to buildings, infrastructure, and endangerment of people’s safety made swift and coordinated communication paramount in response efforts. thus, the aftermath of hurricane sandy resulted in federal agencies using social media for the first time in coordinating and implementing disaster response.4 as community-based service providers, many public libraries responded to the hurricane by sharing available resources and services with patrons. however, few studies explicitly examine the use of social media as a library tool to support their community. this paper explores the role of social media and its impact on public library services in response to hurricane sandy as a measure of libraries using digital mediums to support their communities. using twitter posts from three separate public libraries impacted by the hurricane, their content is analyzed and compared to reported library services after the storm. the analysis will then be used to discuss the use of social media as a library tool and recommendations for social media implementation in future disaster response. background information library response to disasters according to the institute of museum and library services’ public library data from 2009 to 2011, over half of all public libraries are located within declared “disaster counties.”5 this value implicates disaster response as an important topic within public librarianship discourse. in addition to assessing damages to buildings and collections, libraries must also meet the needs of its community. information needs are heightened after a disaster, as the destruction results in information uncertainty and loss of important resources such as power and telecommunication services.6 consistent and increased use of public libraries is not unusual post-disaster. for example, despite 35 percent of louisiana libraries being closed after hurricane katrina in 2004, a study found that overall library visitor counts only decreased by 1 percent.7 frequent use of library resources after a disaster can be attributed to the library’s free and low-cost resources, as well as the institution’s reputation as a source for reliable and credible information.8 libraries also extend their resources and services beyond their walls. library bookmobiles and delivery programs provide services to those who are unable to physically visit the library. some libraries use their skills in information management and communication to assist local disaster preparedness groups and response teams.9 in 2011, the federal emergency management agency (fema) declared public libraries eligible for temporary relocation funds in the event of an emergency, a distinction once limited to first responders, hospitals, utilities, and schools.10 former executive director of the american library association’s (ala) washington office, emily sheketoff, stated such a distinction recognizes libraries as “essential community organizations.”11 in context with jaeger’s interventionist period, it benefits libraries and government agencies alike to have libraries open to serve communities after a disaster. information technology and libraries | june 2019 39 in the aftermath of hurricane sandy, communities suffered from varying degrees of damage, such as flooding, power outages, debris, and downed trees.12 the impact of the storm drove many community members to their local libraries to seek shelter, charge their electronics, file insurance claims and other e-government forms, drop off or pick up donations, and obtain entertainment.13 despite the many stories of libraries serving disaster victims and working with first responders, such actions have yet to be translated into widespread library policy and procedures. ala provides a “disaster preparedness and recovery” resources webpage, but it primarily focuses on addressing material and structural needs after a disaster, such as mitigating water damage to collections.14 other studies also note a majority of library disaster response literature remains focused on protecting materials.15 such a limited perspective is highlighted in a national survey in which the majority of librarian respondents believed protecting library materials and performing daily services were their primary goals in the event of an emergency.16 as a result, library communication with the community and local organizations remains a relatively unexplored subject in context with disaster response.17 while trade journals and websites publish stories of individual libraries serving their communities, formal studies and research are comparatively scarce. with the widespread use of technology and the internet, one method of communication stands out as an important tool for library outreach and study: social media. disaster response through social media as information providers and advocates of communication technology, libraries should use social media to connect with their communities. although libraries were early adopters of social media prior to hurricane sandy, their use of these tools tends to focus on one-way information sharing instead of a dialogue with their community.18 social media in context of disaster response may upend traditional library social media use, which is why this topic needs further examination. social media coupled with mobile technology has created a society in which information sharing and communication are constant and instantaneous.19 since social networking is a relatively new form of media, formal studies on its impact on social behaviors have only come about in the last decade.20 within this young body of literature, however, social media use in disaster response and recovery is a popular topic for researchers, organizations, and federal agencies.21 alexander claims that social media provides the following benefits during disaster response: • provides an outlet to listen and share thoughts, emotions, opinions; • monitors a situation; • integrates social media into emergency plans; • crowdsources information; • creates social cohesion and promoting therapeutic initiatives; • furthers causes; and • creates research data.22 such a comprehensive list is beneficial to this study because it provides a framework through which library social media use can be examined. these benefits stem from the sharing of information with people or entities, which is a large component of library disaster response, as discussed in the previous section. using alexander’s list as a reference, the three main benefits this study examines in context with library disaster response are: weathering the twitter storm | han 40 https://doi.org/10.6017/ital.v38i2.11018 1. monitors a situation. a survey of library patrons impacted by the 2015 south carolina floods revealed all respondents used social media to learn about the flooding and impacted areas.23 people now frequently use social media to get updates on situations, whether they were directly or indirectly impacted by the natural disaster itself. disaster response groups also monitor social media feeds to assess and allocate resources to those in need.24 libraries can use social media feeds to assess resources and services use, plan outreach opportunities, and even inform the public about its own status during the disaster. 2. integrates social media into emergency plans. social media is a low-cost and effective way to coordinate disaster response between organizations and people. much like bookmobiles, social media serves as outreach for librarians to improve service accessibility. librarians can use platforms like twitter and facebook to help coordinate their activities and services alongside with other responders in the community. having an established plan of action where the library’s role and responsibilities are clearly outlined will result in more effective service and efficient response to community needs.25 3. creates social cohesion and promoting therapeutic initiatives. in alignment with the library’s mission of creating and serving communities, social media can act as an extra method of fostering connections in times of need. disaster victims can take advantage of social media’s speed and ubiquity to check in with family, tell them they are safe, and participate in relief efforts.26 social cohesion through platforms such as twitter can also create participatory discourse between people and organizations. for example, then-fema administrator chris furgate’s recommendation to read to children during the hurricane prompted the hashtag #stormreads to trend on twitter, as many accounts—libraries included—shared their recommended titles.27 library use of social media can also address growing concerns about rumors and misinformation spread during disasters.28 as providers of reliable and accurate information, libraries help establish source credibility and push more accurate resources to misinformed and unaware community members. although there is a substantial amount of research focused on libraries responding to disasters and social media use during disasters separately, there is a gap in library science literature examining social media as a method of library disaster response. interestingly, formal studies that mention library disaster response note an explicit absence of social media as a form of emergency communication.29 despite the current dearth, library social media studies can develop quickly thanks to the abundant amount of data available on social media platforms. as libraries continue to respond to disasters, they will require more deliberate and planned use of social media as a communication tool. such a need demands a closer examination of how libraries have historically used social media during disasters. case studies: three public libraries and twitter this study will examine the social media feeds of three public libraries during and immediately after hurricane sandy landed on the northeastern coast as a measure of social media’s impact on communication and information-sharing amongst libraries, patrons, and first responders. due to its frequent use for sharing up-to-date information, twitter was the selected social media platform to study.30 the public library systems were selected for this analysis based on their varying characteristics and available literature describing their actions after the hurricane. new york information technology and libraries | june 2019 41 public library (nypl, @nypl), princeton public library (ppl, @princetonpl), and queens library (ql, @queenslibrary) have twitter accounts that were at least two years old by october 2012. all accounts were active during the time period of interest, although they were closed when hurricane sandy landed. nypl and ql were closed an additional two days due to damages to several branch libraries.31 these library systems serve varied communities. nypl and ql are urban libraries located in new york city, with 91 and 62 branches respectively, and ppl is a one branch library located in downtown princeton, new jersey. the larger library systems reported flooding and power outages at several branches from the hurricane, while ppl sustained no structural or internal damages.32 however, all library systems were in communities where large numbers of households lost electricity and internet access, and sustained damages from fallen trees and flooding.33 the library systems were mentioned in news reports for services to library patrons affected by the storm, including providing charging stations for electronics, helping people fill out fema insurance forms, running programs for children and adults, and having public computers and wireless connections to access the internet.34 the libraries’ coupled use of twitter and active provision of disaster response services make them ideal candidates for examining the correlation between the two activities. methodology this study used a filtered search on twitter to identify tweets from each library’s feed within the time period of interest. within searches, each tweet was recorded and categorized based on content and message format. a single tweet could have more than one category. common content subcategories were identified to improve analysis. defined categories are as follows: • hurricane information: information on the hurricane’s status and impact from news and government agencies. • library policies: information on library policies. • library policies, renewals/fines: information on renewals and fines during the studied time period. • library status: information on library branch closures. • library event/service related to hurricane: event or service specifically planned in response to hurricane. • library event/service not related to hurricane: regular library programming; included event/service cancellations as an indirect/direct result of hurricane. • non-library event/service related to hurricane: information on non-library sponsored events and services provided in response to the hurricane. • replies: a publicly posted message from the library to another twitter user. • social interactions: non-informative and informative tweets aimed at conversing with people or organizations in a social manner. selected categories were then associated with a corresponding benefit from three of alexander’s defined benefits (table 1).35 after categorizing, the collected data was organized for analysis and comparison. weathering the twitter storm | han 42 https://doi.org/10.6017/ital.v38i2.11018 table 1. categories organized by social media benefits.36 benefit twitter content categories monitoring a situation § hurricane information § library event/service related to hurricane § replies integrating social media into emergency plans § library policies § library status § non-library event/service related to hurricane creating social cohesion and promoting therapeutic initiatives § library event/service related to hurricane § library event/service not related to hurricane § non-library event/service related to hurricane § replies § social interactions results from october 29-31, each library used twitter regularly to provide information or to communicate with library followers. tweet frequencies were counted and compared over the fiveday period across libraries (figure 1). while nypl and ql averaged almost 11 tweets per day, ppl had nearly double their numbers, at about 18 tweets per day. nypl and ql had a generally increasing trendline in tweets, while ppl’s twitter use fluctuated greatly. nypl and ql’s low tweet count during the studied time frame may be attributed to library-wide closures, although only ql’s tweet count increased significantly upon reopening. figure 1. number of tweets per day by library. content analysis illustrated variations in twitter use across all three libraries (figure 2). nypl tweeted the most about their library status and renewal/fine policy, with 21 and 17 tweets, information technology and libraries | june 2019 43 respectively. ppl focused more on advertising library events and services such as electrical outlets, heat, internet, and entertainment. they also used twitter heavily for social interactions, 35 percent of ppl’s 112 tweets, including asking questions, recommending books, thanking concerned patrons, and even apologizing for retweeting too many news articles about the hurricane. ql’s twitter use was more of a mix, often posting about library status and socially interacting with other twitter users. figure 2. twitter content by library. each library also differed in least common content tweeted. nypl had the fewest tweets about the hurricane, non-library services and events related to the hurricane, other library policies, and social interactions. ppl also had few tweets with information about the hurricane and rarely tweeted about fines and renewals. ql had no tweets about the hurricane, nor did they tweet about any library events or programs that were unrelated to their disaster response. discussion the data collected was analyzed to determine whether each library fulfilled the three identified benefits of social media that directly relate to the library’s mission of information access and community building: monitoring a situation, integrating social media into emergency plans, and creating social cohesion and promoting therapeutic initiatives. each library’s consistent responses to twitter users, status updates, and information about library services illustrates they all monitored their communities’ situations and responded accordingly through services and programs, as evidenced in news reports. libraries also used twitter to engage with others and weathering the twitter storm | han 44 https://doi.org/10.6017/ital.v38i2.11018 create a social network of library patrons and local institutions. based on the lack of information about the storm itself and few recommendations for non-library disaster response group resources, it is not apparent libraries integrated social media as part of their emergency policy and procedures. this also resulted in a dissonance between library action and their online communication. one notable example: many news reports described librarians aiding patrons with finding and filling out fema insurance forms, but only one of the 196 tweets analyzed in this study advertised fema assistance at the library.37 ppl tweeted several posts illustrating library use by affected patrons, but also emphasized they were at capacity due to large visitor numbers and shortages in charging stations and internet bandwidth. ppl also failed to offer alternatives on twitter to meet patron information needs. the lack of a coordinated effort perhaps can be explained in two parts. first, as no two disasters are alike, library response is often a direct reaction to the event and damages to their institution and community. a busy library would logically place social media communication and coordination as a lower priority than other immediate, tangible needs. second, librarians may not make a concerted effort to use social media if they are trained to prioritize protecting library collections and conducting regular services.38 while digital and outreach services such as bookmobiles have been common components of libraries, there is still a noticeable gap in libraries extending these same services using online tools. the libraries in this study used social media as a part of their disaster response, but the lack of planning resulted in each library’s twitter feed acting more as a “triage center,” providing basic assistance as the need arose, rather than an extension of in-house services. takeaways and further research while these libraries provided much needed services in the aftermath of hurricane sandy, their implementation of social media as a communication and information-sharing tool illustrates opportunities to develop more coordinated efforts. as library presence on and use of social media continues to grow, it should be considered as a necessary component of library disaster response and collaboration with other government agencies and first responders. while libraries are qualified for fema funding, it is uncertain that local first responder groups are aware of the services and benefits libraries provide post-disaster at all. as of 2013, the u.s. department of homeland security’s virtual social media working group did not include any library organizations, which leaves libraries out of crucial conversations in designing comprehensive disaster response plans.39 in an effort to participate in productive discourse, librarians also need to improve their social media use to better align with their practice when serving distressed communities. while the exact reasons for librarians’ lack of effective social media use in disaster response remains speculative, other research has shown that training opportunities for social media use in libraries remain scarce and not very effective.40 since hurricane sandy, social media has only grown as a powerful tool for people and communities, rendering it an essential skill for librarians today. this should motivate librarians, library associations, and other professional groups to consider developing effective training and workshops geared towards intentional use of social media. despite its power, social media should be seen as a complementary tool to enhance information services for community members. it will optimize the library’s reach, but it cannot completely replace current methods of outreach, nor should it. this is especially important when considering information technology and libraries | june 2019 45 who benefits the most from libraries, many of whom do not necessarily have consistent access to social media.41 social media use varies across age, socioeconomic status, digital access, and education levels, making it important for librarians to consider whose information needs are and are not being met online. considering such limitations, learning impactful social media skills and creating a support network amongst disaster response groups will enable libraries to effectively develop outreach strategies and improve disaster response services. the discussion and takeaways highlight the necessity for further research on social media use in library disaster response. as the history of library development and service informs the direction of libraries today, so too should historic uses of social media as a library service tool guide future work. continuing research may include case studies of public library response to recent disasters, which would provide better insight into the developing use of social media. the identified patterns and strengths can be used to guide future work in incorporating effective social media policies and protocols in library disaster plans. considering social media usage by first responders and federal agencies, future research should also include a closer examination of relationships between public libraries, first responders, and disaster information providers in improving coordinated response efforts. conclusion when disaster strikes, many communities exhibit a great need for resources and information. despite libraries providing much needed service and resources to community members after natural disasters, their use of social media platforms as a tool remains overlooked. this study examines historical use of social media as a communication and service tool between libraries, community members, and disaster response groups in the aftermath of hurricane sandy. the effectiveness of social media use was evaluated using alexander’s review of social media benefits and compared with descriptions of post-sandy library resources and services described in the literature. the study found social media use to be highly variable based on content and correlations with reported in-house library services. there was no sign of a coordinated effort with other disaster response groups, and the primary objective of their twitter accounts was connecting with patrons and other organizations through social interactions. improvements to social media use could be achieved through intentional coordination with first responders, directed training, and evaluating social media’s strengths and limitations in disaster response. if libraries wish to continue providing pertinent information, they need to adapt to communication methods used by their community. with social media’s strong presence in society, suburban and urban libraries such as the ones examined in this study should improve their use of social media as an effective information sharing and communication tool. continuing to examine and assess uses of social media as a disaster response tool can help shape policies and procedures that will enable libraries to better serve their communities. references 1 maya rhodan, “‘please send help.’ hurricane harvey victims turn to twitter and facebook,” time, aug. 30, 2017, http://time.com/4921961/hurricane-harvey-twitter-facebook-socialmedia/. weathering the twitter storm | han 46 https://doi.org/10.6017/ital.v38i2.11018 2 paul t. jaeger et al., “libraries, policy, and politics in a democracy: four historical epochs,” library quarterly 83, no. 2 (apr. 2013): 166–81, https://doi.org/10.1086/669559. 3 virtual social media working group and dhs first responders group, “lessons learned: social media and hurricane sandy", u.s. department of homeland security, june 2013, https://www.dhs.gov/sites/default/files/publications/lessons%20learned%20social%20me dia%20and%20hurricane%20sandy.pdf. 4 virtual social media working group and dhs first responders group. 5 bradley w. bishop and shari r. veil, “public libraries as post-crisis information hubs,” public library quarterly 32 (2013): 33–45, https://doi.org/10.1080/01616846.2013.760390. 6 bishop and veil. 7 bishop and veil. 8 bishop and veil; jingjing liu et al., “social media as a tool connecting with library users in disasters: a case study of the 2015 catastrophic flooding in south carolina,” science & technology libraries 36, no. 3 (july 2017): 274–87, https://doi.org/10.1080/0194262x.2017.1358128. 9 charles r. mcclure et al., “hurricane preparedness and response for florida public libraries: best practices and strategies,” florida libraries 52, no. 1 (2009): 4–7. 10 michael kelley, “ala midwinter 2011: fema recognizes libraries as essential community organizations,” school library journal, jan. 11, 2011, http://lj.libraryjournal.com/2011/01/industry-news/ala-midwinter-2011-fema-recognizeslibraries-as-essential-community-organizations/. 11 kelley. 12 maureen m. garvey, “serving a public library community after a natural disaster: recovering from ‘hurricane sandy,’” journal of the leadership & management section 11, no. 2 (spring 2015): 22–31; cathleen a. merenda, “how the westbury library helped the community after hurricane sandy,” journal of the leadership & management section 11, no. 2 (spring 2015): 32– 34. 13 sarah bayliss, shelley vale, and mahnaz dar, “libraries respond to hurricane sandy, offering refuge, wifi, and services to needy communities,” school library journal, nov. 1, 2012, http://www.slj.com/2012/11/public-libraries/libraries-respond-to-hurricane-sandy-offeringrefuge-wifi-and-services-to-needy-communities/; joel rose, “for disaster preparedness: pack a library card? : npr,” npr, aug. 12, 2013, https://www.npr.org/2013/08/12/210541233/for-disasters-pack-a-first-aid-kit-bottledwater-and-a-library-card. 14 “disaster preparedness and recovery,” ala advocacy, legislation & issues, 2017, http://www.ala.org/advocacy/govinfo/disasterpreparedness. information technology and libraries | june 2019 47 15 bishop and veil, “public libraries as post-crisis information hubs.” 16 lisl zach, “what do i do in an emergency? the role of public libraries in providing information during times of crisis,” science & technology libraries 30, no. 4 (sept. 2011): 404–13, https://doi.org/10.1080/0194262x.2011.626341. 17 bishop and veil, “public libraries as post-crisis information hubs.” 18 liu et al., “social media as a tool connecting with library users in disasters: a case study of the 2015 catastrophic flooding in south carolina”; zach, “what do i do in an emergency?” 19 virtual social media working group and dhs first responders group, “lessons learned.” 20 david alexander, “social media in disaster risk reduction and crisis management,” science & engineering ethics 20, no. 3 (sept. 2014): 717–33, https://doi.org/10.1007/s11948-013-9502z. 21 alexander; liu et al., “social media as a tool”; virtual social media working group and dhs first responders group, “lessons learned.” 22 alexander, “social media in disaster risk reduction.” 23 liu et al., “social media as a tool.” 24 alexander, “social media in disaster risk reduction.” 25 bishop and veil, “public libraries as post-crisis information hubs.” 26 alexander, “social media in disaster risk reduction.” 27 bayliss, vale, and dar, “libraries respond.” 28 liu et al., “social media as a tool.” 29 liu et al.; zach, “what do i do in an emergency?” 30 deborah d. halsted, library as safe haven: disaster planning, response, and recovery: a how-todo-it manual for librarians, first edition (chicago: american library association, 2014). 31 george m. eberhart, “libraries weather the superstorm,” american libraries magazine, nov. 4, 2012, https://americanlibrariesmagazine.org/2012/11/04/libraries-weather-thesuperstorm/; rose, “for disaster preparedness.” 32 bayliss, vale, and dar, “libraries respond”; eberhart, “libraries weather the superstorm”; rose, “for disaster preparedness.” 33 bayliss, vale, and dar, “libraries respond.” weathering the twitter storm | han 48 https://doi.org/10.6017/ital.v38i2.11018 34 bayliss, vale, and dar; eberhart, “libraries weather the superstorm”; lisa epps and kelvin watson, “emergency! how queens library came to patrons’ rescue after hurricane sandy,” computers in libraries 34, no. 10 (dec. 2014): 3–30; rose, “for disaster preparedness.” 35 alexander, “social media in disaster risk reduction.” 36 benefits listed and defined in alexander, david. “social media in disaster risk reduction and crisis management.” science & engineering ethics 20, no. 3 (sept. 2014): 717–33. https://doi.org/10.1007/s11948-013-9502-z. 37 eberhart, “libraries weather the superstorm”; rose, “for disaster preparedness.” 38 zach, “what do i do in an emergency?” 39 virtual social media working group and dhs first responders group, “lessons learned.” 40 rachel n. simons, melissa g. ocepek, and lecia j. barker, “teaching tweeting: recommendations for teaching social media work in lis and msis programs,” journal of education for library and information science 57, no. 1 (dec. 1, 2016): 21–30, https://doi.org/10.3138/jelis.57.1.21. 41 alexander, “social media in disaster risk reduction.” beyond viaf: wikidata as a complementary tool for authority control in libraries article beyond viaf wikidata as a complementary tool for authority control in libraries carlo bianchini, stefano bargioni, and camillo carlo pellizzari di san girolamo information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.12959 abstract this paper aims to investigate the reciprocal relationship between viaf® and wikidata and their possible roles in the semantic web environment. it deals with their data, their approach, their domain, and their stakeholders, with particular attention to identification as a fundamental goal of universal bibliographic control. after examining interrelationships among viaf, wikidata, libraries and other glam institutions, a double approach is used to compare viaf and wikidata: first, a quantitative analysis of viaf and wikidata data on personal entities, presented in eight tables; and second, a qualitative comparison of several general characteristics, such as purpose, scope, organizational and theoretical approach, data harvesting and management (shown in table 9). quantitative data and qualitative comparison show that viaf and wikidata are quite different in their purpose, scope, organizational and theoretical approach, data harvesting, and management. the study highlights the reciprocal role of viaf and wikidata and its helpfulness in the worldwide bibliographical context and in the semantic web environment and outlines new perspectives for research and cooperation. introduction in 2011, the library linked data incubator group, a w3c working group with the aim “to help increase global interoperability of library data on the web,” published its final report. two interrelated issues were tackled in that milestone report: what libraries can do for the semantic web and what the semantic web can do for libraries. linked data is an important asset for libraries as the “use of identifiers allows diverse descriptions to refer to the same thing. through rich linkages with complementary data from trusted sources, libraries can increase the value of their own data beyond the sum of their sources taken individually.”1 so linked data greatly contribute to library cataloguing work not just for description of resources but also for their proper identification. on the other hand, libraries have always created and curated a significant amount of valuable information assets and library authority data for names and subjects to help reduce “redundancy of bibliographic descriptions on the web by clearly identifying key entities that are shared across linked data. this will also aid in the reduction of redundancy of metadata representing library holdings.”2 the report opened a new way of thinking about universal bibliographic control (ubc), a “worldwide system for control and exchange of bibliographic information,” (https://archive.ifla.org/ubcim/ubcim-archive.htm) the purpose of which is “to make universally carlo bianchini (carlo.bianchini@unipv.it) is associate professor, department of musicology and cultural heritage, university of pavia. stefano bargioni (bargioni@pusc.it) is deputy director, library of the pontifical university santa croce (rome). camillo carlo pellizzari di san girolamo (camillo.pellizzaridisangirolamo@sns.it) is graduate student, department of classics, university of pisa and scuola normale superiore. © 2021. https://archive.ifla.org/ubcim/ubcim-archive.htm mailto:carlo.bianchini@unipv.it mailto:bargioni@pusc.it mailto:camillo.pellizzaridisangirolamo@sns.it information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 2 and promptly available, in a form which is internationally acceptable, basic bibliographic data on all publications in all countries.”3 exchanging information and data requires standards, at both the national and international level, for description, identification, and data format. nowadays, a pillar of ubc is viaf® (the virtual international authority file), a worldwide project designed by a few national libraries and run by oclc, which combines multiple name authority files with the goal “to lower the cost and increase the utility of library authority files by matching and linking widely-used authority files and making that information available on the web [https://www.viaf.org/].” it “clusters together the various forms of names for an entity” and has become “a major source for authority control and is becoming the collective reference source at the international level.”4 viaf is a fundamental tool for the identification of entities (people, locations, works, and expressions) relevant for the bibliographic universe. yet, as it is based on the harvesting of data from authoritative national libraries spread all over the world, it has a top-down approach: libraries and services that are not viaf sources can only refer to viaf, but not actively cooperate with it, and, for its nature, viaf cannot admit user cooperation. therefore, on a global scale, a very large number of local libraries are excluded, and their data, collections, and specificities are, too. furthermore, since the design and development of viaf at the beginning of the 21 st century, the semantic web environment has hugely evolved, and libraries are more and more required to act in new directions and to explore new forms of cooperation.5 illien and bourdon maintain not only that libraries “must now be careful to keep up their own interoperability,” but also that they “would be well-advised to keep up or enter into dialogue with the most influential communities in the web of data—smoothing out their own disputes in the meantime.”6 moreover, they believe that “building collaborative authority registries linked to standardized identifiers is one of the fundamental cornerstones of the new universal bibliographic control.”7 also, dunsire and willer suggest that a “smart ubc should strive to support all those who wish to think globally and act locally, with a better mix of bottom-up and top-down methodologies” as far as the “attempts to implement ubc as a worldwide system for the control and exchange of bibliographic information using top-down methodologies have only partially succeeded at global scale.”8 as a result, a better integration of libraries into the semantic web seems to require the involvement of a larger group of stakeholders—such as non-national agencies, museums, archives, and users—and the adoption of a complementary bottom-up approach. a new global actor of the semantic web has both a bottom-up and a very inclusive approach: wikidata. wikidata is a freely available hosted platform that anyone—including libraries—can use to create, publish, and use linked open data (lod). since 2012, many users have been involved in a bottom-up approach to identity management in wikidata. furthermore, interest in and experience with the use of wikidata to publish lod among glam (galleries, libraries, archives, and museums) institutions is constantly increasing.9 the wikidata role as an important tool for the identification of entities of any kind —not just those of traditional importance to glam—has likewise been increasingly recognized in recent years.10 https://www.viaf.org/ information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 3 so, two worldwide identification tools, two different backgrounds, two opposite approaches. are they mutually exclusive, or integrable? is one of them sufficient for libraries’ needs, or do libraries need both? which stakeholders are best served by viaf? which are best served by wikidata? this paper investigates the reciprocal relationship between viaf and wikidata and of their possible specific roles in the semantic web environment with respect to their approach, their domain, and their stakeholders, with particular attention to identification as a fundamental goal of ubc. relationship between viaf and libraries viaf gathers a huge quantity of authority data from more than 50 sources, listed in the home page of the project (https://viaf.org). millions of records coming from national libraries and other institutions are continuously processed using algorithms based on the matching of data and bibliographic relationships with the goal of creating clusters of names (figure 1).11 figure 1. viaf cluster for wolfgang amadeus mozart clusters are usable in many services “to identify names, locations, works, and expressions while preserving regional preferences for language, spelling, and script” (https://www.oclc.org/en/viaf.html). clusters may contain one or more ids from viaf sources. furthermore, unique identifiers of clusters (a viaf id, e.g., https://viaf.org/viaf/7524651/) are freely reusable and reused by other institutions to add useful information to their catalogues, open up new paths of information for the end user, contribute local data to the linked data cloud, and much more.12 https://viaf.org/ https://viaf.org/viaf/32197206/ https://www.oclc.org/en/viaf.html https://viaf.org/viaf/7524651/ information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 4 data sources are selected and approved by the viaf council (see https://www.oclc.org/en/viaf/contributing.html), and may belong to two categories: viaf contributors, usually national lam (libraries, archives, museums) agencies, admitted following very selective criteria; and other data providers, i.e., “other selected sources (e.g., wikipedia [sic]) that are not viaf contributor agencies.”13 other data providers include isni and wikidata (even if wikidata is often confused with wikipedia, as in the quotation above).14 while contributors are eligible to appoint a representative to the viaf council, other data providers are not. so, viaf is based on a rigid three-level hierarchical approach: viaf, viaf contributors, and other data providers. all the other national and local institutions, i.e., relevant national data producers that are no t national agencies, cannot provide data to viaf; instead, they are expected to benefit from the use of viaf ids after performing a reconciliation process of their own data with viaf ids. however, benefits could be not completely satisfactory in term of quality of data: while viaf deals with “widely-used authority files,” it can be supposed that the libraries of non-national agencies need authority data more relevant on a local or specialistic basis. lastly, while viaf guidelines state that viaf participants should periodically send updated data to viaf, it is not clear when and how viaf retrieves and collects data from other data providers (https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf). relationships between wikidata and academic, research, and public libraries wikidata was launched in 2012 by the wikimedia foundation as the central storage of the structured data from all wikimedia foundation projects; it is “a freely available hosted platform that anyone—including libraries—can use to create, publish, and use lod.”15 wikidata stores stable and common information about entities, i.e., items and properties, and interlinks between different wikimedia projects, in a form compliant with the rdf model (see https://www.mediawiki.org/wiki/wikibase/datamodel/primer). additionally, wikidata uses triples and enriches them with qualifiers and references.16 qualifiers allow adding specifications about the validity of a statement (start/end date, precision, obsolescence, series ordinal, etc.); references are fundamental to justify the data, i.e., to document the authority data creator’s reason for choosing the name or form of name on which a controlled access point is based. 17 wikidata uses the software wikibase (https://wikiba.se/), which is “an open-source software suite for creating collaborative knowledge bases” whose “data model prioritizes language independence and knowledge diversity.” the wikibase open-source software, which is currently used by more than thirty institutions, supports federated sparql queries. 18 wikibase’s approach and characteristics are particularly interesting for the library world. gemeinsame normdatei (gnd) created a working group with wikimedia deutschland in order to “debate whether wikibase is suitable for the needs of existing authority files coming from libraries” (https://wiki.dnb.de/display/gnd/authority+control+meets+wikibase); in march 2020 it was stated that the cooperation “has proven successful” and the current aim is to “develop a wikibasebased gnd and put it into use” (https://wiki.dnb.de/pages/viewpage.action?pageid=167019461). similarly, the bibliothèque nationale de france (bnf) and the agence bibliographique de l'enseignement supérieur (abes) launched the joint french national entities file (fne), which in 2019 carried out “a proof of concept to investigate the feasibility of using the software https://www.oclc.org/en/viaf/contributing.html https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf https://www.mediawiki.org/wiki/wikibase/datamodel/primer https://wikiba.se/ https://wiki.dnb.de/display/gnd/authority+control+meets+wikibase https://wiki.dnb.de/pages/viewpage.action?pageid=167019461 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 5 infrastructure of wikibase to support the fne.”19 a synthesis of the proof of concept, published in july 2020, mentioned, among the decisions taken, the choice to develop fne to build on wikibase (https://www.transition-bibliographique.fr/wp-content/uploads/2020/07/synthese-preuveconcept-fne.pdf). fne is scheduled to be launched in the next few years (https://f.hypotheses.org/wpcontent/blogs.dir/2167/files/2020/02/20200128_8_versunfichiernationaldentites.pdf ). even more interestingly, between 2017 and 2018, oclc explored a linked data wikibase prototype; the final report shows, among other results, that “the building blocks of wikibase can be used to create structured data with a precision that exceeds current library standards” and that “to populate knowledge graphs with library metadata, tools that facilitate the import and enhancement of data created elsewhere are recommended [. . . and . . .] the pilot underscored the need for interoperability between data sources, both for ingest and export.”20 in late 2019, the ifla wikidata working group was formed “to explore and advocate for the use of and contribution to wikidata by library and information professionals, the integration of wikidata and wikibase with library systems, and alignment of the wikidata ontology with library metadata formats such as bibframe, rda, and marc” (https://www.ifla.org/node/92837). on the wikimedia side, in 2019 the ld4-wikidata affinity group (ld4 stands for “linked data for”) was created by hilary thorsen, wikimedian in residence at stanford university, to understand “how the library can contribute to and leverage wikidata as a platform for publishing, linking, and enriching library linked data” (https://wiki.lyrasis.org/display/ld4p2/ld4wikidata+affinity+group). libraries’ interest in wikidata is usually focused on lod and semantic discovery, not on authority control: “libraries may each use different, unique, or select identifiers and authority control methods for disambiguation. increasingly, wikidata is becoming an important tool for synchronizing across identifiers like virtual international authority file (viaf) and orcid identifiers. integrating awareness of wikidata and its uses for enhancing metadata and link ed open data will help advance a more interconnected research web.”21 identification is a key issue both in bibliographic control and in the semantic web environment, as john riemer noted: “recent examination of the efforts involved in what we have historically called authority control in the pcc community has led us to the conclusion that the primary emphasis should be on identity management.”22 as a matter of fact, wikibase and wikidata’s approach to authority control and bibliographic description is quite new: not only does the traditional distinction between authority and bibliographic data disappear in a wikibase description, but wikidata is to be considered firstly as an identity management tool for any kind of entity.23 relationship between viaf and wikidata the first attempt of cooperation between viaf and wikidata goes back to 2012, when maximilian klein and alex kyrios, wikipedians in residence at oclc and the british library, respectively, developed a project to integrate authority data from the viaf with english wikipedia biographical articles. the project successfully “added authority data to hundreds of thousands of articles on the english wikipedia,” but above all showed that “linking of data represents an opportunity for libraries to present their traditionally siloed data, such as catalogue and authority records, in more openly accessible web platforms.”24 at the time, wikidata was taking its first steps, but later authority data were successfully transferred from english wikipedia to wikidata. https://www.transition-bibliographique.fr/wp-content/uploads/2020/07/synthese-preuve-concept-fne.pdf https://www.transition-bibliographique.fr/wp-content/uploads/2020/07/synthese-preuve-concept-fne.pdf https://f.hypotheses.org/wp-content/blogs.dir/2167/files/2020/02/20200128_8_versunfichiernationaldentites.pdf https://f.hypotheses.org/wp-content/blogs.dir/2167/files/2020/02/20200128_8_versunfichiernationaldentites.pdf https://www.ifla.org/node/92837 https://wiki.lyrasis.org/display/ld4p2/ld4-wikidata+affinity+group https://wiki.lyrasis.org/display/ld4p2/ld4-wikidata+affinity+group https://wiki.lyrasis.org/display/ld4p2/ld4-wikidata+affinity+group information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 6 at present, the connection between wikidata and viaf is very strong. both viaf and wikidata are founded on a strict authority control that is built on a few cataloguing principles . in particular, both apply the principle that the authorized access point “for the name of an entity should be recorded as authority data along with identifiers for the entity and variant forms of name.”25 in addition, wikidata is a data provider in viaf, while viaf ids are constantly recorded and updated in wikidata items. at present, wikidata has 8,304,947 personal items, out of which 2,061,046 items have a viaf id. moreover, each month a wikidata bot (https://www.wikidata.org/wiki/user:krbot) updates links in wikidata items to redirected viaf clusters and removes links to abandoned viaf clusters. the relevance of viaf to the wikidata information ecosystem is evident in the visualization of external identifiers in the items: viaf ids, represented on wikidata by property p214 (https://www.wikidata.org/wiki/property:p214), are automatically sorted as the first external identifier, preceded by the group of iso standards and followed by the group of viaf sources.26 using specific gadgets, i.e., enhancements of the edit interface, wikidata registered users can add to a specific item the ids of single viaf sources extracting them from the viaf id(s) present in the item.27 unfortunately, there is no automatic reciprocity between viaf and wikidata: when a wikidata item gets a link to a viaf cluster, viaf does not have an automated way to add a reciprocal link to the wikidata item. likewise, when a viaf cluster gets a link to a wikidata item, wikidata has no automatic way to add a reciprocal link to the viaf cluster. another very important aspect of the viaf-wikidata relationship is that wikidata uploads data from viaf only by voluntary work of wikidata users; and this approach applies to national library data, and to any other data, too. when available, viaf ids are typically one of the most important elements used by users to decide the identity of a wikidata item. wikidata controls on viaf in wikidata, the use of constraints—i.e., rules that check the appropriate use of a property (https://www.wikidata.org/wiki/help:property_constraints_portal)—enables easy discovery of possible inconsistencies in statements, both in data and in external identifiers. weekly, a wikidata bot (https://www.wikidata.org/wiki/user:krbot2) updates the database reports containing the constraint violations for each property, so that users can check the issues and try to fix them. users can also check constraint violations in real time using the appropriate queries linked in the talk page of each property. as far as to viaf ids, two types of constraint-violations are particularly relevant both for the data entry and for the present paper: • “single value” violations, i.e., one item has two or more viaf ids. this means that either one or more viaf ids are not to be related to the item, so that the non-pertinent viaf ids should be removed from the wikidata item or that more viaf ids exist for the same real entity, so that all the existing viaf ids must be kept in the wikidata item until viaf merges them. an example of a merge performed by viaf, maybe on the basis of the correspondent wikidata item, can be found in iulius rufinianus (https://www.wikidata.org/wiki/q28131664), where the eight distinct viaf ids contained in the wikidata item on september 24, 2019, have now been merged (https://www.wikidata.org/w/index.php?title=q28131664&oldid=1001570078); in april 2021, the wikidata item for alaricus i (https://www.wikidata.org/wiki/q102371) contains https://www.wikidata.org/wiki/user:krbot https://www.wikidata.org/wiki/property:p214 https://www.wikidata.org/wiki/help:property_constraints_portal https://www.wikidata.org/wiki/user:krbot2 https://www.wikidata.org/wiki/q28131664 https://www.wikidata.org/w/index.php?title=q28131664&oldid=1001570078 https://www.wikidata.org/wiki/q102371 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 7 four viaf ids (but there were ten on june 29, 2020; https://www.wikidata.org/w/index.php?title=q102371&oldid=1220309663). • “unique value” violations, i.e., two or more wikidata items have the same viaf id. this violation means not only an error on the wikidata side, but it could imply an error in viaf too. in the former, either one or more wikidata items have a non-pertinent viaf id, to be removed; or the same entity is referred to by one or more wikidata items, to be merged. in the latter, the viaf id conflates two or more distinct entities in one cluster. an example of conflation is the cluster at https://viaf.org/viaf/57898554/, where the painter herbert e. abrams (1920–2003; https://www.wikidata.org/wiki/q4117019) and the physician herbert l. abrams (1920–2016; https://www.wikidata.org/wiki/q23665535) conflate. in that case, wikidata users can report the viaf conflation error in the proper wikidata errorreport pages.28 in most cases just a few weeks are required for viaf to merge clusters regarding the same entity when wikidata includes them in the same item, but solutions to cases of conflation are fixed more slowly. while updates to viaf clusters and ids are obviously necessary and welcome, they are somehow risky for viaf contributors, providers, and users that base the consistency of their data on viaf. so, national libraries could import incorrect data into their ids and wikidata could import wrong national libraries ids referring to different entities into the same wikidata item. there is no evidence that the error-report pages created and updated by wikidata users are being systematically taken into consideration by viaf to solve its conflations. recently, other issues in the use of viaf as a source were raised when viaf removed very important information about its cluster merging process, information that is no longer available to worldwide libraries and users. the viaf data dump page (http://viaf.org/viaf/data) is refreshed monthly and, until april 2020, it included a persist file. for example, the february 2020 dump, viaf-20200203-persist-rdf.xml.gz, contained data about redirected clusters and—potentially— abandoned clusters as well. this information is essential to the prompt and safe synchronization of local data with viaf clusters. in this dump, redirected clusters were described, for instance, as follows: while any abandoned cluster (14,692,237 out of 24,030,176!) was erroneously described as follows: this xml empty statement omits the specific information about the abandoned cluster. to obtain this invaluable information again, we filed a bug by email. 29 the decision taken was drastic: starting in may 2020, viaf stopped including this information in its monthly dump, as stated at the bottom of the page itself.30 as a result, the only recourse available to viaf contributors or any https://www.wikidata.org/w/index.php?title=q102371&oldid=1220309663 https://viaf.org/viaf/57898554/ https://www.wikidata.org/wiki/q4117019 https://www.wikidata.org/wiki/q23665535 http://viaf.org/viaf/data information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 8 other institution that would synchronize their authority records with viaf identifiers is to rely on an external identification tool such as wikidata! materials and methods any comparison between viaf and wikidata must consider their different content. viaf contains personal name clusters, corporate name clusters, geographic name clusters, and work clusters, whereas wikidata allows items to describe any kind of entity relevant in the universe of discourse of the users’ data and irrespective of their bibliographic nature. even if all kinds of viaf clusters are relevant for bibliographic control, this study is limited to the analysis of personal name clusters in viaf and of items having “instance of: human” (p31:q5) in wikidata, because they are largely the most represented in viaf and they can be directly compared.31 some entities, such as mythological persons, legendary persons, etc., that are personal clusters in viaf, are not treated as humans in wikidata and belong to other instances (e.g., https://www.wikidata.org/wiki/q95074). a double approach was used to compare viaf and wikidata: first, data analyses of viaf and wikidata were performed, to compare viaf clusters and wikidata items and to investigate their reciprocal relationships (see the data analysis section). second, a comparison of several general characteristics, such as scope, objectives, philosophy, authority control, and identification, was made based on respective websites and available literature to find and highlight differences and similarities. full viaf dumps are available in native xml, rdf, marc-21 xml, or iso-2709 marc-21 (http://viaf.org/viaf/data/). viaf clusters were analyzed using an xml dump published on september 6, 2020 (http://viaf.org/viaf/data/viaf-20200906-clusters.xml.gz). full wikidata dumps are available in xml, json, or rdf.32 however, given the size of the entire dataset, it is much more convenient to create customized rdf dumps using the tool wdumper (https://wdumps.toolforge.org/). all the information (settings, dimension, and date of base dump) about dumps created using wdumper remains traced (https://wdumps.toolforge.org/dumps). wikidata items were analyzed using a customized rdf dump updated to september 14, 2020 (https://wdumps.toolforge.org/dump/732). the customized dump contains all statements with non-deprecated values33 present in items having both “instance of: human” (p31:q5) in best rank and at least one value of “viaf id” (p214) in best rank. both dumps were parsed using three perl scripts. dumps and scripts were uploaded on zenodo and are all available for analysis and reuse.34 perl scripts generate json data that are published on the html page http://catalogo.pusc.it/beyond_viaf/, where they are interpreted by javascript scripts in order to populate eight tables: three dedicated to viaf (tables 1–3) and five to wikidata (tables 4–8). in order to select the statements to be analyzed in wikidata items, three sets of relevant properties were found through three distinct sparql queries at the end of september 2020: viaf members (table 5), authority controls related to libraries but not being viaf members (table 6), and biographical dictionaries (table 7).35 at the beginning of october 2020, another sparql query was performed to find all the personal items containing the authority controls related to libraries but not being viaf members (table 6, column 4), without filtering the search to personal items having at least one value of “viaf id” (p214).36 https://www.wikidata.org/wiki/q95074 http://viaf.org/viaf/data/ http://viaf.org/viaf/data/viaf-20200906-clusters.xml.gz https://wdumps.toolforge.org/ https://wdumps.toolforge.org/dumps https://wdumps.toolforge.org/dump/732 http://catalogo.pusc.it/beyond_viaf/ http://catalogo.pusc.it/beyond_viaf/#summary http://catalogo.pusc.it/beyond_viaf/#summary http://catalogo.pusc.it/beyond_viaf/#tb5 http://catalogo.pusc.it/beyond_viaf/#tb6 http://catalogo.pusc.it/beyond_viaf/#tb7 http://catalogo.pusc.it/beyond_viaf/#tb6 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 9 data analysis: viaf clusters and wikidata items for this paper, two different versions of the data tables were produced: the first version, available at http://catalogo.pusc.it/beyond_viaf/, is a full, commented, and dynamic version of all the tables. within that version, links to the acronyms (such as lc, dnb, sudoc, etc.) of all the viaf contributors and other data providers are available too. static versions of these tables are included in this paper with commentary. viaf viaf has 22,099,715 personal clusters, half of which (50.90%; table 1, col. 2) are isolated clusters (i.e., they contain only one id). the presence of isolated clusters is interesting because it means that those clusters are created based on data coming from just one source. what is more, the percentage of isolated clusters is much higher (71.19%; table 1, col. 12) if just viaf contributors are taken into account (i.e., excluding isolated clusters due to data from other data providers, such as isni). it is worth noting that other data providers can form isolated clusters, with the relevant exception of wikidata (for which viaf uses the acronym wkp), which never appears in isolated clusters (table 1, cols. 7 and 8). table 1. viaf personal clusters by number of sources [adapted from http://catalogo.pusc.it/beyond_viaf/#tb1] the total number of ids present in viaf clusters is 51,327,847 (table 2), distributed in 22,099,715 clusters; the most relevant contributors include lc (7,266,628 ids), dnb (5,677,731 ids), sudoc (3,278,189 ids), and nta (2,754,036 ids), while the most relevant other data providers are isni (8,455,814 ids) and wkp (2,148,680 ids) (table 2). apart from lc and dnb, data about isolated clusters (table 2, col. 5) shows that the number of isolate clusters tends to slowly decrease over time and that clustering has improved: recently-added sources tend to have a higher share of isolated ids. another relevant figure is that sources in non-latin alphabets usually have higher shares of isolated ids.37 so, a high number of isolated clusters may reveal a source that is partially in need to be gathered to existing clusters. http://catalogo.pusc.it/beyond_viaf/ http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb2 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 10 table 2. viaf personal clusters by source [adapted from http://catalogo.pusc.it/beyond_viaf/#tb2] the histories of viaf clusters, as contained in xml dumps, appear weird and incoherent. for example, many viaf contributors in their first year of appearance seem to have no additions and many removals (e.g., bav row; for complete information see table 3 on the website at http://catalogo.pusc.it/beyond_viaf/#tb3). incoherence is due to the absence of redirected and abandoned clusters in the data. nevertheless, the histories allow us to reconstruct the year of first contribution of each source—an information otherwise unavailable—and to detect major changes in the data provided to viaf by each source.38 table 3. viaf history of personal clusters by source [adapted from http://catalogo.pusc.it/beyond_viaf/#tb3] wikidata wikidata has 8,304,947 personal items and 2,061,046 of them contain a viaf id. usually one or more viaf sources are extracted from the viaf id(s), so that 1,905,470 personal items containing viaf id have at least one viaf source id (table 4, col. 1). wikidata records ids from a wide range http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb3 http://catalogo.pusc.it/beyond_viaf/#tb4 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 11 of other resources, such as non-viaf bibliographic agencies and biographical dictionaries (investigated in these tables), but also encyclopedias and various online databases. considering the 2,061,046 items containing a viaf id, 684,367 items contain only one viaf source id (table 4, col. 1), but only 353,710 items contain only one among viaf sources ids and non-viaf sources ids and biographical dictionaries ids (table 4, col. 15); so, more than 300,000 items containing only one viaf source id have at least one non-viaf source id and/or one biographical dictionary id. table 4. wikidata personal items (pers. it.) by number of ids [adapted from http://catalogo.pusc.it/beyond_viaf/#tb4] viaf and wikidata: a data comparison from a quantitative perspective, wikidata personal items (8,304,947) are 37.58% of viaf personal clusters (22,099,715), while wikidata personal items having a viaf id (2,061,046) are 9.26%. ids from viaf sources present in wikidata personal items containing viaf id (6,292,778; table 5, col. 3) are 12.91% of ids present in viaf personal clusters (48,740,933; table 5, col. 4). in the authors’ opinion, quantitative confrontation between viaf and wikidata must be carefully considered. it could be argued that is a noticeable disadvantage of wikidata with respect to viaf, but it would be right only from a bibliographic control perspective and the other side of the coin must be examined too. as wikidata represents any kind of entity relevant for its users (libraries, archives, museums, and many other stakeholders), viaf contains just over a third of wikidata items (37%). furthermore, a very large part of the personal entities represented in wikidata (at present, more than 6,200,000, i.e., about 75%) cannot rely on viaf for identification purposes (for example, because wikidata personal items can also represent singers, lawyers, pilots, and so on). it can be concluded that viaf can be considered just one specialized source, in the domain of the semantic web and with respect to the objectives of wikidata. considering single viaf sources, wikidata surpasses viaf by number of ids only in two cases, perseus (135.18%) and simacob (102.17%) (table 5, col. 5). this is possible because wikidata and viaf gather different sets of data from both the sources; the former uses sets of data obtained by its users, while the latter uses only data sent by the contributor. all the other sources, because of the absence of systematic imports, are much rarer in wikidata than in viaf. http://catalogo.pusc.it/beyond_viaf/#tb4 http://catalogo.pusc.it/beyond_viaf/#tb4 http://catalogo.pusc.it/beyond_viaf/#tb5 http://catalogo.pusc.it/beyond_viaf/#tb5 http://catalogo.pusc.it/beyond_viaf/#tb5 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 12 table 5. wikidata personal items (pers. it.) by viaf source [adapted from http://catalogo.pusc.it/beyond_viaf/#tb5] table 6 and table 7 show authority control in wikidata living aside viaf. wikidata contains some non-viaf sources (usually non-national libraries or groups of libraries which couldn’t become viaf contributors); their ids in personal items having viaf id (894,161) are the 86.04% of their ids in all personal items (958,206; table 6, col. 4), meaning that wikidata provides a clusterization for more than 64,000 ids (6%) probably corresponding to non-existent viaf clusters (table 6, totals). http://catalogo.pusc.it/beyond_viaf/#tb6 http://catalogo.pusc.it/beyond_viaf/#tb7 http://catalogo.pusc.it/beyond_viaf/#tb6 http://catalogo.pusc.it/beyond_viaf/#tb6 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 13 table 6. wikidata personal items (pers. it.) by non-viaf sources [adapted from http://catalogo.pusc.it/beyond_viaf/#tb6] table 7. wikidata personal items (pers. it.) by biographical dictionary [adapted from http://catalogo.pusc.it/beyond_viaf/#tb7] in general the presence of ids of biographical dictionaries (796,609 ids in total) in 725,755 personal items having viaf id helps significantly in the definition of authoritative dates of birth and death (table 7, total of column 2 and table 4, total of column 12). http://catalogo.pusc.it/beyond_viaf/#tb7 http://catalogo.pusc.it/beyond_viaf/#tb4 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 14 a comparison between table 1, column 7, and table 2, row wkp (the acronym for wikidata wrongly used by viaf) shows that 2,147,319 clusters contain 2,148,680 wkp ids; it means that, from a viaf point of view, wikidata duplicates are only 1,361. furthermore, a comparison between the total and row 0 in table 8, col. 1, shows that 2,061,046 items contain at least one viaf id and that 2,037,638 items contain exactly one viaf id; so, items containing one or more viaf duplicates are 23,408. as a result, it can be concluded that the percentage of duplicates in wikidata is less than 0.01% and in viaf is about 0.01%, so wikidata is as trustworthy as viaf. viaf and wikidata not only are able to discover reciprocal duplicates, but also discover duplicates in viaf sources, by a comparison between table 8, col. 3—containing the total number of the cases in which a viaf source has at least one duplicate—and table 8, col. 5—containing the total number of the cases in which viaf sources are duplicated. however, while duplicates recorded by viaf are findable only by querying the monthly dumps using in-house–made programs, duplicates discovered by wikidata are easily findable through sparql queries detecting single-value constraint violations. table 8. wikidata personal items (pers. it.) by repeated viaf sources and viaf source ids [adapted from http://catalogo.pusc.it/beyond_viaf/#tb8] discussion viaf and wikidata are quite different in their purpose, scope, organizational and theoretical approach, data harvesting and management. a major difference between viaf and wikidata is in their purpose: on the one hand, viaf aims to identify bibliographic entities and to connect authority data provided by selected contributors (national libraries, cultural agencies, and other major institutions) and extracted from other data providers (such as isni, rism or de663, wikidata, etc.) through the creation of clusters by means of software. on the other hand, like isni, wikidata focuses on both identification and description of entities and has the purpose of building collaboratively a database concerning the sum of all relevant knowledge—provided that each item complying with its notability criteria is accepted— using a crowdsourced approach (https://www.wikidata.org/wiki/wikidata:notability). http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb8 http://catalogo.pusc.it/beyond_viaf/#tb8 http://catalogo.pusc.it/beyond_viaf/#tb8 http://catalogo.pusc.it/beyond_viaf/#tb8 https://www.wikidata.org/wiki/wikidata:notability information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 15 another relevant difference between viaf and wikidata is their scope: while viaf aims to identify a few selected types of entities already described within the bibliographic universe by national agencies, wikidata aims to identify and describe any kind of entity of interest for the wikidata community. wikidata items may exist for any kind of entity and may contain a very broad range of data and of external identifiers. so, wikidata can represent bibliographic data and entities —e.g., at present wikidata records data for the 54% of all the bibliographic sources cited in wikipedia entries—any other kind of entity provided for in viaf (i.e., agents, works, expressions, and places), and any other entity defined by the frbr-ifla lrm model (e.g., manifestations, items, timespans, nomens, res, etc.), and by other models relevant for the glam universe (such as frbroo and cidoc).39 but it is open to any data model because it can also include any kind of entity outside the bibliographic or cultural heritage universe, as it is a knowledge base capable of containing any kind of statement on any entity users want to describe. in addition, for any kind of entity there is no minimum or maximum number of statements that must or can be added; as soon as an entity is clearly identified, it can be added to wikidata. moreover, when miss ing, new identifiers—and properties for description—can be proposed by anyone through property proposals and, if well defined, they are usually approved within two weeks (https://www.wikidata.org/wiki/wikidata:property_proposal). a broader scope is supposed to be much more convenient for users who wish to discover previously unknown links and information in the semantic web. organizational model due to the viaf top-down approach, data is completely managed by oclc with no chance for common users or medium and small libraries or other institutions to directly improve viaf clusters (e.g., by adding other data coming from their collections or from encyclopedias or online databases, merging duplicates, solving conflations, etc.). as the wikidata approach is “to crowdsource data acquisition, allowing a global community to edit the data,” data is curated directly by users interested in their creation and use.40 so, in wikidata, data is produced by volunteers, by means of semiautomatic or manual data harvesting from any desired and available source. moreover, users’ statistics show that authoritative data from national bibliographic agencies and other libraries, archives, and museums are normally uploaded by common users, not by librarians (or any other kind of institutional data curator).41 identification function the theoretical approach differs too, both as to the form of the names and as to identification function. in viaf, preferred and variant forms of names for persons are based on national cataloguing codes. because national codes are different, viaf is needed and works as a neutral hub of all the national preferred forms. cataloguing rules can assure uniformity and univocity to the forms of the names of the entities within a national catalogue but are quite complicated to be understood and used by users. in ranganathan’s words, “the cataloguing conventions are on the surface quite contrary to what mr. everybody is familiar with.”42 in contrast, preferred forms in wikidata are based on the international principles of the convenience of the user and common usage.43 a clear example is the use of the direct form of name (jane doe) instead of the inverted form of name (doe, jane). a different usage in the forms of names could be an issue for the integration of library metadata in wikidata. in practice, however, it is not. first, there is no conflict between the wikidata form and any other form from a theoretical point of view, as wikidata form is already treated in viaf as the preferred form within its specific context.44 in addition to that, wikidata accepts any library https://www.wikidata.org/wiki/wikidata:property_proposal information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 16 identifier, so that any library-controlled form can be linked to a wikidata item and vice versa. furthermore, a wikidata bot could be programmed to dump authorized and variant access points from national authority files and add them to the item labels and aliases. 45 lastly, it could be argued that national cataloguing codes are compliant with the icp principles and with the convenience of the user and common usage. but a remarkable difference is that while in national codes principles are applied by cataloguers for users, in wikidata they are expressed directly by the users themselves. as the identification function is a major feature of the semantic web, the different approach of viaf and wikidata to this issue must be underlined. as noted, “viaf remains neutral towards differences in the cataloguing policy of its data contributors” and, for this reason, viaf accepts all ids provided by its sources, even when they are not clearly identifiable entities but are just labels (see for example https://viaf.org/viaf/307171748 or https://viaf.org/viaf/305052259).46 on the contrary, wikidata explicitly requires each item to refer to “a clearly identifiable conceptual or material entity” (second notability criterium; https://www.wikidata.org/wiki/wikidata:notability). as a consequence, many isolated clusters formed by viaf on the basis of single contributors’ ids related to not-clearly-identifiable entities are not acceptable in wikidata and remain unlinked. moreover, data on cluster duplication shows that identification in wikidata is performed with the same quality level as in viaf. clusters for identification purpose are created both in viaf and wikidata, but differently from viaf, in wikidata external identifiers—as all the other data—are not provided in a structured way by national libraries or other institutions (with very few exceptions); instead, identifiers are usually found and added by common users through web scrapers and after data cleaning. what is more, matches are not performed automatically, but semiautomatically (through tools such as openrefine or mix’n’match (https://mix-n-match.toolforge.org/ and https://openrefine.org/) or manually. an enhanced feature of wikidata in clusterization is the record of a wider variety of sources and relative ids: due to its openness, wikidata refers to viaf and its sources, but also to any other library or cultural institution and to a large number of reference sources like encyclopedias and biographical dictionaries too (table 7). a wider variety of identification sources and manual work assure a higher level of identification. data quantity data harvesting affects both quantity and quality of data. in viaf, data are collected from periodical contributions of viaf participants, with very large sets of data. therefore, from a quantitative point of view, viaf has a far larger number of people (22,099,715 personal clusters) in comparison with wikidata (8,304,947 personal items). even though wikidata was created in 2012, the number of personal items in wikidata is currently only over a third (37%) of all viaf personal clusters. although quantities are not directly comparable due to the different universe to be described, in the last few years initiatives to enhance organized cooperation between libraries and wikidata and to promote data production in wikidata are increasing. a very high-quality initiative is supported by cornell university, harvard university, stanford university, and the university of iowa’s school of library and information science, in collaboration with the library of congress and the program for cooperative cataloging (pcc). their linked data for production (ld4p) wikidata project is “an indepth exploration of how wikidata could serve as a platform for publishing, linking, and enriching library linked data” https://viaf.org/viaf/307171748 https://viaf.org/viaf/305052259/#jones,_a._l https://www.wikidata.org/wiki/wikidata:notability https://mix-n-match.toolforge.org/ https://openrefine.org/ http://catalogo.pusc.it/beyond_viaf/#tb7 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 17 (https://www.wikidata.org/wiki/wikidata:wikiproject_linked_data_for_production). an additional example is the ifla wikidata working group that was formed “to explore and advocate for the use of and contribution to wikidata by library and information professionals, the integration of wikidata and wikibase with library systems, and alignment of the wikidata ontology with library metadata formats such as bibframe, rda, and marc” (https://www.ifla.org/node/92837). even so, wikidata is still very far from having a structured workflow to ingest data from national or local libraries, museums, and archives. in fact, while the projects mentioned above are mainly dedicated to explaining to the public of librarians and institutions why wikidata is important and how to contribute to it, there are still very few projects which are mainly dedicated to the concrete massive synchronisation of data between library and bibliographic data and wikidata. in fact, they also require a relevant effort in the manual cleaning of discrepancies and oddities emerging from the synchronisation. relevant exceptions are the national library of wales 47 and the biblioteca europea di informazione e cultura, where significant work has been done to synchronise respective databases of authors (and of other types of entities) with wikidata. 48 data quality data quality also needs to be analyzed in detail. even if data from national libraries are authoritative and of high quality, as a virtual file viaf neither has nor produces its own data. consequently, viaf data does not always remain authoritative because errors can be both inherited and added, and clusters can be duplicated. the issue is well known by isni, that “whenever necessary [. . .] splits and merges data coming from viaf, and even applies protection to data that has been fixed manually.”49 as shown in table 2 and table 8, viaf clusters are subject to isolation and duplication when they are created and to many changes and updates when they are maintained. so, even if viaf collects a huge amount of authoritative data and creates clusters of ids, viaf users can not always safely and continuously rely on them. data flows just in one direction (from national libraries to viaf), viaf deletes and rebuilds clusters without giving priority to the stability of one cluster over another, and, after april 2020, viaf no longer makes available to users a record of its changes.50 on the contrary, wikidata data is always under strict control of any user, as its structure is designed to trace any minimum change to its data. every single addition or deletion is documented, not just to easily recover eventual vandalism, but also to support any decision with clear evidence. any stakeholder can exactly know if, how, when, and why data changed, in any moment. what is more, from a qualitative point of view, wikidata seems to offer a better solution for the recording of authority data than viaf. first, it can store a wider variety of data about a person in a more semantic way. not only is it possible in wikidata to express preferred and variant forms of the name, related names, works, co-authors, publication statistics, and other data about the person—like in viaf—but all these data are all expressed in a semantic way. for example, whereas in viaf “bach, anna magdalena” is just a related name of johann sebastian bach, in wikidata she is recorded and qualified as the person who married the musician. thanks to that different approach, wikidata can represent and show bach’s full genealogic tree (https://magnustoolserver.toolforge.org/ts2/geneawiki/?q=q1339). as adamich noted, “building graphs from bibliographic entities is really about making the data machine readable and understandable. it is about making the data web enabled. in terms of translation, linked data opens up a whole new world over our marc entrapment.”51 https://www.wikidata.org/wiki/wikidata:wikiproject_linked_data_for_production https://www.ifla.org/node/92837 http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb8 https://magnus-toolserver.toolforge.org/ts2/geneawiki/?q=q1339 https://magnus-toolserver.toolforge.org/ts2/geneawiki/?q=q1339 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 18 quality is enhanced by matching methods too; whereas viaf matches identities by an algorithm based on explicit identifiers or string matching (such as the forms of the name, dates, and bibliographic relationships),52 wikidata matches are usually decided by a human, the user, or (in the case of semiautomatic imports) at least checked a posteriori by a human after some time. the higher precision of manual over automatic matching is recognized also in viaf guidelines. 53 furthermore, as seen above, notability requires that, when clear identification is impossible, no item must be created in wikidata. data maintenance and usability data quality relies also on maintenance. comparison between wikidata items and viaf clusters shows a very small but constant presence of errors to be fixed in both (around 0.01%), even if it is impossible to determine with certainty whether viaf uses wikidata error pages. issues on fixing viaf errors directly by viaf contributors were already noted: “while clustering anomalies can be handled by viaf itself, reporting errors found in source data of viaf partners raise problems related to the efficiency of the notification workflows. at this point, involvement of viaf partners themselves in the process is needed.”54 on the other hand, in wikidata anyone can edit items, add new data or delete mistakes, merge items, fix various issues, and so on, on the fly. due to its openness, wikidata may also suffer from vandalism, but it has its own solutions.55 along with this, data receive special attention to their accuracy and reliability because they are uploaded and maintained by users that are direct stakeholders. for this reason, in wikidata, references to bibliographical or biographical sources and to other data provider ids such as any national and international identification system are suggested, promoted, and carefully examined. moreover, there is a commitment to monitor the consistency of viaf clusters. the ability of wikidata to identify inconsistent viaf clusters and the fact that viaf isolated clusters can be reduced at least by 30%56 by referring to identifiers from wikidata and other data providers, are the best demonstration of the quality of its data and of the importance of the other data providers in viaf clusterization. as to the usability of data, the internal search of viaf lacks more than basic functions: the only available filter allows to limit results to clusters having one specific source; on the contrary, filtering searches for clusters having and/or not having a specific group of sources or to clusters having more or less sources would be very useful, especially in order to find duplicates. in contrast, wikidata has a sparql query service which returns results based on the current status of the database and its internal search can integrate some of the functions of the query service, allowing to look for items having and/or not having specific statements (https://www.wikidata.org/wiki/special:search).57 considering cases in which viaf and wikidata discover potential duplicates in their sources, viaf has no page dedicated to listing cases of (supposedly) duplicate ids from its sources, while wikidata easily allows to find cases in which single sources have (supposedly) duplicate ids through constraint violations58 and appropriate sparql queries. a comparison table a comparison table was built to compare scope, role, system, and functions between viaf and wikidata, inspired by and adapted from a viaf vs isni comparison.59 https://www.wikidata.org/wiki/special:search information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 19 table 9. comparison between and complementarity of viaf and wikidata features feature viaf wikidata scope ● persons ● organizations ● works ● expressions ● locations ● any kind of viaf entity ● any “res” of ifla lrm ● any entity of cidoc ● any other non-glam entity ● any entity in the universe of discourse software ● unknown ● wikibase60 data. person entity properties ● preferred form of name, based on national cataloguing rules ● very rich variant forms of name, identified by national agencies variant forms ● sources ● preferred form of name (label) based on convenience of the user and common usage61 ● variant forms of name (aliases), organized by languages and scripts62 ● sources (as statements and references and with qualifiers) data. quantity (persons) ● number of clusters: 33,656,281 (sept. 2020) ● number of personal clusters: 22,099,715 (sept. 2020) ● number of entities: 90,260,081 (oct. 2020) ● number of personal items: 8,304,947 (oct. 2020) ● number of personal items with viaf id: 2,061,046 (sept. 2020) data. harvesting ● data are provided by authoritative national bibliographic agencies ● data are added through massive semiautomatic imports and/or manually by any interested user data. quality ● data are granted by authoritative national bibliographic agencies ● data are controlled by any directly interested user, based on data from viaf, available bibliographic agencies, and other authoritative bibliographic sources data. other entities properties ● isbn, titles, dates included in the cluster ● any kind of property applicable to an entity can be used (multimedia included)63 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 20 feature viaf wikidata ● dates, genre, bibliographic references from sources, xlinks, etc. ● properties are unchangeable ● all statements admit references, which are strongly recommended in some cases ● unavailable properties can be freely added through a process of property proposal64 data. dates ● dates are extracted from authority and bibliographic records using a parsing technique; calendars and precision are not available65 ● dates are imported semiautomatically from various sources or filled in manually; different calendars are available and further statements can be made through qualifiers66 data. vandalism ● no vandalism: data are editable only by oclc ● everyone can edit, but items which are frequently vandalized can be temporarily or permanently protected from the edits of unregistered users67 data. fixing errors, deduplicating, or unmerging clusters/items ● suggestions and requests via email ● asynchronous ● presumably, automated processes and human interventions ● viaf rebuilds clusters and does not give priority to the stability of one cluster over another68 ● everyone can edit69 ● instantaneous ● probable errors (constraintviolations) are detected in an automated way (by bots and through queries) ● pages with lists of probable errors (constraint-violations) are freely available and constantly updated in an automated way (by bots)70 data. license ● all public data (license: http://opendatacommons.org/licen ses/by/1.0/) ● all public data (license: https://creativecommons.org/publi cdomain/zero/1.0/deed.it) role ● create clusters ● ingest authority records from viaf contributors and other data providers (included wkd and isni) ● publish and diffuse viaf ids and data ● create items with a worldwide recognized and standard identifier ● interlink items with any available external identifier ● ingest data from viaf, from viaf contributors, and other data providers (e.g., isni) http://opendatacommons.org/licenses/by/1.0/ http://opendatacommons.org/licenses/by/1.0/ https://creativecommons.org/publicdomain/zero/1.0/deed.it https://creativecommons.org/publicdomain/zero/1.0/deed.it information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 21 feature viaf wikidata ● allow to create and maintain on toolforge free tools—e.g., mix’n’match—to ingest external identifiers71 ● manage library, bibliographic, and non-library and non-bibliographic linked data ● publish and diffuse wikidata ids and data organizational model ● oclc service, guided by viaf council of participating institutions ● hierarchical, top-down ● membership on request and subordinated to approval ● largely limited to national bibliographic agencies ● wikimedia project ● distributed, bottom-up ● everyone can take part in the project72 ● open to any bibliographic or nonbibliographic institution (national, large, medium, and small) system. website ● interface only in english language ● interface in nearly any language and script; new ones can be added ● online facilities (end user input; edit online facilities for end user) ● login enhances users’ experience (by gadgets and scripts) system. updating ● periodical (asynchronous) ingestions ● continuous, instantaneous, free updates system. versioning ● history is included in each present cluster and for abandoned clusters ● history is inaccessible in redirected clusters ● page history available in each item and for redirected items ● for deleted items, history is accessible only to administrators long-term preservation policy ● oclc maintains the hosting, software, and data for viaf73 ● wikimedia foundation maintains the hosting, software, and data for wikidata74 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 22 feature viaf wikidata notifications to stakeholders ● notifications to be sent to data providers ● notifications are sent to end users and contributors display, search, and download ● in multiple formats: xml and json, including justlinks.json; ● basic search interface ● clusters are listed without clear ranking rule ● integrating monthly dumps ● api endpoint75 ● before april 2020, by monthly dump with persist links; after, monthly dumps without persists links ● in multiple formats: json, php, n3, ttl, nt, rdf, jsonld, html76 ● search interface 77 ● api endpoint78 ● sparql query endpoint79 ● dumps80, also customizable81 ● see https://www.wikidata.org/wiki/help :about_data linked data and sru ● linked data ● sru82 (search and browse indexes, using cql syntax; output formats are xml or html) ● linked data interoperability. local ● local institution can only reconcile viaf ids to their own data ● as changes are made by viaf, synchronization must be periodically performed by sources and local institutions ● full reconciliation, upload, and synchronization of local ids on wikidata and vice versa ● dedicated tools: mix’n’match ● other tools: openrefine ● bots ● manually conclusion main viaf and wikidata features and personal entities data were analyzed and compared in this study to focus on analogies and differences, and to highlight their reciprocal role and helpfulness in the worldwide bibliographical context and in the semantic web environment. viaf is a major international initiative to address the challenge of reliably identifying bibliographic agents on the web, by means of authoritative data based on national cataloguing codes and coming from the national libraries involved in the ubc program. moreover, viaf is a pillar of the identification process that users enact within wikidata. still, the comparison emphasized a few relevant issues in viaf’s approach, designed more than twenty years ago: a very selective policy of inclusion of its sources—contributors and other data providers—and to their participation to the governance, that prevents a worldwide openness of the project to non national libraries and cultural institutions; an obvious neutrality toward data coming from its https://www.wikidata.org/wiki/help:about_data https://www.wikidata.org/wiki/help:about_data https://www.wikidata.org/wiki/help:about_data https://www.wikidata.org/wiki/help:about_data information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 23 contributors, even when data are not compliant with the identification requirements of the semantic web; troubles in correct clustering of ids (duplicate clusters to be merged and conflated clusters to be split), and a one-way flow of data due to its top-down approach that prevents a quick and cooperative workflow to identify and fix errors; the ability to identify only a narrow range of entities (i.e., mainly bibliographic entities, but not even all those provided by ifla lrm). on the other side, the semantic web has offered new important tools and chances to libraries, archives, museums and other cultural institutions, and their data are recognized as a relevant asset for building the backbone of the semantic web as to the control of entities of bibliographic and cultural interest. after eight years of existence, wikidata is playing a relevant role in the publication, aggregation, and control of bibliographic and non-bibliographic information in the semantic web too. it is more and more indicated as a hub for identifiers in the semantic web.83 wikidata depends on viaf for a large part of the identification work of its items on viaf and viaf’s preeminent role in wikidata is acknowledged by its primary position in the identifiers section of the data of each item. for this reason, the wikidata community constantly monitors the consistency of viaf clusters and continuously updates lists of errors present in them . on the other hand, if viaf is undoubtedly very useful to the wikidata community, wikidata can support the consistency of viaf clusters. the wikidata informational ecosystem is much larger and wider, can be built by any interested institution and person, and its identification function can count also on the authority work of national and non-national libraries excluded from the viaf environment, and on authoritative non-bibliographical reference sources too. this study opens some research perspectives. analysis was limited to data about personal entities, as this kind of entity was the only one directly comparable, while further research is wanted to possibly extend the analysis to other kinds of entities. moreover, more research should be devoted to the investigation of the treatment of special categories of persons and their names, such as mythological and legendary characters, ancient greek and latin authors, kings, queens, popes, saints, and so on, as viaf guidelines84 themselves declare among viaf’s typical problems the clusterization of such names (and they often get five or more viaf ids in wikidata). a further line of research should consider the relevance of the clusterization of encyclopedias and other reference sources in the identification process within wikidata. lastly, isolated clusters would need more consideration; as a matter of fact, in this study they were used as a clue of relatively recent uploads in viaf, but lc and dnb show a high rate of isolated clusters too (maybe due to the richness of their collections and metadata). more research on isolated clusters could help to describe with more precision the possible role of non-national libraries and institutions and of their locally rich collections in identifying lesser-known agents (not just persons) in a worldwide perspective. from analyzed data and direct comparison, it can be concluded that viaf and wikidata can be constantly improved through reciprocal comparison, which allows discovery of errors in both. viaf and wikidata are two relevant tools for the authority control in the semantic web and they each have a specific role to play and different stakeholders. unfortunately, as opposed to the relationship between viaf and isni, at present no aspect of viaf-wikidata interoperability is discussed between the managing structures of both systems, on a regular or irregular basis . while wikidata appears to be more reliable with regards to the identification process, its most significant weakness consists in its unorganized and unplanned crowdsourced data acquisition, information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 24 even if based at present on about 11,500 active editors.85 furthermore, the wikidata community still lacks the constant support and cooperation of institutional data curators such as librarians, archivists, and museum curators. many current projects are mainly dedicated to explaining to the potential institutional stakeholders the importance and the usefulness of wikidata for their institutional missions, but there are still too few projects devoted to massive synchronization of data from institutional silos to wikidata. but, as soon as these initiatives reach a critical mass, wikidata will become the real global hub of the web of data. acknowledgements all the authors have cooperated in the redaction and revision of the article. nevertheless, each author has mainly authored specific sections and subsections of the article: • stefano bargioni: data analysis; viaf; wikidata; viaf and wikidata: a data comparison. • carlo bianchini: introduction; discussion; organizational model; identification function; data quantity; data quality; data maintenance and usability. • camillo carlo pellizzari di san girolamo: relationship between viaf and libraries; relationship between wikidata and academic, research, and public libraries; relationship between viaf and wikidata; wikidata controls on viaf; materials and methods; conclusion. all authors contributed to a comparison table. the authors wish to thank the anonymous reviewer whose suggestions helped to improve and enrich the paper, and the editor for his helpful edits. information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 25 endnotes 1 thomas baker et al., library linked data incubator group final report, sec. 2 (w3c incubator group, october 25, 2011), http://www.w3.org/2005/incubator/lld/xgr-lld-20111025/. 2 baker et al., library linked data. 3 dorothy anderson, universal bibliographic control. a long term policy—a plan for action (munchen: verlag dokumentation, 1974), 11. 4 anila angjeli, andrew mac ewan, and vincent boulet, “isni and viaf: transforming ways of trustfully consolidating identities,” in ifla wlic 2014 (ifla 2014 lyon, ifla, 2014), 2, http://library.ifla.org/985/1/086-angjeli-en.pdf. 5 rick bennett et al., “viaf (virtual international authority file): linking the deutsche nationalbibliothek and library of congress name authority files,” international cataloguing and bibliographic control 36, no. 1 (2007): 12–18; barbara b. tillett, the bibliographic universe and the new ifla cataloging principles : lectio magistralis in library science = l’universo bibliografico e i nuovi principi di catalogazione dell’ifla : lectio magistralis di biblioteconomia (fiesole (firenze): casalini libri, 2008), 14–15, http://digital.casalini.it/9788885297814; “viaf. connect authority data across cultures and languages to facilitate research,” oclc, 2020, https://www.oclc.org/en/viaf.html. 6 gildas illien and françoise bourdon, “a la recherche du temps perdu, retour vers le futur: cbu 2.0” (paper, ifla wlic 2014, lyon, france, 2014), 13–14, http://library.ifla.org/956/. 7 illien and bourdon, “a la recherche,” 15. 8 gordon dunsire and mirna willer, “the local in the global: universal bibliographic control from the bottom up” (paper, ifla wlic 2014, lyon, france, 2014), 11, http://library.ifla.org/817/. 9 luca martinelli, “wikidata: la soluzione wikimediana ai linked open data,” aib studi 56, no. 1 (march 2016): 75–85, https://doi.org/10.2426/aibstudi-11434; jesús tramullas, “objetos culturales y metadatos: hacia la liberación de datos en wikidata,” anuario thinkepi 11 (2017): 319–21, https://doi.org/10/ghbj63; xavier agenjo-bullón and francisca hernández-carrascal, “wikipedia, wikidata y mix’n’match,” anuario thinkepi 14 (2020), https://doi.org/10/ghbj6t; claudio forziati and valeria lo castro, “the connection between library data and community participation: the project share catalogue-wikidata,” jlis.it 9, no. 3 (2018): 109–20, https://doi.org/10/ggxj9n; adrian pohl, “was ist wikidata und wie kann es die bibliothekarische arbeit unterstützen?,” abi technik 38, no. 2 (2018): 208, https://doi.org/10/ghbj6w; arl white paper on wikidata: opportunities and recommendations (the association of research libraries, 2019), https://www.arl.org/wpcontent/uploads/2019/04/2019.04.18-arl-white-paper-on-wikidata.pdf; regine heberlein, “on the flipside: wikidata for cultural heritage metadata through the example of numismatic description” (paper, ifla wlic 2019, libraries: dialogue for change, session 206: art libraries with subject analysis and access, athens, greece, august 28, 2019), http://library.ifla.org/2492/1/206-heberlein-en.pdf. 10 arl white paper on wikidata, 27–30; theo van veen, “wikidata: from ‘an’ identifier to ‘the’ identifier,” information technology and libraries 38, no. 2 (2019): 72–81, http://www.w3.org/2005/incubator/lld/xgr-lld-20111025/ http://library.ifla.org/985/1/086-angjeli-en.pdf http://digital.casalini.it/9788885297814 https://www.oclc.org/en/viaf.html http://library.ifla.org/956/ http://library.ifla.org/817/ https://doi.org/10.2426/aibstudi-11434 https://doi.org/10/ghbj63 https://doi.org/10/ghbj6t https://doi.org/10/ggxj9n https://doi.org/10/ghbj6w https://www.arl.org/wp-content/uploads/2019/04/2019.04.18-arl-white-paper-on-wikidata.pdf https://www.arl.org/wp-content/uploads/2019/04/2019.04.18-arl-white-paper-on-wikidata.pdf http://library.ifla.org/2492/1/206-heberlein-en.pdf information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 26 https://doi.org/10/ghbj62; hilary thorsen, “ld4p: linked data for production: wikidata as a hub for identifiers” (slideshow presentation, june 11, 2020), https://docs.google.com/presentation/d/1jwz3_ncf5rdd7ejetglfv99uv2pnd1v/edit?usp=embed_facebook. 11 tillett, the bibliographic universe, 15. 12 open data commons attribution license (odc-by) v1.0 (as stated in http://viaf.org/viaf/data/). 13 “viaf admission criteria,” oclc, 2020, https://www.oclc.org/content/dam/oclc/viaf/viaf%20admission%20criteria.pdf. 14 the description of wikidata source in http://viaf.org/viaf/partnerpages/wkp.html seems to refer to wikipedia before the existence of wikidata. the same acronym wkp reflects this anachronism, whereas isni correctly uses wkd. anyway, this description, as well as many others, requires an update. 15 stacy allison-cassin and dan scott, “wikidata: a platform for your library’s linked open data,” code4lib journal 40 (may 4, 2018), https://journal.code4lib.org/articles/13424. 16 carlo bianchini and pasquale spinelli, “wikidata at fondazione levi (venice, italy): a case study for the publication of data about fondo gambara, a collection of 202 musicians’ portraits,” jlis.it 11, no. 3 (september 15, 2020): 24. 17 ifla working group on functional requirements and numbering of authority records (franar), functional requirements for authority data: a conceptual model (münchen: k. g. saur, 2009), 46, https://www.ifla.org/files/assets/cataloguing/frad/frad_2013.pdf. for qualifiers, see https://www.wikidata.org/wiki/help:qualifiers; for references see https://www.wikidata.org/wiki/help:sources. 18 partial lists are linked from https://wikibase-registry.wmflabs.org/wiki/main_page. 19 see https://www.transition-bibliographique.fr/fne/french-national-entities-file/; the proof of concept is available at https://github.com/abes-esr/poc-fne. 20 jean godby et al., creating library linked data with wikibase: lessons learned from project passage (dublin oh: oclc research, 2019): 8, https://doi.org/10.25333/faq3-ax08. 21 ifla, “opportunities for academic and research libraries and wikipedia” (discussion paper, 2016), 10, https://www.ifla.org/files/assets/hq/topics/infosociety/iflawikipediaopportunitiesforacademicandresearchlibraries.pdf. 22 john riemer, “the program for cooperative cataloging & a wikidata pilot” (slideshow presentation, june 16, 2020), slide 5, https://docs.google.com/presentation/d/1npkaqdggft1wi2vx0zgmtixwxwjpq96ntxx4mmy xffi/edit#slide=id.p. 23 godby et al., “creating library linked data,” 8. https://doi.org/10/ghbj62 https://docs.google.com/presentation/d/1jwz3_ncf5rdd-7ejetglfv99uv2pnd1v/edit?usp=embed_facebook https://docs.google.com/presentation/d/1jwz3_ncf5rdd-7ejetglfv99uv2pnd1v/edit?usp=embed_facebook http://viaf.org/viaf/data/ https://www.oclc.org/content/dam/oclc/viaf/viaf%20admission%20criteria.pdf http://viaf.org/viaf/partnerpages/wkp.html https://journal.code4lib.org/articles/13424 https://www.ifla.org/files/assets/cataloguing/frad/frad_2013.pdf https://www.wikidata.org/wiki/help:qualifiers https://www.wikidata.org/wiki/help:sources https://wikibase-registry.wmflabs.org/wiki/main_page https://www.transition-bibliographique.fr/fne/french-national-entities-file/ https://github.com/abes-esr/poc-fne https://doi.org/10.25333/faq3-ax08 https://www.ifla.org/files/assets/hq/topics/info-society/iflawikipediaopportunitiesforacademicandresearchlibraries.pdf https://www.ifla.org/files/assets/hq/topics/info-society/iflawikipediaopportunitiesforacademicandresearchlibraries.pdf https://docs.google.com/presentation/d/1npkaqdggft1wi2vx0zgmtixwxwjpq96ntxx4mmyxffi/edit%23slide=id.p https://docs.google.com/presentation/d/1npkaqdggft1wi2vx0zgmtixwxwjpq96ntxx4mmyxffi/edit%23slide=id.p information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 27 24 maximilian klein and alex kyrios, “viafbot and the integration of library data on wikipedia,” code4lib journal 22 (october 14, 2013), https://journal.code4lib.org/articles/8964. 25 ifla cataloguing section and ifla meeting of experts on an international cataloguing code, statement of international cataloguing principles (icp) (den haag: ifla, 2016), para. 5.3. 26 https://www.wikidata.org/wiki/mediawiki:wikibasesortedproperties#ids_with_datatype_%22external-id%22; isni (p213, https://www.wikidata.org/wiki/property:p213) is presently sorted after viaf instead of in the iso section because it is considered primarily as a viaf source. 27 epìdosis, viaf e wikidata.mpg, 2020, https://commons.wikimedia.org/wiki/file:viaf_e_wikidata.mpg; a list of gadgets is available at https://www.wikidata.org/wiki/wikidata:viaf/cluster#gadgets. 28 the main error-report page is https://www.wikidata.org/wiki/wikidata:viaf/cluster/conflating_entities; its subpage https://www.wikidata.org/wiki/wikidata:viaf/cluster/conflating_specific_entries is designed for collecting “easy” cases of conflation, when only a few members of a cluster should be moved elsewhere, while the cluster is substantially sane. 29 moreno hayley, email to author, march 23, 2020. to the question if data about abandoned clusters would have been maintained, the viaf answered, “we recognize that the data in the file was not usable. viaf is in a period of transition and it was decided that we could not at this time fix the file so it has been removed from the list of available downloads.” 30 the statement read: “the persist-rdf.xml file has been removed and will no longer be available,” accessed october 23, 2020. 31 angjeli, mac ewan, and boulet “isni and viaf,” 3. 32 https://dumps.wikimedia.org/wikidatawiki/; instructions and a list of kinds of data dumps are available at https://www.wikidata.org/wiki/wikidata:database_download. 33 a general explanation of ranks is available at https://www.wikidata.org/wiki/help:ranking. here is a small summary: values of statements can be ranked in three ways, “preferred,” “normal” (default), and “deprecated”; the expression “values with non-deprecated rank” includes all values with preferred rank or normal rank; the expression “values with best rank” includes only values with preferred rank or normal rank, with this condition: if the same statement has two or more values and at least one of them has preferred rank, values with normal rank aren’t counted; if there aren’t values with preferred rank, all values with normal rank are counted. 34 viaf and wikidata dumps, together with the scripts, were published on zenodo at https://doi.org/10.5281/zenodo.4457114. https://journal.code4lib.org/articles/8964 https://www.wikidata.org/wiki/mediawiki:wikibase-sortedproperties%23ids_with_datatype_%22external-id%22 https://www.wikidata.org/wiki/mediawiki:wikibase-sortedproperties%23ids_with_datatype_%22external-id%22 https://www.wikidata.org/wiki/property:p213 https://commons.wikimedia.org/wiki/file:viaf_e_wikidata.mpg https://www.wikidata.org/wiki/wikidata:viaf/cluster%23gadgets https://www.wikidata.org/wiki/wikidata:viaf/cluster/conflating_entities https://www.wikidata.org/wiki/wikidata:viaf/cluster/conflating_specific_entries https://dumps.wikimedia.org/wikidatawiki/ https://www.wikidata.org/wiki/wikidata:database_download https://www.wikidata.org/wiki/help:ranking https://doi.org/10.5281/zenodo.4457114 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 28 35 the queries can be performed using the following links: viaf members: https://w.wiki/i5j; authority controls related to libraries but not being viaf members: https://w.wiki/i5k; biographical dictionaries: https://w.wiki/i5n. 36 the query can be performed using the following link: https://w.wiki/i5p. 37 it could be because they are probably more difficult to cluster, but in some cases also because they represent infrequently described entities. 38 as suggested by the reviewer, more removals than additions may be a clue of a cleanup project. 39 pat riva, patrick le boeuf, and maja zumer, ifla library reference model, draft (den haag: ifla, 2017), https://www.ifla.org/files/assets/cataloguing/frbr-lrm/ifla_lrm_2017-03.pdf; nick crofts et al., “definition of the cidoc conceptual reference model,” version 5.0.4, icom/cidoc crm special interest group, 2011, http://www.cidoc-crm.org/html/5.0.4/cidoc-crm.html; chryssoula bekiari et al., eds., frbr object-oriented definition and mapping from frbrer, frad and frsad, version 2.0 (international working group on frbr and cidoc crm harmonisation, 2013), http://old.cidoccrm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf; lydia pintscher, lea lacroix, and mattia capozzi, “what’s new on the wikidata features this year,” youtube video, october 26, 2020, truocolo, https://www.youtube.com/watch?v=ebxdzk54gru. 40 denny vrandečić and markus krötzsch, “wikidata: a free collaborative knowledgebase,” communications of the acm 57, no. 10 (september 23, 2014): 80, https://doi.org/10/gftnsk. 41 for a general statistic see http://wikidata.wikiscan.org/users; for a statistic about the viaf property see https://bambots.brucemyers.com/navelgazer.php?property=p214; changing the id of the property at the end of the url allows exploring other property statistics. 42 shiyali ramamrita ranganathan, reference service, 2nd ed., ranganathan series in library science 8 (bombay: asia publishing house, 1961), 74. 43 ifla cataloguing section and ifla meeting of experts on an international cataloguing code, statement of international cataloguing principles (icp), 5, https://www.ifla.org/publications/node/11015. 44 wikidata does have a guideline for a preferred label, and its choice is based on users’ convenience (https://www.wikidata.org/wiki/help:label, par. 1.2) as required by international cataloguing principles (2016). as to the choice of the wikidata label in a specific language, viaf does not show any clear principle, while the authors believe that it would be preferable to use the english (“en”) label, whenever available. see ifla cataloguing section and ifla meeting of experts on an international cataloguing code, statement of international cataloguing principles (icp). 45 for example, in september it was done for nkc using openrefine (sample edit: https://www.wikidata.org/w/index.php?title=q520487&diff=1269046867&oldid=12668704 64). https://w.wiki/i5j https://w.wiki/i5k https://w.wiki/i5n https://w.wiki/i5p https://www.ifla.org/files/assets/cataloguing/frbr-lrm/ifla_lrm_2017-03.pdf http://www.cidoc-crm.org/html/5.0.4/cidoc-crm.html http://old.cidoc-crm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf http://old.cidoc-crm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf https://www.youtube.com/watch?v=ebxdzk54gru https://doi.org/10/gftnsk http://wikidata.wikiscan.org/users https://bambots.brucemyers.com/navelgazer.php?property=p214 https://www.ifla.org/publications/node/11015 https://www.wikidata.org/wiki/help:label https://www.wikidata.org/w/index.php?title=q520487&diff=1269046867&oldid=1266870464 https://www.wikidata.org/w/index.php?title=q520487&diff=1269046867&oldid=1266870464 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 29 46 angjeli, mac ewan, and boulet, “isni and viaf,” 9. 47 simon cobb (https://www.wikidata.org/wiki/user:sic19) became wikidata visiting scholar in 2017 (https://en.wikipedia.org/wiki/user:jason.nlw/wikidata_visiting_scholar). 48 federico leva and marco chemello, “the effectiveness of a wikimedian in permanent residence: the beic case study,” jlis.it 9, no. 3 (september 2018): 141–47, https://doi.org/10.4403/jlis.it-12481. 49 angjeli, mac ewan, and boulet, “isni and viaf,” 11. 50 andrew mac ewan, “isni, viaf and naco and their relationship to orcid, discussion paper for pcc policy committee, 4 november,” 2013, 2, http://www.loc.gov/aba/pcc/documents/isni%20poco%20discussion%20paper%202013.d ocx. 51 tom adamich, “library cataloging workflows and library linked data: the paradigm shift,” technicalities 39, no. 3 (may/june 2019): 14. 52 oclc, viaf guidelines, rev. july 16, 2019, 2, https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf. 53 oclc, viaf guidelines, 5. “when viaf is unable to algorithmically match some of the source authority records with each other, they can be manually pulled together into a single cluster using an internal table.” 54 angjeli, mac ewan, and boulet, “isni and viaf,” 16. 55 stefan heindorf et al., “vandalism detection in wikidata,” in proceedings of the 25th acm international conference on information and knowledge management, cikm ’16 (new york, ny: association for computing machinery, 2016), 327–36, https://doi.org/10/gg2nmm; amir sarabadani, aaron halfaker, and dario taraborelli, “building automated vandalism detection tools for wikidata,” in proceedings of the 26th international conference on world wide web companion, www ’17 companion (republic and canton of geneva, che: international world wide web conferences steering committee, 2017), 1647–54, https://doi.org/10/ghhtzf. 56 see table 1, col. 1 vs col. 9; it should be noted that col. 9 considers only non-viaf sources and biographical dictionaries, but wikidata also links to encyclopedias and other online databases. 57 for example, people not having viaf id but having iccu id (https://tinyurl.com/y6hbtjuo); instructions about the internal search are available at https://www.mediawiki.org/wiki/help:extension:wikibasecirrussearch. 58 https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations. 59 angjeli, mac ewan, and boulet, “isni and viaf,” 16. 60 https://www.mediawiki.org/wiki/wikibase/datamodel. https://www.wikidata.org/wiki/user:sic19 https://en.wikipedia.org/wiki/user:jason.nlw/wikidata_visiting_scholar https://doi.org/10.4403/jlis.it-12481 http://www.loc.gov/aba/pcc/documents/isni%20poco%20discussion%20paper%202013.docx http://www.loc.gov/aba/pcc/documents/isni%20poco%20discussion%20paper%202013.docx https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf https://doi.org/10/gg2nmm https://doi.org/10/ghhtzf https://tinyurl.com/y6hbtjuo https://www.mediawiki.org/wiki/help:extension:wikibasecirrussearch https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations https://www.mediawiki.org/wiki/wikibase/datamodel information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 30 61 “the label is the most common name that the item would be known by” (https://www.wikidata.org/wiki/help:label). see also ifla cataloguing section and ifla meeting of experts on an international cataloguing code, statement of international cataloguing principles (icp), 5., https://www.ifla.org/publications/node/11015. 62 bots exist to create more and more variant forms based on matching properties, such as date of birth (p569) and date of death (p570), and to import variant forms of names from national authority files. see, for example, https://www.wikidata.org/w/index.php?title=q5669&diff=611600491&oldid=608231160 . 63 https://www.wikidata.org/wiki/help:data_type. 64 https://www.wikidata.org/wiki/wikidata:property_proposal. 65 jenny a. toves and thomas b. hickey, “parsing and matching dates in viaf,” code4lib journal, 26 (october 21, 2014), https://journal.code4lib.org/articles/9607; stefano bargioni, “from authority enrichment to authoritybox : applying rda in a koha environment,” jlis.it 11, no. 1 (2020): 175–89, https://doi.org/10/gg66rq. 66 https://www.wikidata.org/wiki/help:dates. 67 see heindorf et al., “vandalism detection in wikidata.” 68 see mac ewan, “isni, viaf and naco.” 69 see https://www.wikidata.org/wiki/help:merge, https://www.wikidata.org/wiki/help:split_an_item, and https://www.wikidata.org/wiki/help:conflation_of_two_people. 70 complete list at https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations (e.g., https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations/p214). 71 https://admin.toolforge.org/; see also xavier agenjo-bullón and francisca hernándezcarrascal, “registros de autoridades, enriquecimiento semántico y wikidata,” anuario thinkepi 12 (2018): 361–72, https://doi.org/10/ghbj6z. 72 https://www.wikidata.org/wiki/wikidata:property_proposal. 73 https://www.oclc.org/en/viaf.html. 74 https://www.wikidata.org/wiki/wikidata:introduction. 75 https://platform.worldcat.org/api-explorer/apis/viaf. 76 https://www.wikidata.org/wiki/special:entitydata; see also https://www.wikidata.org/wiki/wikidata:database_download. 77 https://www.wikidata.org/wiki/special:search. https://www.wikidata.org/wiki/help:label https://www.ifla.org/publications/node/11015 https://www.wikidata.org/w/index.php?title=q5669&diff=611600491&oldid=608231160 https://www.wikidata.org/wiki/help:data_type https://www.wikidata.org/wiki/wikidata:property_proposal https://journal.code4lib.org/articles/9607 https://doi.org/10/gg66rq https://www.wikidata.org/wiki/help:dates https://www.wikidata.org/wiki/help:merge https://www.wikidata.org/wiki/help:split_an_item https://www.wikidata.org/wiki/help:conflation_of_two_people https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations/p214 https://admin.toolforge.org/ https://doi.org/10/ghbj6z https://www.wikidata.org/wiki/wikidata:property_proposal https://www.oclc.org/en/viaf.html https://www.wikidata.org/wiki/wikidata:introduction https://platform.worldcat.org/api-explorer/apis/viaf https://www.wikidata.org/wiki/special:entitydata https://www.wikidata.org/wiki/wikidata:database_download https://www.wikidata.org/wiki/special:search information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 31 78 https://www.wikidata.org/w/api.php. 79 https://query.wikidata.org/. 80 https://dumps.wikimedia.org/wikidatawiki/. 81 https://wdumps.toolforge.org/. 82 https://www.oclc.org/developer/develop/web-services/viaf/authority-source.en.html. 83 van veen, “wikidata.” 84 see “typical problems” in viaf guidelines: https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf. 85 pintscher, lacroix, and capozzi, “what’s new.” https://www.wikidata.org/w/api.php https://query.wikidata.org/ https://dumps.wikimedia.org/wikidatawiki/ https://wdumps.toolforge.org/ https://www.oclc.org/developer/develop/web-services/viaf/authority-source.en.html https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf abstract introduction relationship between viaf and libraries relationships between wikidata and academic, research, and public libraries relationship between viaf and wikidata wikidata controls on viaf materials and methods data analysis: viaf clusters and wikidata items viaf wikidata viaf and wikidata: a data comparison discussion organizational model identification function data quantity data quality data maintenance and usability a comparison table conclusion acknowledgements endnotes 10980 2019038 editor lita president’s message updates from the 2019 ala midwinter meeting bohyun kim information technology and libraries | march 2019 2 bohyun kim (bohyun.kim.ois@gmail.com) is lita president 2018-19 and chief technology officer & associate professor, university of rhode island libraries, kingston, rhode island. in this president’s message, i would like to provide some updates from the 2019 ala midwinter meeting held in seattle, washington. first, as many of you know, the potential merger of lita with alcts and llama has been temporarily put on hold, due to an initial timeline that was rather ambitious and the lack of time required to deliberate on and resolve some issues in the transition plan to meet that timeline.1 these updates were also shared at the lita town hall during the midwinter meeting, where many lita members spent time discussing topics such as the draft mission and vision statements for the new division, what makes people feel at home in a division, in which areas lita should redouble its focus, and which activities lita may be able to set aside without losing its identity. valuable feedback and thoughts were provided by town hall participants. many emphasized the importance of building and retaining a community of library technologists around lita values, programming, resources, advocacy, service activities, and networking opportunities in those feedback. the merger-related discussion is to resume this spring, and the leadership of lita, alcts, and llama will make every effort to ensure the best future for three divisions at this time of great flux and change. second, lita is looking into introducing some changes to the lita forum. in the feedback and thoughts gathered at the lita town hall, the lita forum was also mentioned as one of the valuable lita offerings to its members. the origin of the lita forum goes back to lita’s first national conference held in baltimore in 1983.2 since then, the lita forum has become a cherished venue for many library technologists, a place where they meet other like-minded people in the field, learn from one another, share ideas and experience, and look for more ways in which technology can be utilized to better serve libraries and library patrons. initially, the steering committee hoped that all three divisions would participate in putting together the lita forum with a wider range of content that encompasses the interests of not only lita members but also of those in alcts and llama, in a virtual format in order to engage more members who cannot easily travel, to be held some time in spring 2020. at the time this idea was conceived more than a year ago, it was assumed that all preparations for the member vote regarding the merger would have been nearly completed by the time of the midwinter meeting. however, the steering committee unfortunately ran out of time for that preparation. merger planning also took up almost the entirety of the time that the leadership and the staff of the three divisions had available. this resulted in an unfortunate delay in proper forum planning. with the merger conversation on hold at this point and the new timeline for the merger being likely to be set back at least by a year, the changed circumstances for the forum planning had to be reviewed. information technology and libraries | march 2019 3 after a lively and thoughtful discussion at the midwinter meeting, the lita board decided that, considering how much work remains to be done regarding merger planning, it may not be practical or feasible to have the next lita forum be the first virtual and joint one. however, there was a lot of interest in and excitement about trying a virtual format since it will allow lita to reach and serve the needs of more lita members than the traditional in-person meeting. it was also pointed out that the virtual format may provide an opportunity for lita to experiment with different and more unconventional conference program formats, which could be a welcoming change to lita members. the lita board, however, also acknowledged the value of a physical conference where people get to meet one another in person, which cannot be easily transferred to a virtual conference. if the virtual conference experiment takes place and is successful, lita may hold its forum alternating every year between two different formats – virtual and physical. planning for and running a fully virtual conference at the scale of a multi-day national forum will require additional time and careful consideration since it will be the first time the lita forum planning committee and the lita office attempt this. logistics management is likely to be quite different in a virtual conference. the attendee expectations and the user experience will also significantly differ in a virtual conference than in a physical conference. as the first step of this investigation, the lita forum planning committee will explore what the ideal lita virtual forum may look like in terms of programming formats and participant experience. the lita office and the finance advisory committee will also look into the financial side of running the lita forum in a virtual format. at this time, it is not yet determined when the next lita forum will be held and whether it will be a virtual or a physical one. once these investigations are completed, however, the lita board should be able to decide on the most appropriate path towards the next lita forum. stay tuned for what exciting changes may be coming to the lita forum. third, i would like to mention that lita issued a statement regarding the incidents of aggressive behavior, racism, and harassment reported at the 2019 ala midwinter meeting.3 along with the statement, the lita board has decided to commit funds to provide an online bystander / allyship training, which we hope will equip lita members with tools that empower active and effective allyship, recognize and undo oppressive behaviors and systems, and promote the practice of cultural humility, thereby collectively increasing our collaborative capacity. the lita statement and the board decision were received positively by many lita members. other ala divisions such as alcts, alsc, asgcla, llama, united, and yalsa have already expressed interest in working together with lita on this, and the lita board is looking into a few options to choose from. more information about the training will be soon provided. lastly, i am thrilled to announce that the lita president’s program at the upcoming ala annual conference at washington d.c in june will feature meredith broussard, a data journalist and the author of artificial unintelligence: how computers misunderstand the world, as the speaker. in her book, broussard delves into many problems surrounding techno-chauvinism, which displays blind optimism about technology and an abundant lack of caution about how new technologies will be used. she further details how this simplistic worldview that prioritizes building new things and efficient code above social conventions and human interactions often misinterprets a complex social issue as a technical problem and results in a reckless disregard for public safety and the public good. lita president’s message: updates from the 2019 ala midwinter meeting | kim 4 https://doi.org/10.6017/ital.v38i1.10980 reviewing the early history of computing and digital technology, broussard observes: “we have a small, elite group of men who tend to overestimate their mathematical abilities, who have systematically excluded women and people of color in favor of machines for centuries, who tend to want to make science fiction real, who have little regard for social convention, who don’t believe that social norms or rules apply to them, who have unused piles of government money sitting around, and who have adopted the ideological rhetoric of far-right libertarian anarcho-capitalists. what could possibly go wrong?”4 i invite all of you to come to this program for more insight and a deeper understanding about what the recent technology innovation involving artificial intelligence (ai) and big data means to our everyday life and where it may be headed. the program information is available in the ala 2019 annual conference scheduler at https://www.eventscribe.com/2019/alaannual/fspopup.asp?mode=presinfo&presentationid=519109. endnotes 1 the official announcement can be found at the lita blog. see bohyun kim, “update on new division discussions,” lita blog, january 26, 2019, https://litablog.org/2019/01/update-onnew-division-discussions/. 2 stephen r. salmon, “lita’s first twenty-five years: a brief history,” library information technology association (lita), september 28, 2006, http://www.ala.org/lita/about/history/1st25years. 3 “lita’s statement in response to incidents at ala midwinter 2019,” lita blog, february 4, 2019, https://litablog.org/2019/02/litas-statement-in-response-to-incidents-at-alamidwinter-2019/. 4 meredith broussard, artificial unintelligence: how computers misunderstand the world (cambridge, massachusetts: the mit press, 2018), p. 85. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs manolis peponakis abstract the aim of this study is to contribute to the field of machine-processable bibliographic data that is suitable for the semantic web. we examine the entity relationship (er) model, which has been selected by ifla as a “conceptual framework” in order to model the fr family (frbr, frad, and rda), and the problems er causes as we move towards the semantic web. subsequently, while maintaining the semantics of the aforementioned standards but rejecting the er as a conceptual framework for bibliographic data, this paper builds on the rdf (resource description framework) potential and documents how both the rdf and linked data’s rationale can affect the way we model bibliographic data. in this way, a new approach to bibliographic data emerges where the distinction between description and authorities is obsolete. instead, the integration of the authorities with descriptive information becomes fundamental so that a network of correlations can be established between the entities and the names by which the entities are known. naming is a vital issue for human cultures because names are not random sequences of characters or sounds that stand just as identifiers for the entities—they also have socio-cultural meanings and interpretations. thus, instead of describing indivisible resources, we could describe entities that appear in a variety of names on various resources. in this study, a method is proposed to connect the names with the entities they represent and, in this way, to document the provenance of these names by connecting specific resources with specific names. introduction the basic aim of this study is to contribute to the field of machine-processable bibliographic data. as to what constitutes “machine processable” we concur with the clarification of antoniou and van harmelen, who state, “in the literature the term machine-understandable is used quite often. we believe it is the wrong word because it gives the wrong impression. it is not necessary for intelligent agents to understand information; it is sufficient for them to process information effectively, which sometimes causes people to think the machine really understands.”1 also, in the bibliography used, the term “computationally processable” is used as a synonym to “machine processable.” manolis peponakis (epepo@ekt.gr) is an information scientist at the national documentation centre, national hellenic research foundation, athens, greece. information technology and libraries | june 2016 19 mailto:epepo@ekt.gr with regard to machine-processable bibliographic data, we have taken into consideration both the practice and theory of library and information science (lis) and computer science. from lis we have chosen the functional requirements for bibliographic records (frbr) and the functional requirements for authority data (frad) while making comparisons with the resource description and access (rda) standard. from the computer science domain we have chosen the resource description framework (rdf) as a basic mechanism for the semantic web. we examine the entity relationship (er) model (selected from ifla as a “conceptual framework” for the development of frbr), 2 as well as the potential problems that may arise as we move towards the semantic web. having rejected the er model as a conceptual framework for bibliographic data, we have built on the potential of rdf and document how its rationale affects the modeling process. in the context of the semantic web and uniform resource identifiers (uris), the identification process has been transformed. for this reason we have performed an analysis of appellations and names as identifiers and also explored how we could move on from an era where controlled names play the role of identifiers to one of the uri dominion: “while it is self-evident that labels and comments are important for constructing and using ontologies by humans, the owl standard does not pay much attention to them. the standard focuses on the syntax, structure and reasoning capabilities. . . . if the semantic web is to be queried by humans, there will be no other way than dealing with the ambiguousness of human language.”3 it is essential to build on the “library's signature service, its catalog,”4 and use it to provide addedvalue services. but to get there, first there has to be “a shift in perspective, from locked-up databases of records to open data shared on the web.”5 this requires a transition from descriptions aimed at human readers to descriptions that put the emphasis on computational processes to escape the rationale of records being a condensed description in textual form and move towards more flexible and fruitful representations and visualizations. background frbr and rda the fr family has been growing for more than a decade. the first member of the family was the functional requirements for bibliographic records (frbr),6 the first version of which was published towards the end of the last century. subsequently, ifla decided to extend the model in order to cover authorities. during this process, the task of modeling the names was separated from the task of modeling the subjects. thus two new members were added to the family; the “functional requirements for authority data: a conceptual model” (frad) and the “functional requirements for subject authority data (frsad).” 7,8 at the same period of time, the “resource description and access” (rda) standard was established as a set of cataloging rules to replace the aacr standard. according to its creators, the alignment with the fr family was crucial. as stated, in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 20 “a key element in the design of rda is its alignment with the conceptual models for bibliographic and authority data developed by the international federation of library associations and institutions (ifla): functional requirements for bibliographic records [and] functional requirements for authority data.”9 this paper uses the fr family and the rda as a starting point but detects some problems and inconsistencies between these models. it sustains the basic semantics from these standards but rejects their structural formalism because it finds that it is quite problematic and lacks effectiveness in expressing highly machine-processable data. the effective processability of the data will be discussed in detail in the section “the impact of the representation scheme’s selection: rdf versus er.” among the fr family, the terminology is inconsistent and, as we pass from the frbr to frad and frsad, even the perception angle of the general model undergoes change. in frbr (the first in order), there is no notion of the name as an entity. frad introduces this perception (frad also adds family as a new entity) and frsad makes a step forward and introduces the concept of nomen instead of the concept of name. hence, despite the fact that each of the members of the fr family of models has been represented in rdf,10 there is no established consolidated edition yet that combines the different angles using a common model and terminology (vocabulary).11 these representations (one for each model) are available at ifla’s website.12 on the other hand, in the context of rda there may be more consistency regarding terminology, but, as is well established in the relevant literature, there are significant differences between the two models, i.e. the fr family and rda.13,14,15 due to these differences, there are no uris, not even in the rda registry, in the examples of our study.16 given the above, the terms appearing in the figures are a selection from the three texts of the fr family. thus, nomen (from frsad) is used instead of name (from frad) as a more abstract notion, and the attribute—property in the context of rdf—“has string” (from frad) is used to assign a specific literal to a nomen. in figures 2–5 we have used the “has appellation” (reversed “is appellation of”) relationship of frad.17 notes about terminology and graphs: how to read the figures in this paper two different sorts of figures appear. this covers the need to compare two different models and pinpoint the differences between them and the problems that arise from selecting the er model to express frbr. an explanation of the two major models follows in the next subsection. information technology and libraries | june 2016 21 the first figure type follows the diagrams of the entity–relationship model and is used in figure 1. in this case: • the rectangles represent entities. • the oval shapes represent attributes. • the diamond-shaped boxes represent relationships. the second figure type has been created according to the rdf graphical representations and is used in figures 2–5. in these cases: • the oval shapes represent nodes that are identified by a uri and they could serve as objects or subjects for further expansion of the network. in figures 3–5 all the names were derived from the fr entities. • the line connectors between nodes represent the predicates (i.e., they are properties) and should also serve as uris. • the rectangle shapes represent literals consisting of lexical form. language code could apply in these cases. with or without language codes, these are the end points and they could not be subject to new connections. we follow the common modeling of the language in rdf in which the literal itself contains a language code, for example "example"@en in standard turtle syntax, or in rdfs xml coding. we must note that this kind of modeling is quite a simplistic way of language modeling because there is no mechanism to declare more information about language, such as multiple scripts, which could apply in the context of the same language. the impact of the representation scheme’s selection: rdf versus er nowadays, all the information on library catalogs is created through and stored in computers. this technological infrastructure provides specific methods and dictates limitations for the catalog’s data management. hence, every model must take into consideration the basic rationale of the technological infrastructure that will curate and process the data. depending on the syntax capabilities of the representation model, the expression of what we want to express becomes reasonably easy and accurate since “semantics is always going to have a close relationship with the field of syntax.”18 this establishes a vital relationship between what we want to do and how computers can do it. in this section we emphasize the limitations of the entity relationship (er) implementation, which frbr proposes, and denote how syntax affects expressiveness and, accordingly, functionality. finally, we demonstrate how the selection of one implementation or another (in our case er vs. rdf) has serious implications, both for cataloging rules and for cataloging practice. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 22 why do we compare these two specific models? the er model is the base that has been selected from ifla as a “conceptual framework” 19 for the development of frbr, while frbr is the conceptual model upon which rda has been founded. subsequently, rda is also affected by the choice of er model. on the other hand, rdf is the current conceptualization for resource description in the web of data. so, what kind of problems and conflicts arise from the implementations of each of these models? the basic rationale of er comprises three fundamental elements. there are entities; entities have attributes; and there are relationships between entities. it is also possible to declare cardinality constraints upon which the fr family builds. then again, rdf implies quite a different model. “the core structure of the abstract syntax is a set of triples, each consisting of a subject, a predicate and an object. a set of such triples is called an rdf graph. an rdf graph can be visualized as a node and directed-arc diagram, in which each triple is represented as a node-arc-node link. . . . there can be three kinds of nodes in an rdf graph: iris, literals, and blank nodes.”20 “linking the object of one statement to the subject of another, via uris, results in a chain of linked statements, or linked data. this avoids the ambiguity of using natural language strings as headings to match statements. as a result, a literal object terminates a linked data chain, and literals are generally used for human-readable display data such as labels, notes, names, and so on.”21 as a representative example of the differences between the two models, let us consider “place of publication.” peponakis counts nine attributes of place and notices that, due to the fact that the er model does not allow links between attributes, there is no way to define explicitly whether these attributes address the same place or not.22 taking into consideration this problem we demonstrate the transition from the er attributes approach to rdf implementations in figures 1– 2. let us assume that there is person (x), who was born in london, is named john smith and works at publisher (y). this publisher is located in london, where book (1), entitled history of london, has been published. for this specific book, person x was the lithographer. if we create a strict mapping to frbr entities, attributes, and relations, then we have the situation illustrated in figure 1. due to the fact that there is no way to link the four occurrences of london (inasmuch as there is no option to define relations between attributes in the er model), there is no way to be certain that london is the same in all cases. judging only by the name, it could stand for london in england, in ontario, in ohio, or elsewhere. information technology and libraries | june 2016 23 figure 1. example of “place” as attribute of several entities the ifla working group has faced the problem with place and noted the following. the model does not, however, parallel entity relationships with attributes in all cases where such parallels could be drawn. for example, “place of publication/distribution” is defined as an attribute of the manifestation to reflect the statement appearing in the manifestation itself that indicates where it was published. inasmuch as the model also defines place as an entity it would have been possible to define an additional relationship linking the entity place either directly to the manifestation or indirectly through the entities person and corporate body which in turn are linked through the production relationship to the manifestation. to produce a fully developed data model further definition of that kind would be appropriate. but for the purposes of this study it was deemed unnecessary to have the conceptual model reflect all such possibilities. 23 finally, they seem to avoid the problem and repeat their position in frad as well. in certain instances, the model treats an association between one entity and another simply as an attribute of the first entity. for example, the association between a person and the place in which the person was born could be expressed logically by defining a relationship (“born in”) between person and place. however, for the purposes of this study, it was deemed sufficient to treat place of birth simply as an attribute of person. 24 in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 24 for some reason the creators of the fr family have chosen not to “upgrade” the attributes of place into one and only one entity. furthermore, the same problem exists for many attributes, not only for place. thus, the problem has to do with the selection of er as “conceptual framework” and not with the specific entity of place. if we accept that “place of publication” must not be recorded as it appears on the resource, an rdf-based approach makes things clearer, as figure 2 shows. in this case, all attributes of place are promoted to the same rdf node and, instead of four repeats of the attribute with the value “london,” we reduce it to one and only one node with four connections to it. then, as illustrated by figure 2, we can be sure that all instances refer to the same london. figure 2. rdf-based representations of figure 1 in figure 2, it is assumed that there is no need to transcribe the literal of “place of publication” from the resource; i.e., we did not follow rule 2.8.1.4 of rda: “transcribe places of publication and publishers' names as they appear on the source of information.” for cataloging rules that demand to record the place as it appears on the resource, the readers can consult the subsection “place names” in this study. last but not least, rdf has another significant advantage compared to the er model: data coded in rdf are packed ready for use in the semantic web. on the contrary, data coded in er must undergo conversion—with all its implications—in order to be published in the semantic web. information technology and libraries | june 2016 25 names, entities, and identities in this section, the significance of names as carriers of meaning is outlined and the importance of documenting the relations of names with the entities and identities they refer to is established. additionally, the basic approaches are presented for metadata generation for managing names. these approaches resulted in the distinction (dissociation of authorities) from the bibliographic records, which in turn led (both frbr/frad and rda) to the lack of potentially linking—in an explicit way—the entity with the names it goes by. this linking, as it is presented later in this text, is fundamental for the description and interpretation of the entity. in everyday communication, the usage of a name in a sentence plays the role of the identifier for the entity that this specific name indicates. if the speakers share a common background, there is no need for qualifiers other than the name in order to disambiguate information such as whether nick is person x or person y, or if the word “london” indicates the city in ohio or in england, etc. thus, the common background leads to a very limited context in which the interpretation of the name and the assignment to the appropriate entity is sufficient and accurate. however, the context of the internet is extended into a variety of possibilities, so there is need of a more precise way to identify specific entities. in this regard, a very essential issue is the distinction between the properties of the name and the properties of the entity that is represented by the specific name. the word “john” could be recognized as an english name, but we jump to a logical flaw if we assume that john knows english. a representative example of this kind of inference (syllogism) can be found in rayside and campbell.25 statement: “man is a species of animal. socrates is a man. therefore, socrates is a species of animal. . . . ‘man' is a three-lettered word. socrates is a man. therefore, socrates is a three-lettered word.” therefore the authorities of a catalog should embody a two-level modeling of the information they represent. the first has to do with the entities and the second with the names of these entities. consequently, there is the need to find a way to pass from names to the entities they indicate; and, from entities, to the various appellations that these entities have. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 26 in catalogs, it is kind of vague whether the change of a name signifies a new identity. niu states: “for example: the maiden name and the married name of an agent are normally not considered two separate identities, yet one pseudonym used for writing fiction and another pseudonym used for writing scientific works are often considered two different identities of an agent.”26 then there can be one individual with many identities. but there can also be one identity which incorporates many individuals: for example, a shared pseudonym for a group of authors. to deal with these problems, frad introduces the notion of persona, rejecting at the same time the idea that a person is equal to an individual. frad defines a person as an “individual or a persona or identity established or adopted by an individual or group.”27 the question that arises here is when the persona must be conceived as a new identity. yet, frad does not make a sufficient judgment; instead, they refer to cataloguing rules. “under some cataloguing rules, for example, authors are uniformly viewed as real individuals, and consequently specific instances of the bibliographic entity person always correspond to individuals. under other cataloguing rules, however, authors may be viewed in certain circumstances as establishing more than one bibliographic identity, and in that case a specific instance of the bibliographic entity person may correspond to a persona adopted by an individual rather than to the individual per se.”28 so there is no specific guidance if, for example, in the case of “religious relationship,”29 there must be one identity created with two alternative names or two different identities. rule 9.2.2.8 in rda does not elaborate further. still, even with the problem of identities solved, the matter of appellations itself could be extremely complicated, and this is widely addressed in relevant literature.30,31,32 the viaf project confirms this with an extremely huge data set .33 assigning all appellations as attributes is an easy way to model the variants of a name, but it is very simplistic because it “does not allow these appellations to have attributes of their own and neither does it allow the establishing of relationships among the appellations. . . . frad makes a big step forward: all appellations are defined as entities in their own right, thus allowing full modeling.”34 of course, frad’s approach is not a novelty in the domain of lis since library catalogs have been modeling names since the era of marc. in unimarc authorities,35 the control subfield $5 contains a coded value to indicate the relations between the names with values such as “k = name before the marriage,” “i = name in religion,” “d = acronym,” etc., and in marc 21 there is the corresponding subfield $w.36 frad puts these values on a more consistent and abstract level. frad also defines “relationships between persons, families, corporate bodies, and works” in section 5.3 and “relationships between their various names” in section 5.4.37 the distinction between authorities and descriptive information since the days of card catalogs and for as long as marc and aacr have been used, bibliographic records have set their grounds on the dichotomy between descriptive information and control access points. the various types of headings stand for control access points. the terminus of headings was the alphabetical sorting. with the advent of computers, they were used as string identifiers to cluster and retrieve relevant bibliographic records. these bibliographic records had information technology and libraries | june 2016 27 a body of descriptive information that was transcribed from the resource and remained unchanged. so the headings were the keys to the records and the records were surrogates for documents. “the elements of a bibliographic record . . . were designed to be read and comprehended by human beings, not by machines”38; established headings are not an exception. one of their basic characteristics was the precondition that they were unique in the context of a specific catalog, thereby avoiding ambiguity. in every case of synonymy, qualifiers (such as date of birth or profession) were added to disambiguate, while the names also played the role of a unique identifier. from this process, an issue emerges: the information that appears on the document has changed and the controlled name may be completely different from the name on the resource. this means that the cataloger performs a transformation of the information, and this transformation carries two dangers. first, by changing the name, there is the possibility of assigning the entity behind the name to a wrong entity. second, by disturbing the correspondence between the information on the resource and the information on the record of the resource, the record becomes a problematic surrogate of the resource. to surpass this obstacle, traditional catalogs split the information into two different areas: one with the established forms, i.e., the headings; and the second with the purely descriptive information, i.e., the information that must be transcribed from the resource. this is the reason why traditional library catalogs put much effort into transcribing information from resources and very detailed guidelines have been developed. on the other hand, current approaches on metadata creation (such as dublin core) seem to underestimate the importance of descriptive information while concentrating on the established forms of names. but how can we be sure that different literals communicate the same meaning? does this kind of simplification, perhaps, cause problems regarding the integrity of the information? the names are not just sequences of characters (i.e., strings), but they carry latent information. it is known that there are women who wrote using male names (for example mary ann evans wrote as george eliot) and men who wrote by using female names. there are also nicknames for groups (e.g., “richard henry” is a pseudonym for the collaborative works of richard butler and henry chance newton), etc. therefore, it is important not to ignore names and the forms in which they appear on the resources, but to model them in such a way that integration between authorities and descriptive information is feasible, and the names are efficiently machine-processable. integrating authorities with descriptive information as we have already stated, traditional library catalogs are built on the dichotomy between description and access points. this analysis aims to bring descriptive information and authorities closer, i.e. to connect the access point of catalogs with the description of the resource. the basic principle of the model presented in this section is to promote each verbal (lexical) representation of a name to a nomen, whether this form of the name derives from a controlled vocabulary or not. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 28 in the cases that this form appears in a specific vocabulary, appropriate properties could be used to indicate such a relation. in this section, some representative examples are presented. it is important to note, once again, that every node and relation in the following figures could (and must, in the context of the semantic web) be identified by a uri, except for the values in rectangles, which are rdf simple literals and therefore cannot be the subjects of further expansion. thus, the concatenation is the following: every individual (instance of the relevant class) acquires a uri. every individual is connected through the “has appellation” property (acquires uri) to a nomen (also acquires uri) and these nomens end up connected to a plain rdf literal, which is in natural language wording and cannot be subjected to further analysis. place names the problem of place as an attribute in frbr and frad has also been analyzed in the background analysis of the current paper, specifically in the subsection “the impact of the representation scheme’s selection: rdf versus er.” here, a solution to this problem that is compatible with the frbr/rda solution is proposed. by promoting every nomen of a place to an rdf node, there is the option of referring to the entity of place as a whole or to a specific appellation of this entity. so, the relation (property in the context of rdf) between the subjects of a work could be indicated by connecting work x with place z. on the other hand, according to rule 2.8.1.4 of rda, the place of publication for the manifestation must be transcribed as it appears on the source of information. but following the connections presented in figure 3, it is easy to assume that this specific nomen corresponds to the same entity, i.e., to the same place. figure 3. place information technology and libraries | june 2016 29 personal names in the section “names, entities and identities,” we analyzed many of the problems associated with personal names. here, a model is presented where the work (and expression) is connected directly with the author, whereas manifestation is connected with a specific appellation, i.e., nomen, of this author. figure 4. statements of responsibility rda rule 2.4.1.4 states, “transcribe a statement of responsibility as it appears on the source of information.” but occasionally the statement of responsibility may contain phrases and not just names. in these cases, a solution similar to the metadata object description schema (mods) could be implemented where, if needed, the statement of responsibility is included in the note element using the attribute type="statement of responsibility." titles the management of titles in frbr and rda indicates a different point of view between the two standards. according to rda there is no title for the expression,39 and, as taniguchi states, this is a “significant difference between frbr and rda.”40 bibframe abides by the same principle of downgrading expression, since it entangles expression with work in an indivisible unit. in this regard, bibframe is closer to rda than to frbr. the notion of work has nothing to do with specific languages, even in the case when the work is a written text. therefore the assignment of the title of work to a specific appellation is an unnecessary limitation. on the contrary, the title of a manifestation is derived by a specific in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 30 resource. we argue that between these two poles there is the title of expression, which could stand as a uniform title per language. figure 5. titles v of bibliographic records and cataloging rules resource description in the domain of lis—from cutter’s era to the present day—emphasizes static linear textual representations. according to the rda “0.1 key features,” “in rda, there is a clear line of separation between the guidelines and instructions on recording data and those on the presentation of data. this separation has been established in order to optimize flexibility in the storage and display of the data produced using rda. guidelines and instructions on recording data are covered in chapters 1 through 37; those on the presentation of data are covered in appendices d and e.” but the tables in the relative appendices (d and e) contain guidelines that are mainly concentrated on punctuation issues, and they do not take into consideration the dynamics of current interactive user interface capabilities. as coyle and hillmann comment, “there are instructions for highly structured strings that are clearly not compatible with what we think of today as machine-manipulable data.”41 it is rather like producing high-tech cards: rda is faithful information technology and libraries | june 2016 31 to the classical text-centric approaches that produce bibliographic records as a linear enumeration of attributes; thus, rda can be likened to a new suit that is quite old fashioned. traditional catalogs (from card catalogs to opacs and repository catalogs) were built upon the principle of creating autonomous records. frbr set this principle, i.e. one record for each resource, under dispute, while linked data abolishes it. this way, a gigantic graph of statements is created, while a certain part of these statements (not always the same) responds to or describes the desired information. thus, a more sophisticated method emerges, if not makes itself imposed, for showing the results. therefore, the issue is not to present a record that describes a specific resource, since this conceptualization tends to be obsolete altogether. consequently, the visualization has to be different while in dependence with the data structure as well as the available interface of the searcher. in this context, the analysis of this study tries to keep in balance the machine-processable character of rdf that builds on identifiers (uris), while paying attention to the linguistic representation of entities. we argue that the balance between them will result in highly accurate and efficient representations for both humans and software agents. let us consider the model for titles that has been introduced in this study. according to frbr, “if the work has appeared under varying titles (differing in form, language, etc.), a bibliographic agency normally selects one of those titles as the basis of a ‘uniform title’ for purposes of consistency in naming and referencing the work.”42 rda treats the case in a very similar way: rule 5.1.3 states, “the term ‘title of the work’ refers to a word, character, or group of words and/or characters by which a work is known. the term ‘preferred title for the work’ refers to the title or form of title chosen to identify the work. the preferred title is also the basis for the authorized access point representing that work”. in this study, we consider the aforementioned statements as a projection that springs from the days when records were static textual descriptions independent of interfaces. nowadays we are moving towards a much clearer distinction between the entity and its names. this is reflected in figure 5, in which the connection between a work and its author has nothing to do with specific names (appellations) but is based on uris. the selection of the appropriate name as a title for the specific work could be based on certain criteria such as the language of the interface: in this case, the title of the work will be the title of the user interface language, and if this is not possible (i.e. there is no title label in this language), then it could be the title of the catalog’s default language. following the kind of modeling proposed in the current study, the visualizations of data become more flexible and efficient in a variety of dynamic ways. hence, we can isolate and display nodes and their connections, correlate them with the interface language or screen size (i.e., mobile phone or pc), create levels relative to the desired depth of analysis, personalize them upon the user’s request or habits, and so on. also, it becomes possible to display the data in forms other than textual. “as a result, humans, with their great visual pattern recognition skills, can comprehend data tremendously faster and more effectively through visualization than by reading the numerical or textual representation of the data.”43 in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 32 as we have already mentioned, the syntax and the semantics are always going to have a close relationship, but it is crystal clear that, now more than ever, the current semantic web standards allow for greater flexibility. as dunsire et al. put it, the rdf approach is very different from the traditional library catalog record exemplified by marc21, where descriptions of multiple aspects of a resource are bound together by a specific syntax of tags, indicators, and subfields as a single identifiable stream of data that is manipulated as a whole. in rdf, the data must be separated out into single statements that can then be processed independently from one another; processing includes the aggregation of statements into a record-based view, but is not confined to any specific record schema or source for the data. statements or triples can be mixed and matched from many different sources to form many different kinds of user-friendly displays.44 in this framework, cataloging rules must reexamine their instructions in light of the new opportunities offered by technological advancements. discussion naming is a vital issue for human cultures. names are not random sequences of characters or sounds that stand just as identifiers for the entities, but they also have socio-cultural meanings and interpretations. recently, out of “political correctness” and fear of triggering racism, sweden changed the names of bird species that could potentially offend, such as “gypsy bird” and “negro.”45 therefore we cannot treat names just as random identifiers. in this study we examined how, instead of describing indivisible resources, we could describe entities that appear in a variety of names on various resources. we proposed a method for connecting the names to the entities they represent and, at the same time, we documented the provenance of these names by connecting specific resources with specific names. we illustrated how to establish connections between entities, connections between an entity and a specific name of another entity, as well as connections between one name and another name concerning one or two entities. in the proposed framework, we maintain the linguistic character of naming while modeling the names in a machine-processable way. this formalism allows for a high level of expressiveness and flexible descriptions that do not have a static, text-centric orientation, since the central point is not the establishment of the text values (i.e., heading) but the meaning of our statements. this study has shown that it is important to have the possibility to establish relationships both between entities and between specific appellations (nomens in the context of this study) of these entities. to achieve this we promoted every appellation to an rdf node. this is not something unheard of in the domain of rdf since this approach has also been adopted by w3c for the development of skos-xl.46 frbroo, which is another interpretation of increasing influence in the wider context of the fr family, adopts the same perspective. 47 frbroo also gives the option to connect a specific name with a resource through the property “r64 used name (was name used information technology and libraries | june 2016 33 by)” or to connect a name with someone who uses this specific name through the property “r63 named (was named by).” murray and tillett state that “cataloging is a process of making observations on resources”48; hence, the production of records is the result of the judgments during this process. but in the context of traditional descriptive cataloging, the cataloger was not required to judge information in any way other than its category, i.e. to characterize whether the x set of characters corresponded to the name of an author, publisher, or place and so on. there was no obligation of assigning a particular name to a specific author, publisher, or place. in our approach, the cataloger interprets the information and supports the catalog’s potential to deliver added-value information. moreover, the initial information remains undifferentiated; hence, there is always the option of going back in order to generate new interpretations or validate existing ones. in recent years, there has been a significant increase in the attention given to multi-entity models of resource description.49 in this new environment, “the creation of one record per resource seems a deficient simplification.”50 rdf allows the transformation of universal bibliographic control to a giant global graph.51 in this manner, current approaches on resource description “cannot be considered as simple metadata describing a specific resource but more like some kind of knowledge related to the resource.”52 indeed, this knowledge can be computationally processable and exploitable. yet, to achieve this, “catalogers can only begin to work in this way if they are not held bound by the traditional definitions and conceptualizations of bibliographic records.”53 one critical issue is the isolation of parts (sets of statements) of this “giant graph” and the linking of these parts with something else; indeed, theory on this topic is starting to emerge.54 this is very essential because it allows for the creation of ad hoc clusters (i.e. the usage of a specific identity for an entity with all the names that have been assigned to this identity, in our context), which could be used as a set to link to some other entity. as a final remark, we could say that authorities manage controlled access points. in the semantic web, every uri is a controlled access point, and hence, the discrimination between description and authorities acquires a new meaning. in the context of machine-processable bibliographic data, the aim is to connect these two, i.e. the authorities with the description, and examine how one can support the other. however, since the emphasis is not on their individual management, we are drawn away from a mentality of ‘descriptive information versus access points” and towards one of “descriptive information as an access point.” acknowledgement the author wishes to thank henry scott who assisted in the proofreading of the manuscript. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 34 references and notes 1. grigoris antoniou and frank van harmelen, a semantic web primer, 2nd ed. (cambridge, ma: mit press, 2008), 3. 2. ifla, functional requirements for bibliographic records: final report, as amended and corrected through february 2009, ifla series on bibliographic control, vol. 19 (munich: k.g. saur, 1998), 6. 3. daniel kless et al., “interoperability of knowledge organization systems with and through ontologies,” in classification & ontology: formal approaches and access to knowledge: proceedings of the international udc seminar 19–20 september 2011, the hague, the netherlands, organized by udc consortium, the hague, edited by aida slavic and edgardo civallero (würzburg: ergon, 2011), 63–64. 4. karen coyle and diane hillmann, “resource description and access (rda): cataloging rules for the 20th century,” d-lib magazine 13, no. 1/2 (january 2007): para. 2, doi:10.1045/january2007-coyle. 5. cory k. lampert and silvia b. southwick, “leading to linking: introducing linked data to academic library digital collections,” journal of library metadata 13, no. 2–3 (2013): 231, doi:10.1080/19386389.2013.826095. 6. ifla, functional requirements for bibliographic records. 7. ifla, functional requirements for authority data: a conceptual model, edited by glenn e. patton, ifla series on bibliographic control (munich: k.g. saur, 2009). 8. ifla, “functional requirements for subject authority data (frsad): a conceptual model” (ifla, 2010), http://www.ifla.org/files/assets/classification-and-indexing/functional requirements-for-subject-authority-data/frsad-final-report.pdf. 9. ala, “rda toolkit: resource description and access,” sec. 0.3.1, accessed june 18, 2014, http://access.rdatoolkit.org/. 10. gordon dunsire, “representing the fr family in the semantic web,” cataloging & classification quarterly 50, no. 5–7 (2012): 724–41, dx:10.1080/01639374.2012.679881. 11. while this paper was under review, ifla released the draft “frbr-library reference model” (frbr-lrm), which is a consolidated edition for the fr family standards. it is developed according to the respective individual standards following the principles of the entity relationship modeling, which is challenged in this paper. taking into account the er modeling and the statement (available on p.5 of the standard) that “the model is comprehensive at the conceptual level, but only indicative in terms of the attributes and relationships that are defined,” this consolidated edition could not be perceived as a standard that could be implemented directly as a property vocabulary qualifying for use in the rdf environment. information technology and libraries | june 2016 35 http://dx.doi.org/10.1045/january2007-coyle http://dx.doi.org/10.1080/19386389.2013.826095 http://www.ifla.org/files/assets/classification-and-indexing/functional-requirements-for-subject-authority-data/frsad-final-report.pdf http://www.ifla.org/files/assets/classification-and-indexing/functional-requirements-for-subject-authority-data/frsad-final-report.pdf http://access.rdatoolkit.org/ http://dx.doi.org/10.1080/01639374.2012.679881 12. main page (for all fr) at http://iflastandards.info/ns/fr/; “frbr model" available at http://iflastandards.info/ns/fr/frbr/frbrer/; “frad model” available at http://iflastandards.info/ns/fr/frad/; “frsad model” available at http://iflastandards.info/ns/fr/frsad/. an addition to the previous is frbroo: the element set is available at http://iflastandards.info/ns/fr/frbr/frbroo/. 13. manolis peponakis, “conceptualizations of the cataloging object: a critique on current perceptions of frbr group 1 entities,” cataloging & classification quarterly 50, no. 5–7 (2012): 587–602, doi:10.1080/01639374.2012.681275. 14. pat riva and chris oliver, “evaluation of rda as an implementation of frbr and frad,” cataloging & classification quarterly 50, no. 5–7 (2012): 564–86, doi:10.1080/01639374.2012.680848. 15. shoichi taniguchi, “viewing rda from frbr and frad: does rda represent a different conceptual model?,” cataloging & classification quarterly 50, no. 8 (2012): 929–43, doi:10.1080/01639374.2012.712631. 16. rda registry is available at http://www.rdaregistry.info/. 17. the nomen entity and the “has appellation” (reversed “is appellation of”) property are also used by the frbr-lrm. 18. paul h. portner, what is meaning?: fundamentals of formal semantics (malden, ma: blackwell, 2005), 34. 19. ifla, functional requirements for bibliographic records, 19:6. 20. w3c, “rdf 1.1 concepts and abstract syntax: w3c recommendation,” february 25, 2014, http://www.w3.org/tr/2014/rec-rdf11-concepts-20140225/. 21. gordon dunsire, diane hillmann, and jon phipps, “reconsidering universal bibliographic control in light of the semantic web,” journal of library metadata 12, no. 2–3 (2012): 166, doi:10.1080/19386389.2012.699831. 22. manolis peponakis, “libraries’ metadata as data in the era of the semantic web: modeling a repository of master theses and phd dissertations for the web of data,” journal of library metadata 13, no. 4 (2013): 333, doi:10.1080/19386389.2013.846618. 23. ifla, functional requirements for bibliographic records, 19:32. 24. ifla, functional requirements for authority data: a conceptual model, 36–37. 25. derek rayside and gerard t. campbell, “an aristotelian understanding of object-oriented programming,” in proceedings of the 15th acm sigplan conference on object-oriented programming, systems, languages, and applications, oopsla ’00 (new york: acm, 2000), 350, doi:10.1145/353171.353194. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 36 http://iflastandards.info/ns/fr/ http://iflastandards.info/ns/fr/frbr/frbrer/ http://iflastandards.info/ns/fr/frad/ http://iflastandards.info/ns/fr/frad/ http://iflastandards.info/ns/fr/frsad/ http://iflastandards.info/ns/fr/frbr/frbroo/ http://dx.doi.org/10.1080/01639374.2012.681275 http://dx.doi.org/10.1080/01639374.2012.680848 http://dx.doi.org/10.1080/01639374.2012.712631 http://www.rdaregistry.info/ http://www.w3.org/tr/2014/rec-rdf11-concepts-20140225/ http://dx.doi.org/10.1080/19386389.2012.699831 http://dx.doi.org/10.1080/19386389.2013.846618 http://dx.doi.org/10.1145/353171.353194 26. jinfang niu, “evolving landscape in name authority control,” cataloging & classification quarterly 51, no. 4 (2013): 405, doi:10.1080/01639374.2012.756843. 27. ifla, functional requirements for authority data: a conceptual model, 24. 28. ibid., 20. 29. “religious relationship” is the “relationship between a person and an identity that person assumes in a religious capacity”; for example the “relationship between the person known as thomas merton and that person’s name in religion, father louis” (ifla, 2009, 61–62). 30. junli diao, “‘fu hao,’ ‘fu hao,’ ‘fuhao,’ or ‘fu hao’? a cataloger’s navigation of an ancient chinese woman’s name,” cataloging & classification quarterly 53, no. 1 (2015): 71–87, doi:10.1080/01639374.2014.935543. 31. on byung-won, sang choi gyu, and jung soo-mok, “a case study for understanding the nature of redundant entities in bibliographic digital libraries,” program: electronic library and information systems 48, no. 3 (july 1, 2014): 246–71, doi:10.1108/prog-07-2012-0037. 32. neil r. smalheiser and vetle i. torvik, “author name disambiguation,” annual review of information science and technology 43, no. 1 (2009): 1–43, doi:10.1002/aris.2009.1440430113. 33. thomas b. hickey and jenny a. toves, “managing ambiguity in viaf,” d-lib magazine 20, no. 7/8 (2014), doi:10.1045/july2014-hickey. 34. martin doerr, pat riva, and maja žumer, “frbr entities: identity and identification,” cataloging & classification quarterly 50, no. 5–7 (2012): 524, doi:10.1080/01639374.2012.681252. 35. ifla, unimarc manual: authorities format, 2nd revised and enlarged edition, ubcim publications—new series, vol. 22 (munich: k.g. saur, 2001). 36. library of congress, “marc 21 format for authority data” (library of congress, april 18, 1999), http://www.loc.gov/marc/authority/. 37. ifla, functional requirements for authority data: a conceptual model. 38. martha m. yee, “frbrization: a method for turning online public findings lists into online public catalogs,” information technology and libraries 24, no. 2 (2005): 81, doi:10.6017/ital.v24i2.3368. 39. see frbr-rda mapping from joint steering committee for development of rda available at http://www.rda-jsc.org/docs/5rda-frbrrdamappingrev.pdf 40. taniguchi, “viewing rda from frbr and frad,” 934. 41. coyle and hillmann, “resource description and access (rda): cataloging rules for the 20th century,” sec. 8. information technology and libraries | june 2016 37 http://dx.doi.org/10.1080/01639374.2012.756843 http://dx.doi.org/10.1080/01639374.2014.935543 http://dx.doi.org/10.1108/prog-07-2012-0037 http://dx.doi.org/10.1002/aris.2009.1440430113 http://dx.doi.org/10.1045/july2014-hickey http://dx.doi.org/10.1080/01639374.2012.681252 http://www.loc.gov/marc/authority/ http://dx.doi.org/10.6017/ital.v24i2.3368 http://www.rda-jsc.org/docs/5rda-frbrrdamappingrev.pdf 42. ifla, functional requirements for bibliographic records, 19:33. 43. leonidas deligiannidis, amit p. sheth, and boanerges aleman-meza, “semantic analytics visualization,” in intelligence and security informatics, edited by sharad mehrotra et al., lecture notes in computer science 3975 (springer berlin heidelberg, 2006), 49, http://link.springer.com/chapter/10.1007/11760146_5. 44. dunsire, hillmann, and phipps, “reconsidering universal bibliographic control in light of the semantic web,” 166. 45. rick noack, “out of fear of racism, sweden changes the names of bird species,” washington post, february 24, 2015, http://www.washingtonpost.com/blogs/worldviews/wp/2015/02/24/out-of-fear-of racism-sweden-changes-the-names-of-bird-species/. 46. w3c, “skos extension for labels (skos-xl) namespace document—html variant,” 2009, http://www.w3.org/tr/2009/rec-skos-reference-20090818/skos-xl.html. 47. chryssoula bekiari et al., frbr object-oriented definition and mapping from frbrer, frad and frsad, version 2.0 (draft), 2013, http://www.cidoc crm.org/docs/frbr_oo//frbr_docs/frbroo_v2.0_draft_2013may.pdf. 48. robert j. murray and barbara b. tillett, “cataloging theory in search of graph theory and other ivory towers,” information technology and libraries 30, no. 4 (january 12, 2011): 171, http://dx.doi.org/10.6017/ital.v30i4.1868. 49. thomas baker, karen coyle, and sean petiya, “multi-entity models of resource description in the semantic web,” library hi tech 32, no. 4 (2014): 562–82, http://dx.doi.org/10.1108/lht 08-2014-0081. 50. peponakis, “libraries’ metadata as data in the era of the semantic web,” 343. 51. kim tallerås, “from many records to one graph: heterogeneity conflicts in the linked data restructuring cycle,” information research 18, no. 3 (2013), http://informationr.net/ir/18 3/colis/paperc18.html. 52. peponakis, “conceptualizations of the cataloging object,” 599. 53. rachel ivy clarke, “breaking records: the history of bibliographic records and their influence in conceptualizing bibliographic data,” cataloging & classification quarterly 53, no. 3–4 (2015): 286–302, doi:10.1080/01639374.2014.960988. 54. gianmaria silvello, “a methodology for citing linked open data subsets,” d-lib magazine 21, no. 1/2 (2015), doi:10.1045/january2015-silvello. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 38 http://link.springer.com/chapter/10.1007/11760146_5 http://www.washingtonpost.com/blogs/worldviews/wp/2015/02/24/out-of-fear-of-racism-sweden-changes-the-names-of-bird-species/ http://www.washingtonpost.com/blogs/worldviews/wp/2015/02/24/out-of-fear-of-racism-sweden-changes-the-names-of-bird-species/ http://www.w3.org/tr/2009/rec-skos-reference-20090818/skos-xl.html http://www.cidoc-crm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf http://www.cidoc-crm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf http://dx.doi.org/10.6017/ital.v30i4.1868 http://dx.doi.org/10.1108/lht-08-2014-0081 http://dx.doi.org/10.1108/lht-08-2014-0081 http://informationr.net/ir/18-3/colis/paperc18.html http://informationr.net/ir/18-3/colis/paperc18.html http://dx.doi.org/10.1080/01639374.2014.960988 http://dx.doi.org/10.1045/january2015-silvello introduction background frbr and rda notes about terminology and graphs: how to read the figures the impact of the representation scheme’s selection: rdf versus er names, entities, and identities the distinction between authorities and descriptive information integrating authorities with descriptive information place names personal names titles visualization of bibliographic records and cataloging rules discussion references and notes guest editorial clifford lynch information technology and libraries | march 2012 3 congratulations lita and information technology and libraries. since the early days of the internet, i’ve been continually struck by the incredible opportunities that it offers organizations concerned with the creation, organization, and dissemination of knowledge to advance their core missions in new and more effective ways. libraries and librarians were consistently early and aggressive in recognizing, seizing, and advocating for these opportunities, though they’ve faced—and continue to face—enormous obstacles ranging from copyright laws to the amazing inertia of academic traditions in scholarly communication. yet the library profession has been slow to open up access to the publications of its own professional societies, to take advantage of the greater reach and impact that such policies can offer. making these changes is not easy: there are real financial implications that suddenly seem very serious when you are a member of a board of directors, charged with a fiduciary duty to your association, and you have to push through plans to realign its finances, organizational mission, and goals in the new world of networked information. so, as a long-time lita member, i find it a great pleasure to see lita finally reach this milestone with information technology and libraries (ital) moving to fully open-access electronic distribution, and i congratulate the lita leadership for the persistence and courage to make this happen. it’s a decision that will, i believe, make the journal much more visible, and a more attractive venue for authors; it will also make it easier to use in educational settings, and to further the interactions between librarians, information scientists, computer scientists, and members of other disciplines. on a broader ala-wide level, ital now joins acrl’s college & research libraries as part of the american library association’s portfolio of open-access journals. supporting ital as an open-access journal is a very good reason indeed to be a member of lita. clifford lynch (clifford@cni.org) is executive director, coalition for networked information. mailto:clifford@cni.org eclipse editor for marc records bojana dimić surla information technology and libraries | september 2012 65 abstract editing bibliographic data is an important part of library information systems. in this paper we discuss existing approaches in developing user interfaces for editing marc records. there are two basic approaches: screen forms that support entering bibliographic data without knowledge of the marc structure, and direct editing of marc records shown on the screen. this paper presents the eclipse editor, which fully supports editing of marc records. it is written in java as an eclipse plug-in, so it is platform-independent. it can be extended for use with any data store. the paper also presents a rich client platform (rcp) application made of a marc editor plug-in, which can be used outside of eclipse. the practical application of the results is integration of the rcp application into the bisis library information system. introduction an important module of every library information system (lis) is one for editing bibliographic records (i.e., cataloguing). most library information systems store their bibliographic data in a form of marc records. some of them support cataloging by direct-editing of marc record; others have a user interface that enables entering bibliographic data by a user who knows nothing about how marc records are organized. the subject of this paper is user interfaces for editing marc records. it gives software requirements and analyzes existing approaches in this field. as the main part of the paper, we present the eclipse editor for marc records, developed at the university of novi sad, as a part of the bisis library information system. eclipse uses the marc 21 variant of the marc format. the remainder of this paper describes the motivation for the research, presents the software requirements for cataloging according to marc standards, and provides background on the marc 21 format. it also describes the development of the bisis software system, reviews the literature concerning tools for cataloging, and analyzes existing approaches in developing user interfaces for editing marc records. the results of the research are presented in the final section, which describes the functionality and technical characteristics of the eclipse marc editor. the rich client platform (rcp) version of the editor, which can be used independently of eclipse, is also presented. motivation the motivation for this paper was to provide an improved user interface for cataloging by the marc standard that will lead to more efficient and comfortable work for catalogers. bojana dimić surla (bdimic@uns.ns.ac.yu) is an associate professor, university of novi sad, serbia. eclipse editor for marc records |surla 66 there are two basic approaches in developing user interfaces for marc cataloging. the first approach includes using a classic screen form made of text fields and labels with the description of the bibliographic data, without marc standard indication. the second approach is direct editing of a record that is shown on the screen. those two approaches will be discussed in detail in “existing approaches in developing user interfaces for editing marc records” below. the current editor in the bisis system is a mixture of these two approaches—it supports direct editing, but data input is done via text field, which opens on double click.1 the idea presented in this paper is to create an editor that overcomes all drawbacks of previous solutions. the approach taken in creating the editor was direct record-editing with real-time validation and no additional dialogs. software requirements for marc cataloging the user interface for marc cataloging needs to support following functions: • creating marc records that satisfy constraints proposed by the bibliographic format • selecting codes for field tags, subfield names, and values of coded elements, such as character positions in leader and control fields, indicators, and subfield content • validating entered data • access to data about the marc format (a “user manual” for marc cataloging) • exporting and importing created records • providing various previews of the record, such as catalog cards background marc 21 as was previously mentioned, the eclipse editor uses the marc 21 variant. marc 21 consists of five formats: bibliographic data, authority data, holdings data, classification data, and community information.2 marc 21 records consist of three parts: record leader, set of control fields, and set of data fields. the record leader content, which follows the ldr label, includes the logical length of the record (first five characters) and the code for record status (sixth character). after the record leader, there are control fields. every control field is written in new line and consists of the threecharacter numeric tag and content of the control field. the content of the control field can be a single datum or a set of fixed-length bibliographic data. control fields are followed by data fields in the record. every line in the record that contains a data field consists of a three-character numeric tag, the value for the first and the second indicator—or the number sign (#) if indicators are not defined for the field—and the list of subfields that belong to the field. information technology and libraries | september 2012 67 detailed analysis of marc 21 shows that there are some constraints on the structure and content of the marc 21 record. constraints on the structure define which fields and subfields can appear more than once in the record (i.e., are the fields and subfields repeatable or not), the allowed length of the record elements, and all the elements of the record defined by marc 21. constraints on the record content are defined on the content of the leader, indicators, control fields and subfields. moreover, some constraints connect more elements in the record (when the content of one element depends on the content of the other element in the record). an example of constraint on the structure for data field 016 is that the field has the first indicator whereas the second indicator is undefined. the field 016 can have subfields a, z, 2, and 8, of which z and 8 are repeatable. bisis the results presented in this paper belong to the research on the development of the bisis library information system. this system, which has been in development since 1993, is currently in its fourth version. the editor for cataloging in the current version of bisis was the starting point for the development of eclipse, the subject of this paper. 3 apart from an editor for cataloging, the bisis system has a module for circulation and an editor for creating z39.50 queries.4 the indexing and searching of bibliographic records was implemented using the lucene text server.5 as a part of the editor for cataloging, we developed the module generating various reports and catalog cards from marc records.6 bisis also supports creating an electronic catalog of unimarc records on the web, where the input of bibliographic data can be down without knowing unimarc but the entered data are mapped to unimarc and stored in the bisis database.7 the recent research within the bisis project relates to its extension for managing research results at the university of novi sad. for that purpose, we developed the current research information system (cris) on the recommendation of the nonprofit organization eurocris.8 the paper “cerif compatible data model based on marc 21 format” gives the proposal for the common european research information format (cerif), a compatible data model based on marc 21. in this model, a part of the cerif data model that relates to research results is mapped to marc 21. furthermore, on the basis of this model, research management at the university of novi sad was developed.9 the paper “cerif data model extension for evaluation and quantitative expression of scientific research results” explains the extension of cerif for evaluation of published scientific research. the extension is based on the semantic layer of cerif, which enables classification of entities and their relationships by different classification schemas.10 the current version of the bisis system is based on a variant of the unimarc format. the development of the next version of bisis, which will be based on marc 21, is in progress. the first task was migrating existing unimarc records.11 the second task is developing the editor for marc 21 records, which is the subject of this paper. eclipse editor for marc records |surla 68 cataloging tools an editor for cataloging is a standard part of a cataloger’s workstation and the subject of numerous studies. lange describes the cataloging development process from handwritten cataloging cards, to typewriters (first manual then electronic), to the appearance of marc records and pc-based cataloger’s workstations.12 leroya and thomas debate the influence of web development on cataloging. they stress that the availability of information on the web, as well as the possibility that more applications can be opened in the same time in different windows, greatly influence the process of creating bibliographic records. their paper also indicates that there are some problems that result from using large numbers of resources from the web, such as errors that arise from copy-paste methods. consequently, there is a need for automatic check of spelling errors and the possibility of a detailed review by a cataloger during editing.13 khurshid deals with general principles of the cataloger’s workstation, its configuration, and its influence on a cataloger’s productivity. in addition to efficient access to remote and local electronic resources, khurshid includes record transfer through a network and sophisticated record editing as important functions of a cataloger’s workstation. furthermore, khurshid says it is possible to improve cataloging efficiency in the windows-based cataloger’s workstation by finding bibliographic records in other institutions and cutting and pasting lengthy parts of the record (such as summary notes) to their own catalog.14 existing approaches in developing user interfaces for editing marc records the basic source for this analysis of existing user interfaces for editing marc records was the official site for marc standards of the library of congress in addition to scientific journals and conferences. the analysis of existing systems shows that there are two basic approaches in the implementation of editing marc records: 15 • entering bibliographic data in classic screen forms made of text fields and labels, which does not require knowledge of the marc format (concourse,16 koha,17 j-marc18) • direct editing of a marc record shown on the screen (marcedit,19 isismarc,20 catalis,21 polaris,22 marcmaker and marcbraker,23 exlibris voyager24). both of these approaches have advantages and disadvantages. the drawback of the first approach is that it provides a limited set of bibliographic data to edit, and the extension of that set implies changes to the application, or in the best cases changes in configuration. another problem is that there are usually a lot of text fields, text areas, combo boxes, and labels on the screen that need to be organized into several tabs or additional windows. this situation usually makes it difficult for the users to see errors or to connect different parts of the record when checking their work. moreover, all found solutions from the first group perform little validation of data entered by the user.25 one important advantage of the first approach is that the application can be used by a user information technology and libraries | september 2012 69 who is not familiar with the standard, thus the need for access to marc data can be avoided (one of functions listed “marc 21” above). as for second approach, editing a marc record directly on the screen overcomes the problem of extending the set of bibliographic data to enter. it also enables users to scan entered data and check the whole record, which appears on the screen. users can also copy and paste parts of records from other resources into the editor. however, a majority of those applications are actually editors for editing marc files that are later uploaded in some database or transformed in some other format (marcedit, marcmaker and marcbreaker, polaris), and they usually support little or no data validation.26 they allow users to write anything (i.e., the record structure is not controlled by the program), and only validate at the end of the process when uploading or transforming the record. among those editors there are those, such as catalis and isismarc, that present the marc record as a table. they support the control of structure, but the record presented in this way is usually too big to fit on the screen, so it is separated into several tabs. an important function of editing marc records is selecting code for coded elements that can be positioned in the leader or control field, value of the indicator, or value of the subfield. there are also field tags or subfield codes that sometimes need to be selected for addition to a record. all analyzed editors provide additional dialogs for picking this code that require the user to constantly open and close dialogs, which sometimes can be annoying for the user. one important fact about editors in the second group is that they can be used only by a user who is familiar with marc, so access to the large set of marc element descriptions can make the job easier. some of the mentioned systems provide descriptions of the fields and subfields (e.g., isismarc), but most of them do not. findings the editor for marc records was developed as a plug-in for eclipse; therefore it is similar to eclipse’s java code editors. as the editor is written in java, it is platform-independent. the main part of this editor was created using oaw xtext framework for developing textual domain-specific languages.27 it was created using model-driven software development by specifying the model of marc record in a form of xtext grammar and generating the editor. all main characteristics of the editor were generated on the basis of the specification of constraints and extensions of the xtext grammar—therefore all changes to the editor can be realized by changing the specification. moreover, this editor can be easily adjusted for any database by using the concept of extension and extension point in the eclipse plug-in. we make this application independent of eclipse by using rich client platform (rcp) technology. this editor is implemented for marc 21 bibliographic and holdings formats. user interface eclipse editor for marc records |surla 70 figure 1 shows the editor opened within eclipse. the main area is marked with “1”—it shows the marc 21 file that is being edited. that file contains one marc 21 bibliographic record. the tags of the fields and subfields codes are highlighted in the editor, which contributes to presentation clarity. the area marked with “2” serves for listing the errors in the record, that is, nonvalid elements entered in the record. the area marked with “3” shows data about marc 21 in a tree form. this part of the screen has two other possible views: a marc 21 holdings format tree and a navigator, which is the standard eclipse view for browsing resources for the opened project. the actions available for creating a record are available in the cataloging menu and on the cataloging toolbar, which is marked with “4.” these are actions for previewing the catalog card, creating a new bibliographic record, loading a record from a database (importing the record), uploading a record to a database (exporting the record), and creating a holdings record for this bibliographic record. figure 1. eclipse editor for marc records in the eclipse editor for marc, selecting codes is enabled without opening additional dialogs or windows (figure 2). that is a standard eclipse mechanism for code completion: typing ctrl + space opens the dropdown list with all possible values for the cursor’s current position. information technology and libraries | september 2012 71 figure 2. selecting codes record validation is done in real time, and every violation is shown while editing (figure 3). figure 3 depicts two errors in the record: one is a wrong value in the second character position in control field 008, and another is that two 100 fields were entered, which is a field that cannot be duplicated in a record. figure 3. validation errors rcp application of the cataloging editor as shown above, the editor is available as an eclipse plug-in, which raises the question of what a cataloger will do with all the other functions of the eclipse integrated development environment (ide). as seen in figures 1 and 3, there are a lot of additional toolbars and menus that not related eclipse editor for marc records |surla 72 to cataloging. the answer lies in rcp technology. rcp technology generates independent software applications on the basis of a set of eclipse plug-ins.28 the main window of an rcp application with additional actions is shown in figure 4. beside the cataloguing menu that is shown, the window also contains the file menu, which includes save and save as actions, as well as the edit menu, which includes undo and redu actions. all of these actions are also available via the toolbar. figure 4. rcp application conclusion the goal of this paper was to review current user interfaces for editing marc records. we presented two basic approaches in this field and analyzed of advantages and disadvantages of each. we then presented the eclipse marc editor, which is part of the bisis library software system. the idea behind eclipse is inputting structured marc data in the form similar to programming language editors. the author did not find this approach in the accessible literature. the rcp application of the presented editor will find its practical application in future versions of the bisis system. it represents an upgrade of the existing editor and a starting point for forming the version of the bisis system that will be based on marc 21. the acquired results can also be information technology and libraries | september 2012 73 used for the input of other data into the bisis system, including data from the cris system used at the university of novi sad. this paper shows that eclipse plug-in technology can be used for creating end user applications. the development of applications with the plug-in technology enables the use of a big library of created components from the eclipse user interface, whereby writing source code is avoided. additionally, the plug-in technology enables the development of extendible applications by using the concept of the extension point. in this way, we can create software components that can be used by a great number of different information systems. by using the concept of “extension point,” the editor can be extended by the functions that are specific for a data store. an extension point was created for export and import of marc records, which means the marc editor plug-in can be used with any database management system by extending this extension point in eclipse plug-in technology. future work in the development of the eclipse marc editor is to implement support for additional marc formats, for authority and classification data, and for community information. these formats propose the same record structure but have different constraints on the content and different sets of fields and subfields, as well as different codes for character positions and subfields. therefore the appearance of the editor will remain the same. the only difference will be the specification of the constraints and codes for code completion. another interesting topic for discussion is considering implementation of other modules of library information systems in eclipse plug-in technology. references 1. bojana dimić and dušan surla, “xml editor for unimarc and marc21 cataloging,” electronic library 27 (2009): 509–28; bojana dimić, branko milosavljević, and dušan surla, “xml schema for unimarc and marc 21 formats,” electronic library 28 (2010): 245–62. 2. library of congress, “marc standards,” http://www.loc.gov/marc (access february 19, 2011). 3. dimić and surla, “xml editor,” dimić, milosavljević, and surla, “xml schema.” 4. danijela tešendić, branko milosavljević, and dušan surla, “a library circulation system for city and special libraries,” electronic library 27 (2009): 162–68; branko milosavljevic and danijela tešendić, “software architecture of distributed client/server library circulation,” electronic library, 28 (2010): 286–99; danijela boberić and dušan surla, “xml editor for search and retrieval of bibliographic records in the z39.50 standard,” electronic library 27 (2009): 474–95. 5. branko milosavljević, danijela boberić, and dušan surla, “retrieval of bibliographic records using apache lucene,” electronic library 28 (2010): 525–36. http://www.loc.gov/marc eclipse editor for marc records |surla 74 6. jelana rađenović, branko milosavljеvić, and dušan surla, “modelling and implementation of catalogue cards using freemarker,” program: electronic library and information systems 43 (2009): 63–76. 7. katarina belić and dušan surla, “model of user friendly system for library cataloging,” comsis 5 (2008): 61–85; katarina belić and dušan surla, “user-friendly web application for bibliographic material processing,” electronic library 26 (2008): 400–410; eurocris homepage, www.eurocris.org (accessed february 21, 2011). 8. dragan ivanović, dušan surla, and zora konjović, “cerif compatible data model based on marc 21 format,” electronic library 29 (2011). http://www.emeraldinsight.com/journals.htm?articleid=1906945. 9. eurocris, “common european research information format,” http://www.eurocris.org/index.php?page=cerifreleasesandt=1 (accessed february 21, 2011); dragan ivanović et al., “a cerif-compatible research management system based on the marc 21 format,” program: electronic library and information systems 44 (2010): 229–51. 10. gordana milosavljević et al., “automated construction of the user interface for a cerifcompliant research management system,” the electronic library 29 (2011). http://www.emeraldinsight.com/journals.htm?articleid=1954429; dragan ivanović, dušan surla, and miloš racković, “a cerif data model extension for evaluation and quantitative expression of scientific research results,” scientometrics 86 (2010): 155–72. 11. gordana rudić and dušan surla, “conversion of bibliographic records to marc 21 format,” electronic library 27 (2009): 950–67. 12. holley r. lange, “catalogers and workstations: a retrospective and future view,” cataloging & classification quarterly 16 (1993): 39–52. 13. sarah yoder leroya and suzanne leffard thomas, “impact of web access on cataloging,” cataloging & classification quarterly 38 (2004): 7–16. 14. zahirrudin khurshid, “the cataloger’s workstation in the electronic library environment,” electronic library 19 (2001): 78–83. 15. library of congress, “marc standards,” http://www.loc.gov/marc (access february 19, 2011). 16. book systems, “concourse software product,” http://www.booksys.com/v2/products/concourse (accessed february 19, 2011). 17. koha library software community homepage, http://koha-community.org (accessed february 19, 2011). http://www.emeraldinsight.com/journals.htm?articleid=1906945 http://www.emeraldinsight.com/journals.htm?articleid=1954429 http://www.loc.gov/marc http://www.booksys.com/v2/products/concourse http://koha-community.org/ information technology and libraries | september 2012 75 18. wendy osborn et al., “a cross-platform solution for bibliographic record manipulation in digital libraries,” (paper presented at the sixth iasted international conference communications, internet and information technology, july 2–4, 2007, banf, alberta, canada). 19. terry reese, “marcedit—your complete free marc editing utility,” http://people.oregonstate.edu/~reeset.marcedit/html/index.php (accessed february 19, 2011). 20. united nations educational scientific and cultural organization, “isismarc,” http://portal.unesco.org/ci/en/ev.phpurl_id=11041&url_do=do_topic&url_section=201.html (accessed february 19, 2011). 21. fernando j. gómez “catalis,” http://inmabb.criba.edu.ar/catalis (accessed february 19, 2011). 22. polaris library systems homepage, http://www.gisinfosystems.com (accessed february 19, 2011). 23. library of congress, “marcmaker and marcbreaker user’s manual,” http://www.loc.gov/marc/makrbrkr.html (accessed february 19, 2011). 24. exlibris, “exlibris voyager,” http://www.exlibrisgroup.com/category/voyager (accessed february 19, 2011). 25. book systems, “concourse software product.” 26. bonnie parks, “an interview with terry reese,” serials review 31 (2005): 303–8. 27. eclipse.org, “xtext,” http://www.eclipse.org/xtext (accessed february 19, 2011). 28. the eclipse foundation, “rich client platform,” http://wiki.eclipse.org/index.php/rich_client_platform (accessed february 19, 2011). http://people.oregonstate.edu/~reeset.marcedit/html/index.php http://portal.unesco.org/ci/en/ev.php-url_id=11041&url_do=do_topic&url_section=201.html http://portal.unesco.org/ci/en/ev.php-url_id=11041&url_do=do_topic&url_section=201.html http://inmabb.criba.edu.ar/catalis http://www.gisinfosystems.com/ http://www.loc.gov/marc/makrbrkr.html http://www.exlibrisgroup.com/category/voyager http://www.eclipse.org/xtext http://wiki.eclipse.org/index.php/rich_client_platform 18. wendy osborn et al., “a cross-platform solution for bibliographic record manipulation in digital libraries,” (paper presented at the sixth iasted international conference communications, internet and information technology, july 2–4, 2007, banf, ... 25. book systems, “concourse software product.” 26. bonnie parks, “an interview with terry reese,” serials review 31 (2005): 303–8. 170 information technology and libraries | december 2011 this paper summarizes a research program that focuses on how catalogers, other cultural heritage information workers, web/semantic web technologists, and the general public understand, explain, and manage resource description tasks by creating, counting, measuring, classifying, and otherwise arranging descriptions of cultural heritage resources within the bibliographic universe and beyond it. a significant effort is made to update the nineteenth-century mathematical and scientific ideas present in traditional cataloging theory to their twentiethand twenty-first-century counterparts. there are two key elements in this approach: (1) a technique for diagrammatically depicting and manipulating large quantities of individual and grouped bibliographic entities and the relationships between them, and (2) the creation of resource description exemplars (problem–solution sets) that are intended to play theoretical, pedagogical, and it system design roles. to the reader: this paper presents a major re-visioning of cataloging theory, introducing along the way a technique for depicting diagrammatically large quantities of bibliographic entities and the relationships between them. as many details of the diagrams cannot be reproduced in regularly sized print publications, the reader is invited to follow the links provided in the endnotes to pdf versions of the figures. c ataloging—the systematic arrangement of resources through their descriptions that is practiced by libraries, archives, and museums (i.e., cultural heritage institutions) and other parties1—can be placed in an advanced, twenty-first-century context by updating its preexisting scientific and mathematical ideas with their more contemporary versions. rather than directing our attention to implementation-oriented details such as metadata formats, database designs, and communications protocols, as do technologists pursuing bottom-up web and semantic web initiatives, in ronald j. murray and barbara b. tillett cataloging theory in search of graph theory and other ivory towers object: cultural heritage resource description networks this paper we will define a complementary, top-down approach. this top-down approach focuses on how catalogers, other cultural heritage information workers, web/ semantic web technologists, and the general public have understood, explained, and managed their resource description tasks by creating, counting, measuring, classifying, and otherwise arranging descriptions of cultural heritage resources within and beyond the bibliographic universe. we go on to prescribe what enlargements of cataloging theory and practice are required such that catalogers and other interested parties can describe pages from unique, ancient codices as readily as they might describe information elements and patterns on the web. we will be enhancing cataloging theory with concepts from communications theory, history of science, graph theory, computer science, and from the hybrid field of anthropology and mathematics called ethnomathematics. employing this strategy benefits two groups: ■■ workers in the cultural heritage realm, who will acquire a broadened perspective on their resource description activities, who will be better prepared to handle new forms of creative expressions as they appear, and who will be able to shape the development of information systems that support more sophisticated types of resource descriptions and ways of exploring those descriptions. to build a better library system (perhaps an n-dimensional, n-connected system?), one needs better theories about the library collections and the people or groups who manage and use them. ■■ the full spectrum of people who draw on cultural heritage resources: scholars, creatives (novelists, poets, visual artists, musicians, and so on), professional and technical workers, students, and other people or groups pursuing specific or general, long or short-term interests, entertainment, etc. to apply a multidisciplinary perspective to the processes by which resource description data (linked or otherwise) are created and used is not an ivory tower exercise. our approach draws lessons from the debates on why, what, and how to describe physical phenomena that were conducted by physicists, engineers, software developers (and their historian and philosopher of science observers) during the evolution of high-energy physics. during that time, intensive debates raged over theory and observational/experimental data, the roles of theorists, experimenters, and instrument builders, instrumentation, and hardware/software system design.2 accommodating the resulting scientific approaches to description, collaboration, and publishing has required the creation of information technologies that have had and continue to have world-shaking effects. ronald j. murray (rmur@loc.gov) is a digital conversion specialist in the preservation reformatting division, and barbara b. tillett (btil@loc.gov) is the chief of the policy and standards division at the library of congress. cataloging theory in search of graph theory and other ivory towers | murray and tillett 171 descriptions—accounts or representations of a person, object, or event being drawn on by a person, group, institution, and so on, in pursuit of its interests. given this definition, a person (or a computation) operating from a business rules–generated institutional or personal point of view, and executing specified procedures (or algorithms) to do so, is an integral component of a resource description process (see figure 1). this process involves identifying a resource’s textual, graphical, acoustic, or other features and then classifying, making quality and fitness for purpose judgments, etc., on the resource. knowing which institutional or individual points of view are being employed is essential when parties possessing multiple views on those resources describe cultural heritage resources. how multiple resource descriptions derived from multiple points of view are to be related to one another becomes a key theoretical issue with significant practical consequences. ■■ niels bohr’s complementarity principle and the library in 1927, the physicist niels bohr offered a radical explanation for seemingly contradictory observations of physical phenomena confounding physicists at that time.6 according to bohr, creating descriptions of nature is the primary task of the physicist: it is wrong to think that the task of physics is to find out how nature is. physics concerns what we can say about nature.7 descriptions that appear contradictory or incomparable may in fact be signaling deep limitations in language. bohr’s complementarity principle states that a complete description of atomic-level phenomena requires descriptions of both wave and particle properties. this is generally understood to mean that in the normal language these physics research facilities and their supporting academic institutions are the same ones whose scientific subcultures (theory, experiment, and instrument building) generated the data creation, management, analysis, and publication requirements that resulted in the creation of the web. in response to this development, we have come to believe that cultural heritage resource description (i.e., the process of identifying and describing phenomena in the bibliographic universe as opposed to the physical one) must now be as open to the concepts and practices of those twenty-first-century physics subcultures as it had been to the natural sciences during the nineteenth century.3 we have consequently undertaken an intensive study of the scientific subcultures that generate scientific data and have identified four principles on which to base a more general approach to cultural heritage resource description: 1. observations 2. complementarity 3. graphs 4. exemplars the cultural heritage resource description theory to follow proposes a more articulated view of the complex, collaborative process of making available—through their descriptions—socially relevant cultural heritage resources at a global scale. we will demonstrate that a broader understanding of this resource description process (along with the ability to create improved implementations of it) requires integrating ideas from other fields of study, reaching beyond it system design to embrace larger issues. ■■ cataloging as observation as stated in the oxford english dictionary, an observation is: the action or an act of observing scientifically; esp. the careful watching and noting of an object or phenomenon in regard to its cause or effect, or of objects or phenomena in regard to their mutual relations (contrasted with experiment). also: a measurement or other piece of information so obtained; an experimental result.4 following the scientific community’s lead in striving to describe the physical universe through observations, we adapted the concept of an observation into the bibliographic universe and assert that cataloging is a process of making observations on resources. human or computational observers following institutional business rules (i.e., the terms, facts, definitions, and action assertions that represent constraints on an enterprise and on the things of interest to the enterprise)5 create resource figure 1. a resource description modeled as a business ruleconstrained account of a person, object, or event 172 information technology and libraries | december 2011 purpose, its reformatting, and its long-term preservation must take into consideration that resource’s physical characteristics. having things to say about cultural heritage resources—and having many “voices” with which to say them—presents the problem of creating a well-articulated context for library-generated resource descriptions as well as those from other sources. these contextualization issues must be addressed theoretically before implementation-level thinking, and the demands of contextualization require visualization tools to complement the narratives common to catalogers, scholars, and other users. this is where mathematics and ethnomathematics make their entrance. ethnomathematics is the study of the mathematical practices of specific cultural groups over the course of their daily lives and as they deal with familiar and novel problems.10 an ethnomathematical perspective on cultural heritage resource description directs one’s attention to the existence of simple and complex resource descriptions, the patterns of descriptions that have been created, and the representation of these patterns when they are interpreted as expressions of mathematical ideas. a key advantage of operating from an ethnomathematical perspective is becoming aware that mathematical ideas can be observed within a culture (namely the people and institutions who play key roles in observing the bibliographic universe) before their having been identified and treated formally by western-style mathematicians. ■■ resource description as graph creation relationships between cultural heritage resource descriptions can be represented as conceptually engaging and flexible systems of connections mathematicians call graphs. a full appreciation of two key mathematical ideas underlying the evolution of cataloging—putting things into groups and defining relationships between things and groups of things—was only possible after the founding, naming, and expansion of graph theory, which is a field of mathematics that emerged in the 1850s, and the eventual acceptance around 1900 of set theory, a field founded amid intense controversy in 1874. between the emergence of formal mathematical treatments of those ideas by mathematicians and their actual exploitation by cataloging theorists—or by anyone capable of considering library resource description and organization problems from a mathematical perspective—lay a gulf of more than one hundred years.11 it remained for scholars in the library world to begin addressing the issue. tillett’s 1987 work on bibliographic relationships and svenonius’s 2000 definition of bibliographic entities in set-theoretic terms that physicists use to communicate experimental results, the wholeness of nature is accessible only through the embrace of complementary, contradictory, and paradoxical descriptions of it. later in his career, bohr vigorously affirmed his belief that the complementarity principle was not limited to quantum physics: in general philosophical perspective, it is significant that, as regards analysis and synthesis in other fields of knowledge, we are confronted with situations reminding us of the situation in quantum physics. thus, the integrity of living organisms, and the characteristics of conscious individuals, and most of human cultures, present features of wholeness, the account of which implies a typically complementary mode of description. . . . we are not dealing with more or less vague analogies, but with clear examples of logical relations which, in different contexts, are met with in wider fields.8 within a library, there are many things catalogers, conservators, and preservation scientists—each with their distinctive skills, points of view, and business rules—can observe and say about cultural heritage resources.9 much of what these specialists say and do strongly affects library users’ ability to discover, access, and use library resources in their original or surrogate forms. while observations made by these specialists from different perspectives may lead to descriptions that must be accepted as valid for those specialists, a fuller appreciation of these descriptions calls for the integration of those multiple perspectives into a well-articulated, accessible whole. reflecting the perspectives of the library of congress directorates in which we work, the acquisitions and bibliographic access (aba) directorate and the preservation directorate, we assert that the most fundamental complementary views on cultural heritage resources involve describing a library’s resources in terms of their availability (from an acquisitions perspective), in terms of their information content (from a cataloging perspective), and in terms of their physical properties (from a preservation perspective). for example, in the normal languages used to communicate their results, preservation directorate conservators narrate their condition assessments and record simple physical measurements of library-managed objects—while at the same time preservation scientists in another section bring instrumentation to acquire optical and chemical data from submitted materials and from reference collections of physical and digital media. even though these assessments and measurements may not be comprehended by or made accessible to most library users, the information gathered possess a critical logical relationship to bibliographic and other descriptions of those same resources. key decisions regarding a library resource’s fitness for cataloging theory in search of graph theory and other ivory towers | murray and tillett 173 by the modeling technique. what is required instead is theory-based guidance of systems development, alongside theory testing and improvement through application use. if software development is not constrained by a tacit or explicit resource description theory or practice, graph or other data structures familiar to the historically less well-informed, those favored by an institution’s system designers and developers, or those familiar to and favored by implementation-oriented communities may be invoked inappropriately.18 given graph theory’s potentially overwhelming mathematical power—as evidenced by its many applications in the physical sciences, engineering, and computer science—investigations into graph theory and its history require close attention both to the history and evolving needs of the cultural heritage community.19 the unnecessary constraint on resource description theory formation occasioned by the use of e-r or oo modeling can be removed by dispensing with it system analysis tools and expressing resource description concepts in graph-theoretical terms. with this step, the very general elements (i.e., entities and relationships) that characterize e-r models and the more implementation-oriented ones in oo models are replaced by more mathematically flexible, theory-relevant elements expressed in graph-theoretical terms. the result is a “graph-friendly” theory of cultural heritage resource description, which can borrow from other fields (e.g., ethnomathematics, history of science) to improve its descriptive and predictive power, guide it system design and use, and, in response to users’ experiences with functioning systems, results in improved theories and information systems. graph theory in a cultural heritage context ever since the nineteenth century foundation of graph theory (though scholars regularly date its origins from euler’s 1736 paper)20 and its move from the backwaters of recreational mathematics to full field status by 1936, graph theory has concerned itself with the properties of systems of connections—nowadays regularly expressed as the mathematical objects called sets.21 in addition to its set notational form, graphs also are depicted and manipulated in diagrammatic form as dots/labeled nodes linked by labeled or unlabeled, simple or arrowed lines. for example, the graph x, consisting of one set of nodes labeled a, b, c, d, e, and f and one set of edges labeled ab, bd, de, ef, and fc, can be depicted in set notation as x = {{a b c d e f}, {ab bd de ef fc}} and can be depicted diagrammatically as in figure 2. when graphs are defined to represent different types of nodes and relationships, it becomes possible to create and discuss structures that can support cultural heritage resource description theory and application building. the following diagrams depict simple resource description identified those mathematical ideas in cataloging theory and developed them formally.12 then in 2009, we were able to employ graph theory (expressed in set-theoretical terms and in its highly informative graphical representation) as part of a broader historical and cultural analysis.13 cataloging theory had by 2009 haltingly embraced a new view on how resources in libraries have been described and arranged via their descriptions—an activity that in principle stretches back to catalogs created for the library of alexandria14—and how these structured resource descriptions have evolved over time, irrespective of implementation. murray’s investigation into this issue revealed that the increasingly formalized and refined rules that guided anglo-american catalogers had, by 1876, specified sophisticated systems of cross-references (i.e., connections between bibliographic descriptions of works, authors, and subjects)—systems whose properties were not yet the subject of formal mathematical treatment by mathematicians of the time.15 murray also found that library resource description structures—when teased out of their book and card and digital catalog implementations and treated as graphs—are arguably more sophisticated than those being explored in the world wide web consortium’s (w3c) library linked data initiative.16 implementation-oriented substitutes for graph theory cataloging theory has been both helped and hindered by the use of information technology (it) techniques like entity-relationship modeling (e-r, first used extensively by tillett in 1987 to identify bibliographic relationships in cataloging records) and object-oriented (oo) modeling.17 e-r and oo modeling may be used effectively to create information systems that are based on an inventory of “things of interest” and the relationships that exist between them. unfortunately, the things of interest in cultural heritage institutions keep changing and may require redefinition, aggregation, disaggregation, and re-aggregation. e-r and oo modeling as usually practiced are not designed to manage the degree and kind of changes that take place under those circumstances. when trying to figure out what is “out there” in the bibliographic universe, we assert that focus should first be placed on identifying and describing the things of interest, what relationships exist between them, and what processes are involved in the creation, etc., of resource descriptions. having accomplished this, attention can then be safely paid to defining and managing information deemed essential to the enterprise, that is, undertaking it system analysis and design. but when an it-centric modeling technique becomes the bed on which the resource description theory itself is constructed, the resulting theory will be driven in a direction that is strongly influenced 174 information technology and libraries | december 2011 of the resources they describe. figure 4’s diagrammatic simplicity becomes problematic when large quantities of resources are to be described, when the number and kinds of relationships recorded grows large, and when more comprehensive but less-detailed views of bibliographic relationships are desired. to address these problems in a comprehensive fashion, we examined similar complex description scenarios in the sciences and borrowed another idea from the physics community—paper tool creation and use. ■■ paper tools: graph-aware diagram creation paper tools are collections of symbolic elements (diagrams, characters, etc.), whose construction and manipulation are subject to specified rules and constraints.23 berzelian chemical notation (e.g., c6h12o6) and—more prominently—feynman diagrams like those in figure 5 are familiar examples of paper tool creation and use.24 creating a paper tool resource diagram requires that the rules for creating resource descriptions be reflected in diagram elements, properties of diagram elements, and drawing rules that define how diagram/symbolic elements are connected to one another (e.g., the formula c6h12o6 specifies six molecules of carbon, twelve of hydrogen, and six of oxygen). the detailed bibliographic information in figure 4 is progressively schematized in a graphs that are based on real-world bibliographic descriptions. nodes in the graphs represent text, numbers, or dates and relationships that can be nondirectional (as a simple line), unidirectional (as single arrowed lines) or bidirectional (as a double arrowed line). the all-in-one resource description graph in figure 3 can be divided and connected according to the kinds of relationships that have been defined for cultural heritage resources. this is the point where institutional, group, and individual ways of describing resources shape the initial structure of the graph. once constructed, graph structures like this and their diagrammatic representations are then interpreted in terms of a tacit or explicit resource description theory. in the case of graphs constructed according to ifla’s functional requirements for bibliographic records (frbr) standard,22 figure 3 can be subdivided into four frbr sub-graphs, yielding figure 4. the four diagrams depict the initial graph of cataloging data as four complementary frbr wemi (w–work, e–expression, m–manifestation, and i–item) graphs. note that the item graph contains the call numbers (used here to identify the location of the copy) of three physical copies of the novel. this use of call numbers is qualitatively different from the values found in the manifestation graph in that resource descriptions in this graph apply to the entire population of physical copies printed by the publisher. the descriptions contained in figure 4’s frbr subgraphs reproduce bibliographic characteristics found useful by catalogers, scholars, other educationally oriented end users, and to varying extents the public in general. once created, resource description graphs and subgraphs (in mathematical notation or in simple diagrams like figure 4) can proliferate and link in multiple and complex ways—in parallel with or independently figure 3. library of congress catalog data for thomas pynchon’s novel gravity’s rainbow, represented as an all-inone graph labeled c figure 2. a diagrammatic representation of graph x cataloging theory in search of graph theory and other ivory towers | murray and tillett 175 6 graph is now represented explicitly by a black dot in a ring in the more schematic paper tool version. resource descriptions are then represented in fixed colors and positions relative to the resource/ring: the worklevel resource description is represented by a blue box, expression by a green box, manifestation by a yellow box, and item by a red box. depicting one aspect of the frbr way that reflects frbr definitions of bibliographic things of interest and their relevant relationships. as a first step, the four wemi descriptions in figure 4 are given a common identity by linking them to a c node, as in figure 6. the diagram is then further schematized such that frbr description types and relationships are represented by appropriate graphical elements connected to other elements. the result shows how a frbr paper tool makes it much easier to construct and examine complex large-scale properties of resource and resource description structures (like figure 7, right side) without being distracted by textual and linkage details. the resource described (but not shown) by the figure figure 4. the all-in-one graph in figure 3, separated into four frbr work (top-left), expression (top-right), manifestation (bottom-left), and item (bottom-right) graphs figure 5. feynman diagrams of elementary particle interactions figure 6. a frbr resource description graph 176 information technology and libraries | december 2011 expressions. the work products of scholars—especially those creations that are dense with quotations, citations, and other types of direct and derived textual and graphical reference within and beyond themselves—are excellent environments for paper tool explorations and more generally, for testing of exemplars—solutions to the potentially complex problem of describing cultural heritage resources. ■■ exemplars the fourth principle in our cultural heritage resource description theory involves exemplar identification and analysis. according to the historian of science thomas s. kühn, exemplars are sets of concrete problems and solutions encountered during one’s education, training, and work. in the sciences, exemplar-based problem finding and solving involves mastery of relevant models, builds knowledge bases, and hones problem-solving skills. every student in a field would be expected to demonstrate mastery by learning and using their field’s exemplars. change within a scientific field is manifest by the need to modify old or create new exemplars as new problems appear and must be solved.26 a cultural heritage resource description theorist would, in addition to identifying and developing exemplars from real bibliographic data and other sources, want to speculate about possible resource/description configurations that call for changes in existing information technologies. to the theorist, it would be as important to find out what can’t be done with frbr and other resource description models at library, archive, museum, and internet scales, as it is to be able to explain routine item cataloging and tagging activities. discovering system limitations is better done in advance by simulating uncommon or challenging circumstances than by having problems appear later in production systems. model graphically, the descriptions closest to the black dot resource/slot are the most concrete and those furthest away the most abstract. (readers wishing to interpret frbr paper tool diagrams without reference to color values should note the strict ordering of wemi elements: w–e–m–i–resource/ring or resource/ring–i–m–e–w.) finally, to minimize element use when pairs of wemi boxes touch, the appropriate frbr linking relationship for the relevant pair of descriptions (as explicitly shown in the expanded graph) is implied but not shown. with appropriate diagramming conventions, the process of creating and exploring resource description complexes addresses combined issues of cataloging theory and institutional policy—and results in an ability to make better-informed judgments/computations about resource descriptions and their referenced resources. as a result, resource description graphs are readily created and transformed to serve theoretical—and with greater experience in thinking and programming along graph-friendly lines, practical—ends. one example of transformability would arise when exploring the implications of removing redundant portions of related resource descriptions as more copies of the same work are brought to the bibliographic universe. the frbr paper tool elements and the more articulated resource description graphs in figure 8 both depict the consequences of a practical act: combining resource descriptions for two copies of the same edition of the novel gravity’s rainbow.25 the top-most frbr diagram and its magnified section depict how the graph would look with a single item-level description, the call number for one physical copy. the bottom-most frbr diagram and its magnified section depict the graph with two item-level descriptions, the call numbers for two physical copies. a frbr paper tool’s flexibility is useful for exploring potentially complex bibliographic relationships created or uncovered by scholars—parties whose expertise lies in identifying, interrelating, and discussing creative concepts and influences across a full range of communicative figure 7. a frbr paper tool diagram element (left) and the less schematic frbr resource description graph it depicts (right) cataloging theory in search of graph theory and other ivory towers | murray and tillett 177 drawing diagrams. use case diagrams are secondary in use case work.28 as products of and guides for theory making, resource description exemplars have different origins and audiences than those for use cases. while use cases and exemplars offer perspectives that can support information system design, exemplars were originally introduced as theoretical entities by kühn to explain how theories and theory-committed communities can crystallize around problem-solution sets, how these sets also can serve as pedagogical tools, and why and when problem-solution sets get displaced by new ones. the proposed process of cultural heritage exemplar creation and use, followed by modification or replacement in the face of changes in the bibliographic universe draws on kühn’s and historian of science david kaiser’s interest in how work gets done in the sciences, in addition to their rejection of paradigms as eerie self-directing processes.29 exemplars are not use cases use cases are a software modeling technique employed by the w3c library linked data incubator group (lld xg) in support of requirements specification.27 kühnstyle exemplars are definitely not to be confused with use cases, which are requirements-gathering documents that contribute to software engineering projects. there is a wikipedia definition of a use case that describes its properties: a use case in software engineering and systems engineering, is a description of steps or actions between a user (or “actor”) and a software system which leads the user towards something useful. the user or actor might be a person or something more abstract, such as an external software system or manual process. . . . use cases are mostly text documents, and use case modeling is primarily an act of writing text and not figure 8. frbr paper tool diagram elements and the frbr resource description graphs they depict 178 information technology and libraries | december 2011 ■■ a webpage and its underlying, globally distributed, multimedia resource network, as it changes over time. such exemplars can be presented diagrammatically through the use of paper tools. this use of diagrams in support of conceptualization and information system design is deliberately patterned after professional data modeling theory and practice.31 paper tool–supported analyses of a nineteenth-century american novel (exemplar 1) and of eighteenth-century french poems drawn from state archives (exemplar 2) will be presented to illustrate how information system design and pedagogy can be informed by exemplary scholarly research and publication, combined with narrativized diagrammatic representations of bibliographic and other relationships in traditional and digital media. exemplar 1. from moby-dick to mash-ups—a print publication history and multimedia mash-up problem document the publication history of print copies of a literary work, identifying editorially driven content transfer across print editions along with content selection and transformation in support of multimedia resource creation. solution the solution to this descriptive problem relies heavily on placing resource descriptions into groups and then defining relationships within and across those groups— i.e., on graph creation. after locating a checklist that documented the publication history of the novel and after identifying key components of a moby-dick and orson welles–themed multimedia resource appropriation and transformation network, murray used the frbr paper tool along with additional connection rules to create a resource description diagram (rdd) that represented g. thomas tanselle’s documentation of the printing history (from 1851 to 1976) of herman melville’s epic novel, moby-dick.32 the resulting diagram provides a high-level view of a large set of printed materials—depicting concepts such as a creative work, the expression of the work in a particular mode of languaging (i.e., speech, sign, image), and more concrete concepts such as publications. to reduce displayed complexity, sets of frbr diagram elements were collapsed into green shaded squares representing entire editions/printings, yielding figure 9.33 the vertical axis represents the year of publication, starting with the 1851 printings at the top. connected squares the resulting network of connections in figure 9 can be interpreted in publishing terms. one line or two or more lines descending downwards from a printing’s green in addition, resource description structures specified in an exemplar can and should represent a more abstract treatment of a resource description and not just data or data structures engaged by end users. exemplars on hand and others to come cultural heritage resource description exemplars have been created over time as solutions to problems of resource description and later made available for use, study, mastery, and improvement. while not necessarily bound to a particular information technology, such as papyrus, parchment, index cards, database records, or rdf aggregations, resource description exemplars have historically provided descriptive solutions of physical resources whose physical and intellectual structure had originally been innovative solutions to describing, for example, ■■ a manuscript (individual and related multiples, published but host to history, imaginary, etc.); ■■ a monograph in one edition (individual and related multiples); ■■ a monograph in multiple editions (individual and related multiples); and ■■ a publication in multiple media, created sequentially or simultaneously. with the advent of electronic and then digital communications media, more complex resource description problem-solution sets have been called for as a response to enduringly or recently more sophisticated creative/ editorial decision-making and to more flexible print and digital information technology production capabilities. the most challenging problem-solution sets involve the assembly and cross-referencing of several multipart—and possibly multimedia—creative or editorially constructed works, such as the following: ■■ a work published as a monograph, but which has been reprinted and reedited; translated into numerous languages; supplemented by illustrations from multiple artists; excerpted and adapted as plays, an opera, comic books, and cartoon series; multimedia mash-ups; and has been directly quoted in paintings and other graphic arts productions, and has been the subject of dissertations, monographs, journal articles, etc. ■■ a continuing publication (individual and related multiple publications, special editions, name, publisher, editorial policy changes, etc.). ■■ a monograph whose main content is composed nearly entirely of excerpts from other print publications.30 ■■ a library-hosted multimedia resource and its associated resource description network. cataloging theory in search of graph theory and other ivory towers | murray and tillett 179 by paper tool diagram creation, analysis, and subsequent action, namely, ■■ connecting the squares (i.e., assigning at least one relationship to a printing) ensures access based on the relationship assigned; and ■■ parties located around the globe can examine a given connected or disconnected resource description network and develop strategies for enhancing its usefulness. the wealth of descriptive information available in the moby-dick exemplar illustrates how previous and future collaborative efforts between cultural heritage institutions and other parties have already generated resource descriptions that possess a network structure alongside its content. with a more graph-friendly and collaborative implementation, melville scholars, scholarly organizations,34 and enthusiasts could more effectively examine, discuss, and through their actions enhance the moby dick resource description network’s documentary, scholarly, and educational value. in its original form, the moby dick resource description diagram (and the exemplar it partially documents) only depicted full-length publications of melville’s work. as a test of the frbr paper tool’s ability to accommodate both traditional and modern creative expressions in individual and aggregate form—while continuing to serve theoretical, practical, and educational ends—murray added a resource description network for orson whales,35 square are interpreted to mean that the printing gave rise to one or more additional printings, which may occur in the same or later years. two or more lines converging on a green square from above indicate that the printing was created by combining texts from multiple prior printings—an editorial/creative technique similar to that used to construct the mash-ups published on the web. connecting unconnected squares tanselle’s checklist did not specify predecessor or successor relationships for each post–1851 printing. this often unavoidable, incomplete status is depicted in figure 9 as green squares that are ■■ not linked to any squares above it, i.e., to earlier printings; and/or ■■ not linked to any squares below it, i.e., to later printings; or ■■ connected islands, without a link to the larger structure. recognizing the extent of moby-dick printing disconnectedness in tanselle’s checklist and developing a strategy for dealing with it only by analyzing tanselle’s checklist would be extremely difficult. in contrast, the disconnectedness of the moby-dick resource description network, and its implications for search-based discovery based on following the depicted relationships is readily discernable in figure 9. the ease with which the disconnected condition can be assessed also hints at benefits to be gained by collaborative resource description supported figure 9. a moby-dick resource description diagram, depicting relationships between printings made between 1851–1976 (greatly reduced scale) 180 information technology and libraries | december 2011 darnton’s book can stand on its own as an exemplar for historical method, with the diagram providing additional diagrammatic support. solution 2 darnton’s analysis treated each poem found in the archives as an individual creative work,38 enabling the use of the frbr paper tool (as a bookkeeping device this time) instead of a tool designed to aggregate and describe archival materials. the resulting diagram is a more articulated frbr paper tool depiction of darnton’s poetry communication network, a section of which appears as figure 11. the depiction of the poetry communication network shown in figure 11 is composed of: ■■ tan squares that depict individuals (clerks, professors, priests, students, etc.) who read, discussed, copied, and passed along the poems. ■■ diagram elements that depict poetry written on scraps of paper (treated as resources) that were police custody, were admitted to having existed by suspects, or assumed to have existed by the police. if one’s theory and business rules permit it, paper tool drawing conventions can depict descriptions of lost and nonexistent but nonetheless describable resources. ■■ arrowed lines that represent relationships between a poem and the individuals who owned copies, those who created or received copies of the poem, etc.39 with darnton’s monograph to provide background information regarding the historical personages involved, relationships between the works and the people, document selection from archival fonds, and the point of view of the scholar, the resulting problem-solution set can: ■■ serve as enhanced documentation for darnton-style communication network analysis and discussion. ■■ serve as an exemplar for catalogers, scholars, and alex itin’s moby-dick-themed multimedia mash-up, to the print media diagram. the four-minute long orson whales multimedia mashup contains hundreds of hand-painted page images from the novel, excerpts from the led zeppelin song “moby dick,” parts of two vocal performances by the actor orson welles, and a video clip from welles’s motion picture citizen kane. the result is shown in figure 10.36 the leftmost group of descriptions in figure 10 depicts various releases of led zeppelin’s “moby dick.” the central group depicts the sources of two orson welles audio dialogues after they had been ripped (i.e., digitized from physical media) and made available online. the grouping on the right depicts the orson whales mash-up itself and collections of digital images of painted pages created from two printed copies of the novel. exemplar 2. poetry and the police—archival content identification and critical analysis problem examine archival collections and select, describe, and document ownership and other relationships of a set of documents (poems) alleged to have circulated within a loosely defined social group. solution 1 in his 2010 work, poetry and the police: communication networks in eighteenth-century paris, historian robert darnton studied a 1749 paris police investigation into the transmission of poems highly critical of the french king, louis xv. after combing state archives for police reports, finding and identifying scraps of paper once held as evidence, and collecting other archival materials, darnton was able to construct a poetry communication network diagram,37 which, along with his narrative account, identified a number of parties who owned, copied, and transmitted six of the scandalous poems and placed their activities in a political, social, and literary context. figure 10. a resource description diagram of alex itin’s moby-dick multimedia work, depicting the resources and their frbr descriptions. cataloging theory in search of graph theory and other ivory towers | murray and tillett 181 with all of the adaptations and excerpts extant within a specified bibliographic universe (such as the cataloging records that appear in oclc’s worldcat bibliographic database). resource description diagrams, created from real-world or theoretically motivated considerations, would then provide a diagrammatic means for depicting the precise and flexible underlying mathematical ideas that, heretofore unrecognized but nonetheless systematically employed, serve resource description ends. if the structure of a well-motivated and constructed resource description diagram subsequently makes data representation and management requirements that a given information system cannot accommodate, cataloging theorists and information technologists alike will then know of that system’s limitations, will work together on mitigating them, and will embark on improving system capabilities. ■■ cataloging theory, tool-making, education, and practice this modernized resource description theory offers new and enhanced roles and benefits for cultural heritage personnel as well as for the scholars, students, and those members of the general public who require support not just for searching, but also for collecting, reading, writing, collaborating, monitoring, etc.40 information systems that others who seek similar solutions to their problems with identifying, describing, depicting, and discussing as individual works documents ordinarily bundled within hierarchically structured archival fonds at multiple locations. ■■ a paper tool into a power tool there are limits to what can be done with a hand-drawn frbr paper tool. while murray was able to depict largescale bibliographic relationships that probably had not been observed before, he was forced to stop work on the moby-dick diagram because much of the useful information available could not fit into a static, hand-drawn diagram. we think that automated assistance in creating resource description diagrams from bibliographic records is required. with that capability available, cataloging theorists and parties with scholarly and pedagogical interests could interactively and efficiently explore how scholars and sophisticated readers describe significant quantities of analog and digital resources. it would then be possible and extremely useful to be able to initiate a scholarly discussion or begin a lecture by saying, “given a moby-dick resource description network . . . ” and then proceed to argue or teach from a diagram depicting all known printings of moby-dick—along figure 11. a section of darnton’s poetry communication network 182 information technology and libraries | december 2011 the value of non-euclidean geometry lies in its ability to liberate us from preconceived ideas in preparation for the time when exploration of physical laws might demand some geometry other than the euclidean.41 taking riemann to heart, we assert that the value of describing cultural heritage resources as observations organized into graphs and of enhancing and supplementing the resource description exemplars that have evolved over time and circumstance rests in opportunities for liberating the cultural heritage community from preconceived ideas about resource description structures and from longstanding points of view on those resources. having achieved such a goal, the cultural heritage community would then be ready when the demand came for resource description structures that must be more flexible and powerful than the traditional ones. given the unprecedented development of the web and the promise of bottom-up semantic web initiatives, we think that the time for the cultural heritage community’s liberation is at hand. ■■ acknowledgments the authors wish to thank beacher wiggins and dianne van der reyden, directors of the library of congress acquisitions and bibliographic access directorate and the preservation directorates, respectively, for supporting the authors’ efforts to explore and renew the scientific and mathematical foundations of cultural heritage resource description. thanks also to marcia ascher, david hay, robert darnton, daniel huson, and mark ragan, whose scholarship informed our own; and to joanne o’brienlevin for her critical eye and for editorial advice. references and notes 1. oed online, “catalogue, n.” http://www.oed.com/view dictionaryentry/entry/28711 (accessed aug. 10, 2011). 2. peter galison, “part ii: building data,” in image & logic: a material culture of microphysics (chicago: univ. of chicago pr., 2003): 370–431. 3. gordon mcquat, “cataloguing power: delineating ‘competent naturalists’ and the meaning of species in the british museum,” british journal for the history of science 34, no. 1 (mar. 2001): 1–28. exclusive control of classification schemes and of the records that named and described its specimens are said to have contributed to the success of the british museum’s institutional mission in the nineteenth century. as a division of the british museum, the british library appears to have incorporated classification concepts (hierarchical structuring) from its parent and elaborated on the museum’s strategies for cataloging species. 4. oed online, “observation, n.” http://www.oed.com/ viewdictionaryentry/entry/129883 (accessed july 8, 2011). couple modern, high-level understandings about how cultural heritage resources can be described, organized, and explored with data models that support linking within and across multiple points of view will be able to support those requirements. the complementarity of cosmological and quantum-level views cataloging theory formation and practice—two areas of activity that did not interest many outside of cultural heritage institutions—can now be understood as a much more comprehensive multilayered activity that is approachable from at least two distinct points of view. the approach presented in this paper represents a cosmological-level view on the bibliographic universe. this treatment of existing or imaginable large-scale configurations of cultural heritage resource descriptions serves as a complement to the quantum-level view of resource description, as characterized by it-related specificities such as character sets, identifiers, rdf triples, triplestores, etc. activities at the quantum level—the domain of semantic web technologists and others—yield powerful and relatively unconstrained information management systems. in the absence of cosmological-level inspiration or guidance, these systems have not necessarily been tested against nontrivial, challenging cultural heritage resource description scenarios like those documented in the above two exemplars. applying both views to the bibliographic universe would clearly be beneficial for all institutional and individual parties involved. if ever a model for multilevel, multidisciplinary effort was required, the history of physics is illuminated by mutually influential interactions of cosmological and quantum-level theories, practices, and pedagogy. workers in cultural heritage institutions and technologists pursuing w3c initiatives would do well to reflect on the result. ■■ ready for the future—and creating the future to explore the cultural, scientific, and mathematical ideas underlying cultural heritage resource description, to identify, study, and teach with exemplars, and to exploit the theoretical reach and bookkeeping capability of paper tool –like techniques is to pay homage to the cultural heritage community’s 170+ year-old talent for pragmatic, implementation-oriented thinking,while at the same time pointing out a rich set of possibilities for enhanced service to society. the cultural heritage community can draw inspiration from geometrician bernhard riemann’s own justification for his version of thinking outside of the box called euclidean geometry: cataloging theory in search of graph theory and other ivory towers | murray and tillett 183 18. the prospects for creating graph-theoretical functions that operate on resource description networks are extremely promising. for example, combinatorica (an implementation of graph theory concepts created for the computer mathematics application mathematica) is composed of more than 450 functions. were cultural heritage resource description networks to be defined using this application’s graph-friendly data format, significant quantities of combinatorica functions would be available for theoretical and applied uses; siriam pemmaraju and steven skiena, computational discrete mathematics: combinatorics and graph theory with mathematica (new york: cambridge univ. pr., 2003). 19. dénes könig, theory of finite and infinite graphs, trans. richard mccoart (boston: birkhaüser, 1990); fred buckley and marty lewinter, a friendly introduction to graph theory (upper saddle river, n.j.: pearson, 2003); oystein ore and robin wilson, graphs and their uses (washington d.c.: mathematical association of america, 1990). 20. leonhard euler, “solutio problematis ad geometriam situs pertinentis,” commentarii academiae scientarium imperalis petropolitanae no. 8 (1736): 128–40. 21. “set theory, branch of mathematics that deals with the properties of well-defined collections of objects, which may or may not be of a mathematical nature, such as numbers or functions. the theory is less valuable in direct application to ordinary experience than as a basis for precise and adaptable terminology for the definition of complex and sophisticated mathematical concepts.” quoted from encyclopædia britannica online, “set theory,” oct. 2010, http://www.britannica.com/ebchecked/ topic/536159/set-theory (accessed oct. 27, 2010). 22. ifla study group on the functional requirements for bibliographic records, functional requirements for bibliographic records: final report (munich: k.g. saur, 1998). this document is downloadable as a pdf from http://www.ifla.org/vii/s13/ frbr/frbr.pdf or as an html page at http://www.ifla.org/vii/ s13/frbr/frbr.htm. 23. ursula klein, ed., experiments, models, paper tools: cultures of organic chemistry in the nineteenth century (stanford, calif.: stanford univ. pr., 2003); klein, ed., tools and modes of representation in the laboratory sciences (boston: kluwer, 2001); david kaiser, drawing theories apart: the dispersion of feynman diagrams in postwar physics (chicago: univ. of chicago pr., 2005). 24. for more examples and a general description of feynman diagrams, see http://www2.slac.stanford.edu/vvc/theory/ feynman.html. 25. an enlarged version of this diagram may be found online. ronald j. murray and barbara b. tillett, “frbr paper tool diagram elements and the frbr resource description graphs they depict,” aug. 2011, http://arizona.openrepository.com/ arizona/bitstream/10150/139769/2/fig%208%20frbr%20 paper%20tool%20elements%20and%20graphs.pdf. other informative illustrations also are available. murray and tillett, “resource description diagram supplement to ‘cataloging theory in search of graph theory and other ivory towers. object: cultural heritage resource description networks,” aug. 2011, http://hdl.handle.net/10150/139769. 26. thomas s. kühn, the structure of scientific revolutions, 2nd ed. (chicago: univ. of chicago pr., 1970). 27. daniel vila suero, “use case report,” world wide web consortium, june 27, 2011, http://www.w3.org/2005/ incubator/lld/wiki/usecasereport. 5. david c. hay, uml and data modeling: a vade mecum for modern times (bradley beach, n.j.: technics pr., forthcoming 2011): 124–25. some scholars argue that decisions as to what the things of interest are and the categories they belong to are influenced by social and political factors. geoffrey c. bowker, susan leigh star, sorting things out: classification and its consequences (cambridge, mass.: mit pr., 1999). 6. gerald holton, “the roots of complementarity,” daedalus 117, no. 3 (1988): 151–97, http://www.jstor.org/stable/20023980 (accessed feb. 24, 2011). 7. niels bohr, quoted in aage petersen, “the philosophy of niels bohr,” bulletin of the atomic scientists 19, no. 7 (sept. 1963): 12. 8. niels bohr, “quantum physics and philosophy: causality and complementarity,” in essays 1958–1962 on atomic physics and human knowledge (woodbridge, conn.: ox bow, 1997): 7. 9. for cataloging theorists, the description of cultural heritage things of interest yields groups of statements that occupy different levels of abstraction. upon regarding a certain physical object, a marketer describes product features, a linguist enumerates utterances, a scholar perceives a work with known or inferred relationships to other works, and so on. 10. marcia ascher, ethnomathematics: a multicultural view of mathematical ideas (pacific grove, calif.: brooks/cole, 1991); ascher, mathematics elsewhere: an exploration of ideas across cultures (princeton: princeton univ. pr., 2002). 11. a timeline of events, people, and so on that have had or should have had an impact on describing cultural heritage resources is available online. seven fields or subfields are represented in the timeline and keyed by color: library & information science; mathematics; ethnomathematics; physical sciences; biological sciences; computer science; and arts & literature. ronald j. murray, “the library organization problem,” dipity .com, aug. 2011, http://www.dipity.com/rmur/libraryorganization-problem/ or http://www.dipity.com/rmur/ library-organization-problem/?mode=fs (fullscreen view). 12. barbara ann barnett tillett, “bibliographic relationships: toward a conceptual structure of bibliographic information used in cataloging” (phd diss., university of california, los angeles, 1987); elaine svenonius, the intellectual foundation of information organization (cambridge, mass.: mit pr., 2000): 32–51. svenonius’s definition is opposed to database implementations that permitted boolean operations on records at retrieval time. 13. ronald j. murray, “the graph-theoretical library,” slideshare.net, july 5 2011, http://www.slideshare.net/ ronmurray/-the-graph-theoretical-library. 14. francis j. witty, “the pinakes of callimachus,” library quarterly 28, no. 1–4 (1958): 132–36. 15. ronald j. murray, “re-imagining the bibliographic universe: frbr, physics, and the world wide web,” slideshare .net, oct. 22 2010, http://www.slideshare.net/ronmurray/frbrphysics-and-the-world-wide-web-revised. 16. for an overview of the technology-driven library linked data initiative, see http://linkeddata.org/faq. murray’s analyses of cultural heritage resource descriptions may be explored in a series of slideshows at http://www.slideshare.net/ronmurray/. 17. pat riva, martin doerr, and maja žumer, “frbroo: enabling a common view of information from memory institutions,” international cataloging & bibliographic control 38, no. 2 (june 2009): 30–34. 184 information technology and libraries | december 2011 36. the multimedia mash-up in figure 10 was linked to the much larger moby-dick structure depicted in figure 9. the combination of the two yields figure 10a, which is too detailed for printout but which can be downloaded for inspection as the following pdf file: ronald j. murray and barbara b. tillett, “transfer and transformation of content across cultural heritage resources: a moby-dick resource description network covering full-length printings from 1851–1976*,” july 2011, http://arizona.openrepository.com/arizona/bitstream/10150/136270/4/fig%2010a%20orson%20whales%20 in%20moby%20dick%20context.pdf. in the figure, two print publications have been expanded to reveal their own similar mash-up structure. 37. robert darnton, poetry and the police: communication networks in eighteenth-century paris (cambridge, mass.: belknap pr. of harvard univ. pr., 2010): 16. 38. ronald j. murray in a discussion with robert darnton, sept. 20, 2010. darnton considered the poems retrieved from the archives as distinct intellectual creations, which permitted the use of frbr diagram elements for the analysis. otherwise, a paper tool with diagram elements based on the archival descriptive standard isad(g) would have been used. committee on descriptive standards, isad (g): general international standard archival description (stockholm, sweden, 1999– ). 39. the complete poetry communication diagram may be viewed at http://arizona.openrepository.com/arizona/ bitstream/10150/136270/6/fig%2011%20poetry%20commun ication%20network.pdf. 40. carole l. palmer, lauren c. teffeau, and carrie m. pittman, scholarly information practices for the online environment: themes from the literature and implications for library science development (dublin, ohio: oclc research, 2009), http://www . o c l c . o rg / p ro g r a m s / p u b l i c a t i o n s / re p o r t s / 2 0 0 9 0 2 . p d f (accessed july 15, 2011). 41. g. f. b. riemann, quoted in marvin j. greenberg, euclidean and non-euclidean geometry: development and history (new york: freeman, 2008): 371. 28. wikipedia.org, “use case,” june 13, 2011, http://en .wikipedia.org/wiki/use_case. 29. kaiser, drawing theories, 385–86. 30. prime examples being jacques derrida’s typographically complex 1974 work glas (univ. of nebraska pr.), and reality hunger: a manifesto (vintage), david shield’s 2011 textual mashup on the topic of originality, authenticity, and mash-ups in general. 31. graeme simsion, data modeling: theory and practice (bradley beach, n.j.: technics, 2007): 333. 32. herman melville, moby-dick (new york: harper & brothers; london: richard bentley, 1851). moby-dick edition publication history excerpted from g. thomas tanselle, checklist of editions of moby-dick 1851–1976. issued on the occasion of an exhibition at the newberry library commemorating the 125th anniversary of its original publication (evanston, ill.: northwestern univ. pr.; chicago: newberry library, 1976). 33. ronald j. murray, “from moby-dick to mash-ups: thinking about bibliographic networks,” slideshare.net, apr. 2011, http://www.slideshare.net/ronmurray/from-mobydick-to-mashups-revised. the moby-dick resource description diagram was presented to the american library association committee on cataloging: description and access at the ala annual conference, washington d.c., july 2010. 34. the life and works of herman melville, melville.org, july 25, 2000, http://melville.org. 35. the new york artist alex itin describes his creation: “it is more or less a birthday gift to myself. i’ve been drawing it on every page of moby dick (using two books to get both sides of each page) for months. the soundtrack is built from searching ‘moby dick’ on youtube (i was looking for orson’s preacher from the the [sic] john huston film) . . . you find tons of led zep [sic] and drummers doing bonzo and a little orson . . . makes for a nice melville in the end. cinqo [sic] de mayo i turn forty. ahhhhhhh the french champagne.” quoted from alex itin, “orson whales,” youtube, jan. 2011, http://www.youtube .com/watch?v=2_3-gem6o_g. a simulation model for purchasing duplicate copies in a library w. y. arms: the open university, and t. p. walter: unilever limited. at the time this study was undertaken the authors were at the university of sussex. 73 p1'ovision of duplicate copies in a lib1'at'y requires knowledge of the demand fo1' each title. since di1'ect measu1'ement of demand is difficult a simulation model has been developed to estimate the demand for a book f1'om the number of times it has been loaned and hence to dete1·mine the number of copies required. special attention has been given to accurate calibration of the model. introduction a common difficulty in library management is deciding when to buy duplicate copies of a given book and how many copies to buy. a typical research library has several hundred thousand different works; many are lightly used but all are potential candidates for duplication. the problem which we faced at sussex university was how to obtain reliable forecasts of the demand for each title and to translate this into a purchasing policy. at present sussex spends between £10,000 and £20,000 ($22,00o-$44,000) per year on duplicate copies, and as the university grows this amount is increasing steadily. because of the large number of books in a library relatively little data are available about each title. records are kept of books on loan or removed from the library, but frequently these are the only routine data collected. few large libraries even manage inventory checks. we therefore looked for a system that could be implemented with the minimum of data collection, preferably one based on existing records. forecasts of demand if the demand for a particular book is known, it is possible, though not necessarily easy, to determine how many copies of that book are needed to achieve a specified level of service, such as a copy being available on 80 percent of the occasions that a reader requires the book. unfortunately demand cannot be measured directly, even retrospectively. records of the 74 journal of librm·y automation vol. 7/2 june 1974 number of times that a book is issued from the library contain no information about how many times the book was used within the library, nor how many readers failed to find a copy and went away unsatisfied. since both these factors are extremely difficult to measure, one of the central parts of our work was to develop a method of estimating them from data readily available. to forecast demand two lines of approach seemed reasonable: subjective estimation based on faculty reading lists; and forecasts based on the number of loans in previous years. in the past, sussex library has made extensive use of reading lists provided by faculty to decide how many copies to buy of each title. as the books most in demand are those recommended for undergraduate courses this seemed a sensible approach, though the number of copies required is not obvious even if the demand is known. webster analysed the effectiveness of these lists in predicting demand for specific titles and evaluated the purchasing rule being used, one copy for every ten students taking a course. 1 restricting his attention to books known to be in demand and marked in the catalog, he drew a random sample of 673 titles, about 4 percent of the books falling into this category. he compared the number of loans of each of these titles over a term· with data from the reading lists supplied at the beginning of the term. as the library had made a special effort to obtain reading lists for all courses taught that term, he had data on the number and type of students taking each course, the importance given to each text, and the subject areas involved. yet despite a thorough analysis of these data webster was able to find very little relationship between observed demand and reading list information. his work shows that faculty at the university have remarkably little knowledge of the books that their students read. in the sample some books strongly recommended to large groups of students were hardly used and some of the most heavily used works appeared on no reading list. the results of this study are fascinating from an educational viewpoint but less satisfying as operational research. the failure of this .. approach led us to predicting demand from records of the number of past loans. this divides into two parts: using the number of loans over a period to estimate what the total demand was during that period; and using this estimate of the demand in one period to forecast the demand in another. various evidence suggests that the latter is a sensible thing to do. the main demand for heavily used books comes from undergraduate courses. most faculty are loyal in their reading habits, recommending books they know rather than new ones, and each course tends to be repeated year after year with a syllabus that changes only gradually. the use of past circulation to forecast future use is fundamental to a markov model of book usage developed by morse and elston and tested with data from the m.i.t. engineering library. 2 for our work we have used the number of loans in a given term to predict the demand in the corresponding term a year later. simulation m odelj arms and walter 75 estimating the total demand in a period from the number of loans in that period is more difficult. this requires a model of the circulation system. mathematical approach several attempts have been made to apply the methods of inventory control or queueing theory to the problem of buying duplicates. for example, grant has recently described an operational system using the simple rule that the number of copies required to satisfy 95 percent of the demand is n (p,. + 2cr.)/t where n is the number of times that the book is issued during a period of t days and p,8 and cr8 are the mean and standard deviation of the time that each book is off the shelf when on loan. 3 this type of approach has the advantage of being straightforward to use. periodically a simple computer program analyzes the circulation history of each book in the library and prints a list of books requiring duplication. however, the method suffers from difficulties both mathematical and practical. to obtain the simple mathematical expression given above, several simplifying assumptions have to be made. for example, the expression ignores use of a book within the library, and identifies demand in a period with the number of loans within that period. practical difficulties in arriving at a more exact mathematical expression are discussed in the next section. difficulties in constructing a model the following are the main difficulties that we found in constructing a model, either mathematical or using simulation: 1. the most useful measure of the effectiveness of a duplication policy is satisfaction level, the proportion of readers who on approaching the shelves find a copy of the book there, but satisfaction level is almost impossible to measure directly since, although some unsatisfied readers ask that the book be held for them, most go away without comment. more or less equivalent is the percentage time on shelf, the proportion of time that at least one copy of the book is available. this can be measured directly, though a visit to the shelves is needed, and was found useful in validating our model. if the underlying demand is random these two measures of effectiveness have the same value. 2. use of books within the library is also difficult to measure. at sussex, as in most libraries, data are available only on the number of times that a book is lent out of the library. if a reader does not find a copy on the shelves or if he uses a book within the library but does not take it away then no record is generated. since various studies, notably that of fussier and simon, suggest that the amount of use within li76 ]oumal of libmry automation vol. 7/2 june 1974 braries often exceeds the number of loans recorded by a factor of three or more, if the number of loans is used to estimate demand a reasonable knowledge of within-library use is essential.4 3. the number of copies required to achieve a specified satisfaction level does not go up linearly with demand. since a reader is satisfied if he finds a single copy on the shelves, proportionately fewer duplicates are needed of the books most in demand. at sussex more than twenty copies are provided of several books and this nonlinearity is very noticeable. 4. the demand for a title is erratic, changing from term to term, from week to week, and from day to day, even if the mean demand is constant. over a period such as a term three different effects might be expected: a background random demand independent of university courses; sudden peaks when a book is required for a course taken by several students; and feedback caused by previously unsatisfied readers returning. 5. the circulation of books is surprisingly complicated. at sussex some books are designated short term loan and can be borrowed for up to four days only; the remainder are long term loan books and can be borrowed for up to six weeks. circulation data show that the time for which a book is off the shelf is not the same as the period for which it is lent, but has a heavily skewed distribution. few books are returned until near the due date; just before the book is due back there is a peak when most books are returned but many become overdue and the tail of the distribution dies away slowly. simulation as these various factors seemed too complex to derive usable mathematical results, we decided to use computer simulation of the book circulation. simulation of book circulation is not new. in particular it has been used at lancaster university by mackenzie et al. to decide loan periods.5 their report includes a good description of the general approach. the object of our simulation was to model the circulation process so that we could study the relationship between three groups of parameters: 1. 0 bserved data number of copies available number of loans 2. total underlying demand 3. measures of effectiveness satisfaction of level percentage time on shelf. the results obtained from any simulation are only as accurate as the values given to the variables used to calibrate the model. as several of these values were not known at all accurately when the work was begun, special efforts were put into careful validation and calibration of the mod76 ]oumal of libmry automation vol. 7/2 june 1974 braries often exceeds the number of loans recorded by a factor of three or more, if the number of loans is used to estimate demand a reasonable knowledge of within-library use is essentiaj.4 3. the number of copies required to achieve a specified satisfaction level does not go up linearly with demand. since a reader is satisfied if he finds a single copy on the shelves, proportionately fewer duplicates are needed of the books most in demand. at sussex more than twenty copies are provided of several books and this nonlinearity is very noticeable. 4. the demand for a title is erratic, changing from term to term, from week to week, and from day to day, even if the mean demand is constant. over a period such as a term three different effects might be expected: a background random demand independent of university courses; sudden peaks when a book is required for a course taken by several students; and feedback caused by previously unsatisfied readers returning. 5. the circulation of books is surprisingly complicated. at sussex some books are designated short term loan and can be borrowed for up to four days only; the remainder are long term loan books and can be borrowed for up to six weeks. circulation data show that the time for which a book is off the shelf is not the same as the period for which it is lent, but has a heavily skewed distribution. few books are returned until near the due date; just before the book is due back there is a peak when most books are returned but many become overdue and the tail of the distribution dies away slowly. simulation as these various factors seemed too complex to derive usable mathematical results, we decided to use computer simulation of the book circulation. simulation of book circulation is not new. in particular it has been used at lancaster university by mackenzie et al. to decide loan periods.5 their report includes a good description of the general approach. the object of our simulation was to model the circulation process so that we could study the relationship between three groups of parameters: 1. 0 bserved data number of copies available number of loans 2. total underlying demand 3. measures of effectiveness satisfaction of level percentage time on shelf. the results obtained from any simulation are only as accurate as the values given to the variables used to calibrate the model. as several of these values were not known at all accurately when the work was begun, special efforts were put into careful validation and calibration of the modsimulation model/ arms and walter 77 el. a separate study was made for a small sample of books, to compare the percentage time on shelf estimated by the simulation with the actual time for which a copy was available, found by looking at the shelves. the results of this study were used to check the amount of use within the library. by this means we were able to verify the simulation model and calibrate it to a highly satisfactory level of accuracy. description of program the basic layout of the simulation is shown in figure 1. .this is a time advance model with a period of one day. the program has been coded in fortran and running on the icl 1904a computer at sussex takes about one second of machine time to simulate two years. this fast speed has enabled us to try a wide range of values for most parameters and to experiment with a variety of distributions of arrival times and book return dates. 1. satisfaction level at the beginning of each day the number of demands for that day is generated. the satisfaction level is taken as the proportion of these requests which can be satisfied from the books left on the shelf from the previous day and those returned during the simulated day. 2. within-library use the proportion of use that takes place within the library was a key parameter in calibrating the model. the first version of the simulation program assumed a figure of 25 percent use within the library. this was based on a small survey of the type of books being studied, standard texts used for undergraduate courses. the weakness of this survey was that it used a count of those books that were left lying in the library at the end of the day and did not make sufficient allowance for books reshelved by readers or by library staff during the day. the validation experiment showed a consistent difference between predicted and observed percentage time on shelf which could be corrected by changing the value of the within-library use parameter to 60 percent. 3. distribution of demand two distributions of demand have been used, poisson arrivals with a specified mean, and a step demand superimposed on a poisson process. in both cases provision is made for a proportion of unsatisfied readers to return later. as the effect of this feedback is to introduce sharp peaks of demand, the two distributions have proved surprisingly similar in the results produced and most of the runs of the program have been done with random demand. a recent survey showed that 69 percent of readers who fail to find a book intend to return, but we do not know how many actually come back nor what the time interval is before they return. 6 the simulation proved to be insensitive to moderate changes of these parameters 78 journal of library automation vol. 7/2 june 1974 advance clock one day add returned books generate requests fig. 1. outline flowchart of simulation program generate :return date generate return date reader return date simulation model/ arms and walter 79 and for most runs 25 percent of unsatisfied readers were deemed to return after a delay which averaged two days. 4. period for which the book is off the shelf the simulation allows for a book to be borrowed within the library, in which case it is available again the next day, or to be lent from the library. if the book is lent, the return date is generated from one of two histograms which respectively refer to books available on short and long term loan. these histograms were derived from an analysis of all books returned during one week in autumn 1970, modified to reflect changes in the circulation system. validation experiment although the structure of the simulation is fairly straightforward several parameters used in the model have been estimated indirectly. validation of the model took two forms. firstly we ran the program with a wide range of values for the main parameters to see which most influence the results. secondly a small study was set up to measure the percentage time on shelf of a number of books. for each book, the actual availability was estimated by the simulation from the number of loans during the same period. twenty-eight books known to be in heavy demand were selected, half in physics and half in sociology. over a period of eight weeks the shelves were inspected once per day, at random times during the day, to see if a copy was available. the number of loans of each copy of each book during the period was noted and the library staff carried out a thorough check to determine whether any copies shown in the catalog had been lost, stolen, or had their loan category altered. the simulation was used to estimate the percentage time on shelf and this was plotted on a graph against the observed percentage. figure 2 shows the graph for the original values of the parameters. in this graph the x axis shows the percentage time on shelf predicted by the simulation; the y axis shows the percentage observed. if the model were perfect the points would lie near the line y = x, deviations being caused by y being a random variable. the graph in figure 2 is clearly convex downwards showing a consistent error in the model, with these values of the parameters. knowing that the simulation is sensitive to the parameter giving the proportion of use that takes place within the library and that our estimate of its value was not precise, a series of graphs were prepared varying this parameter. figure 3 shows the same observations plotted against predictions assuming 60 percent use within the library, the value which best predicts the observations. this graph is much closer to being linear than figure2. the next question is whether the nonlinearities in figure 3 are the type to be expected from y being a random variable. a very rough calculation helps to answer this question. if we make the dubious assumption that 80 i ournal of lihm1'y automation vol. 7/2 june 197 4 observed availability (percent time on shelf) 100 50 25 o~----------~~----------~50~----------~75 ____________ -jloo predicted availability (percent time on shelf) fig. 2. observed percentage time on shelf against predicted ( 25 percent use within library) availability of a copy on a given day is independent of the days before and afterwards, then, for x given, y should be approximately normally distributed with mean x and variance x( 1 : ) , where n is the number of days in the study (forty). if this calculation were exact, 95 percent of the observations of y would lie within two standard deviations of x, but, since the assumption of independence is definitely false, we would expect the number of observations which fall within the range to be less than 95 percent. the curves y = x ± 2 { x(lx)/n} ¥. observed availability (percent time on shelf) 100 75 50 25 simulation model/ arms and walter 81 predicted availability (percent time on shelf) fig. 3. observed percentage time on shelf against predicted ( 60 percent use within library) with 95 percent probability curves have been added to figure 3. two points lie well off all graphs and cannot be explained except as the result of books being stolen or lost during the period of the study. of the remaining twenty-six all but three lie within the curves. this shows that the simulation model as finally calibrated gives a very reasonable description of the situation. operational experience the results of this simulation have been used by library staff since the middle of 1971 initially on an experimental basis. a two-stage process is in82 journal of library automation vol. 7/2 june 1974 volved. from the computer based circulation system cau; be found the number of times that each short term loan copy has been circulated. from these figures the library staff can estimate the demand for a title, over a given period. once the demand has been estimated the staff can use the simulation again to determine how many copies would have been required to have achieved a specified satisfaction level, perhaps 80 percent. if fewer copies are held by the library orders are placed for extra copies. at present these procedures are done manually using tables, but the possibility exists of modifying the computer system to identify those titles which need extra duplication. the actual decision to purchase needs to be done by library staff who can take account of factors not included in the simulation, such as price and changes of undergraduate courses. conclusion although this work was carried out during 1971, we shall have little operational experience of the method in action until the computer circulation system is reorganized. in the past, different copies of the same book have been processed entirely independently, meaning that the total number of loans of a given title can only be found by manually adding up the number of loans of each copy. in the revised computer system this will be done automatically. experience will probably show that the best procedure combines use of the simulation model with reading lists and the skill of a librarian. one possible feature of a computer based system is that it could automatically indicate which books appear to require duplication. the method used here would seem to apply equally well to other libraries. naturally the circulation patterns of other libraries are different, which means that a different simulation would be needed, but this work has shown that it is possible to calibrate a simulation accurately enough to examine the circulation of individual books. acknowledgments we would like to thank the many members of the university of sussex library staff who have helped at various stages, particularly p. t. stone who was closely involved throughout. references 1. p. f. webster, provision of duplicate copies in the university library, final year project report (university of sussex, 1971). 2. p. m. morse and c. r. elston, "a probability model for obsolescence," operations resem·ch 17:36-47 (1969). 3. r. s. grant, "predicting the need for multiple copies of books," journal of library automation 4:64-71 (june 1971). 4. h. h. fussier and j. l. simon, patterns in the use of books in large research libmries (chicago: univ. of chicago pr., 1969). 5. a. g. mackenzie et al., systems analysis of a university library. report to osti on project sl/ 52/02, 1969. 6. j. urquhart, private discussion, 1971. editorial board thoughts: the checklist mark cyzyk information technology and libraries | june 2014 1 at my home institution, johns hopkins, we have a renowned professor who made a keen insight several years ago. dr. peter pronovost, professor of anesthesiology and critical care medicine, of public health policy and management, of surgery, and of nursing, took note of the fact that many careless errors were occurring in hospital intensive care units, errors that were due not to ignorance, but to overlooking simple, mundane pr ocesses and procedures that maximize the safety of patients and increase the likelihood of positive medical outcomes. how to remedy this situation? as is often the case, the simplest solution is the most profound, the most effective, the most brilliant. pronovost implemented straightforward checklists of processes and procedures for doctors and nurses to routinely follow in the intensive care unit (icu). there was nothing on these checklists the medical staff did not already know; indeed, they were so basic that what was listed almost went without saying. but the brilliance of pronovost’s insight was that the very implementation of a “just saying” checklist resulted in an immediate and significant improvement in patient outcomes. so simple. so easily done. might we here in the library it world implement just such a checklist? in particular, i’m wondering whether we might begin to construct a fairly comprehensive checklist of what i’m calling “software genres” for use in libraries. it doesn’t take much insight to see that software packages map to services and that groups of these packages cluster around such services. such clusters, i’m thinking, are actually genres of software. so, for instance, let’s think about a standard libr ary service: a service that fulfills the need of academic institutions to store, archive, and provide access to faculty research. we see software systems emerge to fulfil this need, e.g., dspace, contentdm, fedora, eprints, and islandora. we can think of each of these systems as the fulfilment of similar sets of requirements and these requirements as being dictated by the satisfaction of a need. these systems individually represent concrete instantiations of clusters of similar requirements. but these systems also cluster among and between themselves insofar as the requirements and the needs they satisfy are similar. they form a genre of software. now, we’ve identified one genre of software useful in libraries: institutional repository software. what other genres might there be? how about the granddaddy of them all: a software mark cyzyk (mcyzyk@jhu.edu), a member of the ital editorial board, is the scholarly communication architect in the sheridan libraries, john hopkins univ ersity, baltimore, maryland. mailto:mcyzyk@jhu.edu editorial board thoughts: the checklist | cyzyk 2 system/concrete-instantiation-of-requirements fulfilling the need in libraries to enable the creation and collection of bibliographic metadata, to provide access to that metadata to library patrons, to enable the management of the circulation of physical and electronic objects mapping to that metadata and to manage acquisition of materials including serial publications? why it’s the venerable integrated library management system, of course! and here we’ll find such open-source examples as koha and evergreen alongside their commercial counterparts. continuing to compile, we’ll begin to gather a table of data similar to what’s below. but what good is this? don’t we all already know about these software “genres”? aren’t we just in some sense stating the obvious? to the extent that these software genres and the software packages classified by each are obvious, this table, this checklist, resembles the checklists that professor pronovost presented to doctors and nurses in the icu. those doctors and nurses certainly knew to wash their hands, to properly sterilize around an area of catheterization, etc. and if they were about to overlook a hand-washing or sterilization, the checklist brought this need and requirement immediately to their attention. so too we might be aware of, say, the fact that many libraries implement systems to manage visual resources, electronic images. we may work in a library that does not. yet scanning the list below we’ll see that it is a need with requirements that cluster to such an extent that a genre has developed around it. we may say to ourselves, “we too have a significant collection of electronic images that we’ve just been putting into our local repository, but maybe the repository is not the right place for this content, maybe we should look into these other systems designed specifically for storing and serving up images?” more, suppose you work in a really small and r esourceconstrained library, yet one that has an important archival collection of materials related to local history? you now scan the table below and determine that there is indeed a genre of archival information systems. “maybe we should set up an instance of archivesspace to facilitate the cataloging and provision of access to our archival treasures?” maybe you should! now some of these systems can do double duty. dspace, for instance, could be used as a serviceable image management/data management/etd management system. but cramming so many disparate kinds of content into a single system and expecting it to behave like a bona fide image management or data management or etd management system is probably a bad idea. the whole reason these genres emerge in the first place is that they satisfy the idiosyncratic needs of various types of content; their functionalities are tailored to the types of services one would want to provide surrounding such disparate types of content. using the right tool for the job is surely a best practice if ever there was one. that said, some of these software packages, by design, cut across the different genres in much the same way that, say, the music of ryan adams cuts across the genres of rock and alt country or the novels of china mieville cut across all manner of literary genres. islandora provides a good example of this. it can be used as a repository, but its various “solution packs” allow it to be information technology and libraries | june 2014 3 expanded to serve well as, for example, an image management system. islandora, interestingly, is itself built on two of the software packages listed below—drupal and fedora—combining the strengths and functionalities of both, resulting in a genre -busting (genre-extending?) software package. viewing software packages as types of genres is useful insofar as, when it comes time to assemble requirements for a new project or ser vice, thinking of existing software systems as concrete instantiations of such requirements speeds the accomplishment of this task. rather than begin to gather requirements ex nihilo, we should be thinking, “here are some prime examples of actual working systems that fulfil the satisfaction of needs similar to ours. how about we compile a list of them, install a few, and see what they do? maybe there is something here that fits our specific need?” so simple. so easily done. check! software genre software packages ilms koha evergreen aleph millennium polaris symphony horizon voyager indexing and discovery blacklight summon worldcat local ebsco discovery services encore synergy primo central libraryfind extensible text framework institutional repository dspace fedora eprints greenstone contentdm islandora digital commons image management and access mdid artstor shared shelf luna publishing: journals open journal systems editorial board thoughts: the checklist | cyzyk 4 annotum digital commons publishing: monographs open monograph press booktype online exhibition software omeka pachyderm open exhibits collective access archival information system archon archivists toolkit archivesspace ica-atom faculty research portfolio/showcase bibapp scival personal bibliography management zotero refworks endnote mendeley web content management wordpress drupal joomla data management data conservancy archivematica dataverse etd management vireo openetd wiki mediawiki confluence sharepoint tiki wiki gis arcgis server mapserver mapstory microsoft word june_ital_owen_final.docx engine of innovation: building the high performance catalog will owen and sarah c. michalak information technology and libraries | june 2015 5 abstract numerous studies have indicated that sophisticated web-‐based search engines have eclipsed the primary importance of the library catalog as the premier tool for researchers in higher education. we submit that the catalog remains central to the research process. through a series of strategic enhancements, the university of north carolina at chapel hill, in partnership with the other members of the triangle research libraries network (trln), has made the catalog a carrier of services in addition to bibliographic data, facilitating not simply discovery, but also delivery of the information researchers seek. introduction in 2005, an oclc research report documented what many librarians already knew—that the library webpage and catalog were no longer the first choice to begin a search for information. the report states, the survey findings indicate that 84 percent of information searches begin with a search engine. library web sites were selected by just 1 percent of respondents as the source used to begin an information search. very little variability in preference exists across geographic regions or u.s. age groups. two percent of college students start their search at a library web site.1 in 2006 a report by karen calhoun, commissioned by the library of congress, asserted, “today a large and growing number of students and scholars routinely bypass library catalogs in favor of other discovery tools. . . . the catalog is in decline, its processes and structures are unsustainable, and change needs to be swift.”2 ithaka s+r has conducted national faculty surveys triennially since 2000. summarizing the 2000– 2006 surveys, roger schonfeld and kevin guthrie stated, “when the findings from 2006 are compared with those from 2000 and 2003, it becomes evident that faculty perceive themselves as becoming decreasingly dependent on the library for their research and teaching needs.”3 furthermore, it was clear that the “library as gateway to scholarly information” was viewed as decreasingly important. the 2009 survey continued the trend with even fewer faculty seeing the will owen (owen@email.unc.edu) is associate university librarian for technical services and systems and sarah c. michalak (smichala@email.unc.edu) is university librarian and associate provost for university libraries, university of north carolina at chapel hill. engine of innovation: building the high-‐performance catalog | owen and michalak doi: 10.6017/ital.v34i2.5702 6 gateway function as critical. these results occurred in a time when electronic resources were becoming increasingly important and large google-‐like search engines were rapidly gaining in use.4 these comments extend into the twenty-‐first century more than thirty years of concern about the utility of the library catalog. through the first half of this decade new observations emerged about patron perceptions of catalog usability. even after migration from the card to the online catalog was complete, the new tool represented primarily the traditionally cataloged holdings of a particular library. providing direct access to resources was not part of the catalog’s mission. manuscripts, finding aids, historical photography, and other special collections were not included in the traditional catalog. journal articles could only be discovered through abstracting and indexing services. as these discovery tools began their migration to electronic formats, the centrality of the library’s bibliographic database was challenged. the development of google and other sophisticated web-‐based search engines further eclipsed the library’s bibliographic database as the first and most important research tool. yet we submit that the catalog database remains a necessary fixture, continuing to provide access to each library’s particular holdings. while the catalog may never regain its pride of place as the starting point for all researchers, it still remains an indispensable tool for library users, even if it may be used only at a later stage in the research process. at the university of north carolina at chapel hill, we have continued to invest in enhancing the utility of the catalog as a valued tool for research. librarians initially reasoned that researchers still want to find out what is available to them in their own campus library. gradually they began to see completely new possibilities. to that end, we have committed to a program that enhances discovery and delivery through the catalog. while most libraries have built a wide range of discovery tools into their home pages—adding links to databases of electronic resources, article databases, and google scholar—we have continued to enhance both the content to be found in the primary local bibliographic database and the services available to students and researchers via the interface to the catalog. in our local consortium, the triangle research libraries network (trln), librarians have deployed the search and faceting services of endeca to enrich the discovery interfaces. we have gone beyond augmenting the catalog through the addition of marcive records for government documents, by including encoded archival description (ead) finding aids and selected (and ever-‐ expanding) digital collections that are not easily discoverable through major search engines. we have similarly enhanced services related to the discovery and delivery of items listed in the bibliographic database, including not only common features like the ability to export citations in a variety of formats but also more extensive services such as document delivery, an auto-‐suggest feature that maximizes use of library of congress subject headings (lcsh), and the ability to submit cataloged items to be processed for reserve reading. information technology and libraries | june 2015 7 both students and faculty have embraced e-‐books, and in adding more than a million such titles to the unc-‐chapel hill catalog we continue to blend discovery and delivery, but now on a very large scale. coupling catalog records with a metadata service that provides book jackets, tables of contents, and content summaries, cataloging geographic information systems (gis) data sets, and adding live links to the finding aids for digitized archival and manuscript collections have further enhanced the blended discovery/delivery capacity of the catalog. we have also leveraged the advantages of operating in a consortial environment by extending the discovery and delivery services among the members of trln to provide increased scope of discovery and shared processing of some classes of bibliographic records. trln comprises four institutions and content from all member libraries is discoverable in a combined catalog (http://search.trln.org). printed material requested through this combined catalog is often delivered between trln libraries within twenty-‐four hours. at unc, our search logs show that use of the catalog increases as we add new capacity and content. these statistics demonstrate the catalog’s continuing relevance as a research tool that adds value above and beyond conventional search engines and general web-‐based information resources. in this article we will describe the most important enhancements to our catalog, include data from search logs to demonstrate usage changes resulting from these enhancements, and comment on potential future developments. literature review an extensive literature discusses the past and future of online catalogs, and many of these materials themselves include detailed literature reviews. in fact, there are so many studies, reviews, and editorials, it becomes clear that although the online catalog may be in decline, it remains a subject of lively interest to librarians. two important threads in this literature report on user-‐query studies and on other usability testing. though there are many earlier studies, two relatively recent articles analyze search behavior and provide selective but helpful literature surveys.5 there are many efforts to define directions for the catalog that would make it more web-‐like, more google-‐like, and thus more often chosen for search, discovery, and access by library patrons. these articles aim to define the characteristics of the ideal catalog. charles hildreth provides a benchmark for these efforts by dividing the history of the online catalog into three generations. from his projections of a third generation grew the “next generation catalog”—really the current ideal. he called for improvement of the second-‐generation catalog through an enhanced user-‐ system dialog, automatic correction of search-‐term spelling and format errors, automatic search aids, enriched subject metadata in the catalog record to improve search results, and the integration of periodical indexes in the catalog. as new technologies have made it possible to achieve these goals in new ways, much of what hildreth envisioned has been accomplished.6 engine of innovation: building the high-‐performance catalog | owen and michalak doi: 10.6017/ital.v34i2.5702 8 second-‐generation catalogs, anchored firmly in integrated library systems, operated throughout most of the 1980s and the 1990s without significant improvement. by the mid-‐2000s the search for the “next-‐gen” catalog was in full swing, and there are numerous articles that articulate the components of an improved model. the catalog crossed a generational line for good when the north carolina state university libraries (ncsu) launched a new catalog search engine and interface with endeca in january 2006. three ncsu authors published a thorough article describing key catalog improvements. their endeca-‐enhanced catalog fulfilled the most important criteria for a “next-‐gen” catalog: improved search and retrieval through “relevance-‐ranked results, new browse capabilities, and improved subject access.”7 librarians gradually concluded that the catalog need not be written off but would benefit from being enhanced and aligned with search engine capabilities and other web-‐like characteristics. catalogs should contain more information about titles, such as book jackets or reviews, than conventional bibliographic records offered. catalog search should be understandable and easy to use. additional relevant works should be presented to the user along with result sets. the experience should be interactive and participatory and provide access to a broad array of resources such as data and other nonbook content.8 karen markey, one of the most prolific online catalog authors and analysts, writes, “now that the era of mass digitization has begun, we have a second chance at redesigning the online library catalog, getting it right, coaxing back old users and attracting new ones.”9 marshall breeding predicted characteristics of the next-‐generation catalog. his list includes expanded scope of search, more modern interface techniques, such as a single point of entry, search result ranking, faceted navigation, and “did you mean . . . ?” capacity, as well as an expanded search universe that includes the full text of journal articles and an array of digitized resources.10 a concept that is less represented in the literature is that of envisioning the catalog as a framework for service, although the idea of the catalog designed to ensure customer self-‐service has been raised.11 michael j. bennett has studied the effect of catalog enhancements on circulation and interlibrary loan.12 service and the online catalog have a new meaning in morgan’s idea of “services against texts,” supporting “use and understand” in addition to the traditional “find and get.”13 lorcan dempsey commented on the catalog as an identifiable service and predicts new formulations for library services based on the network-‐level orientation of search and discovery.14 but the idea that the catalog has moved from a fixed, inward-‐focused tool to an engine for services—a locus to be invested with everything from unmediated circulation renewal and ordering delivery to the “did you mean” search aid—has yet to be addressed comprehensively in the literature. enhancing the traditional catalog one of the factors that complicates discussions of the continued relevance of the library catalog to research is the very imprecision of the term in common parlance, especially when the chief point information technology and libraries | june 2015 9 of comparison to today’s ils-‐driven opacs is google or, more specifically, google scholar. from first-‐year writing assignments through advanced faculty research, many of the resources that our patrons seek are published in the periodical literature, and the library catalog, the one descended from the cabinets full of cards that occupied prominent real estate in our buildings, has never been an effective tool for identifying relevant periodical literature. this situation has changed in recent years as products like summon, from proquest, and ebsco discovery service have introduced platforms that can accommodate electronic article indexing as well as marc records for the types of materials—books, audio, and video—that have long been discovered through the opac. in the following discussion of “catalog” developments and enhancements, we focus initially not on these integrated solutions, but on the catalog as more traditionally defined. however, as electronic resources become an ever-‐greater percentage of library collections, we shall see a convergence of these two streams that will portend significant changes in the nature and utility of the catalog. much work has been done in the first decade of the twenty-‐first century to enhance discovery services and, as noted above, north carolina state university’s introduction of their endeca-‐based search engine and interface was a significant game-‐changer. in the years following the introduction of the endeca interface at ncsu, the triangle research libraries network invested in further development of features that enhanced the utility of the endeca software itself. programmed enhancements to the interface provided additional services and functionality. in some cases, these enhancements were aimed at improving discovery. in others, they allowed researchers to make new and better use of the data that they found or made it easier to obtain the documents that they discovered. faceting and limiting retrieval results perhaps the most immediately striking innovation in the endeca interface was the introduction of facets. the use of faceted browsing allowed users to parse the bibliographic record in new ways (and more ways) than had preceding catalogs. there were several fundamentally important ways faceting enhanced search and discovery. the first of these was the formal recognition that keyword searching was the user’s default means of interacting with the catalog’s data. ncsu’s initial implementation allowed for searches using several indexes, including authors, titles, and subject headings, and this functionality remains in place to the present day. however, by default, searches returned records containing the search terms “anywhere” in the record. this behavior was more in line with user expectations in an information ecosystem dominated by google’s single search box. the second was the significantly different manner in which multiple limits could be placed on an initial result set from such a keyword search. the concept of limiting was not a new one: certain facets worked in a manner consistent with traditional limits in prior search interfaces, allowing users to screen results by language, or date of publication, for example. engine of innovation: building the high-‐performance catalog | owen and michalak doi: 10.6017/ital.v34i2.5702 10 it was the ease and transparency with which multiple limits could be applied through faceting that was revolutionary. a user who entered the keyword “java” in the search box was quickly able to discriminate between the programming language and the indonesian island. this could be achieved in multiple ways: by choosing between subjects (for example, “application software” vs. “history”) or clearly labeled lc classification categories (“q – science” vs. “d – history”). these limits, or facets, could be toggled on and off, independently and iteratively. the third and highly significant difference resulted from how library of congress subject headings (lcsh) were parsed and indexed in the system. by making lcsh subdivisions independent elements of the subject-‐heading index in a keyword search, the endeca implementation unlocked a trove of metadata that had been painstakingly curated by catalogers for nearly a century. the user no longer needed to be familiar with the formal structure of subject headings; if the keywords appeared anywhere in the string, the subdivisions in which they were contained could be surfaced and used as facets to sharpen the focus of the search. this was revolutionary. utilizing the power of new indexing structures the liberation of bibliographic data from the structure of marc record indexes presaged yet another far-‐reaching alteration in the content of library catalogs. to this day, most commercial integrated library systems depend on marc as the fundamental record structure. in ncsu’s implementation, the multiple indexes built from that metadata created a new framework for information. this change made possible the integration of non-‐marc data with marc data, allowing, for example, dublin core (dc) records to be incorporated into the universe of metadata to be indexed, searched, and retrieved. there was no need to crosswalk dc to marc: it sufficed to simply assign the dc elements to the appropriate endeca indexes. with this capacity to integrate rich collections of locally described digital resources, the scope of the traditional catalog was enlarged. expanding scopes and banishing silos at unc-‐chapel hill, we began this process of augmentation with selected collections of digital objects. these collections were housed in a contentdm repository we had been building for several years at the time of the library’s introduction of the endeca interface. image files, which had not been accessible through traditional catalogs, were among the first to be added. for example, we had been given a large collection of illustrated postcards featuring scenes of north carolina cities and towns. these postcards had been digitized and metadata describing the image and the town had been recorded. other collections of digitized historical photographs were also selected for inclusion in the catalog. these historical resources proved to be a boon to faculty teaching local history courses and, interestingly, to students working on digital projects for their classes. as class assignments came to include activities like creating maps enhanced by the information technology and libraries | june 2015 11 addition of digital photographs or digitized newspaper clippings, the easy discovery of these formerly hidden collections enriched students’ learning experience. other special collection materials had been represented in the traditional catalog in somewhat limited fashion. the most common examples were manuscripts collections. the processing of these collections had always resulted in the creation of finding aids, produced since the 1930s using index cards and typewriters. during the last years of the twentieth century, archivists began migrating many of these finding aids to the web using the ead format, presenting them as simple html pages. these finding aids were accessible through the catalog by means of generalized marc records that described the collections at a superficial level. however, once we attained the ability to integrate the contents of the finding aids themselves into the indexes underlying the new interface, this much richer trove of keyword-‐searchable data vastly increased the discoverability and use of these collections. during this period, the library also undertook systematic digitization of many of these manuscript collections. whenever staff received a request for duplication of an item from a manuscript collection (formerly photocopies, but by then primarily digital copies), we digitized the entire folder in which that item was housed. we developed standards for naming these digital surrogates that associated the individual image with the finding aid. it then became a simple matter, involving the addition of a short javascript string to the head of the online finding aid, to dynamically link the digital objects to the finding aid itself. other library collections likewise benefited from the new indexing structures. some uncataloged materials traditionally had minimal bibliographic control provided by inventories that were built at the time of accession in desktop database applications; funding constraints meant that full cataloging of these materials (often rare books) remained elusive. the ability to take the data that we had and blend it into the catalog enhanced the discovery of these collections as well. we also have an extensive collection of video resources, including commercial and educational films. the conventions for cataloging these materials, held over from the days of catalog cards, often did not match user expectations for search and discovery. there were limits to the number of added entries that catalogers would make for directors, actors, and others associated with a film. many records lacked the kind of genre descriptors that undergraduates were likely to use when seeking a film for an evening’s entertainment. to compensate for these limitations, staff who managed the collection had again developed local database applications that allowed for the inclusion of more extensive metadata and for categories such as country of origin or folksonomic genres that patrons frequently indicated were desirable access points. once again, the new indexing structures allowed us to incorporate this rich set of metadata into what looked like the traditional catalog. each of the instances described above represents what we commonly call the destruction of silos. information about library collections that had been scattered in numerous locations—and not all of them online—was integrated into a single point of discovery. it was our hope and intention that engine of innovation: building the high-‐performance catalog | owen and michalak doi: 10.6017/ital.v34i2.5702 12 such integration would drive more users to the catalog as a discovery tool for the library’s diverse collections and not simply for the traditional monographic and serials collections that had been served by marc cataloging. usage logs indicate that the average number of searches conducted in the catalog rose from approximately 13,000 per day in 2009 to around 19,000 per day in 2013. it is impossible to tell with any certainty whether there was heavier use of the catalog simply because increasingly varied resources came to be represented in it, but we firmly believe that the experience of users who search for material in our catalog has become much richer as a result of these changes to its structure and content. cooperation encouraging creativity another way we were able to harness the power of endeca’s indexing scheme involved the shared loading of bibliographic records for electronic resources to which multiple trln libraries provided access. trln’s endeca indexes are built from the records of each member. each institution has a “pipeline” that feeds metadata into the combined trln index. duplicate records are rolled up into a single display via oclc control numbers whenever possible, and the bibliographic record is annotated with holdings statements for the appropriate libraries. we quickly realized that where any of the four institutions shared electronic access to materials, it was redundant to load copies of each record into the local databases of each institution.15 instead, one institution could take responsibility for a set of records representing shared resources. examples of such material include electronic government documents with records provided by the marcive documents without shelves program, large sets like early english books online, and pbs videos streamed by the statewide services of nc live. in practice, one institution takes responsibility for loading, editing, and performing authority control on a given set of records. (for example, unc, as the regional depository, manages the documents without shelves record set.) these records are loaded with a special flag indicating that they are part of the shared records program. this flag generates a holdings statement that reflects the availability of the electronic item at each institution. the individual holdings statements contain the institution-‐specific proxy server information to enable and expedite access. in addition to this distributed model of record loading and maintenance, we were able to leverage oai-‐pmh feeds to add selected resources to the searchtrln database. all four institutions have access to the data made available by the inter-‐university consortium for political and social research (icpsr). as we do not license these resources or maintain them locally, and as records provided by icpsr can change over time, we developed a mechanism to harvest the metadata and push it through a pipeline directly into the searchtrln indexes. none of the member libraries’ local databases house this metadata, but the records are made available to all nonetheless. while we were engaged in implementing these enhancements, additional sources of potential enrichment of the catalog were appearing. in particular, vendors began providing indexing services for the vast quantities of electronic resources contained in aggregator databases. information technology and libraries | june 2015 13 additionally, they made it possible for patrons to move seamlessly from the catalog to those electronic resources via openurl technologies. indeed, services like proquest’s summon or ebsco’s discovery service might be taken as another step toward challenging the catalog’s primacy as a discovery tool as they offered the prospect of making local catalog records just a fraction of a much larger universe of bibliographic information available in a single, keyword-‐ searchable database. it remains to be seen, therefore, whether continuing to load many kinds of marc records into the local database is an effective aid to discovery even with the multiple delimiting capabilities that endeca provides. what is certain, however, is that our approach to indexing resources of any kind has undergone a radical transformation over the past few years—a transformation that goes beyond the introduction of any of the particular changes we have discussed so far. promoting a culture of innovation one important way endeca has changed our libraries is that a culture of constant innovation has become the norm, rather than the exception, for our catalog interface and content. once we were no longer subject to the customary cycle of submitting enhancement requests to an integrated library system vendor, hoping that fellow customers shared similar desires, and waiting for a response and, if we were lucky, implementation, we were able to take control of our aspirations. we had the future of the interface to our collections in our own hands, and within a few years of the introduction of endeca by ncsu, we were routinely adding new features to enhance its functionality. one of the first of these enhancements was the addition of a “type-‐ahead” or “auto-‐suggest” option.16 inspired by google’s autocomplete feature, this service suggests phrases that might match the keywords a patron is typing into the search box. ben pennell, one of the chief programmers working on endeca enhancement at unc-‐chapel hill, built a solr index from the ils author, title, and subject indexes and from a log of recent searches. as a patron typed, a drop-‐ down box appeared below the search box. the drop-‐down contained matching terms extracted from the solr index in a matter of seconds or less. for example, typing the letters “bein” into the box produced a list including “being john malkovich,” “nature—effects of human beings on,” “human beings,” and “bein, alex, 1903–1988.” the italicized letters in these examples are highlighted in a different color in the drop-‐down display. in the case of terms drawn directly from an index, the index name appears, also highlighted, on the right side of the box. for example, the second and third terms in the examples above are tagged with the term “subject.” the last example is an “author.” in allowing for the textual mining of lcsh, the initial implementation of faceting in the endeca catalog surfaced those headings for the patron by uniting keyword and controlled vocabularies in an unprecedented manner. there was a remarkable and almost immediate increase in the number of authority index searches entered into the system. at the end of the fall semester prior to the implementation of the auto-‐suggest feature, an average of around 1,400 subject searches were engine of innovation: building the high-‐performance catalog | owen and michalak doi: 10.6017/ital.v34i2.5702 14 done in a week. approximately one month into the spring semester, that average had risen to around 4,000 subject searches per week. use of the author and title indexes also rose, although not quite as dramatically. in the perpetual tug-‐of-‐war between precision and recall, the balance had decidedly shifted. another service that we provide, which is especially popular with students, is the ability to produce citations formatted in one of several commonly used bibliographic styles, including apa, mla, and chicago (both author-‐date and note-‐and-‐bibliography formats). this functionality, first introduced by ncsu and then jointly developed with unc over the years that followed, works in two ways. if a patron finds a monographic title in the catalog, simply clicking on a link labeled “cite” produces a properly formatted citation that can then be copied and pasted into a document. the underlying technology also powers a “citation builder” function by which a patron can enter basic bibliographic information for a book, a chapter or essay, a newspaper or journal article, or a website into a form, click the “submit” button, and receive a citation in the desired format. an additional example of innovation that falls somewhat outside the scope of the changes discussed above was the development of a system that allowed for the mapping of simplified chinese characters to their traditional counterparts. searching in non-‐roman character sets has always offered a host of challenges to library catalog users. the trln libraries have embraced the potential of endeca to reduce some of these challenges, particularly for chinese, through the development of better keyword searching strategies and the automatic translation of simplified to traditional characters. since we had complete control over the endeca interface, it proved relatively simple to integrate document delivery services directly into the functionality of the catalog. rather than simply emailing a bibliographic citation or a call number to themselves, patrons could request the delivery of library materials directly to their campus addresses. once we had implemented this feature, we quickly moved to amplify its power. many catalogs offer a “shopping cart” service that allows patrons to compile lists of titles. one variation on this concept that we believe is unique to our library is the ability for a professor to compile such a list of materials held by the libraries on campus and submit that list directly to the reserve reading department, where the books are pulled from the shelves and placed on course-‐reserve lists without the professor needing to visit any particular library branch. these new features, in combination with other service enhancements such as the delivery of physical documents to campus addresses from our on-‐ campus libraries and our remote storage facility, have increased the usefulness of the catalog as well as our users’ satisfaction with the library. we believe that these changes have contributed to the ongoing vitality of the catalog and to its continued importance to our community. in december 2012, the libraries adopted proquest’s summon to provide enhanced access to article literature and electronic resources more generally. at the start of the following fall semester, the libraries instituted another major change to our discovery and delivery services through a combined single-‐search box on our home page. this has fundamentally altered how information technology and libraries | june 2015 15 patrons interact with our catalog and its associated resources. first, because we are now searching both the catalog and the summon index, the type-‐ahead feature that we had deployed to suggest index terms from our local database to users as they entered search strings no longer functions as an authority index search. we have returned to querying both databases through a simple keyword search. second, in our implementation of the single search interface we have chosen to present the results from our local database and the retrievals from summon in two, side-‐by-‐side columns. this has the advantage of bringing article literature and other resources indexed by summon directly to the patron’s attention. as a result, more patrons interact directly with articles, as well as with books in major digital repositories like google books and hathitrust. this change has undoubtedly led patrons to make less in-‐depth use of the local catalog database, although it preserves much of the added functionality in terms of discovering our own digital collections as well as those resources whose cataloging we share with our trln partners. we believe that the ease of access to the resources indexed by summon complements the enhancements we have made to our local catalog. conclusion and further directions one might argue that the integration of electronic resources into the “catalog” actually shifts the paradigm more significantly than any previous enhancements. as the literature review indicates, much of the conversation about enriching library catalogs has centered on improving the means by which search and discovery are conducted. the reasonably direct linking to full text that is now possible has once again radically shifted that conversation, for the catalog has come to be seen not simply as a discovery platform based on metadata but as an integrated system for delivering the essential information resources for which users are searching. once the catalog is understood to be a locus for delivering content in addition to discovering it, the local information ecosystem can be fundamentally altered. at unc-‐chapel hill we have engaged in a process whereby the catalog, central to the library’s web presence (given the prominence of the single search box on the home page), has become a hub from which many other services are delivered. the most obvious of these, perhaps, is a system for the delivery of physical documents that is analogous to the ability to retrieve the full text of electronic documents. if an information source is discovered that exists in the library only in physical form, enhancements to the display of the catalog record facilitate the receipt by the user of the print book or a scanned copy of an article from a bound journal in the stacks. in 2013, ithaka s+r conducted a local unc faculty survey. the survey posed three questions related to the catalog. in response to the question, “typically when you are conducting academic research, which of these four starting points do you use to begin locating information for your research?,” 41 percent chose “a specific electronic research resource/computer database.” nearly one-‐third (30 percent) chose “your online library catalog.”17 engine of innovation: building the high-‐performance catalog | owen and michalak doi: 10.6017/ital.v34i2.5702 16 when asked, “when you try to locate a specific piece of secondary scholarly literature that you already know about but do not have in hand, how do you most often begin your process?,” 41 percent chose the library’s website or online catalog, and 40 percent chose “search on a specific scholarly database or search engine.” in response to the question, “how important is it that the library . . . serves as a starting point or ‘gateway’ for locating information for my research?,” 78 percent answered extremely important. on several questions, ithaka provided the scores for an aggregation of unc’s peer libraries. for the first question (the starting point for locating information), 18 percent of national peers chose the online catalog compared to 30 percent at unc. on the importance of the library as gateway, 61 percent of national peers answered very important compared to the 78 percent at unc. in 2014, the unc libraries were among a handful of academic research libraries that implemented a new ithaka student survey. though we don’t have national benchmarks, we can compare our own student and faculty responses. among graduate students, 31 percent chose the online catalog as the starting point for their research, similar to the faculty.18 of the undergraduate students, 33 percent chose the library’s website, which provides access to the catalog through a single search box.19 a finding that approximately a third of students began their search on the unc library website was gratifying. oclc’s perceptions of libraries 2010 reported survey results regarding where people start their information searches. in 2005, 1 percent said they started on a library website; in 2010, not a single respondent indicated doing so.20 the gross disparity between the oclc reports and the ithaka surveys of our faculty and students requires some explanation. the libraries at the university of north carolina at chapel hill are proud of a long tradition of ardent and vocal support from the faculty, and we are not surprised to learn that students share their loyalty. for us, the recently completed ithaka surveys point out directions for further investigation into our patrons’ use of our catalog and why they feel it is so critical to their research. anecdotal reports indicate that one of the most highly valued services that the libraries provide is delivery of physical materials to campus addresses. some faculty admit with a certain degree of diffidence that our services have made it almost unnecessary to set foot in our buildings; that is a trend that has also been echoed in conversations with our peers. yet the online presence of the library and its collections continues to be of significant importance—perhaps precisely because it offers an effective gateway to a wide range of materials and services. we believe that the radical redesign of the online public access catalog initiated by north carolina state university in 2006 marked a sea change in interface design and discovery services for that venerable library service. without a doubt, continued innovation has enhanced discovery. however, we have come to realize that discovery is only one function that the online catalog can and should serve today. equally if not more important is the delivery of information to the information technology and libraries | june 2015 17 patron’s home or office. the integration of discovery and delivery is what sets the “next-‐gen” catalog apart from its predecessors, and we must strive to keep that orientation in mind, not only as we continue to enhance the catalog and its services, but as we ponder the role of the library as place in the coming years. far from being in decline, the online catalog continues to be an “engine of innovation” (to borrow a phrase from holden thorp, former chancellor of unc-‐chapel hill) and a source of new challenges for our libraries and our profession. references 1. cathy de rosa et al., perceptions of libraries and information resources: a report to the oclc membership (dublin, oh: oclc online computer library center, 2005), 1–17, https://www.oclc.org/en-‐us/reports/2005perceptions.html. 2. karen calhoun, the changing nature of the catalog and its integration with other discovery tools, final report, prepared for the library of congress (ithaca, ny: k. calhoun, 2006), 5, http://www.loc.gov/catdir/calhoun-‐report-‐final.pdf. 3. roger c. schonfeld and kevin m. guthrie, “the changing information services needs of faculty,” educause review 42, no. 4 (july/august 2007): 8, http://www.educause.edu/ero/article/changing-‐information-‐services-‐needs-‐faculty. 4. ross housewright and roger schonfeld, ithaka’s 2006 studies of key stakeholders in the digital transformation in higher education (new york: ithaka s+r, 2008), 6, http://www.sr.ithaka.org/sites/default/files/reports/ithakas_2006_studies_stakeholders_di gital_transformation_higher_education.pdf. 5. xi niu and bradley m. hemminger, “beyond text querying and ranking list: how people are searching through faceted catalogs in two library environments,” proceedings of the american society for information science & technology 47, no. 1 (2010): 1–9, http://dx.doi.org/10.1002/meet.14504701294; and cory lown, tito sierra, and josh boyer, “how users search the library from a single search box,” college & research libraries 74, no. 3 (2013): 227–41, http://crl.acrl.org/content/74/3/227.full.pdf. 6. charles r. hildreth, “beyond boolean; designing the next generation of online catalogs,” library trends (spring 1987): 647–67, http://hdl.handle.net/2142/7500. 7. kristen antelman, emily lynema, and andrew k. pace, “toward a twenty-‐first century library catalog,” information technology and libraries 25, no. 3 (2006): 129, http://dx.doi.org/10.6017/ital.v25i3.3342. 8. karen coyle, “the library catalog: some possible futures,” journal of academic librarianship 33, no. 3 (2007): 415–16, http://dx.doi.org/10.1016/j.acalib.2007.03.001. 9. karen markey, “the online library catalog: paradise lost and paradise regained?” d-‐lib magazine 13, no. 1/2 (2007): 2, http://dx.doi.org/10.1045/january2007-‐markey. engine of innovation: building the high-‐performance catalog | owen and michalak doi: 10.6017/ital.v34i2.5702 18 10. marshall breeding, “next-‐gen library catalogs,” library technology reports (july/august 2007): 10–13. 11. jia mi and cathy weng, “revitalizing the library opac: interface, searching, and display challenges,” information technology and libraries 27, no. 1 (2008): 17–18, http://dx.doi.org/10.6017/ital.v27i1.3259. 12. michael j. bennett, “opac design enhancements and their effects on circulation and resource sharing within the library consortium environment,” information technology and libraries 26, no. 1 (2007): 36–46, http://dx.doi.org/10.6017/ital.v26i1.3287. 13. eric lease morgan, “use and understand; the inclusion of services against texts in library catalogs and discovery systems,” library hi tech (2012): 35–59, http://dx.doi.org/10.1108/07378831211213201. 14. lorcan dempsey, “thirteen ways of looking at libraries, discovery, and the catalog: scale, workflow, attention,” educause review online (december 10, 2012), http://www.educause.edu/ero/article/thirteen-‐ways-‐looking-‐libraries-‐discovery-‐and-‐ catalog-‐scale-‐workflow-‐attention. 15. charles pennell, natalie sommerville, and derek a. rodriguez, “shared resources, shared records: letting go of local metadata hosting within a consortium environment,” library resources & technical services 57, no. 4 (2013): 227–38, http://journals.ala.org/lrts/article/view/5586. 16. benjamin pennell and jill sexton, “implementing a real-‐time suggestion service in a library discovery layer,” code4lib journal 10 (2010), http://journal.code4lib.org/articles/3022. 17. ithaka s+r, unc chapel hill faculty survey: report of findings (unpublished report to the university of north carolina at chapel hill, 2013), questions 20, 21, 33. 18. ithaka s+r, unc chapel hill graduate student survey: report of findings (unpublished report to the university of north carolina at chapel hill, 2014), 47. 19. ithaka s+r, unc chapel hill undergraduate student survey: report of findings (unpublished report to the university of north carolina at chapel hill, 2014), 39. 20. cathy de rosa et al., perceptions of libraries, 2010: context and community: a report to the oclc membership (dublin, oh: oclc online computer library center, 2011), 32, http://oclc.org/content/dam/oclc/reports/2010perceptions/2010perceptions_all.pdf. lib-s-mocs-kmc364-20141005045744 book reviews key papers in informatwn science. edited by arthur w. elias. washington, d.c.: american society for information science, 1971. 223 p. $6.00. when i re-read the articles making up this volume for the purpose of writing this review, a strong feeling of nostalgia welled up. as a reader who has lived t?rough the years of speculation, exploration, experiment, development, and debate that they embody, i couldn't help but feel again the spirit of excitement that i and others felt at the time. these are indeed "key papers," and it's valuable to have them together. oh, of course some names are missing and are missedmooers, taube, fairthorne, perry and kent, bar-hillel, bush, shaw-but enough of them are here to give a full flavor of the times. the question is whether, as a collection, this set of papers has value beyond nostalgia. before turning to that question, however, let's see what they consist of. the volume groups nineteen papers into four categories: (1) background and philosophy, (2) information needs and systems, ( 3) organization and dissemination of information, and ( 4) other areas of interest. the first includes papers by borko, by shera, and by otten and debons that attempt to define information science, its relationship to librarianship, and its potential as an independent discipline. the second includes papers by weinberg, by murdock and liston, by taylor, by parker and paisley, and by kertesz that outline the purposes and functions of information transfer, especially for the sciences. the third includes papers by doyle, by fischer, by conner, and by rees that present some of the techniques which have been developed for handling, organizing, and presenting information-especially mechanized ones such as kwic indexes, automatic indexing and abstracting, and sdi. the final section presents a potpourri of topics: a paper by lipetz on information storage book reviews 269 and retrieval, one by de gennaro on library automation, one by garvin on natural language, one by borko on systems analysis, and one by heilprin on technology and copyright. the defined purpose of this collection is to serve students and instructors in introductory courses in information science, by making these key papers readily available as assigned readings. they indeed are useful readings, and the organization imposed on them by the editor, elias, adds greatly to their usefulness, making them far more than a simple chronological listing. despite this, however, i must confess that, as the instructor in an introductory course in which we used the key papers for the purpose for which it was intended, it fell short of meeting the needs. since then, i've tried to evaluate why. recognizing that the difficulties may have been due to the style of the instructor and the form of the course the fact is that any collection of readings: valuable though they individually may be, has many deficiencies. i suppose they can all be summed up as follows: a collection of papers has the appearance of a book without being a book. it lacks congruity; it lacks balance; it lacks inherent structure in contrast to that which is imposed; it lacks a theme or point to be made; it lacks a consistent style. as a sometime publisher, as an editor of a -series of books as a reviewer of prospective manuscrip~ i have felt that these things are as important in evaluation as substance and content. beyond this, a more important fact is that these papers, "key" though they are, represent the past, not the present. an introduction to information science requires reading assignments in the work of today, not just those of historical importance. on the other hand, the fact remains that these are important papers, ones with which students should be come familiarand not simply for historical purposes, and that most instructors and classes should bnd this a useful volume. robert m. hayes becker & hayes, inc. gaps in it and library services at small academic libraries in canada jasmine hoover information technology and libraries | december 2018 15 jasmine hoover (jasmine_hoover@cbu.ca) is scholarly resources librarian, cape breton university, sydney, nova scotia, canada. abstract modern academic libraries are hubs of technology, yet the gap between the library and it is an issue at several small university libraries across canada that can inhibit innovation and lead to diminished student experience. this paper outlines results of a survey of small (<5,000 fte) universities in canada, focusing on it and the library when it comes to organizational structure, staffing, and location. it then discusses higher level as well as smaller scale solutions to this issue. introduction modern academic libraries are hubs of technology, yet existing staffing, organizational structures, physical proximity and traditional ways of doing things in higher education have maintained a gap between the library and it, which is an issue at several small university libraries across canada. libraries today are largely online, which means managing access to resources, using online tools for reference and research, designing websites and more. the physical space in libraries is now a place to interact with new technologies, visualize data, a place for research support including open access repositories and data management, and other digital research initiatives. 1 these library functions often require a staffing complement to support them with a level of specialization in information technology (it). however, though the offerings of the library have changed drastically over the years, smaller university libraries have struggled to support the growing need for it services. larger universities (over 5,000 fte) have managed this influx of demand and usage of new technologies in libraries by having their own library it services to manage software and technologies to support research, teaching and learning. many also offer student and user -facing technical support with it help desks within the library. smaller universities (below 5,000 fte) often do not have the resources to have their own it department or staff and find themselves not able to help researchers with modern digital scholarship, not able to support new systems and software, and not working as closely with it as they would like or need. also, the it department is generally not responsible for this kind of work, as it is outside of institutional-wide software support. this paper outlines the current status of it and the library when it comes to organizational structure, physical location, and collaboration in small academic libraries across canada. it then outlines strategies that can be used in smaller libraries to help bridge the gap, as well as recommendations for administrators when considering organizational changes to better serve a modern research atmosphere. current status at small canadian universities the technologies behind modern library services are often complex, as libraries need to securely manage access to online resources (both on and off campus); support faculty as they research and mailto:jasmine_hoover@cbu.ca gaps in it and library services at small academic libraries in canada | hoover 16 https://doi.org/10.6017/ital.v37i4.10596 teach using new software and technologies; and support new models for publishing that include open-access repositories, data management, open education resources, and more. library staff deal with technology issues that come up daily, with several non-it library staff members troubleshooting and solving various issues that arise. library users run into all kinds of technical issues and reach out for help. in nova scotia, our library consortium offers live help, an online library chat service distributed throughout eleven academic institutions in nova scotia. statistics kept on type of question asked on this service from january 2010 to march 2018 show that 26 percent of the over 68,000 questions asked are technical in nature, with topics including difficulty accessing online resources, login troubles, and other technical issues.2 for this study, 18 out of the 21 universities with fte >1,000 and < 5,000 in canada were surveyed. excluded were universities that were “sister” institutions of larger universities which utilized the same library system and french-only-speaking universities. twelve university libraries responded to an online survey which asked questions concerning organization and collaboration focused on it, the library, and educational technology. results (see figure 1) show that organizational reporting structures in higher education vary when it comes to it and the library. fifty percent of the survey respondents reported that their it department reports to the ceo/cfo or vp administrative, 25 percent of it departments report to a cio, 17 percent report to a provost/vp academic, and 8 percent report to a vp finance. figure 1. which of the following best describes how your it organization reports? all of the libraries in this survey, on the other hand, report to a provost or vp academic. this makes sense, as libraries are generally considered academic while it is usually associated with operations. however, there have been recent changes to some university library structures in canada that might indicate new thinking when it comes to organizational structure and the relationship between these units. in 2018, it was announced that there would be restructuring at brandon university which removed the university librarian position altogether (as well as the 50% 17% 25% 8% reports to ceo/cfo/vp admin reports to provost/ vp academic reports to cio reports to the vp finance information technology and libraries | december 2018 17 director of it services), and placed the library under a chief information officer. this would bring the library and it under one reporting structure.3 in an opposite move, mount allison university recently proposed to eliminate the top librarian position and have an academic dean split the responsibility of the library and their academic unit.4 after local outcry, this move was reversed and the job ad is out for a head librarian. it is hard to say if these are signals of upcoming change in the future of library reporting, or a temporary solution in a time of budget restrictions. however, half of the survey respondents mentioned that there has been some recent reorganization or planned reorganization related to it and the library at their institutions. only 33 percent of small university libraries surveyed have their own it department or staff. one of those libraries have an it specialist who splits time between the library and their it department. the other 67 percent have no it department or staff in the library (see figure 2). figure 2. does your library have its own it department? when asked, “is there anything you would like changed about the current organization when it comes to it and the library?,” all of the libraries without in-library it support mentioned a desire for either a position in the library responsible for it; greater collaboration between it and the library; or a specific person within the it department who they could contact regarding it. student experience, including their experience with technology is important according to a recent educause study. this 2017 educause study outlines the importance of it, and support for students when it comes to wi-fi and other technical support.5 one recommendation from this report is to have it help desks more visible and available. not only is the library a convenient location, but as we have already seen, students are increasingly using technologies in the library and often run into issues. it makes sense then to have an it help desk within the library, as the majority of larger university libraries in canada already offer. when asked about it help desks in 25% 67% 8% yes, they are library employees no gaps in it and library services at small academic libraries in canada | hoover 18 https://doi.org/10.6017/ital.v37i4.10596 the library, three of the responding university libraries (25 percent) have help desks staffed by it services, one (8 percent) had a help desk staffed by library staff, and another (8 percent) had an after-hours help desk staffed by it services. the remaining 59 percent have no it help available in the library (see figure 3). figure 3. does your library have an it help desk? the physical location of the two units is also important. in this survey, 75 percent of respondents replied that the library and the it department are in separate spaces while 25 percent share a common space. studies have shown that physical proximity in the workplace can lead to greater collaboration. an mit study showed that physical proximity drives collaboration between researchers on university campuses.6 as one of the common themes in the survey was the desire for more collaboration, a physical change of location could have a great impact. when asked about changes people would like to see with the current organization of the it and library, many mention a need for more collaboration due to interrelated responsibilities. common suggestions included library it staff, having an it help desk in the library, or a specific person in it they could contact directly for help or who had shared responsibilities between it and the library. another suggestion was a committee that would bring together members from both units to strengthen communication. what can be done? in the larger view, university administrations need to look for outdated governance and organizational structures that are in place. as universities shift their goals and focus over time, they need to adapt structures and staffing accordingly. chong and tan describe it govern ance as being of utmost importance, claiming there needs to be strategic alignment between it and organizational strategies and objectives.7 carraway describes universities with a high level of it governance maturity and effectiveness as those where “it initiatives are aligned with the 59%25% 8% 8% no yes, employed by it services yes, employed by the library yes, employed by it services, only after hours information technology and libraries | december 2018 19 institution’s strategic priorities and prioritized among the university’s portfolio projects.” 8 effective it governance, focused on collaboration and communication, is associated with greater integration of innovation into institutional process. also, it governance was found to be more effective under a delegated model that empowers it governance bodies than under a cio centric model. the majority of universities surveyed showed common governance structures of it, with most as separate units reporting to a cfo/vp admin or similar. the inclusion of faculty, students , and business units in it governance committees was associated with a stronger innovation culture.9 stakeholder inclusion is an important characteristic of it governance maturity. students, as consumers of it, and faculty should both have a seat at the table when it comes to it governance. carraway found that an increased level of student engagement in it governance correlates with a high level of innovation culture.10 university administration should take a good look at how it is governed, who has input and how it is affecting the university’s objectives. the reporting structure of libraries has generally gone unchanged, with most respondents confirming that their library reports to an academic vice president. budget constraints at two canadian universities have seen the library structure being impacted as of late, however there has been little research done on the ideal governance structure of libraries in higher education. both it and the library in smaller canadian universities could consider governance committees that include students, faculty and other stakeholders, in order to be more innovative and effective. it is an interesting unit, where the model in higher education has moved back and forth between three main structures: centralized, decentralized, and federated it structures. centralized, where there is a central hub that runs it services for the university, is the most common structure found at the surveyed universities. decentralized, where it services are spread throughout the organization, would automatically mean the library (and other units) had it staff. a federated model would also lead to local library it work being done by specific people, who work for and out of a central it office, but are assigned to specific areas. federated structures offer centralized control, with decentralized functions in faculties and units. chong and tan believe that federated structures are more appropriate for a collaborative network, such as a university.11 their study found that federated structure, combined with coordinated communication, led to higher effectiveness. nugroho maintains that decentralized organizations such as universities need to regularly review their it governance structure, as both technology and the organization itself changes.12 he maintains that effective governance does not happen by coincidence, and it governance is not a static concept. library staffing also needs to change based on needs of the users and goals of the organization. some even suggest that libraries reorganize every few years to keep staff flexible, take advantage of new opportunities and foster growth.13 in 2011, we saw bell and shank’s work on the blended librarian, which advocated for librarianship with educational technology and instructional design skills.14 according to the 2015 arl statistics, we continue to see nontraditional professional jobs increasing in the library. in 2015, the top three new hire categories included two nontraditional categories: digital specialists and functional specialists.15 arcl statistics from 2016 showed that over the previous five years, 61 percent of libraries repurposed or cross-trained staff to better support new technologies or services.16 we saw in the survey that out of over 68,000 research questions fielded by librarians across nova scotia since 2010, just over one quarter of these are technical in nature. library administration at smaller universities, looking at these numbers, gaps in it and library services at small academic libraries in canada | hoover 20 https://doi.org/10.6017/ital.v37i4.10596 should respond by ensuring that technical knowledge and skills will be written into job ads, as they are increasingly in demand or that staff are trained appropriately. physical location is also important. we’ve seen from the survey results that there is a lack of physical connectedness between the library and it in smaller canadian universities. wineman et al. studied various organizations and their physical proximity. they state: “social networks play important roles in structuring communication, collaboration, access to knowledge and knowledge transformation.”17 they suggest that innovation is a process that occurs at the crossroad between social and physical space. cramton points out that “maintaining mutual knowledge is a central problem of geographically dispersed collaboration.”18 if it is not possible to change the organizational structure or governance to ensure more communication and knowledge sharing, physical spaces such as an it desk in the library is another way for the library and it staff to be in regular contact. a 2017 mit study recommended that institutions keen to support the crossdisciplinary collaborative activity that is vital to research and practice, may need to adopt “a new approach to socio-spatial organisation that may ultimately enrich the design and operation of places for knowledge creation.”19 we could apply the same thinking to institutions interested in supporting collaborative activity between the library, it, and newer-yet-related initiatives such as educational technology and digital research centers. proximity to collaborators should be considered as one option to enhance outcomes and innovation between the library and it. organizational structures and models, physical locations, and governance are all large-scale factors that should be considered when looking at the relationship between it and the library. there are also smaller-scale practical ideas that can help. these ideas will be discussed below. an important first step is to start the conversation. the author’s institution has begun thinking about the gaps in our services and support for research, especially when it comes to support for technologies needed for modern research and publication that are often housed in the library. factors which have helped start this conversation include: funding mandates related to open access and data management; new services or initiatives that researchers or units would like to start; which require it and library specialization; and planning for a future in higher education that increasingly relies on up to date technologies to support research, publishing, and teaching. a conversation is beginning between researchers, administration, the library, and other stakeholders which will lead to a collaborative solution to some of these issues. it’s important that there is interest and initiation from administration, but also that other stakeholders are involved from the onset. many universities have developed new positions, new units, or worked these positions into it or the library to fill this gap, but the solution needs to fit each institution and their goals. often times when there is no it staff in the library, technical issues are managed by one or two technical-minded staff members. equipping frontline service providers may help alleviate some of this work by enabling many staff to solve common technical issues. here at the author’s institution, the librarian in charge of access has begun presenting common technical/access issues during a monthly reference meeting. the goal is to have all staff who field questions from users have a basic understanding of how the systems work in the library, what to do if they see issues, and whom they can contact. in libraries where there is not a strong it presence, it is important to enable staff to be comfortable with basic issues that will come up. this also ensures that there is not just one person who can answer common technical/access issues. if someone staffing the information technology and libraries | december 2018 21 reference or circulation desk encounters users with these issues, they can explain why they are happening and what the library is going to do to help them. the plan is to create a library technical manual out of these quick presentations that can act as a resource for all staff or as a training manual for new staff. at each of these presentations, a survey is administered. the survey has four questions and asks participants about their comfort level dealing with technical/access questions both before and after the presentation. one hundred percent of staff answered that after the presentation, they felt more comfortable when encountering the issues described. this is not a suitable replacement of the specialized it skills needed in libraries; however, it can alleviate some of the pressure put on select people in smaller academic libraries. library staff can, and do, actively work to learn new skills through formal training and professional development. we saw from the acrl survey that many libraries are working to cross-train staff in order to keep up with technological demands. encouraging learning new skills and educational opportunities can go a long way and should be encouraged by library administration. the benefit of having it staff dedicated to the library is obvious, and libraries should continually push for this. results of the survey showed that library staff would prefer to have a person to contact with issues specific to the library. issues can be dealt with promptly, it personnel working in or assigned to the library will have an understanding of the systems involved, communication is easier, as there is a point person to contact, and the library has control over the products and services they offer. however, if that is not possible within the organization, a good system of communication is important. a timely system of contacting it and resolving issues can go a long way. chong and tan maintain that a coordinated communication system is key for it in an organization.20 a commonly used system for technical issues is the ticket system, where issues can be submitted by users, and answered and tracked by it. this is a very useful system for it staff, however the users often cannot track their own ticket, see a timeline for completion, or know who is on the other end to contact with more information. it is a good idea to meet regularly with it, formally or informally, to be able to discuss issues, build a relationship with colleagues, and get a better sense of how each unit works. on the library end, it is important to keep statistics on technical issues sent to it and the time elapsed before the issues are resolved. these statistics can be used to demonstrate the need for library-specific it staff, encourage better communication between departments, or demonstrate a problem with the current way issues are communicated. having statistics will help libraries if and when the time comes that new positions can be created. at the author’s institution we use springshare’s libanswers software to track all technical issues, including those sent on to it. this software records the dates and times; important details and resolutions to technical issues; and exports useful statistics. in smaller canadian university libraries there is a growing need for it support. however, there has been little done by way of organizational structure, staffing, or physical proximity between these two units to allow universities to better serve their students and faculty. this paper outlined the current situation in several smaller university libraries in canada and provides some high level as well as local solutions to this problem. gaps in it and library services at small academic libraries in canada | hoover 22 https://doi.org/10.6017/ital.v37i4.10596 appendix a: it, library, and educational technology organization *required 1. institution name * 2. total student population 3. which of the following best describes how your it organization reports? mark only one oval. reports to ceo/cfo/vp admin reports to cio reports to provost/vp academic reports to dean of library/ head of library other: 4. which of the following best describes how the dean/head of library/university librarian reports? mark only one oval. reports to the ceo/cfo/vp admin reports to provost/vp academic reports to university president other: 5. which of the following best describes it's relationship to the library? mark only one oval. it and the library are not at all part of the same reporting structure it is a part of the library reporting structure it and the library report to the same person, but are separate departments other: 6. which of the following describes the physical location of it and the library? mark only one oval. located in separate spaces share a physical location other: 7. does your library have its own it department? mark only one oval. yes, they are library employees yes, they are employed by it services and work in the library information technology and libraries | december 2018 23 no other: 8. does your library have an it help desk? mark only one oval. yes, they are library employees yes, they are employed by it services no other: 9. have there been any major reorganizations (that you are aware of) related to it and library services in the last ten years? 10. is there anything you would like changed about your current organization when it comes to it and the library? 11. who is in charge of educational technology/academic technology at your university? mark only one oval. library it educational technology is separate unit/office educational technology duties are split up among the library/it/other other: 12. which of the following describes the physical location of educational technology? mark only one oval. ed tech is located in or shares space with the library ed tech is located in or shares space with it ed tech has its own space no ed tech unit gaps in it and library services at small academic libraries in canada | hoover 24 https://doi.org/10.6017/ital.v37i4.10596 other: 13. what would you include as roles of an educational technology unit? mark all that apply. media design/production research and development (testing technologies, em erging tech) instructional design and developm ent faculty development learning spaces assessment (learning outcomes, course evaluations) distance/online learning support training on course software/technologies related to teaching and learning managing classroom technologies other: 14. have there been any changes (that you know of) related to educational technology services in the last ten years? 15. is there anything you would like changed about your current organization when it comes to educational technology services and the library? 16. may i use direct quotes in my research/publication? (no names or institutions will be attributed to a quote.) mark only one oval. yes no information technology and libraries | december 2018 25 references 1 tibor koltay, “are you ready? tasks and roles for academic libraries in supporting research 2.0,” new library world 117, no. 1/2 (january 11, 2016): 94–104, https://doi.org/10.1108/nlw-09-2015-0062. 2 “instant messaging service—statistics data entry page,” novanet, accessed june 5, 2018, https://util.library.dal.ca/livehelp/liveh3lp/admin/livehelp/chatentry.php. 3 “brandon university will eliminate 15% of senior administration to help tackle budget cut,” brandon university, march 15, 2018, https://www.brandonu.ca/news/2018/03/15/brandonuniversity-will-eliminate-15-of-senior-administration-to-help-tackle-budget-cut/. 4 joseph tunney, “mount a proposal to phase out top librarian makes students, staff want to make noise,” cbc news, january 18, 2018, https://www.cbc.ca/news/canada/newbrunswick/mount-allison-university-librarian-1.4492297. 5 d. christopher brooks and jeffrey pomerantz, “ecar study of undergraduate students and information technology,” educause, october 18, 2017, accessed june 7, 2017, https://library.educause.edu/resources/2017/10/ecar-study-of-undergraduate-studentsand-information-technology-2017. 6 matthew claudel et al., “an exploration of collaborative scientific production at mit through spatial organization and institutional affiliation,” plos one 12, no. 6 (2017), https://doi.org/10.1371/journal.pone.0179334. 7 josephine chong and felix b. tan, “it governance in collaborative networks: a socio-technical perspective,” pacific asia journal of the association for information systems 4, no. 2 (2012). 8 deborah louise carraway, “information technology governance maturity and technology innovation in higher education: factors in effectiveness” (master’s diss., the university of north carolina at greensboro, 2015), 113. 9 ibid., 89. 10 ibid. 11 chong and tan, “it governance in collaborative networks: a socio-technical perspective,” 44. 12 heru nugroho, “conceptual model of it governance for higher education based on cobit 5 framework,” journal of theoretical and applied information technology, 60, no. 2 (february 2014): 6. 13 gillian s. gremmels, “staffing trends in college and university libraries,” reference services review 41, no. 2 (2013): 233–52, https://doi.org/10.1108/00907321311326165. 14 john d. shank and steven bell, “blended librarianship.” reference & user services quarterly 51, no. 2 (winter 2011): 105–10. https://doi.org/10.1108/nlw-09-2015-0062 https://util.library.dal.ca/livehelp/liveh3lp/admin/livehelp/chatentry.php https://www.brandonu.ca/news/2018/03/15/brandon-university-will-eliminate-15-of-senior-administration-to-help-tackle-budget-cut/ https://www.brandonu.ca/news/2018/03/15/brandon-university-will-eliminate-15-of-senior-administration-to-help-tackle-budget-cut/ https://www.cbc.ca/news/canada/new-brunswick/mount-allison-university-librarian-1.4492297 https://www.cbc.ca/news/canada/new-brunswick/mount-allison-university-librarian-1.4492297 https://library.educause.edu/resources/2017/10/ecar-study-of-undergraduate-students-and-information-technology-2017 https://library.educause.edu/resources/2017/10/ecar-study-of-undergraduate-students-and-information-technology-2017 https://doi.org/10.1371/journal.pone.0179334 https://doi.org/10.1108/00907321311326165 gaps in it and library services at small academic libraries in canada | hoover 26 https://doi.org/10.6017/ital.v37i4.10596 15 stanley wilder, “hiring and staffing trends in arl libraries,” association of research libraries, october 2017, https://www.arl.org/storage/documents/publications/rli-2017-stanleywilder-article2.pdf. 16 “new acrl publication: 2016 academic library trends and statistics,” news and press center, july 20, 2017, http://www.ala.org/news/member-news/2017/07/new-acrl-publication-2016academic-library-trends-and-statistics. 17 jean wineman et al., “spatial layout, social structure, and innovation in organizations,” environment and planning b: planning and design 41, no. 6 (december 1, 2014): 1,100–112, https://doi.org/10.1068/b130074p. 18 catherine durnell cramton, “the mutual knowledge problem and its consequences for dispersed collaboration,” organization science 12, no. 3 (may-june2001): 346–71, https://doi.org/10.1287/orsc.12.3.346.10098. 19 claudel et al., “an exploration of collaborative scientific production at mit through spatial organization and institutional affiliation,” 2. 20 chong and tan, “it governance in collaborative networks: a socio-technical perspective,” 44. https://www.arl.org/storage/documents/publications/rli-2017-stanley-wilder-article2.pdf https://www.arl.org/storage/documents/publications/rli-2017-stanley-wilder-article2.pdf http://www.ala.org/news/member-news/2017/07/new-acrl-publication-2016-academic-library-trends-and-statistics http://www.ala.org/news/member-news/2017/07/new-acrl-publication-2016-academic-library-trends-and-statistics https://doi.org/10.1068/b130074p https://doi.org/10.1287/orsc.12.3.346.10098 abstract introduction current status at small canadian universities what can be done? appendix a: it, library, and educational technology organization references editorial board thoughts: doesn’t work mark cyzyk editorial board thoughts | cyzyk 3 the proof of the pudding’s in the eating. miguel de cervantes saavedra. the ingenious hidalgo don quixote de la mancha. part i, chapter xxxvii, john rutherford, trans. about fifteen years ago i had two students from germany working for me, jens and andreas. those guys were great. they were smart and funny and interesting and always did their best. i would send them out to fix things around the library, and they would dutifully report back with success or failure. i told them that, particularly if there was a problem with a staff workstation, “if it breaks in the morning, it must be fixed by lunchtime; if it breaks in the afternoon, it must be fixed by 5:00.” they understood that if a staff workstation was down, then that probably also meant a staff member was just sitting there, waiting for it to be fixed. if we had to we could slap a sign on a broken public workstation and get back to it later—there were plenty of other working public stations after all—but staff workstations must be working at all times. insofar as we had an aged fleet of pcs whose cmos batteries were rapidly giving out, i kept jens and andreas running around the building quite a bit. on occasion, though, they would report back with the dreaded, “hey boss, doesn’t work.” this was the one thing that would raise my ire. “of course it doesn’t work, that’s why i sent you down there!” i would think. the phrase “doesn’t work” became for me a pavlovian signal that i was about to drop everything and go take a look myself. it now occurs to me, though, that this notion of “work” is precisely the point of technology, and that sometimes this gets lost for those of us employed fulltime as technologists in libraries. let me explain: in my opinion and for the most part, the proper role of the technologist in a library is that of a consultant on loan to the departments to work on projects there, embedded.1 two of the best bosses i ever had said essentially the same thing to me in our introductory first-day-on-the-job chit-chat: “you report to me, but you work for them.” such is the proper attitude in any serviceoriented profession. does this not frequently get inverted, subverted, lost? what happens is that technology starts to take on an importance undeserved. it becomes selfreferential and insular; a technology-for-technology’s-sake attitude arises. mark cyzyk (mcyzyk@jhu.edu) is scholarly communication architect in the sheridan libraries, john hopkins university. mailto:mcyzyk@jhu.edu information technology and libraries | june 2012 4 but technology-for-technology’s-sake is just wrong. technology is merely a means to an end, not an end in itself. the word itself derives from the ancient greek technê, most frequently translated into english as “craft” and frequently distinguished in the greek philosophical literature from epistêmê or (certain) knowledge.2 so it is here that the crucial distinction in the western world between practical and theoretical activities is made, and technology is clearly a practical, not theoretical activity. as such, it has by its very nature practical outcomes in the world: technology works in the world. technology is instrumental in achieving certain practical outcomes; its value is as a tool, instrumentally valuable, not inherently valuable. it is not for its own sake that we implement technology; we implement technology to get some sort of work accomplished in the world. our programming languages, application servers, web application frameworks, ajax libraries, integrated development environments, source-code repositories, build tools, testing harnesses, switches, routers, single-signon utilities, proxy servers, link resolvers, repositories, bibliographic management utilities, help-desk ticketing applications, and elaborate project-management protocols are all for naught if the final product of our labor, at the end of the day, doesn’t work. our product is not only literally useless, it is worse than useless because the library in which we labor has devoted precious resources to it only to result in a service or product that does not properly function, and those are precious resources that could have been spent elsewhere. hey there fellow technologists, why am i being so dismal? i would prefer the term “grave” to “dismal.” significant portions of the library budget are put toward technology each year, and as those whose duty it is to carry our local technology strategies into the future, we need to always be mindful of the fact that each and every dollar spent on technology is a dollar not available for building our collections—surely the direct center of the mission of anyone who calls himself a librarian, a.k.a., a cultural conservationist. (shouldn’t we be wearing badges that read, “to collect and preserve”?) making it work is job one for the technologist in the library. … a colleague and friend of mine once told me, a decade ago, that our fellow colleague made a snippy comment about an important and major web application i had written, “just because it works doesn’t mean it’s right.” now, admittedly, i was a very sloppy code formatter, and yet i certainly would never say that the applications i wrote were steaming plates of spaghetti. on the contrary, i think the code i wrote consisted of good, solid procedural programming. what my disgruntled colleague meant, i think, was that i failed to follow a framework, and by “framework” he naturally meant the same framework to which he’d recently hitched his own coding wagon. my response to his snippiness was, “ah, pretty-it-up all you want, organize it any-which-way, but functional code-code that works--is actually the number one criterion for being good code.” just ask your clients. editorial board thoughts | cyzyk 5 that app i wrote has been in production, happily working away as a key piece of the enterprise network infrastructure at a prominent, multi-campus, east coast university since 2002.3 references 1. and here i heartily agree with my fellow editorial board member, michael witt, when he notes that “[p]art of this process is attempting to feel our users’ pain…”, and i even extend this to the point of us technologists actively working with our users toward a common goal, literally sitting with them, among them, not merely being present to offer occasional support, not merely feeling their pain but being so invested in our common project that their pain is our pain. [did i really just suggest we take on more pain?! yep.] see: michael witt. “eating our own dogfood.” information technology and libraries 30, no. 3 (september 2011) 90. http://www.ala.org/lita/ital/sites/ala.org.lita.ital/files/content/30/3/pdf/witt.pdf 2. i’m no classics scholar, but this is my recollection from taking a graduate seminar many years ago on this very topic. so while i’m not pulling this entirely out of thin air, i am pulling it from the musty mists of middle-aged memory – that, and a quick scan of professor richard parry’s fine article on this topic in the stanford encyclopedia of philosophy, particularly the section on aristotle’s views. regarding my comments below on technology being instrumentally valuable, i cite parry’s words: “presumably, then, the craftsman does not choose his activity for itself but for the end; thus the value of the activity is in what is made”. see: richard parry. "episteme and techne," the stanford encyclopedia of philosophy, fall 2008 edition, edward n. zalta, editor. http://plato.stanford.edu/archives/fall2008/entries/episteme-techne/ 3. mark cyzyk, "the johns hopkins address registration system (jhars): anatomy of an application," educause quarterly 26, no. 3 (2003). https://jscholarship.library.jhu.edu/handle/1774.2/32800 http://www.ala.org/lita/ital/sites/ala.org.lita.ital/files/content/30/3/pdf/witt.pdf http://plato.stanford.edu/archives/fall2008/entries/episteme-techne/ https://jscholarship.library.jhu.edu/handle/1774.2/32800 2 information technology and libraries | march 2008 currently we librarians seem to be hitching our wagon to the idea of library as community because in part it’s what we ourselves want. we’ve seen that our lita members want more community from our association, so it makes sense to us that our patrons also want community. it’s what pew, oclc, and other studies seem to be telling us. the business-wired side of the world is breaking their backs to create every form of virtual community they can think of as quickly as possible. apply the appropriate amounts of marketing and then our patrons want those things and expect them from all of their historically important community resources, the library being a prime player in that group. so we strive and strive and strive to not only provide the standard issue face-to-face community we’ve always created, but to also create that new highly desired virtual community. either we create a library-specific version, or we at the very least create a way for our patrons to access those communities. hopefully, when our patrons step into those virtual communities, we work to make it possible for them to find libraries there, too. all well and good, but do we have a plan? what’s the goal? what’s the end achievement? if, as studies say, patrons with a research need turn to libraries first only one percent of the time, and instead first hit up friends and family fifty or more percent of the time, then where is our significance and place in either the physical or virtual spaces? we know we serve significant numbers in many ways. we have gate counts, circulation records, holds placed, warm bodies in the building—all manners of indicators that show a well-managed and -marketed library is in demand and appreciated. as we run into the terrible head-on crash of community and technology, willy-nilly doing absolutely everything we can to accommodate everyone and everything, because we’re librarians and library technologists and that’s what we do, do we really have a clue why we’re doing it? all fodder for deep thought and many lattes or beers and late night discussions. on the lita side, though, we’re embarking on doing something about this knot when it comes to serving our members. under the guidance of past-president bonnie postlethwaite we’ve established an assessment and research committee co-chaired by bonnie and diane bisom. to kick off the committee activities and to help them establish an agenda and direction, lita hired the research firm the wedewer group to work with the lita board and the new committee. stay tuned for reports and announcements from this committee as it works to find answers to some of those questions. and have that latte with a lita colleague as you seek to find some answers yourself. it’s all part of building community. mark beatty (mbeatty@wils.wisc.edu) is lita president 2007/2008 and trainer, wisconsin library services, madison. president’s message: doing something about life’s persistent problems? mark beatty president’s message andromeda yelton information technology and libraries | december 2017 2 andromeda yelton (andromeda.yelton@gmail.com) is lita president 2017-18 and senior software engineer, mit libraries, cambridge, united states. before i dive into my column, i’d like to recognize and thank bob gerrity for his six years of service as ital’s editor in chief. he oversaw our shift from a traditional print journal to a fully online one, recognized by micah vandegrift and chealsye bowley as having the strongest open-access policies of all lis journals (http://www.inthelibrarywiththeleadpipe.org/2014/healthyself/). i’d like to further extend a welcome to ken varnum as our new editor in chief. ken’s distinguished record of lita service includes stints on the ital editorial board and the lita board of directors, so he knows the journal very well and i am enthusiastic about its future under his lead. i’m particularly curious to see what will be discussed in ital under ken’s leadership because i’ve just come back from two outstanding conferences which drove home the significance of the issues we wrestle with in library technology, and i’m looking forward to a third. in early november, i attended lita forum in scenic denver. the schedule was packed with sessions on intriguing topics – too many, of course, for me to attend them all – but two in particular stand out to me. in one, sam kome detailed how he’s going about a privacy audit at the claremont colleges library. he walked us through an extensive – and sometimes surprising – list of places personally identifiable information can lurk on library and campus systems, and talked through what his library absolutely needs (which is less than he’d thought, and far less than the library has been logging without thinking about it). in the other, mary catherine lockmiller took a design thinking approach to serving transgender populations. she shared a fantastic, practical libguide (http://libguides.southmountaincc.edu/transgenderresources), but the part that stuck with me most is her statement that many trans people may never physically enter a library because public spaces are not safe spaces; for this population, our electronic services are our public services. as technologists, we create the point of first, and maybe only, contact. a week later, i attended the inaugural data for black lives conference (http://d4bl.org/) at the mit media lab, steps from my office. this was – and i think everyone in the room felt it – something genuinely new. from the galvanizing topic, to the sophisticated visual and auditory design, to the frisson of genius and creativity buzzing all around a room of artists, activists, professors, poets, data scientists and software engineers, it was a remarkable experience for us all. those of you who heard dr. safiya noble speak at thomas dowling’s lita president’s program in 2016 are familiar with algorithmic bias. numerous speakers discussed this at d4bl: the ways that racial disparities in underlying data sets can be replicated, magnified, and given a veneer of objective power when run through the black boxes that power predictive policing or risk assessment for bail hearings. absent and messy data was a theme as well: in a moment that would make many librarians chuckle (and then wince) knowingly, a panel of music industry executives estimated that 40% of their metadata is wrong, thus making it impossible to credit and compensate artists appropriately. mailto:andromeda.yelton@gmail.com) https://www.google.com/url?q=http://www.inthelibrarywiththeleadpipe.org/2014/healthyself/&sa=d&ust=1512118443864000&usg=afqjcnedfyl-ywfgnadmdzfcrvvnmhlhhq http://libguides.southmountaincc.edu/transgenderresources http://d4bl.org/) president’s message | yelton 3 https://doi.org/10.6017/ital.v36i4.10238 and yet – in a memorable keynote – dr. ruha benjamin called on us not only to collect data about black death, as she showed us an image of the ambulance bill sent to tamir rice’s family, but to listen to our artists and poets as we use our data to imagine black life – this in front of an image of wakanda. with our data and our creativity, what new worlds can we map? several of my mit colleagues also attended d4bl, and as we discussed it afterward we started thinking about how these ideas can drive our own work. how does the imaginary world of wakanda connect to the archival imaginary, and what worlds can we empower our own creators to imagine with what we collect and preserve? how can we use our data literacy and access to sometimes un-googleable resources to help community groups collate data on important issues that are not tracked by our public institutions, such as police violence (https://mappingpoliceviolence.org/) or racial disparities in setting bail? with these ideas swirling in my mind, i am looking forward with tremendous excitement to lita forum 2018. building on the work of our forum assessment task force, we’ll be doing a lot of things differently; in particular, aiming for lots of hands-on, interactive sessions. this will be a conference where, whether you’re a presenter or an attendee, you’ll be able to do things. and these last two conferences have driven home for me how very much there is to do in of library technology. our work to select, collect, preserve, clean, and provide access to data can indeed have enormous impact. technology services are front-line services. https://mappingpoliceviolence.org/) microsoft word september_ital_park_proofed.docx evaluation of semi-‐automatic metadata generation tools: a survey of the current state of the art jung-‐ran park and andrew brenza information technology and libraries | september 2015 22 abstract assessment of the current landscape of semi-‐automatic metadata generation tools is particularly important considering the rapid development of digital repositories and the recent explosion of big data. utilization of semi-‐automatic metadata generation is critical in addressing these environmental changes and may be unavoidable in the future considering the costly and complex operation of manual metadata creation. to address such needs, this study examines the range of semi-‐automatic metadata generation tools (n = 39) while providing an analysis of their techniques, features, and functions. the study focuses on open-‐source tools that can be readily utilized in libraries and other memory institutions. the challenges and current barriers to implementation of these tools were identified. the greatest area of difficulty lies in the fact that the piecemeal development of most semi-‐automatic generation tools only addresses part of the issue of semi-‐automatic metadata generation, providing solutions to one or a few metadata elements but not the full range of elements. this indicates that significant local efforts will be required to integrate the various tools into a coherent set of a working whole. suggestions toward such efforts are presented for future developments that may assist information professionals with incorporation of semi-‐automatic tools within their daily workflows. introduction with the rapid increase in all types of information resources managed by libraries over the last few decades, the ability of the cataloging and metadata community to describe those resources has been severely strained. furthermore, the reality of stagnant and decreasing library budgets has prevented the library community from addressing this issue with concomitant staffing increases. nevertheless, the ability of libraries to make information resources accessible to their communities of users remains a central concern. thus there is a critical need to devise efficient and cost effective ways of creating bibliographic records so that users are able to find, identify, and obtain the information resources they need. one promising approach to managing the ever-‐increasing amount of information is with semi-‐ automatic metadata generation tools. semi-‐automatic metadata generation tools jung-‐ran park (jung-‐ran.park@drexel.edu) is editor, journal of library metadata, and associate professor, college of computing and informatics, drexel university, philadelphia. andrew brenza (apb84@drexel.edu) is project assistant, college of computing and informatics, drexel university, philadelphia. evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 23 concern the use of software to create metadata records with varying degrees of supervision from a human specialist.1 in its ideal form, semi-‐automatic metadata generation tools are capable of extracting information from structured and unstructured information resources of all types and creating quality metadata that not only facilitate bibliographic record creation but also semantic interoperability, a critical factor for resource sharing and discovery in the networked environment. through the use of semi-‐automatic metadata generation tools, the library community has the potential to address many issues related to the increase of information resources, the strain on library budget, the need to create high-‐quality, interoperable metadata records, and, ultimately, the effective provision of information resources to users. there are many potential benefits to semi-‐automatic metadata generation. the first is scalability. because of the quantity of information resources and the costly and time-‐consuming nature of manual metadata generation,2 it is increasingly apparent that there simply are not enough information professionals available for satisfying the metadata-‐generation needs of the library community. semi-‐automatic metadata generation, on the other hand, offers the promise of using high levels of computing power to manage large amounts of information resources. in addition to scalability, semi-‐automatic metadata generation also offers potential cost savings through a decrease in the time required to create effective records. furthermore, the time savings would allow information professionals to focus on tasks that are more conceptually demanding and thus not suitable for automatic generation. finally, because computers can perform repetitive tasks with relative consistency when compared to their human counterparts, automatic metadata generation promises the ability to create more consistent records. a potential increase in consistency of quality metadata records would, in turn, increase the potential for interoperability and thereby the accessibility of information resources in general. thus semi-‐automatic metadata generation offers the potential to not only ease resource description demands on the library community but also to improve resource discovery for its users. goals of the study assessment of the current landscape of semi-‐automatic metadata generation tools is particularly important considering the fast development of digital repositories and the recent explosion of data and information. utilization of semi-‐automatic metadata generation is critical to address such environmental changes and may be unavoidable in the future considering the costly and complex operation of manual metadata creation. even though there are promising experimental studies that exploit various methods and sources for semi-‐automatic metadata generation,3 a lack of studies assessing and evaluating the range of tools have been developed, implemented, or improved. to address such needs, this study aims to examine the current landscape of semi-‐ automatic metadata generation tools while providing an evaluative analysis of their techniques, features, and functions. the study primarily focuses on open-‐source tools that can be readily utilized in libraries and other memory institutions. the study also highlights some of the challenges still facing the continued development of semi-‐automatic tools and the current barriers information technology and libraries | september 2015 24 to their incorporation into the daily workflows for information organization and management. future directions for the further development of tools are also discussed. toward this end, a critical review of the literature in relation to semi-‐automatic metadata generation tools published from 2004 to 2014 was conducted. databases such as library and information sciences abstracts and library, information science and technology abstracts were searched and germane articles identified through review of titles and abstracts. because the problem of creating viable tools for the reliable automatic generation of metadata is a not a problem limited to the library and information science professions,4 database searches were expanded to include those databases pertinent to the computing science, including proquest computing, academic search premier, and applied science and technology. keywords, such as “automatic metadata generation,” “metadata extraction,” “metadata tools,” and “text mining,” including their stems, were used to explore the databases. in addition to keyword searching, relevant articles were also identified within the reference sections of articles already deemed pertinent to the focus of the survey as well as through the expansion of results lists through the application of relevant subject terms applied to pertinent articles. to ensure that the latest, most reliable developments in automatic metadata were reviewed, various filters, such as date range and peer-‐review, were employed. once tools were identified, their capabilities were tested (when possible), their features were noted, and overarching developments were determined. the remainder of the article provides an overview of the primary techniques developed for the semi-‐automatic generation of metadata and a review of the open-‐source metadata generation tools that employ them. the challenges and current barriers to semi-‐automatic metadata tool implementation are described as well as suggestions for future developments that may assist information professionals with integration of semi-‐automatic tools within the daily workflow of technical services departments. current techniques for the automatic generation of metadata as opposed to manual metadata generation, semi-‐automatic metadata generation relies on machine methods to assist with or to complete the metadata-‐creation process. greenberg distinguished between two methods of automatic metadata generation: metadata extraction and metadata harvesting.5 metadata extraction in general employs automatic indexing and information retrieval techniques to generate structured metadata using the original content of resources. on the other hand, metadata harvesting concerns a technique to automatically gather metadata from individual repositories in which metadata has been produced by semi-‐automatic or manual approaches. the harvested metadata can be stored in a central repository for future resource retrieval. within this dichotomy of extraction methods, there are several other more specific techniques that researchers have developed for the semi-‐automatic generation of metadata. polfreman et al. identified an additional six techniques that have been developed over the years: meta-‐tag harvesting, content extraction, automatic indexing, text and data mining, extrinsic data auto evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 25 generation, and social tagging.6 although the last technique is not properly a semi-‐automatic metadata generation technique because it is used to generate metadata with a minimum of intervention required by metadata professionals, it can be viewed as a possible mode to streamline the metadata creation process. both greenberg and polfreman provide comprehensive, high-‐level characterizations of the techniques employed in current semi-‐automatic metadata generation tools. however, an evaluation of these techniques within the context of a broad survey of the tools themselves and a comprehensive enumeration of currently available tools are not addressed. thus, although these techniques will be examined for the remainder of this section, they serve simply as a framework through which this study provides a current and comprehensive analysis of the tools available for use today. each section provides an overview of the relevant technique, a discussion of the most current research related to it, and the tools that employ that technique. the tables included in each section provide lists of the semi-‐automatic metadata generation tools (n = 39) evaluated in the course of this survey. the information presented in the tables is designed to provide a characterization of each tool: its name, its online location, the technique(s) used to generate metadata, and a brief description of the tool’s functions and features. only those tools that are currently available for download or for use as web services at the time of this writing are included. furthermore, the listed tools have not been strictly limited to metadata-‐ generation applications but also include some content management system software (cmss) as these generally provide some form of semi-‐automatic metadata extraction. typically, cmss are capable of extracting technical metadata as well as data that can found in the meta-‐tags of information resources, such as the file name, and using that information as the title of a record. meta-‐tag extraction meta-‐tag extraction is a computing process whereby values for metadata fields are identified and populated through an examination of metadata tags within or attached to a document. in other words, it is a form of metadata harvesting and, possibly, conversion of that metadata into other formats. marcedit, the most widely used semi-‐automatic tool for the generation of metadata in us libraries,7 is an example of this technique. marcedit essentially harvests metadata from open archives initiative protocol for metadata harvesting (oai-‐pmh) compliant records and offers the user the opportunity to convert those records to a variety of formats, including machine-‐readable cataloging (marc), machine-‐readable cataloguing in xml (marc xml), metadata object description schema (mods), and encoded archival description (ead). it also offers the capabilities of converting records from any of the supported formats to any of the other supported formats. other examples of this technique are the web services editor-‐converter dublin core metadata and firefox dublin core viewer extension. both of these programs search html files on the web and convert information found in html meta-‐tags to dublin core elements. in the cases of marcedit information technology and libraries | september 2015 26 and editor-‐converter dublin core, users are presented with the converted information in an interface that allows the user to edit or refine the data. figure 1 provides an illustration of the extracted metadata of the new york times homepage using editor-‐converter dublin core, while figure 2 offers an illustration of the editor that this web service provides. figure 1. screenshot of extracted dublin core metadata using editor-‐converter dublin core. figure 2. screenshot of editor-‐converter dublin core editing tool (only eight of the sixteen fields are visible in this screenshot). evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 27 perhaps the biggest weakness to this type of tool is that it entirely depends on the quality of the metadata from which the programs harvest. this can be most readily seen in the above figure by the lack of values for a number of the dublin core fields for the the new york times website. programs that solely employ the technique of meta-‐tag harvesting are unable to infer values for metadata elements that are not already populated in the source. table 1 lists the tools that support meta-‐tag harvesting either as the sole technique or as one of a suite of techniques used to generate metadata from resources. of the thirty-‐nine tools evaluated for this study, nineteen support meta-‐tag harvesting. tool name location techniques functions/features anvl/erc kernel metadata conversion toolkit http://search.cpan.org/~jak/file-‐ anvl/anvl meta-‐tag harvester a utility that can automatically convert records in the anvl format into other formats such as xml, json (javascript object notation), turtle or plain, among others. apache poi – text extractor http://poi.apache.org/download.html content extractor; meta-‐tag harvester; extrinsic auto-‐ generator apache poi provides basic text extraction for all project supported file formats. in addition to the (plain) text, apache poi can access the metadata associated with a given file, such as title and author. apache tika http://tika.apache.org/ content extractor; meta-‐tag harvester; extrinsic auto-‐ generator built on apache poi, the apache tika toolkit detects and extracts metadata and text content from various documents. ariadne harvester http://sourceforge.net/projects/ariadn ekps/files/?source=navbar meta-‐tag harvester a harvester of oai-‐pmh compliant records which can be converted to various other schema such as learning object metadata (lom). bibframe tools http://www.loc.gov/bibframe/implem entation/ meta-‐tag harvester bibframe offers a number of tools for the conversion of marcxml documents to bibframe documents. web service and downloadable software are both available. data fountains http://datafountains.ucr.edu/ content extractor; automatic indexer; meta-‐tag harvester; extrinsic auto-‐ generator scans html documents and first extracts information contained in meta-‐tags. if information is unavailable in meta-‐tags, the program will use other techniques to assign values. includes a focused web crawler that can target websites concerning a specific subject. information technology and libraries | september 2015 28 dublin core meta toolkit http://sourceforge.net/projects/dcmet atoolkit/files/?source=navbar meta-‐tag harvester transforms data collected via different methods into dublin core (dc) compatible metadata. dspace http://www.dspace.org/ meta-‐tag harvester; extrinsic auto-‐ generator; social tagging automatically extracts technical information regarding file format and size. can also extract some information from meta-‐tags. editor-‐converter dublin core metadata http://www.library.kr.ua/dc/dcedituni e.html meta-‐tag harvester; extrinsic auto-‐ generator scans html documents, harvesting metadata from tags and converting them to dc. embedded metadata extraction tool (emet) http://www.artstor.org/global/g-‐ html/download-‐emet-‐public.html content extractor; emet is a tool designed to extract metadata embedded in jpeg and tiff files. meta-‐tag harvester; extrinsic auto-‐ generator firefox dublin core viewer extension http://www.splintered.co.uk/experime nts/73/ meta-‐tag harvester; extrinsic auto-‐ generator scans html documents, harvesting metadata from tags and displaying them in dublin core. marcedit http://marcedit.reeset.net/ meta-‐tag harvester harvests oai-‐pmh compliant data and converts it to various formats including dc and marc. metatag extractor software http://meta-‐tag-‐ extractor.software.informer.com/ meta-‐tag harvester permits customizable extraction features, harvesting meta-‐tags as well as contact information from websites. my meta maker http://old.isn-‐ oldenburg.de/services/mmm/ meta-‐tag harvester can convert manually entered data into dc. photo rdf-‐gen http://www.webposible.com/utilidade s/photo_rdf_generator_en.html meta-‐tag harvester generates dublin core and resource description framework (rdf) output from manually entered input. pymarc https://github.com/edsu/pymarc meta-‐tag harvester scripting tool in python language for the batch processing of marc records, similar to marcedit. repomman http://www.hull.ac.uk/esig/repomman /index.html meta-‐tag harvester; content extractor; extrinsic auto-‐ generator automatically extracts various elements for documents uploaded to fedora such as author, title, description, and key words, among others. results are presented to user for review. sherpa/romeo http://www.sherpa.ac.uk/romeo/api.h tml meta-‐tag harvester a machine-‐to-‐machine application program interface (api) that permits the automatic look-‐up and importation of publishers and journals. url and metatag extractor http://www.metatagextractor.com/ meta-‐tag harvester permits the targeted searching of websites and extracts urls and meta-‐tags from those sites. table 1. semi-‐automatic tools that support meta-‐tag harvesting. evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 29 content extraction content extraction is a form of metadata extraction whereby various computing techniques are used to extract information from the information resource itself. in other words, these techniques do not rely on the identification of relevant meta-‐tags for the population of metadata values. an example of this technique is the kea application, a program developed at the new zealand digital library that uses machine learning, term frequency-‐inverse document frequency (tf.idf) and first-‐occurrence techniques to identify and assign key phrases from the full text of documents.8 the major advantage of this type of technique is that the extraction of metadata can be done independently of the quality of metadata associated with any given information resource. another example of a tool utilizing this technique is the open text summarizer, an open-‐source program that offers the capability of reading a text and extracting important sentences to create a summary as well as to assign keywords. figure 3 provides a screenshot of what a summarized text might look like using the open text summarizer. figure 3. open text summarizer: sample summary of text. another form of this technique often relies on the predictable structure of certain types of documents to identify candidate values for metadata elements. for instance, because of the reliable format of scholarly research papers—which generally include a title, author, abstract, introduction, conclusion, and reference sections in predictable ways—this format can be exploited by machines to extract metadata values from them. several projects have been able to exploit this technique in combination with machine learning algorithms to extract various forms of metadata. for instance, in the randkte project, optical character recognition software was used to scan a large quantity of legal documents from which, because of the regularity of the documents’ information technology and libraries | september 2015 30 structure, structural metadata such as chapter, section, and page number could be extracted.9 in contrast, the kovacevic’s project used the predictable structure of scholarly articles, converting documents from pdf to html files while preserving the formatting details and used classification algorithms to extract metadata regarding title, author, abstract, and keywords, among other elements.10 table 2 lists the tools that support content extraction either as the sole technique or as one of a suite of techniques used to generate metadata from resources. of the thirty-‐nine tools evaluated for this study, twenty tools support some form of content extraction. tool name location techniques functions/features apache poi— text extractor http://poi.apache.org/download.html content extractor; meta-‐tag harvester; extrinsic auto-‐ generator apache poi provides basic text extraction for all project supported file formats. in addition to the (plain) text, apache poi can access the metadata associated with a given file, such as title and author. apache standol https://stanbol.apache.org/ content extractor; automatic indexer extracts semantic metadata from pdf and text files. can apply extracted terms to ontologies. apache tika http://tika.apache.org/ content extractor; meta-‐tag harvester; extrinsic auto-‐ generator built on apache poi, the apache tika toolkit detects and extracts metadata and text content from various documents. biblio citation parser http://search.cpan.org/~mjewell/ biblio-‐citation-‐parser-‐1.10/ content extractor a set of modules for citation parsing. catmdedit http://catmdedit.sourceforge.net/ content extractor catmdedit allows the automatic creation of metadata for collections of related resources, in particular spatial series that arise as a result of the fragmentation of geometric resources into datasets of manageable size and similar scale. crossref http://www.crossref.org/ simpletextquery/ content extractor this web service returns digital object identifiers for inputted references. data fountains http://datafountains.ucr.edu/ content extractor; automatic indexer; meta-‐tag harvester; extrinsic auto-‐ generator scans html documents and first extracts information contained in meta-‐tags. if information is unavailable in meta-‐tags, the program will use other techniques to assign values. includes a focused web crawler that can target websites concerning a specific subject. evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 31 embedded metadata extraction tool (emet) http://www.artstor.org/global/g -‐html/download-‐emet-‐public.html content extractor; meta-‐tag harvester; extrinsic auto-‐ generator emet is a tool designed to extract metadata embedded in jpeg and tiff files. freecite http://freecite.library.brown.edu/ content extractor free parsing tool for the extraction of reference information. can be downloaded or used as a web service. general architecture for text engineering (gate) http://gate.ac.uk/overview.html content extractor; automatic indexer; natural language processor and information extractor. kea http://www.nzdl.org/kea/index_old .html#download content extractor; automatic indexer analyzes the full texts of resources and extracts keyphrases. keyphrases can also be mapped to customized ontologies or controlled vocabularies for subject term assignment. metagen http://www.codeproject.com/articles /41910/metagen-‐a-‐project -‐metadata-‐generator-‐for-‐visual-‐st content extractor; automatic indexer used to build a metadata generator for silverlight and desktop clr projects, metagen can be used as a replacement for static reflection (expression trees), reflection (walking the stack), and various other means for deriving the name of a property, method, or field. metagenerator http://extensions.joomla.org/ extensions/site-‐management/seo-‐a -‐metadata/meta-‐data/11038 content extractor a plugin that automatically generates description and keyword meta-‐tags by pulling text from joomla content. with this plugin you can also control some title options and add url meta-‐tags. ont-‐o-‐mat http://projects.semwebcentral.org/ projects/ontomat/ content extractor assists user with annotation of websites that are semantic web-‐ compliant. may now include a feature that automatically suggests portions of the website to annotate. open text summarizer http://libots.sourceforge.net/ content extractor extracts pertinent sentences from a resource to build a free text description. information technology and libraries | september 2015 32 parscit http://wing.comp.nus.edu.sg/parscit/ #ws content extractor open-‐source string-‐parsing package for the extraction of reference information from scholarly articles. repomman http://www.hull.ac.uk/esig/ repomman/index.html meta-‐tag harvester; content extractor; extrinsic auto-‐ generator automatically extracts various elements for documents uploaded to fedora such as author, title, description, and key words, among others. results are presented to user for review. simple automatic metadata generation interface (samgi) http://hmdb.cs.kuleuven.be/amg/ download.php content extractor; extrinsic auto-‐ generator a suite of tools that is able to automatically extract metadata elements such as key phrase and language from documents as well as from the context in which a document exists. termine http://www.nactem.ac.uk/software/ termine/ content extractor extracts keywords from texts through c-‐value analysis and acromine, an acronym identifier and dictionary. available as free web service for academic use. yahoo content analysis api https://developer.yahoo.com/ contentanalysis/ content extractor; automatic indexer the content analysis web service detects entities/concepts, categories, and relationships within unstructured content. it ranks those detected entities/concepts by their overall relevance, resolves those if possible into wikipedia pages, and annotates tags with relevant metadata. table 2. semi-‐automatic tools that support content extraction automatic indexing in the same way as content extraction, automatic indexing involves the use of machine learning and rule-‐based algorithms to extract metadata values from within information resources themselves, rather than relying on the content of meta-‐tags applied to resources. however, this technique also involves the mapping of extracted metadata terms to controlled vocabularies such as the library of congress subject headings (lcsh), the getty thesaurus of geographic names (tgn), or the library of congress name authority file (lcnaf), or to domain-‐specific or locally developed ontologies. thus, in this technique, researchers use classifying and clustering algorithms to extract relevant metadata from texts. term-‐frequency statistics or if.idf, which determines likelihood of keyword applicability through its relative frequency within a given evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 33 document as opposed to its relative infrequency in related documents, are commonly used in this technique. projects such as john hopkins university’s automatic name authority control (anac) tool utilizes this technique to extract the names of composers within its sheet music collections and to assign the authorized form of those names based on comparisons with lcnaf.11 erbs et al. also use this technique to extract key phrases from german educational documents which are then used to assign index terms, thereby increasing the degree to which related documents are collocated within the repository and the consistency of subject term application.12 table 3 lists the tools that support automatic indexing either as the sole technique or as one of a suite of techniques used to generate metadata from resources. of the thirty-‐nine tools evaluated for this study, seven tools support some form of automatic indexing. tool name location techniques functions/features apache poi— text extractor http://poi.apache.org/download.html content extractor; meta-‐tag harvester; extrinsic auto-‐ generator apache poi provides basic text extraction for all project supported file formats. in addition to the (plain) text, apache poi can access the metadata associated with a given file, such as title and author. apache tika http://tika.apache.org/ content extractor; meta-‐tag harvester; extrinsic auto-‐ generator built on apache poi, the apache tika toolkit detects and extracts metadata and text content from various documents. data fountains http://datafountains.ucr.edu/ content extractor; automatic indexer; meta-‐tag harvester; extrinsic auto-‐ generator scans html documents and first extracts information contained in meta-‐tags. if information is unavailable in meta-‐tags, the program will use other techniques to assign values. includes a focused web crawler that can target websites concerning a specific subject. digital record object identification (droid) http://www.nationalarchives.gov.uk/ information-‐management/manage -‐information/preserving-‐digital -‐records/droid/ extrinsic auto-‐ generator droid is a software tool developed by the national archives to perform automated batch identification of file formats. dspace http://www.dspace.org/ meta-‐tag harvester; extrinsic auto-‐ generator automatically extracts technical information regarding file format and size. can also extract some information from meta-‐tags. editor-‐ converter dublin core metadata http://www.library.kr.ua/dc/ dceditunie.html meta-‐tag harvester; extrinsic auto-‐ generator scans html documents, harvesting metadata from tags and converting them to dublin core. information technology and libraries | september 2015 34 embedded metadata extraction tool (emet) http://www.artstor.org/global/g -‐html/download-‐emet-‐public.html content extractor; meta-‐tag harvester; extrinsic auto-‐ generator emet is a tool designed to extract metadata embedded in jpeg and tiff files. firefox dublin core viewer extension http://www.splintered.co.uk/ experiments/73/ meta-‐tag harvester; extrinsic auto-‐ generator scans html documents, harvesting metadata from tags and displaying them to dublin core. jhove http://jhove.sourceforge.net/ #implementation extrinsic auto-‐ generator extracts metadata regarding file format and size as well as validating the structure of the identified file format. national library of new zealand— metadata extraction tool http://meta-‐extractor .sourceforge.net/ extrinsic auto-‐ generator developed by the national library of new zealand to programmatically extract preservation metadata from a range of file formats like pdf documents, image files, sound files, microsoft office documents, and others. omeka http://omeka.org/ extrinsic auto-‐ generator; social tagging automatically extracts technical information regarding file format and size. repomman http://www.hull.ac.uk/esig/ repomman/index.html meta-‐tag harvester; content extractor; extrinsic auto-‐ generator automatically extracts various elements for documents uploaded to fedora such as author, title, description, and key words, among others. results are presented to user for review. simple automatic metadata generation interface (samgi) http://hmdb.cs.kuleuven.be/amg/ download.php content extractor; extrinsic auto-‐ generator a suite of tools that is able to automatically extract metadata elements such as keyphrase and language from documents as well as from the context in which a document exists. table 3. semi-‐automatic tools that support automatic indexing text and data mining the two methods discussed above, content extraction and automatic indexing, rely on text-‐ and data-‐mining techniques for the automatic extraction of metadata. in other words, the above methods utilize machine-‐learning algorithms, statistical analysis of term frequencies, clustering techniques, or techniques that examine the frequency of term utilization between documents as opposed to the use of controlled vocabularies, and classifying techniques, or techniques that exploit the conventional structure of documents, for the semi-‐automatic generation of metadata. because of the complexity of these techniques, few tools have been fully developed for application within real-‐world library settings. rather, most uses of these techniques have been developed to solve the problems of automatic metadata generation within the context of specific research projects. evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 35 there are two reasons for this. one is that, as many researchers have noted, the effectiveness of machine learning techniques depends on the quality and quantity of training data used to teach the system.13, 14, 15 because of the number and diversity of subject domains as well as the shear variety of document formats, many applications are designed to address the metadata needs of very specific subject domains and very specific types of documents. this is a point that kovacevic et al. make in stating that machine learning techniques generally work best for documents of a similar type, like research papers.16 another issue, especially as it applies to automatic indexing, is the fact that, as gardner notes, controlled vocabularies such as the lcsh are too complicated and diverse in structure to be applied through semi-‐automatic means.17 although some open-‐source tools such as data fountains have made efforts to overcome this complexity, projects like it are the exception rather than the rule. these issues signify the difficulty with developing sophisticated semi-‐automatic metadata generation tools that have general applicability across a wide range of subject domains and format types. nevertheless, for semi-‐automatic metadata generation tools to become a reality for the library community, such complexity will have to be overcome. there are, however, some tools that have broader applicability or can be customized to meet local needs. for instance, the kea keyphrase extractor offers the option of building local or applying available ontologies that can be used to refine the extraction process. perhaps the most promising of all is the above mentioned data fountains suite of tools developed by the university of california. the data fountains suite incorporates almost every one of the semi-‐automatic metadata techniques described in this study, including sophisticated content extraction and automatic indexing features. it also provides several ways to customize the suite in order to meet local needs. extrinsic data auto-‐generation extrinsic data auto-‐generation is the process of extracting metadata about an information resource that is not contained within the resource itself. extrinsic data auto-‐generation can involve the extraction of technical metadata such as file format and size but can also include the extraction of more complicated features such as the grade level of an educational resource or the intended audience for a document. the process of extracting technical metadata is perhaps one area of semi-‐automatic metadata generation that is in a high state of development, included in most cmss such as dspace,18 as well as other more sophisticated tools such as harvard’s jhove, which can recognize at least 7twelve different kinds of textual, audio, and visual file formats.19 on the other hand, the problem of semi-‐automatically generating other types of extrinsic metadata, like grade level, are of the most difficult to solve. as leibbrandt et al. note in their analysis of the use of artificial intelligence mechanisms to generate subject metadata for a repository of educational materials at the education services australia, the extraction of extrinsic metadata such as grade level was much more difficult than the extraction of keywords because of the lack of information surrounding a resource’s context within the resource itself.20 this difficulty can also be seen in the absence of tools that support the information technology and libraries | september 2015 36 extraction of extrinsic data beyond those that are harvesting metadata that has been created manually or extracting technical metadata. table 4 lists the tools that support extrinsic data auto-‐generation either as the sole technique or as one of a suite of techniques used to generate metadata from resources. of the thirty-‐nine tools evaluated for this study, thirteen tools support some form of extrinsic data auto-‐generation. tool name location techniques functions/features apache poi— text extractor http://poi.apache.org/download.html content extractor; meta-‐tag harvester; extrinsic auto-‐ generator apache poi provides basic text extraction for all project supported file formats. in addition to the (plain) text, apache poi can access the metadata associated with a given file, such as title and author. apache tika http://tika.apache.org/ content extractor; meta-‐tag harvester; extrinsic auto-‐ generator built on apache poi, the apache tika toolkit detects and extracts metadata and text content from various documents. data fountains http://datafountains.ucr.edu/ content extractor; automatic indexer; meta-‐tag harvester; extrinsic auto-‐ generator scans html documents and first extracts information contained in meta-‐tags. if information is unavailable in meta-‐tags, the program will use other techniques to assign values. includes a focused web crawler that can target websites concerning a specific subject. digital record object identification (droid) http://www.nationalarchives.gov.uk/ information-‐management/manage -‐information/preserving-‐digital -‐records/droid/ extrinsic auto-‐ generator droid is a software tool developed by the national archives to perform automated batch identification of file formats. dspace http://www.dspace.org/ meta-‐tag harvester; extrinsic auto-‐ generator automatically extracts technical information regarding file format and size. can also extract some information from meta-‐tags. editor-‐ converter dublin core metadata http://www.library.kr.ua/dc/ dceditunie.html meta-‐tag harvester; extrinsic auto-‐ generator scans html documents, harvesting metadata from tags and converting them to dublin core. embedded metadata extraction tool (emet) http://www.artstor.org/global/g -‐html/download-‐emet-‐public.html content extractor; meta-‐tag harvester; extrinsic auto-‐ generator emet is a tool designed to extract metadata embedded in jpeg and tiff files. firefox dublin core viewer extension http://www.splintered.co.uk/ experiments/73/ meta-‐tag harvester; extrinsic auto-‐ generator scans html documents, harvesting metadata from tags and displaying them to dublin core. jhove http://jhove.sourceforge.net/ extrinsic auto-‐ extracts metadata regarding file evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 37 #implementation generator format and size as well as validating the structure of the identified file format. national library of new zealand— metadata extraction tool http://meta-‐extractor .sourceforge.net/ extrinsic auto-‐ generator developed by the national library of new zealand to programmatically extract preservation metadata from a range of file formats like pdf documents, image files, sound files, microsoft office documents, and others. omeka http://omeka.org/ extrinsic auto-‐ generator; social tagging automatically extracts technical information regarding file format and size. repomman http://www.hull.ac.uk/esig/ repomman/index.html meta-‐tag harvester; content extractor; extrinsic auto-‐ generator automatically extracts various elements for documents uploaded to fedora such as author, title, description, and key words, among others. results are presented to user for review. simple automatic metadata generation interface (samgi) http://hmdb.cs.kuleuven.be/amg/ download.php content extractor; extrinsic auto-‐ generator a suite of tools that is able to automatically extract metadata elements such as keyphrase and language from documents as well as from the context in which a document exists. table 4. semi-‐automatic tools that support extrinsic data auto-‐generation. social tagging social tagging is now a familiar form of subject metadata generation although, as mentioned previously, it is not properly a form of automatic metadata generation. nevertheless, because of the relatively low cost in generating and maintaining metadata through social tagging and its current widespread popularity, a few projects have attempted to utilize such data to enhance repositories. for instance, linstaedt et al. use sophisticated computer programs to analyze still images found within flickr and then use this analysis to process new images and to propagate relevant user tags to those images.21 in a slightly more complicated example, liu and qin employ machine-‐learning techniques to initially process and assign metadata, including subject terms, to a repository of documents related to the computer science profession.22 however, this proof of concept project also permits users to edit the fields of the metadata once established. the user-‐edited tags are then reprocessed by the system with the hope of improving the machine-‐learning mechanisms of the database, creating a kind of feedback loop for the system. specifically, the improved tags are used by the system to suggest and assign subject terms for new documents as well as to improve subject description of existing documents within the repository. although these two examples provide instances of sophisticated reprocessing of social tag metadata, these capabilities do not seem to be present in open-‐source tools at this time. nevertheless, social tagging capabilities are offered by many cmss such as omeka. these social tagging capabilities may offer a means to enhance subject access to holdings. information technology and libraries | september 2015 38 table 5 below lists the tools that support social tagging either as the sole technique or as one of a suite of techniques used to generate metadata from resources. of the thirty-‐nine tools evaluated for this study, two tools support some form of social tagging. tool name location techniques functions/features dspace http://www.dspace.org/ meta-‐tag harvester; extrinsic auto-‐ generator; social tagging automatically extracts technical information regarding file format and size. can also extract some information from meta-‐tags. omeka http://omeka.org/ extrinsic auto-‐ generator; social tagging automatically extracts technical information regarding file format and size. table 5. semi-‐automatic tools that support social tagging. challenges to implementation although semi-‐automatic metadata generation tools offer many benefits, especially in regards to streamlining the metadata-‐creation process, there are significant barriers to the widespread adoption and implementation of these tools. one problem with semi-‐automatic metadata generation tools is that many are developed locally to address the specific needs of a given project or as part of academic research. this local, highly focused milieu for development means that general applicability of the tools is potentially diminished. the local context may also hinder widespread adoption of applications that would result in strong communities of application users and provide further support for the development of applications in an open-‐source context. because of the highly specific nature of many current tools, their relevance to real-‐world processes of metadata creation within the broader context of libraries’ diverse information management needs are not accounted for. additionally, many tools are focused on solving one or, at most, a few metadata generation problems. for instance, the kea application is designed to use machine-‐learning techniques for the sole purpose of extracting keywords, the open text summarizer is limited to automatic extractions of summary descriptions and keywords, and editor converter dublin core is designed to extract information in html meta-‐tags and map them to dublin core elements. because of the piecemeal development of semi-‐automatic generation tools, any comprehensive package of tools will require the significant efforts of the implementer to coordinate the selected applications and to produce results in a single output. this is, to say the least, a daunting task. furthermore, a high degree of technical skill is required to implement these complex tools. many of the more sophisticated tools used to semi-‐automatically generate metadata, such as data fountains, kea, and apache stanbol, require competence in a variety of programming languages. evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 39 significant knowledge of c++, python, and java, are required to implement these systems properly. the high degree of technical knowledge needed to implement these tools means that many libraries and other institutions may not have resources to begin implementing them, let alone incorporating them into the daily workflows of the metadata creation process. further, this high degree of technical expertise may require libraries to seek assistance outside of the library. in other words, librarians may need to build strong collaborative relationships with those who have the technical skills, expertise and credentials to implement and maintain these complicated tools. as vellucci et al. note in regards to their development of the metadata education and research information commons (meric), a metadata-‐driven clearinghouse of education materials related to metadata, elaborate and multidisciplinary partnerships need to be firmly established for the ultimate success of such projects, including the sustained support of the highest levels of administration.23 these types of partnerships may be difficult to establish and maintain for the sustained implementation of complicated tools. additionally, sustainable development of tools, especially in regards to the funding needed for continued development of open-‐source applications, appears to be a significant barrier to implementation. for instance, at the time of this writing, many of the tools that were touted in the literature as being most promising, such as dc dot, reggie, and describethis, are no longer available for implementation. beyond the fact that discontinuation hurts the potential adoption and continued development of semi-‐automatic tools within real world library and other information settings, there is also the problem that those settings that have in fact adopted tools may lose the technical support of a central developer and community of users. thus discontinuation may result in higher rates of tool obsolescence and increase the potential expenses of libraries who have implemented and then must change applications. finally, the application of semi-‐automatic metadata tools remains relatively untested in real-‐world scenarios. as polfreman et al. note, most tests of automatic metadata generation tools have several of problems, including small sample sizes, narrow scope of project domains, and experiments that lack true objectivity because systems are generally tested by their creators.24 for these reasons, libraries and other institutions may be reluctant to expand the resources needed to implement and fully integrate a complicated, promising, but ultimately untested, tool within the already strained workflows of its processes. conclusion semi-‐automatic metadata generation tools hold the promise of assisting information professionals with the management of ever-‐increasing quantities and types of information resources. using software that can create metadata records consistently and efficiently, semi-‐automatic metadata generation tools potentially offer significant cost and time savings. however, the full integration of these tools into the daily workflows of libraries and other information settings remains elusive. for instance, although many tools have been developed that have addressed many of the more complicated aspects of semi-‐automatic metadata generation, including the extraction of information technology and libraries | september 2015 40 information related to conceptually difficult areas of bibliographic description such as subject terms, open-‐ended resource descriptions, and keyword assignment, many of these tools are relevant only at the project level and are not applicable to the broader contexts needed by libraries. in other words, the current array of tools exists to solve experimental problems but has not been developed to the point that the library community can implement it in a meaningful way. perhaps the greatest area of difficulty lies in the fact that most tools only address part of the problem of semi-‐automatic metadata generation, providing solutions to the semi-‐automatic generation of one or a few bibliographic elements but not the full range elements. this means that for libraries to truly have a comprehensive tool set for the semi-‐automatic generation of metadata records, significant local efforts will be required to integrate the various tools into a working whole. couple this issue with the instability of tool development and maintenance and it appears that the library community may lack incentive to invest already strained and limited resources in the adoption of these tools. thus it appears that a number of steps will need to be taken before the library community can seriously consider the incorporation of semi-‐automatic metadata generation tools within its daily workflows. first, it seems that the integration of these various tools into a coherent set of applications is likely the next step in the development of viable semi-‐automatic metadata generation. since most small libraries likely do not have the resources required to integrate these disparate tools together, let alone incorporate them within existing library systems, a single package of tools will be needed simply from a resource perspective. secondly, considering the high level of technical expertise needed to implement the current array of tools, the integrated set of tools must be accomplished in such a way as to foster implementation, utilization, and maintenance with a minimum of technical expertise. for instance, if an integrated set of tools that functioned across a wide range of subject domains and format types could be developed, the suite might be akin to the cmss currently employed by many libraries. furthermore, with a suite of tools that are relatively easy to use, adaption would likely increase. this might result in a stable community of users that would foster the further development of the tools in a sustainable manner. a comprehensive, relatively easy to implement set of tools might foster independent testing of those tools. the independent testing of the semi-‐automatic tools is needed to provide an objective basis for tool evaluation and further development. finally, designing automated workflows tailored to the subject domain and types of resources seems to be an essential step for integrating semi-‐automatic metadata generation tools into metadata creation. such workflows may delineate data elements that can be generated by automated meta-‐tag extractor from data elements that need to be refined and manually created by cataloging and metadata professionals. to develop, maximize, and sustain semi-‐automatic metadata generation workflows, administrative support for finance, human resources, and training is critical. evaluation of semi-‐automatic metadata generation tools| park and brenza doi: 10.6017/ital.v34i3.5889 41 thus, although many of the technical aspects of semi-‐automatic metadata generation are well on their way to being solved, many other barriers exist that might limit adoption. further, these barriers may have a negative influence on the continued, sustainable development of semi-‐ automatic metadata generation tools. nevertheless, there is a critical need that the library community finds ways to manage the recent explosion of data and information in cost-‐effective and efficient ways. semi-‐automatic metadata generation holds the promise to do just that. acknowledgement this study was supported by the institute of museum and library services. references 1. jane greenberg, kristina spurgin, and abe crystal, “final report for the amega (autozmatic 2. sue ann gardner, “cresting toward the sea change,” library resources & technical services 56, no. 2 (2012): 64–79, http://dx.doi.org/10.5860/lrts.56n2.64. 3. for details, see jung-‐ran park and caimei lu, “application of semi-‐automatic metadata generation in libraries: types, tools, and techniques,” library & information science research 31, no. 4 (2009): 225–31, http://dx.doi.org/10.1016/j.lisr.2009.05.002. 4. erik mitchell, “trending tech services: programmatic tools and the implications of automation in the next generation of metadata,” technical services quarterly 30, no. 3 (2013): 296–10, http://dx.doi.org/10.1080/07317131.2013.785802. 5. jane greenberg, “metadata extraction and harvesting: a comparison of two automatic metadata generation applications,” journal of internet cataloging 6, no. 4 (2004): 59–82, http://dx.doi.org/10.1300/j141v06n04_05. 6. malcolm polfreman, vanda broughton, and andrew wilson, “metadata generation for resource discovery,” jisc, 2008, http://www.jisc.ac.uk/whatwedo/programmes/resourcediscovery/autometgen.aspx. 7 park and lu, “application of semi-‐automatic metadata generation in libraries.” 8. kea automatic keyphrase extraction homepage, http://www.nzdl.org/kea/index_old.html. 9. wilhelmina randtke, “automated metadata creation: possibilities and pitfalls,” serials librarian 64, no. 1–4 (2013): 267–84, http://dx.doi.org/10.1080/0361526x.2013.760286. 10. aleksandar kovačević et al.,“automatic extraction of metadata from scientific publications for cris systems.” electronic library and information systems 45, no. 4 (2011): 376–96, http://dx.doi.org/10.1108/00330331111182094. information technology and libraries | september 2015 42 11. mark patton et al., “toward a metadata generation framework: a case study at johns hopkins university,” d-‐lib magazine 10, no. 11 (2004), http://www.dlib.org/dlib/november04/choudhury/11choudhury.html. 12. nicolai erbs, iryna gurevych, and marc rittberger, “bringing order to digital libraries: from keyphrase extraction to index term assignment.” d-‐lib magazine 19, no. 9/10 (2013), http://www.dlib.org/dlib/september13/erbs/09erbs.html. 13. polfreman, broughton, and wilson, “metadata generation for resource discovery.” 14. randtke, “automated metadata creation.” 15. xiaozhong liu and jian qin, “an interactive metadata model for structural, descriptive, and referential representation of scholarly output,” journal of the association for information science & technology 65, no. 5 (2014): 964–83, http://dx.doi.org/10.1002/asi.23007. 16. kovačević et al., “automatic extraction of metadata from scientific publications for cris systems.” 17. gardner, “cresting toward the sea change.” 18. mary kurtz, “dublin core, dspace, and a brief analysis of three university repositories,” information technology & libraries 29, no. 1 (2010): 40–46, http://dx.doi.org/10.6017/ital.v29i1.3157. 19. “jhove -‐ jstor/harvard object validation environment,” jstor, http://jhove.sourceforge.net. 20. richard leibbrandt et al., “smart collections: can artificial intelligence tools and techniques assist with discovering, evaluating and tagging digital learning resources?” international association of school librarianship: selected papers from the annual conference (2010). 21. stefanie lindstaedt et al., “automatic image annotation using visual content and folksonomies,” multimedia tools & applications 42, no. 1 (2009): 97–113, http://dx.doi.org/10.1007/s11042-‐008-‐0247-‐7. 22. liu and qin, “an interactive metadata model.” 23. sherry vellucci, ingrid hsieh-‐yee, and william moen, “the metadata education and research information commons (meric): a collaborative teaching and research initiative,” education for information 25, no. 3/4 (2007): 169–78. 24. polfreman, broughton, and wilson, “metadata generation for resource discovery.” usability study of a library’s m obile website: an example from portland state university kimberly d. pendell and michael s. bowman usability study of a library’s mobile website | pendell and bowman 45 abstract to discover how a newly developed library mobile website performed across a variety of devices, the authors used a hybrid field and laboratory methodology to conduct a usability test of the website. twelve student participants were recruited and selected according to phone type. results revealed a wide array of errors attributed to site design, wireless network connections, as well as phone hardware and software. this study provides an example methodology for testing library mobile websites, identifies issues associated with mobile websites, and provides recommendations for improving the user experience. introduction mobile websites are swiftly becoming a new access point for library services and resources. these websites are significantly different from full websites, particularly in terms of the user interface and available mobile-friendly functions. in addition, users interact with a mobile website on a variety of smartphones or other internet-capable mobile devices, all with differing hardware and software. it is commonly considered a best practice to perform usability tests prior to the launch of a new website in order to assess its user friendliness, yet examples of applying this practice to new library mobile websites are rare. considering the variability of user experiences in the mobile environment, usability testing of mobile websites is an important step in the development process. this study is an example of how usability testing may be performed on a library mobile website. the results provided us with new insights on the experience of our target users. in the fall of 2010, with the rapid growth of smartphones nationwide especially among college students, portland state university (psu) library decided to develop a mobile library website for its campus community. the library’s lead programmer and a student employee developed a test version of the website. this version of the website included library hours, location information, a local catalog search, library account access for viewing and renewing checked out items, and access to reference services. it also included a “find a computer” feature displaying the availability of work stations in the library’s two computer labs. kimberly d. pendell (kpendell@pdx.edu) is social sciences librarian, assistant professor, and michael s. bowman (bowman@pdx.edu) is interim assistant university librarian for public services, associate professor, portland state university library, portland, oregon. mailto:kpendell@pdx.edu mailto:bowman@pdx.edu information technology & libraries | june 2012 46 the basic architecture and design of the site was modeled on other existing academic library mobile websites that were appealing to the development team. the top-level navigation of the mobile website largely mirrored the full library website, utilizing the same language as the website when possible. the mobile website was built to be compatible with webkit, the dominant smartphone layout engine. use of javascript on the website was minimized due to the varying levels of support for it on different smartphones, and flash was avoided entirely. figure 1. home page of library mobile website, test version we formed a mobile website team to further evaluate the test website and prepare it for launch. three out of four team members owned smartphones, either an iphone 3gs or an iphone 4. we soon began questioning how the mobile website would work on other types of phones, recognizing that hardware and software differences would likely impact user experience of the mobile website. performing a formal usability test using a variety of internet-capable phones quickly became a priority. we decided to conduct a usability test for the new mobile website in order to answer the question: how user-friendly and effective is the new library mobile website on students’ various mobile devices? literature review smartphones, mobile websites, and mobile applications have dominated the technology landscape in the last few years. smartphone ownership has steadily increased, and a large percentage of usability study of a library’s mobile website | pendell and bowman 47 smartphone owners regularly use their phone to access the internet. the pew research center reports that 52 percent of americans aged 18–29 own smartphones, and 81 percent of this population use their smartphone to access the internet or e-mail on a typical day. additionally, 42 percent of this population uses a smartphone as their primary online access point.1 the 2010 ecar study of undergraduate students and information technology found that 62.7 percent of undergraduate students own internet-capable handheld devices, an increase of 11.5 percent from 2009. the 2010 survey also showed that an additional 11.3 percent of students intended to purchase an internet-capable handheld device within the next year.2 in this environment academic libraries have been scrambling to address the proliferation of student owned mobile devices, thus the number of mobile library websites is growing. the library success wiki, which tracks libraries with mobile websites, shows an 66 percent increase in the number of academic libraries in the united states and canada with mobile websites from august 2010 to august 2011.3 we reviewed articles about mobile websites in the professional library science literature and found that mobile website usability testing is only briefly mentioned. in their summary of current mobile technologies and mobile library website development, bridges, rempel, and griggs state that “user testing should be part of any web application development plan. you can apply the same types of evaluation techniques used in non-mobile applications to ensure a usable interface.”4 in a previous article, the same authors also note that not accounting for other types of mobile users is easy to do but leaves a potentially large audience for a mobile website “out in the cold.”5 more recently, seeholzer and salem found the usability aspect of mobile website development to be in need of further research.6 usability evaluation techniques for a mobile website are similar to those for a full website, but the variety of smartphones and internet-capable feature phones immediately complicates standard usability testing practices. the mobile device landscape is fraught with variables that can have a significant impact on the user experience of a mobile website. factors like small screen size, processing power, wireless or data plan connection, and on-screen keyboards or other data entry methods contribute to user experience and impact usability testing. zhang and adipat note that, mobile devices themselves, due to their unique, heterogeneous characteristics and physical constraints, may play a much more influential role in usability testing of mobile applications than desktop computers do in usability testing of desktop applications. therefore real mobile devices should be used whenever possible.7 one strategy for usability testing on mobile devices is to identify device “families” by similar operating systems or other characteristics, then perform a test of the website. for example, griggs, bridges, and rempel found representative models of device families at a local retailer, where they tested the site on the display phones. the authors also recommend “hallway usability testing,” an impromptu test with a volunteer.8 zhang and adipat go on to outline two methodologies for formal mobile application usability testing: field studies and laboratory experiments. the benefit of a mobile usability field study is information technology & libraries | june 2012 48 the preservation of the mobile environment in which tasks are normally performed. however, data collection is challenging in field studies, requiring the participant to reliably and consistently selfreport data. in contrast, the benefit of a laboratory study is that researchers have more control over the test session and data collection method. laboratory usability tests lend themselves to screen capture or video recording, allowing researchers more comprehensive data regarding the participant’s performance on predetermined tasks.9 however, billi and others point out that there is no general agreement in the literature about the significance or usefulness of the difference between laboratory and field testing of mobile applications.10 one compromise between field studies and laboratory experiments is the use of a smartphone emulator: an emulator mimics the smartphone interface on a desktop computer and is recordable via screen capture. however, desktop emulators mask some usability problems that impact smartphones, such as an unstable wireless connection or limited bandwidth.11 in order to record test sessions of users working directly with mobile devices, jakob nielsen, the well-known usability expert, briefly mentions the use of a document camera.12 in another usability test of a mobile application, loizides and buchanan also used a document camera with recording capabilities to effectively record users working with a mobile device.13 usability attributes are metrics that help assess the user-friendliness of a website. in their review of empirical mobile usability studies, coursaris and kim present the three most commonly used measures in mobile usability testing: efficiency: degree to which the product is enabling the tasks to be performed in a quick, effective and economical manner or is hindering performance; effectiveness: accuracy and completeness with which specified users achieved specified goals in particular environment; satisfaction: the degree to which a product is giving contentment or making the user satisfied.14 the authors present these measures in an overall framework of “contextual usability” constructed with the four variables of user, task, environment, and technology. an important note is the authors’ use of technology rather than focusing solely on the product; this subtle difference acknowledges that the user interacts not only with a product, but also other factors closely associated with the product, such as wireless connectivity.15 a participant proceeding through a predetermined task scenario is helpful in assessing site efficiency and effectiveness by measuring the error rate and time spent on a task. user satisfaction may be gauged by the participant’s expression of satisfaction, confusion, or frustration while performing the tasks. measurement of user satisfaction may also be supplemented by a post-test survey. returning to general evaluation techniques, mobile website usability employs the use of task scenarios, post-test surveys, and data analysis methods, similar to full site testing. general guides such as the handbook of usability testing by rubin and chisnell and george’s user-centered library websites: usability evaluation methods provide helpful information on designing task scenarios, how to facilitate a test, post-test survey ideas, and methods of analysis.16 another usability study of a library’s mobile website | pendell and bowman 49 common data collection method in usability testing is the think aloud protocol as it allows researchers to more fully understand the user experience. participants are instructed to talk about what they are thinking as they use the site; for example, expressing uncertainty of what option to select, frustration with poorly designed data entry fields, or satisfaction with easily understood navigation. examples of the think aloud protocol can also be found in mobile website usability testing.17 method while effective usability testing normally relies on five to eight participants, we decided a larger number of participants would be needed in order to capture the behavior of the site on a variety of devices. therefore, we recruited twelve participants to accommodate a balanced variety of smartphone brands and models. based on average market share, we aimed to test the website on four iphones, four android phones, and four other types of smartphones or internet-capable mobile devices (e.g., blackberry, windows phones). all study participants were university students, the primary target audience of the mobile website. we used three methods to recruit participants: a post to the library’s facebook page, a news item on the library’s home page, and two dozen flyers posted around campus. each form of recruitment described an opportunity for students to spend less than thirty minutes helping the library test its new mobile website. also, participants would receive a $10 coffee shop gift card as an incentive. a project-specific email address served as the initial contact point for students to volunteer. we instructed volunteers to indicate their phone type in their e-mail; this information was used to select and contact the students with the desired variety of mobile devices. if a scheduled participant did not come to the test appointment, another student with the same or similar type of phone was contacted and scheduled. no other demographic data or screening was used to select participants, aside from a minimum age requirement of eighteen years old. we employed a hybrid field and laboratory test protocol, which allowed us to test the mobile website on students’ native devices while in a laboratory setting that we could efficiently manage and schedule. participants used their own phone for the test without any adjustment to their existing operating preferences, similar to field testing methodology. however, we used a controlled environment in order to facilitate the test session and create recordings for data analysis. a library conference room served as our laboratory, and a document camera with video recording capability was used to record the session. the document camera was placed on an audio/visual cart and the participants chose to either stand or sit while holding their phones under the camera. the document camera recorded the phone screen, the participant’s hands, and the audio of the session. the video feed was available through the room projector as well, which helped us monitor image quality of the recordings. information technology & libraries | june 2012 50 figure 2. video still from test session recording the test session consisted of two parts: the completion of five tasks using participants’ phones on our test website recorded under the document camera, and a post-test survey. participants were read an introduction and instructions from a script in order to decrease variation in test protocol and our influence as the facilitators. we also performed a walk-through of the testing session prior to administering it to ensure the script was clearly worded and easy to understand. we developed our test scenarios and tasks according to five functional objectives for the library mobile website: 1. participants can find library hours for a given day in the week. 2. participants can perform a known title search in catalog and check for item status. 3. participants can use my account to view checked out books.18 4. participants can use chat reference. 5. participants can effectively search for a scholarly article using the mobile version of ebscohost academic search complete. prior to beginning the test, we encouraged participants to use the “think aloud” protocol while performing tasks. we also instructed them to move between tasks however they would naturally in order to capture user behavior when navigating from one part of the site to another. the post-test survey provided us with additional data and user reactions to the site. users were asked to rate the site’s appearance, ease of use, and how frequently they might use the different website features usability study of a library’s mobile website | pendell and bowman 51 (e.g., renewing a checked out item). the survey was administered directly after the task scenario portion of the test in order to take advantage of the users’ recent experience with the website. we evaluated the test sessions utilizing the measures of efficiency, effectiveness, and satisfaction. in this study, we assessed efficiency as time spent performing the task and effectiveness as success or failure in completing the task. we observed errors and categorized them as either a user error or site error. each error was also categorized as minor, major, or fatal: minor errors were easily identified and corrected by the user; major errors caused a notable delay, but the user was able to correct and complete the task; fatal errors prevented the user from completing the task. to assess user satisfaction, we took note of user comments as they performed tasks, and we also referred to their ratings and comments on the post-test survey. before analyzing the test recordings, we normalized our scoring behavior by performing a sample test session with a library staff member unfamiliar with the mobile website. we scored the sample recording separately and then met to discuss, clarify, and agree upon each error category. each of the twelve test sessions was viewed and scored independently. once this process was completed, we discussed our scoring of each test session video, combining our data and observations. we analyzed the combined data by looking for both common and unique errors for each usability task across the variety of smartphones tested. to protect participants’ confidentiality, each video file and post-test survey was labeled only with the test number and device type. prior to beginning the study, all recruitment methods, informed consent, methodology, tasks and post-test survey were approved by portland state university human subjects research and review committee. findings our recruitment efforts were successful with even a few same-day responses from the announcement posted on the library’s facebook page. some students also indicated that they had seen the recruitment flyers on campus. a total of fifty-two students volunteered to participate; twelve students were successfully contacted, scheduled, and tested. the distribution of the twelve participants and their types of phones is shown in table 1. number of participants operating system phone model 4 android htc droid incredible 2; motorola droid; htc mytouch 3g slide; motorola cliq 2 3 ios iphone 3gs 2 blackberry blackberry 9630; blackberry curve information technology & libraries | june 2012 52 1 windows phone 7 windows phone 7 1 webos palm pixi 1 other windows kin 2 feature phone (a phone with internet capability, running kinos) table 1. test participants by smartphone operating system and model usability task scenarios all test participants quickly and successfully completed the first task, finding the library hours for sunday. the second task was to find a book in the library catalog and report whether the book was available for check out. nine participants completed this task; the windows phone 7 and the two blackberry phones presented a fatal system error when working with our mobile catalog software, mobilecat. these participants were able to perform a search but were not able to view a full item record, blocking them from seeing the item’s availability and completing the task. this task also revealed one minor error for iphone users: the iphone displayed the item’s ten digit isbn as a phone number, complete with touch-to-call button. many users took more time than anticipated when asked to search for a book. the video recordings captured participants slowly scrolling through the menu before choosing “search psuonly catalog.” a few participants expressed their hesitation verbally: ● “maybe not the catalog? i don't know. yeah i guess that would be the one.” ● “i don't look for books on this site anyway...my lack of knowledge more than anything else.” ● “search psu library catalog i'm assuming?” the blackberry curve participant did not recognize the catalog option and selected “databases & articles” to search for a book. she was guided back to the catalog after her unsuccessful search in ebscohost. we observed an additional delay in searching for a book when using the catalog interface. the catalog search included a pull down menu of collections options. the collections menu was included by the site developers because it is present in the full website version of the local catalog. users tended to explore the menu looking for a selection that would be helpful in performing the task; however, they abandoned the menu, occasionally expressing additional confusion. usability study of a library’s mobile website | pendell and bowman 53 figure 3. catalog search with additional “collections” menu the next task was to log into a library account and view checked out items. all participants were successful with this task, but frequent minor user errors were observed, all misspelling or numerical entry errors. most participants self-corrected before submitting the login; however, one participant submitted a misspelled user name and promptly received an error message from the site. participants were also instructed to log out of the account. after clicking “logout” one participant made the observation; “huh, it goes to the login screen. i assume i'm logged out, though it doesn't say so.” the fourth task scenario involved using the library’s chat reference service via the mobile website. the chat reference service is provided via open source software in cooperation with l-net, the oregon statewide service. usability testing demonstrated that the chat reference service did not perform well on a variety of phones. also, a significant problem arose when participants attempted to access chat reference via the university’s unsecured wireless network. because the chat reference service is managed by a third-party host, three participants were confronted with a non-mobile friendly authentication screen (see discussion of the local wireless environment below). as this was an unexpected event in testing, participants were given the option to authenticate or abandon the task. all three participants who arrived at this point chose to move ahead with authentication during the test session. information technology & libraries | june 2012 54 once the chat interface was available to participants, other system errors were discovered. only three out of twelve participants successfully sent and received a chat message. only one participant (htc droid incredible) experienced an error-free chat transaction. various problems encountered included: · unresponsive or slow to respond buttons, · text fields unresponsive to data entry, · unusually long page loading time, · non-mobile-friendly error message upon attempting to exit, and · non-mobile-friendly “leave a message” webpage. another finding from this task is that participants expressed concern regarding communication delays during the chat reference task. if the librarians staffing the chat service are busy with other users, a new incoming user is placed in a queue. after waiting in the chat queue for forty seconds, one participant commented, “probably if i was on the bus and it took this long, i would leave a message.” being in a controlled environment, participants looked to the facilitator as a guide for how long to remain in the chat queue, distorting the indication of how long users would wait for a chat reference transaction in the field environment. figure 4. chat reference queue usability study of a library’s mobile website | pendell and bowman 55 the last task scenario asked participants to use the mobile version of ebscohost’s academic search complete. our test instance of this database generally performed well with android phones and less well with webos phones or iphones. android participants successfully accessed, searched, and viewed results in the database. iphone users experienced delays in initiating text entry, three consecutive touches being consistently necessary to activate typing in the search field. our feature phone participant with a windows kin 2 was unable to use ebscohost because the phone’s browser, internet explorer 6, is not supported by the ebscohost mobile website. the palm pixi participant also experienced difficulty with very long page loading times, two security certificate notifications (not present on other tests), and our ezproxy authentication page. with all these obstacles, the palm pixi participant abandoned the task. another participant, blackberry 9630, also abandoned the task due to slow page loading. a secondary objective of our ebscohost search task was to observe if participants explored ebscohost’s “search options” in order to limit results to scholarly articles. our task scenario asked participants to find a scholarly article on global warming. only one participant explored the ebscohost interface, successfully identified the “search options” menu, and limited the results to “scholarly (peer reviewed) articles.” another participant included the words “peer reviewed” with “global warming” in the search field in an attempt to add the limit. a third expressed the need to limit to scholarly articles but was unable to discover how to do so. of the remaining seven participants who searched academic search complete for the topic “global warming” none expressed concern or awareness of the scholarly limit in academic search complete. it is unclear whether this was a product of the interface design, users’ lack of knowledge regarding limiting their search to scholarly sources, or if our task scenario was simply too vague. though participants’ wireless configurations, or lack thereof, was not formally part of the usability test, we quickly discovered that this variable had a significant impact on the user’s experience of the mobile website. in the introductory script and informed consent we recommended to participants that they connect to the university’s wireless network to avoid data charges. however, we did not explicitly instruct users to connect to the secure network. most participants chose to connect to the unencrypted wireless network and appeared to be unaware of the encrypted network (psu and psu secure respectively). using the unencrypted network led to authentication requirements at two different points in the test: using the chat service and searching academic search complete. other users who were unfamiliar with adding a wireless network to their phone used their cellular network connection. these participants were asked to authenticate only when accessing ebscohost’s academic search complete (see table 2). participants expressed surprise at the appearance of an authentication request when performing different tasks, particularly while connected to the on-campus university wireless network. the required data entry in a non-mobile friendly authentication screen, and the added page loading time, created an obstacle for the participant to overcome in order to complete the task. notably, three participants also explained their naivete on how to find and add a wireless network to their phone. information technology & libraries | june 2012 56 internet connection library mobile website chat reference ebscohost on campus, unencrypted wireless no authentication required authentication required authentication required on campus, encrypted wireless no authentication required no authentication required no authentication required on campus, cellular network no authentication required no authentication required authentication required off campus, any mode no authentication required no authentication required authentication required table 2. authentication requirements based on type of internet connection and resource. post -test survey each participant completed a post-test survey that asked them to rate the mobile website’s appearance and ease of use. the survey also asked participants to rank how frequently they were likely to use specific features of the website such as search for books and ask for help on a rating scale of more than weekly, weekly, monthly, less than monthly, and never. participants were also invited to add general comments about the website. the mobile website’s overall appearance and ease of use was highly rated by all participants. the straightforward design of the mobile website’s homepage also garnered praise in the comment section of the post-test survey. comments regarding the site’s design included: “very simple to navigate,” and “the simple homepage is perfect! also, i love that the site rotates sideways with my phone.” for many of the features listed on the survey participants selected an almost even distribution across the frequency of use rating scale. however, two features were ranked as having potential for very high use. nine out of twelve participants said they would search for articles weekly or more than weekly. eight out of twelve participants said they would use the “find a computer” function weekly or more than weekly. two participants additionally wrote in comments that “find a computer” was “very important” and would be used “every day.” at the other end of the scale, our menu option “directions” was ranked as having a potential frequency of use of never, with the exception of one participant marking less than monthly. discussion usability testing of the library’s mobile website provided the team with valuable information, leading us to implement important changes before the site was launched. we quickly decided on a usability study of a library’s mobile website | pendell and bowman 57 few changes, while others involved longer discussion. the collections menu was removed from the catalog search; this menu distracted and confused users with options that were not useful in a general search. “directions” was moved from a top level navigation element to a clickable link in the site footer. also, the need for a mobile version of the library’s ezproxy authentication page was clearly documented and has since been created and implemented. however, the team was very pleased with the praise for the overall appearance of the website and its ease of use, especially considering the significant difficulties some participants faced when completing specific tasks. the “find a computer” feature of the mobile website was very popular with test participants. the potential popularity among users is perhaps a reflection of overcrowded computer labs across campus and the continued need students have for desktop computing. unfortunately, “find a computer” has been temporarily removed from the site due to changes in computer laboratory tracking software at the campus it level. we hope to soon again have access to the workstation data for the library’s two computer labs in order to develop a new version of this feature. the hesitation participants displayed when selecting the catalog option in order to search for a book was remarkable for its pervasiveness. it’s possible that the term “catalog” has declined in use to the point of not being recognizable to some users, and it is not used to describe the search on the homepage of the library’s full website. in fact, we had originally planned to name the catalog search option with a more active and descriptive phrase, such as “find books and more,” which is used on the library’s full website. however, the full library website employs worldcat local, allowing users to make consortial and interlibrary loan requests. in contrast, the mobile website catalog reflects only our local holdings and does not support the request functionality. the team decided not to potentially confuse users further regarding the functionality of the different catalogs by giving them the same descriptive title. in the case that worldcat local’s beta mobile catalog increases in stability and functionality, we will abandon mobilecat and provide the same request options on the mobile website as on the full website. we discussed removing the chat service option from the “ask us” page. during usability testing, it was demonstrated that users would too frequently have poor experiences using this service due to slow page loads on most phones, the unpredictable responsiveness of text entry fields and buttons, and the wait time for a librarian to begin the chat. also, it could be that waiting in a virtual queue on a mobile device is particularly unappealing because the user is blocked from completing other tasks simultaneously. the library recently implemented a new text reference service, and this service was added to the mobile website. the text reference service is an asynchronous, non-webbased service that is less likely to pose similar usability problems as those found with the chat service. this reflects the difference between applications developed for desktop computing, such as web-based instant messaging, versus a technology that is specifically related to the mobile phone environment, like text messaging. however, tablet device users complicate matters since they might use the full desktop website or the mobile website; for this reason, chat reference is still part of the mobile website. information technology & libraries | june 2012 58 participants’ interest in accessing and searching databases was notable. during the task, many participants expressed positive reactions to the availability of the ebscohost database. the posttest survey results demonstrated a strong interest in searching for articles via the mobile website, giving their potential frequency of use as weekly or more than weekly. this evidence supports the previous user focus group results of seeholzer and salem.19 students are interested in accessing research databases on their mobile devices, despite the likely limitations of performing advanced searches and downloading files. therefore, the team decided to include ebscohost’s academic search complete along with eight other mobile-friendly databases in the live version of the website launched after the usability test. figure 5. home page of the library mobile website, updated usability study of a library’s mobile website | pendell and bowman 59 the new library mobile website was launched in the first week of fall 2011 quarter classes. in the first full week there were 569 visits to the site. site analytics for the first week also showed that our distribution of smartphone models in usability testing was fairly well matched with the users of the website, though we underestimated the number of iphone users: 64 percent of visits were from apple ios users, 28 percent from android users, 0.7percent blackberry users, and the remaining a mix of users with alternative mobile browsers and desktop browsers. usability testing with participants’ native smartphones and wireless connectivity revealed issues which would have been absent in a laboratory test that employed a mobile device emulator and a stable network connection. the complications introduced by the encrypted and unencrypted campus wireless networks, and cellular network connections, revealed some of the many variables users might experience outside of a controlled setting. ultimately, the variety of options for connecting to the internet from a smartphone, in combination with the authentication requirements of licensed library resources, potentially adds obstacles for users. general recommendations for mobile library websites that emerged from our usability test include: · users appreciate simple, streamlined navigation and clearly worded labels; · error message pages and other supplemental pages linked from the mobile website pages should be identified and mobile-friendly versions created; · recognize that how users connect to the mobile website is related to their experience using the site; · anticipate problems with third-party services (which often cannot be solved locally). additionally, system responses to user actions are important; for example, provide a “you have successfully logged out” message and an indicator that a catalog search is in progress. it is possible that users are even more likely to abandon tasks in a mobile environment than in a desktop environment if they perceive the site to be unresponsive. as test facilitators, we experienced three primary difficulties in keeping the testing sessions consistent. the unexpectedly poor performance of the mobile website on some devices required us to communicate with participants about when a task could be abandoned. for example, after one participant made three unsuccessful attempts at entering text data in the chat service interface, she was directed to move ahead to the next task. such instances of multiple unsuccessful attempts were considered to be fatal system errors. however, under these circumstances, it is difficult to know whether our test facilitation led participants to spend more or less time than they normally would attempting a task. secondly, the issue of system authentication led to unexpected variation in testing. some participants proceeded through these obstacles, while others either opted out or had significant enough technical difficulties that the task was deemed a fatal error. again, it is unclear how the average user would deal with this situation in the field. some users information technology & libraries | june 2012 60 might leave an activity if an obstacle appears too cumbersome, others might proceed. finally, participants demonstrated a wide range in their willingness to “think aloud.” in retrospect, as facilitators, we should have provided an example of the method before beginning the test; perhaps doing so would have encouraged the participants to speak more freely. the relatively simple nature of most of the test tasks may have also contributed to this problem as participants seemed reluctant to say something that might be considered too obvious. another limitation of our study is that the participants were a convenience sample of volunteers selected by phone type. though our selection was based loosely on market share of different smartphone brands, a preliminary investigation into the mobile device market of our target population would have been helpful to establish what devices would be most important to test. additional usability testing on more complex library related tasks, such as advanced searching in a database, or downloading and viewing files, is recommended for further research. also of interest would be a study of user willingness to proceed past obstacles like authentication requirements and non-mobile friendly pages in the field. conclusion we began our study questioning whether or not different smartphone hardware and operating systems would impact the user experience of our library’s new mobile website. usability testing confirmed that the type of smartphone does have an impact on the user experience, occasionally significantly so. by testing the site on a range of devices, we observed a wide variation of successful and unsuccessful experiences with our mobile website. the wide variety of phones and mobile devices in use makes developing a mobile website that perfectly serves all of them difficult; there is likely to always be a segment of users who experience difficulties with any given mobile website. however, usability testing data and developer awareness of potential problems will generate positive changes to mobile websites and alleviate frustration for many users down the road. references and notes 1. aaron smith, “35% of american adults own a smartphone: one quarter of smartphone owners use their phone for most of their online browsing,” pew research center, june 15, 2011, http://pewinternet.org/~/media//files/reports/2011/pip_smartphones.pdf (accessed oct. 13, 2011). 2. shannon d. smith and judith b. caruso, the ecar study of undergraduate students and information technology, 2010, educause, 2010, 41, http://net.educause.edu/ir/library/pdf/ers1006/rs/ers1006w.pdf (accessed sept. 12, 2011); shannon d. smith, gail salaway, and judith b. caruso, the ecar study of undergraduate students and information technology, 2009, educause, 2009, 49, http://www.educause.edu/resources/theecarstudyofundergraduatestu/187215 (accessed sept. 12, 2011). http://pewinternet.org/~/media/files/reports/2011/pip_smartphones.pdf http://net.educause.edu/ir/library/pdf/ers1006/rs/ers1006w.pdf http://www.educause.edu/resources/theecarstudyofundergraduatestu/187215 usability study of a library’s mobile website | pendell and bowman 61 3. a comparison count of u.s. and canadian academic libraries with active mobile websites, wiki page versions, august 2010 (56 listed) and august 2011 (84 listed). library success: a best practices wiki, “m-libraries: libraries offering mobile interfaces or applications,” http://libsuccess.org/index.php?title=m-libraries (accessed sept. 7, 2011). 4. laurie m. bridges, hannah gascho rempel, and kim griggs, “making the case for a fully mobile library web site: from floor maps to the catalog,” reference services review 38, no. 2 (2010): 317, doi:10.1108/00907321011045061. 5. kim griggs, laurie m. bridges, and hannah gascho rempel, “library/mmobile: tips on designing and developing mobile web sites,” code4lib journal no. 8 (2009), under “content adaptation techniques,” http://journal.code4lib.org/articles/2055 (accessed sept. 7, 2011). 6. jamie seeholzer and joseph a. salem jr., “library on the go: a focus group study of the mobile web and the academic library,” college & research libraries 72, no. 1 (2011): 19. 7. dongsong zhang and boonlit adipat, “challenges, methodologies, and issues in the usability testing of mobile applications,” international journal of human-computer interaction 18, no. 3 (2005): 302, doi:10.1207/s15327590ijhc1803_3. 8. griggs, bridges, and rempel, “library/mobile.” 9. zhang and adipat, “challenges, methodologies,” 303–4. 10. billi et al., “a unified methodology for the evaluation of accessibility and usability of mobile applications,” universal access in the information society 9, no. 4 (2010): 340, doi:10.1007/s10209-009-0180-1. 11. zhang and adipat, “challenges, methodologies,” 302. 12. jakob nielsen, “mobile usability,” alertbox, september 26, 2011, www.useit.com/alertbox/mobile-usability.html (accessed sept. 28, 2011). 13. fernando loizides and george buchanan, “performing document triage on small screen devices. part 1: structured documents,” in iiix ’10: proceeding of the third symposium on information interaction in context, ed. nicholas j. belkin and diane kelly (new york: acm, 2010), 342, doi:10.1145/1840784.1840836. 14. constantinos k. coursaris and dan j. kim, “a qualitative review of empirical mobile usability studies” (presentation, twelfth americas conference on information systems, acapulco, mexico, august 4–6, 2006), 4, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.4082&rep=rep1&type=pdf (accessed sept. 7, 2011) 15. ibid., 2. http://libsuccess.org/index.php?title=m-libraries http://journal.code4lib.org/articles/2055 file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.useit.com/alertbox/mobile-usability.html http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.4082&rep=rep1&type=pdf information technology & libraries | june 2012 62 16. jeffrey rubin and dana chisnell, handbook of usability testing: how to plan, design, and conduct effective tests, 2nd ed. (indianapolis, in: wiley, 2008); carole a. george, user-centered library web sites: usability evaluation methods (cambridge: chandos, 2008). 17. ronan hegarty and judith wusteman, “evaluating ebscohost mobile,” library hi tech 29, no. 2 (2011): 323–25, doi:10.1108/07378831111138198; robert c. wu et al., “usability of a mobile electronic medical record prototype: a verbal protocol analysis,” informatics for health & social care 33, no. 2 (2008): 141–42, doi:10.1080/17538150802127223. 18. in order to protect participants’ confidentiality a dummy library user account was created; the user name and password for the account were provided to the participant at the test session. 19. seeholzer and salem, “library on the go,” 14. editorial board thoughts: appreciation for history cynthia porter information technology and libraries | september 2012 2 the future looks exciting for ital, with our new open access and online only journal. as i look forward, i have been thinking about librarians and the changes i have witnessed in library technology. i would like to thank judith carter for her work on ital for over 13 years. she encouraged me to volunteer for the editorial board. i will miss her. i believe that lessons from the past can help us. ital’s first issue appeared in 1982—the same year that i graduated from high school. i typed all my school papers with a typewriter except for my last couple of papers in college. my father bought an early macintosh computer (called lisa). he had a daisy wheel printer—if we wanted to change fonts, we changed out the daisy wheel. i am thankful for the editing capabilities and font choices i have now when i create documents. as an undergraduate student, i worked on dedicated oclc terminals in the interlibrary loan (ill) department at my college library. i was hired because i had the two hours open when ill usually used mail. i thought our ill service was a big help for our students. i could not imagine then that electronic copies of articles could be delivered to ill customers within one day. today’s ill staff doesn’t have to worry about paper cuts now, either. i graduated from library school in 1989. when i first started working as a cataloger, we were able to access oclc on pc’s (an improvement from the dumb terminals) in the libraries. our subject heading lists were in the big red books from the library of congress. i tried to use the red books as an example for today’s students and they had no idea what i was talking about. even though “subject headings” are a foreign concept to many students today, i will always value them and fight for their continuation. i worked on several retrospective conversion projects when i worked for a library contractor until 1991. the libraries still had card catalogs and we converted these physical catalogs to online catalogs. nicholson baker’s article “discards1,” published in 1994, fondly remembered card catalogs. this article was discussed fervently in library school, but it seems quaint now. i grew up with card catalogs and i liked being able to browse through the subject listings. browsing online does not provide the same satisfaction, but i would never give up the ability to keyword search an electronic document. i liked browsing the classification schemes, too. i did like easily seeing where your chosen number appeared within the scheme. it’s harder to do the same thing online. in 1991 i worked at an academic library where we were still converting catalog cards. we all had cynthia porter (cporter@atsu.edu) is distance support librarian at a.t. still university of health sciences, mesa, arizona. editorial board thoughts: appreciation for history| porter 3 computers on our desks by then and we were comfortable with regular use of e-mail. the internet was still young and gophers were the new technology. even though gophers were text-based, i thought it was amazing how easy it was to access information from a university on the other side of the country. the internet was the biggest technology development for me. i currently work with distance students who rely on their internet connections to use our online library. i could not imagine even having distance students if we weren’t connected with computers as we are now. a 2009 issue of ital was dedicated to discovery tools. in judith carter’s introduction to the issue she cites the browsing theory of shan-ju lin chang. browsing is an old practice in libraries and i am very happy to see that discovery tools use this classic library practice. bringing like items together has been a helpful organization method for ages. when i studied s.r. ranganathan and his colon classification scheme, i realized that faceted classification would work very well on the web. i found his ideas to be fascinating, but difficult to implement on book labels for classification numbers. some discovery tools even identify “facets” in searching and limiting. ranganathan’s work is a beautiful example of an old idea blossoming years after its conception. classification, facets, and browsing are old ideas that are still helping us organize information in our libraries. we can’t see the heavily used subjects by how dirty the cards are, but getting exact statistics on search terms is more useful anyway. i would also like to thank marc truitt for his time and contributions to ital. marc recently finished serving for four years as ital editor. he helped me remember library technology. i wanted to know about his collaboration with judith carter. he said that he “thought no one this side of pluto could do as well as she” as managing editor. we are lucky to have had brave librarians like ranganathan, carter, and truitt. although i enjoy remembering the past, i am very happy to utilize modern technology in my library. i don’t want to live in the past, but i definitely don’t want to forget it either. thank you library technology pioneers. references 1. nicholson baker, “discards,” the new yorker, april 4, 1994, vol. 70, no. 7, p. 64-85. student use of library computers: are desktop computers still relevant in today’s libraries? susan thompson information technology and libraries |december 2012 20 abstract academic libraries have traditionally provided computers for students to access their collections and, more recently, facilitate all aspects of studying. recent changes in technology, particularly the increased presence of mobile devices, calls into question how libraries can best provide technology support and how it might affect the use of other library services. a two-year study conducted at california state university san marcos library analyzed student use of computers in the library, both the library’s own desktop computers and laptops owned by students. the study found that, despite the increased ownership of mobile technology by students, they still clearly preferred to use desktop computers in the library. it also showed that students who used computers in the library were more likely to use other library services and physical collections. introduction for more than thirty years, it has been standard practice in libraries to provide some type of computer facility to assist students in their research. originally, the focus was on providing access to library resources, first the online catalog and then journal databases. for the past decade or so, this has expanded to general-use computers, often in an information-commons environment, capable of supporting all aspects of student research from original resource discovery to creation of the final paper or other research product. however, times are changing and the ready access to mobile technology has brought into question whether libraries need to or should continue to provide dedicated desktop computers. do students still use and value access to computers in the library? what impact does student computer use have on the library and its other services? have we reached the point where we should reevaluate how we use computers to support student research? california state university san marcos (csusm) is a public university with about nine thousand students, primarily undergraduates from the local area. csusm was established in 1991 and is one of the youngest campuses in the 23-campus california state university system. the library, originally located in space carved out of an administration building, moved into its own dedicated library building in 2004. one of the core principles in planning the new building was the vision of the library as a teaching and learning center. as a result, a great deal of thought went into the design of technology to support this vision. rather than viewing technology’s role as just supporting access to library resources, we expanded its role to providing cradle-to-grave support for the entire research process. we also felt that encouraging students to work in the library would encourage use of traditional library materials and the expertise of library staff, since these resources would be readily available.1 susan thompson (sthompsn@csusm.edu) is coordinator of library systems, california state university san marcos. student use of library computers | thompson 21 rethinking our assumptions about library technology’s role in the student research process led us to consider the entire building as a partner in the students’ learning process. rather than centralizing all computer support in one information commons, we wanted to provide technology wherever students want to use it. we used two strategies. first, we provided centralized technology using more than two hundred desktop computers, most located in four of our learning spaces: reference, classrooms, the media library, and the computer lab. three of these spaces are configured like information commons, providing full-service research computers grouped around the service desks near each library entrance. in addition, simplified “walk-up” computers are available on every floor. the simplified computers provide limited web services to encourage quick turnaround and no login requirement to ensure ready access to library collections for everyone, including community members. the other major component of our technology plan was the provision of wireless throughout the building, along with extensive power outlets to support mobile computing. more than forty quiet study rooms, along with table “islands” in the stacks, help support the use of laptops for group study. however, only two of these quiet studies, located in the media library, provide desktop computers designed specifically to support group work. in 2009 and again in 2010, we conducted computer use studies to evaluate the success of the library’s technology strategy and determine whether the library’s desktop computers were still meeting student needs as envisioned by the building plan. the goal of the study was to obtain a better understanding of how students use the library’s computers, including types of applications used, computer preferences, and computer-related study habits. the study addressed several specific research questions. first, librarians were concerned that the expanded capabilities of the desktop computers distracted students from an academic and library research focus. were students using the library’s computers appropriately? second, the original technology plan had provided extensive support for mobile technology, but the technology landscape has changed over time. how did the increase in student ownership of mobile devices—now at more than 80 percent—affect the use of the desktop computers? finally, did providing an application-rich computer environment encourage student to conduct more of their studying in the library, leading them more frequently to use traditional library collections and services? this article will focus on the study results pertaining to the second and third research questions. we found that, according to our expectations, students using library computer facilities also made extensive use of traditional library services. however, we were surprised to discover that the growing availability of mobile devices had relatively little impact on students’ continuing preference for libraryprovided desktop computers. literature review the concept of the information commons was just coming into vogue in the early 2000s, when we were designing our library building, and it strongly influenced our technology design as well as building design. information commons, defined by steiner as the “functional integration of technology and service delivery,” have become one of the primary methods by which libraries provide enhanced computing support for students studying in the library.2 one of the changes in libraries motivating the information-commons concept is the desire to support a broad range of learning styles, including the propensity to mix academic and social activities. particularly influential to our design was the concept of the information commons supporting students’ projects “from inception to completion” by providing appropriate technologies to facilitate research, collaboration, and consultation.3 information technology and libraries |december 2012 22 providing access to computers appears to contribute to the value of libraries as “place.” shill and toner, early in the era of information commons, noted “there are no systematic, empirical studies documenting the impact of enhanced library buildings on student usage of the physical library.” 4 since then, several evaluations of the information-commons approach seem to show a positive correlation between creation of a commons and higher library usage because students are now able to complete all aspects of their assignments in the library. for example, the university of tennessee and indiana university have shown significant increases in gate counts after they implemented their commons.5 while many studies discuss the value of information commons, very few look at why library computers are preferred over computers in other areas on campus. burke looked at factors influencing students’ choice of computing facilities at an australian university.6 given a choice of central computer labs, residence hall computers, and the library’s information commons, most students preferred the computers in the library over the other computer locations, with more than half using the library computers more than once a week. they rated the library most highly on its convenience and closeness to resources. perhaps the most important trend likely to affect libraries’ support for student technology needs is the increased use of mobile technology. the 2010 nationwide educause center for applied research (ecar) study, from the same year as the second csusm study, showed that 89 percent of students had laptops.7 other nationwide studies have corroborated this high level of laptop ownership.8 so, does this increased use of laptops and mobile devices have affect the use of desktop computers? the 2010 ecar study reported that desktop ownership (about 50 percent in 2010) had declined by more than 25 percent between 2006 and 2009, a significant period in the lifetime of csusm’s new library building. pew’s internet & american life project trend data showed desktop ownership as the only gadget category in which ownership is decreasing, from 68 percent in 2006 to 55 percent at the end of 2011.9 some libraries and campuses are beginning to respond to the increase in laptop ownership by changing their support for desktop computers. university of colorado boulder, in an effort to decrease costs and increase availability of flexible campus spaces, is making a major move away from providing desktop computers.10 while they found that 97 percent of their students own laptops and other mobile devices, they were concerned that many students still preferred to use desktop computers when on campus. to entice students to bring their laptops to campus, the university is enhancing their support for mobile devices by converting their central computer labs into flexible-use space with plentiful power outlets, flexible furniture, printing solutions, and access to the usual campus software. nevertheless, it may be premature for all libraries and universities to eliminate their desktop computer support. tom, voss, and scheetz found students want flexibility with a spectrum of technological options.11 certainly, they want wi-fi and power outlets to support their mobile technology. however, students also want conventional campus workstations providing a variety of functions, such as quick print and email computers, long-term workstations with privacy, and workstations at larger tables with multiple monitors that support group work. while the ubiquity of laptops is an important factor today, other forms of mobile devices may become more important in the future. a 2009 wall street journal article reported the trend for business travelers is to rely on smartphones rather than laptops.12 for the last three years, educause’s horizon reports have made support for non-laptop mobile technologies one of the top trends. the 2009 horizon report mentioned that in countries like japan, “young people equipped student use of library computers | thompson 23 with mobiles often see no reason to own personal computers.”13 in 2010, horizon reported an interesting pilot project at a community college in which one group of students was issued mobile devices and another group was not.14 members of the group with the mobile devices were found to work on the course more during their spare time. the 2011 horizon report discusses mobiles as capable devices in their own right that are increasingly users’ first choice for internet access.15 therefore, rather than trying to determine which technology is most important, libraries may need to support multiple devices. trends described in the ecar and horizon studies make it clear that students own multiple devices. so how do they use them in the study environment? head’s interviews with undergraduate students at ten us campuses found that “students use a less is more approach to manage and control all of the it devices and information systems available to them.”16 for example, in the days before final exams, students were selective in their use of technology to focus on coursework yet remain connected with the people in their lives. the question then may not be which technology libraries should support but rather how to support the right technology at the right time. method the csusm study used a mixed-method approach, combining surveys with real-time observation to improve the effectiveness of assessment and generate a more holistic understanding of how library users made their technology choices. the study protocol received exempt status by the university human subjects review board. it was carried out twice over a two-year period to determine whether time of the semester affected usage. in 2009, the study was administered at the end of the spring term, april 15 to may 3. we expected that students near the end of the term would be preparing for finals and completing assignments, including major projects. the 2010 study was conducted near the beginning of the term, february 4 to february 18. we that early term students would be less engaged in academic assignments, particularly major research projects. we carried out each study over a two-week period. an attempt was made to check consistency by duplicating each time and location. each location was surveyed monday—thursday, once in the morning and once in the afternoon during the heavy-use times of 11 a.m. and 2 p.m. the survey locations included two large computer labs (more than eighty computers each), one located near the library reference desk and one near the academic technology helpdesk. other locations included twenty computers in the media library, a handful of desktop computers in the curriculum area, and laptop users, mostly located on the fourth and fifth floor of the library. the fourth and fifth floor observations also included the library’s forty quiet study rooms. for the 2010 study, the other large computer lab on campus (108 computers), located outside the library, also was included for comparison purposes. we used two techniques: a quantitative survey of library computer users and a qualitative observation of software applications usage and selected study habits. the survey tried to determine the purpose for which the student was using the computer for that day, what their computer preference was, and what other business they might have in the library. it also asked students for their suggestions for changes in the library. the survey was usually completed within the five-minute period that we had estimated and contained no identifying personal information. the survey administrator handed-out the one-page paper survey, along with a pencil if desired, to each student using a library workstation or using a laptop during each designated observation information technology and libraries |december 2012 24 period. users who refused to take the survey were counted in the total number of students asked to do the survey. however, users who indicated they refused because they had already completed a survey on a previous observation date were marked as “dup” in the 2010 survey and were not counted again. the “dup” statistic proved useful as an independent confirmation of the popularity of the library computers. the second method involved conducting “over-the-shoulder” observations of students using the library computers. while students were filling out the paper survey, the survey administrator walked behind the users and inconspicuously looked at their computer screens. all users in the area were observed whether or not they had agreed to take the survey. the one exception was users in group-study rooms. the observer did not enter the room and could only note behaviors visible from the door window, such as laptop usage or group studying. based on brief (one minute or less) observations, administrators noted on a form the type of software application the student was using at that point in time. the observer also noted other, nondesktop computer technical devices in use (specifically laptops, headphones, and mobile devices such as smart phones), and study behaviors, such as groupwork (defined as two or more people working together). the student was not identified on the form. we felt that these observations could validate information provided by the users on the survey. results we completed 1,452 observations in 2009 and 2,501 observations in 2010. the gate counts for the primary month each study took place—70,607 for april 2009 and 59,668 for february 2010— show the library was used more heavily during the final exam period. the larger number of results the second year was due to more careful observation of laptop and study-group computer users on the fourth and fifth floor and the addition of observations in a nonlibrary computer lab rather than an increase of students available to be observed. the observations looked at application usage, study habits, and devices present, but this article will only discuss the observations pertaining to devices. in 2009, 17 percent of students were observed using laptops (see table1). this number almost doubled in 2010 to 33 percent. most laptop users were observed on the fourth and fifth floors where furniture, convenient electrical outlets, and quiet study rooms provided the best support for this technology. very few desktop computers were available, so students desiring to study on these floors have to bring their own laptops. almost 20 percent of students in 2010 were observed with other mobile technology, such as cell phones or ipods, and 16 percent were wearing headphones, which indicated there was other, often not visible, mobile technology in use. student use of library computers | thompson 25 table 1. mobile technology observed in 2009, 1,141 students completed the computer-use survey. however, we were unable to accurately determine the return rate that year. the nature of the study, which surveyed the same locations multiple times, revealed that many of the students were approached more than once to complete the survey. thus the majority of the refusals to take the survey were because the subject had already completed one previously. the 2010 study accounted for this phenomenon by counting refusals and duplications separately. in 2010, 1,123 students completed the survey out of 1,423 unique asks, resulting in a 79 percent return rate. the 619 duplicates counted represented about half of the 2010 surveys completed and could be considered another indicator of frequent use of the library’s computers. the 2010 results included an additional 290 surveys completed by students using the other large computer lab on campus outside the library. table 2. frequency of computer use 33% 16% 18% 17% 0% 5% 10% 15% 20% 25% 30% 35% laptop in use headphones in use mobile device in use (cell phone, ipod) 2010 2009 49% 33% 11% 9% 42% 30% 15% 10% 0% 10% 20% 30% 40% 50% 60% daily when on campus several times a week several times a month rarely use comps in library 2009 2010 information technology and libraries |december 2012 26 in both years of the study, 78 percent of students said they preferred to use computers in the library to other computer lab locations on campus. students also indicated they were frequent users (see table 2). in 2009, 82 percent of students used the library computers frequently—49 percent daily and 33 percent several times a week. the frequency of use in the 2010 early term study dropped about 10 percent to 72 percent but with the same proportion of daily vs. weekly users. convenience and quiet were the top reasons given by more than half of students as to why they preferred the library computers followed closely by atmosphere. about a quarter of students preferred library computers because of their close access to other library services. table 3. preferred computer to use in the library the types of computer that students preferred to use in the library were desktop computers followed by laptops owned by the students (see table 3). it is notable that the preference for desktop computers changed significantly from 2009 and 2010: 84 percent of students preferred desktop computers in 2009 vs. 72 percent in 2010—a 12 percent decrease. not surprisingly, few students preferred the simplified walk-up computers used for quick lookups. however, we did not expect such little interest in checking out laptops, with only 2 percent preferring that option. the 2010 study added a new question to the survey to better understand the types of technology devices owned by students (see table 4). in 2010, 84 percent of students owned a laptop (combining the netbook and laptop statistics). almost 40 percent of students owned a desktop, therefore many students owned more than one type of computer. of the 85 percent of students that indicated they had a cell phone, about one-third indicated they owned smart phones. the majority of students own music players. the one technology students were not interested in was e-book readers, with less than 2 percent indicating ownership. 84% 6% 23% 2% 71% 5% 28% 2% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% sit-down pc walk-up pc own laptop laptop checked out in library 2009 2010 student use of library computers | thompson 27 table 4. technology devices owned by students (2010) to understand how the use of technology might affect use of the library in general, the survey asked students what other library services they used on the same day they were using library computers. table 5 shows survey responses are very similar between the late term 2009 study and the early term in 2010. by far the most popular use of the library, by more than three-quarters of the students, was for study. around 25 percent of the students planned to meet with others, and 20 percent planned to use the media services. around 15 percent of students planned to checkout print books, 15 percent planned to use journals, and 10 percent planned to ask for help. the biggest difference for students early in the term was an increased interest (5 percent more) in using the library for study. the late-term students were 9 percent more likely to meet with others. by contrast, users in the nonlibrary computer lab were much less likely to make use of other library services. only 24 percent of nonlibrary users planned to study in the library, and 8 percent planned to meet with others in the library that day. use of all other library services was less than 5 percent by the nonlibrary computer users. 1% 1% 7% 31% 40% 52% 59% 77% 0% 20% 40% 60% 80% 100% kindle/book reader other handheld devices netbook smart phone desktop computer regular cell phone ipod/mp3 music player laptop information technology and libraries |december 2012 28 table 5. other library services used in 2010, we also asked users what changes they would like in the library, and 58 percent of respondents provided suggestions. the question was not limited to technology, but by far the biggest request for change was to provide more computers (requested by 30 percent of all respondents). analysis of the other survey questions regarding computer ownership, and preferences revealed who was requesting more traditional desktops in the library. surprisingly, most were laptop users; 90 percent of laptop owners wanted more computers and 88 percent of the respondents making this request were located on the fourth and fifth floor, which were almost exclusively laptop users. the next most comments received were remarks indicating student satisfaction with the current library services: 19 percent of students said they were satisfied with current library services and 9 percent praised the library and its services. commonality of requests dropped quickly at that point, with the fourth most common request being for more quiet (2 percent). 1% 0% 0% 2% 2% 3% 3% 4% 7% 23% 4% 3% 3% 9% 10% 13% 13% 22% 26% 81% 0% 3% 6% 8% 10% 15% 16% 20% 35% 76% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% other pick up ill/circuit create a video/web page use a reserve book ask questions/get help look for journals/newspapers checkout a book use media meet with others study 2009 2010 non-library student use of library computers | thompson 29 discussion the results show that students consistently prefer to use computers in the library, with 78 percent declaring a preference for the library over other computer locations on campus both years of the study. this preference is confirmed by the statistics reported by csusm’s campus it department, which tracks computer login data. this data consistently shows the library computer labs are used more than nonlibrary computer labs, with the computers near the library reference desk as the most popular followed closely by the library’s second large computer lab, which is located next to the technology help desk. for instance, during the 2010 study period, the reference desk lab (80 computers) had 6,247 logins compared to 3,218 logins in the largest nonlibrary lab (108 computers)—double the amount of usage. the data also shows that use of the computers near the reference desk increased by 15 percent between 2007 and 2010. supporting the popularity of using computers in the library is the fact that most students are repeat customers. table 2 shows 82 percent of the 2009 late-term respondents used the library computers several times a week with almost half using our computers daily. in contrast, 72 percent of the 2010 early term students used the library computers daily or several times a week. the 10 percent drop in frequency of visits to the library for computing applied to both laptop and desktop users and seems to be largely due to not yet receiving enough work from classes to justify more frequent use. the kind of computer that users prefered changed somewhat over the course of the study. the preference for desktop computers dropped from 84 percent of students in 2009 to 72 percent in 2010 (see table 3). one reason for this 12 percent drop may be related to how the survey was adminstered. the 2010 study did a more thorough job of surveying the fourth and fifth library floors where most laptop users are. as a result, the laptop floors represented 29 percent of the response in 2010 vs. only 13 percent in 2009. these numbers are also reflected in the proporation of laptops observed each year—33 percent in 2010 vs. 17 percent in 2009 (see table 1). the drop in desktop computer preference is interesting because it was not matched by an equally large increase in laptop preference, which only increased by 5 percent. the other reason for the decrease in desktop preference is likely due to the larger change seen nationwide in student laptop ownership. for instance, the pew study of gadget ownership showed a 13 percent drop in desktop ownership over a five-year period, 2006–2011, while at the same time laptop ownership almost doubled from 30 percent to 56 percent.17 however, it is interesting to note that, according to the pew study, in 2011 the percent of adults who owned each type of device was nearly equal— 55 percent for desktops and 56 percent for laptops. the 2010 survey tried to better understand students’ preferences by identifying all the kinds of technology they had available to them. we found that 77 percent of csusm students owned laptops and an additional 7 percent owned the netbook form of laptops (see table 4). the combined 84 percent laptop ownership is comparable with the 2010 ecar study’s finding of 89 percent student laptop ownership nationwide.18 this high level of laptop ownership may explain why the users who preferred laptop computers almost all preferred to use their own rather than laptops checked out in the library. despite the high laptop ownership and decrease in desktop preference, it is significant that the majority of csusm students still prefer to use desktop computers in the library. aside from the 72 percent of respondents who specifically stated a preference for desktop computers, the top suggestion for library improvement was to add more desktop computers, requested by 38 percent information technology and libraries |december 2012 30 of respondents. further analysis of the survey data revealed that it was the laptop owners and the fourth and fifth floor laptop users who were the primary requestors of more desktop computers. to try to better understand this seemingly contradictory behavior, we have done some further investigation. anecdotal conversations with users during the survey indicated that convenience and reliability are two factors affecting student’s decision to use desktop computers. the desktop computers’ speed and reliable internet connections were regarded as particularly important when uploading a final project to a professor, with some students stating they came to the library specifically to upload an assignment. in may 2012, the csusm library held a focus group that provided additional insight to the question of desktops vs. laptops. all of the eight-student focus group participants owned laptops, yet all eight participants indicated that they preferred to use desktop computers in the library. when asked why, participants indicated the reliability and speed of the desktop computers and the convenience of not having to remember to bring their laptop to school and “lug” it around. another factor influencing the convenience factor may be that our campus does not require that students own a laptop and bring it to class, so they may have less motivation to travel with their laptop. supporting the idea that students perceive different benefits for each type of computer, six of the eight participants owned a desktop computer in addition to a laptop. the 2010 study also showed that students see value in owning both a desktop and a laptop computer, since the 40 percent ownership of desktop computers overlaps the 84 percent ownership of laptops (see table 4). table 6. reasons students prefer using library computer areas for almost half of the students surveyed, one of the reasons for their preference for using computers in the library was either the ready access to library services or staff (see table 6). even more significant, when specifically asked what else they planned to do in the library that day besides using the computer (see table 5), more than 80 percent of the students indicated that they intended to use the library for purposes other than computing. the top two uses for the library were studying (76 percent in 2009, 81 percent in 2010) and meeting with others (35/26 percent), indicating the importance of the library as place. the most popular library service was the media 0% 5% 10% 15% 20% 25% 30% library services are close library staff are close 2009 2010 student use of library computers | thompson 31 library (20/22 percent) followed by collections with 16/13 percent planning to checkout a book and 15/13 percent planning to look for journals and newspapers. it is interesting that the level of use of these library services was similar whether early or late in the term. the biggest difference was that early term students were less likely to be working with a group but were slightly more likely to be engaged in general studying. even the less-used services, such as asking a question (10 percent) or using a reserve book (8 percent), exhibited an appropriate amount of usage if one looks at the actual numbers. for example, 8 percent of 1,123 2010 survey respondents represent 90 students who used reserve materials sometime during the 8 hours of the two-week survey period. to put the use of the library by computer users into perspective, we also asked students using the nonlibrary computer lab if they planned to use the library sometime that same day. only 24 percent of the nonlibrary computer users planned to study in the library that day vs. 81 percent of the library computer users; only 4 percent planned to use media vs. 24 percent; and 2 percent planned to check out a book vs. 13 percent. the implication is clear that students using computers in the library are much more likely to use the library’s other services. we usually think of providing desktop computers as a service for students, and so it is. however, the study results show that providing computers also benefits the library itself. it reinforces its role as place by providing a complete study environment for students and encouraging all study behaviors including communication and working with others. the popularity of the library computers provide us with a “captive audience” of repeat customers. conclusion the csusm library technology that was planned in 2004 is still meeting students’ needs. although most of our students own laptops, most still prefer to use desktop computers in the library. in fact, providing a full-service computer environment to support the entire research process benefits the entire library. students who use computers in the library appear to conduct more of their studying in the library and thus make more use of traditional library collections and services. going forward, several questions arise for future studies. csusm is a commuter school. students often treat their work space in the library as their office for the day, which increases the importance of a reliable and comfortable computer arrangement. one question that could be asked is whether the results would be different for colleges where most students live on campus or nearby. if the university requires that all students own their own laptop and expects them to bring them to class, how does that affect the relevance of desktop computers in the library? the 2010 study was completed just a few weeks before the first ipad was introduced. since students have identified convenience and weight as reasons for not carrying their laptops, are tablets and ultra-light computers, like the macbook air, more likely to be carried on campus by students and used them more frequently for their research? how important is it to have a supportive mobile infrastructure with features such as high speed wifi, ability to use campus printers, and access to campus applications? are students using smart phones and other mobile devices for study purposes? in fact, are we focusing too much on laptops, and are other mobile devices starting to take over that role? this study’s results make it clear that we can’t just look at data such as ecar’s, which show high laptop ownership, and assume that means students don’t want or won’t use library computers. as information technology and libraries |december 2012 32 the types of mobile devices continue to grow and evolve, libraries should continue to develop ways to facilitate their research role. however, the bottom line may not be that one technology will replace another but rather that students will have a mix of devices and will choose which device is best suited to a particular purpose. therefore libraries, rather than trying to pick which device to support, may need to develop a broad-based strategy to support them all. references 1. susan m. thompson and gabriella sonntag. “chapter 4: building for learning: synergy of space, technology and collaboration.” learning commons: evolution and collaborative essentials. oxford: chandos publishing (2008): 117-199. 2. heidi m. steiner and robert p. holley, “the past, present, and possibilities of commons in the academic library,” reference librarian 50, no. 4 (2009): 309–332. 3. michael j. whitchurch and c. jeffery belliston,“information commons at brigham young university: past, present, and future,” reference services review 34, no. 2 (2006): 261–78. 4. harold shill and shawn tonner, “creating a better place: physical improvements in academic libraries, 1995–2002,” college & research libraries 64 (2003): 435. 5. barbara i. dewey, “social, intellectual, and cultural spaces: creating compelling library environments for the digital age,” journal of library administration 48, no. 1 (2008): 85–94; diane dallis and carolyn walters, “reference services in the commons environment,” references services review 34, no. 2 (2006): 248–60. 6. liz burke et al., “where and why students choose to use computer facilities: a collaborative study at an australian and united kingdom university,” australian academic & research libraries 39, no. 3 (september 2008): 181–97. 7. shannon d. smith and judith borreson caruso, the ecar study of undergraduate students and information technology, 2010 (boulder, co: educause center for applied research, october 2010), http://net.educause.edu/ir/library/pdf/ers1006/rs/ers1006w.pdf (accessed march 21, 2012). 8. pew internet & american life project, “adult gadget ownership over time (2006–2012),” http://www.pewinternet.org/static-pages/trend-data-(adults)/device-ownership.aspx (accessed june 14, 2012); the horizon report: 2009 edition, the new media consortium and educause learning initiative, http://net.educause.edu/ir/library/pdf/hr2011.pdf (accessed march 21, 2012); the horizon report: 2010 edition, the new media consortium and educause learning initiative, http://net.educause.edu/ir/library/pdf/hr2011.pdf (accessed march 21, 2012); the horizon report: 2011 edition, the new media consortium and educause learning initiative, http://net.educause.edu/ir/library/pdf/hr2011.pdf (accessed march 21, 2012). 9. pew internet, “adult gadget ownership.” http://net.educause.edu/ir/library/pdf/ers1006/rs/ers1006w.pdf http://www.pewinternet.org/static-pages/trend-data-(adults)/device-ownership.aspx http://net.educause.edu/ir/library/pdf/hr2011.pdf http://net.educause.edu/ir/library/pdf/hr2011.pdf http://net.educause.edu/ir/library/pdf/hr2011.pdf student use of library computers | thompson 33 10. deborah keyek-franssen et al., computer labs study university of colorado boulder office of information technology october 7, 2011, http://oit.colorado.edu/sites/default/files/labsstudypenultimate-10-07-11.pdf (accessed june 15, 2012). 11. j. s. c. tom, k. voss, and c. scheetz[full names?], “the space is the message: first assessment of a learning studio,” educause quarterly 31, no. 2 (2008), http://www.educause.edu/ero/article/space-message-first-assessment-learning-studio (accessed june 25, 2012). 12. nick wingfield, “time to leave the laptop behind,” wall street journal, february 23, 2009, http://online.wsj.com/article/sb122477763884262815.html (accessed june 15 2012). 13. the horizon report: 2009 edition. 14. the horizon report: 2010 edition. 15. the horizon report: 2011 edition. 16. alison j. head and michael b. eisenberg, “balancing act: how college students manage technology while in the library during crunch time,” project information literacy research report, information school, university of washington, october 12, 2011, http://projectinfolit.org/pdfs/pil_fall2011_techstudy_fullreport1.1.pdf (accessed june 14, 2012). 17. pew internet, “adult gadget ownership.” 18. smith and caruso, ecar study. http://oit.colorado.edu/sites/default/files/labsstudy-penultimate-10-07-11.pdf http://oit.colorado.edu/sites/default/files/labsstudy-penultimate-10-07-11.pdf http://www.educause.edu/ero/article/space-message-first-assessment-learning-studio http://online.wsj.com/article/sb122477763884262815.html http://projectinfolit.org/pdfs/pil_fall2011_techstudy_fullreport1.1.pdf table 1. mobile technology observed discussion microsoft word june_ital_vacek_final.docx president’s message: making an impact in the time that is given to us rachel vacek information technologies and libraries | june 2015 3 in an early chapter in the fellowship of the ring, by j.r.r. tolkien, frodo laments having found the one ring and gandalf tries to console him by saying, “all we have to decide is what to do with the time that is given us.” this is one of my favorite quotes in the lord of the rings series because it inspires us to rise to the occasion and perform to the best of our abilities. it also implies that that we have a purpose to fulfill within a predetermined time period. although my term in office is three years, i’m only lita president for one year. to set a vision and goals, establish a sense of urgency, generate buy-‐in, engage and empower the membership, implement sustainable changes, and remain positive and focused – all within one year while holding a full-‐time job – is challenging to say the least. i’ve been very fortunate during my almost eight-‐year tenure at the university of houston libraries to participate in numerous professional development opportunities, lead change, and make a difference. personal and professional growth has always been very important to me, and being in an environment that encourages me to become a better librarian, technologist, manager, and leader is not only helpful for my career, but also extremely rewarding on an intellectual level. lita has benefited that training. in today’s library technology landscape, one of the many skills leaders need to possess is the ability to effect change. as lita president, i have put many changes in motion and am happy with what i have accomplished, and proud of our board and the members who volunteer to lead and effect change. as i reflect over the past year, it’s fair to say that lita, despite some financial challenges, has had numerous successes and remains a thriving organization. three areas – membership, education, and publications – bring in the most revenue for lita. of those, membership is the largest money generator. however, membership has been on a decline, a trend that’s been seen across ala for the past decade. in response, the board, committees, interest groups, and many and individuals have been focused on improving the member experience to retain current members and attract potential ones. with all the changes to the organization and leadership, lita is on the road to becoming profitable again and will remain one of ala’s most impactful divisions. rachel vacek (revacek@uh.edu) is lita president 2014-‐15 and head of web services, university libraries, university of houston, houston, texas. president’s message | vacek doi: 10.6017/ital.v34i2.8804 4 the board has taken numerous steps to stabilize or reverse the decline in revenues that has resulted from a steady reduction in overall membership. at ala annual 2014, the financial advisory committee was established to respond to recommendations from the financial strategies task force, adjusting the budget to make a number of improvements while planning for larger, more substantial changes. in fall 2014 we took steps to improve our communications by establishing the communications & marketing committee and appointing a social media manager and a blog manager. the blog and social media have seen a steady upward trajectory of engagement with over 27,000 blog views since september 2014 and over 13,300 followers on twitter. these efforts help recruit and retain members, advertise our online education and programming, and increase attendance at conferences. over the past year, nine workshops and two web courses were offered, many of which sold out thanks to new marketing approaches. the forum remains popular and has stellar programming and keynote speakers. programs and workshops at ala conferences are stronger than ever and continue to be well attended. publications also remain strong. although only three lita guides were published this year, partially due to a change in publishers, there are many more in the pipeline. finally, the search for a new executive director is underway, and with a new leader comes fresh ideas and perspectives. i am excited about lita’s future. the incoming board, along with a new executive director, has an opportunity to make national and lasting impact as well as collaborate with outstanding librarians and staff in this division and across ala. lita’s challenges and successes are shared amongst a dedicated team of volunteers, and together we’ve made significant changes. i believe that lita members will continue to rise to the occasion and make incredible things happen with “the time that is given us.” lita is an amazing organization because of its members and their passion and dedication. i couldn’t be prouder. it has been an honor and a privilege to serve as your president. j costs of library catalog cards produced by computer 121 frederick g. kilgour: ohio college library center, columbus, ohio production costs of 79,831 cards are analyzed. cards were produced by four variants of the columbia-harvard-yale procedure employing an ibm 870 document writer and an ibm 1401 computer. costs per card ranged from 8.8 to 9.8 cents for completed cards. . early in september, 1964, the yale medical library.put into routine operation the columbia-harvard-yale computerized technique for catalog card manufacture ( 1), and during the following three · years yale produced over 87,000 cards. the principal objective of the chy project was an on-line, computerized, bibliographic information retrieval system. however, the route selected for attaining the objective included manufacture of cards from machine readable data to keep up the manual catalog while machine readable records were being inexpensively accumulated for computerized subject retrieval. catalog cards were only one product of the system, but their production was designed to be as efficient as possible within constraints of the system. nevertheless, this paper will examine chy card production costs as though this segment of the system were an isolated procedure, yielding but one product, as is the case in classical library procedures. costing will disregard other benefits, such as accession lists and machine readable data produced for little, or no, additional expense. the columbia medical library and harvard medical library also installed ibm 870 document writers and tested the programs for card production, but neither library routinely produced cards. however, co122 journal of library automation vol. 1/ 2 june, 1968 lumbia produced its acquisitions lists until october, 1966, using chy techniques. harvard issued a similar list, but for a shorter period of time, and it was harvard's withdrawal early in 1966 that brought about the collapse of the project. nevertheless, other institutions adopted the chy procedure for catalog card production, among them the medical library at the university of rochester, which used the programs for two years following february, 1966. e. r. squibb & sons at east brunswick, new jersey, also uses the programs. at the university of kentucky an 870 document writer types catalog cards, but new programs were written to run on an ibm 7040 computer that recently have been recoded in cobol for an ibm 360/50. similarly, the library at philip morris, inc., richmond, virginia, rewrote the programs to run on an ibm 1620 computer which punches cards that drive an 870. the korean social science bibliography project of the human relations area files has elaborated the chy technique into its automated bibliographic system ( 2), which in turn is the base for another bibliographic system for mrican studies. the machine readable cataloging record of the chy mechanized system eventually became the great-grandfather of the marc ii format and contributed about as much to marc ii as would have been the case had their relationship been truly biological. although the columbia-harvard-yale project never did develop and activate its proposed bibliographic information retrieval system, r. k. summit working entirely independently has brought into successful operation his excellent dialog system ( 3) which is essentially the system that chy had in design stage. moreover, summit's system is definitely superior because it has several useful functions not contemplated in chy. nearly all reports on catalog card production limit study of costs to reproduction of cards and neglect other costs involved in preparing cards for the catalog. an exception is p. j. fasana's 1963 investigation wherein he found that library of congress cards, in seven copies and ready to be filed into a catalog, cost 16.6 cents per card; cards produced by a machine method consisting of a tape typewriter and a very small special purpose computer cost 9.9 cents ( 4). fasana used an hourly salary rate of $2.00. a study of early experience with chy production yielded 12.5 cents per card ( 1) whereas the present study shows that costs range between 8.8 and 9.8 cents per card, cards being ·in completed form, arranged in packs for individual catalogs, and ready for bursting before alphabetizing for filing. methods · during the course of the three years in which the chy programs were in operation, four variant techniques were used for card production. the first three with their limitations have been described · elsewhere ( 5). briefly, the initial system consisted of keypunching from worksheets, _listing the punch cards on an ibm 870 document writer, proofreading and costs of library catalog cards/ kilgour 123 correcting, processing the proofread and corrected punch cards on an ibm 1401 computer which produced punch card output that, in tum, was used to drive the 870 document writer for production of catalog cards on oneup forms. in the next arrangement, printing of cards on one-up forms was accomplished on an ibm 1401 computer driving an upperand lowercase print chain. in the third procedure, a two-up card form replaced the one-up form. finally, the medical library returned the 870 document writer to the manufacturer, and the 1401 was programmed to do the prooflisting in upper and lower case. the yale bibliographic system (6) replaced the chy routines on 25 july 1967. the keypuncher kept time records for the various activities listed in table 1 throughout the period of this study. during the first two months of operation, design for recording data was inadequate. subsequently an individual would, albeit infrequently, fail to record time elapsed, so that production of 7,630 cards was omitted from the study, leaving a total of 79,831 to be included. on several occasions during the fourth part of the study, the second proofreading was suspended, and only correction carried out. hence, time expended in this category is less than in the previous three periods. at first an ibm 1401 computer in the yale computer center was used, the center being located about a mile from the medical library. subsequently, another 1401 modified to drive an upperand lower-case print chain and located in the medical sc;hool was employed. later this machine was transferred to the administrative data systems computer center, which moved to a new location not long after it assumed operation of the 1401. still later, the 1401 was again transferred, this time to the yale computer center. as can be seen from the computer charges in table 1, these wanderings about new haven appear to have had no effect on operating efficiency. time recorded for each computer run was actual time clocked by the operator. other times were recorded by the individual performing the operation. ·. salaries used in the cost calculation were salaries being paid in june, 1967, which were, of course, appreciably higher than those in the autumn of 1964; hourly rate for the first proofreader in table 1 was $2.62 ~nd for the second $2.21. hourly rental for the 870 document writer was $.78. rate of computer charges employed in the calculation was $20 per hour, a rate that had existed during the last year or so during which data was collected. initially, computer charges had been $75 an hour, but they dropped precipitously during the first two years. costs for catalog card stock were the lowest cost charged for the two types of forms. since these forms were not standard items during the years of the study, their prices varied considerably depending upon the amount ordered. results table 1 contains cost figures for catalog card production by the four variant techniques. since salaries and computer charges can vary widely, -----.-.---.-~..::::-·...:::::-.-__ ...... l'o ~ table 1. per-card costs of computer-produced catalog cards. 'o' one-u p form on 870 one-up fo r m o n 1401, t woup f o r m on 1401 , two-up· form o n 1401 , ~ g proo f on 8 70 proof o n 870 p r oof o n 140 1 ...... ..a dollars hou r s dollars hou r s dolla r s hou r s d olla r s hours t"'' .... <:3"' k e ypunch i n g • 02 19 • 0099 • 0 2 18 • 0099 • 0222 • 0 10 1 • 0 235 • 0106 "'t ~ "'t '-!::: keypun c h • 0029 • 00 99 • 0030 • 009 9 • 003 0 • 0101 • 0 0 32 • 0 106 ::> ~ i b m 8 70p r o o£ • 0033 • 00 4 3 • 0 036 • 00 4 6 • 003 9 • 00 51 ..... 0 i bm 1401 -proof • 004 6 ~ • 009 1 ~ ..... .... proofr eaders (2) 0 ;:$ proofr eadi ng • 0 11 5 • 004 4 • 0 11 3 • 00 4 3 . 0118 • 00 45 • 011 6 • 0044 proofr eading and c orrecting • 0 120 • 0 0 55 • 0 12 2 • 005 5 • 0 11 9 • 0 0 54 • 009 1 • 004 1 ~ i bm 140 1 • 0149 • 0085 • 0313 • 0 156 • 023 1 • 011 6 • 024 5 • 0 112 !"""' ...... ib m 8 70-ca r d typing • 0 104 '-.... l'o card st o c k • 0 149 • 01 49 • 01 2 5 • 0125 '--1 t o ta l • 0 9 18 • 0981 • 0884 • 09 35 § v(l) ...... <;;0 n um b er of cards 1 5, 149 9343 27,210 28, 129 0:> 00 number of titles 1, 6 55 990 2 , 920 3,1 30 cards per titl e 9 . 2 9. 4 9. 3 9 . 0 ~--· costs of librm·y catalog cards/kilgour 125 particularly among countries, time per card produced is also included in the table to facilitate comparison with other systems. of course, amounts of tim~ calculated by dividing elapsed time by amount of product are not directly comparable with results of time and motion studies such as henry voos' helpful study (7) . however, two different methods of comparing the input costs in table 1 with those johnson ( 8) published for the stanford book catalog gave divergences of only 2 and 6 per cent. source of the increase in costs of six-tenths of a cent from the first procedure to the second is entirely the increase in computer charges when the 1401 replaced the 870 to print cards. when the two-up form was employed on the computer in variant three, charges then dropped to less than the combined 1401 and 870 costs in the first procedure. costs rose again in procedure four. here the principal cause of the increase was the substitution of computer-produced proof listings after the 870 document writer had been returned to the manufacturer. although there is no reason to think that preparation of cataloging copy on a worksheet is either more or less expensive than older techniques, coding a worksheet constitutes additional work for which there is no equivalent in classical procedures. coding costs were examined between 9 march and 11 may 1965, when six individuals, ranging from professional catalogers to a student assistant, recorded time required to code 725 worksheets. time per final catalog card produced was three seconds; in other words, $.003 for a cataloger receiving $7500 a year, or $.001 for a student assistant earning $1.50 an hour. if total coding cost, . rather than a portion of it, were to be charged to card production, costs reported in table 1 could rise oneto three-tenths cents. discussion the accurate comparison of costs would be with those of systems similar to the chy system that produce more than one product. for instance, the chy system also produced monthly accession lists from the same punch-card decklets that produced catalog cards. the accession list was produced mechanically at a cost far less than that for the previous manual preparation. the decklets also constituted machine readable information available for other purposes, most of which have not yet been realized. system costing would assign only a portion of keypunching and proofreading costs to card production. another saving was the appreciable shortening of time required for catalog cards to appear in the catalog. in procedures one through three, usually three or four days elapsed from the day on which the cataloger completed cataloging to the day on which cards were filed into the catalog. however, in procedure four, the computer, which was then a mile distant from the medical library, was used on two separate occasions for each batch of decklets, so that elapsed time rose to at least a week. ' i li ii ii '· ,, .. '· ,, ' • ,, 126 journal of library automation vol. 1/ 2 june, 1968 even though other benefits are not reflected in comparative costs, it is clear from fasana's findings that the chy computer-produced cards cost far less than do lc cards, and have a similar cost to those produced mechanically on which fasana reported. although there appears to be no published evidence that photocopying techniques can produce finished catalog cards at less expense than 9 cents, it is possible that some photoreproduced cards may be less expensive than those described in this article. however, it must be pointed out that photo-reproduced cards are products . of single-product procedures, whereas the chy cards are one of several system products. increase in cost between procedure three and procedure four was due to increase in cost of prooflisting in upper and lower case on the 1401 computer as compared to prooflisting on the 870 document writer. this cost increase was not detected until calculations were done for this investigation, and therein lies a moral. it was the policy at the yale library for all programming to be done by library programmers, since various inefficiences, and indeed catastrophes, had occasionally been observed when non-library personnel had prepared programs for library operations. the single exception to this policy was the proof program, which this investigation reveals used an exhorbitant amount of time-one-third of that required for subsequent card production. since it had been felt that writing and coding a prooflisting program. was perfectly straightfmward, an outside programmer of recognized ability was employed to write and code the program. because the program was simple, and because the programmer had high competence, efficiency of the program was never checked as it should have been. this episode raises the question that if even the wary can be trapped, how can the tmwary avoid pitfalls? there is no satisfactory answer, but it would appear that some difficulties could be avoided by review of new programs by experienced library programmers, of which there are unfortunately far too few. comparison with data such as that in table 1 will also be helpful, but not definitive, in evaluating new programs. of course, when widely used library computer programs of recognized efficiency are generally available, magnitude of the pitfalls will have been greatly reduced. concl"qsion computer-produced catalog cards, even when they are but one of several system products, can be prepared in finished form for a local catalog less expensively and with less delay than can library of congress printed cards. computer card production at 8.8 to 9.8 cents per completed card appears to be competitive with other procedures for preparing catalog cards. however, undetected inefficiency in a minor program increased costs, thereby emphasizing need to insure efficiency in programs used routinely. costs of library catalog cards/ kilgour 127 acknowledgements the author is most grateful to mrs. sarah boyd, keypuncher extraordinary, who maintained the record of the data used in this study. national science foundation grant no. 179 supported the chy project in part. references 1. kilgour, frederick g.: "mechanization of cataloging procedures," bulletin of the medical library association, 53 (aprill965), 152-162. 2. koh, hesung c.: "a social science bibliographic system; computer adaptations," the american behavioral scientist, 10 (jan. 1967), 2-5. 3. summit, roger k.: "dialog; an operational on-line reference retrieval system," association for computing machinery, proceedings of 22nd national conference, (1967), 51-56. 4. fasana, p.j.: "automating cataloging functions in conventional libraries," library resources & technical services, 7 ( fall1963), 350-365. 5. kilgour, frederick g.: "library catalogue production on small computers," american documentation, 17 (july 1966), 124-131. 6. weisbrod, david l.: "an integrated, computerized, bibliographic system for libraries," (in press). 7. voos, henry: standard times for certain clerical activities in technical processing (ann arbor, university microfilms, 1965). 8. johnson, richard d.: "a book catalog at stanford~" journal of library automation, 1 (march 1968), 13-50. ----------------------factors affecting university library website design | kim 99 yong-mi kim factors affecting university library website design factors include usability testing and institutional forces.5 because website design studies are sparse, this study examines the success of technology utilization studies to further identify factors that are pertinent to website design in order to provide a comprehensive view of web design success factors. a review of literature related to university library website design will be offered in the next section. the research methods, which discuss the data collection strategies and the measurements used in the current study, will be followed by the literature review. the findings of the study will later be reported and discussed after the research methods section. the paper will then conclude with an overview of the implications the findings have for academia and managers. ■■ literature review this section offers an overview of the existing website design literature and relevant success factors. these factors include institutional forces, supervisors’ technical knowledge and support, input from secondary sources, and input from users. because the aforementioned elements are identified as independent variables, this study also adopts them as such. following existing studies, website success factors are identified from the utilitarian perspective.6 the dependent variables are (1) the extent to which website designers meet users’ needs, (2) the extent to which users perceive ulwr to be useful, and (3) their actual usage. in this manner, the evaluation of success is measured from different perspectives. this discussion of the independent and the dependent variables appears in the conceptual model, figure 1. institutional forces institutional forces refer to as organizations following other organizations practices to secure efficiency and legitimacy. existing studies have identified three institutional forces: coercive, mimetic, and normative.7 coercive force takes place when an organization pressures others to adopt a certain practice. it is higher when an organization is a subset of another organization. in this research context, the university could be an agent of coercive force. mimetic force refers to organizations following other organizations’ practices, and it is especially common for organizations within the same industry group.8 because organizations within existing studies have extensively explored factors that affect users’ intentions to use university library website resources (ulwr); yet little attention has been given to factors affecting university library website design. this paper investigates factors that affect university library website design and assesses the success of the university library website from both designers’ and users’ perspectives. the findings show that when planning a website, university web designers consider university guidelines, review other websites, and consult with experts and other divisions within the library; however, resources and training for the design process are lacking. while website designers assess their websites as highly successful, user evaluations are somewhat lower. accordingly, use is low, and users rely heavily on commercial websites. suggestions for enhancing the usage of ulwr are provided. f rom a utilitarian perspective, a website evaluation is based on users’ assessments of the website’s instrumental benefits.1 if a website helps users complete their tasks, they are likely to use the website. following this line of reasoning, dominant research has reported that users are most likely to use university library website resources (ulwr) when they can help with user tasks.2 although we know now that the utilitarian perspective should be applied to web design, not clear is the extent to which web designers consider users’ needs and, likewise, the extent to which users consider ulwr to be successful in terms of meeting their needs. also not clear are what factors other than user needs influence university library website design. this is not a trivial issue because university libraries have invested a massive number of resources into providing web services and need to justify their investments to stakeholders (such as the university) by demonstrating their ability to meet users’ needs.3 also important is the identification of these factors because web design and website performance are closely correlated.4 as a consequence, investigating factors that influence successful university library website design and providing managerial guidance is a timely pursuit. later, the objectives of this paper are twofold: 1. what factors influence university library website design? 2. to what extent do website designers and users consider the university library website to be successful? to explore these research questions, this study identifies factors influencing university library website design that have been reported in existing literature. these yong-mi kim (yongmi@ou.edu) is assistant professor, school of library and information studies, university of oklahoma, tulsa, oklahoma. 100 information technology and libraries | september 2011 although it is a critical factor for website success, there is little evidence that website designers receive strong support from their supervisors. research shows that supervisors’ lack of knowledge about websites inhibits user-centered website design.17 a respondent from chen et al.’s study reports, “it’s really a pain trying to connect with our administration on the topic of web design and usability, because even definitions are completely out the window” and “the dean and the associate directors know little about the need for usability and view it as a last minute check-off, so they can say that the web site is tested and usable.”18 lack of supervisor support inhibits website usability.19 input from secondary sources website designers typically aggregate information from secondary sources rather than from users. identified secondary sources are consultations with experts, other divisions within the library, webmasters, web committees, and focus groups.20 the most widely used method is consultation with experts.21 experts uncover technical flaws and any obvious usability problems with a design,22 facilitate focus groups,23 and create new information architecture.24 because they are experts, however, their ways of thinking may not be the same as users.’25 research shows that 43 percent of the problems found by expert evaluators were actually false alarms and that 21 percent of users’ problems were missed by those evaluators. if this analysis is true, expert evaluators tend to miss and incorrectly identify more problems than they correctly identify;26 consequently, expert testing should not substitute for user testing.27 another problem with secondary sources is that web committees “are ignorant about integrating design with usability and focus on their own agenda.”28 nonetheless, because of the lack of available resources to conduct more rigorous usability tests and the difficulty of collecting information directly from users, secondary sources such as expert evaluations are commonly used.29 input from users user input provides a great advantage for directly finding out users’ needs and integrating a user-centered design during the development stage.30 often, information from secondary sources makes assumptions about users’ needs.31 to discover users’ genuine needs, designers can conduct a regular user survey and/or seek out users’ input.32 by surveying users’ needs, one can overcome criticism such as, “most websites are created with assumptions of more expert knowledge than the users may actually possess,” and can address users’ needs more effectively.33 discovering users’ needs goes beyond usability testing because information obtained directly the same industry face similar problems or issues, mimetic decisions can reduce uncertainty and secure legitimacy.9 in this context, website designers may analyze and emulate other universities’ websites to claim that their websites are congruent with successful websites, thereby justifying their managerial practices. normative force is associated with professionalism.10 normative force occurs when the norms (e.g., equity, democracy, etc.) of the professional community are integrated into organizational decision-making. in a library setting, website designers may follow a set of value systems or go to conferences to discover ways to better deliver services. there is evidence that website designers follow other organizations.11 this phenomenon is known as isomorphism. the appearance and the structure of websites show isomorphic patterns when an organization follows examples of other organizations’ websites or conforms to institutional pressures.12 another study reports coercive forces in the design of university library websites; the parent institution exercises power over library website design by providing guidelines, and later, the design is not independent.13 supervisors’ technical knowledge and support literature on supervisors’ knowledge of and support for technology has long been recognized as one of the most important technology success factors.14 if supervisors are knowledgeable about technology, they are likely to support and provide resources for training.15 supervisors’ technical knowledge also serves as a signal for the importance of the utilization of technology within the organization; consequently, employees actively look for ways to utilize technology and vigorously adopt technology.16 figure 1. conceptual model for website design success factors affecting university library website design | kim 101 march and may 2009. a total of 315 responses were collected (139 males and 176 female; 148 undergraduates, 101 master ’s, and 66 doctoral/faculty; business 152, human relations 51, psychology 43, engineering 41, education 20, other 8). because detailed discussion of the user side of this sample appears elsewhere,36 it will not be repeated here to avoid redundancy. because sparse research has been done in this area, the questionnaire and its measurements were created based on literature relating to the successful deployment of technology, but they were modified to fit into the website design context. because of this modification, the finalized instrument was pretested and pilot tested before use in this study.37 the institutional forces are measured in three categories: coercive isomorphism (i.e., following the university guidelines regarding website creation), mimetic isomorphism (i.e., investigating other university websites and investigating commercial websites), and normative isomorphism (i.e., attending conferences). following existing studies, supervisors’ knowledge and support are assessed by the web designer in two areas: the extent to which a supervisor is knowledgeable about technology and aware of the importance of technology. the supervisor ’s support for the website is measured by asking web designers about the extent to which their supervisors allocated resources and offered training. input from secondary sources is measured by asking the extent to which website designers consult sources such as experts, other divisions, webmasters, and web committees. input from users is measured by the extent to which web designers collect information from website users. finally website successes are measured by two categories: assessments made by the web designers and the website users themselves. the finalized measurements and the sources appear in table 1. ■■ report of findings this section reports the empirical findings of each category discussed in the previous section. figure 2 shows institutional forces that influence university library website design. the first category is coercive force, the second category is mimetic forces, and the third category is normative force. it is clear that the majority of university library web designers (75 percent) comply with the guidelines given by the university, which is a measurement of coercive force; and also designers investigate other universities’ websites (75 percent) and commercial websites (59 percent), which is a measurement of mimetic forces; however, designers don’t appear to actively attend conferences that influence website design, which is a measurement of normative force. from users will reveal what users want and what should be done to meet their needs, thereby enhancing ulwr usage. however, research shows that this aspect is not actively integrated into web design due to the lack of support from supervisors.34 website success success can be measured according to the website’s purpose: to what extent does the website meet users’ needs? in the university library website context, following a utilitarian perspective, researchers measured the success by the degree of ulwr integrated into users’ tasks and users’ frequent visits to the website.35 these two measurements, when combined with the designers’ perceptions of success, will allow one to measure the users’ and designers’ perspectives of website success. by measuring from these two sides, if there is a discrepancy between the two success outcomes, it will prompt designers to adjust their viewpoints to align their success measures with users. ■■ research methods this section discusses the sampling strategies and the measurements for the independent and the dependent variables. because one of the contributions of this study is to compare users’ and designers’ perceptions of website success, the samples are drawn from two groups: one is from university library website designers and the other one is from university library users. for the designer side, it is directly collected from university library website designers; later, libraries without website designers within the library are excluded. the designer sample is identified from the publicly available yahoo academic library list (http://dir.yahoo.com/ reference/libraries). the list contains 448 academic libraries, including those outside the united states. the research assistant made a phone call to the libraries that reside in the united states and verified the existence of website designers within the library, which included 86 academic libraries. if a library had a website designer, the research assistant contacted the person and invited him or her to participate in the study. because of difficulties contacting website designers, the research assistant was able to collect 16 responses between may 2009 and february 2010. once the graduate assistant identified the unreachable designers, the researcher e-mailed those designers between january and april of 2010 and added 30 more responses to the dataset, which resulted in a total of 46 responses (a 54 percent response rate). for the user side, a survey questionnaire was sent to faculty, doctoral, master ’s, and undergraduate students between 102 information technology and libraries | september 2011 the second group of factors that affects website design is supervisors’ knowledge about technology and support for the utilization of technology (see figure 3). web designers have a somewhat mixed perception about their supervisors’ technical knowledge. more specifically, 37 percent of respondents responded that their supervisors do not have good knowledge about technology; 23 percent responded that their supervisors were somewhat knowledgeable about technology; and 40 percent responded that their supervisors have good knowledge about technology; thus, web designers have mixed evaluations about supervisors’ technical knowledge. web designers reported that their supervisors’ perceptions of the importance of technology and websites are higher than their technical knowledge. approximately 60 percent of designers responded that their supervisors emphasize the importance of technology and websites, and the remaining respondents answered that their supervisors are somewhat aware of the importance or do not value it at all. table 1. instrument construct operationalization source institutional forces following university guidelines regarding website creations investigating other university websites investigating commercial websites attending conferences 11, 12, 15 supervisor’s technical knowledge and support supervisor’s knowledge about technology supervisor’s evaluation of the importance of technology supervisor’s evaluation of the importance of website utilization availability of website tools availability of budgeting availability of technical training availability of website creation training 17, 22 input from secondary sources consulting with experts consulting with other divisions within the library consulting with webmasters consulting with website committee consulting with focus group 10, 25–26 input from users conducting user survey utilizing users’ inputs 10 website success measures from web designer we meet users’ needs we provide better services via the website we satisfy users’ needs we provide quality services our library is overall successful 1, 2 website success measures from website users it lets me finish my project more quickly it helps improve my productivity it helps enhance the quality of my project the extent to which users integrate website library resources into users’ tasks* frequency of users’ visits to university library website** 3, 41, 43 all items are measured with a likert scale: 1 not really; 2: somewhat; and 3: greatly. * measured by percentage **measured by frequency figure 2. institutional forces factors affecting university library website design | kim 103 percent of respondents reported that they consult with web experts; over 70 percent responded that they integrate input from other divisions; and around 70 percent consult with webmasters. the utilization of secondary information sources for website creation is very high except for focus groups. the most widely used technique in this category is expert consultations followed by consultations with other divisions within the library. web designers also consider input from webmasters and web committees. figure 6 shows the extent to which website designers apply input directly derived from web users. around half of respondents reported that they obtain information from user surveys, and around 70 percent responded that they consider users’ input collected via comments, feedback, and complaints. figure 4 shows the extent to which supervisors support web designers. fifty-five percent of respondents reported that they have good web creation tools; 44 percent responded that they have enough budget for website creation, and almost a similar rate of respondents (39 percent) reported that they do not have adequate budgets for website creation. the last two questions concerning training show somewhat different results from the findings of the first two questions. the majority of web designers do not get technology-related or website creation-related training. less than one-third of respondents reported that they receive enough technology-related and web creationrelated training. the findings of the use of secondary sources show in figure 5 that web designers actively leverage such information sources for web design. by category, over 80 figure 3. supervisor’s knowledge about technology figure 4. supervisor’s support figure 5. input from secondary sources figure 6. input from users 104 information technology and libraries | september 2011 majority of users rely on commercial web resources for their academic tasks. ■■ discussion based on the study’s findings, this discussion will first cover the most influential factors first followed by the least influential elements in designing a university library website. first, the most influential factors for website designers are expert opinions and consultations with other divisions within the library. these may be the most important factors because relying on experts allows designers to discover users’ needs while saving costs. web designers also consider input from webmasters and web committees. coercive and mimetic forces are also highly significant factors affecting web designers. the university library is a subset of the university, and thus, designers may need to align themselves with university policy. also, designers can claim legitimacy by imitating other successful university websites, thereby securing necessary resources and support for website creation; however, web designers are much less likely to imitate commercial websites. this finding is consistent with existing reports that organizations imitate other successful organizations’ managerial practices that are within the same industry category.38 the least influential website creation factors are supervisors’ knowledge, which in turn impacts low budget allocations, and web designers’ technical training. this finding is consistent with successful technology deployment literature that shows supervisors’ technical knowledge is highly correlated with budget allocations.39 the lack of training for web designers does not appear to be improved since the last study, which was conducted in 2001;40 library ■■ website success website success is evaluated from two sides: designer opinion and user opinion. overall, designers evaluated their websites to be highly successful. they believe that they meet users’ needs, provide better services via the web, satisfy users’ needs, and provide quality services. later, their evaluation of their website is extremely positive, as reported in figure 7. figure 8 shows users’ perceptions of the usefulness of ulwr. users generally agree that ulwr are useful for their academic projects. more specifically, 55 percent responded that they are able to finish their tasks quickly because of the resources; 65 percent reported that they could increase their productivity; and 67 percent responded that they enhanced project quality thanks to the resources. on the other hand, a significant portion of respondents (more than 30 percent) do not think or have no opinions that ulwr are useful for their academic tasks. figure 9 investigates how often users visit university library websites. approximately 30 percent reported that they never visited or rarely visited the university library website. thirty-two percent made a visit to the website a couple of times a month, and approximately 40 percent visited the library website a couple of times a week or daily. figure 10 examines the users’ utilization of ulwr versus commercial website resources. the responses from 315 users show that they utilize commercial websites more than ulwr. specifically, 46 percent of respondents reported that they use less than 20 percent of ulwr and only 8 percent utilize ulwr more than 80 percent. in contrast, 14 percent utilize less than 20 percent of commercial website resources, and 22 percent utilize more than 80 percent of commercial website resources. the figure 7. website success evaluated by design figure 8. users’ perceptions of website usefulness factors affecting university library website design | kim 105 from a utilitarian perspective, web designers primarily need to consider the ability of the website to meet users’ needs. usefulness again needs to be evaluated by users. according to user assessments ulwr are somewhat satisfactory but not strong enough to rely heavily on for academic projects. it is an alarming fact that users use commercial website resources at a much higher rate than ulwr. this is somewhat disturbing given that web designers strive to provide good services to users, and libraries have invested massive resources into providing online services. this study has implications for academia and practitioners. for academia, there has been sparse research on web design studies from a designer standpoint. it may be because of difficulties in collecting data directly from website designers. from this line of research, this study enhances the understanding of what factors influence university web design. although university websites may be deemed successful, information managers should discover why the majority of users turn to commercial websites for their academic projects. without addressing this problem, the existence of library websites may be compromised. although there is evidence that libraries consider user input, it may not accurately represent all user populations because only extremely satisfied or extremely dissatisfied users tend to provide feedback;43 consequently, a regular survey may facilitate the utilization of ulwr. finally, supervisors’ technical knowledge is found to be low. this problem may be alleviated as time goes on because new generations are more aware of the importance of technology. in the meantime, web designers are encouraged to actively communicate with supervisors about the value of the utilization of technology and seek more financial support. this study’s data have some limitations. although the web designers are usually self-taught rather than formally trained.41 one promising finding, though, is that despite the relatively low technical knowledge held by supervisors, the respondents tend to rank highly when it comes to their perceptions of the importance of technology. compared with other institutional forces, normative force is relatively low. this kind of institutional force is higher at the early stage of technology adoption. in other words, the majority of universities have already launched their websites and have established rules and policies, so libraries are already past this stage. also, input from user surveys is relatively low. this may be because it is very costly, and they have other sources to turn to such as other universities’ successful websites. website success evaluations by web designers and users show discrepancies. overall, web designers evaluate their websites to be highly successful, while user ratings offer a different picture. this incongruity is a red flag in terms of ulwr usage. the majority of users report that they turn to commercial websites more than ulwr, and one-third never or rarely visit the university website. the disparity of the success between web designers and users may be attributed to the sources of information that website designers rely on. more specifically, existing studies report that input from experts and website committees is incongruent with what users really want, while feedback from focus groups can assist in understanding users’ needs.42 ■■ conclusions this study investigates the factors that website designers consider when designing university library websites. figure 9. frequency of visits to university library websites figure 10. university library vs. commercial website 106 information technology and libraries | september 2011 seriously in information systems research,” mis quarterly 29, no. 4 (2005): 591–605. 9. scott, institutions and organizations; dimaggio and powell, “the iron cage revisited”; h. haverman, “follow the leader: mimetic isomorphism and entry into new markets,” administrative science quarterly 38, no. 4 (1993): 593–627. 10. scott, institutions and organizations. 11. k. lee, dinesh mirchandani, and xinde zhang, “an investigation on institutionalization of web sites of firms,” the data base for advances in information systems 41, no. 2 (2010): 70–88. 12. lee, mirchandani, and zhang, “an investigation on institutionalization of web sites of firms.” 13. r. raward, “academic library website design principles: development of a checklist,” australian academic & research libraries 32, no. 2 (2001): 123–36. 14. y-m. kim, an investigation of the effects of it investment on firm performance: the role of complementarity (saarbrucken, germany: vdm verlag, 2008); p. weill, “the relationship between investment in information technology and firm performance: a study of the valve manufacturing sector,” information systems research 3, no. 4 (1992): 307–33. 15. a. lederer and v. sethi, “the implementation of strategic information systems planning methodologies,” mis quarterly (1988): 445–461; j. thong, c. yap, and k. raman, “top management support, external expertise and information systems implementation in small business,” information systems research 7, no. 2 (1996): 248–67; m. earl, “experiences in strategic information systems planning,” mis quarterly (1993): 1–24; a. boynton and r. zmud, “information technology planning in the 1990’s: directions for practice and research,” mis quarterly 11, no. 1 (1987): 59–72. 16. s. jarvenpaa and b. ives, “information technology and corporate strategy: a view from the top,” information systems research 1, no. 4 (1990): 351–76. 17. chen, germain, and yang, “an exploration into the practices of library web usability in arl academic libraries.” 18. ibid. 19. j. veldof and s. nackerud, “do you have the right stuff? seven areas of expertise for successful web site design in libraries,” internet reference services quarterly 6, no. 1 (2001): 20. 20. chen, germain, yang, “an exploration into the practices of library web usability in arl academic libraries”; r. raward, “academic library website design principles: development of a checklist,” australian academic & research libraries 32, no. 2 (2001): 123–36; j. bobay et al., “working with consultants to test usability: the indiana university bloomington experience,” in usability assessment of library-related web sites: methods and case studies, ed. n. campbell (chicago: ala, 2002): 60–76; h. king and c. jannik, “redesigning for usability: information architecture and usability testing for georgia tech library’s website,” oclc systems & services 21, no. 3 (2005): 235–43. 21. j. h. spyridakis, j. b. barrick, and e. cuddihy, “internetbased research: providing a foundation for web-design guidelines,” ieee transactions on professional communication 48, no. 3 (2005): 242–60; t. a. powell, web design: the complete reference (berkeley, calif.: osborne/mcgraw-hill, 2002). 22. powell, web design. 23. r. tolliver, d. carter, and s. chapman, “website redesign and testing with a usability consultant: lessons learned,” oclc systems & services 21, no. 3 (2005): 156–66; l. vandecreek, author tried to increase responses using various means, the number of responses does not allow one to use a sophisticated analytical technique such as regression. this study includes academic libraries with a web designer within the library; as a consequence, libraries without a web designer are not included. it is recommended to collect data from both groups and compare those with a designer (resource rich) and without a designer (resource poor), and discover underlying patterns of the factors impacting website designs and offer implications for academia and managers. references 1. d. v. parboteeah, j. s. valacich and j. d. wells, “the influence of website characteristics on a consumer’s urge to buy impulsively,” information systems research 20, no. 1 (2009): 60–78; m-h. huang, “designing web site attributes to induce experiential encounters,” computers in human behavior 19 (2003): 425–42. 2. y-m. kim, “the adoption of university library web site resources: a multigroup analysis,” journal of the american society for information science & technology 61, no. 5 (2010): 978–93; o. nov and c. ye, “users’ personality and perceived ease of use of digital libraries: the case for resistance to change,” journal of the american society for information science & technology 59 (2008): 845–51; n. park et al., “user acceptance of a digital library system in developing countries: an application of the technology acceptance model” international journal of information management 29, no. 3 (2009): 196–209. 3. w. hong et al., “determinants of user acceptance of digital libraries: an empirical examination of individual differences and system characteristics,” journal of management information systems 18, no. 3 (2001–2): 97–124. 4. parboteeah, valacich and wells, “the influence of website characteristics; j. palmer, “web site usability, design, and performance metrics,” information systems research 13, no. 2 (2002): 151–67. 5. c. burton, “library web site user testing,” collect & undergraduate libraries 9, (2002): 10; s. ryan, “library web site administration: a strategic planning model for the smaller academic library,” journal of academic librarianship 29, no. 4 (2003): 207–18; y-h chen, c.a. germain., and h. yang, “an exploration into the practices of library web usability in arl academic libraries,” journal of the american society for information science and technology 60, no. 5 (2009): 953–68. 6. m-h huang, “designing web site attributes to induce experiential encounters,” computers in human behavior 19 (2003): 425–42. 7. w. r. scott, institutions and organizations (thousand oaks, calif.: sage publications, inc, 1995); p. dimaggio and w. powell, “the iron cage revisited: institutional isomorphism and collective rationality in organizational fields,” american sociological review 48 (1983): 147–60. 8. w. r. scott, institutions and organizations; h. haverman, “follow the leader: mimetic isomorphism and entry into new markets,” administrative science quarterly 38, no. 4 (1993): 593–627; m. w. chiasson and e. davidson,” taking industry factors affecting university library website design | kim 107 “usability testing for web redesign: a ucla case study,” oclc systems & services 21, no. 3 (2005): 226–34; j. ward, “web site redesign: the university of washington libraries’ experience,” oclc systems & services 22, no. 3 (2006): 207–16. 32. chen, germain, and yang, “an exploration into the practices of library web usability in arl academic libraries.” 33. ibid. 34. kim, “the adoption of university library web site resources.” 35. ibid. 36. ibid. 37. y-m. kim, “validation of psychometric research instruments: the case of information science,” journal of the american society for information science & technology 60, no. 6 (2009): 1178–91. 38. h. haverman, “follow the leader: mimetic isomorphism and entry into new markets,” administrative science quarterly 38, no. 4 (1993): 593–627. 39. t. teo and j. ang, “an examination of major is planning problems,” information journal of information management 21 (2001): 457–70. 40. r. raward, “academic library website design principles: development of a checklist,” australian academic & research libraries 32, no. 2 (2001): 123–36. 41. ibid. 42. chen, germain, and yang, “an exploration into the practices of library web usability in arl academic libraries”; powell, web design; b. bailey, “heuristic evaluations vs. usability testing,” ui design update newsletter (2001), http:// www.humanfactors.com/downloads/jan01.asp (accessed june 15, 2011). 43. t. hennig-thurau et al., “electronic word-of-mouth via consumer-opinion platforms: what motivates consumers to articulate themselves on the internet?” journal of interactive marketing 18, no. 1 (2004): 38–52. “usability analysis of northern illinois university libraries’ website: a case study,” oclc systems & services 21, no. 3 (2005): 181–92. 24. spyridakis, barrick, and cuddihy, “internet-based research.” 25. b. bailey, “heuristic evaluations vs. usability testing,” ui design update newsletter (2001), http://www.humanfactors .com/downloads/jan01.asp (accessed june 10, 2011). 26. powell, web design. 27. chen, germain, and yang, “an exploration into the practices of library web usability in arl academic libraries.” 28. k.a. saeed, y. hwang, and v. grover, “investigating the impact of web site value and advertising on firm performance in electronic commerce,” international journal of electronic commerce 7, no. 2 (2003): 119–41. 29. l. manzari and j. trinidad-christensen, “user-centered design of a web site for library and information science students: heuristic evaluation and usability testing,” information technology & libraries 25, no. 3 (2006): 163–69. 30. e. abels, m. white, and k. hahn, “identifying user-based criteria for web pages,” internet research 7, no. 4 (1997): 252–56. 31. l. vandecreek, “usability analysis of northern illinois university libraries’ website: a case study,” oclc systems & services 21, no. 3 (2005): 181–92; m. ascher, h. lougee-heimer, and d. cunningham, “approaching usability: a study of an academic health sciences library web site,” medical reference services quarterly 26, no. 2 (2007): 37–53; b. battleson, a. booth and j. weintrop, “usability testing of an academic library web site: a case study,” journal of academic librarianship 27, no. 3 (2001): 188– 98; g. h. crowley et al., “user perceptions of the library’s web pages: a focus group study at texas a&m university,” journal of academic librarianship 28, no. 4 (2002): 205–10; b. thomsett-scott and f. may, “how may we help you? online education faculty tell us what they need from libraries and librarians,” journal of library administration 49, no. 1/2 (2009): 111–35; d. turnbow et al., 152 information technology and libraries | december 2011 ■■ more from the far side of the k–t boundary in my september column, i offered some old-school suggestions for how we as a profession might cope with our confused and unbalanced times. since then, several more have crossed my mind, and i thought i’d offer them, for what they may be worth: ■■ we can outsource everything but responsibility. whether it’s “the cloud,” vendor acquisition profiles, or shelfready cataloguing, outsourcing has become a popular way of dealing with budgetary and staffing stresses during the past few years. generally speaking, i have serious reservations about outsourcing our services, but i do recognize the imperatives that have caused us to resort to them. that said, in farming out critical library services, we do not at the same time gain license to farm out responsibility for their efficient operation. oversight and quality control are still up to us, and it simply will not wash with patrons today, next year, or a century from now to be told that a collection or service is unacceptably substandard because we outsourced it. a vendor’s failure is our failure, too. it’s still “our stuff,” and so are the services. ■■ we’re here to make decisions, not avoid them. document delivery, patron-driven acquisitions, usability studies, and evidence-based methodologies should help to inform and serve as validity checks for our decisions, not be replacements for them. as with outsourcing and our over-reliance on technology-driven solutions, i fear that these services and methodologies are in real danger of becoming crutches, enabling us to avoid making decisions that may be difficult, unpopular, tedious, or simply too much work. but if decisions regarding collections and services can be reduced to simple questions of demand or the outcome of a survey, then who needs us? it’s our job to make these decisions; demandor survey-driven techniques are simply there to assist us in doing so. ■■ relevance is relative. we talk about “relevance” in much the same breathlessly reverential voice as we speak of the “user” . . . as if there were but one, uniquely “relevant” service model for that single, all-encompassing “user.” one of the perils of our infatuation with “relevance” is the illusion that by adopting this or that technology or targeted service, we are somehow remaining relevant to “the user.” which user? just as not all patrons come to us seeking potboiler romances, so too not all users demand that all services and collections be made available electronically, over mobile platforms. since we do recognize that our resources are finite, rather than pandering to some groups at the expense of others with trendy temporal come-ons, why not instead focus on long-term services and collections that reflect our values? the patrons who really should matter most to us will respect us for this demonstration of our integrity. ■■ libraries are ecosystems. as with the rest of the world around us, libraries comprise arrays of interlocking, interdependent, and often poorly understood/ documented entities, services, and systems. they’ve developed that way over centuries. and just as so often happens in the larger world, any and every change we make can cause a cascade of countless other changes, many of which we might not anticipate before making that seemingly simple initial change. we are stewards of the libraries in which we work: our obligation, as librarians, is to respect what was bequeathed to us, to care for and use it wisely, and to pass it on to those who follow in at least the condition in which we received it—preferably better. environments, including libraries, change and evolve of course, but critics of the supposedly slow pace of change in libraries fail to grasp that our role is just as much that of the conservationist as it is the advocate of development and change. our mission is not change for change’s sake; rather, it is incremental, considered change that will benefit not only today’s patrons and librarians, but respect those of the past and serve those of the future as well. perhaps librarians need an analogue to the medical profession’s hippocratic oath: primum non nocere, “first, do no harm.” ■■ innocents abroad probably few ital readers will be aware (i certainly wasn’t!) that mark twain’s bestselling book during his lifetime was not tom sawyer or huckleberry finn—or any of a host of others of his now better-remembered works— but rather his 1869 travelogue innocents abroad, or the new pilgrims’ progress. the book, which i’ve been savoring in my spare leisure reading time over the past several months, records in journal form twain’s involvement in a voyage in 1867 by a group of american tourists to various locales in mediterranean europe, northern africa, and the near east. in the book, twain gleefully skewers marc truitt marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. outgoing editor’s column: parting thoughts outgoing editor’s column | truitt 153 committee assignments go, i think it fair to say that this is probably one of the more thankless. board members must be expert in all areas of technology, and as important, willing and able to do a credible job of pretending to be so in those areas where they are not expert! they must be able to recognize and create good prose and to offer authors practical, constructive insights and guidance in the sometimes black art of turning promising manuscripts into great articles. as i think many ital authors will attest, they do a superb job at this. they also write some of the most interesting and perceptive editorial columns you’ll see in ital! ■■ judith carter. it’s really impossible to overstate the contributions made by judith to ital. other than a brief four-year interlude during which i served in the role, judith has been managing editor for much of the past decade and more. she taught me the job when she relinquished it in early 2004, and then graciously offered to take it back again when i was named editor four years later. more than any other single person, she is responsible for the ital you hold in your hands, and she does it with skill and tireless dedication. she also has been my coach, my confidante, and—as only a true friend can be—even my butt-kicker when i was late in observing a deadline, which has not infrequently been the case. thank you for everything, judith. ■■ dan and john. the late dan marmion brought me on board at ital as a board member in 2000; he later asked me to serve as his managing editor. he also encouraged me to succeed john webb as editor in 2007. from both dan and john i learned much about the role of an editor and especially about what ital could and should be. i am endlessly appreciative for their mentoring and hope that i have been reasonably successful in maintaining the high standards that they set for the journal. ■■ the authors. without interesting, well-researched, and timely content, there would be no ital. i have been blessed with a rich and nearly constant supply of superb manuscript submissions that the folks who make up the ital “publication machine” have then turned into a highly stimulating and readable journal. i hope you agree. ■■ the readers. and finally, i thank all of you, gentle readers. you are the reason that ital exists. i have been grateful for your support, your patience, and your always-constructive suggestions. beginning with the march 2012 issue, ital will be edited by bob gerrity of boston college. i’ve been acquainted with bob for a number of years, and i can’t think of a better person to guide this journal through the tour-goers, those they encounter, and of course himself; as with twain generally, it is at turns witty, outlandish, biting, and—by today’s lights—completely lacking in political correctness. in short, it’s vintage mark twain: delicious! i mention innocents abroad not simply because i’m currently enjoying it (and hoping that by saying so, i might pique some other ital reader ’s interest in giving it a test drive) but also because it—as with other books, songs, stories, etc., about journeys-taken—is a metaphor for life. we are all “innocents” in some sense as we traverse the days and years of growth in selfawareness, relationships, work, and all the other facets that make up life. it’s a comforting way of viewing the world, i think. i’ve served with ital in various capacities for more than eleven years. that’s a very long time in terms of one particular ala/lita committee. it’s now time for my journey and ital’s to part ways. this is my final column as editor of this journal. this “innocent” is debarking the ital ship and moving on. ital is the product of the dedicated labor of many people of whom i am but one. for some of them, it is a labor of love. as with the credits at the end of a film, it is customary for an editor in her or his final column to recognize and thank the people who made it all possible. i’d like to do so now. polite audience members know to remain until “the end” rolls by. i hope you’ll help me honor these people by doing so, too: ■■ mary taylor, valerie edmonds, and melissa prentice in the lita office. over the years, they’ve been unfailingly helpful to me, to say nothing of being nearly as unfailingly tolerant of my clueless and occasionally obstreperous, passive-aggressive ignorance of the byzantine ways of the ala bureaucracy. ■■ ala production services. production services folk are the professionals who, among innumerable other skills, copyedit and typeset manuscripts, perform miracles with figures and tables, and generally make ital into the quality product you receive (whether it is celluloseor electron-based). regardless of ital’s future publishing format and directions, count yourself fortunate as long as the good people in production services continue to play a role. i’d especially like to single out tim clifford, ital’s production editor, who over the past several years has brought skill, grace, stability, and a healthy dose of humor to this critical post. ■■ the members—past and present—of the ital editorial board. the editorial board is a lita committee; the members of this committee serve as the editor’s primary group of reviewer-referees of manuscripts submitted for publication consideration. as 154 information technology and libraries | december 2011 “happy trails,” and “t-t-t-t-that’s all, folks!” “the end.” changes that will be coming over the next few years. i wish him the very best and hope that he has as much fun in the job—and on the journey—as have i. from the managing editor i’d like to take this opportunity to give marc truitt my heartfelt thanks and best wishes as he leaves his longterm relationship with information technology and libraries (ital). i appreciate how he ably stepped into the role of managing editor (me) when i needed to resign to focus on my full-time job. a few years later he became the new editor and i accepted his request to be his me. i think we’ve had a good partnership. i’ve nudged marc about the production schedule while he has managed manuscripts, the peer review process, and eloquently represented the journal when needed. marc held and communicated a clear and scholarly view of the journal to the editorial board and to lita. i have fond memories of many cups of tea drunk in various ala conference venues while we discussed ital, lita, and shared news of mutual friends. we endured the loss of our friend and mentor dan marmion together a year ago september when marc wrote a letter which i read at the memorial service. this too may be my final issue of ital. it is unknown at time of printing. i support the online future of ital and have offered my services to robert gerrity until a paper version is no longer supported and we successfully transition my duties into an online environment/to a new me. i know he will take the journal into its new iteration with skill and grace. i have served lita and ital for over 13 years and am proud of the quality peer reviewed journal dan marmion, john webb, marc truitt, the editorial board members and i have shared with the members of lita. it has also been my honor to communicate with each of the authors and to facilitate their scholarly communication to our profession. without the authors, where would we be? thank you all, judith carter. editorial | truitt 3 w ithin the last few months, two provocative books have been published that take different approaches to the question of how we learn in the always-on, always-connected electronic environment of “screens.” while neither is specifically directed at librarians, i think both deserve to be read and discussed widely in our community. ■■ the shallows the first, the shallows: what the internet is doing to our brains (norton, 2010), by nicholas carr, is an expanded version of his article “is google making us stupid?” published in the july/august 2008 issue of atlantic monthly and discussed in this space soon after.1 carr’s arguments in the shallows will be familiar to those who read his earlier article, but they are more thoroughly developed in his book and worth summarizing here. carr’s thesis is that use of connective technology—the internet and the web—is leading to a remapping of cognitive reading and thinking skills, and a “shallowing” of these mental faculties: over the last few years i’ve had an uncomfortable sense that someone, or something, has been tinkering with my brain, remapping the neural circuitry, reprogramming the memory. . . . i’m not thinking the way i used to think. i feel it most strongly when i’m reading. i used to find it easy to immerse myself in a book or a lengthy article. . . . that’s rarely the case anymore. (5) the problem, as carr goes on to describe at some length, chronicling in detail the results of years of neurological investigations, is that the brain is “plastic.” “virtually all of our neural circuits—whether they’re involved in feeling, seeing, hearing, moving, thinking, learning, perceiving, or remembering—are subject to change.” and one of the things that is changing them the most drastically today is our growing reliance on digital information. the paradox is that as we repeat an activity—surfing the web and clicking on links, rather than engaging with linear texts, for example—chemically induced synapses cause us to want to continue the new activity, strengthening those links (34). this quality of plastic neural circuits that can be remapped, when combined with the “ecosystem of interruption technologies” of the internet and the web (e.g., in-text hyperlinks, e-mail and rss alerts, text messaging, twitter, multiple widgets, etc.) is resulting in what carr argues is a growing inability or unwillingness to engage with and reflect deeply upon extended text (91).2 as carr puts it, the linear, literary mind . . . [that has] been the imaginative mind of the renaissance, the rational mind of the enlightenment, the inventive mind of the industrial revolution, even the subversive mind of modernism . . . may soon be yesterday’s mind. (10) there is much more. carr offers pointed critiques of major internet players and the roles they play in facilitating and exploiting the remapping of our neural circuits. google, whose “profits are tied directly to the velocity of people’s information intake,” is to carr “in the business of distraction” (156–57). the google book initiative “shouldn’t be confused with the libraries we’ve known until now. it’s not a library of books. it’s a library of snippets. . . . the strip-mining of ‘relevant content’ replaces the slow excavation of meaning” (166). ultimately, for carr, it’s about who is controlling whom. while the internet may permit us to better perform some functions—search, for example—“it poses a threat to our integrity as human beings . . . we program our computers and thereafter they program us” (214). put another way, “the computer screen bulldozes our doubts with its bounties and conveniences. it is so much our servant that it would seem churlish to notice that it is also our master” (4). ■■ hamlet’s blackberry perhaps less familiar than carr’s work is william powers’ hamlet’s blackberry: a practical philosophy for building a good life in the digital age (harpercollins 2010). powers, a writer whose work has appeared in the washington post, the new york times, the new republic, and elsewhere, describes the influence of digital technology (or “screens,” to use his shorthand)3 and connectedness on our lives: in the last few decades, we’ve found a powerful new way to pursue more busyness: digital technology. computers and smart phones are often pitched as solutions to our stressful, overextended lives. . . . but at the same time, they link us more tightly to all the sources of our busyness. our screens are conduits for everything that keeps us hopping—mandatory and optional, worthwhile and silly. . . . marc truitteditorial: “the air is full of people” marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 4 information technology and libraries | march 2011 if not yet a general consensus, that people are coming to experience and understand these costs. finally, they also make the point that things need not continue on their present course. i can imagine that if we in libraries take carr and powers seriously, there might be significant implications for service models and collections practices. both books have been reviewed in all the usual mainstream places. remarkably though, to me—and excluding a scant few discussion list threads such as that on web4lib several years ago—i’ve seen no discussion in the usual professional venues of their implications where libraries are concerned. perhaps i’m simply not reading the “right” weblogs or discussion lists. i’m not under the illusion that libraries or librarians can by themselves alter our rush toward the “shallows.” still, given our eagerness to discuss how we extend the reach of “screens” in libraries—whether in the form of learning commons, wireless access, mobile-friendly websites, clearing stacks of “tree-books” in favor of e-books, etc.—would it not be reasonable to think that we should show as much concern about the consequences of such activities, and even some interest in providing possible remedial alternatives? one of my favorite library spaces in college was the linonia and brothers reading room in yale’s sterling memorial library (see a photo of the reading room at http://images.library.yale.edu/madid/oneitem.aspx ?id=1772930). its dark oak paneling, built-in bookshelves, overstuffed leather easy chairs, cozy alcoves, toasty, footwarming steam radiators, and stained-glass windows overlooking a quiet courtyard represented the epitome of the nineteenth-century “gentleman’s library” and encouraged the sort of deep reading and contemplation that are becoming so rare in our institutions today. i spent many hours there, reading, thinking, dreaming—and yes, catnapping too. i haven’t visited the “l&b” in years; i hope it is still the way i so fondly recall it. over the past few years, as we’ve considered the various aspects of the library-as-space question, we’ve created all manner of collaborative, group-focused, überconnected learning spaces. we’ve also created bookfree spaces (to say nothing of book-free “libraries”), food-friendly spaces, quiet and cell-phone-free spaces, and a host of others of which i’m sure i haven’t thought. so, in an attempt to get us thinking about what carr ’s and powers’ books might mean for libraries, here’s a crazy idea to start us off: how about a screen-free space for deep reading and contemplation? it should be very low-tech: no mobiles, no laptops, no desktops, no networks, no clickety-clack of keys, no chimes of incoming e-mail and tweets, no unearthly glow of monitors. no food, drink, or group-study areas, either. just a quiet, inviting, comfortable space for individual reading and the goal is no longer to be “in touch” but to erase the possibility of ever being out of touch. to merge, to live simultaneously with everyone, sharing every moment, every perception, thought, and action via our screens. even the places where we used to go to get away from the crowd and the burdens it imposes on us are now connected. the simple act of going out for a walk is completely different today from what it was fifteen years ago. whether you’re walking down a big-city street or in the woods outside a country town, if you’re carrying a mobile device with you, the global crowd comes along. . . . the air is full of people. (14–15) drawing inspiration and analogy from a list of philosophers and other historical and literary figures beginning with plato and ending with mcluhan, powers describes seven practical approaches, tools, and techniques for disconnecting from our screen-driven life: ■■ seek physical distance (plato) ■■ seek intellectual and emotional distance (seneca) ■■ hope for devices that might allow us to customize our degree of connectedness (gutenberg) ■■ consider older, low-tech tools as alternatives where possible (shakespeare via hamlet) ■■ create positive rituals (ben franklin) ■■ create a “walden zone” refuge (thoreau) ■■ be aware of and take personal control from technology by being aware of that technology (mcluhan) powers then reviews how he and his family used these techniques to regain the sense of control and depth they felt they’d lost to screens. in the past several months, i’ve tried a couple myself. i no longer carry a blackberry unless i’m traveling out of town. i avoid e-mail and the internet completely on saturdays (my “internet sabbath”). the effect of these two small and easily achieved changes has been little short of liberating, providing space to think and reflect without the distraction of always-on connectedness. walking my lab seamus has become a special pleasure! ■■ bringing libraries into the picture so, what do carr’s and powers’ theses mean for libraries, and what do they mean in particular for those of us who provide technology solutions for libraries? they remind us that there is a very real human cost to the technology of screens and always-on connectedness that have become our stock-in-trade in recent years. as well, they provide convincing evidence that there is a growing awareness, editorial | truitt 5 references and notes 1. carr’s atlantic monthly article appeared in volume 301 (july/aug. 2008) and can be found at http://www.theatlantic . c o m / m a g a z i n e / a rc h i v e / 2 0 0 8 / 0 7 / i s g o o g l e m a k i n g u s -stupid/6868/ (accessed jan. 14, 2011); my ital column on the topic is at http://www.ala.org/ala/mgrps/divs/lita/ ital/272008/2703sep/editorial_pdf.cfm (accessed jan. 14, 2011). 2. the term “ecosystem of interruption technologies” belongs to cory doctorow. 3. powers uses the term “screens” to describe “the connective digital devices that have been widely adopted in the last two decades, including desktop and notebook computers, mobile phones, e-readers, and tablets” (1). thought. would some of our patrons adopt it? i’m willing to bet that they would. do we not owe them the same commitment to service that we’ve worked so hard to provide to those who wish to be collaborative and “always-on”? absolutely. no, we can’t change the world or stop the march of the screens. but perhaps, as with powers’ “walden zone,” we can start by providing a close-at-hand safe harbor for those of our patrons seeking refuge from the “always-on” world of screens. a file storage service on a cloud computing environment for digital libraries victor jesús sosa-sosa and emigdio m. hernandez-ramirez information technology and libraries | december 2012 34 abstract the growing need for digital libraries to manage large amounts of data requires storage infrastructure that libraries can deploy quickly and economically. cloud computing is a new model that allows the provision of information technology (it) resources on demand, lowering management complexity. this paper introduces a file-storage service that is implemented on a private/hybrid cloud-computing environment and is based on open-source software. the authors evaluated performance and resource consumption using several levels of data availability and fault tolerance. this service can be taken as a reference guide for it staff wanting to build a modest cloud storage infrastructure. introduction the information technology (it) revolution has led to the digitization of every kind of information.1 digital libraries are appearing as one more step toward easy access to information spread throughout a variety of media. the digital storage of data facilitates information retrieval, allowing a new wave of services and web applications that take advantage of the huge amount of data available.2 the challenges of preserving and sharing data stored on digital media are significant compared to the print world, in which data “stored” on paper can still be read centuries or millennia later. in contrast, only ten years ago, floppy disks were a major storage medium for digital data, but now the vast majority of computers no longer support this type of device. in today’s environment, selecting a good data repository is important to ensure that data are preserved and accessible. likewise, defining the storage requirements for digital libraries has become a big challenge. in this context, it staff—those responsible for predicting what storage resources will be needed in the medium term—often face the following scenarios: • prediction of storage requirements turn out to be below real needs, resulting in resource deficits. • prediction of storage requirements turn out to be above real needs, resulting in expenditure and administration overhead for resources that end up not being used. in these situations, considering only an efficient strategy to store documents is not enough.3 the acquisition of storage services that implement an elastic concept (i.e., storage capacity that can be victor jesús sosa-sosa (vjsosa@tamps.cinvestav.mx) is professor and researcher at the information technology laboratory at cinvestav, campus tamaulipas, mexico. emigdio m. hernandez-ramirez (emhr1983@gmail.com) is software developer, svam international, ciudad victoria, mexico. information technology and libraries | december 2012 35 increased or reduced on demand, with a cost of acquisition and management relatively low) becomes attractive. cloud computing is a current trend that considers the internet as a platform providing on-demand computing and software as a service to anyone, anywhere, and at any time. digital libraries naturally should be connected to cloud computing to obtain mutual benefits and enhance both perspectives.4 in this model, storage resources are provisioned on demand and are paid according to consumption. services deployment in a cloud-computing environment can be implemented three ways: private, public, or hybrid. in the private option, infrastructure is operated solely for a single organization; most of the time, it requires an initial strong investment because the organization must purchase a large amount of storage resources and pay for the administration costs. the public cloud is the most traditional version of cloud computing. in this model, infrastructure belongs to an external organization where costs are a function of the resources used. these costs include administration. finally, the hybrid model contains a mixture of private and public. a cloud-computing environment is mainly supported by technologies such as virtualization and service-oriented architectures. a cloud environment provides omnipresence and facilitates deployment of file-storage services. it means that users can access their files via the internet from anywhere and without requiring the installation of a special application. the user only needs a web browser. data availability, scalability, elastic service, and pay-per-use are attractive characteristics found in the cloud service model. virtualization plays an important role in cloud computing. with this technology, it is possible to have facilities such as multiple execution environments, sandboxing, server consolidation, use of multiple operating systems, and software migration, among others. besides virtualization technologies, emerging tools that allow the creation of cloud-computing environments also support this type of computing model, providing dynamic instantiation and release of virtual machines and software migration. currently, it is possible to find several examples of public cloud storage, such as amazon s3 (http://aws.amazon.com/en/s3), rackspace (http://www.rackspace.com/cloud/public/files), and google storage (https://developers.google.com/storage), each of which provide high availability, fault tolerance, and services and administration at low cost. for organizations that do not want to use a third-party environment to store their data, private cloud services may offer a better option, although the cost is higher. in this case, a hybrid cloud model could be an affordable solution. organizations or individual users, can store sensitive or frequently used information in the private infrastructure and less sensitive data in the public cloud. the development of a prototype of a file-storage service implemented on a private and hybrid cloud environment using mainly free and open-source software (foss) helped us to analyze the behavior of different replication techniques. we paid special attention to the cost of the system implementation, system efficiency, resource consumption, and different levels of data privacy and availability that can be achieved by each type of system. http://aws.amazon.com/en/s3 http://www.rackspace.com/cloud/public/files https://developers.google.com/storage a file storage service on a cloud computing environment for digital libraries | sosa-sosa 36 infrastructure description the aim of this prototyping project was to design and implement scalable and elastic distributed storage architecture in a cloud-computing environment using free, well-known, open-source tools. this architecture represents a feasible option that digital libraries can adopt to solve financial and technical challenges when building a cloud-computing environment. the architecture combines private and public clouds by creating a hybrid cloud environment. for this purpose, we evaluated tools such as kvm and xen, which are useful for creating virtual machines (vm).5 open nebula (http://opennebula.org), eucalyptus (http://www.eucalyptus.com), and openstack (http://www.openstack.org) are good, free options for managing a cloud environment. we selected open nebula for this prototype. commodity hard drives have a relatively high failure rate, hence our main motivation to evaluate different replication mechanisms, providing several levels of data availability and fault tolerance. figure 1(a) shows the core components of our storage architecture (the private cloud), and figure 1(b) shows a distributed storage web application named distributed storage on the cloud (disoc), used as a proof of concept. the private cloud also has an interface to access a public cloud, thus creating a hybrid environment. figure 1. main components of the cloud storage architecture the core components and modules of the architecture are the following: • virtual machine (vm). we evaluated different open-source were evaluated, such as kvm and xen, for the creation of virtual machines.6 some performance tests were done, and kvm showed a slightly higher performance than xen. we selected kvm as the main virtual machine manager (vmm) for the proposed architecture. vmms also are called http://opennebula.org/ http://www.eucalyptus.com/ http://www.openstack.org/ information technology and libraries | december 2012 37 hypervisors. each vm has a linux operating system that is optimized to work in virtual environments and requires a minimum consumption of disk space. the vm also includes an apache web server, a php module, and some basic tools that were used to build the disoc web application. every vm is able to transparently access a pool of disks through a special data access module, which we called dam. more details about dam follow. • virtual machine manager module (vmmm). this has the function of dynamic instantiation and de-instantiation of virtual machines depending on the current load on the infrastructure. • data access module (dam). all of the virtual disk space required by every vm was obtained through the data access module interface (dam-i). dam-i allows vms to access disk space by calling dam, which provides transparent access to the different disks that are part of the storage infrastructure. dam allocates and retrieves files stored throughout multiple file servers. • load balancer module (lbm). this distributes the load among different vms instantiated on the physical servers that make up the private cloud. • load manager (lm). this monitors the load that can occur in the private cloud. • distributed storage on the cloud (disoc). this is a web-based file-storage system that is used as a proof of concept and was implemented based on the proposed architecture. replication techniques high availability is one of the important features offered in a storage service deployed in the cloud. the use of replication techniques has been the most useful proposal to achieve this feature. dam is the component that provides different levels of data availability. it currently includes the following replication policies: no-replication, total-replication, mirroring, and ida-based replication. • no-replication. this replication policy represents the data availability method with the lowest level of fault tolerance. in this method, only the original version of a file is stored in the disk pool. it follows a round-robin allocation policy whereby load assignation is made based on a circularly linked list, taking into account disk availability. this policy prevents all files from being allocated to the same server, providing a minimal fault tolerance in case a server failure. • mirroring. this replication technique is a simple way to ensure higher availability without high resource consumption. in this replication, every time a file is stored in a disk, the dam creates a copy and places it on a different disk. • total-replication. this represents the highest data availability approach. in this technique, a copy of the file is stored on all of the file servers available. total-replication also requires the highest consumption of resources. • ida-based replication. to provide higher data availability with less impact on the consumption of resources, an alternative approach based on information-dispersal techniques can be used. the information dispersal algorithm (ida) is an example of this a file storage service on a cloud computing environment for digital libraries | sosa-sosa 38 strategy.7 when a file (of size |f|) is required to be stored using the ida, the file is partitioned into n fragments of size |f|/m, where m-----~1 of loans + add 1 t o copies circulati ng n calcu late n o . of days on loan s tor e information in table fig. 1. programme logic. total n of days on loan re-set table n calculate average length of l oan calculate standard deviation calculate calcu l a te print report predicting need for multiple copiesj grant 67 are copies in the library. by providing an analysis of the present circulation profile of each book, the formula attempts to predict the number of copies of each title the library would need to have in order to more adequately accommodate unsatisfied demand. the programme for performing the calculations is written in pl/1 and is run on an ibm 360 j 50 (figure 1). the execution time for 140,000 circulation records (each time a book circulates the data on its circulation is considered a single record ) is 15 minutes. the historical record file, the source of data for the programme, is incremented each time a book in circulation is returned. figure 2 shows the format of this file. the file itself is a sequential file stored on magnetic tape, updated daily to include the previous day's circulation data. entries are arranged in lc call number-accession number order. field card type lc call number author accession number spare card sequence number spare borrower's id code borrower 's id number spare action code due da te (mmddyy) (mo .-dayyr.) spare indicator date charged out (yyddd) (yr.-day) date returned (yyddd) (yr. -day) fig. 2. format of historical record file. length 1 29 15 6 1 6 2 1 6 3 1 6 3 1 5 5 accumulative length 1 30 45 51 52 58 60 61 67 70 71 77 bo bl 86 91 68 journal of library automation vol. 4/ 2 june, 1971 results after the calculations described above have been performed for every title circulated during the academic year, a print-out of the results is produced ( figure 3). in order to limit paperwork, only those results under "projected need" which were ~ 1.00 appear on the print-out; any results less than 1.00 were suppressed. the column labelled "transactions" is simply the number of times the book was checked out and checked back in again. the column, "average loan period" is the a described in the formula above. and the column, "copies circulated" is the number of books with the same classification number as listed on the left-hand column, but with different accession numbers, checked out during the year. this figure is not the number of copies of the book that the library owns, which could, in some instances, be more copies than were actually circulated. the column labelled "projected need" should, according to the calculations, indicate the number of copies of a title which could accommodate the demand for that title with 95% certainty. in order to find out whether or not the library should purchase more copies of a particular title, the number listed in this column is simply checked against the number of classification author projected trans. avg loan copies need period circul . am---101.-.c3488-canada-national 3 . 61 37 10 . 45 17 b-----56.-.c6---collins-james-d 1.14 21 8 . 00 2 b-----65.-.86---bodenheimer,£.1. 21 12 11.50 3 b--65. .r6----rommen-heinrich l. 34 5 20.60 2 8--67 ..858-blake-ralph-m-2.00 4 36 .75 2 8 ----67.-.n22---nagel-ernest--2. 34 23 11 . 39 3 8----72.-.c63-copleston f.c . 2. 39 27 9.18 10 b----72. .hs---gilson-e. h.---1. 64 26 9.03 14 b-----72.-.j6----joad-cyril-edwi 2.84 8 2 1. 7 5 2 b----72.-.p3----parkerf .h . ---2.48 4 41.00 2 b--358.-.c57---plato--------5.68 21 15 . 61 3 8----358.-.j8----plato---------2.00 38 8 . 07 10 b---358.-.w7----plato-------2. 72 5 39.80 3 8----377. -. a285-plato--3.65 8 3 5. 3 7 2 b----378.-.a2c6--plato-----l. 58 2 73 . 00 2 8 -381. -. ast35plato---l. 04 3 36 . 33 3 b---385.-.a6----anderson-f h-2.92 16 13 . 43 2 b----395.-.877---brumbaugh-r-s2. 05 12 13.33 1 b----395.-.c6----crombie-i-----3.02 17 12 . 41 2 b-395. .g67--grube-georg em5.13 30 10.30 5 b----395. -. g78 --guardini,r.2. 04 17 12 . 23 4 b---395. -. k6----koyre-alexandre 1.13 4 21. 7 5 1 b-395.-.l6--lodge rupert c1. 88 3 51.33 1 b----395.. 553-5horey , paul-4.69 23 11.91 4 b--398. .t25 -taylor,a.e. ---1. 31 28 7 . 7 5 5 8 -398.-.e8h17hall , robert-w .2.99 11 16 . 72 2 b-407 .-. l8l9-lutoslaw5ki,w. 3.10 4 59.25 1 b--505. -. m2--aristotel£5, -2 . 88 17 1 2 . 00 7 b----505.-.03---oates-w.j.--3 . 86 9 2 7. 3 3 7 b--528.. z 4 13-zeller-eduard-1.39 6 33 . 00 2 8--528. .p751--pohlenz-max---1. 35 5 34.60 2 8 --667.. 525---sam bur5ky 5am-1. 36 5 42 . 40 1 b-701 . -.d4d6-dondaine , h . f . 1. 03 2 69. 0 0 1 b-701.-.a4e5 -proc lu5-diadoch 1.11 2 72.50 1 fig. 3. circulation history analysis report. predicting need for multiple copies/ grant 69 copies listed for this classification number in the official shelf list. for example, the book classified as b.72.j6 shows a "projected need" of 2.84. therefore if the library had three copies of this book, and the book's circulation pattern did not change significantly in the immediate future, then the library would be able to fill 95% of the requests. the official shelf list, however, indicates that the library only owns two copies of this title, suggesting that at least one more copy should be purchased to meet present demand. these calculations do not anticipate future demand on the book. also, doubling the number of copies can never succeed in doubling circulation, a fact demonstrated by leimkuhler ( 2). this print-out, therefore, can only serve as one guide to multiple-copy purchase. precautions and pitfalls in using the results of these computations as a guide to the purchase of multiple copies, the librarian should be aware of several factors which may have distorted the results. one is that the student who checks out the only copy of a book and keeps checking it out all year, in lieu of buying his own copy, creates a false "demand" for· the book. it may be that he is the only person in the university interested in it, and when he graduates this book may sit out its life on the shelves completely unused. however, since the historical record file contains the borrower's id number, it is possible to distinguish between an original loan and a renewal. the first time the borrower's id number appears on the book's circulation record indicates the original loan. each additional and consecutive time the same borrower's id number appears on the same circulation record indicates a renewal. although the pilot project did not contain provisions for obviating this problem, it would have been simple enough to build into the programme a mechanism for suppressing the unwanted data. a faculty member who assigns parts of books for students to read, but does not place the books on reserve, forces competition for them on the open shelves. this too creates a demand which may not exist after the professor leaves the university or stops teaching a particular course. the librarian should be aware of such possible short-lived demands that may never recur. the circulation analysis programme was executed at the end of one academic year in order to provide the university of windsor librarians with guidelines for purchase of multiple copies of books to be used in the next academic year. if it were known that a particular book receiving heavy use one year would not receive equally heavy use in the next (because, for example, the particular course requiring that book would no longer be taught; or the book would be placed on a "two-hour reserve" for the coming academic year; or the book circulated frequently in one year only because it was on the "best-seller list"), then it would be folly to purchase three or four additional copies of the book just because the computer print-out indicated that a number of additional copies were 70 journal of library automation vol. 4/2 june, 1971 needed. other factors, therefore, although not included in the input data, are certainly relevant in determining the need for multiple copies. at the university of windsor library, a book that needs to be re-bound because of heavy use or mutilation is charged out to the bindery department. it then shows up on the historical record file, just as though it had been charged out. but since the "borrower's id number" for books charged to the bindery department consists of all zeroes, it would be simple enough to identify and suppress these particular records as unwanted data. by-products in addition to providing a list of books to be considered for duplication, the historical record file upon analysis revealed several other interesting facts about the university library's circulation. most noteworthy is the fact that, although there were more than 200,000 circulating books sitting on the open shelves at the time of this pilot project, only 40,205 different titles circulated for a total of 134,276 times. assuming there were only 100,000 different titles among the 200,000 books, this would mean that nearly 60% of the collection was probably not used by the students. of the 40,205 different works which did circulate, the calculations indicated that only 3,257 titles required one or more copies in order to fill 95% of the requests. of this latter number, only 570 titles were in need of duplication. (that is to say, the number of copies listed under projected need exceeded the number of copies actually owned by the library as indicated by the shelf list.) a random sample comprising one-third of these 570 titles was checked to see whether or not the books were in print. indications were that 38% of the titles in need of duplication were no longer in print. conclusions a close examination of the 570 titles apparently in need of duplication reveals that, with very few exceptions, students are apparently checking out only books that are curriculum oriented in the most narrow sense, i,e., books which they need to use in writing term papers. nevertheless, one can appreciate the fact that these books are in demand by the student, and if the library is to be responsive to users' demands on its facilities, it will need to spend part of the book budget each year purchasing multiple copies of the most heavily used books. unfortunately, even with these good intentions and the sophisticated assistance of the computer, students' demands for books will still be frustrated (at least one-out-of-three times) because books which need to be duplicated are no longer in print. programme a print-out copy of the circulation analysis programme described above predicting need for multiple copies/grant 71 is available from mrs. jean griffiths, computer centre, university of windsor, windsor, ontario, canada. acknowledgments the initial impetus and continuous guidance for this project was provided by albert v. mate, assistant librarian for public services at the university of windsor. dr. martin basic, faculty of business administration, acted as consultant. systems analyst was mrs. jean griffiths, and programmer was mrs. lillian jin, both at the university computer centre. references 1. leffier, william l.: "a statistical method for circulation analysis," college and research libraries, 25 ( 1964), 488-490. 2. leimkuhler, ferdinand f .: "systems analysis in university libraries," college and research libraries, 27 ( 1966), 13-18. • trends at a glance: a management dashboard of library statistics emily morton-owens and karen l. hanson information technology and libraries | september 2012 36 abstract systems librarians at an academic medical library created a management data dashboard. charts were designed using best practices for data visualization and dashboard layout, and include metrics on gatecount, website visits, instant message reference chats, circulation, and interlibrary loan volume and turnaround time. several charts draw on ezproxy log data that has been analyzed and linked to other databases to reveal use by different academic departments and user roles (such as faculty or student). most charts are bar charts and include a linear regression trend line. the implementation uses perl scripts to retrieve data from eight different sources and add it to a mysql data warehouse, from which php/javascript webpages use google chart tools to create the dashboard charts. introduction new york university health sciences libraries (nyuhsl) had adopted a number of systems that were either open-source, home-grown, or that offered apis of one sort or another. examples include drupal, google analytics, and a home-grown interlibrary loan (ill) system. systems librarians decided to capitalize on the availability of this data by designing a system that would give library management a single, continuously self-updating point of access to monitor a variety of metrics. previously this kind of information had been assembled annually for surveys like aahsl and arl. 1 the layout and scope of the dashboard was influenced by google analytics and a beta dashboard project at brown.2 the dashboard enables closer scrutiny of trends in library use, ideally resulting in a more agile response to problems and opportunities. it allows decisions and trade-offs to be based on concrete data rather than impressions, and it documents the library’s service to its user community, which is important in a challenging budget climate. although the end product builds on a long list of technologies—especially perl, mysql, php, javascript, and google chart tools—the design of the project is lightweight and simple, and the number of lines of code required to power it is remarkably small. further, the design is modular. this means that nyuhsl could offer customized versions for staff in different roles, restricting the display to show only data that is relevant to the individual’s work. because most libraries have a unique combination of technologies in place to handle functions like circulation, reference questions, circulation, and so forth, a one-size-fits-all software package that emily morton-owens (emily.morton.owens@gmail.com) was web services librarian and karen hanson (karen.hanson@med.nyu.edu) is knowledge systems librarian, new york university health sciences libraries, new york. trends at a glance: a management dashboard of library statistics | morton-owens and hanson 37 could be used by any library may not be feasible. instead, this lightweight and modular approach could be re-created relatively easily to fit local circumstances and needs. visual design principles in designing the dashboard, we tried to use some best practices for data visualization and assembling charts into a dashboard. the best-known authority on data visualization, edward tufte, states “above all else, show the data.”3 in part, this means minimizing distractions, such as unnecessary gridlines and playful graphics. ideally, every dot of ink on the page would represent data. he also emphasizes truthful proportions, meaning the chart should be proportional to the actual measurements.4 a chart should display data from zero to the highest quantity, not arbitrarily starting the measurements at a higher number, because that distorts the proportions between the part and the whole. a chart also should not use graphics that differ in width as well as length, because that causes the area of the graphic to increase incorrectly, as opposed to simply the length increasing. pie charts are popular chart types that have serious problems in this respect despite their popularity; they require users to judge the relative area of the slices, which is difficult to do accurately.5 generally, it is better to use a bar chart with different length bars whose proportions users can judge better. color should also be used judiciously. some designers use too many colors for artistic effect, which creates a “visual puzzle”6 as the user wonders whether the colors carry meaning. some colors stand out more than others and should be used with caution. for example, red is often associated with something urgent or negative, so it should only be used in appropriate contexts. duller, less saturated colors are more appropriate for many data visualizations. a contrasting style is exemplified by nigel holmes, who designs charts and infographics with playful visual elements. a recent study compared the participants’ reactions to holmes’ work with plain charts of the same data.7 there was no significant difference in comprehension or shortterm memorability; however, the researchers found that the embellished charts were more memorable over the long term, as well as more enjoyable to look at. that said, holmes’ style is most appropriate for charts that are trying to drive home a certain interpretation. in the case of the dashboard, we did not want to make any specific point, nor did we have any way of knowing in advance what the data would reveal, so we used tufte’s principles in our design. a comparable authority on dashboard design is stephen few. a dashboard combines multiple data displays in a single point of access. as in the most familiar example, a car dashboard, it usually has to do with controlling or monitoring something without taking your focus from the main task.8 a dashboard should be simple and visual, not requiring the user to tune out extraneous information or interpret novel chart concepts. the goal is not to offer a lookup table of precise values. the user should be able to get the idea without reading too much text or having to think information technology and libraries | september 2012 38 too hard about what the graph represents. thinking again of a car, its speedometer does not offer a historical analysis of speed variation because this is too much data to process while the car is moving. similarly, the dashboard should ideally fit on one screen so that it can be taken in at a glance. if this is not possible, at least all of the individual charts should be presented intact, without scrolling or being cramped in ways that distort the data. a dashboard should present data dimensions that are dynamic. the user will refer to the dashboard frequently, so presenting data that does not change over time only takes up space. better yet, the data should be presented alongside a benchmark or goal. a benchmark may be a historical value for the same metric or perhaps a competitor’s value. a goal is an intended future value that may or may not ever have been reached. either way, including this alternate value gives context for whether the current performance is desirable. this is essential for making the dashboard into a decision-making tool. nils rasmussen et al. discuss three levels of dashboards: strategic, tactical (related to progress on a specific project), and operational (related to everyday, department-level processes). 9 so far, nyuhsl’s dashboard is primarily operational, monitoring whether ordinary work is proceeding as planned. later in this paper we will discuss ways to make the dashboard better suited to supporting strategic initiatives. system architecture the dashboard architecture consists of three main parts: importer scripts that get data from diverse sources, a data warehouse, and php/javascript scripts that display the data. the data warehouse is a simple mysql database; the term “warehouse” refers to the fact that it contains a stripped-down, simplified version of the data that is appropriate for analysis rather than operations. our approach to handling the data is an etl (extract, transform, load) routine. data are extracted from different sources, transformed in various ways, and loaded into the data warehouse. our data transformations include reducing granularity and enriching the data using details drawn from other datasets, such as the institutional list of ip ranges and their corresponding departments. data rarely change once in the warehouse because they represent historical measurements, not open transactions.10 there is an importer script customized for each data source. the data sources differ in format and location. for example, google analytics is a remote data source with a unique data export api, the ill data are in a local mysql database, and libraryh3lp has remote csv log files. the scripts run automatically via a cron job at 2a.m. and retrieves data for the previous day. that time was chosen to ensure all other nightly cron jobs that affect the databases are complete before the dashboard imports start. each uses custom code for its data source and creates a series of mysql insert queries to put the needed data fields in the mysql data warehouse. for example, a script might pull the dates when an ill request was placed and filled, but not the title of the requested item. trends at a glance: a management dashboard of library statistics | morton-owens and hanson 39 a carefully thought-out data model simplifies the creation of reports. the data structure should aim to support future expansion. in the data warehouse, information that was previously formatted and stored in very inconsistent ways is brought together uniformly. there is one table for each kind of data with consistent field names for dates, services, and so forth, and others that combine related data in useful ways. the dashboard display consists of a number of widgets, one for each chart. each chart is created with a mixture of php and javascript. google chart tools interprets lines of javascript to draw an attractive, proportional chart. we do not want to hardcode the values in this javascript, of course, because the charts should be dynamic. therefore we use php to query the data warehouse and a statement for each line of results to “write” a line of the data in javascript. figure 1. php is used to read from the database and generate rows of data as server-side javascript. each php/javascript file created through this process is embedded in a master php page. this master page controls the order and layout of the individual widgets using the php include feature to add each chart file to the page plus a css stylesheet to determine the spacing of the charts. finally, because all the queries take a relatively long time to run, the page is cached and refreshes itself the first time the page is opened each day. the dashboard can be refreshed manually if the database or code is modified and someone wants to see the results immediately. many of the dashboard’s charts include a linear regression trend line. this feature is not provided by google charts and must be inserted into the widget’s code manually. the formula can be found online.11 the sums and sums of squares are totted up as the code loops through each line of data, and these totals are used to calculate the slope and intercept. in our twenty-six-week displays, we never want to include the twenty-sixth week of data because that is the present (partial) week. the linear regression line takes the form y = mx + b. we can use that formula along with the slope and intercept values to calculate y-values for week zero and the next-to-last week (week twentyfive). those two points are plotted and the trend line is drawn between them. the color of the line depends on its slope (greater or less than zero). depending on whether we want that chart’s metric to go up or down, the line is green for the desirable direction and red for the undesirable direction. information technology and libraries | september 2012 40 details on individual systems gatecount most of nyuhsl’s five locations have electronic gates to track the number of patrons who visit. formerly these statistics were kept in a microsoft excel spreadsheet, but now there is a simple web form into which staff can enter the gate reading twice daily. the data goes directly into the data warehouse, and the a.m. and p.m. counts are automatically summed. there is some errorchecking to prevent incorrect numbers being entered, which varies depending on whether that location’s gate is the kind that provides a continuously increasing count or is reset each day. the data are presented in a stacked bar chart, summed for the week. the user can hover over the stacked bars to see numbers for each location, but the height of the stacked bar and the trend line represent the total visits for all locations together. figure 2. stacked bar chart with trendline showing visits per week to pphysical library branches over a twenty-six-week period ticketing nyuhsl manages online user requests with a simple ticketing system that integrates with drupal. there are four types of tickets, two of which involve helping users and two of which involve reporting problems. the “helpful” services are general reference questions and literature search requests. the “trouble” services are computer problems and e-resource problems. these two pairs trends at a glance: a management dashboard of library statistics | morton-owens and hanson 41 each have their own stacked bar chart because, ideally, the number of “helpful” tickets would go up while the number of “trouble” tickets would go down. each chart has a trend line, color-coded for the direction that is desirable in that case. figure 3. stacked bar chart with trendline showing trouble tickets by type the script that imports this information into the data warehouse simply does so from another local mysql database. it only fetches the date and the type of request, not the actual question or response. it also inserts a record into the user transactions table, which will be discussed in the section on user data. drupal nyuhsl’s drupal site allows librarians directly to contribute content like subject guides and blog posts.12 the dashboard tracks the number of edits contributed by users (excluding the web services librarian and the web manager, who would otherwise swamp the results). this is done with a simple count query on the node_revisions table in the drupal database. because no other processing is needed and caching ensures the query will be done at most once per day, this is the only widget that pulls data directly from the original database at the time the chart is drawn. koha koha is an open-source opac system. at nyuhsl, koha’s database is in mysql. each night the importer script copies “issues” data from koha’s statistics table. this supports the creation of a information technology and libraries | september 2012 42 stacked bar chart showing the number of item checkouts each week, with each bar divided according to the type of item borrowed (e.g., book or laptop). as with other charts, a color-coded trend line was added to show the change in the number of item checkouts. google analytics the dashboard relies on the google analytics php interface (gapi) to retrieve data using the google analytics data export api.13 nothing is stored in the data warehouse and there is no importer script. the first widget gets and displays weekly total visits for all nyuhsl websites, the main nyuhsl website, and visits from mobile devices. a trend line is calculated from the “all sites” count. the second widget retrieves a list of the top “outbound click” events for the past thirty days and returns them as urls. a regular expression is used to remove any ezproxy prefix, and the remaining url is matched against our electronic resources database to get the title. thus, for example, the widget displays “web of knowledge” instead of “http://ezproxy.med.nyu.edu/login?url=http://apps.isiknowledge.com/.” a future improvement to this display would require a new table in the data warehouse and importer script to store historic outbound click results. this data would support comparison of the current list with past results to identify click destinations that are trending up or down. figure 4. most popular links clicked on to leave the library’s website in a thirty-day period trends at a glance: a management dashboard of library statistics | morton-owens and hanson 43 libraryh3lp libraryh3lp is a jabber-based im product that allows librarians to jointly manage a queue of reference queries. it offers csv-formatted log files that a perl script can access using “curl,” a command-line tool that mimics a web browser’s login, cookies, and file requests. the csv log is downloaded via curl, processed with perl’s text::csv module, and the data are then inserted into the warehouse. the first libraryh3lp widget counts the number of chats handled by each librarian over the past ninety days. the second widget tracks the number of chats for the past twenty-six weeks and includes a trend line. figure 5. bar chart showing number of im chats per week over a twenty-six-week period document delivery services the document delivery services (dds) department fulfills ill requests. the web application that manages these requests is homegrown, with a database in mysql. each night, a script copies the latest requests to the data warehouse. the dashboard uses this data to display a chart of how many requests are made each week and which publications are requested from other libraries most frequently. this data could be used to determine whether there are resources that should be considered for purchase. information technology and libraries | september 2012 44 the dds data was also used to demonstrate how data might be used to track service performance. one chart shows the average time it takes to fulfill a document request. further evaluation is required to determine the usefulness of such a chart for motivating improvement of the service or whether this is perceived as a negative use of the data. some libraries may find this kind of information useful for streamlining services. figure 6. this stacked bar chart shows the number of document delivery requests handled per week. the chart separates patron requests from requests made by other libraries. ezproxy data ezproxy is an oclc tool for authenticating users who attempt to access the library’s electronic resources. it does not log e-resource use where the user is automatically authenticated using the institutional ip range, but the data are still valuable because it logs a significant amount of use that can support in-depth analysis. because of the gaps in the data, much of the analysis looks at patterns and relationships in the data rather than absolute values. karen coombs’ article discussing the analysis of ezproxy logs to understand e-resource at the department level provided the initial motivation to switch on the ezproxy log.14 when logging is enabled, a standard web log file is produced. here is a sample line from the log: 123.45.6.7 amyu0gh5brmuska hansok01 [09/sep/2011:18:25:23 -0500] post http://ovidsp.tx.ovid.com: 80/sp3.3.1a/ovidweb.cgi http/1.1 20020472 http://ovidsp.tx.ovid.com.ezproxy.med.nyu.edu/sp-3.3.1a/ovidweb.cgi trends at a glance: a management dashboard of library statistics | morton-owens and hanson 45 each line in the log contains a user ip address, a unique session id, the user id, the date and time of access, the url requested by the user, the http status code, the number of bytes in the requested file, and the referrer (the page the user clicked on to get to the site). the ezproxy log data undergoes some significant processing before being inserted into the ezproxy report tables. the main goal of this is to enrich the data with relevant supplemental information while eliminating redundancy. to facilitate this process, the importer script first dumps the entire log into a table and then performs multiple updates on the dataset. during the first step of processing, the ip addresses are compared to a list of departmental ip ranges maintained by medical center it. if a match is found, the “location accessed” is stored against the log line. next, the user id is compared with the institutional people database, retrieving a user type (faculty, staff, or student) and a department, if available (e.g., radiology). one item of significant interest to senior management is the level of use within hospitals. as a medical library, we are interested in the library’s value to patient care. if there is significant use in the hospitals, this could furnish positive evidence about the library’s role in the clinical setting. next, the resource url and the referring address are truncated down to domain names. the links in the log are very specific, showing detailed user activity. because the library is operating in a medical environment, privacy is a concern and so specific addresses are truncated to a top-level domain (e.g. ovid.com) to suppress any tie to a specific article, e-book, or other specific resource. finally, a query is run against the remaining raw data to condense the log down to unique session id/resource combinations, and this block of data is inserted into a new table. each user visit to a unique resource in a single session is recorded; for example, if a user visits lexis nexis, ovid medline, scopus, and lexis nexis again in a single session, three lines will be recorded in the user activity table. a single line in the final ezproxy activity table contains a unique combination of location accessed (e.g., tisch hospital), user department (e.g., radiology), user type (e.g., staff), earliest access date/time for that resource (e.g., 9/9/201118:25), resource name (e.g., scopus.com), session id, and referring domain (e.g., hsl.med.nyu.edu). there is significant repetition in the log. depending on what filters are set up, every image within a webpage could be a line in the log. the method of condensing the data described previously results in a much smaller and more manageable dataset. for example, on a single day 115,070 rows of were collected in the ezproxy log, but only 2,198 were inserted into the final warehouse table after truncating the urls and removing redundancy. in a separate query on the raw data table, a distinct list containing user id, date, and the word “eresources” is built and stored in a “user transactions” table. this very basic data are stored so that simple user analysis can be performed (see “user data” below). information technology and libraries | september 2012 46 figure 7. line chart showing total number of ezproxy sessions captured per week over a twenty-sixweek period once the ezproxy data are transferred to the appropriate tables, the raw data (and thus the most concerning data from a privacy standpoint) is purged from the database. several dashboard charts were created using the streamlined ezproxy data, a simple count of weekly e-resource users, and a table showing resources whose use changed most significantly since the previous month. it was challenging to calculate the significance of the variations in use since resources that went from one session in a month to two sessions were showing the same proportional change as those that increased from one thousand to two thousand sessions. a basic calculation was created to highlight the more significant changes in use. d = (pq) if d<0 then significance = d—8 x 10 d q +1 if d>0 then significance = d +8 x 10 d q +1 d = difference between last month and this month p = number of visits last month (8 to 1 days ago) q = number of visits previous month (15 to 9 days ago) trends at a glance: a management dashboard of library statistics | morton-owens and hanson 47 this equation serves the basic purpose of identifying unusual changes in e-resource use. for example, one e-resource was shown trending up in use after a librarian taught a course in it. figure 8. table of e-resources showing the most significant change in use over the last month compared to the previous month the ezproxy data has already proven to be a rich source of data. the work so far has only scratched the surface of what the data could show. only two charts are currently displayed on the dashboard, but the value of thisdata is more likely to come from one-off customized reports based on specific queries, like tracking use of individual resources over time or looking at variations of use within specific buildings, departments, or user types. there is also a lot that could be done with the referrer addresses. for example, the library has been submitting tips to the newsletter that is delivered by email. the referrer log allows the number of clicks from this source to be measured so that librarians can monitor the success of this marketing technique. user data each library system includes some user information. where user information is available in a system, a separate table is populated in the warehouse. as mentioned briefly above, a user id, a date, and the type of service used (e-resources, dds, literature search, etc.) is stored. details of the transaction are not kept here. the user id can be used to look up basic information about the user such as role (faculty, staff, student) and department. we should emphasize for clarity that the detailed information about the activity is completely separated from any information about the user so that the data cannot be joined back together. information technology and libraries | september 2012 48 the most sensitive data, such as the raw ezproxy log data, is purged after the import script has copied the truncated and de-identified data. even though the data stored is very basic, information at the granularity of individual users is never displayed on the dashboard. the user information is aggregated by user type for further analysis and display. the institutional people database can be used to determine how many people are in each department. a table added to the dashboard shows the number of resource uses and the percentage of each department that used library resources in a six-month period. some potential uses of this data include identifying possible training needs and measuring the success of library outreach to specific departments. for example, if one department uses the resources very little, this may indicate a training or marketing deficit. it may also be interesting to analyze how the academic success of a department aligns with library resource use. do the highest intensity users of library resources have greater professional output or higher prestige as a research department, for example? it is unsurprising to find that medical students and librarians are most likely to use library resources. the graduate medical education group is third and includes medical residents (newly qualified doctors on a learning curve). as with the ezproxy data, there are numerous insights to be gained from this data that will help the library make strategic decisions about future services. figure 9. table showing the proportion of each user group that has used at least one library service in a six-month period results trends at a glance: a management dashboard of library statistics | morton-owens and hanson 49 the dashboard has been available for almost a year. it requires a password and is only available to nyuhsl’s senior management team and librarians who have asked for access. feedback on the dashboard has been positive, and librarians have begun to make suggestions to improve its usefulness. one librarian uses the data warehouse for his own reports and will soon provide his queries so that they can be added to the dashboard. the dashboard has facilitated discoveries about the nature of our users and has identified potential training needs and areas of weakness in outreach. a static dashboard snapshot was recently created for presentation to the dean of the medical school to illustrate the extent and breadth of library use. the initial dashboard aimed to demonstrate the kinds of library statistics that it is possible to extract and display, but there is much to be done to improve its operational usefulness. a dashboard working group has been established to build on the original proof-of-concept by improving the data model and adding relevant charts. some charts will be incorporated into the public website as a snapshot of library activity. the dashboard was structured to be adaptable and expandable. the next iteration will support customization of the display for each user. new charts will be added as requested, and charts that are perceived to be less insightful will be removed. for example, one chart shows the number of reference chat requests answered by each librarian in addition to the number of chats handled per week. the usefulness of this chart was questioned when it was observed that the results were merely a reflection of which librarians had the most time at their own desks, allowing them to answer chats. this is an example of how it can be difficult to separate context from numbers. in this instance the individual statistics were only included because the data was available, not because any particular request from management, so these charts may be removed from the dashboard. nyuhsl is also investigating the ex libris tool ustat, which supports analysis of counter (counting online usage of networked electronic resources) reports from e-resources vendors. ustat covers some of the larger gaps in the ezproxy log, including journal-level rather than vendor-level analysis, and most importantly, the use statistics for non-ezproxied addresses. a future project will be to see whether there is an automated way to extract use metrics, either from ustat or directly from the vendors to be incorporated into the data warehouse. preliminary discussion are being held with it administrators about the possibilities of ezproxying library resource urls as they pass through the firewall so that the ezproxy log becomes a more complete reflection of use. an example of a strategic decision based on dashboard data involves nyuhsl’s mobile website. librarians had been considering the question of whether to invest substantial effort in identifying and presenting free apps and mobile websites to complement the library’s small collection of licensed mobile content. the chart of website visits on the dashboard surprisingly shows that the number of visits that come from mobile devices is consistently fewer 3 percent, probably because of the relatively modest selection of mobile-optimized website resources. rather than invest information technology and libraries | september 2012 50 significant effort in cataloging additional potentially lackluster free resources that would not be seen by a large number of users, the team decided to wait for more headlining subscription-based resources to become available and increase traffic to the mobile site. it would be worthwhile to add charts to the dashboard that track metrics related to new strategic initiatives requiring librarians to translate strategic ideas into measurable quantities. for example, if the library aspired to make sure users received responses more quickly, charts tracking the response time for various services could be added and grouped together to track progress on this goal. as data continues to accumulate, it will be possible to extend the timeframe of the charts, for example, making weekly charts into monthly ones. over time, the data may become more static, requiring more complicated calculations to reveal interesting trends. conclusions the medical center has a strong ethic of metric-driven decisions, and the dashboard brings the library in line with this initiative. the dashboard allows librarians and management to monitor key library operations from a single, convenient page, with an emphasis on long-term trends rather than day-to-day fluctuations in use. it was put together using freely available tools that should be within the reach of people with moderate programming experience. assembling the dashboard required background knowledge of the systems in question, was made possible by nyuhsl’s use of open-source and homegrown software, and increased the designers’ understanding of the data and tools in question. references 1 association of academic health sciences libraries, “annual statistics,” http://www.aahsl.org/mc/page.do?sitepageid=84868 (accessed november 7, 2011); association of research libraries, “arl statistics,” http://www.arl.org/stats/annualsurveys/arlstats (accessed november 7, 2011). 2 brown university library, “dashboard_beta :: dashboard information,” http://library.brown.edu/dashboard/info (accessed january 5, 2012). 3 edward r. tufte, the visual display of quantitative information (cheshire, ct: graphics, 2001), 92. 4 ibid., 56. 5 ibid., 178. 6 ibid., 153. 7 scott bateman et al., “useful junk? the effects of visual embellishment on comprehension and memorability of charts,” chi ’10 proceedings of the 28th international conference on human factors in computing systems (new york, acm, 2010) , doi: 10.1145/1753326.1753716. http://www.aahsl.org/mc/page.do?sitepageid=84868 http://www.arl.org/stats/annualsurveys/arlstats/ http://library.brown.edu/dashboard/info/ trends at a glance: a management dashboard of library statistics | morton-owens and hanson 51 8 stephen few, information dashboard design: the effective visual communication of data (beijing: o’reilly, 2006), 98. 9 nils rasmussen, claire y. chen, and manish bansal, business dashboards: a visual catalog for design and deployment (hoboken, nj: wiley, 2009), ch. 4. 10 richard j. roiger and michael w. geatz, data mining: a tutorial-based primer (boston: addison wesley, 2003), 186. 11 one example: stefan waner and steven r. costenoble, “fitting functions to data: linear and exponential regression,” february 2008, http://people.hofstra.edu/stefan_waner/realworld/calctopic1/regression.html (accessed january 5, 2012). 12 emily g. morton-owens, “editorial and technological workflow tools to promote website quality,” information technology &llibraries 30, no 3 (september 2011):92–98. 13 google, “gapi—google analytics api php interface,” http://code.google.com/p/gapi-google-analyticsphp-interface (accessed january 5, 2012). 14 karen a. coombs, “lessons learned from analyzing library database usage data,” library hitech 23 (2005): 4, 598–609, doi: 10.1108/07378830510636373. http://people.hofstra.edu/stefan_waner/realworld/calctopic1/regression.html http://code.google.com/p/gapi-google-analytics-php-interface/ http://code.google.com/p/gapi-google-analytics-php-interface/ efficiently processing and storing library linked data using apache spark and parquet kumar sharma, ujjal marjit, and utpal biswas information technology and libraries | september 2018 29 kumar sharma (kumar.asom@gmail.com) is research scholar, department of computer science and engineering; ujjal marjit (marjitujjal@gmail.com) is system-in-charge, center for information resource management (cirm); and utpal biswas (utpal01in@yahoo.com) is professor, department of computer science and engineering, the university of kalyani, india. abstract resource description framework (rdf) is a commonly used data model in the semantic web environment. libraries and various other communities have been using the rdf data model to store valuable data after it is extracted from traditional storage systems. however, because of the large volume of the data, processing and storing it is becoming a nightmare for traditional datamanagement tools. this challenge demands a scalable and distributed system that can manage data in parallel. in this article, a distributed solution is proposed for efficiently processing and storing the large volume of library linked data stored in traditional storage systems. apache spark is used for parallel processing of large data sets and a column-oriented schema is proposed for storing rdf data. the storage system is built on top of hadoop distributed file systems (hdfs) and uses the apache parquet format to store data in a compressed form. the experimental evaluation showed that storage requirements were reduced significantly as compared to jena tdb, sesame, rdf/xml, and n-triples file formats. sparql queries are processed using spark sql to query the compressed data. the experimental evaluation showed a good query response time, which significantly reduces as the number of worker nodes increases. introduction more and more organizations, communities, and research-development centers are using semantic web technologies to represent data using rdf. libraries have been trying to replace the cataloging system using a linked-data technique such as bibframe.1 libraries have received much attention on transitioning marc cataloging data into rdf format.2 data stored in various other formats such as relational databases, csv, and html have already begun their journey toward the open-data movement.3 libraries have participated in the evolution of linked open data (lod) to make data an essential part of the web.4 various researchers have explored areas related to library data and linked data. in particular, transitioning legacy library data into linked data has dominated most of the research works. other areas include researching the impact of linked library data, investigating how privacy and security can be maintained, and exploring the potential effects of having open linked library data. obviously, a linked-data approach for publishing data on the web brings many benefits to libraries. first, once isolated library data currently stored using traditional cataloging systems (marc) becomes a part of the web, it can be shared, reused, and consumed by web users.5 this promotes the cross-domain sharing of knowledge hidden in the library data, opening the library as a rich source of information. online library users can share more information using linked library resources since every library mailto:kumar.asom@gmail.com mailto:marjitujjal@gmail.com mailto:utpal01in@yahoo.com efficiently processing and storing library linked data | sharma, marjit, and biswas 30 https://doi.org/10.6017/ital.v37i3.10177 resource is crawlable on the web via uniform resource identifiers (uri). most importantly, library data benefits from linked-data technology’s real advantages, such as interoperability, integration with other systems, data crosswalks, and smart federated search.6 numerous approaches have evolved for making the vision of the semantic web a success. no doubt, they have succeeded in making the library a part of the web, but there remain issues related to library big data. the term big data refers to data or information that cannot be processed using traditional software systems.7 the volume of such data is so large that it requires advanced technologies for processing and storing the information. libraries also have real concerns with large volumes of data during and after the transition to linked data. the main challenges are in processing and storage. during conversion from library data to rdf, the process can become stalled because of the large volumes of data. once the data is successfully converted into rdf formats, there are storage issues. finally, even if the data is somehow stored using common rdf triple stores, it is difficult to retrieve and filter. this is a challenging problem that every librarian must give attention to. librarians should know the real nature of library big data, which causes problems in analyzing data and decision making. librarians must also know the technologies that can resolve these issues. the rate of data generation and the complexity of the data itself are constantly increasing. traditional data-management tools are becoming incapable of managing the data. that is why the definition of big data has been characterized by five vs—volume, velocity, variety, value, and veracity.8 • volume is the amount of the data. • velocity is the data-generation rate (which is high in this case). • variety refers to the heterogeneous nature of the data. • value refers to the actual use of the data after the extraction. • veracity is the quality or trustworthiness of the data. to handle the five vs of big data, distributed technologies such as commodity hardware, parallel processing frameworks, and optimized storage systems are needed. commodity hardware reduces the cost of setting up a distributed environment and can be managed with very limited configurations. a parallel processing system can process distributed data in parallel to reduce processing time. an optimized storage system is required to store the large volume of data, supporting scalability to accommodate more data on demand. with these library requirements to tackle the challenges posed by library big data, a distributed solution is proposed. this approach is based on apache hadoop, apache spark, and a column-oriented storage system to process largesize data and to store the processed data in a compressed form. bibliographic rdf data from british national library and the national library of portugal have been used for this experiment. these bibliographic data are processed using apache spark and stored using apache parquet format. the stored data can be queried using sparql queries for which spark sql is used to execute queries. given an existing rdf dataset, we designed a schema for storing rdf data using a columnoriented database. using column-oriented design with apache parquet and spark sql as the query information technology and libraries | september 2018 31 processor, a distributed rdf storage system was implemented that can store any amount of rdf data by increasing the number of distributed nodes as needed. literature review while big data continues to rise, library data are still in traditional storage systems isolated from the web. to continue working with the web, libraries must redesign the way they format data and contribute toward the web of data. to serve library data to other communities, libraries must integrate their data with the web. attempts to do this have been made by several researchers. the task of integration cannot be achieved by only librarians; rather, it requires a team of experts in the field of library and information technology. the advanced way for integrating resources is with linked-data technology by assigning uris to every piece of library data. with this goal, there exist various projects related to the convergence of library data and linked data. one of these, bibframe, is an initiative to transition bibliographic resources into linked-data representation. bibframe aims to replace traditional cataloging standards such as marc and unimarc using the concept of publishing structured data on the web. marc formats cannot be exchanged easily with nonlibrary systems. the marc standard also suffers from inconsistencies, errors, and inability to express relationships between records and fields within the record. that is why mostly bibliographic resources stored in marc standards are targeted for conversion.9 other works include the open-data initiative from the british national library, library catalog to linked opendata conversion, exposing library data as linked data, and building a knowledge graph to reshape the library staff directory.10 linked data is fully dependent on rdf. rdf reveals graph-like structures where resources are linked with one another. thus, rdf can improve on marc standards because of its strong ability to link related resources. this system of revealing everything as a graph helps in building a network of library resources and other data on the web. this also makes for fast search functionality. in addition, searching a topic or book could bring similar graphs from other library resources, leading to the creation of linked-data service.11 such a service has been implemented by the german national library to provide bibliographic and authority data in rdf format, by the europeana linked open data with access to open metadata on millions of books and multimedia data, and by the library of congress linked data service.12 there is less discussion of library big data. though big data in general is in active research, the library domain has received much less attention than the broader concept of big data and its challenges. this could be because most of librarians working with linked data are from nontechnical backgrounds. now is the right time for libraries to give priority to adopting big data technologies to overcome challenges posed by big data. wang et al. have discussed library big data issues and challenges.13 they made some statements about whether library data belongs to the big data category. obviously, library data belongs to big data since it fulfills some of the characteristics of big data, such as volume, variety, and velocity. wang et al. also raise some of libraries’ challenges related to library big data, such as lacking teams of experts, inability to adopt big data due to budgetary issues, and technical challenges. finally, they point out that to take advantage of the web’s full potential, library data must be transformed into a format that can be accessible beyond the library using technologies like semantic web and linked data. the web has already started its work related to big-data challenges. libraries need to transition their data into an advanced format with the ability to handle big-data issues. the main problems efficiently processing and storing library linked data | sharma, marjit, and biswas 32 https://doi.org/10.6017/ital.v37i3.10177 related to library big data happen at data transformation and storage. to store and retrieve large amounts of data, we need commodity hardware that can handle trillions of rdf triples, requiring terabytes or petabytes of disk space. as of now, there are semantic web frameworks such as jena and sesame to handle rdf data, but these frameworks are not scalable for large rdf graphs.14 jena is a java-based framework for building semantic web and linked-data applications. it is basically a semantic web programming framework that provides java libraries for dealing with rdf data. jena tdb is the component of jena for storing and querying rdf data. 15 it is designed to work in a single-node environment. sesame is also a semantic web framework for processing, storing, and querying rdf data. basically, sesame is a web-based architecture for storing and querying rdf data as well as schema information. 16 background this section briefly describes the structure of rdf triples, apache spark along with its features and column-oriented database system, and apache parquet. structure of rdf triples rdf is a schema-less data model. it implies that the data is not fixed to a specific schema, so it does not need to conform to any predefined schema. unlike in relational tables, where we define columns during schema definition and those columns must contain the required type of data, in rdf we can have any number of properties and data using any kind of vocabulary. we only need vocabulary terms to embed properties. the vocabulary is created using domain ontology, which represents the schemas. to describe library resources we need a library-domain ontology. for example, to define a book and its properties one can use the bookont ontology.17 bookont is a book-structure ontology designed for an optimized book search and retrieval process. however, it is not mandatory to use existing ontology and all the properties defined under it. we can use terms from a newly created ontology or mixed ontologies with required properties. rdf represents resources in the form of subject, predicate, and object. the subject is the resource being described, identified by a uri. this subject can have any number of property-value pairs. this way representation of a resource is called knowledge representation, where everything is defined as a knowledge in the form of entity attribute value (eav). in rdf, the basic unit of information is a triple t, such that t = {subject, predicate, object}. such information when stored on disk is called a triplestore. the collection of rdf triples is called an rdf database. an rdf database is specially designed to store linked data to make the web more useful by interlinking data from different sources in a meaningful way. the real advantage of rdf is its support of the common data model. rdf is the standard way for publishing meaningful data on the web, and this is backed by linked data. linked data provides some rules about how data can be published on the web by following the rdf data model.18 with such a common data model, one can integrate data from any sources by inserting new property-value pairs without altering database schema. another important purpose of rdf is to provide resources to be processable by software agents on the web. rdf triples are of two types: literal triples and linked triples. literal triples consist of a uri referenced subject and a literal object (scalar value) joined by a predicate. in linked triples, both the subject and the object consist of uris linking by the predicate. this type of linking is called rdf link, which is the basis for interlinking the resources.19 rdf data are queried using the sparql query language.20 sparql is a graph-matching query language and is used to retrieve information technology and libraries | september 2018 33 triples from the triple store. the sparql queries are also called semantic queries. like sql queries, sparql also finds and retrieves the information stored in the triplestore. a sparql query is composed of five main components:21 • the prefix declaration part is used to abbreviate the uris; • the dataset definition is used to specify the rdf dataset from which the data is to be fetched; • the result clause is used to specify what information is needed to be fetched, which can be select, construct, describe, and ask; • the query pattern is used to specify the search conditions; and • the query modifiers are used to rearrange query results using order by, limit etc. hadoop and mapreduce hadoop is open-source software that supports distributed processing of large datasets on machine clusters.22 two core components—hadoop distributed file system (hdfs) and mapreduce— make distributed storage and computation of processing jobs possible.23 hdfs is the storage component, whereas mapreduce is a distributed data-processing framework, the computational model of hadoop based on java. the mapreduce algorithm consists of two main tasks: map and reduce. the map task takes a set of data as input and produces another set of data with individual components in the form of key/value pairs or tuples. the output of the map task goes to the reduce task, which combines common key/value pairs into a smaller set of tuples. hdfs and mapreduce are based on driver/worker architecture consisting of driver and worker nodes having different roles. an hdfs driver node is called the name-node while the worker node is called the data-node. the name-node is responsible for managing names and data blocks. data blocks are present in the data-nodes. data-nodes are distributed across each machine, responsible for actual data storage. similarly, the mapreduce driver node is called the job-tracker and the worker node is called the task-tracker. job-tracker is responsible for scheduling jobs on task-trackers. task-tracker again is distributed across each machine along with the data-nodes, responsible for processing map and reducing tasks as instructed by the job-tracker. the concept of hadoop implies that the set of data to be processed is broken into smaller forms that can be processed individually and independently. this way, tasks can be assigned to multiple processors to process the data, and eventually it becomes easy to scale data processing over multiple computing nodes. once a mapreduce program is written, the program can be scaled to run over thousands of machines in a cluster. spark and resilient distributed datasets (rdd) apache spark is an in-memory cluster computing platform, which is a faster batch-processing framework than mapreduce. more importantly, it supports in-memory processing of tasks along with data, so querying data is much faster than disk-based engines. the core of spark is the resilient distributed dataset (rdd). rdd is a fundamental data structure of spark that holds a distributed collection of data where data cannot be modified. rather, data modification yields another immutable collection of data (or rdd). this process is called rdd transformation. for example, figure 1 depicts an example of rdd transformation. the distributed processing and efficiently processing and storing library linked data | sharma, marjit, and biswas 34 https://doi.org/10.6017/ital.v37i3.10177 transformation of data is managed by rdd. rdds are fault-tolerant, meaning that the lost data is recoverable using lineage graph of rdds.24 spark constructs a direct acyclic graph (dag) of a sequence of computations that needed to be performed on data. spark has the most powerful computing engine that allows most of the computations in multistage memory. because of this multistage in-memory computation engine, it provides better performance at reading and writing data than the mapreduce paradigm.25 it aims at speed, ease of use, extensibility, and interactive analytics. spark relies on concepts such as rdd, dag, spark context, transformations, and actions. spark context is an execution environment in which rdds and broadcasting variables can be created. spark context is also called the master of a spark application and allows accessing the cluster through a resource manager. data transformation happens in the spark application when the data is loaded from a data-store into rdds and some filter or map functions are performed to produce a new set of rdds. when the set of computations is created, forming a dag, it does not perform any execution; rather, it prepares for execution in the end, like a lazy loading process. some examples of actions are data extraction or collection and getting the count of words. transformations are the sequence of events, and action is the final execution of the underlying logic. figure 1. rdd transformations. the execution model of spark is shown in figure 2. the execution model is based on the driver/worker architecture consisting of the driver and the worker processes. the driver process creates the spark context and schedules tasks based on the available worker nodes. initially, the master process must be started, then creating worker nodes follows. the driver takes the responsibility of converting a user’s application into several tasks. these tasks are distributed among the workers. the executors are the main components of every spark application. executors actually perform data processing, reading and writing data to the external sources and the storage system. the spark manager is responsible for resource allocation and deallocation to the spark job. basically, spark is only a computation model. it is not related to storage of data, which is a different concept. it only helps in computations and data analytics in a distributed manner. for distributed execution, the task is distributed among the connected nodes so that every node can perform tasks at the same time; it performs the desired operation and notifies the master upon completion of the task. information technology and libraries | september 2018 35 figure 2 execution model of spark. in mapreduce, read/write operations happen between disk and memory, making job computation slower than spark. rdds resolve this by allowing fault-tolerant, distributed, in-memory computations. in rdd, the first load of data is read from disk and then a write-to-disk operation may take place depending upon the program. the operations between first read and last write happen in memory. data on rdds are lazily evaluated, i.e., during rdd transformations, data will not take part until any action is called on the final rdd, which triggers the job execution. the chain of rdd transformations creates dependencies between rdds. each dependency has a function for calculating its data and a pointer to its parent rdd. spark divides rdd dependencies into stages and tasks, then it sends them to workers for execution. hence, an rdd does not actually hold the data; rather, it either loads data from disk or from another rdd and performs some actions on the data for producing results. one of the important features of rdd is its fault tolerance, because of which it can retain and recompute any of the unsuccessful partitions due to node failures. rdds have built-in methods for saving data into files. for example, the rdd calls on saveastextfile(), its data are written on the specified text file line by line. there are numerous options for storing data in different formats, such as json, csv, sequence files, and object files. all these file formats can be saved directly into hdfs or normal file systems. spark sql and dataframe spark sql is a query interface for processing structured data using sql style on the distributed collection of data. that means it is used for querying structured data stored in hdfs (like hive) and parquet. spark sql runs on top of spark as a library and provides higher optimization. the efficiently processing and storing library linked data | sharma, marjit, and biswas 36 https://doi.org/10.6017/ital.v37i3.10177 spark dataframe is an api (application programming interface) that can perform relational operations on rdds and external data sources such as hive and parquet. like rdds, a spark dataframe is also a collection of structured records that can be manipulated by spark sql. it evaluates operations lazily to perform relational optimizations.26 a dataframe is created using rdds along with the schema information. for example, the java code snippet below creates a dataframe using rdd and a schema called rdftriple (rdf-triple schema will be discussed in the proposed approach). javardd n_triples_ = marc_records.map(new texttostring()); javardd rdf_triples = n_triples.map(new linestordffunction()); dataset dataframe = sparksession.createdataframe(rdf_triples, rdftriple.class); dataframe.write().parquet("/full-path/rdfdata.parquet"); the spark dataframe uses memory management wisely by saving data in off-heap memory and provides an optimized execution plan. conceptually, a dataframe is equivalent to the relational tables with richer optimization and supports sql queries over its data. so, a dataframe is used for storing data into tables. structured data from spark dataframe can be saved into the parquet file format as shown in the above code snippet. column-oriented database a database is a persistent collection of records. these records are accessed via queries. the system that stores data and processes queries to retrieve data is called a database system. such systems use indexes or iteration over the records to find the required information stored in the database. indexes are an auxiliary, dictionary-like data structure that keeps indexes of individual records. indexing is efficient in some cases, however, as it requires two lookup operations and it slows down the access time. data scanning or iteration over each record resolves the query by finding the exact location of the records. it is inefficient when the size of the data is too large. as data-generation rate is increasing constantly, more and more data is going to be stored on the disk. for a fast-growing rate of data, we need a system that can adjust to more data than traditional storage systems and, at the same time, query-processing tasks should take less time. when the data gets too large, indexing and record scanning will be costly during querying. hence, a satisfying solution is the columnar-storage system, which stores data by columns rather than by rows. 27 a column-oriented database system stores data in corresponding columns, and each column is stored in a separate file into the disk. this makes data access time much quicker. since each column is stored separately, any required data can directly be accessed instead of reading all the data. that means any column can be used as an index, making it auto-indexing. that is why the column-oriented representation is much faster than the row-oriented representation. apart from this, data is stored in the compressed form. each column is compressed using a different scheme. in the column-oriented database, the compression is always efficient as all the values belong to the same data type. hence, column-oriented databases require less disk space, as they do not need additional storage for indexes since the data is stored within the indexes themselves. consider an example where a database table named “book” consisting of columns “bookid,” “title,” and “price.” following a column-oriented approach, all the values for bookid are stored together under the “bookid” column, all the values for title are stored together under “title” column. and so on as shown in figure 3. information technology and libraries | september 2018 37 figure 3 an example of an entity and its row and column representation. apache parquet parquet is a top-level apache project that stores data in column-oriented fashion, highly compressed and densely packed in the disk.28 it is a self-describing data format that embeds schema within the data itself. it supports efficient compression and encoding schemes that allows lowering data-storage costs and maximizes the effectiveness of querying data. parquet has added advantages, such as limiting the i/o operation and storing data in compressed form using the snappy method developed by google and used in its production environment. hence it is designed especially for space and query efficiency. snappy aims at compressing petabytes of data in minimal amounts of time, and especially aims for resolving big data issues.29 the data compression rate is more than 250 mb/sec, and decompression rate is more than 500 mb/sec. these compression and decompression rates are for a single core of a system having a core i7 processor in 64-bit mode. it is even faster than the fastest mode of zlib compression algorithm.30 parquet is implemented using column-striping and assembly-language algorithms that are optimized for storing large data-blocks.31 it supports nested data structures in which each value of the same column is stored in contiguous memory locations.32 apache parquet is flexible and can work with many programming languages because it is implemented using apache thrift (https://thrift.apache.org/). a parquet file is divided into row groups and metadata at the end of the file. each row group is divided into column values (or column chunks), such as column 1, column 2, and so on as shown in figure 4. each column value is divided into pages, and each page consists of the page header, repetition levels, definition levels, and values. the footer of the file contains various metadata, such as file metadata, column metadata, and page-header metadata. the metadata information is required to locate and find the values, just like indexing. https://thrift.apache.org/ efficiently processing and storing library linked data | sharma, marjit, and biswas 38 https://doi.org/10.6017/ital.v37i3.10177 figure 4 parquet file structure. the proposed approach the proposed approach relies on spark’s core apis—rdd, spark sql, and dataframe—which can operate on large datasets. rdd is used to load the initial data from the input file, process the data and transform them into triple structure. spark dataframe is used to load the data from rdd into the triple structure and send the transformed rdf data into a parquet file. spark sql is used to fetch the data stored in the parquet file. processing rdf data processing rdf data from large rdf/xml files requires breaking the file into smaller file components. general data-processing systems cannot handle large files because they face memory issues. at this stage, the proposed approach can process the data using an n-triples file, hence individual rdf/xml files again need to be converted into the n-triples file format. the process of breaking rdf/xml file into smaller file components and then converting them into n-triples format depends upon the size of the input file. if it is not more than 500 mb then it is directly converted into n-triples file format. multiple rdf/xml files are converted into individual ntriples file formats, which are again combined into one n-triples file, as the proposed spark application reads input from a single file. information technology and libraries | september 2018 39 schema to store rdf data a simple rdf schema with three triple entities has been designed. this schema is an rdf triple view, which is the building block of the rdf storage schema proposed in this work. the rdf triple view is a simple java class consisting of three attributes—subject, predicate, and object. given an rdf dataset d, consisting of a set of rdf triples t, in either rdf/xml or n-triples format, the dataset is transformed into a format that can be processed by a spark application. further, the dataset is transformed into a line-based format where the individual triple statement is placed in a line separated by a new-line (\n) character. a line contains three components—subject, predicate, and object separated by a space. here each line is unique, using the combined information of subject, predicate, and object. given an rdf triple structure ti, ti = (si, pi, oi) and ti ∈ t, for each t an instance of rdf triple view is created to hold the triple information. the columnar schema organizes triple information into three components, storing each component separately as subject, predicate, and object columns (figure 5). figure 5. rdf triple view. rdf storage we store the rdf data based on rdf triple view, which is the main schema for storing data in the triple representation. we do not need any indexing or additional information related to subject, predicate, or object to be stored on the disk. since we can have any number of temporary dataframe tables in memory, join operations can be performed using these tables to filter the data. in the absence of expensive indexing and additional triple information, storage area can be reduced significantly. apart from this, the compression technique used in apache parquet reduces lot more space than storing in other triple stores. in figure 6, we illustrate the data-storing process. efficiently processing and storing library linked data | sharma, marjit, and biswas 40 https://doi.org/10.6017/ital.v37i3.10177 figure 6. data-storing process in hdfs. the collection of triple instances is loaded into an rdd. at the end, the collection of triple instances is loaded into spark dataframe. spark dataframes are equivalent to the rdbms tables and support both structured and unstructured data formats. using a single schema, multiple dataframes can be used and can be registered as temporary tables in the memory, where highlevel sql queries can be executed on top of them. here the concept of using multiple dataframes with a single schema is motivated to avoid joins and indexing. in the final step, the spark dataframe is saved into hdfs files in the parquet format. from the parquet file, the data can be loaded back into dataframes in memory and queried using spark sql. fetching data from storage given an rdf dataset d, a sparql query q, and a columnar-schema s, we use s to translate q to q' to perform queries on top of s. here, the answer of query q' on top of s is equal to the answer of q on top of d. query mappings m are used to transform sparql queries into spark sql queries. for querying, first the data is loaded into a spark dataframe from parquet files. to query data using sparql, queries must follow basic graph patterns (bgp). a bgp is a set of triple patterns similar to an rdf triple (s, p, o) where any of s, p, and o can be query variables or literals. bgp is used for matching a triple pattern to an rdf graph. this process is called binding between query variables and rdf terms. the statements listed under the where clause is known as bgp consisting of query patterns. for example, the query “select ?name ?mbox where {?x foaf:name ?name . ?x foaf:mbox ?mbox .}” has two query patterns. to evaluate the query containing two query patterns, one join is required. based on the total number of query patterns, information technology and libraries | september 2018 41 we need one less number of joins. that is, for n number of query patterns we need n-1 joins to resolve the values. figure 7 illustrates the process of query execution. figure 7. process of query execution. evaluation to evaluate the proposed approach we compare the storage size with file-based storage systems such as n-triples files and rdf/xml files. we also compare with standard triple stores such as jena tdb and sesame. the data-storing time is compared with jena tdb, sesame, and parquet, having one, two, and three worker nodes respectively. finally, for the purposes of the experiment, some sparql queries are selected and tested over rdf data stored in parquet format into hdfs. the query performance is tested on the distributed system having one, two, and three worker nodes respectively. in the following subsections, we show the results for each of the above comparisons. datasets for evaluation, we use two datasets. dataset 1 contains bibliographic data from the national library of portugal (nlp) (http://opendata.bnportugal.pt/eng_linked_data.htm). from nlp, we choose the nlp catalogue datasets in rdf/xml formats. the datasets are freely available to reuse and contain metadata information from nlp catalogue, the national bibliographic database, the portuguese national bibliography, and the national digital library. the datasets are available as linked data, which were produced in the context of the european library. the size of the rdf/xml file is 6.46 gb with more than 45 billion rdf triples. http://opendata.bnportugal.pt/eng_linked_data.htm efficiently processing and storing library linked data | sharma, marjit, and biswas 42 https://doi.org/10.6017/ital.v37i3.10177 dataset 2 contains bibliographic data from the british national library (https://www.bl.uk/bibliographic/download.html). from the british national bibliography collection we choose the bnb lod books dataset. the datasets are publicly available and contain bibliographic records of different categories, such as books, locations, bibliographic resources, persons, organizations, and agents. the datasets are divided into sixty-seven files in rdf format. however, we combine them into one file in n-triples format to fit the requirement of the large size of the input data. the combined file is 22.52 gb and contains more than 16 billion rdf resources in n-triples format, making it suitable for the proposed approach. from this conversion, we get more than 150 billion rdf triples. figure 8. data storage time for different file formats. figure 9. disk size for different file formats. disk storage figure 8 shows the data-storing time using sesame, jena tdb, and parquet for the above two datasets. data from raw rdf files are stored in jena tdb and sesame. individual files are processed for storing into jena tdb and sesame to avoid memory overflow as jena or sesame models cannot load data at once from the large files. to store data in parquet format we run the program separately on different worker nodes. figure 9 presents the total disk size required for each of these file formats and triple stores for the two datasets. https://www.bl.uk/bibliographic/download.html information technology and libraries | september 2018 43 query performance for testing, the sparql queries are converted manually at this stage. we run some of the selected queries over bibliographic rdf data stored in parquet file format in hdfs. we run the following type of queries on worker nodes 1, 2 and 3 respectively. the queries are listed below: q1) the first query is to fetch the count of rdf triples present in the storage. query: select (count(*) as ?count) where ?s ?p ?o . q2) the second query is to fetch the entire dataset in spo format. it fetches data in the n triples format. query: select * { ?s ?p ?o } . q3) the third query is to fetch resources that belong to books with the subject “english language composition and exercises.” query: select ?s where ?x rdf:type bibo:book . ?x dc:subject . q4) the fourth query is to fetch resources that belong to books with the subject “english language composition and exercises” and creator “palmer frederick.” query: select ?s where ?x rdf:type bibo:book . ?x dc:subject . ?x dc:creator . q5) the fifth query is to fetch objects having predicate dcterms:ispartof. query: select ?name where ?s dcterms:ispartof ?name . figure 10 shows the query response time for the above queries on different worker nodes for two different datasets. the queries are executed in the distributed environment. it shows that increasing the number of worker nodes decreases the query response time. efficiently processing and storing library linked data | sharma, marjit, and biswas 44 https://doi.org/10.6017/ital.v37i3.10177 figure 10. query response time with different numbers of worker nodes. query comparison for comparing query response time, the proposed approach is tested with the first dataset as mentioned above. though at this stage the proposed approach requires further research to be compared with other distributed triple storage systems. also, it requires more worker nodes and larger datasets compatible for parallel processing in the distributed environment. with a smaller setup, it will be hard to analyze the performance of the individual approaches, as they may produce similar results. we compare the proposed approach with the standard jena tdb solution in a single-node environment. the following sparql queries are tested against dataset 1. prefix rdf: prefix dc: prefix rdau: prefix foaf: q1. select (count(*) as ?count) { ?s ?p ?o } q2. select * { ?s ?p ?o } q3. select ?x where { ?x rdf:type dc:bibliographicresource. } q4. select ?x where { ?x rdf:type . ?x rdau:p60339 'time out lisboa'. } q5. select ?s where {?s dc:ispartof . ?s foaf:page 'http://www.theeuropeanlibrary.org/tel4/record/3000115318515'. } information technology and libraries | september 2018 45 figure 11. query comparison. we are interested in measuring the query response time with the above queries. first, we test with jena tdb. we then test the proposed approach on a single-node environment. we execute the above set of queries multiple times to record the average performance. as mentioned above, no indexing is used in the storage. rdf triples are stored as they appeared in the n-triples file. queries are executed without indexing and are still getting better performance than jena tdb, as shown in figure 11. discussion in this article, we claim that apache spark and column-oriented databases can resolve library big data issues. especially when dealing with rdf data, spark can perform far better than other approaches because of its in-memory processing ability. concerning rdf data storage, the column-oriented database is suitable for storing the large volume of data because of its scalability, fast data loading, and highly efficient data compression and partitioning. a column-oriented database system requires less disk, reducing the storage area. as a proof, we have shown the data storage comparison and the performance of the columnar-storage for rdf data using parquet formats in hdfs. as shown in the results, apache parquet takes much less disk space as compared to other storage systems. also, the data-storing time is relatively very small as compared to others. we observed that the result of query 2 is the entire dataset stored in parquet format. the size of this resultant dataset is 22.52 gb, which is the same as the original size. the same dataset when stored with parquet format is reduced to 2.89 gb. this shows that parquet is a very optimized efficiently processing and storing library linked data | sharma, marjit, and biswas 46 https://doi.org/10.6017/ital.v37i3.10177 storage system that can reduce the storage cost. we have shown the query response time for five different sparql queries on distributed nodes for two different datasets. we believe with better schema for storing rdf triples the proposed approach can be improved, and with the used technologies a fast and reliable triple store can be designed. conclusion and future work librarians all over the globe should give priority to integrating library data with the web to enable cross-domain sharing of library data. to do this, they must pay attention to current trends in big data technologies. because the data-generation rate is increasing in every domain, traditional data processing and storage systems are becoming ineffective because of the scale and complexity of the data. in this article, we present a distributed solution for processing and storing a large volume of library linked data. from the experiment, we observe that the processing of large volume of the data takes significantly less time using the proposed approach. also, the storage area is reduced significantly as compared to other storage systems. in the future we plan to optimize the current approach using advanced technologies such as graphx, machine learning tools, and other big -data technologies for even faster data processing, searching, and analyzing. references 1 eric miller et al., “bibliographic framework as a web of data: linked data model and supporting services,” library of congress, november 11, 2012, https://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf. 2 brighid m. gonzales, “linking libraries to the web: linked data and the future of the bibliographic record,” information technology and libraries 33 no. 4 (2014): 10, https://doi.org/10.6017/ital.v33i4.5631; myung-ja k. han et al., “exposing library holdings metadata in rdf using schema.org semantics,” in international conference on dublin core and metadata applications dc-2015, são paulo, brazil, september 1–4, 2015, pp. 41–49, http://dcevents.dublincore.org/intconf/dc-2015/paper/view/328/363. 3 franck michel et al., “translation of relational and non-relational databases into rdf with xr2rml,” in proceedings of the 11th international conference on web information systems and technologies, lisbon, portugal, 2015, pp. 443–54, https://doi.org/10.5220/0005448304430454; varish mulwad, tim finin, and anupam joshi, “automatically generating government linked data from tables,” working notes of aaai fall symposium on open government knowledge: ai opportunities and challenges 4, no. 3 (2011), https://ebiquity.umbc.edu/_file_directory_/papers/582.pdf; matthew rowe, “data.dcs: converting legacy data into linked data,” ldow 628 (2010), http://ceur-ws.org/vol628/ldow2010_paper01.pdf. 4 virginia schilling, “transforming library metadata into linked library data,” association for library collections and technical services, september 25, 2012, http://www.ala.org/alcts/resources/org/cat/research/linked-data. 5 getaneh alemu et al., “linked data for libraries: benefits of a conceptual shift from libraryspecific record structures to rdf-based data models,” new library world 113, no. 11/12 (2012): 549–70 (2012), https://doi.org/10.1108/03074801211282920. https://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf https://doi.org/10.6017/ital.v33i4.5631 http://dcevents.dublincore.org/intconf/dc-2015/paper/view/328/363 https://doi.org/10.5220/0005448304430454 https://ebiquity.umbc.edu/_file_directory_/papers/582.pdf http://ceur-ws.org/vol-628/ldow2010_paper01.pdf http://ceur-ws.org/vol-628/ldow2010_paper01.pdf http://www.ala.org/alcts/resources/org/cat/research/linked-data https://doi.org/10.1108/03074801211282920 information technology and libraries | september 2018 47 6 lisa goddard and gillian byrne, “the strongest link: libraries and linked data,” d-lib magazine, 16, no. 11/12 (2010), https://doi.org/10.1045/november2010-byrne. 7 t. nasser and r. s. tariq, “big data challenges,” journal of computer engineering & information technology 4, no. 3 (2015), https://doi.org/10.4172/2324-9307.1000133. 8 alexandru adrian tole, “big data challenges,” database systems journal 4, no. 3 (2013): 31–40, http://dbjournal.ro/archive/13/13_4.pdf. 9 carol jean godby and karen smith-yoshimura, “from records to things: managing the transition from legacy library metadata to linked data,” bulletin of the association for information science and technology 43, no. 2 (2017): 18–23, https://doi.org/10.1002/bul2.2017.1720430209. 10 corine deliot, “publishing the british national bibliography as linked open data,” catalogue & index, issue 174 (2014): 13–18, http://www.bl.uk/bibliographic/pdfs/publishing_bnb_as_lod.pdf; gustavo candela et al., “migration of a library catalogue into rda linked open data,” semantic web 9, no. 4 (2017): 481–91, https://doi.org/10.3233/sw-170274; martin malmsten, “exposing library data as linked data,” ifla satellite preconference sponsored by the information technology section: emerging trends in 2009, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.181.860&rep=rep1&type=pdf ; keri thompson and joel richard, “moving our data to the semantic web: leveraging a content management system to create the linked open library,” journal of library metadata 13, no. 2– 3 (2013): 290–309, https://doi.org/10.1080/19386389.2013.828551; jason a. clark and scott w. h. young, “linked data is people: building a knowledge graph to reshape the library staff directory,” code4lib journal 36 (2017), http://journal.code4lib.org/articles/12320; martin malmsten, “making a library catalogue part of the semantic web,” humbolt university of berlin, 2008, https://doi.org/10.18452/1260. 11 r. hastings, “linked data in libraries: status and future direction,” computers in libraries 35, no. 9 (2015): 12–28, http://www.infotoday.com/cilmag/nov15/hastings--linked-data-inlibraries.shtml. 12 mirjam keßler, “linked open data of the german national library,” in eco4r workshop lod of dnb, 2010; antoine isaac, robina clayphan, and bernhard haslhofer, “europeana: moving to linked open data,” information standards quarterly 24, no. 2/3 (2012)<>; carol jean godby and ray denenberg, “common ground: exploring compatibilities between the linked data models of the library of congress and oclc,” oclc online computer library center, 2015, https://files.eric.ed.gov/fulltext/ed564824.pdf. 13 chunning wang et al., “exposing library data with big data technology: a review,” 2016 ieee/acis 15th international conference on computer and information science (icis), pp. 1-6, https://doi.org/10.1109/icis.2016.7550937. 14 b. mcbride, “jena: a semantic web toolkit,” ieee internet computing 6, no. 6 (2002): 55–59, https://doi.org/10.1109/mic.2002.1067737; jeen broekstra, arjohn kampman, and frank van https://doi.org/10.1045/november2010-byrne https://doi.org/10.4172/2324-9307.1000133 http://dbjournal.ro/archive/13/13_4.pdf https://doi.org/10.1002/bul2.2017.1720430209 http://www.bl.uk/bibliographic/pdfs/publishing_bnb_as_lod.pdf https://doi.org/10.3233/sw-170274 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.181.860&rep=rep1&type=pdf https://doi.org/10.1080/19386389.2013.828551 http://journal.code4lib.org/articles/12320 https://doi.org/10.18452/1260 http://www.infotoday.com/cilmag/nov15/hastings--linked-data-in-libraries.shtml http://www.infotoday.com/cilmag/nov15/hastings--linked-data-in-libraries.shtml https://files.eric.ed.gov/fulltext/ed564824.pdf https://doi.org/10.1109/icis.2016.7550937 https://doi.org/10.1109/mic.2002.1067737 efficiently processing and storing library linked data | sharma, marjit, and biswas 48 https://doi.org/10.6017/ital.v37i3.10177 harmelen, “sesame: a generic architecture for storing and querying rdf and rdf schema,” international semantic web conference, ed. j. davies, d. fensel, and f. van harmelen (berlin and heidelberg: springer, 2002), https://doi.org/10.1002/0470858060.ch5. 15 “apache jena—tdb,” apache jena, accessed august 22, 2018, https://jena.apache.org/documentation/tdb/. 16 “sesame (framework),” everipedia, july 15, 2016, https://everipedia.org/wiki/sesame_(framework)/. 17 asim ullah et al., “bookont: a comprehensive book structural ontology for book search and retrieval,” 2016 international conference on frontiers of information technology (fit), 211– 16, https://doi.org/10.1109/fit.2016.046. 18 tom heath and christian bizer, “linked data: evolving the web into a global data space,” synthesis lectures on the semantic web: theory and technology 1, no. 1 (2011): 1–136, https://doi.org/10.2200/s00334ed1v01y201102wbe001. 19 christian bizer et al., “linked data on the web (ldow2008),” proceeding of the 17th international conference on world wide web—www 08, 2008, pp. 1265–66 (2008), https://doi.org/10.1145/1367497.1367760. 20 eric prud and andy seaborne, “sparql query language for rdf,” w3c recommendation, january 15, 2008, https://www.w3.org/tr/rdf-sparql-query/. 21 devin gaffney, “how to use sparql,” datagov wiki rss, last modified april 7, 2010, https://data-gov.tw.rpi.edu/wiki/how_to_use_sparql. 22 tom white, hadoop: the definitive guide (sebastopol, ca: o’reilly media,, 2012), https://www.isical.ac.in/~acmsc/wbda2015/slides/hg/oreilly.hadoop.the.definitive.guide. 3rd.edition.jan.2012.pdf. 23 dhruba borthakur, “the hadoop distributed file system: architecture and design,” hadoop project website, 2007, http://svn.apache.org/repos/asf/hadoop/common/tags/release0.16.3/docs/hdfs_design.pdf; seema maitrey and c. k. jha, “mapreduce: simplified data analysis of big data,” procedia computer science 57 (2015), 563–71 (2015), https://doi.org/10.1016/j.procs.2015.07.392. 24 michael armbrust et al., “spark sql: relational data processing in spark,” in proceedings of the 2015 acm sigmod international conference on management of data (new york: acm, 2015), 1383–94, https://doi.org/10.1145/2723372.2742797. 25 abdul ghaffar shoro and tariq rahim soomro, “big data analysis: apache spark perspective,” global journal of computer science and technology 15, no. 1 (2015), https://globaljournals.org/gjcst_volume15/2-big-data-analysis.pdf. 26 salman salloum et al., “big data analytics on apache spark,” international journal of data science and analytics 1, no. 3–4 (2016): 145–64, https://doi.org/10.1007/s41060-016-0027-9. https://doi.org/10.1002/0470858060.ch5 https://jena.apache.org/documentation/tdb/ https://everipedia.org/wiki/sesame_(framework)/ https://doi.org/10.1109/fit.2016.046 https://doi.org/10.2200/s00334ed1v01y201102wbe001 https://doi.org/10.1145/1367497.1367760 https://www.w3.org/tr/rdf-sparql-query/ https://data-gov.tw.rpi.edu/wiki/how_to_use_sparql https://www.isical.ac.in/~acmsc/wbda2015/slides/hg/oreilly.hadoop.the.definitive.guide.3rd.edition.jan.2012.pdf https://www.isical.ac.in/~acmsc/wbda2015/slides/hg/oreilly.hadoop.the.definitive.guide.3rd.edition.jan.2012.pdf http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.16.3/docs/hdfs_design.pdf http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.16.3/docs/hdfs_design.pdf https://doi.org/10.1016/j.procs.2015.07.392 https://doi.org/10.1145/2723372.2742797 https://globaljournals.org/gjcst_volume15/2-big-data-analysis.pdf https://doi.org/10.1007/s41060-016-0027-9 information technology and libraries | september 2018 49 27 daniel j. abadi, samuel r. madden, and nabil hachem, “column-stores vs. row-stores: how different are they really?,” in proceedings of the 2008 acm sigmod international conference on management of data (new york: acm, 2008), 967–80, https://doi.org/10.1145/1376616.1376712. 28 deepak vohra, “apache parquet,” in practical hadoop ecosystem (berkeley, ca: apress, 2016), 325–35, https://doi.org/10.1007/978-1-4842-2199-0_8. 29 “google/snappy,” github, january 04, 2018, https://github.com/google/snappy. 30 jean-loup gailly and mark adler, “zlib compression library,” 2004, https://www.repository.cam.ac.uk/bitstream/handle/1810/3486/rfc1951.txt?sequence=4. 31 sergey melnik et al., “dremel: interactive analysis of web-scale datasets,” proceedings of the vldb endowment 3, no. 1–2 (2010): 330–39, https://doi.org/10.14778/1920841.1920886. 32 marcel kornacker et al., “impala: a modern, open-source sql engine for hadoop,” in proceedings of the 7th biennial conference on innovative data systems research, asilomar, california, january 4–7, 2015, http://www.inf.ufpr.br/eduardo/ensino/ci763/papers/cidr15_paper28.pdf. https://doi.org/10.1145/1376616.1376712 https://doi.org/10.1007/978-1-4842-2199-0_8 https://github.com/google/snappy https://www.repository.cam.ac.uk/bitstream/handle/1810/3486/rfc1951.txt?sequence=4 https://doi.org/10.14778/1920841.1920886 http://www.inf.ufpr.br/eduardo/ensino/ci763/papers/cidr15_paper28.pdf abstract introduction literature review background structure of rdf triples hadoop and mapreduce spark and resilient distributed datasets (rdd) spark sql and dataframe column-oriented database apache parquet the proposed approach processing rdf data schema to store rdf data rdf storage fetching data from storage evaluation datasets disk storage query performance query comparison discussion conclusion and future work references automated periodicals system at a community college library vivian harp: staff analyst, illinois bell telephone company, chicago, and gertrude heard: serials technician, moraine valley community college library, palos hills, illinois. at the time of writing, ms. harp was assistant librarian at the moraine valley community college library. 83 automated systems need not be extensive to save time and improve efficiency. moraine valley's off-line operation, based on a file of 715 periodical titles, generates renewal orders, sends claims, and records subscription histories. background moraine valley community college (mvcc) is a two-year institution serving southwest cook county, illinois. it opened in september 1968 and now has an enrollment of 3,468 students. the library maintains 715 paid and free periodical subscriptions. because of the small staff size, periodicals had originally been handled by the cataloger. two subscription agencies were tried and found unsatisfactory. problems with overlapping subscriptions and lapsed subscriptions which were never picked up became quite severe, and time spent tracking down problems approached that needed to handle orders and renewals independently. periodicals were transferred to the public service librarian when the staff was expanded. as untangling of agency problems proved more and more time-consuming, a serials technician was assigned to maintain subscriptions, straighten out old problems, check in periodicals, and handle claims. for each subscription, bibliographic and order information and mvcc holdings were entered on a three-by-five-inch history card; on the verso were records for each renewal of purchase order number, subscription length, cost, and subscription dates. magazines were and are checked in on kardex files; the kardex card also holds the latest publisher's mailing label. a checklist is used to ensure that eaqh new title has a kardex card and storage box prepared and a listing in the public holdings record, plus any special instructions for routing. form letters are used for original enquiries to the publisher regarding availability and cost. when a subscription was renewed, data from the "current'' section of 84 journal of library automation vol. 7/2 june 1974 the history cards had to be transferred to the back and updated information entered on the front. a worksheet was made up to give all the necessary renewal information to the typist and the actual purchase order typed from that worksheet. once the purchase order was completed its number was marked on the worksheet and the history card. worksheets were kept on file to serve as easily accessible copies of the purchase orders for use in correspondence since the library copy of the purchase order was tied up in the accounting process. since many publishers do not provide renewal notification-and in our case these renewals amounted to over 40 percent of our orders-various methods to provide ourselves with notification of approaching expirations were attempted, including the use of colored plastic jackets in the history card file and division of the file by date. failure in this area was the chief weakness of the manual system. the cards were bulky and required much handling. creating a holdings list destroyed any semblance of the colorcoded order. if a card were removed for use in correspondence, it could be misplaced or misfiled and therefore not be considered for renewal at the proper time. duplication of paperwork and repeated erasures and transfers of information on the cards were other drawbacks of this operation. introduction to automated system it was hoped that an automated system would indicate approaching expirations and simplify the actual renewal procedure. the following specific objectives were set up: 1. to provide advance notice of subscriptions due for renewal even if a renewal notice were not received. 2. to produce a purchase order, or a replica providing on a single sheet all data needed for renewal. 3. to produce a list of periodical holdings that included the history of all renewals. 4. to claim missing issues of paid and free subscriptions. 5. to produce fiscal and subject area cost reports that would facilitate budget evaluation. two special problems had to be given consideration: ( 1) the college has a complicated check approval system requiring initiation of purchase orders two months before the check is needed; and ( 2) the automated system needed the capability to handle standing orders, government documents (depository and agency items), free materials, and titles held only on microfilm as well as ordinary renewals. these special items make up almost 30 percent of the total subscriptions, and to maintain a parallel manual system for them would be unsatisfactory. method and materials selecting data elements for inclusion was based as much as possible on automated periodicals system/harp and heard 85 the types of output reports desired. a simple holdings list as an end in itself was felt to be wasted effort, but as a by-product of the master file we wanted to generate public holdings lists twice a year. necessary data were readily available from the three-by-five-inch cards, with one addition-a unique number was assigned to each title. the data necessary would require more than one input card to produce the type of reports we wanted; therefore, as data elements were being considered, card codes and item numbers were also assigned to identify information for programming purposes. space was allocated to each field, using information recorded on the history cards, and the coding of certain fixed and variable fields was decided upon. the card formats are outlined in figure 1. figures 2 and 2a list the codes and their meanings. all cards-column1-5 unique number 6 card code cardl cc 7 8 9-66 67 68 69 70-71 72-76 77-80 card 2 cc 7-15 16-24 25---33 34-39 4q-48 49-51 52-74 75---79 80 card 3 cc 7-80 type of material (coded) x ( cancel) or h (hold) (coded) title type of subscription (coded) how to pay (coded) years subscription runs account charged (coded) cost renewal date invoice number periodical holdings microfilm holdings purchase order number subscription length data frequency ( coded) indexing ( coded ) blank method of payment publisher's name fig. 1. field desc1'iptions card 4 cc 7-40 publisher's street address 41-43 subject code (coded) 44-49 purchase order date 50 blank 51-75 publisher's city, state 76-80 publisher zip code cards 5, 6, 7, 8 & 9 cc 7-80 publisher's mailing label data carda cc 7-80 claim information card b-history cc 7--12 purchase order number 13-17 date history transferred 18-26 dates subscription ran 27-31 cost 32 no. years subscription ran 33-80 invoice number garde cc 7-80 comments 86 journal of library automation vol. 7/2 june 1974 cardl cc-7-type of material p-periodicals i-index v-vertical file m-microfilm n-newspapers a-membership (assoc.) l-librarians file cc-67-type of subscription n-new r-renewal cc-68-how to pay r-check request i -imprest fund t-invoice in triplicate cc-70-71-acct. to be chm·ged l-library b-biology h-humanities p-physics ss-social science hs-health science t-technology bu-business e-economics card2 cc-49-51-frequency s-sunday only d-daily d&s-daily & sunday w-weekly q-quarterly a-annually biw-every 2 weeks fig. 2. coding symbols and meanings bim-every 2 months smo-semimonthly san-semiannually irr-irregular 3/y-three times a year 5/y-five times a year 7/ y-seven times a year 9/y-nine times a year 10/y-ten times a year 11/y-eleven times a year cc-52-73-indexing 01-index medicus 02-applied science and technology 03-business periodicals index 04-education index 05-international nursing index 06-library literature 07 -mla bibliography 08-nursing literature index 09-public affairs information science 10-readers guide to periodical literature 11-social science and humanities index 12-new york times index 13-art index 00-no index cc-80-method of payment a-payment enclosed b-payment and notice enclosed c-please invoice in triplicate d-payment and invoice copy enclosed input transmittal forms were designed with the aid of the information systems staff to record information for use in keypunching. forms shown in figures 3, 4, and 5 illustrate the transmittal of information for keypunching. while customized data transmittal forms are available commercially, it was found just as satisfactory and much more economical to design our own forms and have them reproduced by college facilities. since our main consideration in choosing information for inclusion was automated periodicals system/harp and heard 87 001 accounting 002 aeronautics 003 african studies 004 afro-american 005 agriculture 006 anthropology and archaeology 007 architecture 008 art 009 astronomy 010 automation 011 banking and finance 012 bibliography 013 biological sciences 014 boats and boating 015 book reviews 016 business and industry 017 chemistry 018 cities and towns 019 conservation 020 crafts and hobbies 021 criminology and law enforcement 022 dance 023 dissident magazines 024 economics 025 education 026 engineering 027 english 028 entertainment 029 fire fighting 030 fishing and hunting 031 folklore 032 games and sports 033 general 034 geography 035 geriatrics 036 german language fig. 2a. codes used fo1· periodical subiect list 037 government 038 health science 039 history 040 home 041 indexes and abstracts 042 journalism 043 labor and industrial relations 044 library periodicals 045 linguistics and philology 046 literary and political reviews 047 literature 048 mathematics 049 men's magazines 050 military 051 motion pictures 052 music 053 newspapers 054 ornithology 055 philosophy 056 photography 057 physics 058 political science 059 psychology 060 radio, tv, and electronics 061 religion and theology 062 romance languages 063 science-general 064 slavic languages 065 sociology 066 theatre 067 travel 068 traffic and transportation 069 vocations and vocational guidance 070 women's magazines 071 memberships 072 social work to overcome our renewal problems, we had to determine what information was needed for this purpose. if purchase orders were to be generated, a field to be used as a key would be required. the program would check the contents of this key to determine whether or not the subscription was due for renewal. since the logical key seemed to be the expiration date, it was allowed a separate field (figure 3, item 009), even though this partially 88 ]oumal of library automation vol. 7/2 june 1974 sheet #1 field cd cc iteh descriptio 1 7 001 location 8 002 hold/cane. 9-66 003 title 67 004 type sub. 68 005 how to pay ~ 006 no. years 70-71 007 account 72-76 008 cost 77-80 009 expir.date 2 7-15 010 inv.no. 16-24 011 holdings s2_-33 012 mfholdings 34-39 013 p.o. no. 4o-48 014 sub.length 9-51 015 frequency library period1c}~vbrtll tr~.nsmittal title bun 1 s /c g li 11£/v originill infort.llltion f' 0 l:>!m rrj is fi'e ill ie iw ck f?. ,;1. b u 0 i ;;.. 0 oj 0[&' 1s' i q ?:l1:> -lt.t e i q ~~ i 9 7 i 0 .:?'f ~ 0 ~ 0 q 7 3 08'7.$' /)1 unique # 00 09 7 update infor~l~tio!t d i l j l i i i i _l ll i ii i ijjjjjj 2-74 016 indexing 0 3 i 0 l j i i i i iii_ljljij 80 017 pay t.leth. 13 ').~··· 3 7-80 018 publisher ]) 11 n i :s r. ~ iii en/ name .l .. 4 7-40 019 publisher & :~ i.b jf.l.:.z:tritihi ia-ivl£1 i i j i i l_l_l_l i address l l j ll t i i 41-43 027 sullj .code 0 i 6 h4-h9 028 lf'.o.date 0 &, a lrl7j3j 51-7? 020 ublisher la2. e.w ll::i~iiljki i l.,t..j!t:--iwj 1yic61tc i f\1 i i i i i city,stat " 76-80 jo.:>1 pub.zip j (i) 0 jl9l fig. 3. libmry pe1'iodical data tmnsmittal form duplicated the subscription length field. (the subscription length field itself was to be kept intact for transfer to the history record. ) a one-position field (figure 3, item 002) could be programmed to suppress printing of a purchase order, as in the case of a canceled subscription, or to keep the order in "hold" if a budget problem arose near the end of the fiscal year. hold status would cause the order to be printed with the tag "pay when authorized" to call attention to this status. other fields shown in figautomated periodicals system/harp and heard 89 !ncc library periodical data 'rransmittal unique # p () 0 '9 '7 sheet #2 original inforh!\tion update il!formati on field 0 d 'd cc iteh descrip'l'io l ..• . -5 7-41 022 publisher ~ ck -l 'l i-ko 4 los' ~} l 0000 13-1'7 014 18-26 c cj ? lg 0 f 6 9 008 2'7-31 ,{: 0 stj 6 006 32 i i 010 33-80 'i 0 s;j. (., l l l l j b 013 7-12 ;? 8 5 ~¢ rooo 13-17 014 18-26 0 q 10 fc n"~' o! 008 2'7-31 {/ () 'l 0 0 006 32 i i r q 0 q tf 'f i 1111 010 33-80 i i b _q13 7-12 0 0 b 317 tf 000 13-17 c f, ?b2_ 014 18-26 ·c 9 ? 0 0 g 7 3 008 2'7-31 0 i %' 0 0 006 32 3 010 33-80 l111j b 013 7-12 000 13-17 014 18-26 n-r-l __ j 008 27-31 ~ --32 _r-·-,.-j-r-· -r-r--~ 010 33-80 ! i · i lj ~f+tr-· j--rtieb t-r-----i i i 1 i ii-tj l i ! j_ fig. 5. historical record form plete file on the history of each subscription. data necessary to maintain this file were the subscription length and cost fields (described above) and the addition of fields for the purchase order number (figure 3, item 013) plus the invoice number (figure 3, item 010). the computer program was wlitten so these data could be automatically transferred to the history record card at renewal time. automated periodicals system/harp and heard 91 superintencent of documents governmf.nt printing office washington, dc 20402 re• our public lands attention subscription claims department 02107/73 according to our records, we have not receive,d the following issueisi. if our su8scription is in order, kindly send our missing issues. volumf. 2?, issue n0.3••summer 1q72 -------.. thank you moraine valley community college library fig. 6. claims letter cpl•oct ba•d•l8220 morainf vallev cc~munity college library 10900 so 88th ave palos hills il 60465 claims data are transmitted as needed by providing the unique number of the title and the information concerning the missing issue( s). claims letters (figure 6) are then mailed in window envelopes, so no typing is required. when working with periodicals or serials one becomes accustomed to sudden or unusual changes that occur with or without notice. a few examples could be changes in title, frequency, or general publishing patterns. we wanted to provide our system with the ability to notify us that an investigative procedure had been completed and thus avoid many of the "why's" that recur. accordingly, we included a comment card (figure 4, card c) which can be updated as circumstances require. from the transmittal forms for the initial batch of titles, cards were keypunched and built into a magnetic tape file. the serials technician now submits updates or additions (e.g., for new titles) within the schedule provided by information systems and the tape is updated each month. the main printed report is run monthly (figure 7). this master list includes all bibliographic, holding, and renewal information. titles due for renewal in three months are flagged with asterisks. the technician 92 journal of library automation vol. 7/2 june 1974 m 0 r a i n e val ley c 0 m m un i t y c 0 l l e.g e _______ ___!.2{2117_3 ______________ _ periodical list page_ 67 r r 2 bu 01200 0875 00097 p dun's review 197z•oate 1965•1971 024502 0973•0875 m dun's rev! ew 666 fifth ave new york, ne\1 york ~~rkin~o~:~l~~r~6~~~y 86o~t ~~;3~~~~~~~0 --~ library 10900 south 88th ave palos hills, !l 60465 0310 062573 _b ____________ _ 10019 ___ ____:_ _______________ _ 016 history 00097 ebsco 0968•0869 00500 1 805276 ------------------~co o~7o oo70o 1 905441 -------------006374 06-_73 0970•,9873 01800 3 ' ______ ___:_____.:_____~~--"-----:__ _ ___:_ _ ______.:.___---'---~-'-----~-~------------******* history 00664 p early years .0373•date early years , one hale lane library, moraine valley comm. college, 10900 south 88th ave., palos hills, ill. 60465 022934 0373•0374 9/y 000 darien, conn. r r 1 t 00700 ------040273 06820 025 no _history record found 0374 a ------0-~~:~1 ~l ,eb~~~~2""-d~at~e-ll59•1 072 023-yr608'1:3-rzt4 m 10 r r 1 l 00595 1274 051013 bhistory ebony . 820 s. michigan av_e chicago, !lltn~!=._s__:..._....:____!6~0~6~05~~--'--~-. p 160464mrn88t090092 06/b 2 moraine vall comm 06080 '10900 s 88th ave 13 coll_lib palos hills, ill 60464 004 00098 ebsco 0868•0769 00500 1 805276 ebsco 0869•0770 00600 1 905441 _______ __:o:o:0_.:::63=-.7~3~05::__:•73 0870•0773 01200 3 history 00099 px ecology ·today 0371•0872 ecology today r r p ·ooooo 000000 000000000 m 000 box 180 ll-6 21'3 -----~s_t _mystic connecticut -~_b_b ___________ _ moraine vall~y comm coll lib 10900 s 88th ave palos hills, ill 60465 ceased pub•8/72--replaced with environment 00099 010859 02•71 u37l-u'zt2llll600 1 015680 02-73 0372•0273 00600 1 fig. 7. master periodical printout 019 i . 0000 . a determines the current subscription price and number of years to renew each flagged title and updates these fields. at the next monthly run, facsimile purchase orders (containing all revised data except the purchase order number) are printed (figure 8). the technician types up numbered purchase orders from these and forwards them to the business office in time for payment. we intended our system to utilize purchase order forms to be run directly on the computer. therefore, our present method of typing from facsimiles does seem wasted effort, but is looked on as a stopgap measure for the present and the inconvenience is tolerated while waiting for the more desirable method. if the computer forms are adopted, we may have increased conflict in price updates because there will be less opportunity for last minute corrections. however, we do plan to avoid as automated periodicals system/harp and heard 93 assoc. p.o. no ... .0 .hen.tion· ·· ·t.ia~kltv * ****************>lr>lr>lr.i.>lr>lri*>lrlilr>lr•iiii*h****i> .z i • vsiir.rfnevai. s•la§t~~tktltl~··r~·;i~b~~itj.;,.·. :.·,... .... ··= ··· ·>~~cit)'· * aovocate * * * * * * * * * * library * * palos.hllls ll 6046~ * • payment enclosed * * * * to this pri~da~~~;'.~~;~j~ ~j~~e~ ·.··• corr espfjnoene·~.·: •... :. • ·.... · · """'' ·: •·:· < ii< .,. fig. 8. facsimile purchase order * * * • * * * * much conflict as possible by plans to run actual purchase orders closer to the actual expiration date. the renewal procedure followed involves these steps: 1. check purchase order facsimiles for accuracy and match with renewal notices received. 2. check kardex for material arrival regularity. 3. type and forward purchase orders to business office. 4. update forms and send to information systems. 5. scan master list for flagged items and record their unique numbers and titles on update sheets. 94 journal of library automation vol. 7/2 june 1974. 6. update flagged items with renewal notices as follows: a. price. b. number of years for renewal. c. new subscription dates. d. method of payment. e. any changed information concerning publisher and mailing label. 7. update flagged items without renewal notices in the same manner, using the latest issue received. 8. as additional renewal notices for flagged items come in, make necessary updates. 9. send all updates to information systems at least three days before the master list and facsimiles are due to be run. price changes do occur between the time the item is flagged and the check is mailed. with most, though, notification is received from the publisher before the purchase order is actually typed, and corrections are made at that time. since the renewal process is linked to the expiration field, updating that field also causes transfer of data for the year just expired into the history record, as explained earlier. free materials, government depository items, and standing orders for which invoicing is known to be automatic are handled by filling the expiration date field with zeros. if a purchase order history record is needed, as with standing orders, these fields are updated at the time the invoice arrives. our master list does not contain headings to explain field descriptions. we place our master list in· a binder; a legend describing placement of field descriptions is attached to the inside of the front of the binder cover and is readily available for reference. we felt headings on each record would be clumsy, confusing, and would waste valuable printing space. codes and their explanations are attached to the inside of the back of the binder cover. to date, two revisions have been installed into the system: ( 1) in 1972 we decided to classify our holdings by subject. space was "found" for three digits, and we then proceeded to code our subjects (figure 3, item 027). our subject codes and their meanings are explained in figure 2a. ( 2) correspondence was assisted by having all necessary information in one location. the cost, purchase order number, and problem explanation were available by merely flipping the printout pages to the title in question on the master list. however, the date the purchase order was typed had to be looked up in order to effect an intelligent solution. six spaces were again "found" to provide this purchase order date (figure 3, item 028). actual computer programming was performed by information systems staff in bal, and programs are run on the college ibm 370-135 computer. automated pet·iodicals system/harp and heard 95 results it has not been possible to figure actual monetary costs for the library portion and maintenance of this system, nor to compare these costs with the manual system. libraries have traditionally been weak in figuring operation costs, and we confess to not having been very innovative in this area. we do not have specific itemized costs for our manual routines, so actual comparisons are not possible. a few figures concerning library time can be given. from october through december of 1971, when the initial phase was set up, the serials technician and public services librarian each contributed about 20 percent of their time, and a student aide worked 10 to 15 hours per week on the clerical part of the data transmittal. since that time the system has been operational for over two years, and some time approximations concerning updating, adding to the file, etc., are now available. with development behind us, time contributed by the serials technician, who is now solely responsible for the maintenance of the system, has dropped from 20 percent to between 5 and 15 percent. exact costs are difficult to extract, since this varies during the year according to the number of renewals due in particularly heavy expiration months as compared with those due in light expiration months. the library as part of the college is not charged for use of computer facilities. figures for machine time and keypunching are available and are as follows: program periodical additions per 100 titles periodical updates per 100 titles purchase order printing claim disbursements miscellaneous reports machine time (hr.) .1 .1 .5 .1 3.0 keypunch time (hr.) 8.0 2.0 .5 .2 .0 information systems has given their monetary cost in developing this system as $5,970 for programming time. they also figure program maintenance at $215 per year and the cost to run programs per year at $256. we can list important benefits we have derived. renewal problems have been eliminated. the few duplicate problems can be handled now as soon as they occur. our system handles all types of live subscriptions and the "dead file" as well. there is no more fussing with cards since we have a one-stop, clear record of holdings and histories, including the entire invoice and payment record for each subscription. at renewal time all the information for purchase orders is listed on a single-sheet facsimile. claim letters are done for us and we can call for various listings as they are needed. reports we receive are: master listing once a month, purchase order facsimiles once a month, claim letters as needed, fiscal year total cost re96 journal of library automation vol. 7/2 june 1974 ports, fiscal year area cost reports, subject lists as needed, holdings lists as needed, unique number lists as needed. conclusions many librarians having access to sophisticated computer facilities content themselves with producing a more or less elaborate holdings list. subscription placements and renewals are handled manually, often through a commercial agency. common agency problems such as overlapping and lapsed subscriptions are simply tolerated. we feel from our experience that if enough effort is expended to create a successfully operating holdings list, a small library does not require much further effort to add renewal, history record, and claiming functions. this eliminates agency problems, provides the ability to manipulate files for producing various reports, ancl in our opinion, results in more efficient and convenient record-keeping. the size of our operation falls at the lower end of a range of libraries having holdings large enough to require at least one individual's time. translated into figures, we feel that any automated system would be wasted on holdings of under 150 periodicals. the crucial factor in relation to size is not really any magic number of holdings but the ratio of available staff time to the size of the holdings. this factor must be evaluated by libraries considering any type of automated system. we feel much of the success of our system has been dependent upon our initial planning, our staff availability, and our conviction that a change was necessary to/eliminate the problems we were encountering with our n:ianual system. also the availability of the computer facilities, the encouragement provided by our superiors, and adequate library staff and information systems staff all contributed to an efficient changeover. acknowledgments gratitude is due moraine valley community college for its permission and support of this innovation. particular gratitude is due anabel sproat, head librarian, for her permission, support, and constant encouragement. the excellent work and friendly attitude of linda nemeth and the entire information systems staff who made this project a reality have been ~eeply appreciated. also, the capable assistance of student aide barbara hart ( goeske) in the recording process proved to be a very valuable asset. algorithmic literacy and the role for libraries article algorithmic literacy and the role for libraries michael ridley and danica pawlick-potts information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.12963 abstract artificial intelligence (ai) is powerful, complex, ubiquitous, often opaque, sometimes invisible, and increasingly consequential in our everyday lives. navigating the effects of ai as well as utilizing it in a responsible way requires a level of awareness, understanding, and skill that is not provided by current digital literacy or information literacy regimes. algorithmic literacy addresses these gaps. in arguing for a role for libraries in algorithmic literacy, the authors provide a working definition, a pressing need, a pedagogical strategy, and two specific contributions that are unique to libraries. introduction algorithms, in one form or another, are as old as human problem solving and as simple as “a sequence of computational steps that transform the input into the output.”1 for centuries they have been effective, and uncontroversial, methodologies. however, the rise of artificial intelligence (the integration of big data, enhanced computation, and advanced algorithms) with its human and greater-than-human performance in many areas has positioned algorithms as transformational and a “major human rights issue in the twenty-first century.”2 algorithmic literacy is important given of the prevalence of algorithmic decision-making in many aspects of everyday life and because “the danger is not so much in delegating cognitive tasks, but in distancing ourselves from—or in not knowing about—the nature and precise mechanisms of that delegation.”3 as a result, david lankes warns of a new type of digital divide with “a class of people who can use algorithms and a class used by algorithms.”4 in a 2019 deloitte survey “only 4 percent reported they were confident explaining what ai is and how it works.”5 while a 2019 edelman survey indicated general awareness of ai, it also revealed a similar lack of knowledge about the details of ai.6 an informed, algorithmically literate public is better able to negotiate and employ the complexities of ai.7 identifying and acting upon algorithms as a literacy makes them as “fundamental as reading, writing, and arithmetic.”8 however, the uncritical use of the term literacy should make one suspicious of extending it to algorithms. increasingly “literacy” has come to mean merely a body of knowledge or a set of domain-specific skills.9 various literacies have been described, such as health, death, financial, physical, ocean, religious, visual, dancing, spatial, screen, and porn. this includes a dozen different technology-related literacies.10 the case for algorithmic literacy, and the role for libraries in advancing it, must rest on a clear definition, a recognized problem and need, a pedagogical strategy, and a unique (or at least supportive) contribution libraries can provide. michael ridley (mridley@uoguelph.ca) is librarian emeritus, mclaughlin library, university of guelph, ontario, canada. danica pawlick-potts (dpawlic@uwo.ca) is phd candidate, faculty of information and media studies, western university, ontario, canada. © 2021. mailto:mridley@uoguelph.ca mailto:dpawlic@uwo.ca information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 2 algorithms and literacy while the term “algorithmic literacy” is recent, it has antecedents that cover similar if not equivalent ground. the general terms computer literacy or digital literacy have spawned more specific terms such as cyber literacy, computational thinking, and algorithmic thinking.11 most of these arise from the field of computer science, where algorithms are central, and focus on the computational nature of algorithms as a “matter of mathematical proof” where “other knowledge about algorithms—such as their applications, effects, and circulation—is strictly out of frame.”12 the implications of algorithms in everyday life suggests that a deeper and broader interpretation is required. whether a literacy, a mode of thinking, or merely a set of skills, discussions about computation and algorithms have been plagued by “ambiguity and vagueness” and “definitional confusion” resulting in ongoing challenges in establishing core pedagogy in both k–12 and higher education.13 without a clear, acknowledged, and actionable definition that differentiates it from concepts such as digital literacy, computational thinking, and algorithmic thinking, algorithmic literacy will be relegated to a buzz phrase and the urgency of its recognition and application will be lost. the relationship between algorithms and artificial intelligence might recommend the adoption of “ai thinking” or “ai literacy” as the more appropriate term.14 however, algorithmic literacy is both more foundational than the broader concept of ai and more actionable than just thinking. algorithms are not a technology like ai or, more generally, computers. algorithms provide a structure that frames—and constrains—how we express ourselves. they are a way of seeing and acting in the world and “need to be understood as relational, contingent, [and] contextual.”15 while the technical and operational aspects of algorithms are important to understand and use (as they are for the technologies and processes of reading and writing in a new language), they are complemented by a broader awareness: literacy is not a set of generic skills or something we do or do not possess, it’s a sociocultural practice, it’s something that we do, and what we do with literacy depends on the social, cultural, and historical contexts in which we do it. literacy looks different in different contexts and communities. literacy is not neutral, it’s ideological. there are dominant and marginalized literacies.16 this perspective is the essence of critical algorithm studies, where algorithms are viewed as sociotechnical systems that are “intrinsically cultural . . . constituted not only by rational procedures, but by institutions, people, intersecting contexts, and the rough-and-ready sensemaking that obtains in ordinary cultural life.”17 algorithms as part of increasingly ubiquitous ai, such as machine learning and deep learning systems, reflect and promulgate certain ideologies and have impacts and influences in the full range of human society. cautions about algorithmic decision-making have identified the far-reaching implications for bias, fairness, privacy, and democratic processes.18 at the same time, numerous national strategies to support ai development have highlighted the substantial economic impact, anticipated to be $15.7 trillion (us) by 2030.19 the idea of algorithmic literacy must encompass multiple perspectives and contexts. information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 3 “literacies of the digital” computer, internet, information, computation, and algorithmic are all “literacies of the digital.” 20 while each of these has its own domain and focus, they share common ideas and are generally symbiotic with each other. there is an especially strong and complementary connection between computational literacy and information literacy.21 computational thinking and algorithmic literacy are closely related even if most definitions of the former fail to fully acknowledge the broader social, economic, and political implications. however, the extensive literature on computational thinking is useful in helping to articulate aspects of algorithmic literacy. wing’s foundational article about computational thinking describes the key characteristics in terms that closely resemble a literacy: 1. conceptualizing, not programming 2. fundamental, not a rote skill 3. a way that humans, not computers, think 4. complements and combines mathematical and engineering thinking 5. ideas, not artifacts 6. for everyone, everywhere.22 jacob and warschauer make a strong case for computational thinking as a literacy. their threepart framework identifies computational thinking as a new literacy embedded in modern sociocultural practices (computational thinking as literacy), discusses how literacy development can be leveraged to foster computational thinking (computational thinking through literacy), and explores ways in which computational thinking can facilitate literacy development (literacy through computational thinking).23 this analysis of computational thinking informs the larger context and broader implications of algorithmic literacy. defining algorithmic literacy scribner and cole define a literacy as “socially organized practices [that] make use of a symbol system and a technology for producing and disseminating it.”24 therefore, literacy = practices + symbol system + technology. to this definition, steiner adds a more aspirational and humanistic definition: by “literacy” i mean the ability to engage with, to respond to, what is most challenging and creative in our societies. to experience and contribute to the energies of informed debate. to distinguish the “news that stays news,” as ezra pound put it, from the tidal waves of ephemeral rubbish, superstition, irrationalism, and commercial exploitation.25 literacy is about knowing and meaning making through the processes of internalizing and externalizing information. literacy enables a reflective, critical, and integrative approach to information that utilizes a broad knowledge base for both understanding and communicatin g ideas. finn calls for an algorithmic literacy “that builds from a basic understanding of computational systems, their potential and their limitations, to offer us intellectual tools for interpreting the algorithms shaping and producing knowledge” and thereby provides “a way to information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 4 contend with both the inherent complexity of computation and the ambiguity that ensues when that complexity intersects with human culture.”26 referring more broadly to “ai literacy,” long and magerko provide an operational view defining it as “a set of competencies that enables individuals to critically evaluate ai technologies; communicate and collaborate effectively with ai; and use ai as a tool online, at home, and in the workplace.” 27 following an exhaustive analysis of different, and often contradictory, definitions of literacy, information literacy, and digital literacy, bawden suggests “explaining, rather than defining, terms.”28 this provisional description of algorithmic literacy acknowledges that advice. algorithmic literacy is the skill, expertise, and awareness to • understand and reason about algorithms and their processes • recognize and interpret their use in systems (whether embedded or overt) • create and apply algorithmic techniques and tools to problems in a variety of do mains • assess the influence and effect of algorithms in social, cultural, economic, and political contexts • position the individual as a co-constituent in algorithmic decision-making. this description recognizes two overarching concepts: “creativity and critical analysis.”29 creativity involves building, creating, and using algorithms for specific purposes. critical analysis involves recognizing the application of algorithms in decision-making and the implications of their use in a variety of settings and within certain contexts. why algorithmic literacy? the need for algorithmic literacy arises from two key and equally important perspectives, both of which essentially focus on power: control and empowerment. algorithms, especially those us ing machine learning and deep learning, are complex, opaque, invisible, shielded by intellectual property protection, and most importantly, consequential in the everyday lives of people.30 control is held by those who build and deploy algorithms, not those who use them. in part because of these characteristics, people hold significant misconceptions about algorithms, their use, and their effect. in a 2019 global survey of consumers, 72% said they understood what ai was. however, despite ai being used in a wide variety of consumer-facing applications (e.g., email, search, social media), 64% said they had never used ai.31 a study of facebook users found that 62% were unaware that the news feed is algorithmically constructed and, even when told this, 12% concluded that it is, as a result, completely random.32 bias, discrimination, and unfairness in ai have been well documented.33 it is clear that poor data combined with underspecified algorithms and uncritical interpretations of the ai model outcomes can lead to abuses in a variety of ways. there is no quick fix, no automated solution to these problems. accordingly, those creating algorithms and those using them must be able to question the source of training data, the strengths and weaknesses of learning algorithms, the metrics for success, and how (and for whom) the systems are being optimized. the overarching objectives are accountability and transparency. perhaps most critically, the prevalence of algorithms in our lives has changed the way we interact with and use those systems, and the ways we behave in personal and social contexts. we conduct information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 5 ourselves to be “algorithmically recognizable” allowing us to become “increasingly legible to machines for capture and calculation.”34 the danger is that this will “lead users to internalize their [algorithm’s] norms and priorities.”35 at the same time the power of algorithmic technology is abused and misused, it remains a powerful technology to enhance human capabilities and insight. algorithms are attributable to dramatic advances in health care and science as well as more mundane (but appreciated) applications such as spam filters. anti-science sentiments, typified by anti-vaxxers, should not be allowed to undermine the opportunities for algorithms that materially improve the human condition and the natural world. those opportunities now extend beyond the well-funded, technology-rich research and corporate ai departments. increasingly more consumer-friendly tools and applications allow a broader and more diverse population to create algorithmic solutions. the rise of mlaas (machine learning as a service) brings together powerful cloud-based machine learning environments with accessible toolsets.36 algorithmic literacy is needed to acknowledge both the technology’s power (control) over people and power (empowerment) for people. recognizing the need for protection and encouragement, many governments have enacted protective legislation and training initiatives. emblematic of the former is the general data protection regulation (gdpr) of the european union with its “right to explanation” for algorithmic decisions.37 exemplary of the latter is finland’s initiative to educate a large portion of their population through “elements of ai,” a free online course.38 despite these advances there remain power imbalances that require vigilance on the part of 21 st century digital citizens. understanding the power and politics of algorithms recognizes their ontological impact in “new ways of ordering the world.”39 effects this profound suggest a deeper and more comprehensive understanding of algorithms is needed: efforts to help people understand algorithms need to continue moving away from a focus on building awareness of algorithms—people increasingly know about “those things called algorithms”—and toward explaining algorithms in such a way that people have a more consistent conceptualization of what algorithms are, what algorithms do, and—what often is overlooked—what algorithms cannot do.40 algorithmic literacy, like all literacies, is not about mastery but levels of competence appropriate to age, circumstance, and need. understood simply as recipes or visual decision trees, algorithms are accessible to even those with minimal digital literacy. public institutions, and specifically libraries, can and must take a lead role in addressing the challenges of this “new world.” the library role in algorithmic literacy libraries have traditionally played a central role in making emerging technologies accessible to their communities whether those be online systems, makerspaces, interactive media, virtual reality, or a host of others. advancing digital access, digital literacy, and digital inclusion have long been the acknowledged by governments and public agencies as a role of the public library even if not appropriately funded to do so.41 recently, libraries have begun addressing their role in relation to ai and algorithmic literacy. information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 6 the urban libraries council (ulc) conducted an informal poll about ai and public libraries.42 of the responding libraries (83 of its 150-member library systems), 45% identified ai as important to their leadership with 23% having a staff person dedicated to ai and 27% providing programming to help the public learn about ai. in response to a question of how best libraries could serve their community in this area, 79% said by framing and building awareness of ai, 68% recommended providing continuous education opportunities for the public, and 61% supported the provision of experiential programming. in 2019, the ulc formed a working group to advance the public library role in ai awareness, education, and experiences. in 2018 the canadian federation of library associations (cfla) held a national forum in part focused on artificial intelligence.43 participant discussions yielded three key priorities with respect to ai: training for library staff, educational materials of for the public, and advocacy initiatives regarding privacy, bias, and transparency. a fourth priority was the inclusion o f ai literacy and awareness in mis and mlis curricula to facilitate a leadership role for the profession in this area. algorithmic literacy programs have two general audiences: members of the community the library serves and the staff of the libraries themselves. for the community, these programs center on awareness and implications, skill development, and application and use.44 through workshops, hands-on laboratories and makerspaces, consumer checklists, and a variety of informational tools, libraries can provide, or partner in providing, resources in an ageand context-appropriate setting. for library staff, an additional focus is required on advocacy with respect to regulatory issues, system development, and the evolution of the local and national information infrastructures. library staff can lead, and participate in, advocacy programs that seek to influence government, public agencies, commercial system and service providers, and others about algorithmic literacy. it is a misconception to think of algorithms, and ai more generally, as arcane topics beyond the ability of library staff to understand and teach. while the technical details of ai are complex, this is not the level of understanding required of staff or needed by the library’s community. for example, ai programing at the frisco public library introduced ai maker kits and ran basic ai classes. the toronto public library, through its digital innovation hubs, has offered learning circles in basic ai (using the finnish elements of ai course as a foundation) and hosted presentations on various aspects of algorithms in everyday life. by abstracting algorithms to higher level concepts related directly to daily experience (using facebook is illustrative of many key ideas regarding algorithmic literacy), staff can obtain a sufficient overview from a variety of accessible, introductory texts or videos. perhaps most importantly, given the new and evolving nature of this technology, library staff should view themselves as co-learners. no matter the setting or context, an active learning approach is recommended with learners situated as makers as well as consumers.45 a review of the k–12 curricula regarding computational literacy identified active learning strategies based on projects, problem solving, cooperation, and games. the researchers recommend augmenting these with scaffolding strategies, storytelling, and aesthetic approaches.46 while intended for algorithmic literacy initiatives involving children, four design principles from dasgupta and mako are relevant for any demographic: https://friscolibrary.com/ https://www.torontopubliclibrary.ca/ https://www.elementsofai.com/ information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 7 1. make data analysis central and ensure the data is relevant to the learner, 2. manage risk by using sandboxes for experimentation, 3. respect community values about technology that may differ, and 4. support authenticity with real-world examples and scenarios.47 long and magerko document a set of 17 core competencies and 15 associated learning design considerations regarding ai literacy.48 taken together these represent the basis for an algorithmic literacy program for any demographic and any context. libraries are encouraged to seek partnerships and collaborations with schools (k–12 and higher education) as well as with non-profit advocacy and training groups.49 examples among these include the algorithmic literacy project (algorithmliteracy.org) and a.i. for anyone (https://aiforanyone.org). many technology companies also offer high quality programs and resources. however, a report from the public policy forum notes that digital literacy campaigns are “too often funded by the very companies that are contributing to the problem.”50 a key issue is the lack of assessment instruments. there are none for algorithmic literacy and few for computational thinking. the most prominent of the latter is skills based, focusing on concepts and operational practices and very little on the wider social and cultural implications.51 library experience with information literacy assessment can inform algorithmic literacy assessment by helping to balance skills and operational concerns with a wider focus on concepts and contextual awareness. information literacy and explainable ai (xai): unique library contributions while libraries can make contributions to algorithmic literacy through a variety of programs, resources, and advocacy initiatives, two specific areas suggest opportunities for unique contributions: algorithmic literacy as a part of information literacy and algorithmic literacy in support of “explainable ai” (xai). algorithmic literacy and information literacy annemaree lloyd describes the opacity and ubiquity of algorithms as “a wicked problem for librarians and archivists who have a vested interest in equitable access, informed citizenry and the maintenance of public memory” and insists that information literacy “provides resistance to the expansionist claims of algorithms, while at the same time ensuring that people harness the power of this culture to their advantage.”52 information literacy programs championed by libraries have been instrumental in raising awareness and skill building among their user communities. using information literacy programs as a scaffold, algorithmic literacy can be incorporated into these successful initiatives. however, given the current needs “machine learning and algorithms present frontstage in the information literacy constellation.”53 head et al., in their important 2020 study of algorithms and information literacy, present a view of student perspectives that is both troubling and optimistic. 54 the students expressed “a tangle of resignation and indignation” about the effects of algorithms on their lives. for them, algorithms obscure more than they reveal, privacy is compromised, “trust is dead,” and skepticism is total. the authors conclude that we face an “epistemological crisis” where algorithms are “stripping individuals of the responsibility to interpret the facticity of the information these systems give us when that interpretation has been performed by the algorithms themselves.” however, students also employed “defensive practices” against algorithms, utilized “multiple selves” to preserve their https://algorithmliteracy.org/ https://aiforanyone.org/ information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 8 privacy, and were keen to learn how to “fight back” against surveillance and algorithmic decisionmaking. this is a reminder that “while algorithms certainly do things to people, people also do things to algorithms.”55 people have “algorithmic capital” which they can use in “negotiation with algorithmic power.”56 with these findings, it seems clear that status quo information literacy programs will not address the unique challenges presented by algorithms. jason clark, scott young, and lisa janicke hinchliffe took up this challenge with a project funded by an imls grant.57 calling “algorithmic awareness” a “new competency,” these researchers identified a gap in the acrl framework for information literacy that revealed “a lack of an understanding around the rules that govern our software and shape our digital experiences.”58 those rules are the “invisible logic” of algorithms that need to be made transparent for users and library staff. deliverables from this project include an integrated curriculum, syllabus, and software prototype that respond uniquely to the pedagogical challenges of algorithmic literacy.59 in promoting ml (machine learning) literacy, ryan cordell also calls for a specific pedagogical approach that would “emphasize the situated-ness of ml training data and experiments, including the biases or oversights that influence the outcomes of academic, economic, and governmental ml processes.”60 recommendations from this report provide guidelines for developing staff expertise, running pilot projects, and creating toolsets and checklists supportive of responsible machine learning. algorithmic literacy and explainable ai (xai) perhaps a less obvious way for libraries to contribute to algorithmic literacy is through explainable ai (xai).61 difficulties in interrogating algorithms to assess bias, discrimination, and unfairness (as well as other deficiencies such as veracity and generalizability) have led to widespread interest in xai. the purpose of xai is to “enable human users to understan d, appropriately trust, and effectively manage the emerging generation of artificially intelligent partners” and to deploy ai systems that have “the ability to explain their rationale, characterize their strengths and weaknesses, and convey an understanding of how they will behave in the future.”62 there is complementarity between the objectives of xai and algorithmic literacy. both seek transparency, promote understanding, and facilitate accountability. both recognize the primacy of human agency in human-machine interaction. xai is accomplished through a variety of techniques, strategies, and processes. these can involve unambiguous proofs, technical and statistical interventions for verification and validation, and authorizations that rely on standards, audits, and policy directives.63 explanations are contextual. system designers, professionals, regulators, end users, and the general public need explanations specific to their objectives and tailored to their skills and knowledge. as algorithmic decision-making is increasingly embedded in the information tools, services, and resources provided by libraries and promoted to users, xai and algorithmic literacy can operate in close association. libraries can incorporate aspects of xai into algorithmic literacy programming and the principles of algorithmic literacy (and more generally information literacy) can inform how xai is sensitive and responsive to different explanatory needs. xai is still an emergent field but it has had, and will continue to have, a profound impact on the development of machine learning systems. the opportunity for library involvement is immediate: information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 9 librarians need to become well versed in these technologies, and participate in their development, not simply dismiss them or hamper them. we must not only demonstrate flaws where they exist but be ready to offer up solutions. solutions grounded in our values and in the communities we serve.64 a repeated message from lis researchers is that library-developed tools to interrogate ai systems are essential components in advancing algorithmic literacy. 65 these tools can address the complexity and opacity of machine learning systems and provide levels of explainability and transparency in contextually appropriate ways. one such tool, either as a stand-alone system or embedded in an existing discovery system, might provide a user with access to the nature, and potential bias, of the training data, the general efficacy of the learning algorithm(s) used, and the generalizability of the trained model to different contexts. this xai scorecard would integrate the objectives of xai, algorithmic literacy, and information literacy. by leveraging and developing library staff skills and by partnering with ai research and industry groups “libraries can become ideal sites for cultivating responsible and responsive ml.”66 padilla views this engagement as not just a technical initiative but a library-wide effort to promulgate “responsible operations” with ai, noting that library practices “that embed transparency and explainability increase the likelihood of organizational accountability.”67 conclusion algorithms are “the new power brokers in society” and “we are growing increasingly dependent on computational spectacles to see the world.”68 lash argues that this development has altered the rules by which society operates. constitutive rules (e.g., rules that define the boundaries of society) and regulative rules (e.g., the rules define how we operate in society) are now joined by “algorithmic, generative rules.” these rules are “compressed and hidden and we do not encounter them in the way that we encounter constitutive and regulative rules. yet this third type of generative rules is more and more pervasive in our social and cultural life of the post-hegemonic order.”69 algorithmic literacy is a means to understand this new set of rules and to encourage the skills and abilities so people can use algorithms and not be used by them. libraries have typically championed accessible technology and its effective use. the ubiquity of algorithmic decisionmaking and its profound impact on everyday lives makes the recognition and promotion of algorithmic literacy a critical new challenge and imperative for libraries of all types. information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 10 endnotes 1 thomas h. cormen et al., introduction to algorithms, 3rd ed. (cambridge ma: mit press, 2009), 13. 2 yoav shohman et al., “ai index 2017 report” (stanford, ca: human-centered ai initiative, stanford university, 2017), http://cdn.aiindex.org/2017-report.pdf; safiya noble, algorithms of oppression: how search engines reinforce racism (new york: new york university press, 2018), 1. 3 jos de mul and bibi van den berg, “remote control: human autonomy in the age of computer mediated agency,” in law, human agency, and autonomic computing, ed. mireille hildebrandt and antoinette rouvroy (abingdon: routledge, 2011), 58. 4 lee rainie and janna anderson, “code-dependent: pros and cons of the algorithmic age” (pew research center, february 2017), http://www.pewinternet.org/wpcontent/uploads/sites/9/2017/02/pi_2017.02.08_algorithms_final.pdf. 5 “canada’s ai imperative: from predictions to prosperity” (toronto: deloitte, 2019), 16, https://www.canada175.ca/en/reports/aiimperative?&id=ca:2el:3or:awa_2019_fcc_omnia1:from_dca_fccomnia2. 6 “2019 edelman ai survey,” edelman, 2019, https://www.edelman.com/sites/g/files/aatuss191/files/201903/2019_edelman_ai_survey_whitepaper.pdf. 7 jenna burrell, “how the machine ‘thinks’: understanding opacity in machine learning algorithms,” big data & society 3, no. 1 (2016), https://doi.org/10.1177/2053951715622512; rainie and anderson, “code-dependent.” 8 jeannette wing, “computational thinking, 10 years later,” communications of the acm 59, no. 7 (2016): 10, https://doi.org/10.1145/2933410. 9 loanne snavely and natasha cooper, “the information literacy debate,” journal of academic librarianship 23, no. 1 (1997): 9–14, https://doi.org/10.1016/s0099-1333(97)90066-5. 10 alfred thomas bauer and ebrahim mohseni ahooei, “rearticulating internet literacy,” cyberspace studies 2, no. 1 (2018): 29–53, https://doi.org/10.22059/jcss.2018.245833.1012. 11 evelyn stiller and cathie leblanc, “from computer literacy to cyber-literacy,” journal of computing sciences in colleges 21, no. 6 (2006): 4–13; peter j. denning and matti tedre, computational thinking (cambridge ma: mit press, 2019); z. katai, “the challenge of promoting algorithmic thinking of both sciencesand humanities-oriented learners,” journal of computer assisted learning 31, no. 4 (2015): 287–99, https://doi.org/10.1111/jcal.12070. 12 nick seaver, “what should an anthropology of algorithms do?” (american anthropological association, chicago, 2013), 1–2, http://nickseaver.net/papers/seaveraaa2013.pdf. 13 jesús moreno-león and marcos román-gonzález, “on computational thinking as a universal skill,” in ieee global engineering education conference (educon, santa cruz de tenerife, http://cdn.aiindex.org/2017-report.pdf http://www.pewinternet.org/wp-content/uploads/sites/9/2017/02/pi_2017.02.08_algorithms_final.pdf http://www.pewinternet.org/wp-content/uploads/sites/9/2017/02/pi_2017.02.08_algorithms_final.pdf https://www.canada175.ca/en/reports/ai-imperative?&id=ca:2el:3or:awa_2019_fcc_omnia1:from_dca_fccomnia2 https://www.canada175.ca/en/reports/ai-imperative?&id=ca:2el:3or:awa_2019_fcc_omnia1:from_dca_fccomnia2 https://www.edelman.com/sites/g/files/aatuss191/files/2019-03/2019_edelman_ai_survey_whitepaper.pdf https://www.edelman.com/sites/g/files/aatuss191/files/2019-03/2019_edelman_ai_survey_whitepaper.pdf https://doi.org/10.1177/2053951715622512 https://doi.org/10.1145/2933410 https://doi.org/10.1016/s0099-1333(97)90066-5 https://doi.org/10.22059/jcss.2018.245833.1012 https://doi.org/10.1111/jcal.12070 http://nickseaver.net/papers/seaveraaa2013.pdf information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 11 spain: ieee, 2018), 1684–89; shuchi grover and roy pea, “computational thinking in k–12: a review of the state of the field,” educational researcher 42, no. 1 (2013): 38–43, https://doi.org/10.3102/0013189x12463051; betual c. czerkawski and eugene w. lyman iii, “exploring issues about computational thinking in higher education,” techtrends 59, no. 2 (2015): 57–65. 14 daniel zeng, “from computational thinking to ai thinking,” ieee intelligent systems (november/december, 2013), 2–4; duri long and brian magerko, “what is ai literacy? competencies and design considerations,” in proceedings of the 2020 chi conference on human factors in computing systems, chi ’20 (honolulu, hi: association for computing machinery, 2020), 1–16, https://doi.org/10.1145/3313831.3376727. 15 rob kitchin, “thinking critically about and researching algorithms,” information, communication & society 20, no. 1 (2017): 18, https://doi.org/10.1080/1369118x.2016.1154087. 16 karen nicholson, “information into action? reflections on (critical) practice” (workshop on instruction in library use (wilu), university of ottawa, 2018), 7–8, https://ir.lib.uwo.ca/fimspres/51/. 17 nick seaver, “algorithms as culture: some tactics for the ethnography of algorithm systems,” big data & society 4 (2017): 10, https://doi.org/10.1177/2053951717738104. 18 virginia eubanks, automating inequity: how high-tech tools profile, police, and punish the poor (new york: st. martin’s press, 2018); noble, algorithms of oppression; cathy o’neil, weapons of math destruction: how big data increases inequality and threatens democracy (new york: crown, 2016); frank pasquale, the black box society: the secret algorithms that control money and information (cambridge, ma: harvard university press, 2015). 19 time dutton, “building an ai world: report on national and regional ai strategies” (toronto: cifar, 2018), https://www.cifar.ca/docs/default-source/aisociety/buildinganaiworld_eng.pdf?sfvrsn=fb18d129_4; pricewaterhousecooper, “sizing the prize: what’s the real value of ai for your business and how can you capitalise?,” 2017, https://www.pwc.com/gx/en/issues/analytics/assets/pwc-ai-analysis-sizing-the-prizereport.pdf. 20 allan martin and jan grudziecki, “digeulit: concepts and tools for digital literacy development,” innovation in teaching and learning in information and computer sciences 5, no. 4 (2006): 249–67, https://doi.org/10.11120/ital.2006.05040249. 21 rosanne cordell, “information literacy and digital literacy: competing or complementary?,” communications in information literacy 7, no. 2 (2013): 177–83, https://doi.org/10.15760/comminfolit.2013.7.2.150; andreas dengel and ute heuer, “a curriculum of computational thinking as a central idea of information & media literacy,” in proceedings of the 13th workshop in primary and secondary computing education (wipsce’18) october 4-6, 2018, potsdam, germany (new york: acm, 2018), https://doi.org/10.1145/3265757.3265777; sarah gretter and aman yadav, “computational https://doi.org/10.3102/0013189x12463051 https://doi.org/10.1145/3313831.3376727 https://doi.org/10.1080/1369118x.2016.1154087 https://ir.lib.uwo.ca/fimspres/51/ https://doi.org/10.1177/2053951717738104 https://www.cifar.ca/docs/default-source/ai-society/buildinganaiworld_eng.pdf?sfvrsn=fb18d129_4 https://www.cifar.ca/docs/default-source/ai-society/buildinganaiworld_eng.pdf?sfvrsn=fb18d129_4 https://www.pwc.com/gx/en/issues/analytics/assets/pwc-ai-analysis-sizing-the-prize-report.pdf https://www.pwc.com/gx/en/issues/analytics/assets/pwc-ai-analysis-sizing-the-prize-report.pdf https://doi.org/10.11120/ital.2006.05040249 https://doi.org/10.15760/comminfolit.2013.7.2.150 https://doi.org/10.1145/3265757.3265777 information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 12 thinking and media & information literacy: an integrated approach to teaching twenty-first century skills,” techtrends 60 (2016): 510–16, https://doi.org/10.1007/s11528-016-0098-4. 22 jeannette wing, “computational thinking,” communications of the acm 49, no. 3 (2006): 35. 23 sharin rawhiya jacob and mark warschauer, “computational thinking and literacy,” journal of computer science integration 1, no. 1 (2018): 3, https://doi.org/10.26716/jcsi.2018.01.1.1. 24 sylvia scribner and michael cole, the psychology of literacy, acls humanities e-book (series) (cambridge, ma: harvard university press, 1981), 99. 25 george steiner, “school terms: redefining literacy for the digital age,” lapham’s quarterly 1, no. 4 (2008): 198. 26 ed finn, “algorithm of the enlightenment,” issues in science and technology 33, no. 3 (2017): 25; ed finn, what algorithms want: imagination in the age of computing (cambridge, ma: mit press, 2017), 2. 27 long and magerko, “what is ai literacy?,” 2. 28 david bawden, “information and digital literacies: a review of concepts,” journal of documentation 57, no. 2 (2001): 233. 29 gretter and yadav, “computational thinking,” 510. 30 pasquale, the black box society; o’neil, weapons of math destruction. 31 “what consumers really think about ai: a global study,” pega, 2019, https://www.ciosummits.com/what-consumers-really-think-about-ai.pdf. 32 motahhare eslami et al., “first i ‘like’ it, then i hide it: folk theories of social feeds,” in proceedings of the 2016 chi conference on human factors in computing systems, chi ’16 (san jose, ca: association for computing machinery, 2016), 2371–82, https://doi.org/10.1145/2858036.2858494. 33 julia angwin et al., “machine bias,” propublica, may 23, 2016, https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing; eubanks, automating inequity; noble, algorithms of oppression; pasquale, the black box society; ruha benjamin, race after technology: abolitionist tools for the new jim code (polity press, 2019); o’neil, weapons of math destruction. 34 tarleton gillespie, “the relevance of algorithms,” in media technologies: essays on communication, materiality, and society, ed. tarleton gillespie, pablo j. boczkowski, and kirsten a. foot (cambridge, ma: mit press, 2014), 184; sun-ha hong, technologies of speculation: the limits of knowledge in a data-driven society (new york: new york university press, 2020), 2. 35 gillespie, “the relevance of algorithms,” 187. https://doi.org/10.1007/s11528-016-0098-4 https://doi.org/10.26716/jcsi.2018.01.1.1 https://www.ciosummits.com/what-consumers-really-think-about-ai.pdf https://doi.org/10.1145/2858036.2858494 https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 13 36 altexsoft, “comparing machine learning as a service: amazon, microsoft azure, google cloud ai, ibm watson,” data science (blog), september 27, 2019, https://www.altexsoft.com/blog/datascience/comparing-machine-learning-as-a-serviceamazon-microsoft-azure-google-cloud-ai-ibm-watson/. 37 european union, “regulation (eu) 2016/679 of the european parliament and of the council of 27 april 2016,” 2016, http://eur-lex.europa.eu/legalcontent/en/txt/?uri=celex:32016r0679; bryce goodman and seth flaxman, “european union regulations on algorithmic decision making and a ‘right to explanation,’” ai magazine 38, no. 3 (2017): 50–57, https://doi.org/10.1609/aimag.v38i3.2741. 38 finland, “work in the age of artificial intelligence: four perspectives on economy, employment, skills and ethics” (helsinki: ministry of economic affairs and employment, 2018), http://urn.fi/urn:isbn:978-952-327-313-9. 39 taina bucher, if . . . then: algorithmic power and politics (new york: oxford university press, 2018), 20. 40 alison j. head, barbara fister, and margy macmillan, “information literacy in the age of algorithms: student experiences with news and information, and the need for change” (project information literacy, 2020), 41, https://www.projectinfolit.org/uploads/2/7/5/4/27541717/algoreport.pdf. 41 paul t. jaeger et al., “the intersection of public policy and public access: digital divides, digital literacy, digital inclusion, and public libraries,” public library quarterly 31, no. 1 (2012): 1, https://doi.org/10.1080/01616846.2012.654728. 42 “ulc snapshot: artificial intelligence,” urban libraries council weekly newsletter, july 18, 2018. 43 canadian federation of library associations, “artificial intelligence and intellectual freedom: key policy concerns for canadian libraries” (ottawa: cfla, 2018), http://cfla-fcab.ca/wpcontent/uploads/2018/07/cfla-fcab-2018-national-forum-paper-final.pdf. 44 martin and grudziecki, “digeulit.” 45 b. alexander, s. adams becker, and m. cummins, “digital literacy: an nmc horizon project strategic brief” (austin, tx: the new media consortium, 2016), https://www.nmc.org/publication/digital-literacy-an-nmc-horizon-project-strategic-brief/. 46 ting-chia hsu, shao-chen chang, and yu-ting hung, “how to learn and how to teach computational thinking: suggestions based on a review of the literature,” computers & education 126 (2018): 296–310, https://doi.org/10.1016/j.compedu.2018.07.004. 47 sayamindu dasgupta and benjamin mako hill, “designing for critical algorithmic literacies,” arxiv:2008.01719 [cs], 2020, http://arxiv.org/abs/2008.01719. 48 long and magerko, “what is ai literacy?” 49 alexander, adams becker, and cummins, “digital literacy.” https://www.altexsoft.com/blog/datascience/comparing-machine-learning-as-a-service-amazon-microsoft-azure-google-cloud-ai-ibm-watson/ https://www.altexsoft.com/blog/datascience/comparing-machine-learning-as-a-service-amazon-microsoft-azure-google-cloud-ai-ibm-watson/ http://eur-lex.europa.eu/legal-content/en/txt/?uri=celex:32016r0679 http://eur-lex.europa.eu/legal-content/en/txt/?uri=celex:32016r0679 https://doi.org/10.1609/aimag.v38i3.2741 http://urn.fi/urn:isbn:978-952-327-313-9 https://www.projectinfolit.org/uploads/2/7/5/4/27541717/algoreport.pdf https://doi.org/10.1080/01616846.2012.654728 http://cfla-fcab.ca/wp-content/uploads/2018/07/cfla-fcab-2018-national-forum-paper-final.pdf http://cfla-fcab.ca/wp-content/uploads/2018/07/cfla-fcab-2018-national-forum-paper-final.pdf https://www.nmc.org/publication/digital-literacy-an-nmc-horizon-project-strategic-brief/ https://doi.org/10.1016/j.compedu.2018.07.004 http://arxiv.org/abs/2008.01719 information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 14 50 edward greenspon and taylor owen, “democracy divided: countering disinformation and hate in the digital public sphere” (ottawa: public policy forum, 2018), 19, https://ppforum.ca/wpcontent/uploads/2018/08/democracydivided-ppf-aug2018-en.pdf. 51 marcos román-gonzález, juan-carlos pérez-gonzález, and carmen jiménez-fernández, “which cognitive abilities underlie computational thinking? criterion validity of the computational thinking test,” computers in human behavior 72 (2017): 678–91, https://doi.org/dx.doi.org/10.1016/j.chb.2016.08.047. 52 annemaree lloyd, “chasing frankenstein’s monster: information literacy in the black box society,” journal of documentation 75, no. 6 (2019): 1476, https://doi.org/10.1108/jd-022019-0035. 53 head, fister, and macmillan, “information literacy in the age of algorithms,” 42. 54 head, fister, and macmillan, “information literacy in the age of algorithms.” 55 taina bucher, “the algorithmic imaginary: exploring the ordinary affects of facebook algorithms,” information, communication & society 20, no. 1 (2017): 42, https://doi.org/10.1080/1369118x.2016.1154086. 56 tanya kant, making it personal: algorithmic personalization, identify, and everyday life (oxford: oxford university press, 2020), 152. 57 jason clark, lisa janicke hinchliffe, and scott young, “unpacking the algorithms that shape our ux” (washington, dc: imls, 2017), https://www.imls.gov/sites/default/files/grants/re-7217-0103-17/proposals/re-72-17-0103-17-full-proposal-documents.pdf. 58 association of college and university libraries, “framework for information literacy for higher education,” 2015, http://www.ala.org/acrl/standards/ilframework; jason clark, “building competencies around algorithmic awareness” (washington, dc: code4lib, 2018), https://www.lib.montana.edu/~jason/talks/algorithmic-awareness-talk-code4lib2018.pdf. 59 jason clark, algorithmic awareness (2018; repr., github, 2020), https://github.com/jasonclark/algorithmic-awareness. 60 ryan cordell, “machine learning + libraries: a report on the state of the field” (washington dc: library of congress, 2020), 31, https://labs.loc.gov/static/labs/work/reports/cordellloc-ml-report.pdf. 61 michael ridley, “explainable artificial intelligence,” research library issues, no. 299 (2019): 28– 46, https://doi.org/10.29242/rli.299.3. 62 matt turek, “explainable artificial intelligence (xai)” (arlington, va: darpa, 2016), https://www.darpa.mil/program/explainable-artificial-intelligence; darpa, “explainable artificial intelligence (xai)” (arlington, va: darpa, 2016), http://www.darpa.mil/attachments/darpa-baa-16-53.pdf. 63 ashraf abdul et al., “trends and trajectories for explainable, accountable, and intelligible systems: an hci research agenda,” in proceedings of the 2018 chi conference on human https://ppforum.ca/wp-content/uploads/2018/08/democracydivided-ppf-aug2018-en.pdf https://ppforum.ca/wp-content/uploads/2018/08/democracydivided-ppf-aug2018-en.pdf https://doi.org/dx.doi.org/10.1016/j.chb.2016.08.047 https://doi.org/10.1108/jd-02-2019-0035 https://doi.org/10.1108/jd-02-2019-0035 https://doi.org/10.1080/1369118x.2016.1154086 https://www.imls.gov/sites/default/files/grants/re-72-17-0103-17/proposals/re-72-17-0103-17-full-proposal-documents.pdf https://www.imls.gov/sites/default/files/grants/re-72-17-0103-17/proposals/re-72-17-0103-17-full-proposal-documents.pdf http://www.ala.org/acrl/standards/ilframework https://www.lib.montana.edu/~jason/talks/algorithmic-awareness-talk-code4lib2018.pdf https://github.com/jasonclark/algorithmic-awareness https://labs.loc.gov/static/labs/work/reports/cordell-loc-ml-report.pdf https://labs.loc.gov/static/labs/work/reports/cordell-loc-ml-report.pdf https://doi.org/10.29242/rli.299.3 https://www.darpa.mil/program/explainable-artificial-intelligence http://www.darpa.mil/attachments/darpa-baa-16-53.pdf information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 15 factors in computing systems, chi ’18 (new york: acm, 2018), 582:1–582:18, https://doi.org/10.1145/3173574.3174156; wojciech samek and klaus-robert muller, “towards explainable artificial intelligence,” in explainable ai: interpreting, explaining and visualizing deep learning, ed. wojciech samek et al., 2019., lecture notes in artificial intelligence 11700 (cham: springer international publishing, 2019), 5–22; alejandro barredo arrieta et al., “explainable artificial intelligence (xai): concepts, taxonomies, opportunities and challenges toward responsible ai,” arxiv:1910.10045 [cs], 2019, http://arxiv.org/abs/1910.10045. 64 r. david lankes, “decoding ai and libraries,” r. david lankes (blog), july 3, 2019, https://davidlankes.org/decoding-ai-and-libraries/. 65 catherine coleman, “artificial intelligence and the library of the future, revisited,” digital library blog (blog), november 3, 2017, https://library.stanford.edu/blogs/digital-libraryblog/2017/11/artificial-intelligence-and-library-future-revisited; head, fister, and macmillan, “information literacy in the age of algorithms”; cordell, “machine learning + libraries”; clark, hinchliffe, and young, “unpacking the algorithms.” 66 cordell, “machine learning + libraries,” 2. 67 thomas padilla, responsible operations. data science, machine learning, and ai in libraries (dublin, oh: oclc research, 2019), 10, https://doi.org/10.25333/xk7z-9g97. 68 nicholas diakopoulos, “algorithmic accountability reporting: on the investigation of black boxes” (new york: tow center for digital journalism, columbia university, 2014), 2, https://doi.org/10.7916/d8zk5tw2; finn, “algorithm of the enlightenment,” 24. 69 scott lash, “power after hegemony: cultural studies in mutation?,” theory, culture & society 24, no. 3 (2007): 71, https://doi.org/10.1177/0263276407075956. https://doi.org/10.1145/3173574.3174156 http://arxiv.org/abs/1910.10045 https://davidlankes.org/decoding-ai-and-libraries/ https://library.stanford.edu/blogs/digital-library-blog/2017/11/artificial-intelligence-and-library-future-revisited https://library.stanford.edu/blogs/digital-library-blog/2017/11/artificial-intelligence-and-library-future-revisited https://doi.org/10.25333/xk7z-9g97 https://doi.org/10.7916/d8zk5tw2 https://doi.org/10.1177/0263276407075956 abstract introduction algorithms and literacy “literacies of the digital” defining algorithmic literacy why algorithmic literacy? the library role in algorithmic literacy information literacy and explainable ai (xai): unique library contributions algorithmic literacy and information literacy algorithmic literacy and explainable ai (xai) conclusion endnotes bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 205 kayla l. quinney, sara d. smith, and quinn galbraith bridging the gap: self-directed staff technology training of hbll patrons. as anticipated, results indicated that students frequently use text messages, social networks, blogs, etc., while fewer staff members use these technologies. for example, 42 percent of the students reported that they write a blog, while only 26 percent of staff and faculty do so. also, 74 percent of the students and only 30 percent of staff and faculty indicated that they belonged to a social network. after concluding that staff and faculty were not as connected as their student patrons are to technology, library administration developed the technology challenge to help close this gap. the technology challenge was a self-directed training program requiring participants to explore new technology on their own by spending at least fifteen minutes each day learning new technology skills. this program was successful in promoting lifelong learning by teaching technology applicable to the work and home lives of hbll employees. we will first discuss literature that shows how technology training can help academic librarians connect with student patrons, and then we will describe the technology challenge and demonstrate how it aligns with the principles of self-directed learning. the training will be evaluated by an analysis of the results of two surveys given to participants before and after the technology challenge was implemented. ■■ library 2.0 and “librarian 2.0” hbll wasn’t the first to notice the gap between librarians and students, mcdonald and thomas noted that “gaps have materialized,” and library technology does not always “provide certain services, resources, or possibilities expected by emerging user populations like the millennial generation.”1 college students, who grew up with technology, are “digital natives,” while librarians, many having learned technology later in life, are “digital immigrants.”2 the “digital natives” belong to the millennial generation, described by shish and allen as a generation of “learners raised on and confirmed experts in the latest, fastest, coolest, greatest, newest electronic technologies.”3 according to sweeny, when students use libraries, they expect the same “flexibility, geographic independence, speed of response, time shifting, interactivity, multitasking, and time savings” provided by the technology they use daily.4 students are undergraduates, as members of the millennial generation, are proficient in web 2.0 technology and expect to apply these technologies to their coursework—including scholarly research. to remain relevant, academic libraries need to provide the technology that student patrons expect, and academic librarians need to learn and use these technologies themselves. because leaders at the harold b. lee library of brigham young university (hbll) perceived a gap in technology use between students and their staff and faculty, they developed and implemented the technology challenge, a self-directed technology training program that rewarded employees for exploring technology daily. the purpose of this paper is to examine the technology challenge through an analysis of results of surveys given to participants before and after the technology challenge was implemented. the program will also be evaluated in terms of the adult learning theories of andragogy and selfdirected learning. hbll found that a self-directed approach fosters technology skills that librarians need to best serve students. in addition, it promotes lifelong learning habits to keep abreast of emerging technologies. this paper offers some insights and methods that could be applied in other libraries, the most valuable of which is the use of self-directed and andragogical training methods to help academic libraries better integrate modern technologies. l eaders at the harold b. lee library of brigham young university (hbll) began to suspect a need for technology training when employees were asked during a meeting if they owned an ipod or mp3 player. out of the twenty attendees, only two raised their hands—one of whom worked for it. perceiving a technology gap between hbll employees and student patrons, library leaders began investigating how they could help faculty and staff become more proficient with the technologies that student patrons use daily. to best serve student patrons, academic librarians need to be proficient with the technologies that student patrons expect. hbll found that a self-directed learning approach to staff technology training not only fosters technology skills, but also promotes lifelong learning habits. to further examine the technology gap between librarians and students, the hbll staff, faculty, and student employees were given a survey designed to explore generational differences in media and technology use. student employees were surveyed as representatives of the larger student body, which composes the majority kayla l. quinney (quinster27@gmail.com) is research specialist, sara d. smith (saradsmith@gmail.com) is research specialist, and quinn galbraith (quinn_galbraith@byu.edu) is library human resource training and development manager, brigham young university library, provo, utah. 206 information technology and libraries | december 2010 2.0,” a program that “focuses on self-exploration and encourages staff to learn about new technologies on their own.”24 learning 2.0 encouraged library staff to explore web 2.0 tools by completing twenty-three exercises involving new technologies. plcmc’s program has been replicated by more than 250 libraries and organizations worldwide,25 and several libraries have written about their experiences, including academic26 and public libraries.27 these programs—and the technology challenge implemented by hbll—integrate the theories of adult learning. in the 1960s and 1970s, malcolm knowles introduced the theory of andragogy to describe the way adults learn.28 knowles described adults as learners who (1) are self-directed, (2) use their experiences as a resource for learning, (3) learn more readily when they experience a need to know, (4) seek immediate application of knowledge, and (5) are best motivated by internal rather than external factors.29 the theory and practice of self-directed learning grew out of the first learning characteristic and assumes that adults prefer self-direction in determining and achieving learning goals, and therefore learners exercise independence in determining how and what they learn.30 these theories have had a considerable effect on adult education practice31 and employee development programs.32 when adults participate in trainings that align with the assumptions of andragogy, they are more likely to retain and apply what they have learned.33 ■■ the technology challenge hbll’s technology challenge is similar to learning 2.0 in that it encourages self-directed exploration of web 2.0 technologies, but it differs in that participants were even more self-directed in exploration and that they were asked to participate daily. these features encouraged more self-directed learning in areas of participant interest as well as habit formation. it is not our purpose to critique learning 2.0, but to provide some evidence and analysis to demonstrate the success of hands-on, self-directed training approaches and to suggest other ways for libraries to apply self-directed learning to technology training. the technology challenge was implemented from june 2007 to january 2008. hbll staff included 175 full-time employees, 96 of whom participated in the challenge. (the student employees were not involved.) participants were asked to spend fifteen minutes each day learning a new technology skill. hbll leaders used rewards to make the program enjoyable and to motivate participation: for each minute spent learning technology, participants earned one point, and when one thousand points were earned, the participant would receive a gift certificate to the campus bookstore. staff and faculty participated and tracked their progress through an online masters of “informal learning”; that is, they are accustomed to easily and quickly gathering information relevant to their lives from the internet and from friends. shish and allen claimed that millennials prefer “interactive, hyper-linked multimedia over the traditional static, textoriented printed items. they want a sense of control; they need experiential and collaborative approaches rather than formal, librarian-guided, library-centric services.”5 these students arrive on campus expecting “to handle the challenges of scholarly research” using similar methods and technologies.6 interactive technologies such as blogs, wikis, streaming media applications, and social networks, are referred to as “web 2.0.” abram argued that web 2.0 technology “could be useful in an enterprise, institutional research, or community environment, and could be driven or introduced by the library.”7 “library 2.0” is a concept referring to a library’s integration of these technologies; it is essentially the use of “web 2.0 opportunities in a library environment.”8 manesss described library 2.0 is user-centered, social, innovative, and provider of a multimedia experiences.9 it is a community that “blurs the line between librarian and patron, creator and consumer, authority and novice.”10 libraries have been using web 2.0 technology such as blogs,11 wikis,12 and social networks13 to better serve and connect with patrons. blogs allow libraries to “provide news, information and links to internet resources,”14 and wikis create online study groups15 and “build a shared knowledge repository.”16 social networks can be particularly useful in connecting with undergraduate students: millennials use technology to collaborate and make collective decisions,17 and libraries can capitalize on this tendency by using social networks, which for students would mean, as bates argues, “an informational equivalent of the reliance on one’s facebook friends.”18 students expect library 2.0—and as libraries integrate new technologies, the staff and faculty of academic libraries need to become “librarian 2.0.” according to abram, librarian 2.0 understands users and their needs “in terms of their goals and aspirations, workflows, social and content needs, and more. librarian 2.0 is where the user is, when the user is there.”19 the modern library user “needs the experience of the web . . . to learn and succeed,”20 and the modern librarian can help patrons transfer technology skills to information seeking. librarian 2.0 is prepared to help patrons familiar with web 2.0 to “leverage these [technologies] to make a difference in reaching their goals.”21 therefore staff and faculty “must become adept at key learning technologies themselves.”22 stephen abram asked, “are the expectations of our users increasing faster than our ability to adapt?”23 and this same concern motivated hbll and other institutions to initiate staff technology training programs. the public library of charlotte and mecklenburg county of north carolina (plcmc) developed “learning bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 207 their ability to learn and use technology. to be eligible to receive the gift card, participants were required to take this exit survey. sixty-four participants, all of whom had met or exceeded the thousand-point goal, chose to complete this survey, so the results of this survey represent the experiences of 66 percent of the participants. of course, if those who had not completed the technology challenge had taken the survey the results may have been different, but the results do show how those who chose to actively participate reacted to this training program. the survey included both quantifiable and open-ended questions (see appendix b for survey results and a list of the open-ended questions). the survey results, along with an analysis of the structure of the challenge itself, demonstrates that the program aligns with knowles’s five principles of andragogy to successfully help employees develop both technology skills and learning habits. self-direction the technology challenge was self-directed because it gave participants the flexibility to select which tasks and challenges they would complete. garrison wrote that in a self-directed program, “learners should be provided with choices of how they wish to proactively carry out the learning process. material resources should be available, approaches suggested, flexible pacing accommodated, and questioning and feedback provided when needed.”34 hbll provided a variety of challenges and training sessions related to various technologies. technology challenge participants were given the independence to choose which learning methods to use, including which training sessions to attend and which challenges to complete. according to the exit survey, the most popular training methods were small, instructor-led groups, followed by self-learning through reading books and articles. group training sessions were organized by hbll leadership and addressed topics such as microsoft office, rss feeds, computer organization skills, and multimedia software. other learning methods included web tutorials, dvds, large group discussions, and one-on-one tutoring. the group training classes preferred by hbll employees may be considered more teacher-directed than self-directed, but the technology challenge was self-directed as a whole in that learners were given the opportunity to choose what they learned and how they learned it. the structure of the technology challenge allowed participants to set their own pace. staff and faculty were given several months to complete the challenge and were responsible to pace themselves. on the exit survey, one participant commented: “if i didn’t get anything done one week, there wasn’t any pressure.” another enjoyed flexibility in deciding when and where to complete the tasks: “i liked being able to do the challenge anywhere. when i had a few minutes between appointments, classes, board game called “techopoly.” participation was voluntary, and staff and faculty were free to choose which tasks and challenges they would complete. tasks fell into one of four categories: software, hardware, library technology, and the internet. participants were required to complete one hundred points in each category, but beyond that, were able to decide how to spend their time. examples of tasks included attending workshops, exploring online tutorials, and reading books or articles about a relevant topic. for each hundred points earned, participants could complete a mini-challenge, which included reading blogs or e-books, listening to podcasts, or creating a photo cd (see appendix a for a more complete list). participants who completed fifteen out of twenty possible challenges were entered into a drawing for another gift certificate. before beginning the challenge, all participants were surveyed about their current use of technology. on this survey, they indicated that they were most uncomfortable with blogs, wikis, image editors, and music players. these results provided a focus for technology challenge trainings and mini-challenges. while not all of these technologies may apply directly to their jobs, 60 percent indicated that they were interested in learning them. forty-four percent reported that time was the greatest impediment to learning new technology; therefore the daily fifteen-minute requirement was introduced with the hope that it was small enough to be a good incentive to participate but substantial enough to promote habit formation and allow employees enough time to familiarize themselves with the technology. although some productivity may have been lost due to the time requirement (especially in cases where participants may have spent more than the required time), library leaders felt that technology training was an investment in hbll employees and that, at least for a few months, it was worth any potential loss in productivity. because participants could chose how and when they learned technology, they could incorporate the challenge into their work schedules according to their own needs, interests, and time constraints. of ninety-six participants, sixty-six reached or exceeded the thousand-point goal, and eight participants earned more than two thousand points. ten participants earned between five hundred and one thousand points, and another six earned between one hundred and five hundred. although not all participants completed the challenge, most were involved to some extent in learning technology during this time. ■■ the technology challenge and adult learning after finishing the challenge, participants took an exit survey to evaluate the experience and report changes in 208 information technology and libraries | december 2010 were willing, even excited, to learn technology skills: 37 percent “agreed” and 60 percent “strongly agreed” that they were interested in learning new technology. their desire to learn was cultivated by the survey itself, which helped them recognize and focus on this interest, and the challenge provided a way for employees to channel their desire to learn technology. immediate application learners need to see an opportunity for immediate application of their knowledge: ota et al. explained that “they want to learn what will help them perform tasks or deal with problems they confront in everyday situations and those presented in the context of application to real life.”39 because of the need for immediate application, the technology challenge encouraged staff and faculty to learn technology skills directly related to their jobs—as well as technology that is applicable to their personal or home lives. hbll leaders hoped that as staff became more comfortable with technology in general, they would be motivated to incorporate more complex technologies into their work. here is one example of how the technology challenge catered to adult learners’ need to apply what they learn: before designing the challenge, hbll held a training session to teach employees the basics of photoshop. even though attendees were on the clock, the turnout was discouraging. library leaders knew they needed to try something new. in the revamped photoshop workshop that was offered as part of the technology challenge, attendees brought family photos or film and learned how to edit and experiment with their photos and burn dvd copies. this time, the class was full: the same computer program that before drew only a few people was now exciting and useful. focusing on employees’ personal interests in learning new software, instead of just on teaching the software, better motivated staff and faculty to attend the training. motivation as stated by ota et al., adults are motivated by external factors but are usually more motivated by internal factors: “adults are responsive to some external motivators (e.g., better job, higher salaries), but the most potent motivators are internal (e.g., desire for increased job satisfaction, self-esteem).”40 on the entrance survey, participants were given the opportunity to comment on their reasons for participating in the challenge. the gift card, an example of an external motivation, was frequently cited as an important motivation. but many also commented on more internal motivations: “it’s important to my job to stay proficient in new technologies and i’d like to stay current”; “i feel that i need to be up-to-date or meetings i could complete some of the challenges.” employees could also determine how much or how little of the challenge they wanted to complete: many reached well over the thousand-point goal, while others fell a little short. participants began at different skill levels, and thus could use the time and resources allotted to explore basic or more advanced topics according to their needs and interests. garrison had noted the importance of providing resources and feedback in self-directed learning.35 the techopoly website provided resources (such as specific blogs or websites to visit) and instructions on how to use and access technology within the library. hbll also hired a student to assist staff and faculty one-on-one by explaining answers to their questions about technology and teaching other skills he thought may be relevant to their initial problem. the entrance and exit surveys provided opportunities for self-reflection and self-evaluation by questioning the participants’ use of technology before the challenge and asking them to evaluate their proficiency in technology after the challenge. use of experience the use of experience as a source of learning is important to adult learners: “the richest resource for learning resides in adults themselves; therefore, tapping into their experiences through experiential techniques (discussions, simulations, problem-solving activities, or case methods) is beneficial.”36 the small-group discussions and one-onone problem solving made available to hbll employees certainly fall into these categories. small-group classes are one of the best ways to encourage adults to share and validate their experiences, and doing so increases retention and application of new information.37 the trainings and challenges encouraged participants to make use of their work and personal experiences by connecting the topic to work or home application. for example, one session discussed how blogs relate to libraries, and another helped participants learn adobe photoshop skills by editing personal photographs. need to know adult learners are more successful when they desire and recognize a need for new knowledge or skills. the role of a trainer is to help learners recognize this “need to know” by “mak[ing] a case for the value of learning.”38 hbll used the generational survey and presurvey to develop a need and desire to learn. the results of the generational survey, which demonstrated a gap in technology use between librarians and students, were presented and discussed at a meeting held before the initiation of the technology challenge to help staff and faculty understand why it was important to learn 2.0 technology. results of the presurvey showed that staff and faculty bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 209 statistical reports or working with colleagues from other libraries.” ■■ “i learned how to set up a server that i now maintain on a semi-regular basis. i learned a lot about sfx and have learned some perl programming language as well that i use in my job daily as i maintain sfx.” ■■ “the new oclc client was probably the most significant. i spent a couple of days in an online class learning to customize the client, and i use what i learned there every single day.” ■■ “i use google docs frequently for one of the projects i am now working on.” participants also indicated weaknesses in the technology challenge. almost 20 percent of those who completed the challenge reported that it was too easy. this is a valid point—the challenge was designed to be easy so as not to intimidate staff or faculty who are less familiar with technology. it is important to note that these comments came from those who completed the challenge—other participants may have found the tasks and mini-challenges more difficult. the goal was to provide an introduction to web 2.0, not to train experts. however, a greater range of tasks and challenges could be provided in the future to allow staff and faculty more selfdirection in selecting goals relevant to their experience. to encourage staff and faculty to attend sponsored training sessions as part of the challenge, hbll leaders decided to double points for time spent at these classes. this certainly encouraged participation, but it lead to “point inflation”—perhaps being one reason why so many reported that the challenge was too easy to complete. the doubling of points may also have encouraged staff to spend more time in workshops and less time practicing or applying the skills learned. a possible solution would be offering 1.5 points, or offering a set number of points for attendance instead of counting per minute. it also may have been informative for purpose of analysis to have surveyed both those who did not complete the challenge as well as those who chose not to participate. because the presurvey indicated that time was the biggest deterrent to learning and incorporating new technology, we assume that many of those who did not participate or who did not complete the challenge felt that they did not have enough time to do so. there is definitely potential for further investigation into why library staff would not want to participate in a technology training program, what would motivate them to participate, and how we could redesign the technology challenge to make it more appealing to all of our staff and faculty. several library employees have requested that hbll sponsor another technology challenge program. because of the success of the first and because of continuing interest in technology training, we plan to do so in the future. we will make changes and adjustments according to the on technology in order to effectively help patrons”; “to identify and become comfortable with new technologies that will make my work more efficient, more presentable, and more accurate.” ■■ lifelong learning staff and faculty responded favorably to the training. none of the participants who took the exit survey disliked the challenge; 34 percent even reported that they strongly liked it. ninety-five percent reported that they enjoyed the process of learning new technology, and 100 percent reported that they were willing to participate in another technology challenge—thus suggesting success in the goal of encouraging lifelong technology learning. the exit survey results indicate that after completing the challenge, staff and faculty are more motivated to continue learning—which is exactly what hbll leaders hoped to accomplish. eighty-nine percent of the participants reported that their desire to learn new technology had increased, and 69 percent reported that they are now able to learn new technology faster after completing the technology challenge. ninety-seven percent claimed that they were more likely to incorporate new technology into home or work use, and 98 percent said they recognized the importance of staying on top of emerging technologies. participants commented that the training increased their desire to learn. one observed, “i often need a challenge to get motivated to do something new,” and another participant reported feeling “a little more comfortable trying new things out.” the exit survey asked participants to indicate how they now use technology. one employee keeps a blog for her daughter’s dance company, and another said, “i’m on my way to a full-blown googlereader addiction.” another participant applied these new skills at home: “i’m not so afraid of exploring the computer and other software programs. i even recently bought a computer for my own personal use at home.” the technology challenge was also successful in helping employees better serve patrons: “i can now better direct patrons to services that i would otherwise not have known about, such as streaming audio and video and e-book readers.” another participant felt better connected to student patrons: “i understand the students better and the things they use on a daily basis.” staff and faculty also found their new skills applicable to work beyond patron interaction, and many listed specific examples of how they now use technology at work: ■■ “i have attended a few microsoft office classes that have helped me tremendously in doing my work more efficiently, whether it is for preparing monthly 210 information technology and libraries | december 2010 2. richard t. sweeny, “reinventing library buildings and services for the millennial generation,” library administration & management 19, no. 4 (2005): 170. 3. win shish and martha allen, “working with generationd: adopting and adapting to cultural learning and change,” library management 28, no. 1/2 (2006): 89. 4. sweeney, “reinventing library buildings,” 170. 5. shish and allen, “working with generation-d,” 96. 6. ibid., 98. 7. stephen abram, “social libraries: the librarian 2.0 pheonomenon,” library resources & technical services 52, no. 2 (2008): 21. 8. ibid. 9. jack m. maness “library 2.0 theory: web 2.0 and its implications for libraries,” webology 3, no. 2 (2006), http:// www.webology.ir/2006/v3n2/a25.html?q=link:webology.ir/ (accessed jan. 8, 2010). 10. ibid., under “blogs and wikis,” para. 4. 11. laurel ann clyde, “library weblogs,” library management 22, no. 4/5 (2004): 183–89; maness, “library 2.0. theory.” 12. see matthew m. bejune, “wikis in libraries,” information technology & libraries 26, no. 3 (2007): 26–38 ; darlene fichter, “the many forms of e-collaboration: blogs, wikis, portals, groupware, discussion boards, and instant messaging,” online: exploring technology & resources for information professionals 29, no. 4 (2005): 48–50; maness, “library 2.0 theory.” 13. mary ellen bates, “can i facebook that?” online: exploring technology and resources for information professionals 31, no. 5 (2007): 64; sarah elizabeth miller and lauren a. jensen, “connecting and communicating with students on facebook,” computers in libraries 27, no. 8 (2007): 18–22. 14. clyde, “library weblogs,” 183. 15. maness, “library 2.0 theory.” 16. fichter, “many forms of e-collaboration,” 50. 17. sweeney, “reinventing library buildings”; bates, “can i facebook that?” 18. bates, “can i facebook that?” 64. 19. abram, “social libraries,” 21. 20. ibid., 20. 21. ibid., 21. 22. shish and allen, “working with generation-d,” 90. 23. abram, “social libraries,” 20. 24. helene blowers and lori reed, “the c’s of our sea change: plans for training staff, from core competencies to learning 2.0,” computers in libraries 27, no. 2 (2007): 11. 25. helene blowers, learning 2.0, 2007, http://plcmclearning .blogspot.com (accessed jan. 8, 2010). 26. for examples, see ilana kingsley and karen jensen, “learning 2.0: a tool for staff training at the university of alaska fairbanks rasmuson,” the electronic journal of academic & special librarianship 12, no. 1 (2009), http://southernlibrarianship.icaap.org/content/v10n01/kingsley_i01.html (accessed jan. 8, 2010); beverly simmons, “learning (2.0) to be a social library,” tennessee libraries 58, no. 2 (2008): 1–8. 27. for examples, see christine mackenzie, “creating our future: workforce planning for library 2.0 and beyond,” australasian public libraries & information services 20, no. 3 (2007): 118–24; liisa sjoblom, “embracing technology: the deschutes public library’s learning 2.0 program,” ola quarterly 14, no. 2 (2007): 2–6; hui-lan titango and gail l. mason, “learning library 2.0: 23 things @ scpl,” library management 30, no. 1/2 feedback we have received, and continue to evaluate it and improve it based on survey results. the purpose of a second technology challenge would be to reinforce what staff and faculty have already learned, to teach new skills, and to help participants remember the importance of lifelong learning when it comes to technology. ■■ conclusion hbll’s self-directed technology challenge was successful in teaching technology skills and in promoting lifelong learning—as well as in fostering the development of librarian 2.0. abram listed key characteristics and duties of librarian 2.0, including learning the tools of web 2.0; connecting people, technology, and information; embracing “nontextual information and the power of pictures, moving images, sight, and sound”; using the latest tools of communication; and understanding the “emerging roles and impacts of the blogosphere, web syndicasphere, and wikisphere.”41 survey results indicated that hbll employees are on their way to developing these attributes, and that they are better equipped with the skills and tools to keep learning. like plcmc’s learning 2.0, the technology challenge could be replicated in libraries of various sizes. obviously an exact replication would not be feasible or appropriate for every library—but the basic ideas, such as the principles of andragogy and self-directed learning could be incorporated, as well as the daily time requirement or the use of surveys to determine weaknesses or interests in technology skills. whatever the case, there is a great need for library staff and faculty to learn emerging technologies and to keep learning them as technology continues to change and advance. but the most important benefit of a self-directed training program focusing on lifelong learning is effective employee development. the goal of any training program is to increase work productivity—and as employees become more productive and efficient, they are happier and more excited about their jobs. on the exit survey, one participant expressed initially feeling hesitant about the technology challenge and feared that it would increase an already hefty workload. however, once the challenge began, the participant enjoyed “taking the time to learn about new things. i feel i am a better person/librarian because of it.” and that, ultimately, is the goal—not only to create better librarians, but also to create better people. notes 1. robert h. mcdonald and chuck thomas, “disconnects between library culture and millennial generation values,” educause quarterly 29, no. 4 (2006): 4. bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 211 ers,” journal of extension 33 (2005), http://www.joe.org/ joe/2006december/tt5.php (accessed jan. 8, 2010); wayne g. west, “group learning in the workplace,” new directions for adult and continuing education 71 (1996): 51–60. 33. ota et al., “needs of learners.” 34. d. r. garrison, “self-directed learning: toward a comprehensive model,” adult education quarterly 48 (1997): 22. 35. ibid. 36. ota et al., “needs of learners,” under “needs of the adult learner,” para. 4. 37. ota et al., “needs of learners”; west, “group learning.” 38. ota et al., “needs of learners,” under “needs of the adult learner,” para. 2. 39. ibid., para. 6. 40. ibid., para 7. 41. abram, “social library,” 21–22. (2009): 44–56; illinois library association, “continuous improvement: the transformation of staff development,” the illinois library association reporter 26, no. 2 (2008): 4–7; and thomas simpson, “keeping up with technology: orange county library embraces 2.0,” florida libraries 20, no. 2 (2007): 8–10. 28. sharan b. merriam, “andragogy and self-directed learning: pillars of adult learning theory,” new directions for adult & continuing education 89 (2001): 3–13. 29. malcolm shepherd knowles, the modern practice of adult education: from pedagogy to andragogy (new york: cambridge books, 1980). 30. jovita ross-gordon, “adult learners in the classroom,” new directions for student services 102 (2003): 43–52. 31. merriam, “pillars of adult learning”; ross-gordon, “adult learners.” 32. carrie ota et al., “training and the needs of learnappendix a. technology challenge “mini challenges” technology challenge participants had the opportunity to complete fifteen of twenty mini-challenges to become eligible to win a second gift certificate to the campus bookstore. below are some examples of technology mini-challenges: 1. read a library or a technology blog 2. listen to a library podcast 3. check out a book from circulation’s new self-checkout machine 4. complete an online copyright tutorial 5. catalog some books on librarything 6. read an e-book with sony ebook reader or amazon kindle 7. scan photos or copy them from a digital camera and then burn them onto a cd 8. backup data 9. change computer settings 10. schedule meetings with microsoft outlook 11. create a page or comment on a page on the library’s intranet wiki 12. use one of the library’s music databases to listen to music 13. use wordpress or blogger to create a blog 14. post a photo on a blog 15. use google reader or bloglines to subscribe to a blog or news page using rss 16. reserve and check out a digital camera, camcorder, dvr, or slide scanner from the multimedia lab and create something with it 17. convert media on the analog media racks 18. edit a family photograph using photo-editing software 19. attend a class in the multimedia lab 20. make a phone call using skype 212 information technology and libraries | december 2010 how did you like the technology challenge overall? answer response percent strongly disliked 0 0 disliked 0 0 liked 42 66 strongly liked 22 34 how did you like the reporting system used for the technology challenge (the techopoly game)? answer response percent strongly disliked 0 0 disliked 4 6 liked 41 64 strongly liked 19 30 would you participate in another technology challenge? answer response percent yes 64 100 no 0 0 what percentage of time did you spend using the following methods of learning? (participants were asked to allocate 100 points among the categories) category average response instructor-led large group 15.3 instructor-led small group 27 one-on-one instruction 3.5 web tutorial 12.8 self-learning (books, articles) 27.4 dvds .5 small group discussion 2.7 large group discussion 2.6 other 6.7 i am more likely to incorporate new technology into my home or work life. answer response percent strongly disagree 0 0 disagree 2 3 agree 49 77 strongly agree 13 20 i enjoy the process of making new technology a part of my work or home life. answer response percent strongly disagree 0 0 disagree 2 3 agree 37 58 strongly agree 24 38 after completing the technology challenge, my desire to learn new technologies has increased. answer response percent strongly disagree 0 0 disagree 7 11 agree 44 69 strongly agree 13 20 i feel i now learn new technologies more quickly. answer response percent strongly disagree 0 0 disagree 20 31 agree 39 61 strongly agree 5 8 appendix b. exit survey results bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 213 open-ended questions ■■ what would you change about the technology challenge? ■■ what did you like about the technology challenge? ■■ what technologies were you introduced to during the technology challenge that you now use on a regular basis? ■■ in what was do you feel the technology challenge has benefited you the most? how much more proficient do you feel in . . . category not any somewhat a lot hardware 31% 64% 5% software 8% 72% 20% internet resources 17% 68% 15% library technology 23% 64% 13% in order for you to succeed in your job, how important is keeping abreast of new technologies to you? answer response percent not important 1 2 important 22 34 very important 41 64 62 information technology and libraries | june 2011 jason vaughan and kristen costello management and support of shared integrated library systems the second major hardware migration occurred, and an initial memorandum of understanding (mou) was drafted by the unlv libraries. this mou is still used by the libraries. the mou was discussed with all partners and ultimately signed by the director of each library. since the mou was signed nearly a decade ago, the system has continued to grow by all measures—size of the database, number of users, number of software modules comprising the complete system, and the financial and staff commitment toward support and maintenance. despite the emergence of a large number of other network-based technologies critical to library operations and services, the ils remains a critical system that supports many library operations. the research described in this paper developed in part because there is a dearth of published survey-based research of shared ils management and financial support. this article interweaves local existing practices with research findings. for brevity’s sake, the system shared by the unlv university libraries and four additional partners will be referred to as unlv’s system. to provide a relative sense of the footprint of each partner on the system, various measures can be used (see figure 1). ■■ survey method in april 2010, the authors administered a 20-question survey to the innovative user’s group (iug) via the group’s listserv. the survey focused on libraries that are part of a consortial or otherwise shared innovative ils. the innovative user’s group is the primary user’s group associated with the innovative ils and suite of products. the iug hosts a busy listserv, coordinates the annual north american conference devoted solely to the innovative system, and provides innovative customer-driven enhancement requests. to prevent multiple individuals from the same consortium responding to the survey, instructions indicated that only one individual from the main institution hosting the system should officially respond. given the anonymity of the survey and the desire to provide confidentiality, there is the possibility that some survey responses refer to the same system. the survey consisted primarily of multiple choice, “select all that apply,” and free-text response questions. the survey was divided into four broad topical areas: (1) background information; (2) funding; (3) support; and (4) training, professional development, and planning. the survey was open for a period of three weeks. because respondents could choose to skip questions, the number of responses received per question varied. on average, 43 individual responses were received for each question. innovative currently has more than 1,200 millennium ils installations.2 not all of those installations support multiple, administratively separate library entities. it is unknown the university of nevada, las vegas (unlv) university libraries has hosted and managed a shared integrated library system (ils) since 1989. the system and the number of partner libraries sharing the system has grown significantly over the past two decades. spurred by the level of involvement and support contributed by the host institution, the authors administered a comprehensive survey to current innovative interfaces libraries. research findings are combined with a description of unlv’s local practices to provide substantial insights into shared funding, support, and management activities associated with shared systems. s ince 1989, the university of nevada, las vegas university libraries has hosted and managed a shared integrated library system (ils). currently, partners include the university of nevada, las vegas university libraries (consisting of one main and three branch libraries, and hereafter referred to as unlv libraries); the administratively separate unlv law library; the college of southern nevada (a community college system consisting of three branch libraries); nevada state college; and the desert research institute. the original ils installation included just the unlv libraries and the clark county community college (now known as the college of southern nevada). the desert research institute joined in the early 1990s, the unlv law library joined with the establishment of the william j. boyd school of law in 1998, and, finally, nevada state college joined upon its creation in 2002. over time, the technological underpinnings of the ils have changed tremendously and have migrated firmly into a webbased environment unknown in 1989. the system was migrated to innovative interfaces’ current java-based platform, millennium, beginning in 1999. since the original installation, there have been three major full hardware migrations, in 1997, 2002, and 2009. over time, regular innovative software updates, as well as additional purchased software modules, have greatly extended both the staff and end user functionality of the ils. in early 2001, unlv and its partners conducted a marketplace assessment of ils vendors catering to academic customers.1 the assessment reaffirmed the consortia’s commitment to innovative interfaces. shortly thereafter, jason vaughan (jason.vaughan@unlv.edu) is director, library technologies, university of nevada las vegas. kristen costello (kristen.costello@unlv.edu) is systems librarian, university of nevada las vegas. management and support of shared integrated library systems | vaughan and costello 63 partners originally purchased the system together; 20 (38.5 percent) indicated they purchased the system with some of their current existing partners, while 9 (17.3 percent) indicated they as the main institution originally and solely purchased the system. several of the entities sharing the unlv libraries’ system did not even exist when the ils was originally purchased; only two of the current partners shared the original purchase cost of the system. another background question sought to understand how partners potentially individualize the system despite being on a shared platform. innovative, and likely other similar ils vendors, offers several products to help libraries better manage and control their holdings and acquisitions. of potential benefit to staff operations and workflow, innovative offers the option to have multiple acquisitions and/or serials control units, which provide separate fund files and ranges of order records for different institutions sharing the ils system. of 51 responses received, 44 respondents (86.3 percent) indicated they had multiple acquisitions and serials units and 7 (13.7 percent) do not. innovative offers two web-based discovery interfaces for patrons: the traditional online public access catalog, known as webpac, and their version of a next-generation discovery layer, known as encore. of potential benefit to staff as well as patrons, innovative offers “scoping” modules that help patrons using one of the web-based discovery interfaces, as well as staff using the millennium staff modules. the scoping module allows holdings segmentation by location or material type. scopes allow libraries to define their collections and offer their patrons the option to search just the collection of their applicable library. forty-six (88.5 percent) of the 52 respondents indicated they use scoping and 6 (11.5 percent) do not. unlv how many shared innovative library systems exist. while a true response rate cannot be determined, such a measure is not critical for this research. the survey questions with summarized results are provided in appendix a. ■■ survey background unlv’s system, with only five unique library entities, is a “small” system when compared with survey responses. survey respondents indicated a range from 2 to 80 unique members sharing their system. of the 48 responses received for this background question, 26 (54 percent) indicated 10 or fewer partners on the system. seven (14.6 percent) indicated 40 or more partners. the average number of partners sharing an ils implementation was 18 and the median was 8.5. there can be varying levels of partnership within a shared ils system. unlv’s instance is a rather informal partnership. some survey respondents indicated the existence of a far more structured or dedicated support group not directly associated with any particular library. one respondent noted they have a central office comprised of an executive director and two additional staff, responsible for ils administration; this central office reports to a board of directors, comprised of library directors for each member library. another indicated they have a central office responsible not only for the ils, but for other things such as wide and local area networks and workstation support. one respondent indicated that they are actually a consortium of consortia, with 9 hosts each comprised of anywhere from 4 to 11 libraries. twenty-three respondents out of 52 (44.2 percent) indicated that they and all of their current existing full-time library staff bibliographic records item records order records patron records staff login licenses unlv libraries 105 (70.9%) 1,494,890 (78.2%) 1,906,225 (81.1%) 74,223 (58.4%) 40,788 (59.6%) 85 (69.1%) unlv law library 13 (8.8%) 246,678 (12.9%) 243,788 (10.4%) 29,921 (23.5%) 2,034 (3%) 13 (10.6%) college of southern nevada 27 (18.2%) 146,118 (7.6%) 175,862 (7.5%) 22,142 (17.4%) 23,876 (34.9%) 20 (16.3%) nevada state college 1 (.7%) 17,787 (.9%) 17,979 (.8%) 841 (.7%) 1,718 (2.5%) 3 (2.4%) desert research institute 2 (1.4%) 5,396 (.3%) 5,361 (.2%) 0 (0%) 24 (<.1%) 2 (1.6%) figure 1. various measures of ils footprints for unlv’s shared ils (percentage of overall system) note: “staff login licenses” refers to the number of simultaneous staff users each institution can have on the system at any given time. 64 information technology and libraries | june 2011 share of funding toward annual maintenance based on their number of staff licenses, as shown in figure 1. ■■ funding support from partners mous appear to include funding and budgeting information more than any other discrete topic. direct support costs can include the maintenance support costs paid to one or more vendors, costs for additional vendor authored software modules purchased in addition to the base software, and, perhaps, licensing costs associated with a database or operating system used by the ils (e.g., an oracle license for oracle based ils systems). there are many parameters by which costs could be determined for partners, and, given the dearth of published research on the topic, a chief focus of this research sought more information on what factors were used by other consortia. the authors brainstormed 10 elements that could potentially figure into the overall cost sharing method. thirty-eight respondents provided information on factors playing a role in their cost sharing arrangements, illustrated in figure 2. respondents could mark more than one answer for this question, as more than one factor could be involved. the top two factors relate directly to vendor costs— whether annual support costs or acquisition of new vendor software. hardware placed third in overall frequency; for innovative and likely for other ils systems, ils hardware can be purchased from the vendor or an approved platform can be sourced from a reseller directly. support costs from third parties and the number of staff login ports were each identified as a factor by more than a third of all respondents. ■■ software purchases depending on the software, additional modules extending the system capabilities can benefit a single partner, or, in unlv’s experience, all partners on the system. traditionally, the unlv libraries have had the largest operating budget of the group, and a majority of new software requests have come internally from unlv libraries staff. over the past 20 years, the unlv libraries have fully funded the initial purchase costs of a majority of the software extending the system, regardless of whether it benefits just the unlv libraries or all system partners. there are numerous exceptions where the partner libraries have contributed funding, including significant start-up costs associated with the unlv law library joining the system in 1998 and the addition of nevada state college in 2002. in both instances, those bodies funded required and recommended software directly applicable has multiple serials and acquisitions units as well as multiple scopes configured to help segment the records for each entities’ particular collection. innovative offers various levels of maintenance support. unlv’s level of support includes the vendor supplying services such as application troubleshooting resolution, software updates, and some degree of operating system and hardware configuration and advice. unlv also contracts with the hardware vendor for hardware maintenance and underlying operating system support. the unlv libraries have had the opportunity to hire fully qualified and capable technical staff to provide a high level of support for the ils. unlv’s level of vendor support has evolved from an original full turnkey installation with innovative providing all support to a present level of more modest support. nearly half of all survey respondents, 25 of 52 (48.1 percent) indicated they had a turnkey arrangement with innovative; the remaining 27 respondents had a lesser level of support. maintenance and support obviously carry a cost with one or more third party providers. the majority of the respondents, 40 of 51 (78.4 percent), indicated there is a cost-sharing structure in place where maintenance support costs related to the ils are spread across partner libraries. six respondents (11.8 percent) indicated the main institution fully funds the maintenance support costs. the unlv libraries drafted the first and current mou in 2002 for all five entities sharing the ils system. thirty-five of 51 survey respondents (68.6 percent) indicated they, too, have a mou in place. unlv’s mou is a basic document, two pages in length, split into the following sections: background; acquisition of new or additional hardware; acquisition of new or additional software; annual maintenance associated with the primary vendor and third party suppliers and, importantly, the associated cost allocation method for how annual support costs are split between the partners; how new products are purchased from the vendor; and management and support responsibilities of the hosting institution. many of the survey respondents provided details on items contained in their own mous, which can be clustered into several broad categories. these include budgeting, payments, funding formulas; general governance and voting matters; support (e.g., contractual service responsibilities, responsibilities of member libraries); equipment (e.g., title and use of equipment, who maintains equipment); and miscellaneous. this latter category includes items such as expectations for record quality; network requirements/ restrictions; fine collection; and holds management. the majority of unlv’s mou addresses shared costs for annual maintenance. unlv’s cost-sharing structure is simple. the system has a particular number of associated staff (simultaneous login) licenses, which have gradually increased as the libraries have grown. logins are separated by institution, and each member is assessed their management and support of shared integrated library systems | vaughan and costello 65 annual maintenance bill and all partners help maintain new software acquisitions by contributing toward the annual maintenance. regarding new software acquisitions, cost-sharing practices varied between 44 respondents providing information in the survey. eight (18.2 percent) indicated there is consultation with other partners and there is some arrangement to share costs between the majority or all partners sharing the system. two respondents (4.5 percent) indicated the institution expressing the initial interest in the product fully funds the purchase. nineteen respondents (43.2 percent) indicated that they have had instances of both these scenarios (shared funding and sole funding). two respondents (4.5 percent) indicated they could not recall ever adding any additional software. thirteen respondents (29.5 percent) offered details to their operation such as additional serials and accounting units (for the law library), check-in and order records, and staff licenses. in addition, when the system was migrated from the aging text-based system (innopac) to the current millennium java-based gui system in 1999, the current partners contributed toward the upgrade cost based on number of staff licenses. partner institutions have continued to fund items of sole benefit to their operation, such as adding staff licenses or required network port interfaces associated with patron self-check stations installed at their facilities. during the 2000s, the unlv libraries have fully funded a majority of software of potential benefit to all partners, such as the electronic resource management module, the encore next generation discovery platform, and various opac/encore enhancements. software additions typically increase the figure 2. cost-sharing formula factors t h e a m o u n t o f th e o ve ra ll ye a rl y in n o va ti ve in te rf a c e s m a in te n a n c e /s u p p o rt i n vo ic e t h e a m o u n t o f a n y a d d it io n a l 3 rd p a rt y m a in te n a n c e / su p p o rt a g re e m e n ts a ss o c ia te d w it h t h e i n n o va ti ve sy st e m ( su c h a s c o n tr a c ts w it h t h e h a rd w a re m a n u fa c tu re r— h p, s u n m ic ro sy st e m s [o ra c le ], e tc .) t h e p u rc h a se c o st (s ) fo r n e w ly a c q u ir e d i n n o va ti ve m o d u le s/ p ro d u c ts t h e p u rc h a se c o st (s ) fo r n e w ly a c q u ir e d h a rd w a re a ss o c ia te d w it h t h e i n n o va ti ve s ys te m ( su c h a s a se rv e r, a d d it io n a l d is k s p a c e , b a c k u p e q u ip m e n t, e tc .) t h e n u m b e r o f in c id e n t re p o rt s (o r ti m e s p e n t) , b y p e rs o n n e l a t th e m a in i n st it u ti o n r e la te d t o r e se a rc h , tr o u b le sh o o ti n g , e tc . su p p o rt i ss u e s re p o rt e d b y p a rt n e r in st it u ti o n s t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y in st it u ti o n f t e t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y n u m b e r o f b ib o r it e m r e c o rd s th e p a rt n e r’ s in st it u ti o n h a s in t h e in n o va ti ve d a ta b a se t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y n u m b e r o f st a ff lo g in p o rt s d e d ic a te d t o t h e p a rt n e r lib ra ry t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y n u m b e r o f u se r se a rc h e s c o n d u c te d f ro m i p r a n g e s a ss o c ia te d w it h th e p a rt n e r in st it u ti o n t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y th e n u m b e r o f p a tr o n r e c o rd s w h o se h o m e l ib ra ry i s a ss o c ia te d w it h t h e p a rt n e r in st it u ti o n 66 information technology and libraries | june 2011 applied, the number of staff users has increased significantly, and the system was migrated to an underlying oracle database in 2004. since the original system was purchased in 1989 and fully installed in 1990, the central, locally hosted server has been replaced three times, in 1997, 2002, and 2009. partners contributed toward the costs of the server upgrades in 1997 and 2002, while the unlv libraries fully funded the 2009 upgrade. software and hardware components comprising the backup system have been significantly enhanced with a modern system capable of the speed, capacity, and features needed to perform appropriately in the short backup window available each night. unlv funded the initial backup software and hardware, and the partner institutions contribute toward the annual maintenance associated with the backup equipment and software. one survey question focused on major central infrastructure supporting the ils (defined as items exceeding $1,000 and with several examples listed). the question did not focus on hardware that could be provided by ils vendors benefiting a single partner, such as self-check stations or inventory devices. fourteen (31.8 percent) of the 44 respondents indicated that if major new hardware was needed, there was consultation with other partners, and, if purchased, a cost-sharing agreement was arranged. two respondents (4.5 percent) indicated the institution expressing the initial interest fully funds the purchase and seven respondents indicated they’ve had instances in the past of both these scenarios. three respondents (6.8 percent) indicated their shared system hardware had never been replaced or upgraded to their knowledge. nineteen respondents provided information on alternate scenarios or otherwise more details as to local practice. several indicated a separate fund is maintained solely for large ils system-related improvements or ils related purchases. revenue for these funds can be built up over time through maintenance and use payments by partner libraries or by a small additional fee earmarked for future hardware replacement needs collected each year. one respondent indicated they have been able to get grant funds to cover major purchases. with few exceptions, the majority of free text responses indicated that costs for major purchases were shared by partners or otherwise funded by the central consortium or cooperative agency. as with regular annual maintenance and new software purchases, various elements can determine what portion of hardware replacement costs are borne by partner libraries. this includes number of staff licenses (21.9 percent of responses), institutional fte count (15.6 percent), number of bibliographic or item records (15.6 percent), and number of patron records (9.4 percent). twenty respondents provided additional information. several indicated that the costs are split evenly across all partners. several indicated that population served was a factor. others reiterated that costs for central hardware on other scenarios. several indicated that if a product is directly applicable to only one library, such as self-check interfaces and additional acquisition units, then the library in need fully funds the purchase, which mirrors the local practice at unlv. several respondents indicated that if a product benefits all libraries, then costs are shared equally. one respondent indicated that the partner libraries discuss the potential item, and collectively they may choose not to purchase, even if one or more partners are very interested. in such cases, those partners have the option to purchase the product and must agree to make it available to all partners. several respondents indicated that, as the largest entity using the shared system, they generally always purchased new software for their operation as needed, with the associated benefit that the other partners of the system were allowed to use the software as well. three respondents reiterated that a central office funds add-on modules, in one case from funding set aside each year for system improvements. a fourth respondent indicated that a “joiners fee” fund, built up from new members joining the system, allows for the purchase of new software. clearly there are many scenarios of how new software is funded. generally, regardless of funding source, sole or share, if a product can benefit all partners, it’s allowed to do so. thirty-six survey respondents provided details on what factors determine how much each partner contributes toward new software purchases. seven respondents (19.4 percent) indicated the number of staff licenses plays a role (as in the unlv model). three respondents (8.3 percent) indicated that institution fte played a role, while three other respondents indicated that the number of partner bibliographic/item records played a role. the majority of respondents, 25 (69.4 percent) provided alternate scenarios or otherwise more information. nine of these 25 respondents indicated costs were split evenly across all partners. several indicated that the formula used for determining maintenance costs was also applied to new software purchases. four respondents indicated that the library service population was a factor. two indicated that circulation counts were a factor. one indicated that it’s negotiated on a per purchase basis, based on varying factors. ■■ hardware purchases hardware needs related to the underlying infrastructure, such as server(s), disk space, and backup equipment increases as the ils grows. unlv’s ils installation has grown tremendously. new software modules have been purchased, application architecture changes occurred with the release of the millennium suite in the late 1990s, regular annual updates to the system software have been management and support of shared integrated library systems | vaughan and costello 67 each partner institution. each module coordinator served as the contact person charged with maintaining familiarity with the functions and features of a particular module, testing enhancements within new releases, keeping other staff informed of changes, and alerting the system vendor of any problems with the module. annually, module coordinators were to consider new software and prioritize and recommend ils software the library should consider purchasing. module coordinators were tasked to maintain a system-wide view of the ils and alert others if they discovered problems or made changes to the ils that could affect other areas of the system. in addition, module coordinators were encouraged to subscribe to the iug listserv to monitor discussions and to maintain awareness of overall system issues. all staff had access to the system’s user manual but if they had questions on system features or functions, the module coordinator served as an additional resource. in addition, any bug reports were provided to the most appropriate module coordinator, who would contact innovative. the unlv systems staff, which has grown over time and is now part of the library technologies division, was responsible for all hardware and networking problems, and for scheduling and verifying nightly data backups. the systems department coordinated any new software installations with the module coordinators group, library staff, and library partners. in 2006, the unlv libraries reorganized and hired a dedicated systems librarian focused on the ils. the systems librarian’s principal job responsibility is to serve as the central administrator and site coordinator of the unlv libraries’ shared ils. responsibilities include communicating with colleagues regarding current system capabilities, monitoring vendor software developments, monitoring how other libraries utilize their innovative systems, and recommending enhancements. the systems librarian is the site contact with innovative and coordinates and monitors support calls, software and patch upgrades, and new software module installations. the position serves as the contact person for the shared institutions whenever they have questions or issues with the ils. the systems librarian has taken over much of the work previously coordinated through the module coordinators group. while the formal module coordinators group no longer exists, module experts still provide assistance as needed, and consultation always occurs with partners on system-wide issues as they arise. unlv is not unique in how it manages their ils. in the survey results, 36 respondents (87.8 percent) indicated there is a dedicated individual at the main institution who has a primary responsibility of overseeing the ils. to help clarify the responses, “primary responsibility” is defined as individuals spending more than half their time devoted to support, research, troubleshooting, and system administration duties related to the ils. the authors replacements are determined by the same formula used for assessing the share of annual maintenance. ■■ additional purchases the last funding-related survey question asked if ongoing content enrichment services were subscribed to, and if so, to describe how the cost share amount is determined for partner libraries. content enrichment services can provide additional evaluative content such as book cover images, table of contents (toc), and book reviews. unlv subscribes to a toc service as well as an additional service providing book covers, reviews, and excerpts. partner institutions contribute to the annual service charge associated with the toc service and pay for each record enhanced at their library. unlv fully funds the book cover/review/excerpt service that benefits all partners. fourteen of the 43 survey respondents (32.6 percent) indicated they did not subscribe to enrichment services. twelve respondents (27.9 percent) indicated they had one or more enrichment services and that the costs were fully funded by the main institution. seventeen respondents (39.5 percent) subscribe to enrichment services and that the costs are shared. several indicated the existing cost-sharing formula used for other assessments (annual maintenance, hardware, or nonsubscription-based software) is also used for the ongoing enrichment services. one respondent indicated they maintain a collective fund for enrichment services and estimate the cost of all shared subscriptions; this figure is integrated into the share each institution contributes to the central fund annually. one respondent indicated that their system only uses free enrichment services. ■■ support the next section of the survey addressed staff support efforts related to management of the ils. twenty years ago when unlv installed its ils, staff support included one librarian and one additional staff; both focused on various aspects of system support, from maintaining hardware to working with the vendor, in addition to having other primary job responsibilities completely unrelated to the ils. in addition, over time, functional experts developed for particular modules of the system, such as cataloging, acquisitions, circulation, and serials control. this group of functional experts eventually became known as the unlv innovative module coordinators group, which was chaired by the head of the library systems department. this group met quarterly and included experts from unlv as well as one representative from 68 information technology and libraries | june 2011 solely by the main library. typical system administration activities include managing and executing mid-release and major release software upgrades (95.2 percent of all respondents indicated the main library is solely responsible); managing, coordinating, and scheduling new products for installation (95.2 percent); monitoring disk space (95 percent); and scheduling and monitoring backups (92.9 percent). unlv’s ils support model is very similar to the survey results. the systems librarian at unlv manages all software upgrades, as well as coordinating and scheduling new ils software product and module installs. the library technologies division monitors and schedules the nightly backups and diskspace usage. certain unlv libraries staff and selected individuals from the partner libraries are authorized to open support calls with the system vendor, although the systems librarian often handles this activity herself. other functions, such as maintaining the year-to-date and last year circulation statistics are also performed by the unlv libraries systems librarian. updating circulation parameters are tasks best performed by each of the created a list of 20 duties related to ils system administration and asked respondents to indicate whether: the main library or a central consortial or cooperative office dedicated to the ils handles this particular duty; the duty is shared between the main library and partner libraries; or the duty is handled by just a partner library. as illustrated in figure 3, the survey results overwhelmingly show that the main library in a shared system provides the majority of system administration support. only two tasks were broadly shared between the main library and partner libraries; maintenance of the institution’s records (bibliographic, item, patron, order, etc.) and maintaining network and label printers. other shared tasks included changes to the circulation parameters tables (e.g., configuring loan rules and specifying open hours and days closed tables for materials they themselves circulate) with 40.5 percent of the respondents indicating this as a shared responsibility, opening support calls with the vendor (38.1 percent), monitoring bounced export and fts mail (33.3 percent), and account management (31 percent). the more typical system administration activities are done a c c o u n t m a n a g e m e n t (c re a te n e w / d e le te a c c o u n ts ; m ill e n n iu m a u th o ri za ti o n s) m a n a g e a n d e x e c u te i n n o va ti ve m id -r e le a se a n d m a jo r re le a se s o ft w a re u p g ra d e s m a n a g e , c o o rd in a te a n d s c h e d u le n e w in n o va ti ve s o ft w a re p ro d u c t in st a lla ti o n s s c h e d u le a n d m o n it o r b a c k u p s w ri te s c ri p ts t o a u to m a te p ro c e ss e s (i. e ., c ir c u la ti o n o ve rr id e s re p o rt , sy st e m s ta tu s re p o rt s, e tc .) p e rf o rm r e vi e w f ile m a in te n a n c e a n d t a k e a c ti o n s h o u ld a ll fi le s fi ll o p e n s u p p o rt c a lls w it h i n n o va ti ve m o n it o r st a tu s o f o p e n c a lls ; se rv e a s lia is o n w it h i n n o va ti ve f o r re so lu ti o n o f su p p o rt c a lls m a in ta in y e a rto -d a te /l a st y e a r c ir c u la ti o n st a ti st ic c o u n te rs m o n it o r sy st e m m e ss a g e s m o n it o r d is k s p a c e u sa g e m o n it o r b o u n c e d e x p o rt a n d f t s m a il m a in ta in c o d e t a b le s (f ix e d l e n g th , va ri a b le le n g th , e tc .) u p d a te c ir c u la ti o n p a ra m e te rs t a b le s (lo a n ru le s, h o u rs o p e n , d a ys c lo se d , e tc .) s e t u p , m o n it o r a n d t ro u b le sh o o t n o ti c e s is su e s w ri te o r m o d if y lo a d t a b le s fo r n e w r e c o rd lo a d in g m a in ta in s ys te m p ri n te rs ( la b e l, n e tw o rk e d la se r p ri n te rs ) p ro vi d e m a in te n a n c e o n r e c o rd s (p a tr o n , b ib , it e m , e tc .) m a n a g e s ys te m s e c u ri ty t h ro u g h i n n o va ti ve sy st e m s e tt in g s a n d /o r h o st b a se d o r n e tw o rk b a se d f ir e w a lls p ro vi d e e m e rg e n c y (o ff h o u rs ) re sp o n se t o re p o rt s o f in n o va ti ve d o w n ti m e o r se rv e r h a rd w a re f a ilu re s figure 3. systems administration / support responsibilities management and support of shared integrated library systems | vaughan and costello 69 and definition of policies and procedures. some groups provide recommendations to a larger executive board for the consortia. the meeting frequency of these groups is as varied as the libraries. some groups meet quarterly (33.3 percent) or monthly (20 percent) but the majority meet at other frequencies (40 percent), such as every other month or twice a year. some libraries use e-mail to communicate as opposed to having regular in-person meetings. in addition to a standing committee focused on the ils, and similar to unlv’s experience, libraries may have finite working groups to implement particular products. ■■ training, professional development, and planning the survey also focused on training, professional development, and planning activities related to the ils. there are many methods that library staff can use to stay current with their ils. most training methods typically include in-person workshops or online tutorials, as well as other venues for professional development, such as conference attendance. the authors were interested in how libraries sharing an ils determined training needs and who was responsible for the training. the survey results showed that libraries value a variety of training opportunities, partner institutions, with advice and assistance as necessary provided by the systems librarian. the authors were interested if an ils oversight body exists with other shared systems, and, if so, what issues are discussed. responses indicated that a variety of groups exist, and, in some instances, multiple groups may exist within one consortia (some groups have a more specific ils focus and others a more tangential involvement). as illustrated in figure 4, a minority of respondents, 11 of 41 (26.8 percent), indicated that they do not have a group providing ils oversight. if such a group exists, respondents were allowed to select various predefined duties performed by that group. twenty-three respondents indicated the group discusses purchasing decisions. respondents also indicated that such a group also discusses the impact of the vendor enhancements offered by mid-release and regular full-releases (19), and when to schedule the upgrades (12). the absence of an oversight group doesn’t imply that consultation doesn’t occur, rather, it may be the responsibility of an individual as opposed to an effort coordinated by a group. some libraries also have module-driven committees, which disseminate information, introduce new ideas, and try to promote cohesiveness throughout the consortium. other duties that such an oversight group may focus on include workflow issues, discussion of system issues, figure 4. issues discussed by ils oversight body updates on unresolved problem calls with innovative discussion on enhancements offered by mid-release and regular full release software upgrades and their impact (positive/ negative) on users of the system scheduling mid-release/ full release software upgrades prioritizing and selecting choices related to the innovative user’s group enhancements ballot for your installation discussion of potential new software/ modules to purchase from innovative n/a—an oversight group, body, or committee does not exist related to the oversight of the innovative system other 70 information technology and libraries | june 2011 specifically regarding cost sharing, support, and rights and responsibilities. in conducting this background research, a paucity of published literature was observed, and thus the authors hope the findings above may help other established consortia, who may be interested in reviewing or tweaking their current mous or more formalized agreements likely in place. it may also provide some considerations for libraries considering initiating a shared ils instance, something that, given the current recession, may be a topic to consider. given that nearly a decade has passed since the original unlv mou was drafted and agreed to, several revisions will be proposed and drafted. this includes formalization of how costs are divided for enrichment services (new since the original mou), and formalization in writing of the coordination role of the systems librarian in her capacity as chief manager of the ils. other ideas gathered from survey responses are worth consideration, such as a base additional fee contributed each year (above and beyond the fee accessed as determined by staff licenses). such a fee could help recoup real, sometimes significant costs associated with the system, such as the purchase of additional software benefitting all players (often, in practice funded solely by the main library). such a fee could also help recoup more tangential (but still real) expenses, such as replacement of backup media. however, at the time of writing, tweaking (increasing) the fee assessed to partner institutions is a delicate issue. as with many other institutions of learning and their associated libraries, the nevada system of higher education has been particularly hard hit with funding cuts, even when compared against serious cuts experienced by colleagues nationwide. by all measures (unemployment, state budget shortfall, foreclosures, etc.) nevada has been one of the hardest hit states in the current recession. while knowledge gained from this survey was useful (and current), what effect it will have in changing the cost structure is, now, on hold. in the spirit of support among the libraries in the same system of higher education, and in continuing to demonstrate serious shared efficiencies (by maintaining one joint system as opposed to five individual systems), no new fee structure will be implemented in the short term. at the appropriate time, different costing structures such as those elicited in the survey results will merit closer attention. references 1. jason vaughan, “a library’s integrated online library system: assessment and new hardware implementation,” information technology and libraries 23, no. 2 (june 2004): 50–57. 2. innovative interfaces, “about us: history,” http://www .iii.com/about/history.shtml (accessed may 17, 2010). regardless of the library’s status. the easiest and cheapest method of awareness involves having someone monitor the iug electronic discussion list, with 29 respondents (70.7 percent) indicating that both the main library and one or more partner libraries participate in this activity. attendance at the national and regional iug meetings was also valued highly by libraries with 26 respondents (66.7 percent) indicating both the main libraries and their partner libraries having a staff member attend such meetings in the past 5 years. sixteen respondents (64 percent) indicated both the main library and their partner libraries regularly send staff to the american library association annual conference and midwinter meeting. iug typically has a meeting the friday before the midwinter meeting. attendance at training workshops held at the vendor headquarters, as well as online training, is an activity in which the main library participates more frequently than the partner libraries (61.1 percent). complete survey results are provided in appendix a, available at http://www.lita.org/ala/mgrps/divs/lita/ ital/302011/3002jun/pdf/vaughan_app.pdf. ■■ research summary and future directions integrated library systems shared by multiple partners hold the promise of shared efficiencies. given a rather significant number of responses, shared systems appear to be quite common, ranging from a few partners to systems with many partners. perhaps reflecting this, shared systems range from loose federations of library partners to shared systems managed by a more formalized, official consortium. a majority of libraries with shared systems have a mou or other official documents to help define the nature of the relationship, focusing on such topics as budgeting, payments, and funding formulas; general governance and voting matters; support; and equipment. most libraries sharing a system have a method or funding formula outlining how the ils is funded on an annual basis and the contributions provided by each partner. such methods can include not only annual maintenance, but also the procurement of new hardware and software extending the system capabilities. while many support functions are carried out by a central office or staff at the main library hosting the shared system, partner libraries often participate in annual user group and library association conferences where they help stay abreast of vendor ils developments. the research above describes the authors’ investigations into management of shared integrated library systems. in particular, the authors were interested in how other consortia sharing an ils managed their system, generating collaborative systems for digital libraries | visser and ball 187 marijke visser and mary alice ball the middle mile: the role of the public library in ensuring access to broadband of fundamentally altering culture and society. in some circles the changes happen in real time as new web-based applications are developed, adopted, and integrated into the user’s daily life. these users are the early adopters; the internet cognoscenti. second tier users appreciate the availability of online resources and use a mix of devices to access internet content but vary in the extent to which they try the latest application or device. the third tier users also vary in the amount they access the internet but have generally not embraced its full potential, from not seeking out readily available resources to not connecting at all.1 regardless of the degree to which they access the internet, all of these users require basic technology skills and a robust underlying infrastructure. since the introduction of web 2.0, the number and type of participatory web-based applications has continued to grow. many people are eagerly taking part in creating an increasing variety of web-based content because the basic tools to do so are widely available. the amateur, creating and sharing for primarily personal reasons, has the ability to reach an audience of unprecedented size. in turn, the internet audience, or virtual audience, can select from a vast menu of formats, including multimedia and print. with print resources disappearing, it is increasingly likely for an individual to only be able to access necessary material online. web-based resources are unique in that they enable an undetermined number of people, personally connected or complete strangers, to interact with and manipulate the content thereby creating something new with each interaction and subsequent iteration. many of these new resources and applications require much more bandwidth than traditional print resources. with the necessary technology no longer out of reach, a crosssection of society is affecting the course the twenty-first century is taking vis à vis how information is created, who can create it, and how we share it.2 in turn, who can access web-based content and who decides how it can be accessed become critical questions to answer. as people become more adept at using web-based tools and eager to try new applications, the need for greater broadband will intensify. the economic downturn is having a marked effect on people’s internet use. if there was a preexisting problem with inadequate access to broadband, current circumstances exacerbate it to where it needs immediate attention. access to broadband internet today increases this paper discusses the role of the public library in ensuring access to the broadband communication that is so critical in today’s knowledge-based society. it examines the culture of information in 2010, and then asks what it means if individuals are online or not. the paper also explores current issues surrounding telecommunications and policy, and finally seeks to understand the role of the library in this highly technological, perpetually connected world. i n the last twenty years library collections have evolved from being predominantly print-based to ones that have a significant digital component. this trend, which has a direct impact on library services, has only accelerated with the advent of web 2.0 technologies and participatory content creation. cutting-edge libraries with next generation catalogs encourage patrons to post reviews, contribute videos, and write on library blogs and wikis. even less adventuresome institutions offer a variety of electronic databases licensed from multiple publishers and vendors. the piece of these library portfolios that is at best ignored and at worst vilified is the infrastructure that enables internet connectivity. in 2010, broadband telecommunication is recognized as essential to access the full range of information resources. telecommunications experts articulate their concerns about the digital divide by focusing on firstand last-mile issues of bringing fiber and cable to end users. the library, particularly the public library, represents the metaphorical middle mile providing the public with access to rich information content. equally important, it provides technical knowledge, subject matter expertise, and general training and support to library users. this paper discusses the role of the public library in ensuring access to the broadband communication that is so critical in today’s knowledge-based society. it examines the culture of information in 2010, and then asks what it means if individuals are online or not. the paper also explores current issues surrounding telecommunications and policy, and finally seeks to understand the role of the library in this highly technological, perpetually connected world. ■■ the culture of information information today is dynamic. as the internet continues on its fast paced, evolutionary track, what we call ‘information’ fluctuates with each emerging web-based technology. theoretically a democratic platform, the internet and its user-generated content is in the process marijke visser (mvisser@alawash.org) is information technology policy analyst and mary alice ball (maryaliceball@yahoo .com) former chair, telecommunications subcommittee, office for information technology policy, american library association, washington, dc. 188 information technology and libraries | december 2010 the geographical location of a community will also influence what kind of internet service is available because of deployment costs. these costs are typically reflected in varying prices to consumers. in addition to the physical layout of an area, current federal telecommunications policies limit the degree to which incentives can be used on the local level.7 encouraging competition between isps, including municipal electric utilities, incumbent local exchange carriers, and national cable companies, for example, requires coordination between local needs and state and federal policies. such coordinated efforts are inherently difficult when taking into consideration the numerous differences between locales. ultimately, though, all of these factors influence the price end users must pay for internet access. with necessary infrastructure and telecommunications policies in place, there are individual behaviors that also affect broadband adoption. according to the pew study, “home broadband adoption 2008,” 62 percent of dial-up users are not interested in switching to broadband.8 clearly there is a segment of the population that has not yet found personal relevance to high-speed access to online resources. in part this may be because they only have experience with dial-up connections. depending on dial-up gives the user an inherently inferior experience because bandwidth requirements to download a document or view a website with multimedia features automatically prevent these users from accessing the same resources as a user with a high-speed connection. a dial-up user would not necessarily be aware of this difference. if this is the only experience a user has it might be enough to deter broadband adoption, especially if there are other contributing factors like lack of technical comfort or availability of relevant content. motivation to use the internet is influenced by the extent to which individuals find content personally relevant. whether it is searching for a job and filling out an application, looking at pictures of grandchildren, using skype to talk to a family member deployed in iraq, researching healthcare providers, updating a personal webpage, or streaming video, people who do these things have discovered personally relevant internet content and applications. understanding the potential relevance of going online makes it more likely that someone would experiment with other applications, thus increasing both the familiarity with what is available and the comfort level with accessing it. without relevant content, there is little motivation for someone not inclined to experiment with internet technology to cross what amounts to a significant hurdle to adoption. anthony wilhelm argues in a 2003 article discussing the growing digital divide that culturally relevant content is critical in increasing the likelihood that non-users will want to access web-based resources.9 the scope of the issue of providing culturally relevant content is underscored in the 2008 pew study, the amount of information and variety of formats available to the user. in turn more content is being distributed as users create and share original content.3 businesses, nonprofits, municipal agencies, and educational institutions appreciate that by putting their resources online they reach a broader segment of their constituency. this approach to reaching an audience works provided the constituents have their own access to the materials, both physically and intellectually. it is one thing to have an internet connection and another to have the skill set necessary to make productive use of it. as reported in job-seeking in u.s. public libraries in 2009, “less than 44% of the top 100 u.s. retailers accept instore paper applications.”4 municipal, state, and federal agencies are increasingly putting their resources online, including unemployment benefit applications, tax forms, and court documents.5 in addition to online documents, the report finds social service agencies may encourage clients to make appointments and apply for state jobs online.6 many of the processes that are now online require an ability to navigate the complexities of the internet at the same time as navigating difficult forms and websites. the combination of the two can deter someone from retrieving necessary resources or successfully completing a critical procedure. while early adopters and policy-makers debate the issues surrounding internet access, the other strata of society, knowingly or not and to varying degrees, are enmeshed in the outcomes of these ongoing discussions because their right to information is at stake. ■■ barriers to broadband access by condensing internet access issues to focus on the availability of adequate and sustainable broadband, it is possible to pinpoint four significant barriers to access: price, availability, perceived relevance, and technical skill level. the first two barriers are determined by existing telecommunications infrastructure as well as local, state, and federal telecommunications policies. the latter barriers are influenced by individual behaviors. both divisions deserve attention. if local infrastructure and the internet service provider (isp) options do not support broadband access to all areas within its boundaries, the result will be that some community members can have broadband services at home while others must rely on work or public access computers. it is important to determine what kind of broadband services are available (e.g., cable, dsl, fiber, satellite) and if they are robust enough to support the activities of the community. infrastructure must already be in place or there must be economic incentive for isps to invest in improving current infrastructure or in installing new infrastructure. generating collaborative systems for digital libraries | visser and ball 189 at all. success hinges on understanding that each community is unique, on leveraging its strengths, and on ameliorating its weaknesses. local government can play a significant role in the availability of broadband access. from a municipal perspective, emphasizing the role of broadband as a factor in economic development can help define how the municipality should most effectively advocate for broadband deployment and adoption. gillett offers four initiatives appropriate for stimulating broadband from a local viewpoint. municipal governments can ■■ become leaders in developing locally relevant internet content and using broadband in their own services; ■■ adopt policies that make it easier for isps to offer broadband; ■■ subsidize broadband users and/or isps; or ■■ become involved in providing the infrastructure or services themselves.12 individually or in combination these four initiatives underscore the fact that government awareness of the possibilities for community growth made possible by broadband access can lead to local government support for the initiatives of other local agencies, including nonprofit, municipal, or small businesses. agencies partnering to support community needs can provide evidence to local policy makers that broadband is essential for community success. once the municipality sees the potential for social and economic development, it is more likely to support policies that stimulate broadband buildout. building strong local partnerships will set the stage for the development of a sustainable broadband initiative as the different stakeholders share perspectives that take into account a variety of necessary components. when the time comes to implement a strategy, not only will different perspectives have been included, the plan will have champions to speak for it: the government, isps, public and private agencies, and community members. it is important to know which constituents are already engaged in supporting community broadband initiatives and which should be tapped. the ultimate purpose in establishing broadband internet access in a community is to benefit the individual community members, thereby stimulating local economic development. key players need to represent agencies that recognize the individual voice. a 2004 study led by strover provides an example of the importance of engaging local community leaders and agencies in developing a successful broadband access project.13 the study looked at thirty-six communities that received state funding to establish community technology centers (ctc). it addressed the effective use and management of ctcs and called attention to the inadequacy of supplying the hardware without community support which found that of the 27 percent of adult americans who are not internet users, 33 percent report they are not interested in going online.10 that pew can report similar information five years after the wilhelm article identifies a barrier to equitable access that has not been adequately resolved. ■■ models for sustainable broadband availability in discussing broadband, the question of what constitutes broadband inevitably arises. gillett, lehr, and osoria, in “local government broadband initiatives,” offers a functional definition: “access is ‘broadband’ if it represents a noticeable improvement over standard dial-up and, once in place, is no longer perceived as the limiting constraint on what can be done over the internet.”11 while this definition works in relationship to dial-up, it is flexible enough to apply to all situations by focusing on “a noticeable improvement” and “no longer perceived as the limiting constraint” (added emphasis). ensuring sustainable broadband access necessitates anticipating future demand. short sighted definitions, applicable at a set moment in time, limit long-term viability of alternative solutions. devising a sustainable solution calls for careful scrutiny of alternative models, because the stakes are so high in the broadband debate. there are many different players involved in constructing information policies. this does not mean, however, that their perspectives are mutually exclusive. in debates with multiple perspectives, it is important to involve stakeholders who are aligned with the ultimate goal: assuring access to quality broadband to anyone going online. what is successful for one community may be entirely inappropriate in another; designing a successful system requires examining and comparing a range of scenarios. existing circumstances may predetermine a particular starting point, but one first step is to evaluate best practices currently in place in a variety of communities to come up with a plan that meets the unique criteria of the community in question. sustainable broadband solutions need to be developed with local constituents in mind and successful solutions will incorporate the realities of current and future local technologies and infrastructure as well as local, state, and federal information policies. presupposing that the goal is to provide the community with the best possible option(s) for quality broadband access, these are key considerations to take into account when devising the plan. in addition to the technological and infrastructure issues, within a community there will be a combination of ways people access the internet. there will be those who have home access, those who need public access, and those who do not seek access 190 information technology and libraries | december 2010 the current emphasis on universal broadband depends on selecting the best of the alternative plans according to carefully vetted criteria in order to develop a flexible and forward-thinking course of action. can we let people remain without access to robust broadband and the necessary skill set to use it effectively? no. as more and more resources critical to basic life tasks are accessible only online, those individuals that face challenges to going online will likely be socially and economically disadvantaged when compared to their online counterparts. recognition of this potential for intensifying digital divide is recognized in the federal communication commission’s (fcc) national broadband plan (nbp) released in march 2010.18 the nbp states six national broadband goals, the third of which is “every american should have affordable access to robust broadband service, and the means and skills to subscribe if they so choose.”19 research conducted for the recommendations in the nbp was comprehensive in scope including voices from industry, public interest, academia, and municipal and state government. responses to more than thirty public notices issued by the fcc provide evidence of wide concern from a variety of perspectives that broadband access should become ubiquitous if the united states is to be a competitive force in the twentyfirst century. access to essential information such as government, public safety, educational, and economic resources requires a broadband connection to the internet. it is incumbent on government officials, isps, and community organizations to share ideas and resources to achieve a solution for providing their communities with robust and sustainable broadband. it is not necessary to have all users up to par with the early adopters. there is not a one-size-fits-all approach to wanting to be connected, nor is there a one-size-fits-all solution to providing access. what is important is that an individual can go online via a robust, high-speed connection that meets that individual’s needs at that moment. what this means for finding solutions is ■■ there needs to be a range of solutions to meet the needs of individual communities; ■■ they need to be flexible enough to meet the evolving needs of these communities as applications and online content continue to change; and ■■ they must be sustainable for the long term so that the community is prepared to meet future needs that are as yet unknown. solutions to providing broadband internet access will be most successful when they are designed starting at the local level. community needs vary according to local demographics, geography, existing infrastructure, types of service providers, and how state and federal systems in place. users need a support system that highlights opportunities available via the internet and that provides help when they run into problems. access is more than providing the infrastructure and hardware. the potential users must also find content that is culturally relevant in an environment that supports local needs and expectations. strover found the most successful ctcs were located in places that “actively attracted people for other social and entertaining reasons.”14 in other words, the ctcs did not operate in a vacuum devoid of social context. successful adoption of the ctcs as a resource for information was dependent on the targeted population finding culturally relevant content in a supportive environment. an additional point made in the study showed that without strong community leadership, there was not significant use of the ctc even when placed in an already established community center.15 this has significant implications for what constitutes access as libraries plan broadband initiatives. investments in technology and a national commitment to ensure universal access to these new technologies in the 1990s provide the current policy framework. as suggested by wilhelm in 2003, to continue to move forward the national agenda needs to focus on updating policies to fit new information circumstances as they arise. today’s information policy debates should emphasize a similar focus. beyond accelerating broadband deployment into underserved areas, wilhelm suggests there needs to be support for training and content development that guarantees communities will actually use and benefit from having broadband deployed in their area.16 technology training and support for local agencies that provide the public with internet access, as well as opportunities for the individuals themselves, is essential if policies are going to actually lead to useful broadband adoption. individual and agency internet access and adoption require investment beyond infrastructure; they depend on having both culturally relevant content and the information literacy skills necessary to benefit from it. ■■ finding the right solution though it may have taken an economic crisis to bring broadband discussions into the living room, the result is causing renewed interest in a long-standing issue. many states have formed broadband task forces or councils to address the lack of adequate broadband access at the state level and, on the national front, broadband was a key component of the american recovery and reinvestment act of 2009.17 the issue changes as technologies evolve but the underlying tenet of providing people access to the information and resources they need to be productive members of society is the same. what becomes of generating collaborative systems for digital libraries | visser and ball 191 difficult to measure, these kinds of social and cultural capital are important elements in ongoing debates about uses and consequences of broadband access. an ongoing challenge for those interested in the social, economic, and policy consequences of modern information networks will be to keep up with changing notions of what it means to be connected in cyberspace.”20 the social contexts in which a broadband plan will be enacted influence the appropriateness of different scenarios and should help guide which ones are implemented. engaging a variety of stakeholders will increase the likelihood of positive outcomes as community members embrace the opportunities provided by broadband internet access. it is difficult, however, to anticipate the outcomes that may occur as users become more familiar with the resources and achieve a higher level of comfort with technology. ramirez states, the “unexpected outcomes” section of many evaluation reports tends to be rich with anecdotes . . . . the unexpected, the emergent, the socially constructed innovations seem to be, to a large extent, off the radar screen, and yet they often contain relevant evidence of how people embrace technology and how they innovate once they discover its potential.21 community members have the most to gain from having broadband internet access. including them will increase the community’s return on its investment as they take advantage of the available resources. ramirez suggests that “participatory, learning, and adaptive policy approaches” will guide the community toward developing communication technology policies that lead to a vibrant future for individuals and community alike.22 as success stories increase, the aggregation of local communities’ social and economic growth will lead to a net sum gain for the nation as a whole. ■■ the role of the library public libraries play an important role in providing internet access to their community members. according to a 2008 study, the public library is the only outlet for no-fee internet access in 72.5 percent of communities nationwide; in rural communities the number goes up to 82.0 percent.23 beyond having desktop or, in some cases, wireless access, public libraries offer invaluable user support in the form of technical training and locally relevant content. libraries provide a secondary community resource for other local agencies who can point their clients to the library for no-fee internet access. in today’s economy where anecdotal reports show an increase in library use, particularly internet use, the role of the public policies mesh with local ordinances. local stakeholders best understand the complex interworking of their community and are aware of who should be included in the decision-making process. including a local perspective will also increase the likelihood that as community needs change, new issues will be brought to the attention of policy makers and agencies who advocate for the individual community members. community agencies that already are familiar with local needs, abilities, and expectations are logical groups to be part of developing a successful local broadband access strategy. the library exemplifies a community resource whose expertise in local issues can inform information policy discussions on local, state, and federal levels. as a natural extension of library service, libraries offer the added value support necessary for many users to successfully navigate the internet. the library is an established community hub for informational resources and provides dedicated staff, technology training opportunities, and no-fee public access computers with an internet connection. libraries in many communities are creating locally relevant web-based content as well as linking to other community resources on their own websites. seeking a partnership with the local library will augment a community broadband initiative. it is difficult to appreciate the impacts of current information technologies because they change so rapidly there is not enough time to realistically measure the effects of one before it is mixed in with a new innovation. with web-based technologies there is a lag time between what those in the front of the pack are doing online and what those in the rear are experiencing. while there is general consensus that broadband internet access is critical in promoting social and economic development in the twenty-first century as is evidenced by the national purposes outlined in the nbp, there is not necessarily agreement on benchmarks for measuring the impacts. three anticipated outcomes of providing community access to broadband are ■■ civic participation will increase; ■■ communities will realize economic growth; and ■■ individual quality of life will improve. when a strategy involves significant financial and energy investments there is a tendency to want palpable results. the success of providing broadband access in a community is challenging to capture. to achieve a level of acceptable success it is necessary to focus on local communities and aggregate anecdotal evidence of incremental changes in public welfare and economic gain. acceptable success is subjective at best but can be usefully defined in context of local constituencies. referring to participation in the development of a vibrant culture, horrigan notes that “while inherently 192 information technology and libraries | december 2010 isolation. an individual must possess skills to navigate the online resources. as users gain an understanding of the potential personal growth and opportunities broadband yields, they will be more likely to seek additional online resources. by stimulating broadband use, the library will contribute to the social and economic health of the community. if the library is to extend its role as the information hub in the community by providing no-fee access to broadband to anyone who walks through the door, the local community must be prepared to support that role. it requires a commitment to encourage build out of appropriate technology necessary for the library to maintain a sustainable internet connection. it necessitates that local communities advocate for national information and communication policies that are pro-library. when public policy supports the library’s efforts, the local community benefits and society at large can progress. what if the library’s own technology needs are not met? the role of the library in its community is becoming increasingly important as more people turn to it for their internet access. without sufficient revenue, the library will have a difficult time meeting this additional demand for services. in turn, in many libraries increased demand for broadband access stretches the limit of it support for both the library staff and the patrons needing help at the computers. what will be the fallout from the library not being able to provide internet services the patrons desire and require? will there be a growing skills difference between people who adopt emerging technologies and incorporate them into their daily lives and those who maintain the technological status quo? what will the social impact be of remaining off line either completely or only marginally? can the library be the bridge between those on the edge, those in the middle, and those at the end? with a strong and well articulated vision for the future, the library can be the link that provides the community with sustainable broadband. ■■ conclusion the recent national focus on universal broadband access has provided an opportunity to rectify a lapse in effective information policy. whether the goal includes facilitating meaningful access continues to be more elusive. as government, organizations, businesses, and individuals rely more heavily on the internet for sharing and receiving information, broadband internet access will continue to increase in importance. following the status quo will not necessarily lead to more people having broadband access in the long run. the early adopters will continue to stimulate technological innovation which, in turn, will trickle down the ranks of the different user types. currently, library as a stable internet provider cannot be overestimated. to maintain its vital function, however, the library must also resolve infrastructure challenges of its own. because of the increased demand for access to internet resources, public libraries are finding their current broadband services are not able to support the demand of their patrons. the issues are two-fold: increased patron use means there are often neither sufficient workstations nor broadband speeds to meet patron demand. in 2008, about 82.5 percent of libraries reported an insufficient number of public workstations, and about 57.5 percent reported insufficient broadband speeds.24 to add to these already significant issues, the report indicates libraries are having trouble supporting the necessary information technology (it) because of either staff time constraints or the lack of a dedicated it staff.25 public libraries are facing considerable infrastructure management issues at a time when library use is increasing. overcoming the challenges successfully will require support on the local, state, and federal level. here is where the librarian, as someone trained to become inherently familiar with the needs of her local constituency and ethically bound to provide access to a variety of information resources, needs to insert herself into the debate. librarians need to be ahead of the crowd as the voice that assures content will be readily accessible to those who seek it. today, the elemental policy issue regarding access to information via the internet hinges on connectivity to a sustainable broadband network. to promote equitable broadband access, the librarian needs be aware of the pertinent information policies in place or under consideration, and be able to anticipate those in the future. additionally, she will need to educate local policy makers about the need for broadband in their community. in some circumstances, the librarian will need to move beyond her local community and raise awareness of community access issues on the state and federal level. the librarian is already able to articulate numerous issues to a variety of stakeholders and can transfer this skill to advocate for sustainable broadband strategies that will succeed in her local community. there are many strata of internet users, from those in the forefront of early adoption to those not interested in being online at all. the early adopters drive the market which responds by making resources more and more likely to be primarily available only online. as we continue this trend, the social repercussions increase from merely not being able to access entertainment and news to being unable to participate in the knowledge-based society of the twenty-first century. by folding in added value online access for the community, the library helps increase the likelihood that the community will benefit from broadband being available to the library patrons and by extension to the community as a whole. to realize the internet’s full potential, access to it cannot be provided in generating collaborative systems for digital libraries | visser and ball 193 community, the entire community benefits regardless of where and how the individuals go online. the effects of the internet are now becoming broadly social enough that there is a general awareness that the internet is not decoration on contemporary society but a challenge to it.28 being connected is no longer an optional luxury; to engage in the twenty-first century it is essential. access to the internet, however, is more than simple connectivity. successful access requires: an understanding of the benefits to going on line, technological comfort, information literacy, ongoing support and training, and the availability of culturally relevant content. people are at various levels of internet use, from those eagerly anticipating the next iteration of web-based applications to those hesitant to open an e-mail account. this user spectrum is likely to continue. though the starting point may vary depending on the applications that become important to the user in the middle of the spectrum, there will be those out in front and those barely keeping up. the implications of the pervasiveness of the internet are only beginning to be appreciated and understood. because of their involvement at the cutting edge of internet evolution, librarians can help lead the conversations. libraries have always been situated in neutral territory within their communities and closely aligned with the public good. librarians understand the perspective of their patrons and are grounded in their local communities. librarians can therefore advocate effectively for their communities on issues that may not completely be understood or even recognized as mattering. connectivity is an issue supremely important to the library as today access to the full range of information necessitates a broadband connection. libraries have carved out a role for themselves as a premier internet access provider in the continually evolving online culture. as noted by bertot, mcclure, and jaeger, the “role of internet access provider for the community is ingrained in the social perceptions of public libraries, and public internet access has become a central part of community perceptions about libraries and the value of the library profession.”29 in times of both economic crisis and technological innovation, there are many unknowns. in part because of these two juxtaposed events, the role of the public library is in flux. additionally, the network of community organizations that libraries link to is becoming more and more complex. it is a time of great opportunity if the library can articulate its role and frame it in relationship to broader society. evolving internet applications require increasing amounts of bandwidth and the trend is to make these bandwidth-heavy applications more and more vital to daily life. one clear path the library community can take however, the supply of internet resources is unevenly stimulating user demand and the unequal distribution of broadband access has greater potential for significant negative social consequences. staying the course and following a haphazard evolution of broadband adoption, may, in fact, renew valid concerns about a digital divide. without an intentional and coordinated approach to developing a broadband strategy, its success is likely to fall short of expectations. the question of how to ensure that internet content is meaningful requires instituting a plan on a very local level, including stakeholders who are familiar with the unique strengths and weaknesses of their community. strover, in her 2000 article the first mile, suggests connectivity issues should be viewed from a first mile perspective where the focus is on the person accessing the internet and her qualitative experience rather than from a last mile perspective which emphasizes isp, infrastructure, and market concerns.26 both perspectives are talking about the same physical section of the connection network: the piece that connects the user to the network. according to strover, distinguishing between the first mile and last mile perspectives is more than an arbitrary argument over semantics. instead, a first mile perspective represents a shift “in the values and priorities that shape telecommunications policy.”27 by switching to a first mile perspective, connectivity issues immediately take into account the social aspects of what it means to be online. who will bring this perspective to the table? and how will we ascertain what the best approach to supporting the individual voice should be? the first mile perspective is one the library is intimately familiar with as an organization that traditionally advocates for the first mile of all information policies. the library is in a key position in the connectivity debate because of its inclination to speak for the user and to be aware of the unique attributes and needs of its local community. as part of its mission, the library takes into account the distinctive needs of its user community when it designs and implements its services. a natural outgrowth of this practice is to be keenly aware of the demographics of the community at large. the library can leverage its knowledge and understanding to create an even greater positive impact on the social, educational, and economic community development made possible by broadband adoption. to extend the first mile perspective analogy, in the connectivity debate, the library will play the role of the middle mile: the support system that successfully connects the internet to the consumer. while the target populations for stimulating demand for broadband are really those in the second tier of users, by advocating for the first mile perspective, the library will be advocating for equitable information policies whose implementation has bearing on the early adopters as well. by stimulating demand for broadband within a 194 information technology and libraries | december 2010 initiatives,” 538. 12. ibid., 537–58. 13. sharon strover, gary chapman, and jody waters, “beyond community networking and ctcs: access, development, and public policy,” telecommunications policy 28, no. 7/8 (2004): 465–85. 14. ibid., 483. 15. ibid. 16. wilhelm, “leveraging sunken investments in communications infrastructure,” 282. 17. see, for example, the virginia broadband round table (http://www.otpba.vi.virginia.gov/broadband_roundtable .shtml), the ohio broadband council (http://www.ohiobroad bandcouncil.org/), and the california broadband task force (http://gov.ca.gov/speech/4596. see www.fcc.gov/recovery/ broadband/) for information on broadband initiatives in the american recovery and reinvestment act. 18. federal communication commission, national broadband plan: connecting america, http://www.broadband.gov/ (accessed apr. 11, 2010). 19. ibid. 20. horrigan, “broadband: what’s all the fuss about?” 2. 21. ricardo ramirez, “appreciating the contribution of broadband ict with rural and remote communities: stepping stones toward and alternative paradigm,” the information society 23 (2007): 86. 22. ibid., 92. 23. denise m. davis, john carlo bertot, and charles, r. mcclure, “libraries connect communities: public library funding & technology access study 2007–2008,” 35, http:// www.ala.org/ala/aboutala/offices/ors/plftas/0708/libraries connectcommunities.pdf (accessed jan. 24, 2009). 24. john carlo bertot et al., “public libraries and the internet 2008: study results and findings,” 11, http://www.ii.fsu.edu/ projectfiles/plinternet/2008/everything.pdf (accessed jan. 24, 2009). these numbers represent an increase from the previous year’s study which suggests that libraries while trying to meet demand are not able to keep up. 25. ibid. 26. sharon strover, “the first mile,” the information society 16, no. 2 (2000): 151–54. 27. ibid., 151. 28. clay shirky, “here comes everybody: the power of organizing without organizations.” berkman center for internet & society (2008). video presentation. available at http:// cyber.law.harvard.edu/interactive/events/2008/02/shirky (retrieved march 1, 2009). 29. john carlo bertot, charles r. mcclure, and paul t. jaeger, “the impacts of free public internet access on public library patrons and communities,” library quarterly 78, no. 3 (2008): 286, http://www.journals.uchicago.edu.proxy.ulib.iupui.edu/ doi/pdf/10.1086/588445 (accessed jan. 30, 2009). is to develop its role as the middle mile connecting the increasing breadth of internet resources to the general public. the broadband debate has moved out of the background of telecommunication policy and into the center of public attention. now is the moment that calls for an information policy advocate who can represent the end user while understanding the complexity of the other stakeholder perspectives. the library undoubtedly has its own share of stakeholders, but over time it is an institution that has maintained a neutral stance within its community, thereby achieving a unique ability to speak for all parties. those who speak for the library are able to represent the needs of the public, work with a diverse group of stakeholders, and help negotiate a sustainable strategy for providing broadband internet access. references and notes 1. lee rainie, “2.0 and the internet world,” internet librarian 2007, http://www.pewinternet.org/presentations/2007/20 -and-the-internet-world.aspx (accessed mar. 4, 2009). see also john horrigan, “a typology of information and communication technology users,” 2007, www.pewinternet.org/~/media// files/reports/2007/pip_ict_typology.pdf.pdf (accessed feb. 12, 2009). 2. lawrence lessig, “early creative commons history, my version,” video blog post, 2008, http://lessig.org/ blog/2008/08/early_creative_commons_history.html (accessed jan. 20, 2009). see the relevant passage from 20:53 through 21:50. 3. john horrigan, “broadband: what’s all the fuss about?” 2007, p. 1, http://www.pewinternet.org/~/media/ files/reports/2007/broadband%20fuss.pdf.pdf (accessed feb. 12, 2009). 4. “job-seeking in us public libraries,” public library funding & technology access study, 2009, http://www.ala.org/ ala/research/initiatives/plftas/issuesbriefs/brief_jobs_july.pdf (accessed mar. 27, 2009). 5. ibid. 6. ibid. 7. sharon e. gillett, william h. lehr, and carlos osorio, “local government broadband initiatives,” telecommunications policy 28 (2004): 539. 8. john horrigan, “home broadband adoption 2008,” 10, http://www.pewinternet.org/~/media//files/reports/2008/ pip_broadband_2008.pdf (accessed feb. 12, 2009). 9. anthony g. wilhelm, “leveraging sunken investments in communications infrastructure: a policy perspective from the united states,” the information society 19 (2003): 279–86. 10. horrigan, “home broadband adoption,” 12. 11. gillett, lehr, and osorio, “local government broadband the next generation integrated library system: a promise fulfilled? yongming wang and trevor a. dawes information technology and libraries | september 2012 76 abstract the adoption of integrated library systems (ils) became prevalent in the 1980s and 1990s as libraries began or continued to automate their processes. these systems enabled library staff to work, in many cases, more efficiently than they had in the past. however, these systems were also restrictive—especially as the nature of the work began to change—largely in response to the growth of electronic and digital resources that they were not designed to manage. new library systems—the second (or next) generation—are needed to effectively manage the processes of acquiring, describing, and making available all library resources. this article examines the state of library systems today and describes the features needed in a next-generation library system. the authors also examine some of the next-generation library systems currently in development that purport to fill the changing needs of libraries. introduction since the late 1980s and early 1990s, the library automation system has gone from inception to rapid implementation to near ubiquitous adoption. but after two decades of changes in information technology, and especially in the last decade, the library has seen itself facing tremendous changes in terms of both resources and services it provides. on the resource side, print material and physical items are no longer dominant collections; electronic resources are fast outpacing physical materials to become the dominant library resources, especially in the academic and special libraries. in addition, many other digital format resources, such as digital collections, institutional repositories, and e-books have taken root. on the service front, library users— accustomed to immediate and instant searching, finding, and accessing information in the google age—demand more and more instant and easy access to library resources and services. but the library automation system, also called the integrated library system (ils), has not changed much for the past two decades. it finds itself uneasily handling the ever-changing library environment and workflow. library staff becomes ever more frustrated with the ils, noting its inadequacy in dealing with their daily jobs. library users are confused by the many interfaces and complexity of library applications and systems. it is obvious that we are at the tipping point for a dramatic change in the area of library automation systems. the library literature has been referring to these as second-generation library automation systems or next-generation library systems.1 two pillars of the second-generation library automation system are(1) it will manage the library resources in the comprehensive and unified way regardless of resource format and location; and (2) it will break away from the traditional ils models and build on the service oriented architecture (soa) model. yongming want (wangyo@tcnj.edu) is systems librarian for the college of new jersey library, ewing township, and trevor dawes (tdawes@princeton.edu) is access services & circulation librarian, princeton university libraries, princeton, new jersey. the next generation library system: a promise fulfilled? | wang and dawes 77 we are at the beginning of a new era of library automation systems. some library system vendors have realized the need to change and have started to develop and implement the secondgeneration library automation system. we believe that the concept and implementation of the new library automation system will catch on quickly among the all types of libraries. it will change how the library conducts its business and will benefit both library staff and users. literature review there is not much research literature on the subject to date. after more than a decade of library automation development and implementation, starting in the late 1990s, libraries have been facing the challenges ushered in by rapidly evolving internet and web 2.0 technologies in addition to the growing number of savvy web users. libraries found themselves lagging behind other sources (such as internet search engines) in meeting users’ information needs, and library staff members are generally frustrated by the lack of flexibility of traditional library systems. as early as 2007, marshall breeding pointed out that “as librarians continue to operate with sparse resources, performing ever more services with ever more diverse collections—but with no increases in staff—it’s more important than ever to have automation tools that provide the most effective assistance possible.”2 in his 2009 article, he deliberately says that “dissatisfaction with the current slate of ils products runs high. the areas of concern lie in their inability to manage electronic content and with their user interfaces that do not fare well against contemporary expectations of the web.”3 so what are the trends in libraries for the last decade in terms of library resources, collections, services, and resource discoveries? according to breeding, there are three trends: “1. increased digital collections; 2. changed expectations regarding interfaces; 3. shifted attitudes toward data and software.”4 andrew pace notes that “web-based content, licensed resources, born-digital documents, and institutionally significant digital collections emerged rapidly to overtake the effort required to maintain print collections, especially in academic libraries.”5 another noticeable trend in the library technology field is occurring along with a similar trend in the general information technology field, that is, the open-source software movement. as pace states, “open source software (oss) efforts such as the open archive initiative (oai), dspace, and koha—just to name a few, as an exhaustive list would overwhelm the reader—challenged commercial proprietary systems, not only for market share but often in terms of sophistication and functionality.”6 as for the infrastructure and features of the second-generation library automation system, both breeding and pace have their respective visions. breeding writes that “the next generation of library automation systems needs to be designed to match the workflows of today’s libraries, which manage both digital and print resources.”7 “one of the fundamental assumptions of the next generation library automation would involve a design to accommodate the hybrid physical and digital existence that libraries face today.”8 pace specifically requires that the next-generation library automation system should use the web as a platform to fulfill the notion of software-as-aservice (saas), or further, platform-as-a-service (paas). the technical advantages of such systems would include the ability to “1. develop, test, deploy, host, and maintain on the same integrated environment; 2. user experience without compromise; 3. build-in scalability, reliability, and information technology and libraries | september 2012 78 security; 4. build-in integration with web services and databases; 5. support collaboration; 6. deep application instrumentation.”9 also as early as october 2007, computers in libraries invited ellen bahr to survey a number of library technology experts regarding what features and functionality they want to see built into ilss soon. the experts included roy tennant, kristin antelman, ross singer, andrew pace, john blyberg, stephen abram, and h. frank cervone. they identified the following key functionality for future ilss: • direct, read-only access to data, preferably through an open source database management system like mysql. • a standard way to communicate with the ils, preferably through an application programming interface. • standards-compliant systems including better security and more complete documentation. • the ability to run the ils on hardware that the library selects and on servers that the library administers. • greater interoperability of systems, pertaining to the systems within the library (including components from vendors, open source communities, and homegrown systems) and beyond (enterprise-level systems such as courseware and university portals, and shared library systems such as oclc). • greater distinction between the ils (which needs to efficiently manage a library’s business processes) and the opac (which needs to be a sophisticated finding tool). • better user interfaces, making use of the most current technologies available and providing a single interface to all of the library’s holdings, regardless of format.10 four aspects of next-generation ils there are four distinguishing characteristics of the next-generation ils we believe are critical. they are comprehensive library resources management; a system based on service-oriented architecture; the ability to meet the challenge of new library workflow; and a next-generation discovery layer. comprehensive library resources management comprehensive library resources management requires that next-generation ilss should be able to manage all library materials regardless of format or location. current ilss are built around the traditional library practice of print collections and services designed around these collections, but the last ten to fifteen years have seen great shifts in both library collections and services. print and physical materials are no longer the dominant resources. actually, in many libraries, especially in academic and research libraries, the building of electronic and digital collections have taken a larger role in library collection development. the traditional ils has not been able to handle ever-growing electronic and digital resources—either in terms of their acquisition or management. therefore a variety of either commercial or open-source the next generation library system: a promise fulfilled? | wang and dawes 79 electronic resources management systems (erm systems) have been developed over the years to address this management gap, but two problems exist: first, most erm systems, whether commercial or open-source, have not been able to truly integrate the acquisition process into the acquisitions workflow of the current ils systems, causing a messy and redundant workflow for the library staff. in libraries where an erm is deployed, staff generally track workflows in both the erm and the ils. if the library’s workflows have not been revised, miscommunication between the traditional acquisitions staff and the electronic resources staff can cause confusion, delay, and may even lead to disruption of services to library patrons. second, erm systems, by design, don’t take current library workflows into account. while it is true that these resources may need to be processed differently, library staff generally are used to traditional processes and want systems that function in familiar ways. many libraries, particularly academic libraries, still have relatively large serials departments responsible for the management of print journals. some have only recently begun to develop the personnel and the skills required to manage the influx of electronic and digital resources. because of these problems with existing erm systems, it is important that the next-generation ilss fully integrate the key features of erm systems, enabling the library to streamline and efficiently manage resources and staff. full integration of e-resource management would not only include acquisitions functionality but also the ability to manage licenses—a critical component of e-resource management—and the ability to manage the various packages, databases, and vendors. describing and providing access to e-resources are two aspects of the e-resources management process. these two features of the erm system should also be integrated with the description and metadata management component of the next-generation ils. centrally managing the metadata of e-resources enables easier discovery of resources by library users and has the advantage of shifting some of the management workflow to the metadata (or cataloging) staff. system based on service-oriented architecture next-generation ilss should be designed based on service-oriented architecture (soa). what is soa? a service-oriented architecture (soa) is an architecture for building business applications as a set of loosely coupled distributed components linked together to deliver a well-defined level of service. these services communicate with each other, and the communication involves data exchange or service coordination. soa is based on web services. broadly, soa can be classified into two aspects: services and connections, described below. services: a service is a function or some processing logic or business processing that is welldefined, self-contained, and does not depend on the context or state of other services. an example of a service is loan processing services, which can be a self-contained unit for processing loan applications. another example is weather services, used to get weather information. any application on the network can use the services of the weather service to get the weather information for a local area or region. in the library field, an example of a well-defined service is a check-in or check-out service. information technology and libraries | september 2012 80 connections: connections are the links connecting these self-contained distributed services with each other. they enable client-to-services communication. in case of web services, simple object access protocol (soap) is frequently used to communicate between services. there are many benefits of soa in the next-generation ils. these include the ability to be platform independent, therefore allowing libraries to use the software and hardware of their choice. there is no threat of being locked in to a single vendor, as many libraries are now with their current ilss. soa also enables incremental development, deployment, and maintenance. the vendors can use the existing software (investment) and use soa to build applications without replacing existing applications. as breeding described, the potential of web services (soa) for libraries includes • real-time interaction between library-automation systems and business systems of a library’s parent organization; • real-time interaction between library-automation systems and library suppliers or other business partners; • blending of library services into campus or municipal portal environments; • insertion of library services and content into courseware-management systems or other learning environments; • blending of content from external sources into library interfaces; and • delivery of library services and content to library users through nontraditional channels. 11 meet the challenge of the new library workflow the library systems in use today are, in general, aging—most were developed at least ten to fifteen years ago. they have been updated with software patches and new releases, but they still demand that staff work in the manner in which the systems were originally designed. although changes in our library operations have been realized in many organizations, these systems have not been able to adequately adapt to how library staff now want to—or need to—operate. the inability to keep pace with the move from largely print to increasingly electronic resources in our libraries is one of the reasons our existing systems fail. copeland et al. present a stunning visual of the typical workflow involved in acquiring and making available an electronic resource in the print-based library management system.12 their graphic depicts five possible starting points, nine decision points, and close to twenty steps involved in the process. this process may not be typical, but it is illustrative of the complex nature of our new workflows that simply cannot be accommodated by existing ilss. as early as 1997, the sirsi corporation recognized the need to modify systems; they introduced workflows, which is designed to streamline library operations.13 workflows, which introduced a graphical user interface to the sirsi unicorn system, was intended to allow staff a certain amount of flexibility and customization, depending on the tasks they typically perform. the new systems that are being developed and deployed today promise even more flexibility and propose to enable staff to work more efficiently irrespective of the format of the material being processed. but these systems will require staff to think about workflows in entirely different ways. not only will the method used to perform tasks be different (now web-based, hosted services as the next generation library system: a promise fulfilled? | wang and dawes 81 opposed to client-server-based tools) but the functionality has been enhanced to be more efficient. we cannot say how these new systems will be welcomed or resisted by staff. nor can we say how much staff savings will be realized because these systems are still too new and have not yet been implemented on a wide enough scale for a thorough assessment. but they are at least starting to address the issue. on the one hand, they will open a new window for further study and exploration of how to shape the next-generation ilss to suit the new library workflow. on the other hand, the library will benefit by changing some of their out-of-date practices and workflows around the new system. next-generation discovery layer current library opacs, like the ilss themselves, are more than ten years old and generally have shown no improvement in search capability, navigability, or discovery. meanwhile, search technology has radically improved in the past decade. frustrations with the opacs’ limitations on the part of both librarians and library users eventually motivated many libraries to seek alternatives. libraries want to take advantage of the advances in search and discovery technology by implementing “nextgen” opacs or library discovery services. given the vast range of resources available in libraries—local print holdings, specialized databases, and commercial databases to name only a few—libraries want a service that would make as many of them as discoverable as possible. the ideal system would have a unified search interface with a single search box, but with relevance ranking, faceted search, social tagging of records, persistent links to records, rss feeds for searches, and the ability to easily save searches or export selected records to standard bibliographic management software programs. the ideal system would also integrate with the library’s opac, overlaying its current interface with a more nimble and navigable interface that still allows real-time circulation status and provides as much support as possible for foreign language fonts. it would also be as customizable as possible. numerous options for discovery currently exist, and these include summon from serials solutions, primo from ex libris, worldcat local from oclc, ebsco discovery service, and encore from innovative interfaces. as these services are not the focus of this article, they will not be discussed in detail, but the next-generation ilss should have the ability to integrate seamlessly with these discovery services. analysis of two examples 1. alma development in early 2009, ex libris (owner of aleph and voyager) began discussions with several institutions (boston college, princeton university, and katholieke universiteit leuven; purdue university joined later) to develop what they then termed the unified resource management system (urm). the urm was to replace the existing ilss and the subsequent add-ons that provided functionality not inherently available, such as the electronic resources management (erm) tools. the “backend” operations would also be de-coupled from the user interface as described elsewhere in this paper. information technology and libraries | september 2012 82 through a series of in-person and online meetings with the development partners, ex libris staff developed the conceptual framework and functional requirements for the urm (later named alma) and began development of the product. alma was delivered to the partners in a series of releases, each with more functionality, and the feedback was used to enhance or further develop the product. alma uses the concept of a shared metadata repository (the metadata management system) to which libraries would contribute, through which records would be shared, and from which records would be downloaded and edited with local information. selection and acquisitions functions would be integrated not only within alma, but within the discovery layer to allow patrons, as well as staff, the ability to suggest items for addition to the library’s holdings. with “smart fulfillment,” the workflows for delivering materials to patrons will also be seamless.14 one of the major changes planned for alma is the ability to manage the types of resources that cannot be effectively managed in current ilss—specifically electronic and digital resources. these resources are currently managed with the use of add-on products that interact with varying degrees of success with the ilss. this lack of integration has been a source of frustration for library staff, particularly as library electronic and digital collections continue to steadily grow. the development partners have presented extensively at various conferences about the development process and have been mostly positive about the product. dawes and lute described princeton university’s participation in a presentation at the 2011 acrl conference in philadelphia.15 at princeton, an executive committee was created to oversee that partner’s process. other staff members were then involved in testing each of the partner releases as the functionality increased and was made available to them. the princeton university team then provided feedback to ex libris via regular telephone calls, after which they would see changes based on their feedback, or a status update from ex libris about the particular issue reported. the staff members at princeton believe that their participation in the development of alma has given them an opportunity to closely examine their workflows to see where efficiencies can be made. 2. kuali ole project in 2008 a group of nine libraries formed the open library environment (ole) project, later called kuali ole. kuali is a community of higher education institutions that came together to build enterprise-level and open-source applications for the higher education community. these systems include some core applications such as kuali financial system, kuali people management, and other campus-wide applications. the kuali ole is its most recent endeavor. the purpose of the kuali ole project is to build an enterprise-level, open-source, and next-generation ils. the goal of kuali ole, taken from its website (http://kuali.org/ole), is to “develop the first system designed by and for academic and research libraries for managing and delivering intellectual information.” there are six principal objectives of the project: • to be built, owned, governed by the academic and research library community • to supports the wide range of resources and formats of scholarly information • to interoperate and integrate with other enterprise and network-based systems the next generation library system: a promise fulfilled? | wang and dawes 83 • to support federation across projects, partners, consortia, and institutions • to provide workflow design and management capabilities • to provides information management capabilities to nonlibrary efforts the funding is provided by a contribution from the andrew w. mellon foundation and the nine partner institutions. kuali ole will be built based on the soa model, on top of the kuali middleware application, kuali rice, the core component of the kuali suite of applications. kuali rice “provides an enterprise class middleware suite of integrated products that allows for applications to be built in an agile fashion. this enables developers to react to end user business requirements in an efficient and productive manner, so that they can produce high quality business applications.”16 version 1.0 of kuali ole is scheduled to be released to the public in december 2012. a stepping and testing version (0.3) was released in november 2011, which covers some core acquisitions features such as “select” and “acquire” processes. we believe that the kuali ole software will not only provide an alternative solution of the ils for academic and research libraries, but will change the way the library conducts its business, and will also have implications for staffing. these changes will result from the comprehensive management of library materials and resources, and the system’s interoperability with other college-level enterprise applications. conclusion after about two decades of library automation system history, both libraries and vendors have begun to realize that a revolutionary change is needed in designing and developing the nextgeneration ils. the system, built on the model of soa, should enable the library to comprehensively and effectively manage all library resources and collections, should accommodate a more flexible library workflow, and should enable the library to provide better services to library users. it is encouraging to see that, in both the commercial and open-source arenas, concrete steps are being taken to develop these systems that will manage all library resources. alma and kuali ole are but two of the next-generation ilss in development. in 2011, serials solutions announced their intent to develop a system using the same principles as described. so have innovative interfaces and oclc, the latter of which has already released an early version of their product to some institutions. since these products are still in development and implementation is not yet widespread, their success in meeting the needs of the library community is still to be seen. references 1. marshall breeding, “next generation library automation: its impact on the serials community,” the serials librarian 56, no. 1–4 (2009): 55–64. 2. marshall breeding, “it’s time to break the mold of the original ils,” computers in libraries 27, no. 10 (2007): 39–41. 3. breeding, “next generation library automation information technology and libraries | september 2012 84 4. breeding, “it’s time to break the mold of the original ils.” 5. andrew pace, “21st century library systems,” journal of library administration 49, no. 6 (2009): 641–50. 6. ibid. 7. breeding, “it’s time to break the mold of the original ils.” 8. breeding, “next generation library automation.” 9. dave mitchell, “defining platform-as-a-service, or paas,” bungee connect developer network, 2008, http://bungeeconnect.wordpress.com/2008/02/18/defining-platform-as-a-service-orpaas (accessed jan. 28, 2012). 10. ellen bahr, “dreaming of a better ils,” computers in libraries 27, no. 9 (2007): 10–14. 11. marshall breeding, “web services and service oriented architecture,” library technology reports 42, no. 3 (2006): 3–42. 12. jessie l. copeland et al., “workflow challenges: does technology dictate workflow?” serials librarian 56, no. 1–4 (2009): 266–70. 13. “sirsi introduces workflows to streamline library operations,” information today 14, no. 7 (1997): 52. 14. ex libris, “ex libris alma: the next generation library services framework,” 2011, www.exlibrisgroup.com/category/almaoverview (accessed jan. 3, 2012). 15. acrl virtual conference, “princeton university discusses ex libris alma,” 2011, www.learningtimes.net/acrl/2011/906 (accessed jan. 3, 2012). 16. kuali rice website, http://www.kuali.org/rice (accessed sept. 10, 2012). http://bungeeconnect.wordpress.com/2008/02/18/defining-platform-as-a-service-or-paas http://bungeeconnect.wordpress.com/2008/02/18/defining-platform-as-a-service-or-paas http://www.exlibrisgroup.com/category/almaoverview http://www.kuali.org/rice microsoft word march_ital_prommann_original_notes.docx applying hierarchical task analysis method to discovery layer evaluation merlen prommann and tao zhang information technology and libraries | march 2015 77 abstract while usability tests have been helpful in evaluating the success or failure of implementing discovery layers in the library context, the focus of usability tests has remained on the search interface rather than the discovery process for users. the informal site-‐ and context specific usability tests have offered little to test the rigor of the discovery layers against the user goals, motivations and workflow they have been designed to support. this study proposes hierarchical task analysis (hta) as an important complementary evaluation method to usability testing of discovery layers. relevant literature is reviewed for the discovery layers and the hta method. as no previous application of hta to the evaluation of discovery layers was found, this paper presents the application of hta as an expert based and workflow centered (e.g., retrieving a relevant book or a journal article) method to evaluating discovery layers. purdue university’s primo by ex libris was used to map eleven use cases as hta charts. nielsen’s goal composition theory was used as an analytical framework to evaluate the goal charts from two perspectives: a) users’ physical interactions (i.e., clicks), and b) user’s cognitive steps (i.e., decision points for what to do next). a brief comparison of hta and usability test findings is offered as a way of conclusion. introduction discovery layers are relatively new third party software components that offer google-‐like web-‐ scale search interface for library users to find information held in the library catalo and beyond. libraries are increasingly utilizing these to offer a better user experience to their patrons. while popular in application, the discussion about discovery layer implementation and evaluation remains limited. [1][2] a majority of reported case studies discussing discovery layer implementations are based on informal usability tests that involve a small sample of users in a specific context. the resulting data sets are often incomplete and the scenarios are hard to generalize.[3] discovery layers have a number of technical advantages over the traditional federated search and cover a much wider range of library resources. however, they are not without limitations. questions have remained scarce about the workflow of discovery layers and how well they help users achieve their goals. merlen prommann (mpromann@purdue.edu) is user experience researcher and designer, purdue university libraries. tao zhang (zhan1022@purdue.edu) is user experience specialist, purdue university libraries. applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 78 beth thomsett-‐scott and patricia e. reese1 offered an extensive overview of the literature discussing the disconnect between what the library websites offer and what their users would like.[1] on the one hand, library directors deal with a great variety of faculty perceptions, in terms of what the role of library is and how they approach research differently. the ithaka s+r library survey of not-‐for profit four-‐year academic institutions in the us suggests a real diversity of american academic libraries as they seek to develop services with sustained value.[4] for the common library website user, irrelevant search results and unfamiliar library taxonomy (e.g. call numbers, multiple locations, item formats, etc.) are two most common gaps.[3] michael khoo and catherine hall demonstrated how users, primarily college students, have become so accustomed to the search functionalities on the internet that they are reluctant to use library websites for their research.[5] no doubt, the launch of google scholar in 2005 was another driver for librarians to move from the traditional federated searching to something faster and more comprehensive.[1] while literature encouraging google-‐like search experiences is abundant, khoo and hall have warned designers to not take users’ preferences towards google at face value. they studied users’ mental models, defining it as “a model that people have of themselves, others, the environment, and the things with which they interact, such as technologies,” and concluded that users often do not understand the complexities of how search functions actually work or what is useful about them.[5] a more systematic examination of the tasks that discovery layers are designed to support is needed. this paper introduces hierarchical task analysis (henceforth hta) as an expert method to evaluate discovery layers from a task-‐oriented perspective. it aims to complement usability testing. for more than 40 years, hta has been the primary methodology to study systems’ sub-‐ goal hierarchies for it presents the opportunity to provide insights into key workflow issues. with expertise in applying hta and being frequent users of the purdue university libraries website for personal academic needs, we mapped user tasks into several flow charts based on three task scenarios: (1) finding an article, (2) finding a book, and (3) finding an ebook. jackob nielsen’s “goal composition” heuristics: generalization, integration and user control mechanisms[6] were used as an analytical framework to evaluate the user experience of an ex libris primo® discovery layer implemented at purdue university libraries. the goal composition heuristics focus on multifunctionality and the idea of servicing many possible user goals at once. for instance, generalization allows users to use one feature on more objects. integration allows each feature to be used in combination with other facilities. control mechanisms allow users to inspect and amend how the computer carries out the instructions. we discussed the key issues with other library colleagues to meet nielsen’s five expert rule and avoid loss in the quality of insights.[7] nielsen studied the value of participant volume in usability tests and concluded that after the fifth user researchers are wasting their time by observing the same findings and not learning much new. a comparison to usability study findings, as presented by fagan et al, is offered as a way of conclusion.[3] information technology and libraries | march 2015 79 related work discovery layers the traditional federated search technology offers the overall benefit of searching many databases at once.[8][1] yet it has been known to frustrate users, as they often do not know which databases to include in their search. emily alling and rachel naismith aggregated common findings from a number of studies involving the traditional federated search technology.[9] besides slow response time, other key causes of frustrating inefficiency were: limited information about search results, information overload due to the lack of filters, and the fact that results were not ranked in order of relevance (see also [2][1]). new tools, termed as “discovery,” “discovery tools,”[2][10] “discovery layers’” or “next generation catalogs,”[11] have become increasingly popular and have provided the hope of eliminating some of the issues with traditional federated search. generally, they are third party interfaces that use pre-‐indexing to provide speedy discovery of relevant materials across millions of records of local library collections, from books and articles, to databases and digital archives. furthermore, some systems (e.g., ex libris primo central index) aggregate hundreds of millions of scholarly e-‐ resources, including journal articles, e-‐books, reviews, legal documents and more that are harvested from primary and secondary publishers and aggregators, and from open-‐access repositories. discovery layers are projected to help create the next generation of federated search engines that utilize a single search index of metadata to search the rising volume of resources available for libraries.[2][11][10][1] while not systematic yet, results from a number of usability studies on these discovery layers point to the benefits they offer. the most noteworthy benefit of a discovery layer is its seemingly easy to use unified search interface. jerry caswell and john d. wynstra studied the implementation of ex libris metalib centralized indexes based on the federated search technology at the university of northern iowa library.[8] they confirmed how the easily accessible unified interface helped users to search multiple relevant databases simultaneously and more efficiently. lyle ford concluded that the summon discovery layer by serials solutions fulfilled students’ expectations to be able to search books and articles together.[12] susan johns-‐smith pointed out another key benefit to users: customizability.[10] the summon discovery layer allowed users to determine how much of the machine-‐readable cataloging (marc) record was displayed. the study also confirmed how the unified interface, aligning the look and feel among databases, increased the ease of use for end-‐ users. michael gorrell described how one of the key providers, ebsco, gathered input from users and considered design features of popular websites, to implement new technologies to the ebscohost interface.[13] some of the features that ease the usability of ebscohost are a dynamic date slider, an article preview hover, and expandable features for various facets, such as subject and publication.[2] applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 80 another key benefit of discovery systems is the speed of results retrieval. the primo discovery layer by ex libris has been complimented for its ability to reduce the time it takes to conclude a search session, while maximizing the volume of relevant results per search session.[14] it was suggested that in so doing the tool helps introduce users to new content types. yuji tosaka and cathy weng reported how records with richer metadata tend to be found more frequently and lead to more circulation.[15] similarly, luther and kelly reported an increase in overall downloads, while the use of individual databases decreased.[16] these studies point to the trend of an enhanced distribution of discovery and knowledge. with the additional metadata of item records, however, there is also the increased likelihood of inconsistencies across databases that are brought together in a centralized index. a study by graham stone offered a comprehensive report on the implementation process of the summon discovery layer at the university of huddersfield, highlighting major inconsistences in cataloging practices and the difficulties it caused in providing consistent journal holdings and titles.[17] this casts shadows on the promise of better findability. jeff wisniewski[18] and williams and foster[2] are among the many who espouse discovery layers as a step towards a truly single search function that is flexible while allowing needed customizability. these new tools, however, are not without their limitations. the majority of usability studies reinforce similar results and focus on the user interface. fagan et al, for example, studied the usability of ebsco discovery service at james madison university (jmu). while most tasks were accomplished successfully, the study confirmed previous warnings that users do not understand the complexities of search and identified several interface issues: (1) users desire single search, but willingly use multiple options for search, (2) lack of visibility for the option to sort search results, and (3) the difficulty in finding journal articles.[3] yang and wagner offer one case where the aim was to evaluate discovery layers against a check-‐ list of 12 features that would define a true ‘next generation catalogue’: (1) single point of entry to all library information, (2) state-‐of-‐the-‐art web interface (e.g. google and amazon), (3) enriched content (e.g. book cover images, ratings and comments), (4) faceted navigation for search results, (5) simple keyword search on every page, (6) more precise relevancy (with circulation statistics a contributing factor), (7) automatic spell check, (8) recommendations to related materials (common in commercial sites, e.g. amazon), (9) allowing users to add data to records (e.g. reviews), information technology and libraries | march 2015 81 (10) rss feeds to allow users to follow top circulating books or topic related updates in the library catalogue, (11) links to social networking sites to allow users to share their resources, (12) stable url’s that can be easily copied, pasted and shared. [11] they used this list to evaluate seven open source and ten proprietary discovery layers, revealing how only a few of them can be considered true ‘next generation catalogs’ supporting the users’ needs that are common on the web. all of the tools included in their study missed precision in retrieving relevant search results, e.g. based on transaction data. the authors were impressed with open source discovery layers libraryfind and vufind, which had 10 of the 12 features, leaving vendors of proprietary discovery layers ranking lower (see figure 1). figure 1. 17 discovery layers (x-‐axis) were evaluated against a checklist of 12 features expected of the next generation catalogue (y-‐axis) yang and wagner theorized that the relative lack of innovation among commercial discovery layers is due to practical reasons: vendors create their new discovery layers to run alongside older ones, rather than attempting to alter the proprietary code of the integrated library system’s (ils) online public access catalog (opac). they pointed to the need for “libraries, vendors and the open source community […] to cooperate and work together in a spirit of optimism and collegiality to make the true next generation catalogs a reality”.[11] at the same time, the university of michigan article discovery working group reported on vendors’ being more cooperative and allowing coordination among products, increasing the potential of web-‐scale discovery services.[19] how to evaluate and optimize user workflow across these coordinating products remains a practical 9 9 9 8 7.5 7 7 7 6 6 6 5 5 4 2 1 0 1 2 3 4 5 6 7 8 9 10 ranking of discovery layers (yang and wagner 2010, 707) applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 82 challenge. in this study, we propose hta as a prospectively helpful method to evaluate user workflow through these increasingly complex products. hierarchical task analysis with roots in tylorism*, industrial psychology and system processes, task analyses continue to offer valuable insights into the balance of efficiency and effectiveness in human-‐computer interaction scenarios [20][21]. historically, frank and lillian gilbreth (1911) set forth the principle of hierarchical task analysis (hta), when they broke down and studied the individual steps involved in laying bricks. they reduced the brick laying process from about 18 movements down to four (in [21]). but, it was john annett and keith d. duncan (1967) who introduced hta as a method to better evaluate the personnel training needs of an organization. they used it to break apart behavioral aspects of complex tasks such as planning, diagnosis and decision-‐making (see in[22][21]). hta helps break users goals into subtasks and actions, usually in a visual form of a graphic chart. it offers a practical model for goal execution, allowing designers to map user goals to the system’s varying task levels and evaluate their feasibility [23]. in so doing, hta offers the structure with which to learn about tasks and highlight any unnecessary steps and potential errors that might occur during a task performance [24][25], whether cognitive or physical. its strength lies in its dual approach to evaluation: on the one hand, user interface elements are mapped at an extremely low and detailed level (to individual buttons), while on the other hand, each of these interface elements gets mapped to user’s high-‐level cognitive tasks (the cognitive load). this informs a rigorous design approach, where each detail accounts for the high-‐level user task it needs to support. the main limitation of classical hta is its system-‐centric focus that does not account for the wider context the tasks under examination exists in. the field of human-‐computer interaction has shifted our understanding of cognition from an individual information processing model to a networked and contextually defined set of interactions, where the task under analysis is no longer confined to a desktop but “extends into a complex network of information and computer-‐mediated interactions” [26]. the task step focused hta does not have the ability to account for the rich social and physical contexts that the increasingly mediated and multifaceted activities are embedded in. hta has been reiterated with additional theories and heuristics, so as to better account for the increasingly more complete understanding of human activity. advanced task models and analysis methods have been developed based on the principle of hta. stuart k. card, thomas p. moran and allen newell [27] proposed an engineering model of human performance – goms (goals, operators, methods, and selection) – to map how task environment features determine what and when users know about the task [20]. goms have been expanded to cope with rising complexities (e.g. [28][29][30]), but the models have become largely impractical * tylorism is the application of scientific method to the analysis of work, so as to make it more efficient and cost-‐effective. modern task information technology and libraries | march 2015 83 in the process [20]. instead of simplistically suggesting cognitive errors are due to interface design, cognitive task analysis (cta) attempts to address the underlying mental processes that most often give rise to errors [24]. given the lack of our structural understanding about cognitive processes, the analysis of cognitive tasks has remained problematic to implement [20][31]. activity theory models people as active decision makers [20]. it explains how users convert goals into a set of motives and how they seek to execute those motives as a set of interactions in a given situational condition. these situational conditions either help or prevent the user from achieving the intended goal. activity theory is beginning to offer a coherent foundation to account for the task context [20], but it has yet to offer a disciplined set of methods to execute this theory in the form of a task analysis. even though task analyses have seen much improvement, adaptation and usage in its near-‐40-‐ year-‐long existence and its core benefit – aiding an understanding of the tasks users need to perform to achieve their desired goals – have remained the same. until activity theory, cla and other contextual approaches are developed into more readily applicable analysis frameworks, classical hta with the additional layers of heuristics guiding the analysis remains the practical option [21]. nielsen’s goal composition [6] offers one such set of heuristics applicable for the web context. it presents usability concepts such as reuse, multitasking, automated use, recovering and retrieving, to name a few, so as to systematically evaluate the hta charts representing the interplay between an interface and the user. utility of hta for evaluating discovery layers usability testing has become the norm in validating the effectiveness and ease of use of library websites. yet, thirteen years ago, brenda battleson, austin booth and jane weintrop [32] emphasized the need to support user tasks as the crucial element to user-‐centered design. in comparison to usability testing, hta offers a more comprehensive model for the analysis of how well discovery layers support users’ tasks in the contemporary library context. considering the strengths of the hta method and the current need for vendors to simplify the workflows in the increasingly complex systems, it is surprising that hta has not yet been applied to the evaluation of discovery layers. this paper introduces hierarchical task analysis (hta) as a solution to systematically evaluate the workflow of discovery layers as a technology that helps users accomplish specific tasks, herein, retrieving relevant items from the library catalog and other scholarly collections. nielsen’s [6] goal composition heuristics, designed to evaluate usability in the web context, is used to guide the evaluation of the user workflow via the hta task maps. as a process (vs. context) specific approach, hta can help achieve a more systematic examination of the tasks discovery layers should support, such as finding an article, a book or an ebook, and help vendors coordinate to achieve the full potential of web-‐scale discovery services. applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 84 method: applying hta to primo by ex libris the object of this study was purdue university’s library website, which was re-‐launched with ex libris’ primo in january 2013 (figure 2) to serve the growing student and faculty community. its 3.6 million indexed records are visited over 1.1 million times every year. roughly 34% of these visits are to electronic books. according to sharon q. yang and kurt wagner [11], who studied 17 different discovery layers, primo ranked the best among the commercial discovery layer products, coming fourth after the open source tools library find, vufind, and scriblio in the overall rankings. we will evaluate how efficiently and effectively the primo search interface supports users’ of the purdue libraries tasks. figure 2. purdue library front page and search box based on our three year experience of user studies and usability testing of the library website, we identified finding an article, a book and an ebook as the three major representative scenarios of purdue library usage. to test how primo helps its users and how many cognitive steps it requires of them, each of the three scenarios were broken into three or four specific case studies. the case studies were designed to account for the different availability categories present in the current primo system, e.g. ‘full text available’, ‘partial availability’, ‘restricted access’ or ‘no access’. this is because the different availabilities present users with different possible frustrations and obstacles information technology and libraries | march 2015 85 to task accomplishment. this system-‐design perspective could offer a comparable baseline for discovery layer evaluation across libraries. a full list of the eleven case studies can be seen below: find an article: case 1. the library has only a full electronic text. case 2. the library has the correct issue of the journal in print, which contains the article, as well as, a full electronic text. case 3. the library has the correct issue of the journal, which contains the article, only in print. case 4. the library does not have the full text, either in print or electronically. a possible option is to use inter library loan (here forth ill) request. find a book (print copy): case 5. the library has the book and the book is on the shelf. case 6. the library has the book, but the book is in a restricted place, such as the hicks repository. the user has to request the book. case 7. the library has the book, but it is either on the shelf or in a repository. the user would like to request the book. case 8. the library does not have the book. possible options are uborrow† or ill. find an ebook: case 9. the library has the full text of the ebook. case 10. the ebook is shown in search results but the library does not have full text. case 11. the book is not shown in search results. possible option is to use uborrow or ill. it is generally accepted that hta is not a complex analysis method, but since it offers general guiding principles rather than a rigorous step-‐by-‐step guide, it can be tricky to implement [24][20][21][23]. both authors of this study have expertise in applying hta and are frequent users of the purdue library’s website. we are familiar with the library’s commonly reported system errors; however, all of our case studies result from a randomized topic search, not from specific reported items. to achieve consistent hta charts one author carried out the identified use-‐cases on a part-‐time basis over a two-‐month period. each case was executed on the purdue library website, using the primo discovery layer. an on campus hewlett-‐packard (hp) desktop computer with internet explorer and a personal macbook laptop with safari and google chrome were used to identify any possible inconsistencies between user experiences on different † uborrow is a federated catalog and direct consortial borrowing service provided by the committee on institutional cooperation (cic). uborrow allows users to search for, and request, available books from all cic libraries, which includes all universities in the big ten as well as the university of chicago, and the center for research libraries. applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 86 operating systems. as per stanton’s [21] statement that “hta is a living documentation of the sub-‐ goal hierarchy that only exists in the latest state of revision”, mapping the hta charts was an iterative process between the two authors. according to david embrey [24] “the analyst needs to develop a measure of skill [in the task] in order to analyze a task effectively” (2). this measure of skill was developed in the process of finding real examples (via a randomized topic search) from the purdue library catalog to match the structural cases listed above. for instance ‘case 1. the library has only the electronic full text’ was turned into a case goal: ‘0 find the conference proceeding on network-‐assisted underwater acoustic communication'. a full list of referenced case studies is below: find an article: case 1. find the article “network-‐assisted underwater acoustic communication” (yang and kevin, 2012). case 2. find the article “comparison of simple potential functions for simulating liquid water” (jorgensen et al., 1983). case 3. find the journal design annual “graphis inc” (2008). case 4. find the article “a technique for murine irradiation in a controlled gas environment” (walb, m. c. et al., 2012). find a book (in print): case 5. find the book show me the numbers: designing tables and graphs to enlighten (few, 2004). case 6. find the book the love of cats and place a request for it (metcalf, 1973). case 7. find the book the prince and place a request for it (machiavelli). case 8. find the book the design history reader by maffei and houze (2010). (uborrow or ill). find an ebook: case 9. find the ebook handbook of usability testing. how to plan, design and conduct effective tests (rubin and chisnell, 2008) case 10. find the ebook the science of awakening consciousness: our ancient wisdom (partly available via hathi trust) case 11. find the ebook ancient awakening by matthew bryan laube (uborrow). hta descriptions are generally diagrammatic or tabular. since diagrams are easier to assimilate and promise the identification of a larger number of sub-‐goals [23], diagrammatic description method was preferred (figure 2). each analysis started with the establishment of sub-‐goals, such as ‘browse the library website’ and ‘retrieve the article’, and followed with the identification of individual small steps that make the sub-‐goal possible, e.g. ‘press search’ and ‘click on 2, to go to page 2’ (figures 3-‐5). then, additional iterations were made to include: (1) cognitive steps, where information technology and libraries | march 2015 87 users need to evaluate the screen in order to take the next step (e.g. identifying the correct url to open from the initial results set), and (2) capture cognitive decision points between multiple options for users to choose from. for instance, items can be requested either via interlibrary loan (ill) or uborrow, presenting the user with an a or b option, requiring cognitive effort to make a choice. such parallel paths were color coded in yellow (figure 2). both physical and cognitive steps were recorded into xmind‡, a free mind mapping software. they were color-‐coded black and gray, respectively, helping visualize the volume of cognitive decision points and steps (i.e. cognitive load). figure 3. full hta chart for 'find a book' scenario (case 5). created in xmind. figure 4. zoom in to steps 1 and 2 of the hta map for ‘find a book’ scenario (case 5). created in xmind. ‡ xmind is a free mind mapping software that allows structured presentation of step multiple coding references, the addition of images, links and extensive notes. http://www.xmind.net/ applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 88 figure 5. zoom in to step 3 of the hta map for the 'find a book' scenario (case 5). created in xmind. information technology and libraries | march 2015 89 figure 6. zoom in to step 4 of the hta map of the 'find a book' scenario (case 5). created in xmind. to organize the decision flow chart, the original hierarchical number scheme for hta that requires every sub-‐goal to be uniquely numbered with an integer in numerical sequence [21], was strictly followed. visual (screen captures) and verbal notes on efficient and inefficient design factors were taken during the hta mapping process and linked directly to the tasks they applied to. steps, where interface design guided the user to the next step, were marked ‘fluent’ with a green tick (figures 3 and 4). steps that were likely to mislead users from the optimal path to item retrieval and were a burden to user’s workflow were marked with a red ‘x’ (see figures 4 and 5). one major advantage of the diagram format is its visual and structural representation of sub-‐goals and their steps in a spatial manner (see figures 2-‐5). this is useful for gaining a quick overview of the workflow [21]. when exactly to stop the analysis has remained undefined for hta [21]. it is at the discretion of the analyst to evaluate if there is the need to re-‐describe every sub-‐goal down to the most basic level, or whether the failure to perform that sub-‐goal is, in fact, consequential to the study results. we decided to stop evaluation at the point where the user located (a shelf number or reserve pick up number) or received the sought item via download. furthermore, steps that were perceived as possible when impossible in actuality were transcribed into the diagrams. article scenario case 1 offers an example: once the desired search result was identified, its green dot for ‘full text available’ applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 90 was likely to be perceived as clickable, when in actuality it was not. the user is required to click on the title or open the tab ‘find online’ to access the external digital library and download the desired article (see figure 7). figure 7. article scenario (case1) two search results, where green 'full text available' may be perceived as clickable. task analysis focuses on the properties of the task rather than the user. this requires expert evaluation in place of involving users in the study. as stated above, both of the authors are working experts in the field of user experience in the library context, thoroughly aware of the tasks under analysis and how they are executed on a daily basis. a group of 12 (librarians, reference service staff, system administrators and developers) were asked to review the hta charts on a monthly basis. feedback and implications of identified issues were discussed as a group. according to nielsen [7] it takes five experts (double specialist in nielsen’s terms, is an expert in usability as well as in the particular technology employed by the software.) to not have significant loss of findings (see figure 7). based on this enumeration, the final versions of the hta charts offer accurate representations of the primo workflow in the three use scenarios of finding an article, finding a book and finding an ebook at purdue university libraries. information technology and libraries | march 2015 91 figure 8. average proportion of usability problems found as a function of number of evaluators in a group performing heuristic evaluation [7]. results the reason for mapping primo’s workflows in hta charts was to identify key workflow and usability issues of a widely used discovery layer in scenarios and contexts it was designed to serve. the resulting hta diagrams offered insights into fluent steps (green ticks), as well as workflow issues (red ‘x’) present in primo, as applied at purdue university libraries. it is due to space limitations, that only the main findings of the hta will be discussed. the full results are published on purdue university research repository§. table 1 presents how many parallel routes (a vs. b route), physical steps (clicks), cognitive evaluation steps, likely errors and well guided steps each of the use cases had. on average it took between 20 to 30 steps to find a relevant item within primo. even though no ideal step count has been identified for the library context, this is quite high in the general context of the web, where fast task accomplishment is generally expected. paul chojecki [33] tested how too many options impact usability on a website. he revealed that the average step count to lead to higher satisfaction levels is 6 (vs. 18,16 average steps at purdue libraries). in our study, the majority of the steps were physical pressing of a button or filter selection; however, cognitive steps took up just under a half of the steps in nearly all cases. the majority of cases flow well, as the strengths (fluent well guided steps) of primo outweigh its less guided steps that easily lend themselves to the chance of error. § task analysis cases and results for ex libris primo. https://purr.purdue.edu/publications/1738 applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 92 content type articles books ebooks case number 1 2 3 4 avg 5 6 7 8 avg 9 10 11 avg no. of decision points (between a & b), to retrieve an item 5 8 4 4 5 4 5 5 2 4 6 3 2 4 minimum steps possible to retrieve an item (clicks + cognitive decisions) 18 27 16 30 23 18 25 28 24 24 22 19 19 20 of these minimum steps, how many were cognitive (information evaluation was needed to proceed) 4 8 9 13 9 6 9 7 7 7 4 6 4 5 maximum steps it can take to retrieve an item (clicks + cognitive decisions) 26 35 23 36 30 22 31 33 28 29 32 23 22 26 of these, maximum steps, how many were cognitive 10 17 14 15 14 10 13 16 8 12 9 8 5 7 errors (steps that mislead from optimal item retrieval) 3 15 4 8 8 2 2 4 3 3 13 1 2 5 fluent well guided steps to item retrieval 11 11 9 8 10 7 8 7 5 7 6 4 3 5 table 1. table listing each case’s key task measures, and each scenario’s averages. between the three item search scenarios – articles, books and ebooks – the retrieval of articles was least guided and required the highest amount of decisions from the user (5, vs. 4 for books and 4 for ebooks on average). retrieving an article (between 23-‐30 steps on average) or a book (24-‐29 steps on average) took more steps to accomplish than finding a relevant ebook (20-‐26 steps on average). the high volume of steps (max 30 steps on average) it required to retrieve an article, as well as its high error rate (8), were due to the higher amount of cognitive steps (12 steps on average) required to identify the correct article and to locate a hard copy (instead of the relatively easily retrievable online copy). in the book scenario, the challenge was also two-‐fold: on the one hand, it was challenging to verify the right book when there were many similar results (this explains the high number of 12 cognitive steps on average); on the other hand, the flow to place a request for a book was also a challenge. the latter was a key contributor to the higher amount of physical steps required for retrieving a book (max 29 on average). information technology and libraries | march 2015 93 common to all eleven cases, whether articles or books, was the four sub-‐goal-‐process: 1) browse the library website, 2) find results, 3) open the page of the desired item, and 4) retrieve, locate or order the item. the first two offered near identical experiences, no matter the search scenario or case. third and fourth sub-‐goals, however, presented different workflow issues depending on the product searched and its availability, e.g. ‘in print’ or ‘online’. as such, general results will be presented for the first two themes, while scenario specific overviews will be provided for the latter two themes. browsing the library website browsing the library website was easy and supported different user tasks. the simple url (lib.purdue.edu) was memorable and appeared first in the search results. the immediate availability of sub-‐menus, such as databases and catalogs, offered speedy searching for the frequent users. the choice between: a) general url, or b) sub-‐menu, was the first key decision point users of primo at purdue libraries were presented with. the purdue libraries’ home page (revisit figure 1) had a simple design with a clear, central and visible search box. just above it were search filters for articles, books and the web. this was the second key decision point users were presented with: a) they could either type into the search bar without selecting any filters, or b) they could select a filter to aid the focus of their results to a specific item type. browsing the library website offers an efficient and fluent workflow, with ebooks being the only exception. it was hard to know whether they were grouped under articles or books & media filters. confusingly (at the time of the study) purdue libraries listed ebooks that had no physical copies under articles, while other ebooks that purdue had physical version of (in addition to the digital ones) under books & media. this was not explained in the interface, nor was there a readily available tooltip. finding relevant results figure 9. search results for article (case2) ‘comparison of simple potential functions for simulating liquid water’ applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 94 primo presented the search results in an algorithmic order of relevance offering additional pages for every 20 items appearing in the search results. the search bar was then minimized at the top of the page, available for easy editing. the page was divided into two key sections, where the first quarter entailed filters (e.g. year of publishing, resource type, author, journal, etc.), and the other three quarters was left for search results (see figure 8). the majority of cognitive decisions across scenarios were made on this results page. this was due to the need to pick up the cues to identify and verify the accurate item being searched. the value of these cognitive steps lies in their leading of the user to the next physical steps. as discussed in the next section, opening the page of the desired item, there were several elements that succeeded and failed at guiding the user to their accurate search result. search results were considered relevant when the search presented results in the general topic area of the searched item. most cases in most scenarios led to relevant results, however, book case 8 and ebook case 11, provided only unrelated results. generally, books and ebooks were easy to identify as available. this was due to their typically short titles, which took less effort to read. journal articles, on the other hand, have longer titles and required more cognitive effort to be verified. article case 4, book case 6 and ebook case 10 had relevant but restricted results. the color-‐ coding system that indicated the level of availability for the presented search results: green (fully available), orange (partly available) or gray (not available) dots – was followed by an explanatory availability tag, e.g. 'available online' or 'full text available' etc. tabs represented additional cues, offering additional information, e.g. ‘find in print’. these appeared in a supplementary way where applicable. for example, if an item was not available, its dot was gray and it neither had the 'find in print' nor 'find online' tab. instead, it had a 'request' tab, guiding the user towards an available alternative action. restricted availability items, such as a book in a closed repository, had an orange indicator for partial availability. for these, primo still offered the 'find in print' or 'find online' tab, whichever was appropriate. while the overall presentation of item availability was clear and color-‐coding consistent, the mechanisms were not without their errors, as discussed below. opening the page of the desired item this sub-‐goal comprised of two main steps: 1) information driven cognitive steps, which help the user identify the correct item, and 2) user interface guided physical steps that resulted in opening the page of the desired item. frequent strengths that helped the identification of relevant items across the scenarios were the clearly identifiable labels underneath the image icons (e.g. 'book’, 'article', ‘conference proceeding'), hierarchically structured information about the items (title, key details, availability) and perceivably clickable links (blue with an underlined hover effect). the labels and hierarchically presented details (e.g. year, journal, issue, volume, etc.) helped the workflow to remain smooth, information technology and libraries | march 2015 95 minimizing the need to use side filters. the immediate details reduced the need to open additional pages, cutting down the steps needed to accomplish the task. the hover effect of item titles made the link look and feel clickable, guiding the user closer to retrieving the item. color-‐ coding all clickable links in the same blue was also an effective design feature, even though bolded availability labels were equally prominent and clickable. this was especially true for articles where the ‘full text available’ tags correspond to users goal to immediately download the sought item (figure 8). the most frequent causes of errors were duplicated search results. generally, primo displays multiple versions of the same item into one search result and offered a link: ‘see all results’. in line with graham stone’s [17] study, which highlighted the problem of cataloging inconsistences, primo struggled to consistently grouping all overlapping search result items. both book and article scenarios suffered from at least one duplicate search result case due to inconsistent details. article scenario case 2 offers an example, where jorgensen et al “comparison of simple potential functions for simulating liquid water” (1983) had two separate results for the same journal article of the same year (first two results in figure 8). problematically, the two results offered different details for the journal issue and page numbers. this may cause likely referencing problems for primo users. duplicated search results were also an issue for book scenarios. the most frequent causes for this were instances where authors’ first and last names were presented in a reverse order (see also figure 8 for article case 2), the books had different print editions, or the editors’ name was used in place of the authors’. book scenario case 7: machiavelli’s “the prince” resulted in extremely varied results, requiring 16 cognitive steps and 33 physical steps before a desired item could be verified. this is where search filters were most handy. problematically, in case 7, machiavelli – the author – did not even appear in the author filter list, while ebrary inc was listed. again, this points to the inconsistent metadata and the effects it can have on usability, as discussed by stone.2 other workflow issues were presented by design details such as the additional information boxes underneath the item information, e.g. ‘find in print’, ‘details’ and ‘find online’. they opened a small scrollable box that maintained the overall page view, were difficult to scroll. the arrow kept slipping outside of the box, scrolling the entire site’s page instead of the content inside the box. in addition, the information boxes did not work well with chrome. this was especially problematic on the macbook where after a couple of searches the boxes failed to list the details and left the user with an unaccomplished task. comparably, safari on a mac and internet explorer on a pc never had such issues. retrieving the items (call number or downloading the pdf) the last sub-‐goal was to retrieve the item of interest. this often comprised of multiple decision points: whether to retrieve the pdf version from online or identify a call number for the physical applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 96 copy or whether to place a request, ordering it via inter library loan (ill) or uborrow. each option is briefly discussed below. ebooks and articles, if available online, offered efficient online availability. if an article was identified for retrieval, there were two options to access the link to the database, e.g. ‘view this record in acm’: a) via the full view of the item, or b) small ‘find online’ preview box discussed above. where more than one database was available, information about the publication range the library holds helped identify the right link to download the pdf on the link-‐resolver page. one of the key benefits of having links from within primo to the full texts was the fact that they opened in new browser windows or tabs, without interference to other ongoing search. while a few of the pdf links to downloadable texts were difficult to find through some external database sites, once found, they all opened in adobe reader with easy options to either 'save' or ‘print’ the material. ebooks were available via ebrary or ebl libraries. while the latter offers some novel uses, such as audio (i.e. read aloud), neither of the two platforms was easy to use. while reading online was possible, downloading an ebook was challenging. the platform seemed to offer good options: a) download by chapter, b) download by page numbers, or c) download the full book for 14 days. in actuality, however, these were all unavailable. ebook case 9 had chapters longer than the 60-‐page limit per day. page numbers proved difficult to use, as the book’s numbers did not match the pdf’s page numbers. this made it hard to keep track of what was downloaded and where one left off to continue later (due to imposed time-‐limits). the 14-‐day full access option was only available in adobe digital editions software (an ebook reader software by adobe systems built with adobe flash), which was neither available on most campus computers nor on personal laptops. the least demanding and most fluent of all retrieval options was the process of identifying the location and call number for physical copies. inconsistent metadata, however, posed some challenges. book case 5 offered a merged search result of two books, but listed them with different call numbers in the ‘find in print’ tab. libraries have many physical copies of the same book, but identifying consistency in call number is a cognitive step that helps verify the similarities or differences between the two results. the different call numbers raised doubts about which item to choose, slowing the workflow for the task and increasing the number of cognitive steps required to accomplish the task. compared to books, finding an article in print format was hardly straightforward. the main cause for error when looking up hard copies of journals was the fact that individual journal issues did not have individual call numbers at purdue libraries. instead, they were had one call number per periodical where the entire journal series had only one call number. article case 2, for example, offered the journal code: 530.5 j821 in the ‘find in print’ tab. in general, the tab suffered from too much information, poor layout and unhelpful information hierarchy, all of which slowed down the cognitive tasks of verifying whether an item was relevant or not. it listed ‘location’ and ‘holdings range’ as the first pieces of information, wherein ‘holdings range’ included not just hard copy related information, but listed digital items as well, even though this tab was for physical version information technology and libraries | march 2015 97 of the item. to illustrate, article case 2 claimed to have holdings for 1900 – 2013, whereas hard copies were only available for 1900-‐2000, and digital copies for 2001-‐2013. each scenario had one or two cases where there were neither physical nor digital options available. the sub-‐goal commonly comprised of a decision between three options: c) placing a request, d) ordering an item via inter library loan (ill), or c) ordering an item via uborrow. while the ‘signing in to request’ option and ill were easy to use with few required steps, there was a lack of guidance on how to choose between the three options. frequently, ill and uborrow appeared as equal options adjacent to one another, leaving the next step unguided. of all three, placing a request via uborrow was the hardest to accomplish. it often failed to present any relevant results on the first results page of the uborrow system, requiring the use of advanced search and filters. for instance, book case 6 was ‘not requestable’ via uborrow. when it did list the sought for item in the search results it looped back to purdue's own closed repository (which remained unavailable). discussion the goal of this study was to utilize hta to examine the workflow of the primo discovery layer at purdue university libraries. nielsen’s [6] goal composition heuristics were used to extend the task-‐based analysis and understand the tasks in the context of discovery layers in libraries. three key usability domains: generalization, integration and user control mechanisms were used as an analytical framework to draw usability conclusions about how primo was supporting, if at all, successful completion of the three scenarios. the next three sub-‐sections evaluate and offer design solutions on the three usability domains mentioned above. overall, this study confirmed primo’s ability to reduce the workload for users to find their materials. primo is flexible and intuitive, permitting efficient search and successful retrieval of library materials, while offering the possibility of many search sessions at once [14]. a comparison to a usability test results is offered as a way of conclusion. generalization mechanisms primo can be considered a flexible discovery layer as it helps users achieve many goals with minimum amount of steps. it makes use of several generalization mechanisms that allow users to utilize their tasks towards many goals at once. for instance, the library website result in google offers not only the main url but also seven sub-‐links to specialist library site locations, such as opening hours and databases. this makes primo accessible and relevant for a broader array of people who are likely to have different goals. for instance, some may seek to enter a specific database, instead of having to open primo’s landing page and entering the search terms. another may wish to utilize ‘find’, which guides the user, one step at a time, via a process of definition elimination, closer to the item they are looking forknow the opening times. similarly, the primo search function saves already typed information, both on its landing page and its results page. this facilitates search by requiring query entry only once, while allowing end applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 98 users to click on different filters to narrow the results in different ways. as a part of the work done towards one search can be used towards another, e.g. by content, journal, or topic type, the system can ease the work effort required of users. this is further supported by the system saving already typed keywords when returning to the main search page from research results and allows for a fluid search experience where the user adjusts a keyword thread with minimal typing, until they find what they are looking for. a key problem for primo is its inability to manage inconsistent meta-‐data. the tendency to group different versions of the same search results together is helpful as it clarifies information noise. in an effort to enhance the speed it takes to evaluate the relevancy of search results, the system seeks to shighlight any differences in the meta-‐data. if inconsistencies in meta-‐data cause same search results to appear as separate items, it is likely to affect the cognitive steps and therefore the workload and efficiency with which the user is able to accomplish identification. it is clear from previous studies that if discovery layers were to become the next generation catalogs [11], and were to enhance the speed of knowledge distribution as has been hoped by tosaka and weng [15] and luther and kelly [16], then mutual agreement is needed on how meta-‐ data from disparate sources [17]. understanding that users’ cognitive workload should be minimized (by offering fewer options and more directive guidance) for more efficient decision-‐ making, library items should have accurate details in their meta-‐data, e.g. consistent and thorough volume, issue and page numbers for journal articles, correct print and reprint years for books, and item type (conference proceeding vs. journal article). integration mechanisms the discovery layer’s ability to increase the number of search sessions [14] at any one time is possible due to its flexibility to support multitasking. primo achieves this with its own individual features used in combination with other system facilities and external sources. for instance, primo’s design allows users to review and compare several search results at once via the ‘find in print’ or ‘details’ tabs. although not perfect, since the small boxes are hard to scroll within, the information can save the user the need and additional steps of opening many new windows and having to click between them just for reviewing search results. instead, many ‘detail’ boxes of similar results may be opened and viewed at once, allowing for effective visual comparison. this integration mechanism allows a fluent transition from skimming the search results to another temporary action of gaining insight about the relevance of an item. most importantly, this is accomplished without requiring the user to open a new browser page or tab, where they would have to break from their overall search flow and remember the details (instead of visually comparing them), making it hard to resume from where they left off. a contrary integration mechanic that primo makes use of is its smooth automated connectivity to external sites, such as databases, ebrary, ill, etc. new browser pages are used to allow the continuation of a task outside of primo itself without forcing the user out of the system to the information technology and libraries | march 2015 99 library service or full text. primo users can skim search results, identify relevant resources and open them in new browser pages for later reviewing. what is missing, however, is the opportunity to easily save and resume a search. retrieving the search result or saving it under ones’ login details would benefit users who recall items of interest from previous searches and would like to repeat the results without having to remember the keywords or search process they used. it is not obvious how to locate the save search session option in primo’s interface. user control mechanisms yang and wagner [11] ranked primo highest among the vendors, primarily for its good user control mechanisms, which allow users to inspect and change the search functions on an ongoing basis. primo does a good job at presenting search results in a quick and organized manner. it allows for the needed ‘undo’ functionality and continued attachment and removal of filters, while saving the last keywords when clicking the back button from search results. the continuously available small search box also offers the flexibility for the user to change search parameters easily. in summary, primo offers agile searching, while accounting for a few different discovery mental models. however, if primo wants to preserve its current effectiveness and make the jump towards a single search function that is truly flexible and allows for much needed customizability [18][2], it needs to allow for several similar user goals to be easily executable without confusion about the likely outcome. the most prominent current system error for primo, as it has been applied in the purdue libraries, is its inability to differentiate ebooks from journal articles or books. it would support users goals to be able to start and finish an ebook related tasks at the home page’s search box. currently, users have the cognitive burden to consider whether ebooks are more likely to be found under ‘books & media’ or ‘journals’. currently, primo, as applied to its implementation at purdue libraries at the time of this study, does not support goals to search for content type, e.g. an ebook. this however, is increasingly popular among the student population who want ebooks on their tablets and phones instead of carrying heavy books in their backpacks. another key pain-‐point for current users is the identification of specific journals in physical form, say for archival research. currently, each journal issue is listed individually in the ‘find in print’ section, even though the journals only have one call number. listing all volumes and issues of each periodical overwhelms the user with too much information and prevents the effective accomplishment of the task of locating a specific journal issue. since there is only one call number available for the entire journal sequence, it may lead to better clarity and usability if the information was reduced. instead of listing all possible journal issues, a range or ranges (if incomplete set of issues) that the library has physically present should be listed. in article case 2, for instance, there are five items for the year 1983. why lead the user to look at a range where there is no possible option? applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 100 comparing hta to a usability test usability tests benefit from the invaluable direct input from the end user. at the same time usability studies, as constructed conditions, offer limited opportunities to learn about users’ real motivations and goals and how the discovery layers support or fail to support their tasks. fagan et al [3] conducted a usability test with eight students and two faculty members to learn about usability issues and user satisfaction with discovery layers. they measured time, accuracy and completion rate for nine specific tasks, and obtained insights from task observations and post-‐test surveys. they reported on issues with users not following directions (93), the prevalence of time outs, users skipping tasks, and variable task times. these results all point to a mismatch between the user goals and the study tasks and offer an incomplete picture about the system’s ability to support user goals that are accomplished via specific tasks. expert evaluation based hta method does not require users’ direct input. hta offers a method to achieve a relatively complete evaluation of how low-‐level interface facets support users’ high-‐ level cognitive tasks. hta measures the system designs quality in supporting a specific task needed to accomplish a user goal. instead of measuring time, physical and cognitive tasks are measured in number of steps. instead of accuracy and completion rate, fluent workflow steps and mistaken steps are counted. the two methods offer opposite strengths, making them a good complements. given hta’s system-‐centric approach, it can better inform which tasks would be useful in usability testing. to compare the our research findings with usability tests, fagan et al [3] confirmed some of the previously established findings that journal titles are difficult to locate via the library home page (vs. databases), that filters are handy when they are needed and that users’ mental models have a preference for a google-‐like single search-‐box. for instance, students and even librarians, struggle to understand what is being searched in each system and how results are ranked (see also [5]). the hta method applied in this study was also able to confirm that journal titles are more difficult to identify than books and ebooks, the flexibility benefit offered by filters and identify the single search box as a fluent system design. since, hta does not rely on the user to tell why these results are true, hta, as applied in this study, helped expert evaluators understand the reasons for these findings via self-‐directed execution and discussion with colleagues later. depending on the task design, either usability testing or hta offer the capabilities to identify cases such as confusion about how to start an ebook search in primo. taking a system design approach to task design offers a path to a systematic understanding of discovery layer usability, which lends itself to easier comparison and external validity. in terms of specific interface features, usability tests are good for evaluating the visibility of specific features. for example, fagan et al [3] asked their participants to (1) search on speech pathology, (2) find a way to limit search results to audiology, and then (3) limit their search results to peer-‐reviewed (task 3 in [3], p. 95). by measuring completion rate, they were able to identify the relative failure of ‘peer-‐reviewed’ over ‘audiology’ filters, but they were left “unclear information technology and libraries | march 2015 101 [about] why the remaining participants did not attempt to alter the search results to ‘peer reviewed,” failing to accomplish the task [3]. in comparison, hta as an analytical rather than observational methodology, leads to more synthesized results. in addition to insights into possible gaps between system design and mental models, hta as a goal-‐oriented approach, concerns itself with issues of workflow (how well the system guides the user to accomplishing their task) and efficiency (minimizing the number of steps required to finish a task). these are less obvious to identify with usability tests, where participants are not impacted by their routine goals, time pressures and consequently their patience may be more tolerant as a result. the application of hta helped identify key workflow issues and map them to specific design elements. for instance, the lack of ebooks as a search filter meant that the current system did not support content form based searching well for two mains forms: articles and books. compared to usability tests that focus on specific fabricated search processes, hta aims to map all possible routes the system’s design offers to accomplish a goal, allowing for their parallel existence during the analysis. this system-‐centered approach to task evaluation, we argue, is the key benefit hta can offer towards a more systematic evaluation of discovery layers, where different user groups would have varying levels of assistance needs. hta task-‐analysis allows for the nuanced understanding that results can differ as the context of use differs. that applies even to the contextual difference between user test participants and routine library users. conclusion discovery layers are advancing the search experiences libraries can offer. with increasing efficiency, increased ease of use and more relevant results, scholarly search has become a far less frustrating experience. while google is still perceived as the holy grail of discovery experiences, in reality it may not be quite what scholarly users are after [5]. the application of discovery layers has focused on eliminating the limitations that plagued the traditional federated search and improving the search index coverage and performance. usability studies have been effective in verifying these benefits and key interface issues. moving forward, studies on discovery layers should focus more on the significance of discovery layers on user experience. this study presents the expert evaluation based hta methods as a complementary way to systematically evaluate popular discovery layers. it is the system design and goal-‐oriented evaluation approach that offers the prospects of a more thorough body of research on discovery layers than usability alone. using hta as a systematic preliminary study guiding formal usability testing offers one way to achieve more comparable study results on applications of discovery layers. it is through comparisons that the discussion of discovery and user experience can gain a more focused research attention. as such, hta can help vendors to achieve the full potential of web-‐scale discovery services. to better understand and ultimately design to their full potential, systematic studies are needed on discovery layers. this study is the first attempt to apply hta towards systematically analyzing user workflow and interaction issues on discovery layers. the authors hope to see more work in applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 102 this area, with the hope of achieving true next generation catalogs that can enhance knowledge distribution. references [1] beth thomsett-‐scott and patricia e. reese, “academic libraries and discovery tools: a survey of the literature,” college & undergraduate libraries 19, no. 2–4 (april 2012): 123– 143. http://dx.doi.org/10.1080/10691316.2012.697009. [2] sarah c. williams and anita k. foster, “promise fulfilled? an ebsco discovery service usability study,” journal of web librarianship 5, no. 3 (jul. 2011): 179–198. http://dx.doi.org/10.1080/19322909.2011.597590. [3] jody condit fagan, meris a. mandernach, carl s. nelson, jonathan r. paulo, and grover saunders, “usability test results for a discovery tool in an academic library,” information technology and libraries 31, no. 1 (mar. 2012): 83–112, mar. 2012. http://dx.doi.org/10.6017/ital.v31i1.1855. [4] roger c. schonfeld and matthew p. long, “ithaka s+r us library survey 2013,” ithaka s+r, survey 2, mar. 2014. http://sr.ithaka.org/research-‐publications/ithaka-‐sr-‐us-‐library-‐survey-‐ 2013. [5] michael khoo and catherin hall, “what would ‘google’ do? users’ mental models of a digital library search engine,” in theory and practice of digital libraries, ed. panayiotis zaphiris, george buchanan, edie rasmussen, and fernando loizides, 1-‐12 (berlin heidelberg, springer: 2012). http://dx.doi.org/10.1007/978-‐3-‐642-‐33290-‐6_1. [6] jakob nielsen, “goal composition: extending task analysis to predict things people may want to do,” goal composition: extending task analysis to predict things people may want to do, 01-‐jan-‐1994. http://www.nngroup.com/articles/goal-‐composition/. [7] jakob nielsen, “finding usability problems through heuristic evaluation,” in proceedings of the sigchi conference on human factors in computing systems, 373-‐380 (new york, ny, acm: 1992). http://dx.doi.org/10.1145/142750.142834. [8] jerry v. caswell and john d. wynstra, “improving the search experience: federated search and the library gateway,” library hi tech 28, no. 3 (sep. 2010): 391–401. http://dx.doi.org/10.1108/07378831011076648. [9] emily r. alling and rachael naismith, “protocol analysis of a federated search tool: designing for users,” internet reference services quarterly 12, no. 1/2, (2007): 195–210. http://dx.doi.org/10.1300/j136v12n01_10. [10] susan johns-‐smith, “evaluation and implementation of a discovery tool,” kansas library association college and university libraries section proceedings 2, no. 1 (jan. 2012): 17–23. information technology and libraries | march 2015 103 [11] sharon q. yang and kurt wagner, “evaluating and comparing discovery tools: how close are we towards next generation catalog?,” library hi tech 28, no. 4 (nov. 2010): 690–709. http://dx.doi.org/10.1108/07378831011096312. [12] lyle ford, “better than google scholar?,” presentation, advance program for internet librarian 2010, monterey, california, 25-‐oct-‐2010. [13] michael gorrell, “the 21st century searcher: how the growth of search engines affected the redesign of ebscohost,” against the grain 20, no. 3 (2008): 22, 24. [14] sian harris, “discovery services sift through expert resources,” research information, no. 53, , (apr. 2011): 18–20. http://www.researchinformation.info/features/feature.php?feature_id=315. [15] yuji tosaka and cathy weng, “reexamining content-‐enriched access: its effect on usage and discovery,” college & research libraries 72, no. 5 (sep. 2011): pp. 412–427. http://dx.doi.org/10.5860/. [16] judy luther and maureen c. kelly, “the next generation of discovery,” library journal 136, no. 5 (march 15, 2011): 66-‐71. [17] graham stone, “searching life, the universe and everything? the implementation of summon at the university of huddersfield,” liber quarterly 20, no. 1 (2010): 25–51. http://liber.library.uu.nl/index.php/lq/article/view/7974. [18] jeff wisniewski, “web scale discovery: the future’s so bright, i gotta wear shades,” online 34, no. 4 (aug. 2010): 55–57. [19] gaurav bhatnagar, scott dennis, gabriel duque, sara henry, mark maceachern, stephanie teasley, and ken varnum, “university of michigan library article discovery working group final report,” university of michigan library, jan. 2010, http://www.lib.umich.edu/files/adwg/final-‐report.pdf [20] abe crystal and beth ellington, “task analysis and human-‐computer interaction: approaches, techniques, and levels of analysis” in amcis 2004 proeedings, paper 391, http://aisel.aisnet.org/amcis2004/391. [21] neville a. stanton, “hierarchical task analysis: developments, applications, and extensions,” applied ergonomics 37, no. 1 (2006): 55–79. [22] john annett and neville a. stanton, eds. task analysis, 1 edition. london ; new york: crc press, 2000. [23] sarah k. felipe, anne e. adams, wendy a. rogers, and arthur d. fisk, “training novices on hierarchical task analysis,” proceedings of the human factors and ergonomics society annual meeting 54, no. 23, (sep. 2010): 2005–2009, http://dx.doi.org/10.1177/154193121005402321. applying hierarchical task analysis method to discovery layer evaluation | promann and zhang 104 [24] d. embrey, “task analysis techniques,” human reliability associates ltd, vol. 1, 2000. [25] j. reason, “combating omission errors through task analysis and good reminders,” quality & safety health care 11, no. 1 (mar. 2002): 40–44, http://dx.doi.org/10.1136/qhc.11.1.40. [26] james hollan, edwin hutchins, and david kirsh, “distributed cognition: toward a new foundation for human-‐computer interaction research,” acm trans. comput.-‐hum. interact 7, no. 2 (jun. 2000): 174–196, http://dx.doi.org/10.1145/353485.353487. [27] stuart k. card, allen newell, and thomas p. moran, the psychology of human-‐computer interaction. hillsdale, nj, usa: l. erlbaum associates inc., 1983. [28] stephen j. payne and t. r. g. green, “the structure of command languages: an experiment on task-‐action grammar,” international journal of man-‐machine studies 30, no. 2 (feb. 1989): 213–234. [29] bonnie e. john and david e. kieras, “using goms for user interface design and evaluation: which technique?,” acm transactions on computer-‐human interactions 3, no. 4 (dec. 1996): 287–319, http://dx.doi.org/10.1145/235833.236050. [30] david e. kieras and david e. meyer, “an overview of the epic architecture for cognition and performance with application to human-‐computer interaction,” human-‐computer interaction 12, no. 4 (dec. 1997): 391–438, http://dx.doi.org/10.1207/s15327051hci1204_4. [31] laura g. militello and robert j. hutton, “applied cognitive task analysis (acta): a practitioner’s toolkit for understanding cognitive task demands,” ergonomics 41, no. 11 (nov. 1998): 1618–1641, http://dx.doi.org/10.1080/001401398186108. [32] brenda battleson, austin booth, and jane weintrop, “usability testing of an academic library web site: a case study,” the journal of academic librarianship 27, no. 3 (may 2001): 188– 198. [33] paul chojecki, “how to increase website usability with link annotations,” in 20th international symposium on human factors in telecommunication. 6th european colloquium for user-‐friendly product information. proceedings, 2006, p. 8. case study references: information technology and libraries | march 2015 105 find an article: case 1. yang, t. c., and kevin d. heaney. "network-‐assisted underwater acoustic communications." in proceedings of the seventh acm international conference on underwater networks and systems, p. 37. acm, 2012. case 2. jorgensen, william l., jayaraman chandrasekhar, jeffry d. madura, roger w. impey, and michael l. klein. "comparison of simple potential functions for simulating liquid water." the journal of chemical physics 79 (1983): 926. case 3. “design annual.” graphis inc., 2008 case 4. walb, m. c., j. e. moore, a. attia, k. t. wheeler, m. s. miller, and m. t. munley. "a technique for murine irradiation in a controlled gas environment." biomedical sciences instrumentation 48 (2012): 470. find a book (physical): case 5. few, stephen. show me the numbers: designing tables and graphs to enlighten. vol. 1, no. 1. oakland, ca: analytics press, 2004. case 6. metcalf, christine. the love of cats. crescent books, 1973. case 7. machiavelli, niccolò, and leo paul s. de alvarez. 1989. the prince. prospect heights, ill: waveland press. case 8. lees-‐maffei, grace, and rebecca houze, eds. the design history reader. berg, 2010. find an ebook: case 9. rubin, jeffrey, and dana chisnell. handbook of usability testing: how to plan, design, and conduct effective tests. wiley technical communication library, 2008. case 10. rubin, jeffrey, and dana chisnell. handbook of usability testing: how to plan, design, and conduct effective tests. wiley technical communication library, 2008. case 11. laube, matthew bryan. ancient awakening. 2010. editor’s comments bob gerrity information technology and libraries | september 2012 1 g’day, mates, and welcome to our third open-access issue. ital takes on an additional international dimension with this issue, as your faithful editor has taken up residence down under, in sunny queensland, australia. the recent ala annual meeting in anaheim marked some changes to the ital editorial board that i’d like to highlight. cynthia porter and judith carter are ending their tenure with ital after many years of service. cynthia is featured in this month’s editorial board thoughts column, offering her perspective on library technology past and present. judith carter ends a long run with ital as managing editor, and i thank her for her years of dedicated service. ed tallent, director of levin library at curry college, is the incoming managing editor. we also welcome two new members of the editorial board: brad eden, the dean of library services and professor of library science at valparaiso university, and jerome yavarkovsky, former university librarian at boston college, and the 2004 recipient of ala’s hugh c. atkinson award. jerome currently co-chairs the library technology working group at the mediagrid immersive education initiative. we cover a broad range of topics in this issue. ian chan, pearl ly, and yvonne meulemans describe the implementation of the open-source instant messaging (im) network openfire at california state university san marcos, in supporting of the integration of chat reference and internal library communications. richard gartner explores the use of the metadata encoding and transmission standard (mets) as an alternative to the fedora content model (fcm) for an “intermediary” digital-library schema. emily morton and karen hanson present an innovative approach to creating a management dashboard of key library statistics. kate pittsley and sara memmott describe navigational improvements made to libguides at eastern michigan university. bojana surla reports on the development of a platform-independent, java-based marc editor. yongming wang and trevor dawes delve into the need for next-generation integrated library systems and early initiatives in that space. melanie schlosser and brian stamper begin to explore the effects of reposting library digital collections on flickr. in addition to the compelling new content in this issue of ital, we have compelling old content from the print archive of ital and its predecessor, journal of library automation (jola), that will soon be available online, thanks in large to the work of andy boze and colleagues at the university of notre dame. scans of all of the back issues have now been deposited onto the server that currently hosts ital, and will be processed and published online over the coming months. bob gerrity (r.gerrity@uq.edu.au) is university librarian, university of queensland, st. lucia, queensland, australia. letter from the editors (december 2023) letter from the editors kenneth j. varnum and marisha c. kelly information technology and libraries | decmber 2023 https://doi.org/10.5860/ital.v42i4.17014 the journal’s editorial board has begun work this fall in two main areas, building on last year’s work on our diversity statement: an updated scope statement and exploration of content areas to match. if you have thoughts about how ital can better support your professional work and career development, please let us know. additionally, we are working with guest editors peter musser and joy dubose on an artificial intelligence-themed special issue that will be published in 2024. we are excited about the articles in the pipeline for this issue and look forward to sharing more details about this issue and its contents next year. in this issue in addition to those articles, we are pleased to present contributions from other voices: • there are three items in the letters to the editor section. • in our occasional editorial board thoughts series, editorial board member mary guillory contributes “drained-pool politics versus digital libraries in u.s. cyberspace,” a discussion of book banning in digital environments. • our regular public libraries leading the way column is by ross hanney, “reorienting collection analysis: cost effective item level analysis and machine learning in public libraries.” this essay highlights an analysis of how a small-town public library can save its community’s residents money. peer-reviewed articles in the current issue are listed here: • “to thine own 3d selfie be true: outreach for an academic library makerspace with a 3d selfie booth,” by alex watson • “towards an open source-first praxis in libraries,” by j. robertson mcilwain • “managing your library’s libguides: conducting a usability study to determine student preference for libguide design,” by julie burchfield and maggie possinger • “using qualtrics xm to create a point-of-use survey to assess the usability of a local implementation of primo,” by matthew black, heather ganshorn, and justine wheeler • “reference chatbots in canadian academic libraries,” by julia guy, paul r. pival, carla j. lewis, and kim groome help keep ital going if you are interested in contributing to information technology and libraries, there are several ways for you to do so. the main one, of course, is through contributing an article. our call for submissions outlines the topics of interest to the journal—basically, if the submission discusses the intersection of libraries/archives/museums and technology, it’s potentially in scope—and the process for submitting an article. for those of you who are interested in contributing in other ways, two opportunities will be available in the first part of 2024. a call for peer reviewers for ital and our sibling core journals (library leadership & management and library resources & technical services) will be distributed in january. for those interested in serving on ital’s editorial board, our annual call for volunteers will be distributed to core members in april. kenneth j. varnum, editor marisha c. kelly, assistant editor varnum@umich.edu marisha.librarian@gmail.com https://ital.corejournals.org/index.php/ital/article/view/16988 https://ital.corejournals.org/index.php/ital/article/view/16987 https://ital.corejournals.org/index.php/ital/article/view/16987 https://ital.corejournals.org/index.php/ital/article/view/16987 https://ital.corejournals.org/index.php/ital/article/view/15107 https://ital.corejournals.org/index.php/ital/article/view/15107 https://ital.corejournals.org/index.php/ital/article/view/16025 https://ital.corejournals.org/index.php/ital/article/view/16473 https://ital.corejournals.org/index.php/ital/article/view/16473 https://ital.corejournals.org/index.php/ital/article/view/16475 https://ital.corejournals.org/index.php/ital/article/view/16475 https://ital.corejournals.org/index.php/ital/article/view/16511 https://ejournals.bc.edu/index.php/ital/call-for-submissions mailto:varnum@umich.edu mailto:marisha.librarian@gmail.com in this issue help keep ital going microsoft word june_ital_gerrity.docx editor’s comments bob gerrity information technology and libraries | june 2015 1 library discovery circa 1974 our ongoing project to digitize back issues of information technology and libraries (ital) and its predecessor, journal of library automation (jola), provides frequent reminders of what’s changed (and what hasn’t) in library technology in the past several decades. the image above is from a 1974 advertisement in jola for the “rom ii book catalog on microfilm” from information design in menlo park, ca. the ad copy speaks for itself: all the advantages of a printed book catalog…none of the disadvantages. your staff and patrons can use the catalog simultaneously in many different locations. the user can scan a number of related titles on the same page, in contrast to the one-‐at-‐a-‐time viewing of catalog cards in trays. manual filing routines and maintenance are eliminated. easy to use…requires no instruction. an automatic index pointer shows your patron his position in the file. at the touch of a button he can scan forward or back at high speed. average look-‐up time is about twelve seconds. a staff member can insert an updated catalog totally cumulated on a single reel of microfilm in about one minute. your patrons never touch the film—your complete library catalog “locked-‐in”! bob gerrity (r.gerrity@uq.edu.au) is university librarian, university of queensland, australia. editor’s comments | gerrity doi: 10.6017/ital.v34i2.8805 2 my favorite bit is the sign on the front of the machine, proudly proclaiming: these are all the books in the library. this month’s issue of ital looks at the current state of library discovery from a number of angles. will owen and sarah michalak describe efforts at unc chapel hill and partners within the triangle research libraries network to enhance the utility of the library catalog as a core tool for research, taking advantage of web-‐based search technologies while retaining many of the unique attributes of the traditional catalog. joseph deodato provides a useful step-‐by-‐ step guide to evaluating web-‐scale discovery services for libraries. david nelson and linda turney analyze faceted navigation capabilities in library discovery systems and offer suggestions for improving their usefulness and potential. julia bauder and emma lange describe a new approach to subject searching, using an interactive, visual approach. yan quan liu and sarah briggs report on the current state of mobile services among the top 100 us university libraries. unrelated to discovery but certainly relevant to issues around library provision of access to information, jill ellern, robin hitch, and mark stoffan report on user authentication policies and practices at academic libraries in north carolina. incoming editor’s column | gerrity 155 bob gerrity g reetings ital readers. i’m writing this in late september, as the boston red sox attempt to back their way into the major league baseball postseason after blowing a 9-game lead over tampa bay in a major-league september meltdown of epic proportions. [red sox fans are prone to hyperbole, but in this case no hyperbole is needed: this meltdown really is epic.] it’s down to the last game of the season, and like many red sox fans, i’m hopeful but not optimistic. the fate of the 2011 red sox will be old news by the time this appears in print, though: as i’m coming to learn, the wheels of scholarly publishing continue to turn ever so slowly, unless forced to do otherwise. which brings me to why i’m taking on the role of editor of ital. on one hand, i’m fortunate to be taking on the editorship of a journal that quite clearly has been stewarded with care, dedication, and attention by my predecessors. i’ve spent quite a few hours recently in the z678.9 section of my library’s stacks, perusing three decades of back volumes of ital and its predecessor, the journal of library automation. there’s an impressive body of scholarly and informational output on library automation and related topics, from the sublime (“to boolean or not to boolean?” september 1983), to the not-so-sublime (“the effects of baud rate, performance anxiety, and experience in online bibliographic searches,” march 1990), to the sentimental (“floppies to pass the billion-dollar level in ’84.” september 1982), to the déjà-vu-all-over-again (“ls2000—the integrated library system for oclc,” june 1984). overall, i’d have to say there’s a solid foundation to build on, plus plenty of good content in the pipeline, and it would be easy to continue on in the same vein. but that’s not why i’m here. i’m fortunate to be taking on the role of editor as ital faces significant changes. in his inaugural editorial for ital in march 2005, then-incoming editor john webb articulated a number of worthy goals for ital, to both broaden and deepen the content of the journal and the demographic of the authors contributing to it. one goal in particular, though, strikes me (in hindsight of course) as problematic: “i hope to . . . facilitate the electronic publication of articles without endangering—but in fact enhancing—the absolutely essential financial contribution that the journal provides to the association.” anyone who has observed the struggles of the newspaper industry in recent years or been involved in the shift towards e-only in the world of academic/scholarly journals will not be surprised to learn that, in the intervening years since john wrote his column and ital has continued in print plus electronic form, revenues (primarily from subscriptions and advertising) have steadily declined while production and distribution costs have not, resulting in an increasing annual subsidy from ala/lita to support the publication. as a result, i’ve been tasked with exploring a new publication model for ital: open access and electronic only. plans for—and the timing of — this transition are still being developed as i write this, but should be finalized before “my” first issue is published in march 2012. there is much about ital that will not change even if the publication format does. a primary focus of the journal will continue to be to solicit and publish high-quality, peer-reviewed papers covering a broad array of topics related to the design, application, and use of technology in libraries. changes i would like to see include making ital more timely and more relevant to the day-to-day work interests of many of its readers. i’d like to add more topical, current, and informational content to ital without negatively impacting its traditional role as a publication vehicle for librarians in tenure-track positions. ital in an e-only format also needs to provide easy and transparent ways for readers to be informed when new content is published and to offer advice, criticism, and commentary to help improve ital. i look forward to your feedback as ital moves in a new direction, about which i’m both hopeful and optimistic. i would like to offer my sincere thanks to the outgoing editor of ital, marc truitt, who has been both helpful and gracious during this editorial transition. marc is passionate about ital and its legacy, and i hope he’ll see the future ital as a worthy successor to, rather than an unfortunate break from, the journal he’s stewarded for the past several years. incoming editor’s column: ch-ch-ch-ch-changes (turn and face the strain) bob gerrity (robert.gerrity@bc.edu) is associate university librarian for information technology, boston college libraries, chestnut hill, massachusetts. 6 information technology and libraries | march 2011 i n the new lita strategic plan, members have suggested an objective for open access (oa) in scholarly communications. some people describe oa as articles the author has to pay someone to publish. that can be true, but that’s not how i think of it. oa is definitely not vanity publishing. most oa journals are peer-reviewed. i like the definition provided by enablingopenscholarship: open access is the immediate (upon or before publication), online, free availability of research outputs without any of the restrictions on use commonly imposed by publisher copyright agreements.1 my focus on oa journals increased precipitously when the licensing for a popular american weekly medical journal changed. we could only access online articles from one on-campus computer unless we increased our annual subscription payment by 500 percent. we didn’t have the funds, and now the students suffer the consequences. i think it was an unfortunate decision the journal’s publishers made. i know from experience that if a student can’t access the first article they want, they will find another one that is available. interlibrary loan is simpler than ever, but i think only the patient and curious students will make the effort to contact us and request an article they cannot obtain. in 2006 scientist gary ward wrote that faculty at many institutions experience problems accessing current research. when faculty teach “what is available to them rather than what their students most need to know, the education of these students and the future of science in the u.s. will suffer.” he explains it is a false assumption that those who need access to scientific literature already have it. interlibrary loans or pay-per-view are often offered by publishers as the solution to the access problem, but this misses an essential fact of how we use the scientific literature: we browse. it is often impossible to tell from looking at an abstract whether a paper contains needed methodological detail or the perfect illustration to make a point to one’s students. apart from considerations of cost, time, and quality, interlibrary loans and pay-per-views simply do not meet the needs of those of us who often do not know what we’re looking for until we find it.2 i want our medical students and tomorrow’s doctors to have access to all of the most current medical research. we offer the service of providing jama articles to students, but i’m guessing that we hear from a small percentage of the students who can’t access the full text online. are people reading oa articles? not only are scholars reading the articles, but they are citing those articles in their publications. consider the public library of science’s plosone (http://www.plosone.org/home.action), a peerreviewed, open-access, online publication that features reports on primary research from all disciplines within science and medicine. in june 2010, plosone received its first impact factor of 4.351—an impressive number. that impact factor puts plosone in the top 25 percent of the institute for scientific information’s (isi) biology category.3 the impact factor is calculated annually by isi and represents the average number of citations received per paper published in that journal during the two preceding years.4 in other words, articles from plosone published in 2008 and 2009 were highly cited. is oa making an impact in my medical library? i believe it is, although i won’t be happy until our students can access the online journals they want from off campus and the library won’t have to pay outrageous licensing fees. we have more than one thousand online oa journal titles in our list of online journals. the more full text they can access, the less they’ll have to settle for their second or third choice because their first choice is not available online. i’m glad that lita members included oa in their strategic plan. the number of oa journals is increasing, and i believe we will continue to see that the articles are reaching readers and making a difference. i don’t think ital will be adopting the “author pays” model of oa, but the editorial board is dedicated to providing lita members with the access they want. references 1. enablingopenscholarship, “enabling open scholarship: open access,” http://www.openscholarship.org/jcms/c_6157/ open-access?portal=j_55&printview=true, (accessed jan. 18, 2011). 2. ward, gary, “deconstructing the arguments against improved public access,” newsletter of the american society for cell biology, nov. 2006, http://www.ascb.org/filetracker .cfm?fileid=550 (accessed jan. 18, 2011). 3. davis, phil, “plos one: is a high impact factor a blessing or a curse?” online posting, june 21, 2010, the scholarly kitchen, http://scholarlykitchen.sspnet.org/2010/06/21/plosone -impact-factor-blessing-or-a-curse/ (accessed jan. 18, 2011). 4. thomson reuters, “introducing the impact factor,” http://thomsonreuters.com/products_services/science/ academic/impact_factor/ (accessed jan. 18, 2011). cynthia porter editorial board thoughts: is open access the answer? cynthia porter (cporter@atsu.edu) is distance support librarian at a.t. still university of health sciences, mesa, arizona. 2 information technology and libraries | march 2007 m any things happen on the national front that affect libraries and their use of technology. legislative action, national policy, and stan dards development are all arenas in which ala and lita both take an active role. lita has articulated in its strategic plan the need to pursue active involvement in providing its expertise on national issues and standards development. lita achieves these important objectives in a variety of ways. lita has several committees, interest groups, and representatives to ala standing committees that address legislation, regulation, and national policy issues that pertain to technology. the charge of the lita legislation and regulations committee reads: “the legislation and regulation committee monitors legislative and regula tory developments in the areas of information and communications technologies; identifies relevant issues affecting libraries and assists in developing appropri ate strategies for responding to these issues.” as its educational mission, the committee publicizes issues and strategies on the lita web site. the chairperson of this committee serves as the lita representative to the ala legislation assembly which advises ala on positions to take regarding legislative and regulatory action. lita also has a representative to the ala office of information technology policy advisory committee who works closely with the legislation and regulation committee on it policy issues that may cross over into the legislative realm. lita also appoints a representa tive to the ala intellectual freedom committee whose purpose is “to recommend such steps as may be neces sary to safeguard the rights of library users, libraries, and librarians, in accordance with the first amendment to the united states constitution and the library bill of rights.” much has happened on the national front in the past few years that provides plenty of work for these lita and ala committees. the patriot act, calea, net neutrality, dopa, ada compliance, and debates over copyright and intellectual property rights in an electronic world are all examples of issues that require technologi cal control or affect systems and network solutions. they also touch at the heart of what librarians have always stood for: protection of intellectual property, personal pri vacy, and intellectual freedom. library technologists exert enormous time and effort protecting the privacy of patron records through data retention policies, system controls, and strong authentication systems all while providing authorized access to intellectual property according to copyright or licensing restrictions. keeping lita mem bers apprised of all of these issues and the technologies required to abide by legal requirements is an enormous task of the committees and interest groups. these groups do this through programming, publications, and postings to the lita web site. lita has always been very active on the standards development front. from the start, lita was involved with the marc standards through the hard work of henriette avram. the number of standards that affect libraries has mushroomed. there are standards for all aspects of technology—data formats, hardware and firmware, and networking. ala regularly calls on lita to provide expertise on developing standards that per tain to library technology. lita has a standards interest group and shares membership with alcts and rusa on the marbi committee. most lita interest groups deal with standards of some sort at least occasionally. the lita board felt that lita’s work on develop ing standards was so important that in 2006 a new standards coordinator position was created and diane hillman, cornell university, was appointed as the first person in this role. the standards coordinator identifies lita experts to assist in calls for review of developing standards and seeks input from the membership. the standards coordinator works closely with the standards interest group to help educate the membership. because of the nature of digital information, networks, and the standards that enable the distribution of digital informa tion and services, it has become impossible for any one person to understand all the standards that affect the library technologist. as standards proliferate, it becomes more important for lita to provide educational oppor tunities alongside the involvement in the development of these standards that so impact our daily lives. the lita web site provides a wealth of information about standards. a new means of contributing to the dialogue about developing standards is to participate in the lita wiki where diane hillman will be leading the way in posting information about various library technology standards. also, a great place to learn about various stan dards is right here in ital. practically every issue has at least one article about one standard or another. lita’s participation in technological developments on the national front is critical to all libraries. policy, regu lation, and standards form the infrastructure to techno logical implementation and are the cornerstone to library technology. lita is the place where you can learn more about these developments and participate in the dialogue about them. bonnie postlethwaite (postlethwaiteb@umkc.edu) is lita president 2006/2007 and associate dean of libraries, university of missouri–kansas city. president’s column bonnie postlethwaite reference information extraction and processing using conditional random fields tudor groza, gunnar aastrand grimnes, and siegfried handschuh reference information extraction and processing |groza, grimnes, and handschuh 6 abstract fostering both the creation and the linking of data with the scope of supporting the growth of the linked data web requires us to improve the acquisition and extraction mechanisms of the underlying semantic metadata. this is particularly important for the scientific publishing domain, where currently most of the datasets are being created in an author-driven, manual manner. in addition, such datasets capture only fragments of the complete metadata, omitting usually, important elements such as the references, although they represent valuable information. in this paper we present an approach that aims at dealing with this aspect of extraction and processing of reference information. the experimental evaluation shows that, currently, our solution handles very well diverse types of reference format, thus making it usable for, or adaptable to, any area of scientific publishing. 1. introduction the progressive adoption of semantic web 1 techniques resulted in the creation of a series of datasets connected by the linked data 2 initiative, and via the linked data principles, into a universal web of linked data. in order to foster the continuous growth of this linked data web, we need to improve the acquisition and extraction mechanisms of the underlying semantic metadata. unfortunately, the scientific publishing domain, a domain with an enormous potential for generating large amounts of linked data, still promotes trivial mechanisms for producing semantic metadata. 3 as an illustration, the metadata acquisition process of the semantic web dog food server, 4 the main linked data publication repository available on the web, consists of two steps:  the authors manually fill-in submission forms corresponding to different publishing venues (e.g., conferences or workshops), with the resulting (usually xml) information being transformed via scripts into semantic metadata, and  the entity uris (i.e., authors and publications) present in this semantic metadata are then manually mapped to existing web uris for linking/consolidation purposes. tudor groza (tudor.groza@uq.edu.au) is postdoctoral research fellow, school of information technology and electrical engineering, university of queensland, gunnar aastrand grimnes (grimnes@dfki.uni-kl.de) is researcher, german research center for artificial intelligence (dfki) gmbh, kaiserslautern, germany, siegfried handschuh (msiegfried.handschuh@deri.org) is senior lecturer/associate professor, national university of ireland, galway, ireland. mailto:tudor.groza@uq.edu.au mailto:grimnes@dfki.uni-kl.de mailto:msiegfried.handschuh@deri.org information technology and libraries | june 2012 7 moreover, independent of the creation/acquisition process, one particular component of the publication metadata, i.e., the reference information, is almost constantly neglected. the reason is mainly the amount of work required to manually create it, or the complexity of the task, in the case of automatic extraction. as a result, currently, there are no datasets in the linked data web exposing reference information, while the number of digital libraries providing search and link functionality over references is rather limited. this is quite a problematic gap if we consider the amount of information provided by references and their foundational support for other application techniques that bring value to researchers and librarians, such as citation analysis and citation metrics, tracking temporal author-topic evolution 5 or co-authorship graph analysis. 6,7 in this paper we focus on the first of the above-mentioned steps, i.e., providing the underlying mechanisms for automatic extraction of reference metadata. we devise a solution that enables extraction and chunking of references using conditional random fields (crf). 8 the resulting metadata can then be easily transformed into semantic metadata adhering to particular schemas via scripts, the added value being the exclusion of the manual author-driven creation step from the process. from the domain perspective, we focus on computer science and health sciences only because these domains have representative datasets that can be used for evaluation and hence enable comparison against similar approaches. however, we believe that our model can be applied also in domains such as digital humanities or social sciences, and we intend, in the near future, to build a corresponding corpus that would allow us to test and adapt (if necessary) our solution to these domains. figure 1. examples of chunked and labeled reference strings reference chunking represents the process of label sequencing a reference string, i.e., tagging the parts of the reference containing the authors, the title, the publication venue, etc. the main issue associated with this task is the lack of uniformity in the reference representation. figure 1 presents three examples of chunked and labeled reference strings. one cannot infer generic patterns for all types of references. for example, the year (or date) of some of the references of this paper are similar to example 2 from the figure, i.e., they are located at the very end of the reference string. unfortunately, this does not hold for some journal reference formats, such as the one presented in example 1. and at the same time, the actual date might not comprise only the year, but also the month (and even day). in addition to the placement of the particular types of tokens within the reference string, one of the major concerns when labeling these types of tokens is disambiguation. generally, there are three categories of ambiguous elements: reference information extraction and processing |groza, grimnes, and handschuh 8  names—can act as authors, editors, or even part of organization names (e.g., max planck institute); in example 1 a name is used as part of the title;  numbers—can act as pages, years, days, volume numbers, or just numbers within the title;  locations—can act as actual locations or part of organization names (e.g., univ. of wisconsin) to help the chunker in performing disambiguation, one can use a series of markers, such as, pp. for pages, tr for technical reports, univ. or institute for organization. however, there are cases where such markers help in detecting the general category of the token, e.g., publication venue, but a more detailed disambiguation is required. for example, the proc. marker generally signals the publication venue of the reference, without knowing exactly whether it represents a workshop, conference or even journal (as in the case of proc. natl. acad. sci.—proceedings of the national academy of sciences). the solution we have devised was built to properly handle such disambiguation issues and the intrinsic heterogeneous nature of references. the features of the crf chunker model were chosen to provide a representative discrimination between the different fields of the reference string. consequently, as the experimental results show, the resulting chunker has a superior efficiency, while at the same time maintaining an increased versatility. the rest of the paper is structured as follows: in section 2 we briefly describe conditional random fields and analyze the existing related work. section 3 details the crf-based reference chunker and before concluding in section 5, section 4 presents our experimental results. 2. background 2.1 conditional random fields to have a better understanding of the machine learning technique used by our solution, in the following we give a brief description of the conditional random fields paradigm. figure 2. example linear crf—showing dependencies between features x and classes y information technology and libraries | june 2012 9 conditional random fields (crf) is a probabilistic graphical model for classification. crf, in general, can represent many different types of graphical models, however in the scope of this paper, we use the so-called linear-chain crfs. a simple example of a linear dependency graph is shown in figure 2, here only the features x of the previous item influences the class of the current item y. the conditional probability is defined as: ( | ) ( ) (∑ ( ) ) where ( ) ∑ ( ) and ( ) ∑ (∑ ( ) ) . the model is usually trained by maximizing the log-likelihood of the training data by gradient methods. a dynamic algorithm is used to compute all the required probabilities p⍬(yi, yi+1) for calculating the gradient of the likelihood. this means that in contrast to traditional classification algorithms in machine learning (e.g., support vector machines 9 ), it not only considers the attributes of the current element when determining the class, but also attributes of preceding and succeeding items. this makes it ideal for tagging sequences, such as chunking of parts of speech or parts of references, which is what we require for our chunking task. 2.2 related work in recent years, extensive research has been performed in the area of automatic metadata extraction from scientific publications. most of the approaches focus on one of the two main metadata components, i.e., on the heading/bibliographic metadata or on the reference metadata, but there are also cases when the entire set is targeted. as this paper focuses only on the second component, within this section we present and discuss those applications that deal strictly with reference chunking. the parscit framework is the closest technique mapping to our goals and methodology. 10 parscit is an open-source reference-parsing package. while its first version used a maximum entropy model to perform reference chunking, 11 currently, inspired by the work of peng et al. , 12 it uses a trained crf model for label sequencing. the model was obtained based on a set of twenty-three token-oriented features tailored towards correcting the errors that peng's crf model produced. our crf chunker builds on the work of parscit. however, as we aimed at improving the chunking performance, we altered some of the existing features and introduced additional ones. moreover, we have compiled significantly larger gazetteers required for detecting different aspects, such as names, places, organizations, journals, or publishers. one of the first attempts to extract and index reference information led to the currently well known system, citeseer. 13 around the same period, seymore et al. developed one of the first reference chunking approaches that used machine learning techniques. 14 the authors trained a hidden markov model (hmm) to build a reference sequence labeler using internal states for different parts of the fields. as it represented pioneering work, it also resulted in the first gold standard set, the cora dataset. at a later stage, the same group applied crf for the first time to perform reference chunking, which later inspired parscit. 15 reference information extraction and processing |groza, grimnes, and handschuh 10 in the same learning-driven category is the work of han et al. 16 the authors proposed an effective word clustering approach with the goal of reducing feature dimensionality when compared to hmm, while at the same time improving the overall chunking performance. the resultant domain, rule-based word clustering method for cluster feature representation used clusters formed from various domain databases and word orthographic properties. consequently, they achieved an 8.5 percent improvement on the overall accuracy of reference fields classification combined with a significant dimensionality reduction. flux-cim 17 is the only unsupervised 18 approach that targets reference chunking. the system uses automatically constructed knowledge bases from an existing set of sample references for recognizing the component fields of a reference. the chunking process features two steps:  a probability estimation of a given term within a reference which is a value for a given reference field based on the information encoded in their knowledge bases, and  the use of generic structural properties of references. similarly to seymore et al., 19 the authors have also created two datasets (specifically for the computer science and health science areas) to be used for comparing the achieved accuracies. a completely different, and novel, direction was developed by poon and domingos. 20 unlike all the other approaches, they propose a solution where the segmentation (chunking) of the reference fields is performed together with the entity resolution in a single integrated inference process. they, thus, help in disambiguating the boundaries of less-clear chunked fields, using the already well-segmented ones. although the results achieved are similar to, and even better than some of, the above-mentioned approaches, this is suboptimal from the computational perspective: the chunking/resolution time reported by the authors measured around thirty minutes. in addition to the previously described works, which were specifically tailored for bibliographic metadata extraction, there are a series of other approaches that could be used for the same purpose. for example, cesario et al. propose an innovative recursive boosting strategy, with progressive classification, to reconcile textual elements to an existing attribute schema. 21 in the case of bibliographic metadata segmentation, the metadata fields would correspond to the textual elements, while an ontology describing them (e.g., dublincore 22 or swrc 23 ) would have the schema role. the authors even describe an evaluation of the method using the dblp citation dataset, however, without giving precise details on the fields considered for segmentation. some other approaches include, in general, any sequence labeling techniques, e.g., slf, 24 named entity recognition techniques, 25 or even field association (fa) terms extraction, 26 the latter working on bibliographic metadata fields in a quasi-similar manner as the recursive boosting strategy. in conclusion, it is worth mentioning that retrieving citation contexts is an interesting research area especially in the context of digital libraries. our current work does not feature this aspect, but we regard it as one of the key next steps to be tackled. consequently, we mention the research performed by schwartz et al. 27 teufel et al., 28 or wu et al. 29 that deal with using citation contexts for discerning a citation's function and analyzing how this influences or is influenced by the work it points to. information technology and libraries | june 2012 11 3. method this section presents the crf chunker model. we start by defining the preprocessing steps that deal with the extraction of the references block, dividing the block into actual reference entries and cleaning the reference strings, and then detail the crf reference chunker features. 3.1 prerequisites most of the features used by the crf chunker require some forms of vocabulary entries. therefore, we have manually compiled a comprehensive list of gazetteers (only for english, except for the names), explained as follows:  firstname—25,155 entries gazetteer of the most common first names (independent of gender);  lastname—48,378 entries list of the most common surnames;  month—month names gazetteer and associated abbreviations;  venuetype—a structured gazetteer with five categories: conference, workshop, journal, techreport, and website. each category has attached its own gazetteer, containing specific keywords and not actual titles. for example, the conference gazetteerfeatures ten unigrams signaling conferences, such as conference, conf, or symposium;  location—places, cities, and countries gazetteer comprising 17,336 entries;  organization—150 entries gazetteer listing organization prefixes and suffixes (e.g., e.v. or kgaa);  proceedings—simple list of all possible appearances of the proceedings marker;  publisher—564 entries gazetteer comprising publisher unigrams (produced from around 150 publisher names);  jtitle—12,101 entries list of journal title unigrams (produced from around 1600 journal titles);  connection—a 42 entries stop-word gazetteer (e.g., to, and, as). 3.2 preprocessing in the preprocessing stage we deal with three aspects:  cleaning the provided input,  extracting the reference block, and  the division of the reference block into reference entries. the first step aims to clean the raw textual input received by the chunker of unwanted spacing characters while at the same time ensuring proper spacing where necessary. since the source of the textual input is unknown to the chunker, we make no assumptions with regard to its structure or content. 30 thus, in order to avoid inherent errors that might appear as a result of extracting the raw text from the original document, we perform the following cleaning steps:  we compress the text by eliminating unnecessary carriage returns, such that the lines containing less than 15 characters are merged with previous ones, 31  we introduce spaces after some punctuation characters, such as “,,” “.” or “-”, and finally,  we split the camel-cased strings, such as johndoe. reference information extraction and processing |groza, grimnes, and handschuh 12 the result will be a compact and clean version of the input. also, if the raw input is already compact and clean, this preprocessing step will not affect it. the extraction of the reference block is done using regular expressions. generally, we search in the compacted and cleaned input for specific markers, like references or bibliography, located mainly at the beginning of a line. if these are not directly found, we try different variations, such as, looking for the markers at the end of a line, or looking for split markers onto two lines (e.g., ref – erences, or refer – ences). this latter case is a typical consequence of the above-described compacting step if the initial input was erroneously extracted. the text following the markers is considered for division, although it may contain unwanted parts such as appendices or tables. the division into individual reference entries is performed on a case basis. after splitting the reference block based on new lines, we look for prefix patterns at the beginning of each line. as an example, we analyze which lines start with “[”, “(”, or a number followed by “.” or space, and we record the positions of these lines in the list of all lines. to ensure that we don't consider any false positives when merging the adjacent lines into a reference entry, we compute a global average of the differences between positions. assuming that a reference does not span on more than four lines, if this average is between one and four, a reference entry is created. the same average is also used to extract the last reference in the list, thus detaching it from eventual appendices or tables. 3.3 the reference chunking model we have built the crf learning model based on a series of features used in principle also by the other crf reference chunking approaches such as parscit 32 or peng and mccallum 33 . a set of feature values is used to characterize each token present in the reference string, where the reference's token list is obtained by dividing the reference string into space-separated pieces. the complete list of features is detailed as follows. we use example 1 from figure 1 toexemplify the feature values.  token—the original reference token: bronzwaer,  clean token—the original token, stripped of any punctuation and lower cased: bronzwaer  token ending—a flag signaling the type of ending (possible values: lower cap – c / upper cap – c / digit – 0 / punctuation character: ,  token decomposition–start—five individual values corresponding to token's first five characters, taken gradually: b, br, bro, bron, bronz  token decomposition–end—five individual values corresponding to the token's last five characters, taken gradually: r, er, aer, waer, zwaer,  pos tag—the token's part of speech tag (possible values: proper noun phrase – nnp ,  noun phrase – np, adjective – jj, cardinal number – cd, etc): nnp  orthographic case—a flag signaling the token's orthographic case (possible values:  initialcap, singlecap, lowercase, mixedcaps, allcaps): singlecap  punctuation type—a flag signaling the presence and type of a trailing punctuation character (possible values: cont, stop, other): cont  number type—a flag signaling the presence and type of a number in the token (possible values: year, ordinal, 1dig, 2dig, 3dig, 4dig, 4dig+, nonumber): nonumber information technology and libraries | june 2012 13  dictionary entries—a set of ten flags signaling the presence of the token in the set of individual gazetteers listed in sect. 3.1. for our example the dictionary feature set would be: no lastname no no no no no no no no  date check—a flag checking whether the token may contain a date in form of a period of days, e.g., 12-14 (possible values: possdate, no): no  pages check—a flag checking whether the token may contain pages, e.g., 234–238 (possible values: posspages, no): no  token placement—the token placement in the reference string, based on its division into nine equal consecutive buckets. this feature indicates the bucket number: 0 for training purposes we compiled and manually tagged a set of 830 randomly chosen references. these were extracted from random publications from diverse conferences and journals from the computer science field (collected from ieee explorer, springer link or the acm portal), manually cleaned, tagged, and categorized according to their type of publication venue. 34 to achieve an increased versatility, instead of performing crossvalidation, 35 which would result in a datasettailored model with limited or no versatility, we opted for sampling the test data. hence, we included in the training corpus some samples from the testing datasets as follows: 10 percent of the cora dataset (i.e., 20 entries), 36 10 percent of the flux-cim cs dataset (i.e., 30 entries), 37 and 1% of the flux-cim hs dataset (i.e., 20 entries). consequently, the final training corpus consisted of a total of 900 reference strings. to clarify, this is, to some extent, similar to the dataset-specific cross-validation, but instead of considering, for example, a 60–40 ratio for training/testing, we used only 10 percent for training, while the testing (described in section 4) was performed as a direct application of the chunker on the entire dataset. as already mentioned, our focus on computer science and health sciences is strictly due to evaluation purposes. our proposed model is domain-agnostic, and hence, the steps described here can be easily performed on datasets emerged from other domains, if at all necessary. in reality, the chunker’s performance on references from a domain not covered above can be easily boosted simply by including a sample of references in the training set and then retraining the chunker. the list of labels used for training and then testing consists of author, title, journal, conference, workshop, website, technicalrep, date, publisher, location, volnum, pages, etal, note, editors, organization. as we will see in the evaluation, not all labels were actually used for testing (e.g., note or editors), some of them being present in the model for the sake of disambiguation. also, as opposed to the other approaches, we made a clear distinction between workshop and conference, which adds an extra degree to the complexity of the disambiguation. the crf model was trained using the mallet (a machine learning for language toolkit) implementation. 38 the output of the chunker is post-processed to expose a series of fine-grained details. as shown in figure 1 in all the examples, the chunking provides a blocked partition of the reference string, but we require for the author field an even deeper partition. consequently, following a rule-based approach we extract the individual author names from the author block making use of the punctuation marks, the orthographic case, and the alternation between initials and actual names. when no initials, subject to the existing punctuation marks, we consider as a rule-of-thumb that each name generally comprises one first name and one surname (in this order, i.e., john doe). the result of the post-processing is used in the linking process. reference information extraction and processing |groza, grimnes, and handschuh 14 4. experimental results we have performed an extensive evaluation of the proposed reference chunking approach. in general, all the previous work in reference chunking focuses on raw reference chunking, i.e., label sequencing at the macro level. more concretely, the other approaches split and tag the reference strings using blocks of complete references, without going into details such as chunking individual authors. the only exception is the parscit package that does perform complete reference chunking in a similar fashion as we do. the evaluation results presented in this section, will feature complete chunking only for our solution and for parscit, and raw chunking for the rest of the approaches. field parscit peng han et al. our approach p r f1 f1 p r f1 p r f1 author 98.7 99.3 98.99 99.4 92.6 99.1 97.6 99.08 99.6 99.30 title 96.0 98.4 97.18 98.3 92.2 93.0 92.6 95.64 95.64 95.64 date 100 98.4 99.19 98.9 98.5 95.9 97.2 99.33 98.67 98.99 pages 97.7 98.4 98.04 98.6 95.6 96.9 96.2 99.28 99.22 99.24 location 95.6 90.0 92.71 87.2 77.7 71.5 74.5 93.45 92.59 93.01 organization 90.9 87.9 89.37 94.0 76.5 77.3 76.9 100 87.87 93.54 journal 90.8 91.2 90.99 91.3 77.1 78.7 77.9 94.02 97.42 95.68 booktitle 92.7 94.2 93.44 93.7 88.7 88.9 88.88 97.77 98.44 98.10 publisher 95.2 88.7 91.83 76.1 56.0 64.1 59.9 94.84 95.83 95.33 tech. rep. 94.0 79.6 86.2 86.7 56.2 64.1 59.9 100 90.90 95.23 website 100 100 100 table 1. evaluation results on the cora dataset an additional observation we need to make is related to the reference fields taken into account. most of the fields we have focused on coincide with the fields considered by all the existing relevant approaches. nevertheless, there are also some discrepancies, listed as follows:  the fields: volume, number, editors, or note were used in the chunking process b u t are not considered for evaluation  unlike all the other approaches, we make the distinction between conference and workshop as publication venues. however, for alignment purposes (i.e., to be able to compare our results with the other approaches), in the evaluation results these are merged into the booktitle field. the actual tests were performed on four different datasets, three of them used also for evaluating the other approaches, and a fourth one compiled by us. in the case of the three existing datasets, during the experimental evaluation we did not make use of the preprocessing step as they were already clean. as evaluation metric, we used the f1 score, 39 i.e., the harmonic mean of precision and recall, using the following formula: information technology and libraries | june 2012 15 in the following, we iterate over each dataset, by providing a short description and the experimental results. it is worth mentioning that our crf reference chunker was trained only once, as described earlier, and not specifically for each dataset. 4.1 dataset: cora the cora dataset is the first gold standard created for automatic reference chunking. 40 it comprises two hundred reference strings and focuses on the computer science area. each entry is segmented into thirteen different fields: author, editor, title, booktitle, journal, volume, publisher, date, pages, location, tech, institution and note. table 1 shows the comparative evaluation results on the cora dataset of parscit, peng et al., 41 han et al., 42 and our approach. we observe that our chunker outperforms the other chunkers on most of the fields, with some of them presenting a significant increase in performance (looking at the f1 score): journal from 91.3 percent to 95.68 percent, booktitle from 93.44 percent to 98.10 percent, publisher from 91.83 percent to 95.33 percent, and especially tech. rep. from 86.7 percent to 95.23 percent. in the case of the fields where our chunker was outperformed, the f1 score is very close to the best of the approaches and includes an increase in one of its two components (i.e., precision or recall). for example, on the organization field, we scored 93.54percent, the best being peng's 94 percent. however, we achieved a gain of almost 10 percent in precision when compared with parscit (100 percent vs. 90.9 percent precision). similarly, on the date field, our f1 was 98.99 percent, opposed to parscit's 99.19 percent, but with a better recall of 98.67 percent. field parscit flux-cim our approach p r f1 p r f1 p r f1 author 98.8 99.0 98.89 93.59 95.58 94.57 99.08 99.08 99.08 title 98.8 98.3 98.54 93.0 93.0 93.0 99.65 99.65 99.65 date 99.8 94.5 97.07 97.75 97.44 97.59 98.55 98.19 98.36 pages 94.7 99.3 96.94 97.0 97.84 97.41 97.28 97.72 97.49 location 96.9 88.4 92.45 96.83 97.6 97.21 95.55 94.5 95.02 journal 97.1 82.9 89.43 95.71 97.81 96.75 94.0 97.91 95.91 booktitle 95.7 99.3 97.46 97.47 95.45 96.45 99.13 99.13 99.13 publisher 98.8 75.9 85.84 100 100 100 98.59 98.59 98.59 table 2. evaluation results on the flux-cim dataset—cs domain field flux-cim our approach p r f1 p r f1 author 98.57 99.04 98.81 99.8 99.36 99.57 title 84.88 85.14 85.01 91.39 91.39 97.39 date 99.85 99.5 99.61 99.89 99.69 99.78 pages 99.1 99.2 99.45 99.94 99.59 99.76 journal 97.23 89.35 93.13 99.42 99.16 99.28 table 3. evaluation results on the flux-cim dataset—hs domain reference information extraction and processing |groza, grimnes, and handschuh 16 4.1 dataset: flux-cim flux-cim 43 is an unsupervised 44 reference extraction and chunking system. in order to evaluate its performance, the authors of flux-cim created two separate datasets:  the flux-cim cs dataset, composed on a collection of heterogeneous references from the computer science field, and  the flux-cim hs dataset is comprised of an organized and controlled collection of references from pubmed. the flux-cim cs dataset contains three hundred reference strings randomly selected from the acm digital library. each string is segmented into ten fields: author, title, conf, journal, volume, number, pub, date, pages and place. the flux-cim hs dataset contains 2000 entries, with each entry segmented into six fields: author, title, journal, volume, date and pages. table 2 presents the comparative test results achieved by parscit, flux-cim, and our approach on the cs dataset. similar to the cora dataset, our chunker outperformed the other chunkers on the majority of the fields, exceptions being the location, journal, and publisher fields. the test results on the hs dataset are presented in table 3. here we can observe a clear performance improvement on all fields, in some cases the difference being significant, e.g., the title field, from 85.01 percent to 97.39 percent, or the journal field, from 93.12 percent to 99.28 percent. this increase is even more relevant considering the size of the dataset, each 1percent representing twenty references. 4.3 dataset: cs-sw while the cora and flux-cim cs datasets do focus on the computer science field, they do not cover the slight differences in reference format that can be found nowadays in the semantic web community. consequently, to show the even broader application of our approach, we have compiled a dataset named cs-sw comprising 576 reference strings randomly selected from publications in the semantic web area, from conferences such as international semantic web conference (iswc), the european semantic web conference (eswc), the world wide web conference (www), or the european conference on knowledge acquisition (and co-located workshops). 45 each reference entry is segmented into twelve fields: author, title, conference, workshop, journal, techrep, organization, publisher, date, pages, website and location. table 4 shows the results of the tests carried out on this dataset. one can easily observe that the chunker performed in a similar manner as on the cora dataset, with emphasis on the author, date, pages and publisher fields. field our approach p r f1 author 98.61 99.27 98.93 title 94.91 93.29 94.09 date 98.89 98.34 98.61 pages 98.94 97.24 98.08 location 93.9 92.77 93.33 organization 85.71 80 00 82.75 journal 94.59 93.33 93.95 information technology and libraries | june 2012 17 conference 96.66 95.08 95.86 workshop 83.33 88.23 85.71 publisher 96.61 97.43 97.01 tech. rep. 100 80 88.88 website 98.14 94.64 96.35 table 4. evaluation results on the cs-sw dataset 5. conclusion in this paper we presented a novel approach for extracting and chunking reference information from scientific publications. the solution, realized using a crf trained chunker, achieved good results in the experimental evaluation, in addition to an increased versatility shown by applying the one-time trained chunker on multiple testing datasets. this enables a straightforward adoption and reuse of our solution for generating semantic metadata in any digital library or publication repository focused on scientific publishing. as next steps, we plan to create a comprehensive dataset covering multiple heterogeneous domains (e.g., social sciences or digital humanities) and evaluate the chunker’s performance on it. then we will focus on developing an accurate reference consolidation and linking technique, to address the second step mentioned in section 1, i.e., aligning the resulting metadata to the existing linked data on the web. we plan to develop a flexible consolidation mechanism by dynamically generating and executing sparql queries from chunked reference fields and filtering the results via two string approximation metrics (a combination of monge-elkan and chapman soundex algorithms). the sparql queries generation will be implemented in an extensible manner, via customizable query modules, to accommodate the heterogeneous nature of the diverse linked data sources. finally, we intend to develop an overlay interface for arbitrary online publication repositories, to enable on-the-fly creation, visualization, and linking of semantic metadata from repositories that currently do not expose their datasets in a semantic / linked manner. acknowledgements the work presented in this paper has been funded by science foundation ireland under grant no. sfi/08/ce/i1380 (lion-2). references and notes 1. tim berners-lee et al., “the semantic web,” scientific american 284 (2001): 35–43. 2. christian bizer et al., “linked data—the story so far,” international journal on semantic web and information systems 5 (2009): 1–22. 3. generating computer-understandable metadata represents an issue, in general, in the publishing domain, and not necessarily only in its scientific area. however, the relevant literature dealing with metadata extraction/generation has focused on scientific publishing, because of its accelerated growing rate, especially with the increasing use of the world wide web as a dissemination mechanism. reference information extraction and processing |groza, grimnes, and handschuh 18 4. knud moeller et al., “recipes for semantic web dog food – the eswc and iswc metadata projects,” proceedings of the 6th international semantic web conference (busan, korea, 2007). 5. wei peng and tao li, “temporal relation co-clustering on directional social network and author-topic evolution,” knowledge and information systems 26 (2011): 467–86. 6. laszlo barabasi et al., “evolution of the social network of scientific collaborations,” physica a: statistical mechanics and its applications 311 (2002): 590–614. 7. xiaoming liu et al., “co-authorship networks in the digital library research community,” information processing & management 41 (2005): 1462–80. 8. john d. lafferty et al., “conditional random fields: probabilistic models for segmenting and labeling sequence data,” proceedings of the 18th international conference on machine learning (san francisco, ca, usa, 2001): 282–89. 9. vladimir vapnik, the nature of statistical learning theory (new york: springer, 1995). 10. isaac g. councill et al., “parscit: an open-source crf reference string parsing package,” proceedings of the sixth international language resources and evaluation (marrakech, morocco, 2008). 11. yong kiat ng, “citation parsing using maximum entropy and repairs” (master's thesis, national university of singapore, 2004). 12. fuchun peng and andrew mccallum, “information extraction from research papers using conditional random fields,” information processing & management 42 (2006): 963–79. 13. c. lee giles et al., “citeseer: an automatic citation indexing system,” proceedings of the third amc conference on digital libraries (pittsburgh, pa, 1998): 89–98. 14. kristie seymore et al., “learning hidden markov model structure for information extraction,” proceedings of the aaai workshop on machine learning for information extraction (1999): 37– 42. 15. isaac g. councill et al., “parscit: an open-source crf reference string parsing package,” proceedings of the sixth international language resources and evaluation (marrakech, morocco, 2008). 16. hui han et al., “rule-based word clustering for document metadata extraction,” proceedings of the symposium on applied computing (santa fe, new mexico, 2005). 17. eli cortez et al., “flux-cim: flexible unsupervised extraction of citation metadata,” proceedings of the 2007 conference on digital libraries (new york, 2007): 215–24. 18. machine learning methods can be broadly classified into two categories: supervised and unsupervised. supervised methods require training on specific datasets that exhibit the characteristics of the target domain. to achieve high accuracy levels, the training dataset needs to be reasonably large, and more importantly, it has to cover most of the possible information technology and libraries | june 2012 19 exceptions from the intrinsic data patterns. unlike supervised methods, unsupervised methods do not require training, and in principle, use generic rules to encode both the expected patterns and the possible exceptions of the target data. 19. peng and mccallum, “information extraction from research papers using conditional random fields.” 20. hoifung poon and pedro domingos, “joint inference in information extraction,” proceedings of the 22nd national conference on artificial intelligence (vancouver, british columbia, canada, 2007): 913–18. 21. ariel schwartz et al., “multiple alignment of citation sentences with conditional random fields and posterior decoding,” proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (prague, czech republic, 2007): 847–57. 22. simone teufel et al., “automatic classification of citation function,” proceedings of the 2006 conference on empirical methods in natural language processing (sydney, australia, 2006): 103–10. 23. jien-chen wu et al., “computational analysis of move structures in academic abstracts,” coling/acl interactive presentation sessions (sydney, australia, 2006): 41–44. 24. eugenio cesario et al., “boosting text segmentation via progressive classification,” knowledge and information systems 15 (2008): 285–320. 25. dublin core website, http://dublincore.org (accessed may 4, 2011). 26. york sure et al., “the swrc ontology – semantic web for research communities,” proceedings of the 12th portuguese conference on artificial intelligence (covilha, portugal, 2005). 27. yanjun qi et al., “semi-supervised sequence labeling with self-learned features,” proceedings of ieee international conference on data mining (miami, fl, usa, 2009). 28. david sanchez et al., “content annotation for the semantic web: an automatic web-based approach,” knowledge and information systems 27 (2011): 393-418. 29. tshering cigay dorji et al., “extraction, selection and ranking of field association (fa) terms from domain-specific corpora for building a comprehensive fa terms dictionary,” knowledge and information systems 27 (2011): 141–61. 30. please note that the chunker is document-format agnostic and takes as input only raw text. the actual extraction of this raw text from the original document (pdf, doc or some other format) is the user’s responsibility. 31. as a note, we chose this length of fifteen characters empirically, and based on the assumption that in any format the publication content lines usually have more than fifteen characters. reference information extraction and processing |groza, grimnes, and handschuh 20 32. lafferty et al., “conditional random fields: probabilistic models for segmenting and labeling sequence data.” 33. councill et al., “parscit: an open-source crf reference string parsing package.” 34. the manual tagging was performed by a single person and since the reference chunks have no ambiguity attached, we did not see the need for running any data reliability tests. 35. ron kohavi, “a study of cross-validation and bootstrap for accuracy estimation and model selection,” proceedings of the 14th international joint conference on artificial intelligence (montreal, quebec, 1995): 1137–43. 36. peng and mccallum, “information extraction from research papers using conditional random fields.” 37. councill et al., “parscit: an open-source crf reference string parsing package.” 38. mallet: machine learning for language toolkit, http://mallet.cs.umass.edu (accessed may 4, 2011). 39. william m. shaw et al., “performance standards and evaluations in ir test collections: clusterbased retrieval models,” information processing & management 33 (1997): 1–14. 40. peng and mccallum, “information extraction from research papers using conditional random fields.” 41. councill et al., “parscit: an open-source crf reference string parsing package.” 42. seymore et al., “learning hidden markov model structure for information extraction.” 43. han et al., “rule-based word clustering for document metadata extraction.” 44. cortez et al., “flux-cim: flexible unsupervised extraction of citation metadata.” 45. the cs-sw dataset is available at http://resources.smile.deri.ie/corpora/cs-sw (accessed may 4, 2011). http://resources.smile.deri.ie/corpora/cs-sw 149 an integrated computer based technical processing system in a small college library jack w. scott: kent state university library, kent, ohio (formerly lorain county community college, lorain, ohio) a functioning technical processing system in a two-year community college library utilizes a model 2201 friden flexowriter with punch card control and tab card reading units, an ibm 026 key punch, and an ibm 1440 computer, with two tape and two disc drives, to produce all acquisitions and catalog files based primarily on a single typing at the time of initiating an order. records generated by the initial order, with slight updating of information,. are used to produce, via computer, manual and mechanized order files and shelf lists, catalogs in both the traditional 3x5 card form and book form, mechanized claiming of unfilled orders, and subject bibliographies. the lorain county community college, a two-year institution designed for 4000 students, opened in september 1964, with no librarian and no library collection. when the librarian was hired in october 1964, lack of personnel, both professional and clerical, forced him to examine closely traditional ways of ordering and preparing materials, his main task being the controlled building of a collection as quickly as possible. no library having been established, there were no inflexible rules governing acquisitions or cataloging and no catalogs or other files enforcing their pattern on future plans. the librarian was free to experiment and adapt as much as he desired; and adapt and experiment he did, remembering, at least most of the time, the primary reasons for designing the 150 journal of library automation vol. 1/3 september, 1968 system. these were 1) to notify the vendor about what material was desired; 2) to have readily available information about when material had been ordered and when it might arrive; 3) to provide a record of encumbrances; 4) to make sure that material received was the material which had been ordered; 5) to initiate payment for material received; 6) to provide catalog copy for technical processes to use in producing card and book catalogs; 7) to provide inexpensive control cards for a circulation system; and 8) to provide whatever other statistics might be needed by the librarian. the librarian attended the purdue conference on library automation (october 2-3, 1964) and an ibm conference on a-utomation held in cleveland (december 1964), and visited libraries with data processing installations, such as the decatur public library. then an extensive literature search was run on the subject of mechanization of libraries and the available material thoroughly reviewed. it was the consensus of the president, the librarian, and the manager of data processing that, as white said later, "the computer will play a major part in how libraries are organized and operated because libraries are a part of the fabric of society and computers are becoming a daily accepted part of life." ( 1) moreover, it was agreed that the use of data processing equipment would be justified only if it made building a collection more efficient and more economical than manual methods could do. metro}) after careful consideration of the ibm 870 document writing system ( 2) and the system described by kraft ( 3) as input techniqu~s for the college library, ·it . was decided to use the friden flexowriter, recommended both at purdue and, in european applications, by bernstein ( 4). its most attractive feature was the use of paper tapes to generate various secondary. records without the necessity of proofreading each one. the college, by mid-1965, ·had the following equipment available for library use: one friden flexowriter (model 2201) with card punch control unit and tab card reading unit, one ibm 026 key punch with alternate programming, and guaranteed time on the college-owned ibm 1440 8k computer with two tape and lwo disc drives. to produce punched paper tape and tab cards with only one keyboarding, an electrical connection between the flexowtiter and the keypunch was especially designed and installed. . it was fortunate for the library that the college also had an excellent data processing· manager who was interested in seeing data processing machines and techniques utilized in as many ways as possible. with his enthusiastic support, aid in programming and preparation of flow charts, and patient cooperation, it was not surprising that the automation of library processes was completely successful. ·· at this time it ·was decided that since the college was likely to remain integrated computer based processing/ scott 151 a single-campus institution it would be uneconomical to rely solely on a book catalog, even though the portability of such a device was most attractive to librarian and faculty alike. therefore, it was planned to have the public catalog, as well as the official shelf list, in card form, permitting both to be kept current economically. these two files were to be supplemented with crude book catalogs which would be a by-product, among others, of the typing of the original book orders. these book catalogs were not to replace the card catalog but simply to extend and facilitate use of the collection. it was also decided to design a system which would duplicate as few as possible of the manual aspects of normal technical processing systems, but one which would, at the same time, permit the return to a manual system from a machine system with a minimum of trouble and tribulation if support for the library's automated system should be withdrawn. concern about such withdrawal of support had originally been voiced by durkin and white in 1961, when they said: "there have been a number of unfortunate examples of libraries that abandoned their home-grown catalogs for a machine retriev(tl program because there was some free computer time, only to lose their machine time to a higher priority project and to be left with information storage to which they no longer have access. many of these librarians, and others who have heard about their plight, are determined not to bum their bridges behind' them by abandoning their reliable, if old-fashioned, 3x5 card catalogs." ( 5) although the necessity of returning to an inefficient manual system has not, to date, raised its ugly head, there were times when it was most comforting to know that routes of retreat and reformation were available. under the present system there is only one manual keyboarding of descriptive catalog main entries for most titles. all other records are generated from these main entries. this integrated system was adopted on the assumption that cataloging infonnation in some form ( 6) would be available for a high percentage of books. experience showed that about 95 percent of acquisitions did have catalog copy readily available. of 4029 titles processed in a 5-month period, catalog copy was available for 3824. after verification that a requested title is neither in the library nor on order, a copy of a catalog entry is located in a source such as the national union catalog, library of congress proofsheets, or publisher's weekly, etc. the catalog information is manually typed in its entirety (including subject headings) onto five-part multiple request forms, using the friden flexowriter. output from the friden consists of the multiple order, a punched paper tape containing the full bibliographic entry but no order information, and tab cards, punched by the slave ibm key punch, which contain full order information but only abbreviated bibliographic data. (figure 1 ). the tab cards, containing full order information, are used as input to the 1440 computer to create an "on order" file arranged by order 152 /ou·rnal of library automation vol. 1/ 3 september, 1968 mail copies to vendor typed multiple book orders on order tape fig. 1 on order creation routine. start flexowriter 026 key punch on order cards cards to week integrated computer based p1'0cessing / scott 153 number and stored on magnetic tape, from which an "on order" printout is produced weekly (figure 2). at any given time this magnetic tape order file can be used to total the dollar amount of outstanding orders to any given vendor, or the total amount outstanding to all vendors (figure 3 ). the punched paper tape and two copies of the request form are stored in a standard 3x5 card file arranged by main entry. one copy of the request form is to be used as a work slip when material is received. on order cards for one week fig. 2 on order update. start cpu on order update scratc h a f ter update 154 journal of library automation vol. 1/ 3 september, 1968 the original and one copy of the request form is sent to the vendor, with instructions to return one copy with shipment. in the event the vendor does not comply, the main entry can be located readily by checking the order number or order .date on the "on order" printout and using the abbreviated bibliographic information which appears there. if the material requested has not been shipped within three months, the magnetic tape order file is used to prepare tab cards containing all original order information and the cards are sent to the library with a notice stating that shipment is overdue. these tab cards are used as input fig. 3 on order cost tally. start cpu list or tab of on order file by cost #30000 on order cost tab integrated computer based processing/ scott 155 to the flexowriter tab card reader unit which activates the flexowriter itself and prepares "overdue, ship or cancel" notices to the vendor (figfig. 4 late on order routine. ure 4). 156 journal of library automation vol. 1/ 3 september, 1968 products when material is received, the paper tape and one copy of the main entry work slip are pulled from the card order file and sent to the cataloger who notes on the work slip the call number to be used as well as any changes. the work slip, punched paper tape and book then pass to the technician who does the shelf listing. at this point the original output paper tape containing full bibliographic information is used as input for the flexowriter to create a standard 3x5 hard-copy shelf list card containing full bibliographic information, as well as inventory data such as vendor, date of receipt and cost. the last three items and the call number are added manually as "changes." simultaneously a new paper tape is produced as output which contains bibliographic information from the first tape and all revisions deemed necessary by the cataloger. the revised paper tape is used on the flexowriter to prepare 3x5 card sets for the public catalog. at the same time the slave keypunch prepares a set of tab cards containing full acquisitions fig. 5 shelf list creation routine. integrated computer based processing/scott 157 information: cost, vendor, date of receipt; and abbreviated bibliographic information: short author, short title, full call number (including copy, year, part and volume), accession number and short edition statement (figure 5). the tab cards are used first to delete the item from the magnetic tape "on order" file and second as input to create a magnetic tape shelf list of abbreviated information arranged by call number (figure 6). the magnetic tape shelf list is used to create 1) eight copies of author, title, and classified catalogs which are updated semi-annually; 2 ) printouts of weekly acquisitions; 3) subject printouts on demand; and 4) tab cards which serve as circulation cards for books, film s, drawings, tape and disc recordings, filmstrips and any other materials. the tab cards can be used with the ibm 357 circulation system or any similar system. discussion the efficiency of this system is most dramatically demonstrated by the amount of work accomplished per person per year. one technician can sort by call number cpu circ. caro prep fig. 6 weekly shelf list update. sort by control number cpu 158 journal of library automation vol. 1/ 3 september, 1968 process over one thousand orders per month. over fifteen thousand fully cataloged volumes per year (approximately eleven thousand titles) are added to the collection by a technical processing department which consists solely of one full-time cataloger and two full-time technicians. one technician spends one half of her time typing orders and the other half preparing the shelf list. at present the limiting factor in processing material is not the personnel time available but rather time on the flexowriterkeypunch combination, which runs continuously for sixty hours per week. the cataloger feels if some thirty hours more per week were available for running the machines, or if a second flexowriter were available to handle catalog card output, it would then be possible to order, receive, and fully process fifteen thousand titles per year (eighteen to twenty thousand volumes) with only the present technical processing staff. references 1. white, herbert s.: "to the barricades! the computers are coming!" special libmries 57 (november, 1966), 631. 2. general information manual: mechanized library procedures (white plains, n.y.: ibm, n.d.). 3. kraft, donald h .: libmry automation with data processing equipment (chicago: ibm, 1964). 4. bernstein, hans h.: "die verwendung von flexowritern in dokumentation und bibliothek", n achrichten fur dokumentation 12 (june, 1961), 92. 5. durkin, robert e.; white, herbert s.: "simultaneous preparation of library catalogs for manual and machine applications", special libraries 52 (may, 1961), 231. 6. kaiser, walter h.: "new face and place for the catalog card", library journal 88 (january, 1963 ), 186. 2 information technology and libraries | december 2007 editorial: farewell and thank you john webb this issue of information technology and libraries (ital), december 2007, marks the end of my term as editor. it has been an honor and a privilege to serve the lita membership and ital readership for the past three years. it has been one of the highlights of my professional career. editing a quarterly print journal in the field of information technology is an interesting experience. my deadlines for the submission of copy for an issue are approximately three and a half months prior to the beginning of the month in which the issue is pub lished; for example, my deadline for the submission of this issue to ala production services was august 15. therefore, most articles that can appear in an issue were accepted in final form at least five months before they were published. some are older; one was a baby at only four months old. when one considers the rate of change in information technologies today, one understands the need for blogs, wikis, lists, and other forms of profes sional discourse in our field. what role does ital play in this rapidly changing environment? for one, unlike these newer forms, it is doubleblind refereed. published articles run a peer review gauntlet. this is an important distinction, not least to the many lita members who work for aca demic institutions. it may be crass to state it so baldly, but publication in ital can help one earn tenure, an oldfashioned fact of life. it is indexed or abstracted in nineteen published sources, not all of them in english. many of its articles appear in various digital repositories and archives, and these also are harvested or indexed or both. in addition, its articles are cataloged in worldcat local. many of lita’s most prominent members—your distinguished peers—have published articles in ital. the journal also serves as a source for the wider dis semination of sponsored research, a requirement of most grants. and you can read it on the bus or at the beach (heaven forbid!), in the brightest sunlight, or with a flashlight under the covers (though there are no reports of this ever having been observed). i am amazed at how quickly these three years have passed, though that may be at least as much a function of my advanced age as of the fun and pleasure i have had as editor. certainly, these past three years have hosted some notable landmarks in our history. lita and ital both celebrated their fortieth anniversaries. sadly, the death of one of lita’s founders and ital’s first editor, frederick g. kilgour, on july 31, 2006, at age ninetytwo, was a landmark in the passing of an era. oclc and rlg’s merger, which fred lived to witness, was a landmark of a different sort—one of maturity, we hope. ital is now an electronic as well as a print journal. this conversion has had some rough passages, but i trust these will have been ironed out by the time you read this. when i became editor, i had a number of goals for the journal, which i stated in my first editorial in march 2005. reading that editorial today, i realize that we successfully accomplished the concrete ones that were most important to me then: increasing the number of articles from library and ischool faculty; increasing the number that result from sponsored research; increasing the number that describe any relevant research or cuttingedge advance ments; increasing the number of articles with multiple authors; and finding a model for electronic publication of the journal. the accomplishment of the most abstract and ambitious goal, “to make ital a destination journal of excellence for both readers and authors,” only you, the readers and authors, can judge. i thank mary taylor, lita executive director, and her staff for all of the support they provided to me during my term. i owe a debt that i can never repay to all of the staff of ala production services who worked with me these past three years. their patience with my some times bumbling ways was awardwinning. thank all of you. the lita presidents and other officers and board members were unfailingly supportive, and i thank you all. in the lita organizational structure, the ital editor and the editorial board report to the lita publications committee, and the editor is a member of that body. i thank all of the chairs and other members of that commit tee for their support. once more, and sadly for the last time, i thank all of the members of the ital editorial board who served dur ing my term for their service and guidance. they perform more than their share of refereeing, but more importantly, as i have written before, they are the junkyard dogs who have kept me under control and prevented my acting on my worst instincts. i say again, you, the lita member ship and ital readership, owe them more than you can ever guess. trust me. to marc truitt, ital managing editor and the incom ing ital editor for the 2008–2010 volume years, i must say, “thank you, thank you, thank you!” marc and the ala production services staff were responsible for the form, fit, and finish of the journal issues you received in the mail, held in your hands, and read under the covers. finally, most of all, thank you authors whose articles, communications, and tutorials i have had the privilege to publish, and you whose articles have been accepted and await publication. john webb (jwebb@wsu.edu) is a librarian emeritus, washington state university, and editor of information technology and libraries. editorial: farewell and thank you | john webb 3 not only is this the end of my term as editor, but i also have retired. from now on, my only role in the field of library and information technology will be as a user. those of you have seen the movie the graduate probably remember the early scene when benjamin, the dustin hoffman character, receives the single word of advice regarding his future: “plastics.” (i don’t know if that scene is in the novel from which the movie was adapted.) my single word of advice to those of you too young or too ambitious to retire from our field is: “handhelds.” i am surprised that my treo is more valuable to me now in retirement than it was when i was working. (i’m not surprised that my ipod video is, nor that word thinks that treo and ipod are misspellings.) i just wish that more of the web was as easily accessible on my treo as are google maps and almost all of yahoo!. handhelds. trust me. president’s message andromeda yelton information technology and libraries | march 2018 2 andromeda yelton (andromeda.yelton@gmail.com) is lita president 2017-18 and senior software engineer, mit libraries, cambridge, massachusetts. in my last president’s message, i talked about change — ital’s transition to new leadership — and imagination — wakanda and the archival imaginary. today change and imagination are on my mind again as lita contemplates a new path forward: potential becoming a new combined division with alcts and llama. as you may have already seen on litablog (http://litablog.org/2018/02/lita-alcts-and-llamadocument-on-small-division-collaboration/), the three divisional leadership teams have been envisioning this possibility, and all three division boards discussed it at midwinter. while the id ea sprang out of our shared challenges with financial stability, in discussing it we’ve realized how much opportunity we have to be stronger together. for instance, we’ve heard for years that you, lita members, want more of a leadership training pathway, and more ways to stay involved with your lita home as you move into management; alignment with llama automatically opens up all kinds of possibilities. they have an agile divisional structure with their communities of practice and an outstanding set of lead ership competencies. and anyone involved with library technology knows that we live and die by metadata, but we aren’t all experts in it; joining forces with alcts creates a natural home for people no matter where they are (or where they’re going) on the technology/metadata continuum. alcts also runs far more online education than lita and runs a virtual conference. meanwhile, of course, lita has a lot to offer to llama and alcts. you already know how rewarding the networking is, and how great the depth of expertise on technology topics. we also bring strong publications (like this very journal), marquee conference programs (like top tech trends and the imagineering panel), and a face-to-face conference. (speaking of which, please pitch a session (http://bit.ly/2gpgxdf) for the 2018 lita forum!) i want to emphasize that no decisions have been made yet. the outcome of our three board discussions was that we all feel there is enough merit to this proposal to explore it further, but none of us are formally committed to this direction. furthermore, it is not practically or procedurally possible to make a change of this magnitude until at least 2019. in the meantime, we expect there will be numerous working groups to determine if and how this all could work, as well as open forums for the membership of all three divisions to express hopes, concerns, and ideas. personally, my highest priority is to ensure that that you, the members, continue to have a divisional home: one that gives you learning opportunities and a place for professional camaraderie, and that is on solid financial footing so it can continue to be here for you in the long term. http://litablog.org/2018/02/lita-alcts-and-llama-document-on-small-division-collaboration/ http://litablog.org/2018/02/lita-alcts-and-llama-document-on-small-division-collaboration/ http://bit.ly/2gpgxdf president’s message | march 2018 3 https://doi.org/10.6017/ital.v37i1.10386 so, i’m excited about the possibilities that a superhero teamup affords, but i’m even more excited to hear from you. do you find this prospect thrilling, scary, both? do you think we should absolutely go this way, or definitely not, or maybe but with caveats and questions? please tell me what you think. you can submit anonymous feedback and questions at https://bit.ly/litamergefeedback. i will periodically collate and answer these questions on litablog. you can also reach out to me personally any time (andromeda.yelton@gmail.com). https://bit.ly/litamergefeedback mailto:andromeda.yelton@gmail.com challenges and strategies for educational virtual reality: results of an expert-led forum on 3d/vr technologies across academic institutions articles challenges and strategies for educational virtual reality: results of an expert-led forum on 3d/vr technologies across academic institutions matt cook, zack lischer-katz, nathan hall, juliet hardesty, jennifer johnson, robert mcdonald, and tara carlisle information technology and libraries | december 2019 25 matt cook (matt_cook@harvard.edu) is digital scholarship program manager, harvard library. zack lischer-katz (zlkatz@ou.edu) is postdoctoral research fellow, university of oklahoma libraries. nathan hall (nfhall@vt.edu) is director, digital imaging and preservation services, university libraries, virginia tech. juliet hardesty (jlhardes@iu.edu) is metadata analyst, indiana university libraries. jennifer johnson (jennajoh@iupui.edu) is digital scholarship outreach librarian, university library, iupui. robert mcdonald (rhmcdonald@colorado.edu) is dean, university libraries, university of colorado boulder. tara carlisle (tara.carlisle@ou.edu) is head, digital scholarship lab, university of oklahoma libraries. abstract virtual reality (vr) is a rich visualization and analytic platform that furthers the library’s mission of providing access to all forms of information and supporting pedagogy and scholarship across disciplines. academic libraries are increasingly adopting vr technology for a variety of research and teaching purposes, which include providing enhanced access to digital collections, offering new research tools, and constructing new immersive learning environments for students. this trend suggests that positive technological innovation is flourishing in libraries, but there remains a lack of clear guidance in the library community on how to introduce these technologies in effective ways and make them sustainable within different types of institutions. in june 2018, the university of oklahoma hosted the second of three forums on the use of 3d and vr for visualization and analysis in academic libraries, as part of the project developing library strategy for 3d and virtual reality collection development and reuse (lib3dvr), funded by a grant from the institute of museum and library services. this qualitative study invited experts from a range of disciplines and sectors to identify common challenges in the visualization and analysis of 3d data, and the management of vr programs, for the purpose of developing a national library strategy. introduction virtual reality, 3d data, and other spatial technologies are being adopted in libraries as innovative and immersive tools for enhancing research and teaching.1 vr provides a highly realistic, interactive visualization platform for engaging with 3d data, such as models produced from cultural heritage sites or medical imaging data, presenting many potential applications for a range of academic fields.2 while these technologies are not new, they have become increasingly affordable, enabling widespread adoption beyond their traditional niches. for example, vr equipment has been studied in computer science departments for decades, but costs restricted use to large research labs.3 consumer-oriented vr headsets emerged in the late 1980s, but at a high mailto:matt_cook@harvard.edu mailto:zlkatz@ou.edu mailto:nfhall@vt.edu mailto:jlhardes@iu.edu mailto:jennajoh@iupui.edu mailto:rhmcdonald@colorado.edu mailto:tara.carlisle@ou.edu challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 26 https://doi.org/10.6017/ital.v38i4.11075 price point and with many technical challenges, such as high latency in interactive graphics processing, they were ultimately unsuccessful in the consumer marketplace. cheaper and technically superior mass-market vr headsets became widely available in 2016 with the release of the oculus rift and htc vive systems. vr is finally within the budgetary and technical means of libraries of various sizes to adopt and deploy. at the same time, educators are developing new methods of crafting vr content. decreasing costs of equipment associated with 3d data creation techniques, such as photogrammetry, laser scanning, and medical imaging (e.g., ct scanning), have encouraged their adoption outside of specialized fields. this content is increasingly used within immersive learning environments. spatial data creation and visualization tools together can comprise a 3d/vr ecosystem that enables a range of research activities, including 3d scanning of cultural heritage artifacts, drone scanning of landscapes, interactive mapping, and data visualization, all of which can be viewed and analyzed in immersive vr.4 there is already evidence to suggest that vr has many academic benefits. while vr has not yet been proven to lead to better learning outcomes when compared with other educational media (indeed for learning certain types of facts, videos and lectures are often still mo re effective), it does offer other types of benefits. vr has been shown to lead to changes in student attitudes, such as increasing student engagement or self-efficacy.5 furthermore, research has shown the positive impact that 3d and vr visualization can have on analytic tasks for researchers, which indicates the benefit of having this type of equipment and support available through academic libraries. 6 despite decreasing costs and a growing understanding of the potential benefits of the technology, there is still concern in the library field about the cost and sustainability issues associated with bringing these types of technologies into the library. there are currently no standards or best practices in place for adopting 3d/vr, so institutions often have to develop ad hoc solutions, which wastes time by duplicating work already being done in other institutions and makes it difficult to share content due to a lack of interoperability standards. to begin to address these challenges and aid in the maturation of 3d and vr as learning and research technologies, an interdisciplinary group of librarians from virginia tech, indiana university, and the university of oklahoma convened to develop a series of three national forums on this topic, funded by the institute for museum and library services (imls), as a project titled developing library strategy for 3d and virtual reality collection development and reuse (lib3dvr).7 each forum was designed to cover a particular phase of the 3d/vr lifecycle within academic contexts. in june 2018, the second forum was held at the university of oklahoma on the topic of 3d/vr visualization and analysis and considered the following research questions: rq1: what are effective strategies for addressing common challenges faced by academic libraries as they implement 3d and vr programs? rq2: how are academic librarians using vr to support existing library services, such as curriculum development and access? rq3: how can the knowledge and resources of academic library–based 3d/vr programs be shared with other academic and information organizations, such as public libraries and regional higher-education institutions? information technology and libraries | december 2019 27 this paper presents the findings of the 3d/vr visualization and analysis forum, discusses the common challenges and strategies identified, and indicates key directions forward. the forum assembled invited experts representing academic libraries, commercial software companies, vr and visualization labs, and educational research centers for two days of closed-door discussions. the forum identified common challenges faced by a diverse range of stakeholders and institutions of various types and scales, synthesized strategies and practices discussed by forum participants as possible solutions to those challenges, and presented policies that participants are developing to support vr as a research and learning tool in their institutions. in addition to convening experts in the field, a public forum was also held that brought together diverse stakeholders from the south central united states library community to provide opportunities for engagement and knowledge sharing. participants in the public forum represented local academic libraries, public libraries, public k-12 educators, commercial vr developers, and other academic programs. this enabled the cross-pollination of ideas and sharing of best practices for implementing vr in a range of contexts not represented by the invited experts. including such a diverse group enabled the wider academic and library communities to benefit from the sharing of information that is otherwise often siloed or restricted to large institutions with substantial economic and knowledge resources. literature review a growing body of literature on vr has considered the technology’s general benefits, explored its potential applications for research, presented methods of integrating vr into the classroom, defined some of the institutional challenges of adopting vr, and considered the use of vr for expanding library services. the general benefits of vr while the science that informs the development of contemporary vr systems has its roots in nineteenth-century scientific perceptual research (and even further back we can see rene descartes’ seventeenth-century theory of vision establishing the groundwork for contemporary vr systems development) it has been primarily within the last two decades that computer science and electrical engineering departments have defined the platform characteristics that reveal vr to be uniquely beneficial for working with complex 3d data.8 under controlled conditions, researchers have identified and tested the prevalence and impact of myriad “real-world” depth cues; benefits related to preservation of the embodied first-person viewer in a virtual environment; and the importance of increased viewing angles for engaging with what would traditionally be considered cluttered data sets.9 combined, this set of features allows for more efficient analysis of visual information, especially as related to activities where the user is expected to search, identify, describe, and compare subcomponents of complex, multivariate data sets.10 research has thus shown that vr is valuable because it presents information in context, at human scale, and in a way that is responsive to a wide range of body-centered interactions and representational characteristics that reproduce real-world interactions. these general benefits can be applied across academic disciplines and institutions. uses of vr in research the general benefits of vr that have been identified are now regularly employed in research capacities across the academy. vr and related 3d data-creation tools are being applied to fields such as digital humanities,11 archaeology,12 cultural heritage preservation,13 medieval studies,14 challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 28 https://doi.org/10.6017/ital.v38i4.11075 engineering,15 biology and biodiversity research,16 medicine,17 and architecture.18 in some cases, these approaches draw on the capabilities of 3d/vr to recreate immersive, high-fidelity experiences of real-world spaces, while in other cases, researchers are exploring the capabilities of vr to provide a platform for analyzing spatially oriented research data in the form of 3d models of cultural heritage artifacts and sites or visualizations of multivariate quantitative data. 19 in all of these cases, vr extends the capabilities of the human senses to engage with digital research data and scholarly outputs in ways that open new possibilities for discovery and analysis. integrating vr into the classroom starting with work on second life, quest atlantis, and others, researchers have also studied the pedagogical potential of virtual worlds.20 these early virtual worlds consisted of computergenerated 3d environments, but user engagement was limited to viewing them on 2d computer monitors and interacting via keyboard and mouse interfaces. early virtual classrooms were designed and studied in the hope that they could effectively bring students and teachers together from across the world and enable them to engage with distant artifacts, locations, and people as part of the curriculum, and early research was concerned with how virtual worlds could emulate or expand on the benefits of traditional learning environments.21 for example, jeremy bailenson has argued that vr is particularly well-suited for providing field trips to students, i.e., learning experiences that enable students to visit new places, and chris dede has identified the benefits and challenges of vr field trips.22 one of the main challenges of this type of learning technology is the high cost of designing and building the virtual environments; however, as bailenson points out, once they are created they can be endlessly replicated and shared, “enabling us to share educational opportunities with anyone who has an internet connection and an hmd [headmounted display].”23 with the increasing adoption by schools and libraries of vr equipment due to decreasing equipment costs, the focus has shifted to studying the pedagogical benefits of immersive vr experiences. these experiences place the user in an interactive, stereoscopic world, with interface controls modeled on intuitive embodied gestures and movements. vr has been shown to aid design and learning tasks in a range of fields, including design work in architecture classes24 and anatomical instruction in medical schools.25 these vr experiences augment, but do not replace, other forms of classroom learning, just as traditional field trips provide interactive learning experiences that contribute to formal classroom learning. while these applications are impressive, the high cost of vr adoption leads to concern about evaluating the benefits of vr for learning. what types of benefits are valued and how do we evaluate those benefits in rigorous ways? bailenson points out that vr may not facilitate factual knowledge acquisition better than other educational delivery methods; instead, it may have other benefits, such as increased student engagement, enthusiasm, and self -efficacy.26 indeed, lischerkatz et al. showed how a carefully designed course integration using vr could have a positive impact on undergraduate students’ self-efficacy in regards to spatial analytic tasks and technology engagement.27 mina johnson-glenberg studied two unique attributes of vr, “the sense of presence, and embodied affordances of gesture and manipulation in the 3rd dimension,” and generated findings that supported the hypothesis that “when learners perform actions with agency and can manipulate content during learning, they are able to learn even very abstract content better than those who information technology and libraries | december 2019 29 learn in a more passive and low embodied manner.”28 schneider et al., kuliga et al., and pober and cook have all documented the impact of vr on the process of architectural design.29 in the case of schneider et al., students engaged with digital and physical versions of the same facilities via virtual and real-world tours over the course of a semester. while students were critical of the digital surrogates’ relative lack of atmospheric detail, they also communicated that “experiencing the 3d-model in real size helped to evaluate the design.”30 similarly, angulo successfully integrated vr into her undergraduate architecture coursework, resulting in a documented increase in term-project evaluation scores for those students who iterated on designs using a vr viewing tool.31 these studies reflect the potential impact of vr across the design disciplines (e.g. , architecture), where such tools are already being deployed in professional settings.32 collectively, these studies indicate the likely benefits of vr in the classroom across disciplines, especially in fields where accurate perceptions of spatial characteristics—such as depth and scale—are critical to student (and professional) success. additional work is necessary to develop and streamline pedagogical research instrumentation whereby easily applied metrics can be implemented by library and instructional staff to evaluate the effectiveness of vr on students and other users. institutional experiences of adopting vr with the release of consumer-grade vr equipment, more and more institutions are considering the feasibility of adopting vr. as a result, case studies, practical strategies, and models for institutional deployment of vr are beginning to appear in the published literature.33 for example, austin olney discusses the implementation of augmented reality (ar) systems at the white plains public library in new york, and suggests that this endeavor is more easily accomplished by building off of existing vr capabilities and policies. in particular, logistical and legal concerns that had been addressed when previously deploying “public” vr (e.g., the signing of waivers before use) were useful for rapid deployment of ar.34 bohyun kim offers practical considerations for the integration of vr systems into library makerspaces, assessing suitable vr hardware and software to support 3d-modeling activities at the university of rhode island libraries.35 patterson and coworkers define five service models for integrating vr into libraries (see table 1 below). service model intended use open lab space walk-in closed lab space demonstrations, testing, and staging of new equipment flexible lab space reservable space and equipment for class or team use equipment checkout individual use developer kits (laptops and vr equipment) for checkout to use in research, demonstrations, and presentations table 1. library service models for vr.36 one of the major challenges faced by all institutions adopting vr is the concern over user comfort while using vr systems. as steven lavalle suggests, “experiencing discomfort as a side effect of using vr systems has been the largest threat to widespread adoption of the technology over the challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 30 https://doi.org/10.6017/ital.v38i4.11075 past decades.”37 fortunately, published best practices concerning baseline performance standards for consumer headsets and software design considerations have been well established in the literature, suggesting that those looking to adopt vr in institutional settings can do so by drawing on existing technical knowledge concerning how to ensure the relative comfort of vr users.38 this combination of practical and technical considerations will undoubtedly further the adoptio n of vr across educational institutions worldwide. the remainder of this article reports on the methodology and findings of this forum, which expand upon the benefits and challenges identified in the literature. methods the conveners assembled a two-and-a-half-day forum at the university of oklahoma in norman, oklahoma, with fifteen expert participants in attendance. participants were selected in consultation with an advisory board, with the intention of recruiting a diverse group of national experts in representative fields, including academic librarians, researchers from a variety of disciplines, and commercial game designers and software engineers. the conveners used nominal group technique to generate research data for this study.39 nominal group technique is a consensus-building method for achieving general agreement on a topic through face-to-face small group discussions and it “empowers participants by providing an opportunity to have their voices heard and opinions considered by other members” in a structured format.40 this method was adopted in order to reveal key challenges related to the visualization and analysis of 3d and vr data and strategies for designing and managing library programs to support these activities. data were generated through community note taking using google drive documents designated for each forum session. at the end of each discussion session, a group note taker summarized and presented the views of each small group to the wider forum. both the raw, community notes and the summarized, facilitator notes were collected and analyzed. notes produced from the smaller groups and from the larger group form the basis of the findings. one discussion topic, “course integrations and measuring impact on student learning ,” was open to the public during the “public forum” portion of the forum on the afternoon of the second day (see the “findings from the public forum” section, below). while these additional attendees were not given the opportunity to participate in the collaborative note taking, participants in the public forum provided anonymous responses to a set of questions on index cards that they submitted to the research team. data analysis consisted of grouping data from the community note-taking documents into higherlevel categories based on the research questions and emergent themes, following an inductive analysis approach. a central part of the data analysis process involved moving from grouping specific examples of institutional practices and personal perspectives in order to link them to more general, community-wide phenomena. in this way, a set of shared challenges and strategies could be identified at the community level of analysis. while there was a range of institutional and professional perspectives presented by the forum participants and the intention was to present a diverse set of perspectives on the topics covered in this forum, one limitation of this methodology is that it is limited to small groups of experts, which information technology and libraries | december 2019 31 could potentially exclude other perspectives. the inclusion of the public forum, including more participants from a greater range of institutions, helped to mitigate this limitation. we validated these findings by disseminating drafts to participants and asking them to correct, cl arify, or elaborate on the contents. the authors incorporated all participant feedback into a subsequent draft. this project was approved by the institutional review board of the lead organizing institution, virginia tech. findings this section discusses the forum findings and aligns them with the project’s research questions. rq1: what are effective strategies for addressing common challenges faced by academic libraries as they set out to implement 3d and vr programs? findings for research question 1 (rq1) are broken down into three main areas: (1) humancentered design challenges, (2) initiating vr programs in libraries and schools, and (3) curriculum and research integration and assessment. human-centered design challenges participants agreed that, in many ways, virtual reality is still an immature technology, which is reflected in shared experiences of forum participants, such as encountering simulator sickness and interface learning curves. as simulator sickness results from, among other things, a graphical rendering performance shortfall that leads to a disconnect between what the eyes and inner ear perceive, or an unnatural locomotive user interface (ui) decision on the part of content creators, the importance of graphics card and software selection (above and beyond their educational value) was emphasized.41 adding in-app spatial reference points—such as a virtual horizon line— was mentioned as a quick solution for the disorientation some users experience when engaging with virtual environments. practical solutions were provided for some of the most common issues related to the academic use of vr, including providing ginger candy and personal mirrors, as well as defining reasonable time limits per person for in-headset time, to help with motion sickness as well as self-consciousness that follows the removal of a headset (i.e., mussed hair) that is experienced by some users, particularly students. further, while not physiologically discomforting, instances of self-consciousness have been known to interfere with the educational effectiveness of virtual reality at the k-12 level, and methods for mitigating that experience—with the deployment of small mirrors, to check one’s appearance after a headset session, for example— were discussed by forum participants.42 most vr systems are able to provide both sitting and standing experiences. standing experiences are particularly valuable insofar as full-body interface mechanisms allow users to engage their whole bodies in interacting with vr systems, which provides a close correspondence between a user’s physical movements and their navigation of the virtual space. this decreases the learning curve of vr learning experiences, since the traditional alternatives—a sometimes tricky controller schema or command line interface—often require software or discipline specific training. in the case of seated vr systems, the system is able to accommodate users with disabilities. not only are students, faculty, and staff with disabilities able to engage with vr content, but also the vr technology can provide heretofore impossible learning experiences by providing lifelike access to scenes that are physically inaccessible to them. while these possibilities are promising, participants agreed that further work needs to be performed to accommodate often overlooked barriers to accessibility. examples of early techniques used to successfully address accessibility challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 32 https://doi.org/10.6017/ital.v38i4.11075 concerns include the mirroring of controller mappings to allow for use by either hand, the integration of accessible interfaces such as the xbox adaptive controller, and the creation of opensource developer tools that can virtually adjust for a range of visual impairments.43 regarding specific vr hardware currently available on the commercial market, participants were quick to point to common factors that limit the approachability, use, and scalability of university or library-based virtual reality. one critical concern was the cost of vr equipment; at the high end, oculus rift, htc vive, and assorted windows mixed reality headsets require a tethered connection to a dedicated personal computer (pc), and designated pcs must be outfitted with graphics cards that start at ~$300 (us) and increase rapidly from there. to sustain the performance necessary for a stable and comfortable virtual experience to the end user, a $500 graphics card coupled to another $500-$1,000 worth of computing hardware (e.g., cpu, motherboard, storage, monitor, etc.) is necessary, in addition to the purchase price of the vr headset. cost-wise, high-end vr remains a prohibitively expensive endeavor outside of wellfunded research institutions. while cost was mentioned as a major cause for concern, the physical constraint to user movements due to the wiring that connects the vr headset to the pc impedes one of the primary features of the technology, positional tracking of user movement. the importance of positional tracking, the ability to track a user’s headset and hand controllers along the six axes of motion as the user moves through physical space, means that interfacing with vr is ideally an intuitive, “natural,” and immersive experience that can be easily disrupted if the user encounters cables or other restrictions to their movements. fortunately, the major headset manufacturers mentioned above are starting to release more affordable, untethered, positionally tracked vr devices with increasing quality, which suggests that institutions ranging from k-12, to community colleges, to r1 research universities and academic libraries will soon be able to adopt immersive visualization technologies without encountering significant financial or ergonomic problems. 44 forum participants also noted that the same real-world fidelity that makes virtual reality an impressive learning platform can also produce harmful and disturbing effects. indeed, many of the most popular vr titles are violent, action-oriented “first-person shooters,” or multiplayer chat rooms in which very little attention is being paid to regulating abusive language. moreover, there are some educational or training applications that, while culturally and politically sensitive, are nonetheless unsettling in a high-fidelity virtual environment. disturbing places or objects can also negatively impact users; for example, phobia training applications in virtual reality, such as fearlessvr, might require academic institutions to deploy disclaimers prior to use at the risk of disruptive or damaging reactions from users (http://www.fearlessvr.com/). finally, forum participants discussed ways in which earlier design paradigms have subtly influenced the design of 3d/vr applications. participants noted that while libraries and librarians might immediately assume that a books-on-the-shelf virtual library is a good way to invest time and development resources, the technology affords the means to interact with not just text-based materials (or the spines of books, in the case of browsing activities), but richly detailed “source material” (3d objects of study) as well. in the case of design principles, the entire body can be incorporated into wholly new search and discovery mechanisms that build on the best aspects of the library browsing experience while incorporating novel content types. http://www.fearlessvr.com/ information technology and libraries | december 2019 33 initiating vr programs in libraries and schools participants identified a variety of challenges associated with initiating vr programs in libraries and schools and a set of strategies for addressing these challenges. they discussed the challenges of developing curricula, the importance of management plans, the impact of particular institutional landscapes on the success or failure of 3d/vr initiatives, and offered some insight on the experiences of other information institutions, such as museums, that may be helpful to consider when developing broader strategies. one emergent theme emphasized the challenges of developing curricula—specifically, customized 3d/vr teaching modules. some participants noted that the expertise needed to develop these learning modules is unevenly distributed across the university, which makes it difficult to develop these critical teaching components. furthermore, finding and funding technical expertise is a significant challenge, and participants discussed the difficulties in balancing the investment in untested technologies with the potential benefits of those same technologies. untested technologies may be difficult to maintain and may require buy-in from administrators. another challenge is the difficulty in getting researchers to share their project outcomes for use in instruction. researchers who develop vr tools or content may not want to share the products of their labor because they perceive them to be integral parts of their research agendas, or they may feel that the products of their projects may be so customized that they will not be useable by other researchers. finally, participants in the forum pointed to the general bottleneck of content creation for 3d and vr, which impacts the development of teaching curricula and the production of research-quality 3d data. more specifically, they noted that better workflows need to be developed in order to make it easier to create their own 3d models and to acquire and work with 3d models created by other researchers. participants pointed out that a lack of easily accessible content will likely limit investment in vr programs and integration into curricula, which suggests that support for sharing content between and within institutions could be an important way of promoting the adoption of these new technologies. forum participants discussed how developing management plans for 3d/vr hosting spaces is essential to ensure successful initiation of vr programs. participants offered a range of models for overseeing user engagement with vr equipment, including placing the equipment in a makerspace-like environment that was always monitored by staff; keeping the space locked with the option for users to request a key; as well as implementation in a fine arts space that has its own operating hours and security staff to oversee public engagement. indeed, the question of staffing spaces emerged repeatedly throughout the proceedings, with the idea of using student staff from the host university or college as an effective and inexpensive means of o verseeing 3d/vr spaces, assisting users, and troubleshooting technical issues that are typical of these new and emerging technologies. students were also identified as potential content creators for those spaces and promoters of the 3d/vr programs to their peers. the institutional landscape of the school hosting the 3d/vr program was also identified as an important factor that can impact 3d/vr adoption, since administrators can play a big part in helping or hindering vr implementation. participants noted that they needed to explicitly justify the time and return on investment for vr in less supportive environments, which is not surprising given ever-tightening budgets across universities. faculty support also impacts vr program initiation. while a handful of faculty members may be willing to superficially explore vr technology and try it out in one of their classes, wider adoption will require the provision of challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 34 https://doi.org/10.6017/ital.v38i4.11075 measurable student outcomes. however, the biggest challenge noted for getting faculty to support vr initiatives was convincing them that it could be useful for their research. participants from museums that are implementing vr or augmented reality (ar) technologies also offered some insight into institutional challenges of these technologies. some museums are creating 3d scans of historical items, which raises a number of concerns about the accuracy of the models and how that impacts the meaning of the 3d models in the context of a museum’s wider collecting mission. museums want to create 3d content that is historically accurate, while at the same time being optimized for use in vr and ar applications. the biggest challenge identified is how to make the technology affordable while also maintaining its usefulness to communities of museum visitors. this aligns with concerns expressed by libraries, which also find balancing cost and usability to be a key concern. participants from across institutions offered strategies for addressing some of these challenges. first, participants noted that training sessions can provide an important introduction to vr for students and faculty that enables them to have a good initial experience and see the value of vr beyond its novelty. without proper orientation, users could have an unpleasant initial experience, which could unnecessarily sour them on using vr in the future. second, participants suggested several demonstration techniques for introducing a wide range of university and library community members to the new technology and exhibiting it in a positive light. “road shows,” i.e., taking the technology to other parts of campus or different communities, can be very useful for reaching users who may not otherwise come into the library to engage with emerging technologies. library-hosted events, such as hackathons, workshops, and demonstrations can also help to develop interest among current library patrons. providing mobile workstations for classroom use or individual patron checkout can also make the technology more accessible. these strategies suggest the importance of convincing potential users of the benefits of 3d/vr for their particular interests or research needs. curriculum and research integration and assessment integrating 3d/vr technologies into established research and teaching conventions and workflows has proven to be a challenge, and all participants reported continued struggles to establish ways to assign credit to the creators of 3d/vr content, and methods of rigorously assessing the impact of this content on learning outcomes. regarding tenure and promotion concerns, it is not clear how faculty outside of technologically oriented or applied science disciplines might communicate and track their use of vr as a tool in their courses or research for inclusion in their tenure and promotion portfolios. creating vr experiences represents a substantial investment of development resources and faculty time, and the experience itself—while distributable and citable—is not typically treated by the research community as a scholarly output. this is symptomatic of the relatively immature content ecosystem (i.e., a scarcity of educational vr software and scholarly 3d content), which necessitates either custom software development or forces researchers to use off-the-shelf tools that may comprise “blackboxed” data transformations or offer limited functionality. to address this current lack of fully developed educational software and the necessity of assembling 3d/vr course modules piecemeal, participants concluded that the best place to start the integration process would be by establishing “first principles,” or what is known for certain regarding the strengths and weaknesses of vr. faculty members who want to include a 3d/vr information technology and libraries | december 2019 35 component in their teaching or research agenda should familiarize themselves with the key benefits of the technology that transcend disciplines and that are supported by evidence-based assessment. faculty should select task types for their learning activities that lend themselves to vr visualization, and deploy course (or research) content that, due to fragility, distance, rarity, or scale are relatively inaccessible. forum participants discussed how the benefits of 3d/vr in terms of a particular instructional goal or research initiative should be judiciously weighed against the relative cost of purchasing, installing, and maintaining what is still an immature, fast-changing set of interrelated content creation, visualization, and output (e.g., 3d printing) technologies. moreover, once these technologies are deployed, and learning outcomes or research goals that align with the documented benefits of the technology are identified, assessment strategies should be deployed to evaluate the impact of these tools. measuring performance, engagement (e.g., measuring time-ontask), comparative economic benefit (i.e., with respect to traditional course content delivery methods), and qualitative variables, such as impact on student self-efficacy, are all useful approaches for measuring the impact of vr in a research or teaching environment, and thereby may assist in justifying the expansion of these programs and tool sets. rq2: how are academic librarians using vr to support existing library services, such as curriculum development and access? participants discussed a number of applications for vr and related technologies that could help expand library services, including new ways of browsing and engaging with library resources, circulating vr equipment, and offering entirely new types of library services. some applications that were brought up included using ar to develop virtual books and book “trailers” that display promotional materials for each book as the patrons browse the stacks with their ar-enabled device. using vr to engage with the materiality of books was also suggested. participants gave examples of using vr to look closely at medieval manuscripts, rare books, and artists books in order to observe their spatial properties, such as indentations, surface detail/topography, ink, and other material aspects.45 other participants suggested that vr could be used to compare and recontextualize library collections across geographies, placing the books in the contexts in which they were found and enabling comparative analysis of textual features. participants also suggested that vr could be used as a platform for presenting numerical data, which could have applications for researchers who analyze data through already established visualization services at academic libraries. the discussion around expanding access using vr brought up the idea of libraries hosting 3d/vr collections, which could contain 3d scans and vr environments sourced from locally produced content and would be shared by 3d-scanning partners around the world. one question that was raised was how to store the 3d content in a way that it could be easily transferred to vr systems for access. participants from the university of oklahoma discussed their work hosting multi-campus vr walkthroughs of 3d scans of historic sites (e.g., the arches of palmyra, syria), which offered an evocative example of how libraries can simultaneously provide access to technology, create and curate collections of scholarly 3d-scan data, and present expertled events.46 this shows how libraries hosting vr technologies can both be technological and intellectual partners in presenting 3d content to library patrons. participants also pointed out how vr could be used to contribute to ongoing efforts in the library community to develop tools challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 36 https://doi.org/10.6017/ital.v38i4.11075 for linked open data and bibliometrics by creating new ways of visualizing networks of relationships between texts. these examples suggest that vr could be used as a multidimensional visualization platform that would visualize relationships between texts and areas of knowledge in rich and immersive ways, yielding new insights for librarians, library researchers, and other users.47 a circulation model of vr deployment was also discussed and found to be successful at several institutions. this typically followed a traditional “checkout” model of using the circulation desk to loan equipment for patron use for a limited time period. some libraries check out all the vr pieces separately, while others are experimenting with full vr kits. circulating vr equipment brings up a number of challenges, such as hygiene concerns, theft and loss of equipment, the cost of licensing and scaling up software purchases, and managing software accounts and updates. adopting a circulation policy for vr may help to ensure the sustainability of the program by centralizing cost, risk, and point of access. however, centralization may not always be the most appropriate solution as it can turn off some faculty and discourage their interest in using it. a final challenge of circulating vr hardware is the bottleneck of content creation, such that libraries find it difficult to provide new content for vr, which may make it difficult to sustain user interest over time. some participants pointed out that without a game design curriculum or other creative programs on campus that have the capability of developing new content, users may lose interest, which would limit the success of a circulation-based model. if programs do not have access to the knowledge and tools necessary to develop vr content, it becomes difficult to expand services for development and pilot projects. it is therefore critical to form partnerships early on with content creators when developing vr programs. rq3: how can the knowledge and resources of academic, library-based 3d/vr programs be shared with other academic and information organizations, such as public libraries and regional highereducation institutions? based on participant discussions, several areas of concern were identified that need to be addressed in order for 3d/vr knowledge and resources to be shared with a broader range of institutions: methods for collaborating and coordinating across institutions; strategies for addressing development and hardware resource limitations; and addressing challenges to the widespread adoption of 3d/vr tools in higher education. methods for collaborating and coordinating across institutions collaboration and coordination across institutions was found to be important for ensuring that vr tools are made available to both large and small institutions. few small colleges, cultural heritage institutions, or public k-12 school districts have the financial or technical resources to deploy educational vr at scale. the hardware and software used for educational vr require expertise—in the form of hardware setup, maintenance, administrative capabilities, and software development experience—represents a significant investment in labor (i.e., staffing overhead) and training on the part of research institutions, such as those participating in the forum. forum participants agreed that, given the disproportionate concentration of expertise within higher education, initiatives should be undertaken to ensure that tools, workflows, training, and support are provided to organizations outside of academia. a consortium model was suggested as one formal mechanism for addressing this concern. information technology and libraries | december 2019 37 strategies for addressing development and hardware resource limitations in the case of software or content, forum participants emphasized the value of open-source and open-access standards for facilitating the widespread distribution and use of educational vr content, especially for those without the development resources to create their own content. participants suggested an open “app store” like ecosystem to assist in the distribution of this educational content across organizations. to aid in the successful integration of apps from a central database, participants discussed the prospect of supporting collaborators remotely, perhaps even from within vr, essentially training others on the software using the strengths of vr hardware. finally, participants identified several large content-hosting platforms that host a variety of 3d assets relevant to education which might readily be deployed in vr without extensive development resources. a sandbox or open-ended viewing environment for loading and analyzing arbitrary sets of user-generated 3d models could be provided by the university or academic library, such that the end user would experience 3d-learning objects that were selected and deployed by local educators without the need for software development expertise.48 in contrast to the high-end vr workstations (e.g., oculus rift, htc vive, etc.), participants were quick to point to the relative affordability of smartphone-based vr solutions. these google cardboard–type implementations are especially promising, given the widespread adoption of smartphones with sufficient computing power to render interactive educational 3d content stereoscopically. in the case of sketchfab, a commercial 3d-hosting platform, web-based 3d assets can be uploaded, collected, and accessed from any current smartphone device, and launched quickly into a stereoscopic viewing experience. in this way, more users at a wider variety o f educational institutions can make use of some of the uniquely beneficial platform characteristics of vr (e.g., depth cues) without committing to the purchase of high-end vr workstations. addressing challenges to the widespread adoption of 3d/vr tools in higher education library technologists want to introduce vr to a wide audience of potential beneficiaries, yet this approach may cause faculty to dismiss such implementations as a novelty. forum participants repeatedly discussed the importance of involving faculty in implementations of vr tools and spaces that have academic value. participants noted how the uptake of 3d/vr technologies can be thwarted when faculty are disinterested or are not well-informed about the potential uses of 3d/vr technologies. forum participants noted that university faculty who want to incorporate 3d/vr technologies are faced with many of the same challenges encountered by smaller educational organizations, including lack of guidance on setup and maintenance, administrative pushback, and high cos t. participants agreed that academic libraries should play a critical role in the hosting and administration of these systems. deployment of 3d/vr technologies in academic libraries would provide a central resource that could be used by many departments, including those fields that may lack the resources to invest in their own equipment. moreover, since librarians already act as research and instructional collaborators with faculty, those relationships can be drawn upon when adopting vr, and showcasing faculty engagement helps to demonstrate the academic value of 3d/vr technologies for library and university administrators. further strategies for collaborating across organizations interinstitutional collaboration was put forward as a means to begin addressing the challenges facing both universities and smaller educational organizations seeking to implement 3d/vr programs. participants agreed that it was incumbent upon the larger universities or institutions to challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 38 https://doi.org/10.6017/ital.v38i4.11075 provide open access to their vr software projects and provide the mentorship and training necessary to successfully deploy these applications for use by smaller institutions. forum participants suggest that, in some cases, this will require a physical visit to a collaborator’s location and live demonstrations of tools and techniques. alternatively, a showcase or summit event could be hosted by large institutions for the purpose of demonstrating 3d/vr technologies to a number of small institutions. at such events, during site visits, or within vr-based training sessions, best practices and the results of empirical research on the efficacy of 3d/vr could be communicated to smaller organizations. one such institution type, public libraries, was discussed in relation to the potential value of collaborative outreach. even with limited budgets, public libraries are trying to bring 3d/vr technology to their patrons. they would benefit from becoming strong collaborators with local universities and colleges. for example, summer programming at public libraries could be organized to effectively distribute the software, standards, best practices, and workflows being pioneered at the university level. forum participants suggested a “hand me down” program for donating earlier generation headset hardware, which is replaced quite frequently and at great expense, but is typically still functional. in this case, smaller organizations would benefit from surplus equipment funded by research grant money or donor contribution, both of which are less common at the public-library level. along with the open-access software and 3d-asset ecosystems discussed above, this sharing of hardware and knowledge would increase the impact of 3d/vr technologies across multiple organizations and institutions. findings from the public forum the public portion of the forum consisted of a half-day, afternoon session attended by local stakeholders from other academic libraries, public schools, public libraries, and other institutions in oklahoma, texas, kansas, and arkansas. they were invited to attend and engage in discussions with the invited experts on the topics of 3d/vr in relation to smaller institutions, such as public k12 schools and public libraries. the following section reports on findings from that session, drawing on a set of 38 anonymously completed notecards that participants filled out during the public forum and returned to the project team. for these notecard responses, participants were asked to answer the following three questions: 1. what challenges do smalland medium-sized, public-facing institutions face when implementing vr? 2. how can large institutions support smalland medium-sized institutions who are starting to adopt 3d/vr? 3. what options are there for public libraries or k-12 to participate in 3d/vr workflows? the following sections summarize those responses and the key themes from the public forum discussion. challenges faced by smalland medium-sized institutions public forum participants identified challenges related to cost (including equipment, maintenance, and staffing), content creation, and faculty/administrator buy-in as the top concerns facing small to medium-sized public institutions looking to deploy educational vr. regarding cost, it was clear that budgets for innovative, unproven technology that may become quickly obsolete were oftentimes nonexistent, and those local vr champions who sought to install such technology faced pushback from administrators who do not understand the technology and are focused on information technology and libraries | december 2019 39 measurable returns on investment rather than exploratory offerings. finally, staffing was a challenge identified and shared by multiple public forum participants. to hire, train, and support skilled staff members who are expected to keep abreast of ongoing developments of a still young technology such as vr requires a sustainable investment from administrators. participants noted that even having skilled staff in place and access to the necessary equipment, the success of a given vr deployment was not guaranteed. one potential bottleneck identified by public forum participants was vr content creation. due to the relative immaturity of the vr software marketplace and its associated disorganization, public-forum participants described focusing their efforts on supporting local content-creation efforts by students, faculty, and staff. beyond recognizing the need to hire and support costly development workers, public-forum participants also noted how the preservation and further distribution of locally produced vr content require skills and training beyond their level of expertise. oftentimes, these smallto medium-sized public institutions have a single staff member, who may not have all of the required technical skills, tasked with deploying and developing vr content, which limits the local impact of this potentially transformative educational technology. finally, multiple public-forum participants identified the need to garner faculty buy-in as a recurring challenge. oftentimes faculty do not know about the technology or have a limited understanding of it. to overcome this, public-forum participants noted that a demonstration of vr’s value is critical. they suggested that this demonstration might be accomplished in partnership with larger educational institutions (e.g., universities), whose own staff, content, and expertise could be leveraged to best communicate the value of vr to local administrators and faculty. ways larger institutions can support smaller ones beyond assisting smallto medium-sized public institutions in demonstrating the value of vr for local faculty and administrators, public-forum attendees suggested that the continued development and distribution of open-source vr software, providing help with setup and maintenance issues, and engaging in formal knowledge distribution activities—in the form of conferences, consortia, and grant partnership—would streamline the adoption of vr by these smaller public institutions. public-forum participants also expressed a desire to engage with the research outputs of larger institutions that focused on the efficacy of vr, which may be useful for working with local faculty and administrators. software sharing was specifically identified as a way that larger institutions could support the early efforts of smallto medium-sized institutions as they set out to deploy and integrate vr. participants were careful to note that additional support, in the form of training and troubleshooting, was equally important to the distribution of the software itself. the affordability of vr software was described as a particularly important issue by a number of participants, hence the focus on open-source solutions by the group. predicting inevitable hardware and software failures, public-forum participants communicated that onsite support by vr experts from larger institutions would be ideal. participants noted that, while knowledge sharing is important, guidance on how to set up a specific system sometimes requires onsite support. fortunately, there are novel support mechanisms supported by the technology itself, with public-forum participants suggesting that experts could be “brought in” to provide support in the virtual environment itself. in this case, a multiplayer vr experience similar challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 40 https://doi.org/10.6017/ital.v38i4.11075 to those demonstrated by the content developers in attendance could function as an interactive learning platform for the distribution of information and would even provide training on the technology. ways public libraries and k-12 schools can participate in 3d/vr the size and diversity of the public library and k-12 user communities were reflected in the feedback provided by public forum participants concerning ways in which these types of institutions can participate in vr. participants suggested that within public libraries, focus groups that represent different ages, physical ability levels, and socioeconomic backgrounds of library users can be recruited to test and refine vr offerings. in this way, the wider public can be introduced to vr technologies and made aware of their benefits. the potential for k-12 students to assist in the vr-content development process, which could have the added benefit of helping students develop valuable technical skills, was also mentioned as a way that these institutions could participate. this programming need not start out as a formal curriculum, participants suggested, but rather as an afterschool program, taking the form of a “modern day a.v. club,” as one participant suggested. public-forum participants identified the need for compensation or incentive programs for those adolescents in public libraries who are wishing to contribute to the content creation process. overall, participants in the public forum provided positive feedback about their experience at the event: “very useful indeed. i learned a lot,” wrote one participant. “it’s nice to hear perspectives of those working in academia and to share our own perspectives . . . ” wrote another public-forum participant. a third participant was enthusiastic about the future, writing: “i love the idea of libraries being partners.” these responses indicate the importance of collaboration between small and large institutions and the value of these types of public forums for sharing knowledge in this field. summary of findings & discussion the findings drawn from the discussions and presentations at the forum offer a broad view of the current concerns of this diverse community involved in implementing 3d/vr in academic institutions for the purposes of education and research. the range of stakeholder groups is expansive and demonstrates a growing interest in immersive visualization technology across many fields and institution types. by reviewing previous literature in conjunction with the group discussion findings, we can identify and summarize a set of common challenges facing libraries and other information institutions that are implementing 3d/vr technologies. in the following section, we describe these challenges or considerations identified for each topic and point towards possible strategies or directions forward for addressing them. initiating vr programs in libraries and schools the main challenges facing libraries and schools as they initiate vr programs include developing interest and awareness for the emerging technology among faculty, students, and administrators; locating necessary expertise in vr within their communities when knowledge is unevenly distributed; getting enough buy-in from administrators to support the allocation of necessary resources; encouraging researchers to share their projects and research outputs for the benefit of the larger community; and overcoming the bottleneck of vr content creation as a limiting factor information technology and libraries | december 2019 41 on institutional investment. from the findings we can identify a set of strategies to begin to address these issues, including: • utilize demonstration techniques to generate student and faculty interest in vr (e.g., “road shows” and library-hosted events, such as hackathons, workshops, and demonstrations). • develop replicable workflows that can be implemented by a variety of stakeholders. • establish management plans for 3d/vr hosting spaces that include using student labor for overseeing vr technology spaces and content creation. • develop and validate metrics for evaluating the impact of vr technologies in order to provide evidence for faculty and administrators that vr is worth their time and investment. integrating vr into research and teaching from the findings, we can also identify a number of challenges related to integrating vr into research and teaching. one of the major challenges in this area is related to the research community not valuing vr projects as scholarly or pedagogical outputs. because of this, faculty members are typically reluctant to invest their time and resources into developing vr curricula if it will not contribute to their tenure and promotion portfolios. related to that problem is the issue of assigning credit or attribution to 3d/vr learning objects when they may be developed by teams, which can discourage sharing and limit reuse of vr learning modules. finally, because of a lack of metrics for measuring the impact of 3d/vr on student learning, it has been hard for faculty to rationalize integrating what is sometimes perceived as unproven technology into their classes. determining which metrics to use has pedagogical as well as economic implications, since without metrics, it becomes difficult to rationalize the expense of 3d/vr to institutional administrators. participants did not have strategies for addressing all of these challenges, but offered the following suggestions: • design vr course integrations that take advantage of the particular access and analytic characteristics of vr technologies. • weigh instructional goals with the cost of vr equipment and development time. • develop assessment strategies and define metrics for evaluating the impact of vr learning activities on students. expanding the role of the library with vr the group agreed that libraries are ideal places for hosting 3d/vr equipment, services, and support because they are often centrally located in their communities and they can potentially help by centralizing the risk and cost of untested technologies. participants identified a number of new techniques for utilizing the benefits of 3d/vr to expand existing library services, such as offering new ways of browsing and engaging with existing library resources, enabling the development of 3d-based digital collections and curated exhibitions and events that draw from those collections, and adding vr-visualization equipment as another piece of digital technology that libraries can circulate to support the various uses of the range of patrons interested in this technology. again, although there was no lack of big ideas in regards to how 3d/vr could exp and library services, the biggest challenges for libraries adopting 3d/vr into existing services was still a lack of verified educational content, which confirms the dire need to share 3d/vr content within institutions and across the wider community. without platforms for sharing 3d/vr content and the appropriate institutional and disciplinary incentives to do so, 3d/vr is unlikely to be adopted broadly and the range of exciting new applications will not be realized beyond niche projects. challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 42 https://doi.org/10.6017/ital.v38i4.11075 collaborating and coordinating across institutions based on the findings drawn from both the expert-led and the public portions of the forum, it is clear that collaboration and coordination across institutions is essential for making 3d/vr a widely successful educational and research tool, because it can enable the sharing of resources with a range of smaller institutions that would otherwise not be able to adopt the technology on their own. supporting this exchange will require providing faculty at larger institutions with the necessary tools and incentives to support that sharing, which is an area in which participants agreed that academic libraries could serve as the needed source for technical knowledge and equipment. in summary, participants identified the following approaches for supporting efforts at collaborating and supporting smaller institutions and expanding access to underserved communities: • larger institutions should provide tools, workflows, training, and support through on-site visits. • universities should partner with public libraries, since they can be hubs for providing access to communities that would otherwise not have the opportunity to engage with 3d/vr outside of academic communities. • use open-source and open-access standards and content, including an open “app store” ecosystem of 3d/vr content. • use existing databases of free 3d content. • use affordable smartphone-based vr applications when more expensive vr systems are not feasible. these findings contribute to current discussions in the field of library innovation that consider how libraries can adopt and sustain emerging technologies, such as vr and 3d technologies.49 we have identified a set of common challenges and possible strategies for integrating 3d/vr programs into libraries and educational institutions, but additional research is required in this area to produce more detailed workflows for a range of institutional types to follow. there are inherent limitations to any specification, since every context has its own specific requirements for 3d/vr implementation, but as these findings suggest, there are common challenges that can be addressed in systematic and generalizable ways. these findings offer some examples of this, but additional data collection is necessary to focus on some of the key areas that are still developing. conclusion the overriding theme across the findings from the forum is the importance of interinstitutional and interdisciplinary collaboration. confirming what we had assumed going into this project, it is clear that many of the challenges of 3d/vr can only be solved through systematic and concerted effort across multiple stakeholder groups. 3d/vr is not limited to a niche area. as we can see from the range of participants and applications, it has broad transformative potential and is becoming increasingly mainstream in many contexts. this suggests the importance of addressing these challenges through additional forums and working groups to generate standards and best practices that can be applied across the growing 3d/vr community. such guidance needs to be specific enough that they can offer practical benefit to stakeholder groups of varying capacities, but flexible enough to be useful for a range of applications and disciplinary practices. information technology and libraries | december 2019 43 while the findings from the forum suggest a variety of techniques and strategies for addressing the challenges identified, there is still much work that needs to be done to establish standards and best practices, generate institutional support, and enact change within disciplinary cultures in order to better support these communities. in particular, the following areas require further inquiry: • develop validated metrics for evaluating the impact of 3d/vr, from pedagogical, research, and institutional perspectives. • develop guidelines and tools for supporting users with disabilities. • support smaller institutions in initiating and supporting 3d/vr projects. • find ways to educate skeptical disciplines about the value of research and teaching that uses 3d/vr. • develop tools for supporting 3d/vr throughout the research or educational lifecycle, including: o project management and documentation tools; o universal 3d viewers that integrate with vr equipment and 3d repositories; o sustainable, preservation-quality file formats for 3d and vr; and o open platforms for hosting 3d/vr content. there are a number of other projects that are addressing some of these lingering challenges within the field of 3d and vr research and teaching, including community standards for 3d data preservation (cs3dp), an imls-funded project that is using a series of meetings and working groups to develop community-sanctioned standards for preserving 3d data in academic contexts (http://gis.wustl.edu/dgs/cs3dp/); building for tomorrow, another imls-funded project that is developing guidelines for preserving 3d models in the fields of architecture, design, architectural archives, and architectural history (https://projects.iq.harvard.edu/buildingtomorrow/home); the smithsonian institute’s 3d digitization program, which is developing workflows and metadata guidelines for a variety of 3d creation processes (https://3d.si.edu/); and the library of congress’s born to be 3d initiative, which has started convening experts in the field to look at the preservation challenges of “born digital” 3d data, including cad models, gis data, etc. (https://www.loc.gov/preservation/digital/meetings/b2b3d/b2b3d2018.html). the lib3dvr project team will continue to collaborate with members of these project teams to ensure that knowledge is shared and that any standards and best practices that are developed for 3d/vr visualization and analysis take into consideration the findings from this forum. the project team is confident that through these initiatives, useful standards and best practices will emerge to assist educators, researchers, librarians, technologists, and other information professionals address the complex challenges of implementing 3d/vr visualization and analysis for scholarly and pedagogical purposes in their institutions. notes 1 matt cook and zack lischer-katz, “integrating 3d and virtual reality into research and pedagogy in higher education,” in beyond reality: augmented, virtual, and mixed reality in the library, ed. kenneth j. varnum (chicago: ala editions, 2019), 69-85. 2 with 3d/vr technologies “a professor may take students on an immersive field trip to stonehenge, changing the lighting to simulate various phases of solar events; an archaeologist http://gis.wustl.edu/dgs/cs3dp/ https://projects.iq.harvard.edu/buildingtomorrow/home https://3d.si.edu/ https://www.loc.gov/preservation/digital/meetings/b2b3d/b2b3d2018.html challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 44 https://doi.org/10.6017/ital.v38i4.11075 may capture 3d scans of an archaeological excavation and share these data with a colleague on the other side of the world in the form of an immersive virtual exploration of the site; [or] a biochemistry professor may explore complex protein structures with students,” zack lischerkatz et al., “3d/vr creation and curation: an emerging field of inquiry,” in jennifer grayburn et al., eds., 3d/vr in the academic library: emerging practices and trends (clir report 176, february 2019), https://www.clir.org/wp-content/uploads/sites/6/2019/02/pub-176.pdf. 3 samuel a. miller, noah j. misch, and aaron j. dalton, “low-cost, portable, multi-wall virtual reality,” eurographics workshop (2005): 1-16, https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20050240930.pdf; carolina cruzneira, daniel j. sandin, and thomas a. defanti, “surround-screen projection-based virtual reality: the design and implementation of the cave," in proceedings of the 20th annual conference on computer graphics and interactive techniques (1993): 135-42; jeremy bailenson, experience on demand: what virtual reality is, how it works, and what it can do (new york: w. w. norton, 2018). 4 mieke pfarr-harfst and s. münster, “typical workflows, documentation approaches and principles of 3d digital reconstruction of cultural heritage,” in 3d research challenges ii (springer, 2016), 32–46, https://doi.org/10.1007/978-3-319-47647-6_2; pierre alliez et al., “digital 3d objects in art and humanities: challenges of creation, interoperability and preservation,” (white paper, parthenos project, may 24, 2017), https://hal.inria.fr/hal01526713v2/document. 5 bailenson, experience on demand; zack lischer-katz, matt cook, and kristal boulden, “evaluating the impact of a virtual reality workstation in an academic library: methodology and preliminary findings,” in proceedings of the association for information science and technology annual meeting (vancouver, nov. 2018): 300-08, https://doi.org/10.1002/pra2.2018.14505501033. 6 ciro donalek et al., “immersive and collaborative data visualization using virtual reality platforms,” in proceedings of 2014 ieee international conference on big data (2014): 609-14. 7 the lib3dvr website is available at http://lib3dvr.org. information about the grant is available at https://www.imls.gov/grants/awarded/lg-73-17-0141-17. 8 hermann von helmholtz and james powell cocke southall, treatise on physiological optics, vol. 3 (courier corporation, 2005, originally published 1867); andries van dam, david h. laidlaw, and rosemary michelle simpson, “experiments in immersive virtual reality for scientific visualization,” computers & graphics 26, no. 4 (2002): 535-55; doug a. bowman and ryan p. mcmahan, “virtual reality: how much immersion is enough?" computer 40, no. 7 (2007): 3643; david a. atchison and larry n. thibos, “optical models of the human eye,” clinical and experimental optometry 99, no. 2 (2016): 99-106. 9 colin ware and peter mitchell, “reevaluating stereo and motion cues for visualizing graphs in three dimensions,” in proceedings of the 2nd symposium on applied perception in graphics and visualization, (2005), 51-58; tao ni, doug a. bowman, and jian chen, “increased display size and resolution improve task performance in information-rich virtual environments,” in proceedings of graphics interface (quebec, canada, june 7-9, 2006), 139-46; andrew forsberg https://www.clir.org/wp-content/uploads/sites/6/2019/02/pub-176.pdf https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20050240930.pdf https://doi.org/10.1007/978-3-319-47647-6_2 https://hal.inria.fr/hal-01526713v2/document https://hal.inria.fr/hal-01526713v2/document https://doi.org/10.1002/pra2.2018.14505501033 http://www.lib3dvr.org/ https://www.imls.gov/grants/awarded/lg-73-17-0141-17 information technology and libraries | december 2019 45 et al., “a comparative study of desktop, fishtank, and cave systems for the exploration of volume rendered confocal data sets,” ieee transactions on visualization and computer graphics 14, no. 3 (2008): 551-63; marta kersten-oertel, sean jy-shyang chen, and d. louis collins, “an evaluation of depth enhancing perceptual cues for vascular volume visualization in neurosurgery,” ieee transactions on visualization and computer graphics 20, no. 3 (2014): 391-403; susan jang et al., “direct manipulation is better than passive viewing for learning anatomy in a three-dimensional virtual reality environment,” computers & education 106 (2017): 150-65. 10 eric d. ragan et al., “studying the effects of stereo, head tracking, and field of regard on a small-scale spatial judgment task,” ieee transactions on visualization and computer graphics 19, no. 5 (2013): 886-96; bireswar laha, doug a. bowman, and john j. socha, “effects of vr system fidelity on analyzing isosurface visualization of volume datasets,” ieee transactions on visualization & computer graphics 4 (2014): 513-22. 11 victoria szabo, “collaborative and lab-based approaches to 3d and vr/ar in the humanities,” in grayburn et al., eds., 3d/vr in the academic library: emerging practices and trends (clir report 176, february 2019), 12-23, https://www.clir.org/wpcontent/uploads/sites/6/2019/02/pub-176.pdf; aris alissandrakis et al., “visualizing dynamic text corpora using virtual reality,” 39th annual conference of the international computer archive for modern and medieval english (tampere, finland, may 30-june 3, 2018), 205, http://www.diva-portal.org/smash/record.jsf?pid=diva2%3a1213822&dswid=2342. 12 van dam, laidlaw, and simpson, “experiments in immersive virtual reality,” 535-55; limp et al., “developing a 3-d digital heritage ecosystem.” 13 will rourk, “3d cultural heritage informatics: applications to 3d data curation,” in grayburn et al., eds., 3d/vr in the academic library: emerging practices and trends (clir report 176, february 2019), 24-38, https://www.clir.org/pubs-reports-pub176/. 14 bill endres, digitizing medieval manuscripts: the st. chad gospels, materiality, recoveries, and representation in 2d & 3d (amsterdam: arc humanities press, 2019). 15 seth abhishek, judy m. vance, and james h. oliver, "virtual reality for assembly methods prototyping: a review," virtual reality 15, no. 1 (2011): 5-20. 16 jeremy a. bot and duncan j. irschick, “using 3d photogrammetry to create open-access models of live animals: 2d and 3d software solutions,” in grayburn et al., eds., 3d/vr in the academic library: emerging practices and trends (clir report 176, february 2019), 54-72, https://www.clir.org/pubs-reports-pub176/. 17 guido giacalone et al., “the application of virtual reality for preoperative planning of lymphovenous anastomosis in a patient with a complex lymphatic malformation," journal of clinical medicine 8, no. 3 (2019): 371. 18 michelle e. portman, asya natapov, and dafna fisher-gewirtzman, “to go where no man has gone before: virtual reality in architecture, landscape architecture and environmental planning,” computers, environment and urban systems 54 (2015): 376-84. https://www.clir.org/wp-content/uploads/sites/6/2019/02/pub-176.pdf https://www.clir.org/wp-content/uploads/sites/6/2019/02/pub-176.pdf http://www.diva-portal.org/smash/record.jsf?pid=diva2%3a1213822&dswid=2342 https://www.clir.org/pubs-reports-pub176/ https://www.clir.org/pubs-reports-pub176/ challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 46 https://doi.org/10.6017/ital.v38i4.11075 19 fred limp et al., “developing a 3-d digital heritage ecosystem: from object to representation and the role of a virtual museum in the 21st century," internet archaeology 30 (2011): 1-38; donalek et al., 2014. 20 bryan carter and aline click, “imagine the real in the virtual: experience your second life," paper presented at 22nd annual conference on distance teaching and learning (madison, wi, 2006); sasha barab et al., “making learning fun: quest atlantis, a game without guns,” educational technology research and development 53, no. 1 (2005): 86-107. 21 ekaterina praslova–førland, alexei sourin, and olga sourina, “cybercampuses: design issues and future directions,” visual computer 22, no. 12 (2006): 1,015-28; stephen bronack et al., “designing virtual worlds to facilitate meaningful communication: issues, considerations, and lessons learned,” technical communication 55, no. 3 (2008): 261-69; kim holmberg and isto huvila, “learning together apart: distance education in a virtual world,” first monday 13, no. 10 (october 2008), https://firstmonday.org/article/view/2178/2033; mats deutschmann, luisa panichi, and judith molka-danielsen, “designing oral participation in second life: a comparative study of two language proficiency courses,” recall 21, no. 2 (may 2009): 20626; diane carr, martin oliver, and andrew burn, “learning, teaching and ambiguity in virtual worlds,” in researching learning in virtual worlds, anna peachey et al., eds. (london: springer, 2010), 17–31. 22 chris dede, “immersive interfaces for engagement and learning,” science 323 (2009): 66-69; bailenson, experience on demand. 23 dede, 234. 24 julie milovanovic et al., “virtual and augmented reality in architectural design and education: an immersive multimodal platform to support architectural pedagogy,” paper presented at the 17th international conference, caad futures (istanbul, turkey, july 2017), https://hal.archives-ouvertes.fr/hal-01586746. 25 susan jang et al., “direct manipulation is better than passive viewing for learning anatomy in a three-dimensional virtual reality environment,” computers & education 106 (2017): 150-65, https://doi.org/10.1016/j.compedu.2016.12.009. 26 bailenson, experience on demand. 27 lischer-katz et al., 2018, “evaluating the impact of a virtual reality workstation.” 28 mina c. johnson-glenberg, “immersive vr and education: embodied design principles that include gesture and hand controls,” frontiers in robotics and ai 5 (2018): 4, https://doi.org/10.3389/frobt.2018.00081. 29 sven schneider et al., “educating architecture students to design buildings from the inside out,” in proceedings of the 9th international space syntax symposium (seoul, korea, 2013); saskia f. kuliga et al., “virtual reality as an empirical research tool—exploring user experience in a real building and a corresponding virtual model,” computers, environment and urban systems 54 (2015): 363-75; elizabeth pober and matt cook, “the design and development of an https://firstmonday.org/article/view/2178/2033 https://hal.archives-ouvertes.fr/hal-01586746 https://doi.org/10.1016/j.compedu.2016.12.009 https://doi.org/10.3389/frobt.2018.00081 information technology and libraries | december 2019 47 immersive learning system for spatial analysis and visual cognition,” paper presented at the conference of the design communication association (bozeman, mt, 2016). 30 schneider et al., “educating architecture students,” 15. 31 antonieta angulo, “on the design of architectural spatial experiences using immersive simulation," in eaea 11 conference proceedings, envisioning architecture: design, evaluation, communication (milan, italy, 2013), 151-58. 32 michelle goldchain, “virtual reality leads to better building designs, happier clients, says architecture firm,” curbed washington, dc (march 10, 2017), https://dc.curbed.com/2017/3/10/14690200/virtual-reality-perkins-will. 33 e.g., miguel figueroa, “in a virtual world: how school, academic, and public libraries are testing virtual reality in their communities,” american libraries 49, no. 3/4 (april 3, 2018): 26-33; edward iglesias, “creating a virtual reality-based makerspace,” online searcher 42, no. 1 (february 1, 2018): 36-39. 34 austin olney, “augmented reality: all about holograms,” in beyond reality: augmented, virtual, and mixed reality in the library, kenneth j. varnum, ed. (chicago: ala editions, 2019), 1-16. 35 bohyun kim, “virtual reality for 3d modeling,” in beyond reality: augmented, virtual, and mixed reality in the library, kenneth j. varnum, ed. (chicago: ala editions, 2019), 31-46. 36 brandon patterson et al., “play, education, and research: exploring virtual reality through libraries,” in beyond reality: augmented, virtual, and mixed reality in the library, kenneth j. varnum, ed. (chicago: ala editions, 2019), 50-51. 37 steven lavalle, virtual reality (london: cambridge university press, 2017), 348. 38 oculus vr, llc, “oculus best practices, version 310-30000-02,” retrieved from http://static.oculus.com/documentation/pdfs/intro-vr/latest/bp.pdf; robert s. kennedy, kay m. stanney, and william p. dunlap, “duration and exposure to virtual environments: sickness curves during and across sessions,” presence: teleoperators and virtual environments 9, no. 5 (2000): 463-72. 39 andre l. delbecq, andrew h. van de ven, and david h. gustafson, group techniques for program planning: a guide to nominal group and delphi processes (glenview, il: scott, foresman, &co., 1975). 40 sara s. mcmillan, michelle a. king, and mary p. tully, “how to use the nominal group and delphi techniques,” international journal of clinical pharmacy 38 (2016): 656, https://doi.org/10.1007/s11096-016-0257-x. 41 eugenia m. kolasinski, “simulator sickness in virtual environments,” report no. ari -tr-1027 (alexandria, va: army research institute for the behavioral and social sciences, 1995). https://dc.curbed.com/2017/3/10/14690200/virtual-reality-perkins-will http://static.oculus.com/documentation/pdfs/intro-vr/latest/bp.pdf https://doi.org/10.1007/s11096-016-0257-x challenges and strategies for educational virtual reality | cook, lischer-katz, et al. 48 https://doi.org/10.6017/ital.v38i4.11075 42 lisa castaneda, anna cechony, and arabella bautista, “applied vr in the schools, 2016 -2017 aggregated report,” foundry 10 (2017), http://fineduvr.fi/wp-content/uploads/2017/10/allschool-aggregated-findings-2016-2017.pdf. 43 more information about the xbox adaptive controller can be found here: https://www.xbox.com/en-us/xbox-one/accessories/controllers/xbox-adaptive-controller. 44 information about upcoming vr hardware releases can be found here: https://www.roadtovr.com/simple-guide-oculus-quest-rift-s-valve-index-hp-reverbcomparison/. 45 e.g., see prof. william endres work on scanning medieval manuscripts at the lichfield cathedral, https://lichfield.ou.edu/content/imaging. 46 dian schaffhauser, “multi-campus vr session tours remote cave art,” campus technology (oct. 9, 2017), https://campustechnology.com/articles/2017/10/09/multi-campus-vr-sessiontours-remote-cave-art.aspx. 47 matt cook, “virtual serendipity: preserving embodied browsing activity in the 21st century research library,” the journal of academic librarianship 44, no. 1 (jan. 2018): 145-9, https://doi.org/10.1016/j.acalib.2017.09.003. 48 see cook and lischer-katz, “integrating 3d and virtual reality into research and pedagogy,” for a discussion of the vr “sandbox” platform developed at university of oklahoma libraries, oklahoma virtual academic laboratory (oval). more information about oval can be found here: https://libraries.ou.edu/content/virtual-reality-ou-libraries. 49 e.g., matt cook and betsy van der veer martens, “managing exploratory units in academic libraries,” journal of library administration 59, no. 6 (2019): 1-23, https://doi.org/10.1080/01930826.2019.1626647. http://fineduvr.fi/wp-content/uploads/2017/10/all-school-aggregated-findings-2016-2017.pdf http://fineduvr.fi/wp-content/uploads/2017/10/all-school-aggregated-findings-2016-2017.pdf https://www.xbox.com/en-us/xbox-one/accessories/controllers/xbox-adaptive-controller https://www.roadtovr.com/simple-guide-oculus-quest-rift-s-valve-index-hp-reverb-comparison/ https://www.roadtovr.com/simple-guide-oculus-quest-rift-s-valve-index-hp-reverb-comparison/ https://lichfield.ou.edu/content/imaging https://campustechnology.com/articles/2017/10/09/multi-campus-vr-session-tours-remote-cave-art.aspx https://campustechnology.com/articles/2017/10/09/multi-campus-vr-session-tours-remote-cave-art.aspx https://doi.org/10.1016/j.acalib.2017.09.003 https://libraries.ou.edu/content/virtual-reality-ou-libraries https://doi.org/10.1080/01930826.2019.1626647 abstract introduction literature review the general benefits of vr uses of vr in research integrating vr into the classroom institutional experiences of adopting vr methods findings rq1: what are effective strategies for addressing common challenges faced by academic libraries as they set out to implement 3d and vr programs? human-centered design challenges initiating vr programs in libraries and schools curriculum and research integration and assessment rq2: how are academic librarians using vr to support existing library services, such as curriculum development and access? rq3: how can the knowledge and resources of academic, library-based 3d/vr programs be shared with other academic and information organizations, such as public libraries and regional higher-education institutions? methods for collaborating and coordinating across institutions strategies for addressing development and hardware resource limitations addressing challenges to the widespread adoption of 3d/vr tools in higher education further strategies for collaborating across organizations findings from the public forum challenges faced by smalland medium-sized institutions ways larger institutions can support smaller ones ways public libraries and k-12 schools can participate in 3d/vr summary of findings & discussion initiating vr programs in libraries and schools integrating vr into research and teaching expanding the role of the library with vr collaborating and coordinating across institutions conclusion notes reproduced with permission of the copyright owner. further reproduction prohibited without permission. information ecologies: using technology with heart/the media ... zillner, tom information technology and libraries; mar 2000; 19, 1; proquest pg. 54 book reviews information ecologies: using technology with heart by bonnie a. nardi and vicki l. o'day. cambridge: mit pr., 1999. 232p. $27.50 (isbn 0-262-14066-7). the media equation: how people treat computers, television, and new media like real people and places by byron reeves and clifford nass. cambridge: cambridge univ . pr., 1996 and 1999. 305p. $28.95 (isbn 1-575-86052x); paper, $15.95 (isbn 1-575-86053-8). the books i am reviewing this month are interrelated because they both focus on information technology and our changing world, with the two volumes looking at different levels of the picture. the broader, and to me more intriguing, view is presented by nardi and o'day in their wonderful book information ecologies. although it is not clear from the capsule biographies of the dust jacket, nardi and o'day are anthropologists who study the world of technology in a number of locales, and they here report the findings from their field work. among the case studies they discuss are an examination of the activities of reference librarians at two corporations and a look at a virtual world created for and by elementary school students. but they do much more than simply present case studies, although these alone make the book a worthwhile read. in addition, they argue that the most useful way to look at information technology is through the metaphor of "information ecologies," "system[s] of people, practices, values, and technologies in ... particular local environment[s]." they adopt this biological metaphor after carefully considering the most commonly employed information technology metaphors: technology as tool, text, or system . in turn, they find each of these metaphors wanting. it is particularly important to choose carefully the metaphorical lenses through which technological developments are viewed. each particular metaphor has consequences for how sanguinely we view a technology, and it is often worthwhile to use multiple metaphors to enhance our world view. the information ecology metaphor is particularly appropriate for an anthropological view of local "habitats" and their inhabitants and artifacts . in turn, an anthropological view is particularly apt for capturing the human side of technology (thus the subtitle: using technology with heart). this is a side of things that can be overlooked in other metaphorical views, particularly since it requires that the sticky issue of values be considered. unfortunately for all of us, there is a reluctance to talk of human values when considering technology. as nardi and o'day note, there is a tendency to either enthusiastically applaud new technology without regard to its effects, or to condemn all new technology as inherently debasing to humanity, or to simply resign oneself pessimistically to the inevitable development of technology and our lack of control over it. nardi and o'day tend to be cautious optimists, claiming that we can control technology, and the way to exercise that control is through our own local encounters with information ecologies. thus, rather than bemoaning the dehumanizing effects of the internet, information ecologies explores the successful use of internet technologies to set up a virtual world for students and the elderly in phoenix, arizona. instead of thinking or acting globally, exploit the technology locally, but do so in a way that makes sense in terms of human values. on the taxonomic scale of technology views, ranging from gloom and doom (e.g., the views of clifford 54 information technology and libraries i march 2000 tom zillner, editor stoll) to perpetual optimism (e.g., nicholas negroponte), i place nardi and o'day somewhere in the middle, but as i suggested, leaning toward cautious optimism. in fact, they spend several chapters discussing the views of others and offering prescient criticism of the deficiencies of those views . of particular interest to me was their analysis of the french sociologist jacques ellul, who apparently sounded the alarm concerning the stress to mind and soul of constant technological change in 1954, well before the current crop of doomsayers. nardi and o'day find ellul's views, as articulated in the technological society to be compelling. yet, they claim, the rise of the internet can counteract the trend that ellul saw toward monotonous sameness and lack of diversity in the face of technological efficiency. perhaps so. one thing that i was looking for in information ecologies were some practical tools for engaging in the kind of exploration of information habitats that nardi, o'day, and other anthropologists engage in. there is a spate of interest lately in the role of anthropologists in the design and deployment of new technologies, and i would like to determine its applicability to my modest software development projects. unfortunately, i was mainly disappointed on this score. in fairness to the authors, they did not set out to spell out the anthropological methodology of exploring information ecologies in any detail. the purpose of the book is rather to argue that viewing the world of technology as a set of interconnected information ecologies is useful and accurate, and in many cases superior to other metaphorical views. they succeed in this goal. now i want them to go on to write a book on using anthropological methods in these ecologies without necessarily becoming a professional anthropologist. nardi and o'day do touch extremely briefly on a few conventions of interviewing subjects, with --------reproduced with permission of the copyright owner. further reproduction prohibited without permission. their most important technical discussion centering on what they call "strategic questioning," which they present in the context of evolving information ecologies . they provide useful categories of questions to be asked, and specific examples. although it may seem obvious to ask penetrating questions of members of an information habitat, this is one area in which software developers in particular fail miserably . another seemingly obvious pointer is to pay attention . again, its obviousness is deceptive , since most of us are poor observers who make many assumptions about the characteristics of a work activity without observational evidence . as evidence that people introducing new technologies to an ecology do not follow these simplest pieces of advice you can tum to the chapter "a dysfunctional ecology," to see how badly technology can fail for nontechnological reasons . this case study deals with a major teach ing hospital that introduced a monitoring system into its neurosurgical operating suites that captured instrument readings as well as complete audio and video. the system was installed to aid neurophysiologists, experts who are called in to advise neurosurgeons at key points during complex surgeries to ensure that patient neurological function is not compromised . the neurosurgeons and neurophysiologists at this hospital decided that it would be more efficient for the neurophysi ologists to be able to remotely monitor multiple surgeries simultaneously. both groups failed to consult with the other constituencies among the operating team, the nurses and anesthesiology staff. these groups believed that their privacy was being compromised, particularly since it was possible to tape any procedures at multiple workstations throughout the hospital. i can easily envision similar sorts of problems due to lack of communication in introducing new or modified technology into other milieus, e.g., libraries. although the consequences might not lead to the potentially life-threatening situations that could arise in an operating suite, there are certainly possible outcomes where service to users could be undermined. despite the book being not exactly what i (rather selfishly) want, lnformation ecologies is a first-rate read and an important starting point for those concerned with better controlling technological change in the world of information. turning from an anthropological point of view to a psychological one, the media equation offers another important basis for technological design and implementation, particularly of computer software and multimedia. the release last year of a paperback edition of this volume, first published in 1996, provides a convenient pretext for reviewing this work. reeves and nass have supervised years of study and experimentation that have consistently demonstrated the truth of what they call the "media equation": that our relations with media, including computers and multimedia, are identical in key ways to our relationships with other human beings. this is true of all of us, even those of us sophisticated enough to understand that we are dealing with devices and human artifacts rather than people . reeves and nass quite entertainingly present the technique they've used over the years to perform their research, on a step-by-step basis: 1. pick a research finding on how people respond to each other or their environment. 2. find the summary of the social or natural rule that the study has yielded. 3. replace the words "person" or "environment" in the summary with media of some sort (television, movi es, computers, etc.) 4. find the research procedure . 5. substitute media for one of the people or the environment in the procedure. 6. run the experiment. 7. draw conclusions. although this may sound facetious, it is in fact the recipe that produced the startling conclusions that we all tend to behave toward media much as we do toward other people. what's perhaps more important is that reeves and nass point toward techniques that practitioners can use to produce more effective media, including computer software . as a simple example, consider politeness. reeves and nass discovered that people treated computers with the same sort of politeness that they would other human beings, and in turn reeves and nass suggest that people respond better to "polite media." they then provide some fairly straightforward advice on producing polite computer programs, starting with grice's maxims, a set of politeness rules assembled by h. paul grice, a philosopher and psychologist. these center around truth telling, appropriate quantity of information (neither too much nor too little), relevance, and clarity. all of this is fairly unsurprising, but the authors spell out just how the maxims can be applied to the construction of computer programs . further, they go on to suggest some rules of thumb of their own. for example, some computer programs produce verbal output but expect the user to key in his or her responses. this may be perceived by the user , possibly subconsciously, as forcing an impolite response, since mixing communications modalities is a faux pas. thus, they suggest that if text input is required , perhaps only text output should be supplied . this should provide you with some of the flavor of the media equation, and in turn you may be able to see a set of potential ethical dilemmas that can arise from utilizing book reviews i 55 reproduced with permission of the copyright owner. further reproduction prohibited without permission. techniques that result from the research of reeves and nass. this set of problems can be seen most clearly in the chapter "subliminal images," where they discuss how subliminal messages could be inserted into new media to advertise products or to attempt to bolster employee morale. in fact, they say, " ... it might be easier to accomplish subliminal intrusions with a computer than with a television, because software can respond to the particular input of individual users and timing is more precise." they immediately temper this insight with the caution that" ... ethical and legal issues abound." indeed. although some of the techniques that can be applied to new media do lead to ethical problems, i think that most of what reeves and nass talk about are just elements of good design. subliminal suggestion seems to most of us to be out of bounds because it unfairly manipulates user response in a powerful way. the unfairness is that someone can be manipulated without his or her knowledge to do something outside of the person's normal behavior. although the other techniques tend to subtly alter behavior, they don't generally result in an anomalous action by the user. if you think this is a kind of philosophical hairsplitting, you're right. the onus is upon the programmer or multimedia designer to use these techniques with great care. in a past professional life i wrote computerized patient interviews for the psychiatry department of the university of wisconsin. researchers there and elsewhere found that people were generally more candid with the computer than they were with human clinicians. so the findings of reeves and nass were not quite as surprising to me as they might be to others. what did surprise me, however, is that the media equation is not a phenomenon solely of the nai"ve or inexperienced media and computer users. on the contrary, all of us, no matter how conversant we are with underlying technology, are susceptible to the effects described in the media equation. this vastly increases the power of computer programs and other media for both good and ill. i want to emphasize that not all of the possible effects of humanmedia interaction are pernicious. most are simply innocuous, and if techniques that benefit users can result from these effects there should be no harm in applying them in software or multimedia. in general, it's desirable to make user experiences of software and media pleasanter and more productive, and reeves and nass do an excellent job of providing pointers throughout the book. there are suggestions with regard to personality, emotion (including arousal), social roles, and form (e.g., image size, fidelity of sound, and video). none of them comes close to being as controversial as subliminal suggestion, although it continues to make me uncomfortable that people react to media as if they were dealing directly with other human beings. this is a disquieting finding, but it should not dissuade us from our jobs of designing good systems for users. all in all, information ecologies and the media equation are both firstrate books that belong in our libraries and on our professional bookshelves. both provide methodologies and techniques for making user interactions with automated systems a better experience, both in terms of accomplishing tasks efficiently and in terms of user satisfaction.-tom zillner index to advertisers info usa library technologies, inc. lita cover 4 cover 3 cover 2, 2 56 information technology and libraries i march 2000 microsoft word june_ital_ellern_final.docx user authentication in the public area of academic libraries in north carolina gillian (jill) d. ellern, robin hitch, and mark a. stoffan information technology and libraries | june 2015 103 abstract the clash of principles between protecting privacy and protecting security can create an impasse between libraries, campus it departments, and academic administration over authentication issues with the public area pcs in the library. this research takes an in-‐depth look at the state of authentication practices within a specific region (i.e., all the academic libraries in north carolina) in an attempt to create a profile of those libraries that choose to authenticate or not. the researchers reviewed an extensive amount of data to identify the factors involved with this decision. introduction concerns surrounding usability, administration, and privacy with user authentication on public computers are not new issues for librarians. however, in recent years there has been increasing pressure on all types of libraries to require authentication of public computers for a variety of reasons. since the 9/11 tragedy, there has been increasing legislation such as the uniting and strengthening america by providing appropriate tools required to intercept and obstruct terrorism act of 2001 (usa patriot act) and communications assistance for law enforcement act (calea). in response, administrators and campus it staff have become increasingly concerned about allowing open access anywhere on their campuses. restrictive licensing agreements for specialized software and web resources are also making it necessary or attractive to limit access to particular academic subgroups and populations. permitting access to secured campus storage from these computers can make it necessary for libraries to think about the necessity of authentication. and finally, the general state of the economy has increased the user traffic to libraries, sometimes making it necessary to control the use of limited computer resources. authenticating can often make these changes easier to implement and can give the library more control over its it environment. that being said, authentication comes at a price for librarians. authentication often creates ethical issues with regards to patron privacy, freedom of inquiry, increasing the complexity of using public area machines, and restricting the open access needs of public or guest users. requiring a patron to log into a computer can make it possible for organizations outside the library’s control gillian (jill) d. ellern (ellern@email.wcu.edu) is systems librarian, robin hitch (rhitch@email.wcu.edu) is tech support analyst, and mark a. stoffan (mstoffan@email.wcu.edu) is head, digital, access, and technology services, western carolina university, cullowhee, north carolina. user authentication in the public library area of academic libraries in north carolina | 104 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 to collect, review and use data of a patron’s searching habits or online behaviors. issues associated with managing patron logins can also create barriers for access as well as being time consuming and frustrating for both the patron and the library staff.1 while open, anonymous access does not completely protect against these issues, it can help to create an environment of free, private and open access similar to the longstanding situation with the book collection in most libraries. the hunter library experience while working on the implementation of a new campus-‐wide pay-‐for-‐print solution in 2009, librarians from the hunter library at western carolina university began to feel pressured by the campus it department to change its practice of allowing anonymous logins to all the computers in the public areas of the library. concerns about authenticating users on library public area machines had been building between these two units for several years. the resulting clash of principles between protecting privacy and protecting security came to a head over this project. the hunter library employees perceived that there needed to be more time for research and debate before implementing the preceded mandate. initially, there was great resistance from campus it staff to take the library’s concerns into account, but eventually a compromise was worked out that allowed the library to retain anonymous logins on its public computers. the confrontation led library staff to investigate the practices of other libraries, particularly within the university of north carolina (unc) system of which it is a member. it seemed a logical development to extend the initial research into the authentication practices throughout the state of north carolina. the problem one of the first questions asked by western carolina’s library administration of the systems department was what other libraries in the area were doing. in our case, the library director specifically asked how many of west carolina’s sister universities were authenticating and why. anecdotally, during this process, it seemed that many other university of north carolina system libraries reported being pressured to authenticate their public computers by organizations outside the library, most often the campus it department. when the librarians at the hunter library began looking at research to support their position, hard data and practical arguments that could be used to effectively argue their case against this change, helpful literature seemed to be lacking. some items were found such as carlson, writing in the chronicle of higher education, who reported on the divide between access and security. he confirmed that other librarians also have ambivalent feelings about authentication issues but that there was also growing understanding in libraries about the potential vulnerability of networks or misuse of their resources.2 it seemed that the speed at which authenticating computers in the public areas of libraries was happening across the country had not really allowed the literature on the subject to quite catch up. information technology and libraries | june 2015 105 those studies that existed such as spec kits seem to address the issue from the perspective of larger research libraries or else did not systematically assess other specific groups of libraries.3,4 there were questions in our minds about whether the current research that was found would describe the trends and unique situations of libraries located in rural areas or in other types of academic libraries. there seemed to be no current statewide or geographically defined analysis of authentication practices across various types of academic libraries in a specific state or region, nor were there any available studies creating a profile of libraries more likely to authenticate computers in their public areas. we questioned if the rural nature of our settings, our mission, or our geographic area in the south might reinforce or hurt our position with it. authentication status is not something that is mentioned in the ala directory nor is this kind of information often given on a library’s web site. we found that individuals usually need to call or visit the library directly if they want to know about a library’s authentication practices. during the initial investigation, the need for this kind of information to support the library’s perspective became clear. this question led to the creation of this survey of authentication practices in a larger geographical area and across various kinds of academic libraries. the goals of this research were to determine some answers to the following questions: • what is the current state of authentication practices in the public area of academic libraries in north carolina? • what factors caused these libraries to make the decisions that they did in regards to authentication? • could you predict whether an academic library would require users to authenticate? literature review a number of studies have discussed various other aspects of user authentication in libraries, including privacy and academic freedom concerns, guest access policies, differing views of privacy and access between library and campus it departments, and legislation impacting library operations. all are potential factors impacting decisions on authentication of patron accessible computers located in the public areas of library. privacy and academic freedom about the use of a library’s collection have long been major concerns for librarians even before information technology was introduced. the impact of 9/11 and the patriot act made the discussion of computers and network security, especially in the library environment much more entwined. oblinger discussed online access concerns in the context of academic values, focusing on unique aspects of the academic mission. she discussed the results of an educause/internet2 computer and network security task force invitational workshop that established a common set of principles as a starting point for discussion: civility and community, academic and intellectual freedom, privacy and confidentiality, equity of access to resources, fairness, and ethics. all of these principles, she argues, are integral to the environment user authentication in the public library area of academic libraries in north carolina | 106 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 of a university and concluded that security is a complex topic and that written, top-‐imposed policies alone will not adequately address all concerns.5 while not directly addressing the issues of the library’s public computer access in particular, she established a framework of values on how security issues relate to the university culture of freedom and openness. dixon in an article written for library administrators discussed privacy practices for libraries within the context of the library profession’s ethical concerns. she highlights such documents as the code of ethics of the american library association6, the fair information practices adopted by the organization for economic cooperation and development7, and the niso best practices for designing web services in the library context8. she also reviews a variety of ways that patron data may be misused or compromised. she stated that all the ways that patron data can be be stored or tracked by local networks, it departments, or internet service providers may not be fully understood by librarians. while most librarians ardently maintain the privacy of patron circulation records, she points out that similar usage data on online activities may be collected without the librarians or their patrons being aware. dixon studied the current literature and maintained that libraries need to be closely involved in decisions about the collection and retention of patron usage data, especially when patron authentication and access is controlled by external agencies such as campus or city it departments, because of a tendency for security to prevail over privacy and free inquiry.9 this theme was of major importance to us in preparing the present study as it shows that we are not alone in these concerns. carter focused on the balance between security and privacy and suggested several possible scenarios for addressing both areas. he emphasized librarian values involving privacy and intellectual freedom, contrasting the librarian’s focus on unrestricted access with the over-‐arching security concerns of computing professionals. he discussed several computer access policies in use at various institutions and possible approaches. these options include computer authentication (with associated privacy concerns), open access stations visually monitored from staffed desks, or routine purging of user logs at the end of each session. he also suggested librarians lobby state legislatures to have computer usage logs included in laws governing the confidentiality of library records.10 still and kassabian provided a good summary of internet access issues as they affected academic libraries from legal and ethical perspectives. they suggested that librarians focus on public obligations, free speech and censorship, and potential for illegal activities occurring on library workstations. the issues highlighted in the article have increased in the 15 years since the article was written but it remains the best available overview.11 the arguments put forth in this article proved relevant for us in understanding the multitude of viewpoints regarding authentication even before 9/11. in the post-‐9/11 era, essex discussed the usa-‐patriot act and its implications for libraries and patron privacy. some of the 9/11 terrorists were reported to have made use of public library computers in the days before the attack. this has led to heighted concern about patron privacy information technology and libraries | june 2015 107 among librarians. accurate assessment of its impact is difficult due to restrictions placed on libraries in even disclosing that they have been subjected to search.12 while not directly addressing authentication, the article highlights privacy issues surrounding library records of all types. one of the arguments in not requiring authentication in the public area is the use by unaffiliated users of academic libraries. this is especially true in rural areas where an academic library might be some of the best-‐funded, comprehensive and accessible resources in a geographical area. even in urban areas, guest access by unaffiliated users is a growing issue for many academic libraries because of limited resources, software licensing problems and public access to campus infrastructure. while most institutions have traditionally offered basic library services to unaffiliated patrons, the online environment has raised new problems. weber and lawrence provided one of the best studies of these issues. their work surveyed association of research libraries (arl) member libraries to determine the extent of mandatory logins to computer workstations and document how online access was provided to non-‐affiliated guest users. they concentrated their study questions on federal and canadian depository libraries that must provide some type of access to online government information, with or without authentication. less than half of respondents reported having any written policies governing open access on computers or guest access policies. of the 61 responding libraries to the survey, 32 required that affiliated users authenticate, and of these libraries and 23 had a method for authenticating guest users.13 this article, which was published just as this study was testing and evaluating the survey instrument, proved to be very useful as we worked with our questions in qualtrics™ and dealt with the irb requirements. courtney explored a half-‐century of changes in access policies for unaffiliated library users. viewing the situation from somewhat early in the shift from print to electronic resources, she foresaw the potential for significantly reduced access to library resources for non-‐affiliated patrons. these barriers would be created by access policy issues with computing infrastructure and licensing limitations by database vendors. this is especially true if a library’s licenses or policies did not specifically address use by unaffiliated users. she concluded that decisions about guest access to online library resources should be made by librarians and not be handed over to vendors or campus computing staff.14 our study began as a result of this very issue, i.e., an outside entity (campus it) determining how access to library resources should be controlled, without input by librarians or library staff. courtney also surveyed 814 academic libraries to assess their policies for access by unaffiliated users. she focused on all library services including building access, reference assistance, and borrowing privileges in addition to online access. many libraries were also cancelling print subscriptions in favor of online access and she questioned the impact this might have on use by unaffiliated users. while suggesting little correlation between decisions to cancel paper subscriptions and requiring authentication of computer workstations, she concluded that reduced user authentication in the public library area of academic libraries in north carolina | 108 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 access by unaffiliated users would be an unintended consequence of this change.15 this article proved valuable to us in framing our study, as it gave us some idea of what we might expect to find and provided some concepts to use when we formulated our survey. best-‐nichols surveyed public use policies in 11 nc tax-‐supported academic libraries and asked similar questions to our own. this study was dated and didn’t address computer resources, but some of the same issues were addressed.16 public use and authentication policies have the potential to impact one another and how the library responds. courtney called on librarians to conduct a carefully thought out discussion of user authentication because of the implications for public access and freedom of inquiry. while librarians are traditionally passionate at protecting patron privacy involving print resources, many are unaware of related concerns involving online authentication. she advocated for more education and open debate of the issues because of the potential gravity of leaving decision-‐making in the hands of database vendors or campus it departments. decisions regarding authentication and privacy impact library services and access, and therefore need to include input from librarians.17 as this study included a summary of the reasons for authentication as provided by surveyed libraries, it also gave us another reference point to use when comparing our results and highlighted the intellectual freedom issues that were often missing or glossed over in other studies. barsun surveyed the web sites of the 100 association of research libraries to assess services to unaffiliated users in four areas: building access, circulation policies, interlibrary loan services, and access to online databases. 61 member libraries responded to requests for data. she explored the question of whether the policies governing these services would be found on a library’s web site. she perceived a possible disparity between increasing demand for services generated by members of the public who are discovering a library’s resources via online searching and the library’s ability or willingness to serve outside users. while she did not address computer authentication issues directly, she did find that a significant percentage of academic library web sites were ambiguous about stating the availability of non-‐authenticated access to databases from onsite computers.18 this ambiguity could possibly be related to vague usage agreements with database vendors that do not clearly state whether non-‐affiliated users may obtain onsite access to these resources. in “secret shopper” visits done as part of our own research, we saw a disparity between what was stated on a library’s web site and the reality of access offered. method it seemed appropriate to start this project with a regional focus. none of the studies available looked at authentication geographically. because colleges and universities within a state are all subjected to the same economic, political and environmental factors, looking at the libraries might help provide some continuity for creating a relevant profile of current practices. north carolina has a substantial number of academic libraries (114) with a wide variety of demographics. historically, the state supports a strong educational system with one of the first public university information technology and libraries | june 2015 109 systems. together with the 17 universities within university of north carolina system, the state has 59 public community colleges, 36 private colleges and universities, and 3 religious institutions. religious colleges are identified as those whose primary degree is in divinity or theology. (see chart 1.) chart 1. survey participation by type of academic library. work had been started to identify the authentication practices of other unc system libraries, so the researchers expanded the data to include the other academic libraries within the state. to create a list of the library’s pertinent information for this investigation, the researchers used the american library directory19, the nc state library’s online directories of libraries20, and visited each library’s web page to create a database. the researchers augmented each library’s data to include information including the type of academic library (public, private, unc system and religious), current contact information on personnel who might be able to answer questions on authentication policies and practices in that library, current number of books, institutional enrollment figures, and the name and population of the city or town in which the library was located. the library’s responses to the survey were also tracked in the database with spss and excel employed in evaluating the collected data. a western carolina institution review board (irb) “request for review of human subject research” was submitted and approved using the following statement: “we want to know the authentication situation for all the college libraries in north carolina.” the researchers discovered quickly that the definition of “authentication” would have to be explained to the review board and many of the responding librarians that filled out the survey. the research goal was further simplified with the explanation of authentication as “how do patrons identify themselves to get user authentication in the public library area of academic libraries in north carolina | 110 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 access to a computer in the public area of a library” because many librarians might not realize that what they do is “authentication”. during the approval phase, there was some question about whether the researchers needed formal approval because much of the information could be collected by just visiting the libraries in person. the researchers saw no risk of potentially disclosing confidential data. however, it was decided that it was better to go through the approval process, since the survey asked the librarians whether they were being required to authenticate by outside entities. there might also be a need to do some follow-‐up calls and there was a plan to do site visits to the local libraries in order test the data for accuracy. the qualtrics™ online survey system was used to create the survey and collect the responses. contact information from the database was uploaded to the survey system with the irb approved introductory letter to each library contact person along with a link to the survey. the introductory letter described the goals of the project and included an invitation to participate as well as refusal language as required by the irb request. the same language was used in the follow up emails and phone calls. the initial (16) surveys were administered to the unc system libraries in october – december 2010 as a test of the delivery and collection system on qualtrics™, with the rest of the libraries being sent the survey mid-‐december 2010. in the spring of 2011, the researchers followed initial survey with a second letter and then with phone calls and emails. during the follow up calls, some librarians chose to answer the survey questions with the researcher filling it out over the phone. most filled out the survey themselves. the final surveys were completed in april 2011. because the status of authentication is volatile, this survey data and research represents a snapshot in time of their authentication practices between october 2010 and april 2011. the researchers did see changes happening over the course of the surveying process and made changes to any data collected in follow up contact in order to maintain the most current information about that library for the charts, graphs and presentations made from the data. in fall 2011, the researchers did a “secret shopper” type expedition to the nearest academic libraries by visiting in person as a guest user. the main purpose of these visits was to check the data, take pictures of the library public areas, get a firsthand experience with the variety of authentication practices, and talk to and thank the librarians that participated. the survey the survey asked 36 different questions using a variety of pull down lists, check boxes and fill in the blank questions. qualtrics™ allows for the survey to have seven branches, or skip logic, that asked further questions depending upon the answer given. these branches allowed the survey software to skip particular sections or ask for additional information depending on the answers information technology and libraries | june 2015 111 supplied. some libraries, especially those that didn’t authenticate or didn’t know specific details, might be asked as little as 14 questions while others received all 36. the setup of computers in the public area of libraries can be quite variable, especially if the library differentiates between student-‐only and guest/public use only workstations. the survey questions were grouped into seven basic areas: descriptive, authentication, student-‐only pcs, guest/public pcs, wireless access, incident reports, and computer activity logs. the full survey is included as appendix a. initial hypothesis given the experience at the hunter library, we expected the following factors might influence a decision to authenticate. some of these basic assumptions did influence our selection of questions in the seven areas of the survey. we expected to find: • when the workstations were under the control of campus it, authentication would usually be required • when the workstations were under the control of the library, authentication would probably not be required • that factors such as population, enrollment, and book volume would play a role in decisions to authenticate • that librarians would not be aware of what user information was being logged whether or not authentication was required • a library would have experienced incidents involving the computers in the public area that the library would have authentication • that authentication increased from post-‐ 9/11 factors and its legal interpretations to force libraries to authenticate survey questions, responses, and general findings the data collected from this survey, especially from those libraries that did authenticate, produced over 200 data points for each library. below are those that resulted in answers to questions posed at the outset that particularly looked at overall authentication practices. further articles are planned to look at areas of inquiry with regards to other related practices in the public areas of academic libraries geographically. there are 114 academic libraries in north carolina. as a result of the follow up emails and phone calls, this research survey got an exceptional 99.1% response rate (113 out of 114). once the user authentication in the public library area of academic libraries in north carolina | 112 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 appropriate librarians were contacted and understood the scope and purpose of this study, they were very cooperative and willing to fill out the survey. those who were contacted via phone mentioned that the original email was overlooked or lost. only one library refused to participate in the study. individual library’s demographics were collected in a database by using directory and online information. the data was matched with the survey data provided by the respondents to produce more in-‐depth analysis and create a profile of each library. how many libraries in north carolina are authenticating? (chart 2) the survey asked: “is any type of authentication required or mandated for using any of the pcs in the library’s public area?” 66% (or 75) of libraries answered yes that they required authentication to use the pcs. (see chart 2.) chart 2. are some types of libraries more likely to authenticate? (chart 3) while each type of library had a different overall total as compared to the other types, chart 3 shows how the percentages of authentication hold for each type. three out of the four types of libraries authenticate more often. of the 58 community college libraries, 60% (or 35) of them require users to authenticate. seventy-‐eight percent (78%) of the 36 private colleges libraries authenticate and 11 of the 16 (or 69%) unc system libraries authenticate. only the religious college libraries more often don’t require users to authenticate (1 of the 3 or 33%), although this is a very small population in the survey. however, percentagewise, community colleges are more likely to not require users to authenticate then private college libraries (40% vs. 22%) and the unc system libraries, that are public institutions, fall in the middle at 31%. information technology and libraries | june 2015 113 chart 3. how many academic libraries were required to authenticate pcs in their public areas? (chart 4) of the 75 libraries that required patrons to authenticate, when asked if “they were required to use this authentication”, 59 (52%) replied “yes”. putting these data points together shows that 16 (or 14%) of the libraries authenticate even though they were not required to do so. some clues about why this was were asked in the next question and during the follow up phone calls. chart 4. user authentication in the public library area of academic libraries in north carolina | 114 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 why was authentication used? libraries were asked, “do you know the reasons why authentication is being used?” if they answered “prevent misuse of resources” or “control the public’s use of these pcs” then an additional question was asked, “what led the library to control the use of pcs?” this option had two check boxes (“inability of students to use the resources due to overuse by the public” and “computer abuse”) and a third box to allow free text entry. a library could check more than one box. of those 75 libraries that authenticated, 60% (or 45) checked “prevent misuse of resources” and 48% (or 36) cited “controlling the public’s use of these pcs” as the reasons for authenticating. in normalizing the data from the two questions and the free text field, table 1 combines all answers to illustrate the number and percentages of each. table 1. in the course of the follow up calls with those libraries that answered the survey over the phone, further insight was provided. one librarian said that their it department told them “authentication was the law and they had to do it”. another answered that they were “on the bus line and so the public used their resources more than they expected and so they had to”. to get a better understanding of the scope and variety of these answers, here are some examples of the reasons cited in the free text space: “all it's idea to do this” “best practices”, “caution”, “concerned they would be used for the wrong reasons”, “control”, “we found them misusing computer resources (porn, including child porn)”, “control over college students searching of inappropriate websites, such as porn/explicit sites”, “disruption”, “ease of distributing information technology and libraries | june 2015 115 applications”, “fear of abuse on the part of legal”, “legal issues regarding internet access”, “making students accountable”, “monitor use”, “policy”, “security of campus network”, “security of machines after issues were raised at a conference”, and “time”. who required that the libraries authenticate? (chart 5) the survey asked, “what organization or group required or mandated the library to use authentication?” respondents were allowed to choose more than one of the 5 boxes. these choices included “the library itself,” “it or some unit within it,” “college or university administration,” “other” (with a text box to explain), and “not sure”. the results of this question are shown in chart 5. the survey revealed that the decision was solely the library’s choice 25% of the time, (or 28 libraries) 22% of the time the library was mandated or required to authenticate by it or some unit within it (or 25 libraries) and 4% of the time a library’s college or university administration required or mandated authentication (or 4 libraries). collaborative decisions in 14 libraries involved more than one organization. of the 39 libraries that were involved with the authentication decision (28 that made the decision by themselves and 11 that were part of a collaborative decision), 55% (or 16) authenticated even though they were not required to do it. chart 5. what type of authentication is used? authentication in libraries can take many forms. the most common method for those libraries that authenticate was by using centralized or networked systems. almost sixty percent of the libraries used some form of this identified access (tables 2 and 3) with one library using some other independent system. twenty-‐five percent (or 19) of libraries that authenticate still use some form of paper sign-‐in sheets and 21% (or 16) use pre-‐set or temporary logins or guest cards. fifteen percent (or 11) use pc based sign-‐in or scheduling software and 8% (or 6) use the library user authentication in the public library area of academic libraries in north carolina | 116 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 system in some form for authentication. a few libraries indicated that they bypass their authentication systems for guests by either having staff log guests in or disabling the system on selected pcs. we saw this during the “secret shopper” visits as well. table 2. do the forms of authentication used in libraries allow for user privacy? when asked how they handle user privacy in authentication, of the 75 libraries that authenticate, 67% (or 50) use a form of authentication that can identify the user. in other words, most users do not have privacy when using public computers in an academic library because they are required to use some form of centralized or networked authentication. the options in table 3 were presented to the respondents as possible forms of privacy methods. thirty-‐five percent (or 26) libraries indicated that they provide some form of privacy for their patrons. anonymous access accounted for 28% (or 21) of the libraries. table 3. information technology and libraries | june 2015 117 are librarians aware of the computer logging activity going on in the public area? (table 4) all the 113 respondents were asked two questions about the computer logging activities of their libraries: “do you know what computer activity logs are kept” and “do you know how long computer activity logs are kept”. the second question was only asked if “unsure” was not checked. besides “unsure”, responses on the survey included “authentication logs (who logged in)”, “browsing history (kept on pc after reboot)”, “browsing history (kept in centralized log files)”, “scheduling logs (manual or software)”, “software use logs” and “other”. the respondents could select more than one answer. however, over half (52%) of the respondents were unsure if the library kept any computer logs at all. authentication logs of who logged in were the most common, but those were kept in only 25% of the total libraries surveyed. a high percentage of libraries kept some kind of logs but most respondents were unsure how long those records were kept. of the various types of logs, respondents that use scheduling software were the most familiar with the length of time software logs were kept. in one case, a respondent mentioned that the manual sign-‐in sheets were never thrown out and that they had retained them for years. table 4. log retention. are past incidents factors in authenticating? only three libraries reported breaches of privacy and all those libraries reported using authentication. of the 75 libraries that do authenticate (chart 6, 3 bars on the right), 36 reported that they did have improper use of the pcs while 29 of the libraries reported that did not and 10 did not know. of the 38 libraries that do not authenticate (chart 6, 3 bars on the left), 23 reported that they had no improper use of the pcs while 13 stated that they did and 2 did not know. the overall known reports of improper use in the survey are higher when the library does authenticate and is lower when the library doesn’t authenticate. computer activity logs number of total libraries don't know how long data is kept (unsure) unsure 59 52% 100% authentication logs (who logged in) 28 25% 60% none 21 19% -‐-‐ browsing history (kept in centralized log files) 14 12% 86% scheduling logs (manual or software) 10 9% 70% browsing history (kept on pc after reboot) 7 6% 57% software use logs 6 5% 33% library system 4 4% 75% other 2 2% -‐-‐ what kind and for how long computer logs are kept (all 113 libraries) user authentication in the public library area of academic libraries in north carolina | 118 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 chart 6. when did libraries begin authenticating in their public areas? of the 75 libraries that authenticate, only one implemented this more than ten years prior to the survey. 51 (or 67%) of the responding libraries began authenticating between 3 and 10 years ago. 10 libraries implemented authentication in the year before the survey. this is consistent with the growth of security concerns in the post 9/11 decade. (chart 7) chart 7. information technology and libraries | june 2015 119 discussion since the introduction of computer technology to libraries, library staff and patrons have used different levels of authentication depending upon the application. while remote access to commercial services such as oclc cataloging subsystems or vendor databases have always used some form of authorization, usually username and password, it has never been necessary or desirable for public access to the library’s catalog system to have any kind of authorization requirements. most of the collections within an academic library have traditionally been housed in open access stacks where anyone can freely access material on the shelves. printed indexes and other tools that provide in-‐depth access to these collections have traditionally been open as well. today, most libraries still make their library catalog and even some bibliographic discovery tools open access and available over the web. this practice naturally extended to computer technology and other electronic reference tools until libraries began connecting them to the campus and public networks. the principle of free and open access to the materials and resources of the library, within the library walls, has been a fundamental characteristic of most public and academic libraries. there is an ethical commitment of librarians to a user’s privacy and confidentiality that has deep roots based in the first and fourth amendment of the us constitution, state laws, and the code of ethics of the ala. article ii of the ala code states “we protect each library user's right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.” traditionally, library staff do not identify patrons that walk through the door; they don’t ask for identification when answering questions at the reference desk nor do they identify patrons reading a book or magazine in the public areas of a library. schneider has empathized that librarians have always valued user privacy and have been instrumental in the passing of many state’s library privacy laws.23 usually, it is only when materials are checked out to a patron that a user’s affiliation or authorization even gets questioned directly. frequently patrons can make use of materials within the library building with no record of what was accessed. we are seeing these traditional principles of open access to materials as they transition to electronic formats. it is becoming more common for patrons to have to authenticate before they can use what was once openly available. the data collected from this survey confirms this trend with 66% of the libraries using some form of authentication in their public area. the widespread use of personally identifiable information is making it more difficult for librarians to protect the privacy and confidentiality of library users. although the writing was on the wall that some choices would have to be made with regards to privacy before 911, no easy answer to the problem had yet been identified. librarians themselves are often uncertain about what information is collected and stored as evidenced by our data (chart 6). as more information becomes available only electronically, because computers in the public areas are now used for much more than just accessing library catalog functions, it is becoming difficult to uphold the code of ethics and protect the privacy of users. user authentication in the public library area of academic libraries in north carolina | 120 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 using authentication can also make it more difficult to use technology in the library. in order to authenticate, users may be required to start or restart a computer and/or, log into or out of the computer. this can take time to do as well as require the user to remember to log off the computer when finished. users often have difficulty keeping track of their user information and may require increased assistance (table 5). table 5. library staff or scheduling software can be required to help library guests obtain access to computer equipment. north carolina, like other states, does have laws governing the confidentiality of library records. librarians have long dealt with this situation by keeping as little data as possible. for example, many library circulation systems do not store data beyond the current checkout. access logs that detail what resources a particular user has accessed would seem to fall under this legislation, although the wording in the law is vague. information technology departments, legal counsel, and administrators, on the other hand, are often less concerned about privacy and intellectual freedom issues. more often their focus is on security, limiting access to those users affiliated with the institution, and monitoring use. being ready and able to provide data in response to subpoenas and court orders is often a priority. at western carolina university, illicit use of an unauthenticated computer in the student center led to an investigation by campus and county law enforcement. this case is still used as justification for needing to authenticate and monitor campus computer use even though the incident occurred many years ago. being able to track an individual’s online activity is believed to increase security by ensuring adherence to institutional policies. authentication with individually assigned login credentials permits online activity to be traced to that specific account whose owner can then be held accountable for the activity performed. librarian’s responses to the survey indicate that these issues play a role in a library’s decisions to authenticate as seen in the free text responses in table 6. information technology and libraries | june 2015 121 tracking use through ip address, individual login, and transaction logs allows scrutinizing of users in case of illegal or illicit use of computer resources. in many cases, this action is justified as being required by auditors or law enforcement agencies, though information regarding this is scarce. the authors of this article are not aware of any laws or auditing requirements in north carolina that require detailed tracking of library computer use. some libraries indicated that it departments were concerned about security of networks and/or computers. security can be undermined when generic accounts are used or when no authentication is required. by using individual logins, users can be restricted to specific network resources and can be monitored. when multiple computers use the same account for logging in or when the login credentials are posted on each computer, it can compromise security because use cannot be tracked to a specific user. in some libraries, these security issues have trumped librarian’s concerns about intellectual freedom and privacy. creating a profile as a result of these findings given the number of characteristics collected about each library, it was assumed there were some factors gathered that might influence a decision to authenticate and allow for the possibility to create a profile for prediction. the data was collected from libraries within a fixed geographic region. the externally collected and survey data was coded, put into spss™ and a number of statistical tests were performed to find what factors might be statistically significant. to further the geographical analysis of the data, the data was also put into arcview™ to produce a map of north carolina with the libraries given different colored pins for those academic libraries that authenticated vs. non-‐authenticated to see if there were any pattern to the choice. (map 1) to more completely explore the possible role that geographic information might play in the decision to authenticate, the population of the city or town the institution was located in, enrollment, book volume, number of pcs and total number of library it staff (scaled variables) as well as ordinal variables such as “who controlled the setup of the pcs”, “do you differentiate between student and public pcs”, and “known incidents of privacy and misuse”, were also integrated into the analysis. the data collected could not predict whether an academic library would authenticate or not using logistical regression techniques, although those that differentiate between student and public pcs did have a higher probability. based on all our collected data and mapping, it is impossible to predict with any significance whether or not an academic library would authenticate. so the short answer statistically is no. using all of the data collected, a statistically significant profile could not be created, however there are general tendencies identified that the data was able to suggest. user authentication in the public library area of academic libraries in north carolina | 122 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 map 1. for those libraries that do authenticate, the average book volume is almost 400,000, the enrollment around 5,600, the city population where the institution is located is 94,000, the total number of pcs in the public area is 54, and the average number of library it staff is 1.8. for those libraries that do not authenticate, the average book volume is about 163,000, enrollment around 3,000, the population is 53,000, the average number of pcs in the public area is about 39 and the average number of library it staff is 0.8. libraries that authenticate tend to have statistically significant differences in book volume, the number of pcs in the public area, which has a t-‐test value of p<1. student enrollment was the most statistically significant factor in those that authenticated, with a t-‐test value of p<0.5. libraries that authenticate had many more students, more books and a larger number of pcs in their public areas then libraries that didn’t authenticate. those libraries that didn’t authenticate tended to be in smaller towns, more often their pcs in the public areas were setup by non-‐library it staff, and had fewer library it staff. sixty percent (60%) of the libraries that don’t authenticate had zero library it staff. information technology and libraries | june 2015 123 while it was assumed at the outset of this research that the responsible campus department for the setup of the workstations (the library or it) in the public area would be a factor in whether authentication was used in the library, the data does not support this assumption statistically. ethical questions about authentication as a result of these findings there are a variety of reasons why a library might choose to authenticate despite the ethical issues associated with it. the protection and management of it resources or the mission of the institution are two likely scenarios. a library, especially one with lots of use by unaffiliated users or guests, might chose to authenticate regardless of concerns in order to make sure its own users have preference to the pcs in the public area of their library. a private institution may choose to authenticate in order to limit access by any members of the general public. of those 75 libraries that authenticate, 81% cited concerns about controlling use, overuse and misuse. this study also found that in 25% of the total academic libraries, the library itself decided to authenticate without influence from external groups. this was a higher percentage than was expected. given librarian’s professional concerns about intellectual freedom and privacy, we were very surprised that so many libraries choose to authenticate on their own. we suspected that many librarians might not have a full understanding of the privacy issues created when requiring individual logins. based on this assumption, we expected that many of the librarians would not be fully aware of what user tracking data was being kept. examples include network authentication, tracking cookies, web browser history, and user sign-‐in sheets. the study found that librarians are often unsure of what data is being logged with 51 (or 45%) of 113 libraries reporting this. only 19% reported knowing with certainly that no tracking data was kept. of those that did know that tracking data was being kept, most had no idea how long this data was retained. conclusion this study found that 66% (or 75) of the 113 surveyed north carolina academic libraries required some form of user authentication on their public computers. the researchers reviewed an extensive amount of data to identify the factors involved with this decision. these factors included individual demographics, such as city population, book volume, type of academic library, and enrollment. it was anticipated that by looking a large pool of academic libraries within a specific region, a profile might emerge that would predict which libraries would chose to authenticate. even with comprehensive data about the 75 libraries that authenticated, a profile of a “typical” authenticated library could not be developed. the data did show two factors of any statistical significance (enrollment and book volume) in determining a library’s decision to authenticate. however, the decision to authenticate could not be predicted. each library’s decision to authenticate seems to be based on the unique situation of that library. we expected to find that most libraries would authenticate due to pressure from external sources, such as campus it departments, administrators, or in response to incidents involving the user authentication in the public library area of academic libraries in north carolina | 124 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 computers in the public area. this study found that only 39% (or 44) libraries surveyed authenticated due to these factors so our assumption was incorrect. surprisingly, we found that 25% (or 28) libraries did choose to authenticate on their own. the need to control the use of their limited resources seemed to have precedence over any other factors including user privacy. we did expect to see a rise in the number of libraries that authenticated in the aftermath of 9/11. this we found to be true. looking at the prior research that define an actual percentage of authentications in academic libraries, no matter how limited in scope, (for example, just the arl libraries, responding libraries, etc.), there does seem to be a strong trend for academic libraries to authenticate. our results, with 75% of academic libraries having authentication, support the conclusion that there is a continued trend of authentication that has steadily expanded over the past decade. this has happened in spite of librarian’s traditional philosophy on access and academic freedom. libraries are seemingly relinquishing their ethical stance or have other priorities that make authentication an attractive solution to controlling use of limited or licensed resources. our survey results show that many librarians may not fully understand the privacy risks inherent in authentication. slightly over half (52%) of the libraries reported that they did not know if any computer or network log files were being kept nor for how long they are kept. the issues surrounding academic freedom, access to information, and privacy in the face of security concerns continue to effect library users. academic libraries in smaller communities are often the only nearby source of scholarly materials. traditionally these resources have been made available to community members, high school students, and others who require materials beyond the scope of the resources of the public or school library. as pointed out, restrictive authentication policies may hamper the ability of these groups to access the information they need. however, the data showed very little consistency to support this idea with respect to authentication in small towns and communities throughout the state. some of the surveyed academic libraries made a strong statement that they are not authenticating in their public area computers and have every intention of continuing this practice. these libraries are now in a distinct minority and we expect their position will continually be challenged. for example, at western carolina university, we continue to employ open computers in the public areas of the library but are regularly pressed by our campus it department to implement authentication. we have so far been successful in resisting this pressure because of the commitment of our dean and librarians to preserving the privacy of our patrons. further studies as a follow-‐up to this study, we plan to contact the 35 libraries that did not authenticate to determine if they now require authentication or have plans to do so. based on responses to this survey, we expect that many librarians are unaware of the degree to which authentication can undermine patron privacy. we suggest an in-‐depth study be conducted to determine the degree of information technology and libraries | june 2015 125 understanding among librarians about potential privacy issues with authentication in the context of their longstanding professional position on academic freedom and patron confidentiality. user authentication in the public library area of academic libraries in north carolina | 126 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 appendix a. survey questions 1. select the library you represent: 2. which library or library building are you reporting on? • main library or the only library on campus • medical library • special library • other 3. how many total pcs do you have in your library public area for the building you are reporting on? 4. how many library it or library systems staff does the library have? 5. does the library’s it/systems staff control the setup of these pcs in the library public area? • yes • shared with it (campus computing center) • it (campus computing center) • no (please specify who does control the setup of these pcs) authentication 6. is any type of authentication required or mandated to use any of the pcs in the library’s public area? 7. were you required to use this authentication on any of the pcs in the library’s public area? 8. what organization or group required or mandated the library to use authentication on pc’s in the library public area? • the library itself • it or some unit within it • other (please explain) • not sure • college/university administration information technology and libraries | june 2015 127 9. do you know the reason’s authentication is being used? • mandated by parent institution or group • prevent misuse of resources • other (please specify) • control the public’s use of these pcs 10. what lead the library to control the use of pcs? • inability of students to use the resource due to overuse by the public • computer abuse • other (please specify) 11. how are the users informed about the authentication policy? • screen saver • web page • login or sign on screen • training session or other presentation • other (please specify) 12. what form of authentication do you use? • manual paper sign-‐in sheets • individual pc based sign-‐in or scheduling software • centralized or networked authentication such as active directory, novell, or ers (enterprise resource planning) system with a college/university wide identifier • pre-‐set or temporary authorization logins or guest cards handed out (please specify the length of time this is good for) • other (please specify) 13. how does the library handle user privacy of authentication? • anonymous access (each session is anonymous with repeat users not identified) • anonymous access (each session is anonymous with repeat users not identified) • identified access • pseudonymous access with demographic identification (characteristics of users determined but not actual identified) • pseudonymous access (repeat users identified but not the identity of a particular user) user authentication in the public library area of academic libraries in north carolina | 128 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 14. when did you implement authentication of the pcs in the library public area? • this year • last year • 3-‐5 years ago • 5-‐10 years ago • don’t know student only pcs 15. do you differentiate between student only pcs and guest/public use pcs in the library public area? 17. how many pcs are designated for student only pcs in the library’s public area? 18. do you require authentication to access student only pcs in the library’s public area? 19. what does authentication provide on a student only pc once an affiliated person logs in? • access to specialized software • access to storage space • printing • internet access • other (please specify) 20. once done with an authenticated session on a student only pc, how is authentication on a pc removed? • user is required to log out • user is timed out • other (please specify) 21 what authentication issue have you seen in your library with student only pcs? • id management issues from the user (e.g., like forgetting passwords) • id management issues from the network (e.g., updating changes in timely fashion) • timing out issues • authentication system become not available • other (please specify) guest/public pcs 22. how many pcs are designated for guest or public use in the library’s public area? 23. describe the location of these guest/public use pcs. information technology and libraries | june 2015 129 • line-‐of-‐sight to library service desk • all in one general area • scattered throughout the library • other (please specify) • in several groups around the library 24. do you require authentication to access guest/public use pcs in the library’s public area? 25. what does authentication allow for guest or the public that log in? • limited software • control, limit or block web sites that can be accessed • limited or different charge for printing • timed or scheduled access • internet access • other (please specify) • control, limit or block access to library resources (such as databases or other subscription based services) 26. are there different type of pcs in your library area? check those that apply. • all pcs are the same • some have different type of software (like browser only) • some have time or scheduling limitation • some have printing limitations • some have specialized equipment attached (like scanners, microfiche readers, etc.) • some control, limit or block web sites that can be accessed • some control, limit or block access to library resources (such as database or other subscription based services) • other (please specify) wireless access 27. do you have wireless access in your library public area? 28. do you require authentication to your wireless access in the library public area? 29. does the library have its own wireless policies different from the campus’s policy? 30. what methods are used to give guests or the public access to your wireless access? check those that apply. • no access to guest or general public • paperwork and/or signature required before access given user authentication in the public library area of academic libraries in north carolina | 130 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 • limited access by time • open access • limited access by resource (such as internet access only) • other incident reports 31. has your library had any known incidents of breach of privacy that you know about? 32. has your library had any incidents of improper use of public pcs (such as cyber stalking, child pornography, terrorism, etc.?) 33. have these incidents required investigation or digital forensics work to be done? 34. who handled the work of investigation? • library it or library systems staff • it or campus computing center • campus police • other law enforcement • unsure • other (please specify) computer activity logs 35. do you know what computer activity logs are kept? (if unsure, end, if not ask) • authentication logs (who logged in) • browsing history (kept on pc after reboot) • browsing history (kept in centralized log files) • scheduling logs (manual or software) • software use logs • none • unsure • other (please specify) 36 do you know how long computer activity logs are kept? • 24 hours or less • week • month • year • unknown information technology and libraries | june 2015 131 references 1. pam dixon, "ethical risks and best practices," journal of library administration 47, no. 3/4 (may 2008): 157. 2. scott carlson, “to use that library computer, please identify yourself,” chronicle of higher education, june 25, 2004, a39. 3. lori driscoll, library public access workstation authentication, spec kit 277 (washington, d.c.: association of research libraries, 2003). 4. martin cook and mark shelton, managing public computing, spec kit 302 (washington, d.c.: association of research libraries, 2007). 5. diana oblinger, “it security and academic values,” in computer and network security in higher education, ed. mark luker and rodney petersen (jossey-‐bass, 2003): 1-‐13. 6. code of ethics of the american library association, http://www.ala.org/advocacy/proethics/codeofethics/codeethics 7. fair information practices adopted by the organization for economic cooperation and development, http://www.oecd.org/sti/security-‐privacy 8. ”niso best practices for designing web services in the library context,” niso rp-‐2006-‐01 (bethesda, md: national information standards organization, 2006) 9. dixon, “ethical issues implicit in library authentication and access management.” 10. howard carter, "misuse of library public access computers: balancing privacy, accountability, and security," journal of library administration 36, no. 4 (april 2002): 29-‐48. 11. julie still and vibiana kassabian, "the mole's dilemma: ethical aspects of public internet access in academic libraries," internet reference services quarterly 4, no. 3 (january 1, 1999): 7-‐22. 12. don essex, "opposing the usa patriot act: the best alternative for american librarians," public libraries 43, no. 6 (november 2004): 331-‐340. 13. lynne weber and peg lawrence, "authentication and access: accommodating public users in an academic world." information technology & libraries 29, no. 3(september 2010): 128-‐140. 14. nancy courtney, "barbarians at the gates: a half-‐century of unaffiliated users in academic libraries," journal of academic librarianship 27, no. 6 (november 2001): 473. 15. nancy courtney, "unaffiliated users’ access to academic libraries: a survey," the journal of academic librarianship 29, no. 1 (2003): 3-‐7. user authentication in the public library area of academic libraries in north carolina | 132 ellern, hitch, and stoffan doi: 10.6017/ital.v34i2.5770 16. barbara best-‐nichols, “community use of tax-‐supported academic libraries in north carolina: is unlimited access a right?” north carolina libraries 51 (fall 1993): 120-‐125. 17. nancy courtney, "authentication and library public access computers: a call for discussion," college & research libraries news 65, no. 5 (may 2004): 269-‐277. 18. rita barsun, "library web pages and policies toward “outsiders”: is the information there?" public services quarterly 1, no. 4 (october 2003): 11-‐27. 19. american library directory : a classified list of libraries in the united states and canada, with personnel and statistical data, 62nd ed. (new york: information today, 2009) 20. http://statelibrary.ncdcr.gov/ld/aboutlibraries/nclibrarydirectory2011.pdf. 21. karen schneider, “so they won’t hate the wait: time control for workstations,” american libraries, 29 no. 11 (1998): 64. 22. code of ethics of the american library association. 23. karen schneider, “privacy: the next challenge,” american libraries, 30, no. 7 (1999): 98. microsoft word september_ital_maceli_proofed.docx what technology skills do developers need? a text analysis of job listings in library and information science (lis) from jobs.code4lib.org. monica maceli information technology and libraries | september 2015 8 abstract technology plays an indisputably vital role in library and information science (lis) work; this rapidly moving landscape can create challenges for practitioners and educators seeking to keep pace with such change. in pursuit of building our understanding of currently sought technology competencies in developer-‐oriented positions within lis, this paper reports the results of a text analysis of a large collection of job listings culled from the code4lib jobs website. beginning more than a decade ago as a popular mailing list covering the intersection of technology and library work, the code4lib organization's current offerings include a website that collects and organizes lis-‐related technology job listings. the results of the text analysis of this dataset suggest the currently vital technology skills and concepts that existing and aspiring practitioners may target in their continuing education as developers. introduction for those seeking employment in a technology-‐intensive position within library and information science (lis), the number and variation of technology skills required can be daunting. the need to understand common technology job requirements is relevant to current students positioning themselves to begin a career within lis, those currently in the field that wish to enhance their technology skills, and lis educators. the aim of this short paper is to highlight the skills and combinations of skills currently sought by lis employers in north america through textual analysis of job listings. previous research in this area explored job listings through various perspectives, from categorizing titles to interviewing employers;1,2 the approach taken in this study contributes a new perspective to this ongoing and highly necessary work. this research report seeks a further understanding of the following research questions: • what are the most common job titles and skills sought in technology-‐focused lis positions? • what technology skills are sought in combination? • what implications do these findings have for aspiring and current lis practitioners interested in developer positions? as detailed in the following research method section, this study addresses these questions monica maceli (mmaceli@pratt.edu) is assistant professor, school of information and library science, pratt institute, new york. what technology skills do developers need? | maceli doi: 10.6017/ital.v34i3.5893 9 through textual analysis of relevant job listings from a novel dataset—the job listings from the code4lib jobs website (http://jobs.code4lib.org/). code4lib began more than a decade ago as an electronic discussion list for topics around the intersection of libraries and technology.3 over time, the code4lib organization expanded to an annual conference in the united states, the code4lib journal, and most relevant to this work, an associated jobs website that highlights jobs culled from both the discussion list and other job-‐related sources. figure 1 illustrates the home page of the code4lib jobs website; the page presents job listings and associated tags, with the tags facilitating navigation and viewing of other related positions. users may also view positions geographically or by employer. figure 1. homepage of the code4lib jobs website, displaying most-‐recently posted jobs and the associated tags.4 in addition to the visible user interface for job exploration, the website consists of software to gather the job listings from a variety of sources. the website incorporates jobs posted to the code4lib discussion list, american library association, canadian library association, australian library and information association, highered jobs, digital koans, idealist, and archivesgig. this broad incoming set of jobs provides a wide look into new technology-‐related postings. new job listings are automatically added to a queue to be assessed and tagged by human curators before posting. this allows manual intervention where a curator assesses whether the job is relevant to technology in the library domain and to validate the job listing information and metadata (see figure 2). curating is done on a volunteer basis, and curators are asked to assess whether the position is relevant to the code4lib community, if it is unique, and to ensure that it has an associated employer, set of tags, and descriptive text. combining both software processes information technology and libraries | september 2015 10 and human intervention in the job assessment results in the ability to gather a large number of jobs of high relevance to the code4lib community. as mentioned earlier, code4lib’s origins are in the area of software development and design as applied in lis contexts. these foci mean that most jobs identified as relevant for inclusion in the code4lib jobs dataset are oriented toward developer activities. the code4lib jobs website therefore provides a useful and novel dataset within which to understand current employment opportunities relating to the intersection between technology— particularly developer work—and the lis field. figure 2. code4lib job curators interface where job data is validated and tags assigned.5 research method to analyze the job listing data in greater depth, a textual analysis was conducted using the r statistical package, exploring job titles and descriptions.6 first, the job listing data from the most recent complete year (2014) were dumped from the database backend of the code4lib jobs website; this dataset contained 1,135 positions in total. the dataset included the job titles, descriptions, location and employer information, as well as tags associated with the various what technology skills do developers need? | maceli doi: 10.6017/ital.v34i3.5893 11 positions. the text was then cleaned to remove any markup tags or special characters that remained from the scraping of listings. finally, the tm (text mining) package in r was used to calculate frequency, correlation of terms, generate plots, and cluster terms across both job titles and descriptions.7 results job title analysis of the full set of 1,135 positions, 30 percent were titled as a librarian position; popular specialties included systems librarian and various digital collections and curation-‐oriented librarian titles. figures 3 and 4 detail the most common terms used in position titles across librarian and nonlibrarian positions. figure 3. most common terms used in librarian position titles. 345 89 63 59 34 29 25 25 23 21 20 20 18 18 16 14 13 13 13 12 12 11 11 11 10 librarian digital systems services metadata data technologies university technology web electronic resources assistant information emerging scholarship collections library management initiatives sciences cataloging projects research professor top title terms librarian positions information technology and libraries | september 2015 12 figure 4. most common terms used in nonlibrarian position titles. the most popular job title terms were then clustered using ward’s agglomerative hierarchical method (dendogram in figure 5). agglomerative hierarchical clustering, of which ward’s method is widely used, begins first with single-‐item clusters, then identifies and joins similar clusters until the final stage in which one larger cluster is formed. commonly used in text analysis, this allows the investigator to explore datasets in which the number of clusters is not known before the analysis. the dendograms generated (e.g., figure 5) allow for visual identification and interpretation of closely related terms representing various common positions, e.g., digital librarian, software engineer, collections management, etc. given that job titles in listings may include extraneous or infrequent words, such as the organization name, the cluster analysis can provide an additional view into common job titles across the full dataset in a more generalized fashion. 182 141 116 90 86 68 65 59 59 59 55 52 49 49 40 40 40 40 38 35 34 34 33 32 24 digital developer library manager specialist software web archivist services technology engineer director data systems analyst coordinator information senior metadata administrator lead project head programmer research top title terms non-librarian positions what technology skills do developers need? | maceli doi: 10.6017/ital.v34i3.5893 13 figure 5. cluster dendrogram of terms used in job titles generated using ward's agglomerative hierarchical method. tag analysis as described earlier, the code4lib jobs website allows curators to validate and tag jobs before listing. the word cloud in figure 6 displays the most common tags associated with positions, with xml being the most popular tag (178 occurrences). figure 7 contains the raw frequency counts of common tags observed. information technology and libraries | september 2015 14 figure 6. word cloud of most frequent tags associated with job listings by curators. what technology skills do developers need? | maceli doi: 10.6017/ital.v34i3.5893 15 figure 7. frequency of commonly occurring tags (frequency of fifty occurrences or more) in the 2014 job listings. job description analysis the job description text was then analyzed to explore commonly co-‐occurring technology-‐related terms, focusing on frequent skills required by employers. figures 8, 9, and 10 plot term correlations and interconnectedness. terms with correlation coefficients of 0.3 or higher were chosen for plotting; this common threshold chosen broadly included terms with a range in positive relationship strength from moderate to strong. plots were created to express correlations around the top five terms identified from the tags: xml, javascript, php, metadata, and html (frequencies in figure 7). any number of terms and 178 155 152 142 125 119 114 106 101 99 90 90 89 89 86 82 79 78 70 70 69 69 66 63 62 54 53 51 51 50 50 xml javascript php metadata html archive cascading style sheets python integrated library system java mysql dublin core marc standards encoded archival description ruby drupal project management sql metadata object description standard data management gnu/linux digital preservation perl digital library xsl transformations resource description and access digital repository world wide web management dspace mets frequency of tags 2014 job listings information technology and libraries | september 2015 16 frequencies can be plotted from such a dataset; to orient the findings closely around the job listing text, a focus on the top terms was chosen. these plots illustrate the broader set of skills related to these vital competencies represented in the job listings. figure 8. job listing terms correlated with “xml” (most popular tag). figure 9. job listing terms correlated with “javascript” (second most popular tag), including “php” and “html” (third and fifth most popular tags, respectively). what technology skills do developers need? | maceli doi: 10.6017/ital.v34i3.5893 17 figure 10. job listing terms correlated with “metadata” (fourth most popular tag). finally, a series of general plots was created to visualize the broad set of skills necessary in fulfilling the positions of interest to the code4lib community. as detailed in the title analysis (figures 3 and 4), apart from the generic term librarian, the two most common terms across all job titles were digital and developer. correlation plots were created to detail the specific skills and requirements commonly sought in positions using such terms. figure 11 illustrates the terms correlated with the general term of developer, while figure 12 displays terms correlated with digital. the implications of these findings will be discussed further in the following discussion section. information technology and libraries | september 2015 18 figure 11. job listing terms correlated with “developer.” figure 12. job listing terms correlated with “ddigital.” what technology skills do developers need? | maceli doi: 10.6017/ital.v34i3.5893 19 discussion taken as a whole, the job listing dataset covered a quite dramatic range of positions, from highly technical (e.g., senior-‐level software engineer or web developer) to managerial and leadership roles (e.g., director or department head roles centered on digital services or emerging technologies). these findings support the suggestions of earlier research,8 which advocated for lis graduate programs to build their offerings not just in technology skills but also in technology management and decision-‐making. however, the code4lib jobs dataset is a one-‐dimensional view into the employment process and is focused largely on the developer perspective. additional contextual information, including whether suitable candidates were easily identified and if the position was successfully filled, would provide a more complete view of the employment process. prior research has indicated that many technology-‐related positions in lis are in fact difficult to fill with lis graduates.9 while lis graduate programs have made great strides in increasing the number of courses and topics covered that address technology, these improvements may not benefit those already in the field or wishing to shift towards a more technology-‐focused position. in the common tags and terms analysis, experience with specific lis applications was relatively infrequently required, with the drupal content management system a notable exception. more generalizable programming languages or concepts, e.g., python, relational databases, xml, etc., were favored as with technology positions outside of the lis domain, employers likely seek those with the ability to flexibly apply their skills across various tools and platforms. this may also relate to the above challenges in filling such positions with lis graduates, with the goal of opening up the position to a larger technologist applicant base. common web technologies popular in the open-‐source software often favored by lis organizations continued to dominate, with a clear preference for candidates well versed in html, css, javascript, and php. relating to these skills, web development and design practices were often intertwined with positions requesting both developer-‐oriented skillsets as well as interface design (e.g., figure 7). technologies supporting modern web application development and workflow management were evident as well, e.g., common requirements for experience with versioning systems such as git, popular javascript libraries, and development frameworks. also striking was the richness of the terms correlated with metadata (figure 10), including mention of growing areas of expertise, such as linked data. interestingly, the general correlation plots expressing the common terms sought around “digital” and “developer” positions were quite varied. while the developer plot (figure 11 above) provided a richly technical view into common technologies broadly applied in web and software development, the terms correlated around digital were notably less technical (figure 12 above). while there was a clear focus on digital preservation activities and common standards in this area, mention of terms such as “grant” indicated that these positions likely have a broad role. the term digital was frequently observed in librarian job titles, so these roles may be tasked with both technical and administrative work. information technology and libraries | september 2015 20 finally, there are inherent difficulties in capturing all jobs relating to technology use in the lis domain that introduce limitations into this study. while the incoming job feeds attempt to broadly capture recent job posts, it is possible that jobs are missed or overlooked by the job curators. given the lack of one centralized job-‐posting source regardless of the field, this is a common challenge to research work attempting to assess every job posting. and as mentioned above, there is also a lack of corresponding data as to whether these jobs are successfully filled and what candidate backgrounds are ultimately chosen (i.e., from within or outside of lis). conclusion this assessment of the in-‐demand technology skills provides students, educators, and information professionals with useful direction in pursuing technology education or strengthening their existing skills. there are myriad technology skills, tools, and concepts in today’s information environments. reorienting the pursuit of knowledge in this area around current employer requirements can be useful in professional development, new course creation, and course revision. the constellations of correlated skills presented above (figures 8–12) and popular job tags (figure 7) describe key areas of technology competencies in the diverse areas of expertise presently needed, from web design and development to metadata and digital collection management. in addition to the results presented in this paper, the code4lib job website provides a continuously current view into recent jobs and related tags; this data can help those in the lis field orient professional and curricular development toward real employer needs. acknowledgements the author would like to thank ed summers of the maryland institute for technology in the humanities for generously providing the jobs.code4lib.org dataset for analysis. references 1. janie m. mathews and harold pardue, “the presence of it skill sets in librarian position announcements,” college & research libraries 70, no. 3 (2009): 250–57, http://dx.doi.org/10.5860/crl.70.3.250. 2. vandana singh and bharat mehra, “strengths and weaknesses of the information technology curriculum in library and information science graduate programs,” journal of librarianship & information science 45, no. 3 (2013): 219–31, http://dx.doi.org/10.1177/0961000612448206. 3. “about”" code4lib, accessed january 6, 2014, http://jobs.code4lib.org/about/. 4. “code4lib jobs: all jobs,” code4lib jobs, accessed january 12, 2015, http://jobs.code4lib.org/. 5. “code4lib jobs: curate,” code4lib jobs, accessed january 17, 2015, http://jobs.code4lib.org/curate/. 6. r core team, r: the r project for statistical computing, 2014, http://www.r-‐project.org/. what technology skills do developers need? | maceli doi: 10.6017/ital.v34i3.5893 21 7. ingo feinerer and kurt hornik, “tm: text mining package,” 2014, http://cran.r-‐ project.org/package=tm. 8. meredith g. farkas, “training librarians for the future: integrating technology into lis education,” in information tomorrow: reflections on technology and the future of public & academic libraries, edited by rachel singer gordon, 193–201 (medford, nj: information today, 2007). 9. mathews and pardue, “the presence of it skill sets in librarian position announcements.” 190 information technology and libraries | december 2011 from static and stale to dynamic and collaborative: the drupal difference editor’s note: this paper is adapted from a presentation given at the 2010 lita forum. i n 2009, the university library of the university of california, santa cruz, moved from a static, dreamweaverand html-created website to an entirely new databasedriven website using the open-source content management system (cms) drupal. this article will describe the interdisciplinary approach the project team took for this large-scale transition process, with a focus on user testing, information architecture planning, user analytics, data gathering, and change management. we examine new approaches implemented for group-authoring of resources and the challenges presented by collaboration and crowdsourcing in an academic environment. we also discuss the impact on librarians and staff changing to this new paradigm of website design and development and the training support provided. we present our process for testing, staging, and publishing new content and describe the modules used to build dynamic subjectand course-guide displays. finally, we provide a list of resources and modules for beginning and intermediate drupal users. why change was needed our old library website was created using static html and its organizational structure evolved to mirror the administrative structure of the library. the vocabulary we used was very library-centric and, though useful to library staff, could be confusing to patrons. like many larger, older websites, we had accumulated a number of redundant and defunct pages. many of these pages had not been updated for years, had inconsistent naming conventions, or outdated page design. the catalyst for updating our web presence was predicated on several things. with more than one million visits per year and more than two million page views, our old servers were no longer able to handle this load, and we were about to begin a major project to replace our server hardware. in addition, we anticipated participating in an upcoming transition to a new campuswide website template. we saw this moment of change as an opportunity to revitalize the library website’s entire structure and reorganize it with a more user-centric approach to the menus and vocabulary. to do this, we decided to move away from dreamweaver and the static html approach to web design and instead choose a cms that would provide a more flexible and innovative interface. choosing drupal we had done research on commercial and open-source solutions and were leaning toward drupal as our cms. many academic departments at our campus were going through a similar process of website redesign and had already explored the cms options and had chosen drupal. this helped move us toward choosing drupal and taking advantage of a growing developer community on campus. two of the largest units on campus both chose drupal as their cms and have since been great partners for collaboration and peer support. drupal is a free, open-source cms (or content management framework) written in php with a mysql database backing it up. it is a small application of core modules with thousands of add-on modules available to increase functionality. drupal also has a very strong developer community and has been adopted by a growing number of libraries. we have found it to be very open and fluid, which is both a blessing and curse. for any one problem there can be dozens of differing solutions and modules to resolve it. the transition team the library created a core website implementation team consisting of a librarian project manager/developer, a web designer from the it department, and two librarian developers. the core team was supported by a server administrator and an it analyst. the it staff supported the technical aspects of drupal installation, backup, and maintenance. the librarian developers planned the content migration and managed the user interface design, layout, content, scope, and architecture. they needed to know the basics of how drupal works and needed to have much more access to the inner workings of drupal (e.g., modules, user permissions, etc.) than staff. the librarians also would train library staff, so needed to be able to teach and develop documentation and tailor instruction to specific staff needs. everyone who participated in the implementation team had many other competing responsibilities. the librarian developers had other projects and traditional duties such as collection development and reference services, so learning drupal and creating this new website was a part-time project and had to be integrated into existing workloads. tutorial ann hubble, deborah a. murphy, and susan chesley perry ann hubble (ahubble@ucsc.edu) is science librarian, deborah a. murphy (damurphy@ucsc.edu) is emerging technologies librarian, and susan chesley perry (chesley@ucsc.edu) is head of digital initiatives, university of california, santa cruz. selecting a web content management system for an academic library website | hubble, murphy, and perry 191from static and stale to dynamic and collaborative: the drupal difference | hubble, murphy, and perry 191 and often eccentric organizational structures that were no longer meaningful. the previous website had accumulated pages that were a bit more freewheeling in design with a lack of consistent navigation and “look and feel.” adding another layer of complexity, our website changeover took place during a period of great organizational change and a severe budget crisis. surprisingly, what seemed at first a major drawback was actually somewhat helpful. with fewer people spread thinner and doing more work, there was less need to feel in control of individual empires, leading to more cooperation during the changeover. staff learning styles vary, and no one approach to drupal training will work for everyone, so we brought many of the lessons we have learned in our bibliographic instruction sessions to our staff training. for example, we focused training on repetition, reassurance, and patience, ensuring it was an active process with hands-on participation as well as a lecture or demonstration. we provided ample time for questions and invited staff to bring their own projects to work on during training sessions. though some staff only needed to learn a few applications within drupal to perform their jobs, most needed specialized instruction to do some departmental-specific task or action that now had a very different interface. we supplemented our large group by drop-in training sessions with specialized departmental sessions, custom-made documentation, individual hands-on training, e-mail updates on system changes, and regular presentations of new system features. not everyone will become a “born again” drupalista, but everyone should at least feel that they can get their work done using drupal. drupal has also meant changes not only in the way content is added to the website, but also in how we handle revisions and updates. in the past, we had a very siloed initially from the increasing interest in drupal at the campus level. they attended a two-day intensive drupal training course from a company called lullabot, which provided an in-depth technical foundation for our initial drupal installation. this level of technical training and content was not appropriate for the other librarian developers on our team. however, a more detailed, midlevel training would have benefited the librarian developers and moved the project forward at a faster pace. these librarian developers learned using a combination of resources, including free online content that covers core drupal skills, combined with a few carefully chosen professional in-person consultations and online training packages and books. drupal is not a static environment, so after the initial training there was still a need for regular updates and refresitishers. our transition team joined the drupal4lib discussion list and consulted with library colleagues using drupal in the northern california area. drupalcon conferences as well as online users groups were excellent places not only to learn but also to make contact with vendors and other developers. several of these resources are listed in the accompanying bibliography in appendix b. staff training by far our largest group of library drupal users was the fifty-plus library staff content contributors who were faced with learning a new approach to web development. drupal’s successful implementation was ultimately dependent on ensuring library staff would be able to create, edit, and manage thousands of library webpages using this new cms. this was a change for everyone in the library, not just a few. the new website meant leaving behind the comfort of routines created over the years, elaborate designs that had been developed, and various idiosyncratic transition planning with the goal of making our new site user-centered, we wanted to make data-driven decisions about design rather than what had ultimately devolved into the practice of decisions based on politics and committee negotiations. to that end, we took several approaches to gathering user data. we inventoried our current site and gathered usage statistics based on website analytics. we met with small campus focus groups who answered questions about library site searching. we created personas for user categories based on profiles of average users (e.g. first-year students, graduate students, faculty, community users, etc.). based on this data, we drafted web interface wireframes and began user testing. drupal implementation also included developing a safe and effective means of moving from a testing environment to a final, public production site. this deployment process is a crucial component of ensuring that we could both test new features and still provide a stable environment for our users. after extensive discussions and revisions we developed a process to experiment with new modules and themes in a way that does not overwrite our existing public site. the deployment process goes from development to staging to production. it is critical to be able to determine that a new module or update will not negatively affect the database. the process we follow from our sandbox site to our production site is described in more detail in appendix a. transition team training we had three types of drupal users within the library: system administrators, developers, and staff (the primary content editors); each group had its own training needs. the library project manager, web designer, and it systems administrators benefited 192 information technology and libraries | december 2011 appropriate for this particular subject. each tab can also be customized to display whichever records pertain to this subject area. how our dynamic displays were built cck (content construction kit) module content used in both the “article databases and research tools” and “subject guides” displays is held within a special record type we created using the content construction kit (cck) module. we called this special record, or content type, online resource. we defined fields within the online resource record to hold information about individual resources we want to either display on our website or keep track of internally. the fields we defined include the resource name, url (sometimes multiple urls), description, type of information (article database, dictionary, encyclopedia, etc.), and subject discipline. figure 3 shows what a portion of the online resource record for a particular database changes, it’s updated in a central record and immediately reflected in displays throughout the site. not only is it less work to update information, but we also can provide resources in more varied combinations and make them more findable for our users. figure 1 shows how the dynamically created “article databases and research tools” list appears to a user browsing our website. the default display lists these resources in alphabetical order. the user can display the same group of records sorted by other criteria just by clicking on the appropriate tab. if the “by subject” tab is selected, the resources are displayed under subject headings. selecting the “by type” tab lists the resources by resource types, such as dictionaries and encyclopedias, citation style guides, etc. our subject guides are also created using the same components used to build the “article databases and research tools” lists. figure 2 shows a portion of one of our subject guides. like the previous example, this portion of the guide is created dynamically, displaying only records permissions environment that limited editing to only those given specific permissions. we now have role-based ownership where everyone can edit everything so that we did not have to keep up a detailed list of who does what. initial concern that someone could write over or accidently delete pages was somewhat remedied by the drupal revisions history feature, which assists with version control. there have been a few pages where ownership is an issue, and we are still in the process of developing a system to ensure that pages are updated when there is no specific individual linked to a page. dynamic displays: article databases and subject guides as part of the move to drupal, we wanted to take advantage of the new environment to redesign some of the more specialized portions of our site. in particular, we hoped that drupal’s dynamic displays would help us to keep our site more current with less effort. with this in mind, we chose to focus on two of the most heavily used resources: our list of article databases and our library subject guides. we planned to transform these static, high-maintenance html pages into dynamic, easily maintained, and easily generated resources. we used a number of drupal modules to develop the library’s website, and these are described in more detail in appendix c. to redesign our list of article databases and our library subject guides, we relied heavily on three important modules: cck (content construction kit), views, and taxonomy. the interaction of these three modules is key to building dynamically created webpages. once these modules are configured, information is input just once. drupal does the work of pulling the right information from each resource to create dynamic displays. if information, such as a url figure 1. dynamic display: article databases and research tools selecting a web content management system for an academic library website | hubble, murphy, and perry 193from static and stale to dynamic and collaborative: the drupal difference | hubble, murphy, and perry 193 and not programmers. we found that drupal was very different from anything we had used before and had a very steep learning curve. if we could start over, we would have invested much more time in lessons learned learning drupal takes time our implementation team was comprised predominantly of librarians and list of defined fields looks like behind the scenes to the librarian web developer. some fields within the online resource record rely on a little further customization. the “type of information” field is defined via an allowed values list. figure 4 shows a portion of the values list we have defined for this particular field. the “subject, discipline, topic” field (figure 3) incorporates a taxonomy list that we first created using the taxonomy module. this taxonomy vocabulary allows us to later sort the resources dynamically in both the “article databases and research tools” (figure 1) and “subject guides” displays (figure 2). taxonomy module figure 5 shows the list of subject terms we created using the taxonomy module. terms are easily added, edited, and organized via this taxonomy display, available only to the web developers. views module–putting it all together to define how the online resource records are displayed to the user (figure 1), we use the views module. views allow us to define, sort, and filter these records for display. figure 6 shows what the “article databases and research tools” view of figure 1 looks like to the web developer. notice that “a–z,” “subjects,” and “by type” are listed in a box on the left side of the page. each of these tabs corresponds to a tab on the page that displays to the user. in this case, “a–z” is bold and is the active tab currently being defined for this display. display settings such as the record type used, number of records to display per page, specific fields to display, type of sorting, and url path for the webpage are defined here. figure 2. dynamic display: dynamic portion of a subject guide figure 3. cck module: online resource record: manage fields 194 information technology and libraries | december 2011 learning drupal basics and getting a better grasp of how drupal works as a cms. the architecture and database-driven paradigm of our new drupal site is a significantly different environment from our previous website’s html-designed pages and directory-and-folder organization. of particular importance for our site were three core modules: cck, views, and taxonomy. becoming proficient with these modules was a challenge, and we can’t emphasize enough the importance of good, basic training on their use. start small: identify small parts to bring over initially, the thought of moving our old website to drupal seemed insurmountable. bringing over static html pages was straightforward, but portions of the website (such as converting our database of online resources) took more intensive planning. the entire process became more manageable when we divided up the site and focused on drupalizing small parts at a time. this way we could focus on learning enough drupal to make these portions of the site work without being overwhelmed. project management software: document & share what you’ve done if we were to transition an entire website again we would recommend using some type of project management software before starting. none of the implementation team worked on this site full time. this project was added to our other full-time workload providing reference services, collection planning, teaching, digital projects, etc. during our project we tried several free products but were not satisfied with any of them. we felt that finding the right project management package could have made the website transition process much figure 4. cck module: allowed values list (type of information) figure 5. taxonomy module: subjects list selecting a web content management system for an academic library website | hubble, murphy, and perry 195from static and stale to dynamic and collaborative: the drupal difference | hubble, murphy, and perry 195 and the library is now in a much better position for future website design transitions, a process that will be much easier with so much less static content to migrate. for example, the look and feel of our entire website can be transformed by reconfiguring a few options within drupal. ultimately, the transition of the library website to a drupal environment was a very good thing, and we are glad we did it. it was difficult and messy at times, but our website is now more flexible, agile, adaptable, and better poised for change. epilogue since this article was submitted, the uc santa cruz university library website has moved to an entirely new campus theme. we note that having a drupal-based cms greatly aided this transition process. personas for librarians and content contributors and done more usability testing for non-developers. we found that training and teaching library staff the architecture and databasedriven paradigm of the new drupal culture has been a challenge and we still have varying levels of buy-in. conclusion we now have a consistent look and feel to our site, though there are still many things yet to do. now that we are more comfortable using drupal, we can focus on creating more dynamic content, such as staff lists, adding sidebars to pages, and so on. increasing the number of dynamically created pages will mean a more up-to-date site in general. though group authoring within the library is still a challenge, we continue to find ways to encourage collaboration. easier. documenting and sharing how we created elements of the site helped us replicate complex components and allowed us to collaborate more easily on various projects. test, test, test testing the website as we developed it was a crucial component of our work. modules also can interact with other modules in unpredictable ways, so we ultimately found that loading new modules on our sandbox site, a mirror of the library website, was a crucial step in determining compatibility as well as functionality with our existing site (appendix a). it’s essential to practice using a live site without bringing the real production website down. focus on essential modules: cck, views, taxonomy images, wysiwyg editors drupal comes with a set of core modules plus an ever-increasing number of specialized contributed modules. finding and installing the right contributed module that fits a particular need can sometimes be difficult. there are often myriad modules that can solve a problem. it takes time to find and test each one to see if it will actually function as needed, and not all modules work well with one another. focusing on the essential drupal core modules plus cck, views, and taxonomy will help reduce unnecessary development frustrations. staff are important though we created many personas for faculty, students, and community users, we should have created figure 6. views module: article databases and research tools view 196 information technology and libraries | december 2011 appendix a. website deployment process created by bryn kanar and sue chesley perry selecting a web content management system for an academic library website | hubble, murphy, and perry 197from static and stale to dynamic and collaborative: the drupal difference | hubble, murphy, and perry 197 appendix b. drupal resources for getting started ■■ american library association. “drupal4lib interest group (lita library & information technology association).” http://connect.ala.org/node/71787 (accessed march 18, 2011). ■■ american library association. “showcase: database pages & research guides using drupal.” http://connect.ala .org/node/98546 (accessed march 18, 2011). ■■ austin, andy, and christopher harris. “drupal in libraries.” library technology reports 44, no. 4 (2008). ■■ byron, angela, addison berry, nathan haug, jeff eaton, james walker, and jeff robbins. using drupal: choosing and configuring modules to build dynamic websites. sebastopol, ca: o'reilly, 2008. ■■ drupal. “drupal.org.” http://drupal.org/(accessed march 18, 2011). ■■ drupal dojo. “drupal dojo.” http://drupaldojo.com/ (accessed march 18, 2011). ■■ drupal modules. “search, rate, and review drupal modules.” http://drupalmodules.com/ (accessed march 18, 2011). ■■ “drupalconsf san francisco – april 19-21, 2010.” http://sf2010.drupal.org/conference/sessions (accessed march 18, 2011). ■■ drupalib.”drupalib: a place for library drupalers to hang out.” http://drupalib.interoperating.info/ (accessed march 18, 2011). ■■ gotdrupal.com. “gotdrupal: once you've got it, you're addicted!”. http://gotdrupal.com (accessed march 18, 2011). ■■ groups.drupal. “libraries.” http://groups.drupal.org/libraries (accessed march 18, 2011). ■■ groups.drupal. “list of libraries using drupal.” http://groups.drupal.org/libraries/libraries (accessed march 18, 2011). ■■ “is this site built with drupal?”. http://www.isthissitebuiltwithdrupal.com/ (accessed march 18, 2011). ■■ learn by the drop. “learn by the drop: a place to learn drupal.” http://learnbythedrop.com/ (accessed march 18, 2011). ■■ “lullabot.” http://lullabot.com (accessed march 18, 2011). ■■ mastering drupal. “drupal screencasts.” http://www.masteringdrupal.com/videos (accessed march 18, 2011). ■■ slideshare. “drupal resources for libraries, sarah houghton-jan.” http://www.slideshare.net/librarianinblack/ drupal-resources-2982935 (accessed march 18, 2011). ■■ slideshare. “introduction to drupal for libraries, laura solomon.” http://www.slideshare.net/oplin/intro-to -drupal-for-libraries (accessed march 18, 2011). ■■ sunrainproductions. “drupalcampla 2009 views demystified.” http://www.sunrainproductions.com/ drupalcampla/views-demystified (accessed march 18, 2011). appendix c. selected drupal modules used on the ucsc library site ■■ administration menu—adds a top menu bar for authenticated users with common administration tasks ■■ cck—allows you to add new content types, for example the online resources content type for a–z list ■■ ckeditor—wysiwyg editor ■■ google analytics—adds google javascript tracking code to all of our site's pages ■■ google cse—allows us to use google as the site search ■■ imce—image-uploading module, also allows you to create subdirectories within the image directory ■■ image cache—allows you to pre-set sizes for images ■■ ldap integration—links user authentication to the library’s ldap server ■■ mollum—spam filter and image captcha (part of spam control) ■■ nice menus—allows drop-down/right/left expandable menus ■■ nodeblock—allows you to specify a content type as being a block, which content creators to edit the block text and title without having to access the block administration page ■■ pathauto—automatically generates path aliases for various kinds of content (nodes, categories, users) ■■ printer-friendly, e-mail and pdf versions—allows you to configure any type of page to display links for print, e-mail, and pdf ■■ rules—allows site administrators to define conditionally executed actions based on occurring events, we use it to send email when new content is created and to hide some content fields from selected user roles ■■ taxonomy—enables us to assign subjects and other categories to content; the url paths and views use taxonomy ■■ webform—enables quick creation of forms and questionnaires social contexts of new media literacy: mapping libraries elizabeth thorne-wallington information technology and libraries | december 2013 53 abstract this paper examines the issue of universal library access by conducting a geospatial analysis of library location and certain socioeconomic factors in the st. louis, missouri, metropolitan area. framed around the issue of universal access to internet, computers, and technology (ict) for digital natives, this paper demonstrates patterns of library location related to race and income. this research then raises important questions about library location, and, in turn, how this impacts access to ict for young people in the community. objectives and purpose the development and diffusion of new media and digital technologies has profoundly affected the literacy experiences of today’s youth.1 young people today develop literacy through a variety of new media and digital technologies.2 the dissemination of these resources has also allowed for youth to have literacy-rich experiences in an array of different settings. ernest morrell, literacy researcher, writes, as english educators, we have a major responsibility to help future english teachers to redefine literacy instruction in a manner that is culturally and socially relevant, empowering, and meaningful to students who must navigate a diverse and rapidly changing world.3 this paper will explore how mapping and geographic information systems (gis) can help illuminate the cultural and social factors related to how and where students access and use new media literacies and digital technology. libraries play an important role in encouraging new media literacy development;4 yet access to libraries must be understood through social and cultural contexts. the objective of this paper is to demonstrate how mapping and gis can be used to provide rigorous analysis of how library location in st. louis, missouri, is correlated with socioeconomic factors defined by the us census including median household income and race. by using gis, the role of libraries in providing universal access to new media resources can be displayed statistically, both challenging and confirming previously held beliefs about library access. this analysis raises new questions about how libraries are distributed across the st. louis area and whether they truly provide universal and equal access. elizabeth thorne-wallington (ethornew@wustl.edu) is a doctoral student in the department of education at washington university in st. louis. mailto:ethornew@wustl.edu information technology and libraries | december 2013 54 literature review advances in technologies are transforming the very meaning of literacy.5 traditionally, literacy has been defined as the ability to understand and make meaning of a given text.6 the changing global economy requires a variety of digital literacies, which schools do not provide.7 instead, young people acquire literacy through a multitude of inand out-of-school experiences with new media and digital technology.8 libraries play a vital role in supporting new media literacy by offering out-of-school access and experiences. to understand the role that libraries play in offering access to new media literacy technologies, a few key concepts must be defined. first is the concept of the digital native. those born around 1980, who have essentially grown up with technology, are known as digital natives.9 digital natives are expected to have a base knowledge of technology and to be able to pick up and learn new technology quickly because of that base knowledge. digital natives have been exposed to technology from a young age and are adept at using a variety of digital technologies. the suggestion is that young people can quickly learn to make use of the new media and technology available in a specific location. key to any discussion of digital natives is the concept of the digital divide. the digital divide has been a central issue of education policy since the mid-1990s.10 early work on the digital divide was concerned primarily with equal access.11 more recently, however, the idea of a “binary digital divide” has been replaced by studies focusing on a multidimensional view of the digital divide.12 hargattai asserts that even among digital natives, there are large variations in internet skills and uses correlated with socioeconomic status, race, and gender.13 these variations call for a nuanced study examining social and cultural factors associated with new media literacy, including out-ofschool contexts. the concept of literacy and learning in out-of-school contexts has a strong historical context. hull and schultz provide a review of the theory and research on literacy in out-of-school settings.14 a variety of studies, including self-guided literacy activities, after-school programs, and reading programs were reviewed, and the significance of out-of-school learning opportunities was supported by these studies. importantly for the research here, research has also been done on the use of digital technology in out-of-school settings. lankshear and knobel examine out-of-school practices extensively with their work on new literacies.15 lankshear and knobel also make clear the complexity of out-of-school experiences among young people. students participate in nontraditional literacy activities such as blogging and remix in a variety of out-of-school contexts, from home computers to community-based organizations to libraries. most importantly, lankshear and knobel found that the students did connect what they learned in the classroom with these out-of-school activities. the connection between out-of-school literacies and in-school learning has also been studied. education policy researcher allan luke writes, the redefined action of governments . . . is to provide access to combinatory forms of enabling capital that enhance students’ possibilities of putting the kinds of practices, texts, and discourses social contexts of new media literacies: mapping libraries| thorne-wallington 55 acquired in schools to work in consequential ways that enable active position taking in social fields.16 collins writes about this relationship between inand out-of-school literacies. collins writes in her case study that there are a variety of “imports” and “exports” in terms of practices. that is, skill transaction works in both directions, with skills learned out of school used in school, and skills learned in school used out of school.17 skerett and bomer make this connection even more explicit when looking at adolescent literacy practices.18 their article examines how a teacher in an urban classroom drew on her students’ out-of-school literacies to inform teaching and learning in a traditional literacy classroom. the authors found that the teacher in their study was able to create a curriculum that engaged students by inviting them to use literacies learned in out-of-school settings. however, the authors write that this type of literacy study was taxing and time-consuming for both the teacher and the student. still, it is clear that connections between inand out-of-school literacies can be made. the role libraries play in making this connection has not been studied as extensively. yet it is clear that young people do use libraries to access technology. becker et al., found that nearly half of the nation’s 14 to 18 year olds had used a library computer within the past year. becker et al. additionally found that for poor children and families, libraries are a “technological lifeline.” among those below the poverty line, 61 percent used public library computers and the internet for educational purposes.19 tripp writes that libraries have long played an important role in helping people gain access to digital media tools, resources, and skills.20 tripp writes that libraries should capitalize on the potential of new media to engage young people. additionally, tripp argues that librarians need to develop skills to train young people to use new media. the idea that libraries are important in meeting the need is further supported by the recent grants, totaling $1.2 million, by the john d. and catherine t. macarthur foundation to build “innovative learning labs for teens” in libraries. this grant making was a response to president obama’s “educate to innovate” campaign, a nationwide effort to bring american students to the forefront in science and math.21 this literature review demonstrates that the body of research currently available focuses on digital natives and the digital divide, but that the research lacks the nuance needed to capture the complexity of social and cultural contexts surrounding the issue. this literature review further demonstrates both the importance of new media literacy and out-of-school learning, as well as the key role that libraries play in supporting these learning opportunities. the study provided here uses gis analysis to demonstrate important socioeconomic and cultural factors that surround libraries and library access. first, i describe the role of gis in understanding context. next, i describe the methods used in this paper. finally, i analyze the results and implications for the study. geographic information systems analysis in education there is a burgeoning body of research which uses geographic information systems (gis) to better understand socioeconomic and cultural contexts of education and literacy issues.22 information technology and libraries | december 2013 56 there are several key works that link geography and social context. lefebvre defines space as socially produced, and he writes that space embodies social relationships shaped by values and meanings. he describes space as a tool for thought and action or as a means of control and domination. lefebvre writes that there is a need for spatial reappropriation in everyday urban life. the struggle for equality, then, is central to the “right of the city.”23 the unequal distributions of resources in the city help to maintain social and economic advantaged positions, which is important to the analysis here of library access. this unequal distribution of resources continues today. de souza briggs and others write that there is clear geographical segregation in american cities today.24 this is seen in housing choice, racial attitudes, and discrimination, as well as metropolitan development and policy coalitions. in the conclusion of his book, de souza briggs writes that housing choice is limited for low-ses minorities, and these limitations produce myriad social effects. again, this finding is important to the contexts of where libraries are located. jargowsky writes of similar findings.25 like de souza briggs, jargowsky focuses on the role that geography plays in terms of neighborhood and poverty. jargowsky even finds social characteristics of these neighborhoods: there is a higher prevalence of single-parent families, lower educational attainment, a higher level of dropouts, and more children living in poverty. important here, though, is that all such characteristics can be displayed geographically, which means that varying housing, economic, and social conditions can be displayed with library locations. soja goes beyond the geographic analysis offered by de souza briggs and jargowsky and writes that space should be applied to contemporary social theory.26 soja found that spatiality should be used in terms of critical human geography to advance a theory of justice on multiple levels. he writes that injustice is spatially construed and that this spatiality shapes social injustice as much as social injustice shapes a specific geography. this understanding, then, shapes how i approach the study of new media literacies as influenced by cultural and social factors. these factors are particularly prevalent in the st. louis, missouri, area. colin gordon reiterates the arguments of lefbvre jargowsky and de souza briggs in arguing that st. louis is a city in decline.27 by providing maps that project housing policies, gordon is able to provide a clear link between historical housing policies such as racial covenants and current urban decline. gordon is able to show that vast populations are moving out of st. louis city and into the county, resulting in a concentration of minority populations in the northern part of the city. gordon argues that the policies and programs offered by st. louis city have only exacerbated the problem and led to greater blight.28 in terms of literacy, morrell makes the most explicit connection between literacy and mapping with a study that used a community-asset mapping activity to make the argument that teachers need to make an explicit connection between literacy at school and the new literacies experienced in the community.29 the significance of this is that gis can be used to illuminate the social and economic contexts of new media literacy opportunities as well, which in turn could help inform social dialogue about the availability of and access to informal education opportunities for new media literacy. social contexts of new media literacies: mapping libraries| thorne-wallington 57 methods and data the gis analysis performed here concerns library locations in the st. louis metropolitan area, including st. louis city and st. louis county. the st. louis metropolitan area was chosen because of past research mapping the segregation of the city, largely because the city and county are so clearly segregated racially and economically along the north–south line. this segregation is striking when displayed geographically and illuminating when mapped with library location. maps were created using tiger files (www.census.gov/geo/maps-data/data/tiger.html) and us census data (http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml), both freely available to the public via internet download. libraries were identified using the st. louis city library’s “libraries & hours” webpage (www.slpl.org/slpl/library/article240098545.asp), the st. louis county library “locations & hours” webpage (www.slcl.org/about/hours_and_locations), google maps (www.maps.google.com), and the yellow pages for the st. louis metropolitan area (www.yellowpages.com). the address of each library was entered into itouchmap (http://itouchmap.com ) to indentify the latitude and longitude of the library. a spreadsheet containing this information was then loaded into the gis software and displayed as x–y data. the maps were then displayed using median household income, african american population, and latino and hispanic population as obtained from the us census at census tract level. for median household income, the data was from 1999. for all other census data, the year was 2010. for district-level data, communication arts data from the missouri department of elementary and secondary education (modese) website (http://dese.mo.gov/dsm ), was entered into microsoft excel, and then displayed on the maps. the data is district level, representing all grades tested for communication arts across all district schools. the modese data was from 2008, the most recent year available at the time the analysis was performed. the communication arts data was taken from the missouri assessment program test. this test is given yearly across the state to all public school students. the state then collects the data and makes it available at the state, district, and school level. the data used here is district-level data. scores are broken into four categories: advanced, proficient, basic, and below basic. the groups for proficient and advanced were combined to indicate the district’s success on the map test. these are the two levels generally considered acceptable or passing by the state.30 before looking at patterns of library location and these socioeconomic and educational factors, density analysis was performed on the library locations using esri arcgis software, version 9.0, to analyze whether clustering was statistically significant. this analysis was used to demonstrate whether libraries were clustered in a statistically significant pattern, or if location was random. the nearest neighbor tool of arcgis was used to determine if a set of features, in this case the libraries, shows a statistically significant level of clustering. this was done by measuring the distance from each library to its single nearest neighbor and calculating the average distance of all the measurements. the tool then created a hypothetical set of data with the same number of features, but placed randomly within the study area. then an average distance was calculated for these features and compared to the real data. that is, a hypothetical random set of locations was compared to the set of actual library locations. a near-neighbor index was produced, which expresses the ratio of the observed distance divided by the distance from the hypothetical data, thus comparing the two sets.31 this score was then standardized, producing a z-score, reported below in the results section. http://www.census.gov/geo/maps-data/data/tiger.html http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml http://www.slpl.org/slpl/library/article240098545.asp http://www.slcl.org/about/hours_and_locations http://www.maps.google.com/ http://www.yellowpages.com/ http://dese.mo.gov/dsm information technology and libraries | december 2013 58 results and conclusions using the nearest neighbor tool produced a z-score of -3.08, showing that the data is clustered beyond the 0.01 significance level. this means that there is a less than 1 percent chance that library location would be clustered to this degree based on chance. knowing, then, that library location is not random, we can now examine socioeconomic patterns of the areas where libraries are located. figure 1 shows library location and population of individuals under the age of 18 at the census tract level for st. louis city and county, using data from the 2010 us census. to clarify, the city and county are divided by the bold black line crossing the middle of the map, the only such boundary in figure 1, where the county is the larger geographic area. library location is important because previous research shows that young people use informal learning environments to access new media technologies,32 and libraries are a key informal learning environment.33 this map demonstrates, however, that libraries are not located in census tracts with the highest populations of individuals under the age of 18 in st. louis city and county. in fact, for all the tracts with the highest number of individuals under the age of 18, there are zero libraries located in these tracts. this is especially concerning given that young people may have less access to transportation, so their access of facilities in neighboring census tracts may be quite limited. figure 1. number of individuals under the age of 18 by census tract and library location in st. louis city and st. louis county. source: 2010 us census. social contexts of new media literacies: mapping libraries| thorne-wallington 59 figure 2 includes maps showing library locations in st. louis city and county in terms of poverty and race by census tract level, as well as act score by district, represented by the bold lines, where st. louis city is represented by a single district, the st. louis public school district. median household income in indicated by the gray shading, with white areas not having data available. first, census tracts with low median household income are clustered in the northern part of the city and county. there are four libraries in the northern half of the city, and eleven libraries in the central and southern parts of the city. there are fewer libraries in the census tracts with low median household income. figure 2. median household income, act score, and library location, st. louis city and county. source: 2010 us census and missouri department of elementary and secondary education, 2010, www.modese.gov. while the nearest neighbor analysis has already demonstrated the libraries are significantly clustered, the maps seem to suggest the pattern of that clustering. this is especially concerning given the report by becker that 61 percent of those living below the poverty line use libraries to access the internet.34 first, in terms of median household income, it does appear that many libraries are located in higher income areas of the city and county. while the libraries appear to be http://www.modese.gov/ information technology and libraries | december 2013 60 clustered centrally, and particularly near major freeways, there appear to be libraries in many of the higher income census tracts. adding to the concern of location is that of access to these library locations. for those living below the poverty line, transportation is often a prohibitive cost, so access from public transportation should also be a major concern for libraries. additionally, in a pattern repeated in figure 4, the location of libraries does not appear to have any effect on act scores, but there are clearly higher act scores in wealthier areas of the city and county. this is not to say that there is a statistical relationship between act score and library location, but rather to look at the spatial patterns of each in order to note similarities and differences in these patterns. figure 3 shows library location by race, including african american or black and hispanic or latino. first, it is important to note that patterns of race in st. louis have been carefully documented by gordon.35 the st. louis area is clearly a highly segregated region, which makes the social contexts of libraries in the st. louis area even more important. this map demonstrates that while there are many libraries in the northern parts of st. louis city and county, none of these libraries is located in the census tracts with the highest populations of those identifying themselves as african american or black in either the city or county. this raises questions about the inequality of access to the libraries. on the other hand, the densest populations of those identifying themselves as hispanic or latino are in the southern part of the city, but not the county. there is a library located in one of those tracts. it appears the areas with higher concentrations of african americans or blacks have fewer libraries, while areas with the higher concentrations of latinos or hispanics are located in the southern parts of the city that do have libraries. it is important to note, however, that the concentrations of latinos and hispanics is quite low, and those areas are majority white census tracts. as noted above, beyond location, access from public transportation is also an important issue. at the same time, the clustering and patterns shown on these maps raise key issues about access based on income and race. libraries are not located in areas with low median household income or in areas with high concentrations of african americans or blacks. this raises serious questions about why libraries are located where they are, and whether the individuals located in these areas have equal access to library resources, particularly new media technologies. social contexts of new media literacies: mapping libraries| thorne-wallington 61 figure 3. african american or black and hispanic, library location, st. louis city and county. source: 2010 us census. the final map raises a slightly different issue, one of test scores and student achievement. figure 4 shows library location by percent proficient or advanced on the missouri achievement program test by district. beyond the location of the libraries, one factor that stands out is that the areas with the lowest percent proficient or advanced are also the areas with the lowest median household income and the highest percentage of those identifying as african american or black. here an interesting pattern emerges. while there are many libraries in the city and northern part of the county, the percent proficient or advanced on the communication arts portion of exam is quite low (20–30 percent). on the other hand, in the western part of the county, there are few libraries, but the percent proficient or advanced is at its highest level. this suggests that there may not be a strong connection between achievement on the map exam and library location, similar to the lack of relationship seen in between act average score and library location in figure 2. at the same time, there does appear to be a correlation between race, income, and test scores. this correlation is noted throughout the literature on student achievement.36 clearly, these maps raise important questions such as how and why libraries are located in a certain area, who uses libraries in a given area, as well as what other informal learning environments and community assets exist in these areas. what is made clear by the maps, though, is that gis can be used as a tool to help understand the context of new media literacy. information technology and libraries | december 2013 62 figure 4. proficient or advanced, communication arts map by district, 2009, and library location. source: missouri department of elementary and secondary education, 2010, www.modese.gov. significance these results demonstrate that gis can be used to illuminate the social, cultural, and economic complexity that surrounds informal learning environments, particularly libraries. this can help demonstrate not only where young people have the opportunity to use new media literacy, but also the complex contextual factors surrounding those opportunities. paired with traditional qualitative and quantitative work, gis can provide an additional lens for understanding new media literacy ecologies, which can help inform dialogue about this topic. for the results of this study, there does appear to be a relationship between library location and race and income. this study illuminates the complex contextual factors affecting libraries. because of the important role that libraries can play in offering young people out of school learning opportunities, particularly in terms of access to new media resources, these contextual factors are important to ensuring equal access and opportunity for all. http://www.modese.gov/ social contexts of new media literacies: mapping libraries| thorne-wallington 63 references 1. ernest morrell, “critical approaches to media in urban english language arts teacher development,” action in teacher education 33, no. 2 (2011): 151–71, doi: 10.1080/01626620.2011.569416. 2. mizuko ito et al., hanging out, messing around, and geeking out: kids living and learning with new media (cambridge: mit press/macarthur foundation, 2010). 3. morrell, “critical approaches to media in urban english language arts teacher development.” 4. lisa tripp, “digital youth, libraries, and new media literacy,” reference librarian 52, no. 4 (2011): 329–41, doi: 10.1080/02763877.2011.584842. 5. gunther kress, literacy in the new media age (london: routledge, 2003). 6. ibid. 7. donna e. alvermann and alison h. heron, “literacy identity work: playing to learn with popular media,” journal of adolescent & adult literacy 45, no. 2 (2001): 118–22. 8. colin lankshear and michele knobel, new literacies: everyday practices and classroom learning (maidenshead: open university press, 2006). 9. john palfrey and urs gasser, born digital: understanding the first generation of digital natives (new york: perseus, 2009). 10. karin m. wiburg, “technology and the new meaning of educational equity,” computers in the schools 20, no. 1–2 (2003): 113–28, doi: 10.1300/j025v20n01_09. 11. rob kling, “learning about information technologies and social change: the contribution of social informatics,” information society 16, no. 3 (2000): 212–24. 12. james r. valadez and richard p. durán, “redefining the digital divide: beyond access to computers and the internet,” high school journal 90, no. 3 (2007): 31–44, http://www.jstor.org/stable/40364198. 13. eszter hargittai, “digital na(t)ives? variation in internet skills and uses among members of the ‘net generation,’” sociological inquiry 80, no. 1 (2010): 92–113, doi: 10.1111/j.1475682x.2009.00317.x. 14. glynda hull and katherine schultz, “literacy and learning out of school: a review of theory and research,” review of educational research 71, no. 4 (2001): 575–611, http://www.jstor.org/stable/3516099. 15. colin lankshear and michele knobel, new literacies. http://www.jstor.org/stable/40364198 http://www.jstor.org/stable/3516099 information technology and libraries | december 2013 64 16. allan luke, “literacy and the other: a sociological approach to literacy research and policy in multilingual societies,” reading research quarterly 38, no. 1 (2003): 132–41, http://www.jstor.org/stable/415697. 17. stephanie collins, “breadth and depth, imports and exports: transactions between the in-and out-of-school literacy practices of an ‘at risk’ youth,” in cultural practices of literacy: case studies of language, literacy, social practice, and power (mahwah, nj: lawrence erlbaum, 2007). 18. allison skerrett and randy bomer, “borderzones in adolescents literacy practices: connecting out-of-school literacies to the reading curriculum,” urban education 46, no. 6 (2011): 1256–79, doi: 10.1177/0042085911398920. 19. samantha becker et al., opportunity for all: how the american public benefits from internet access at u.s. libraries (washington, dc: institute of museum and library services). 20. lisa tripp, “digital youth, libraries, and new media literacy.” 21. nora fleming, “museums and libraries awarded $1.2m to build learning labs,” education week (blog), december 7, 2012, http://blogs.edweek.org/edweek/beyond_schools/2012/12/museums_and_libraries_awarde d_12_million_to_build_learning_labs_for_youth.html. 22. see william f. tate iv and mark hogrebe, “from visuals to vision: using gis to inform civic dialogue about african american males,” race ethnicity and education 14, no. 1 (2011), 51– 71, doi: 10.1080/13613324.2011.531980; mark c. hogrebe and william f. tate iv, “school composition and context factors that moderate and predict 10th-grade science proficiency,” teachers college record 112, no. 4 (2010), 1096–1136; robert j. sampson, great american city: chicago and the enduring neighborhood effect (chicago: university of chicago press, 2012). 23. henri lefebvre, the production of space (oxford: blackwell, 1991). 24. xavier de souza briggs, the georgraphy of opportunity: race and housing choice in metropolitan america (washington, dc: brookings institute press, 2005). 25 paul jargowsky, poverty and place: ghettos, barrios, and the american city (new york: russell sage foundation, 1997). 26. edward w. soja, postmodern geographies: the reassertion of space in critical social theory (new york: verso, 1989). 27. collin gordon, mapping decline: st. louis and the fate of the american city (university of pennsylvania press, 2008). 28. ibid. http://www.jstor.org/stable/415697 http://blogs.edweek.org/edweek/beyond_schools/2012/12/museums_and_libraries_awarded_12_million_to_build_learning_labs_for_youth.html http://blogs.edweek.org/edweek/beyond_schools/2012/12/museums_and_libraries_awarded_12_million_to_build_learning_labs_for_youth.html social contexts of new media literacies: mapping libraries| thorne-wallington 65 29. ernest morrell, “critical approaches to media in urban english language arts teacher development.” 30. missouri department of elementary and secondary education, http://dese.mo.gov/dsm/. 31. david allen, gis tutorial ii: spatial analysis workbook (redlands, ca: esri press, 2009). 32. becker et al., opportunity for all: how the american public benefits from internet access at u.s. libraries (washington, dc: institute of museum and library services). 33. lisa tripp, “digital youth, libraries, and new media literacy.” 34. becker et al., opportunity for all: how the american public benefits from internet access at u.s. libraries (washington, dc: institute of museum and library services). 35. collin gordon, mapping decline: st. louis and the fate of the american city. 36. see mwalimu shujaa, beyond desegregation: the politics of quality in african american schooling (thousand oaks, ca: corwin, 1996); william j. wilson, the truly disadvantaged: the inner city, the underclass, and public policy (chicago: university of chicago press, 1987); gary orfield and mindy l. kornhaber, raising standards or raising barriers: inequality and highstakes testing in public education (new york: century foundation, 2010). http://dese.mo.gov/dsm/ learning to share: measuring use of a digitized collection on flickr and in the ir melanie schlosser and brian stamper information technology and libraries | september 2012 85 abstract there is very little public data on usage of digitized library collections. new methods for promoting and sharing digitized collections are created all the time, but very little investigation has been done on the effect of those efforts on usage of the collections on library websites. this study attempts to measure the effects of reposting a collection on flickr on use of the collection in a library-run institutional repository (ir). the results are inconclusive, but the paper provides background on the topic and guidance for future efforts. introduction inspired by the need to provide relevant resources and make wise use of limited budgets, many libraries measure the use of their collections. from circulation counts and in-library use studies of print materials, to increasingly sophisticated analyses of usage of licensed digital resources, the techniques have changed even as the need for the data has grown. new technologies have simultaneously presented challenges to measuring use, and allowed those measurements to become more accurate and more relevant. in spite of the relative newness of the digital era, “librarians already know considerably more about digital library use than they did about traditional library use in the print environment.”1 arl’s libqual+,2 one of the most widelyadopted tools for measuring users’ perceptions of service quality, has recently been joined by digiqual and mines for libraries. these new statsqual tools3 extend the familiar libqual focus on users into the digital environment. there are tools and studies for seemingly every type of licensed digital content, all with an eye toward better understanding their users and making better-informed collection management decisions. those same tools and studies for measuring use of library-created digital collections are conspicuous in their absence. almost two decades into library collection digitization programs, there is not a significant body of literature on measuring use of digitized collections. a number of articles have been written about measuring usage of library websites in general; arendt and wagner4 is a recent example. in one of the few studies to specifically measure use of a digitized collection, herold5 uses google analytics to uncover the geographical location of users of a digitized archival image collection. otherwise, a literature search on usage studies uncovers very little. less formal communication channels are similarly quiet, and public usage data on digitized collections on library sites is virtually nonexistent. commercial sites for disseminating and sharing melanie schlosser (schlosser.40@osu.edu) is digital publishing librarian and brian stamper (stamper.10@osu.edu) is administrative associate, the ohio state university libraries, columbus, ohio. mailto:schlosser.40@osu.edu mailto:stamper.10@osu.edu information technology and libraries | september 2012 86 digital media frequently display simple use metrics (image views, for example, or file downloads) alongside content; such features do not appear on digitized collections on library sites. usage and digitization projects digitized library collections are created with an eye toward use from their early planning stages. an influential early clir publication on selecting collections for digitization written by a harvard task force6 included current and potential use of the analog and digitized collection as a criterion for selection. the factors to be considered include the quantitative (“how much is the collection used?”) and the qualitative (“what is the nature of the use?”). more than ten years later, ooghe and moreels7 find that use is still a criterion for selection of collections to digitize, tied closely to the value of the collection. facilitating discovery and use of the digitized collection is a major consideration during project development. payette and rieger8 is an early example of a study of the needs of users in digital library design. usability testing of the interface is frequently a component of site design; see jeng9 for a good overview of usability testing in the digital library environment. increasing usage of the digitized collection is also a major theme in metadata research and development. standards such as the open archives initiative’s protocol for metadata harvesting10 and object reuse and exchange11 are meant to allow discovery and reuse of objects in a variety of environments, and the linked data movement promises to make library data even more relevant and reusable in the world wide web environment.12 digital collection managers have also found more radical methods of increasing usage of their collections. inserting references into relevant wikipedia articles has become a popular way to drive more users to the library’s site.13 some librarians have taken the idea a step further and have begun reposting their digital content on third-party sites. the smithsonian pioneered one reposting strategy in 2008 when they partnered with flickr, the popular photo-sharing site, to launch flickr commons.14 the commons is a walled garden within flickr that contains copyrightfree images held by cultural heritage institutions such as libraries, archives, and museums. each partner institution has its own branded space “photostream” in flickr parlance organized into collections and sets. this model aggregates content from different organizations and locates it where users already are, but it still maintains the traditional institution/collection structure. flickr commons has been, by all measures, a very successful experiment in sharing collections with users. the smithsonian,15 the library of congress,16 the alcuin society,17 and the london school of economics18 have all written about their experiences with the commons. stephens19 and michel and tzoc20 give advice on how libraries can work with flickr, and garvin21 and vaughan22 take a broad view of the project and the partners. another sharing strategy is beginning to emerge, where digital collection curators contribute individual or small groups of images to thematic websites. a recent example is pets in collections,23 a whimsical tumblr photo blog created by the digital collections librarian at bryn mawr college. learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 87 the site’s description states, “come on if you work in a library, archive, or museum, you know you’ve seen at least one of these a seemingly random image of that important person and his dog or a man and a monkey wearing overalls … so now you finally have a place to share them with the world!” the site requires submissions to include only the image and a link back to the institution or repository that houses it, although submitters may include more information if they choose. although more lighthearted than most traditional library image collections, it still performs the desired function of introducing users to digital collections they may never have encountered otherwise. clearly, these creative and thoughtful strategies are not dreamed up by digital librarians unconcerned with end use of their collections, so why do stewards of digitized collections so rarely collect, or at least publicly discuss, statistics on the use of their content? the one notable exception to this may shed some light on the matter. institutional repositories (irs) have been the one area of non-licensed digital library content where usage statistics are frequently collected and publicized. dspace,24 the widely-adopted ir platform developed by mit and hewlett-packard, has increasingly sophisticated tools for tracking and sharing use of the content it hosts. digital commons,25 the hosted ir solution created by bepress, provides automated monthly download reports for scholars who use it to archive their content. the development of these features has been driven by the need to communicate value to faculty and administrators. encouraging participation by faculty has been a major focus of ir managers since the initial ‘build it and they will come’ optimism faded and the challenge of adding another task to already busy faculty schedules became clear.26 having a clear need (outreach) and a defined audience (participating scholars) has led to a thriving program of usage tracking in the ir community. the lack of an obvious constituency and the absence of pointed questions about use in the digitized collections world have, one suspects, led to the current dearth of measurement tools and initiatives. still, questions about use do arise, particularly when libraries undertake laborintensive usability studies or venture into the somewhat controversial landscape of sharing library-created digital objects on third party sites.27 anecdotally, the thought of sharing library content elsewhere on the web also raises concerns about loss of context and control, as well as a fear of ‘dilution’ of the library’s web presence. “if patrons can use the library’s collections on other sites,” a fellow librarian once exclaimed, “they won’t come to the library’s website anymore!” without usage data, we cannot adequately answer questions about the value of our projects or the way they impact other library services. justification for study and research questions there were three major motivations for this project. first, inspired by the success of the flickr commons project, we wanted to explore a method for sharing our collections more widely. an image collection and a third-party image-sharing platform were an obvious choice, since image display is not a strength of our dspace-based repository. flickr is currently a major presence in information technology and libraries | september 2012 88 the image sharing landscape, and the existence of the commons was an added incentive for choosing flickr as our platform. second, the collection we selected for the project (described more fully below) is not fully described, and we wanted to take advantage of flickr’s annotation tools to allow user-generated metadata. since further description of the images would have required an unusual depth of expertise, we were not optimistic that we would receive much useful data, and in fact we did not. still, we lost nothing by asking, and gained familiarity with flickr’s capabilities for metadata capture. the final motivation for the project, and the focus of the study, was the desire to investigate the effect of third-party platform sharing of a local collection on usage of that collection on library sites. the data gathered were meant partly to inform our local practice, but also to address a concern that may hold librarians back from exploring such means of increasing collection usage the fear that doing so will divert traffic from library sites. we suspected that sharing collections more widely would actually increase usage of the items on library-owned sites, and the study was developed to explore the issue in a rigorous way. the research question for this study was: does reposting digitized images from a library site to a third-party image sharing site have an effect on usage of the images on the library site? about the study platforms for the study, the images were submitted to two different platforms the knowledge bank (kb),28 a library-managed repository, and flickr, a commercial image sharing site. the kb is an institutional repository built on dspace software with a manakin (xml-based) user interface. established in 2005, it holds more than 45,000 items, including faculty and student research, gray literature, institutional records, and digitized library collections. image collections like the one used in this study make up a small percentage of the items in the repository. in the kb’s organizational structure, the images in the study were submitted as a collection in the library’s community, under a sub-community for the special collection that contributed them. each image was submitted as an item consisting of one image file and dublin core metadata.29 the project originally called for submitting the images to flickr commons, but the commons was not accepting new partners during the study period. instead, we created a standard flickr pro account for the libraries, while following the commons guidelines in image rights and settings. in contrast to dspace’s community/sub-community/collection structure, flickr images are organized in sets, sets belong to collections, and all images make up the account owner’s photostream. a set was created for the images, with accompanying text giving background information and inviting users to contribute to the description of the images.30 the images were accompanied by the same metadata as the items in the kb, but the files themselves were higher resolution, to take advantage of flickr’s ability to display a range of sizes for each image. all items in the set were publicly learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 89 available for viewing, commenting, and tagging, and each image was accompanied by links back to the kb at the item, collection, and repository level. the collection the choice of a collection for the study was limited by a number of factors. first, and most obviously, it needed to be an image collection. second, it needed to be in the public domain, both to allow our digitization and distribution of the images, and also to satisfy flickr commons’ “no known copyright restrictions” requirement.31 this could be accomplished either by choosing a collection whose copyright protections had expired, or by removing restrictions from a collection to which the libraries owned the rights. third, the curator of the collection needed to be willing and able to post the images on a commercial site. this required not only an open-minded curator, but also a collection without a restrictive donor agreement or items containing sensitive or private information. finally, we wanted the collection to be of broad public interest. the collection chosen for the study was a set of 163 photographs from osu’s charles h. mccaghy collection of exotic dance from burlesque to clubs, held by the jerome lawrence and robert e. lee theatre research institute.32 the photographs, mainly images of burlesque dancers, were published on cabinet and tobacco cards in the 1890s, putting them solidly in the public domain. figure 1. "the devil's auction," j. gurney & son (studio). http://hdl.handle.net/1811/47633 (kb), http://www.flickr.com/photos/60966199@n08/5588351865/ (flickr) http://hdl.handle.net/1811/47633 learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 87 methodology phases the study took place in 2011 and was organized in three ten-week phases. for the first phase (january 31 through april 11), the images were submitted to the kb. the purpose of this phase was to provide a baseline level of usage for the images in the repository. in phase two (april 12 through june 20), half of the images were randomly selected and submitted to flickr (group a). the purpose of this phase was to determine what effect reposting would have on usage of items in the repository both on those images that were reposted, and on other images in the same collection that had not been reposted. in phase three (june 21 through august 29), the rest of the images (group b) were submitted to flickr. in this phase, we began publicizing the collection. publicity consisted of sharing links to the collection on social media and sending emails to scholars in relevant fields via email lists. these efforts led to further downstream publicity on popular and scholarly blogs.33 data collection the unit of measurement for the study was views of individual images. to understand the notion of a “view,” we must contrast two different ways that an image may be viewed in the knowledge bank. each image in the collection has an individual web page (the item page) where it is presented along with metadata describing it. in addition, from that page a visitor may download and save the image file itself (in this collection, a jpeg). in the former case, the image is an element in a web page, while in the latter it is an image file independent of its web context. search engines and other sources commonly link directly to such files, so it is not unusual for a visitor to download a file without ever having seen it in context. in light of this, we produced two data sets, one for visits to item pages, and another for file downloads. depending on one’s interpretation, either could be construed as a “view.” ultimately there was little distinction in usage patterns between the two types of measure. the data were generated by making use of dspace’s apache solr-based statistics system, which provides a queryable database of usage events. for each item in the study, we made two queries; one for per-day counts of item page views, and another for per-day counts of image file downloads (called “bitstream” downloads in dspace parlance.) in both cases, views that came from automated sources such as search engine indexing agents were excluded from our counts. views of the images in flickr were noted and used as a benchmark, but were not the focus of the study. unlike cumulative views, which are tabulated and saved indefinitely, flickr saves daily view numbers for only thirty days. as a result, daily view numbers for most of the study period were not available for analysis, and the discussion of the trends in the flickr data is necessarily anecdotal. information technology and libraries | september 2012 88 results at the end of the study period, the data showed very little usage of the collection in the repository. this lack of usage was relatively consistent through the three phases of the study, and in rough terms translates to less than one view of each item per day. of the two ways of measuring an image "view" either by counting views of the web page where the item can be found or by counting how many times the image file was downloaded there was little distinction. knowledge bank item pages received between 5 and 38 views per item, while files were downloaded between 5 and 34 times. further, there were no significant differences in number of views received between the first group released to flickr and the second. kb item page views image file downloads min median max min median max group a (images released to flickr in phase ii) 5 10 35 5 9 25 group b (images released to flickr in phase iii) 6 10 38 4 9 34 table 1. the items in the study are divided into group a and group b, depending on when the images were placed on flickr. this table shows that both groups received similar traffic over the course of the study, with items having between 5 and 38 views in both groups, with a median of 10 for both, and between 4 and 34 downloads, with a median of 9 for both groups. the items attracted more visitors on flickr, with the images receiving between 100 and 600 views each. with a few exceptions, the items that appeared towards the beginning of the set (as viewed by a user who starts from the set home page) received more views than items towards its end. this suggests a particular usage pattern start at the beginning, browse through a certain number of images, and navigate away. a more significant trend in the flickr data is that most views of the images came after publicity for the collection began (approximately midway through the third phase of the study). again, the lack of daily usage numbers on flickr makes it impossible to demonstrate the publicity ‘bump,’ but it was dramatic. we witnessed a similar, if smaller, ‘bump’ in usage of the items in the kb after publicity started. we were also able to identify 65 unique visitors to the kb who came to the site via a link on flickr, out of 449 unique visitors overall. of those who came to the kb from flickr, 31 continued on to other parts of the kb, and the rest left after viewing a single item or image. learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 89 discussion with so little data, we cannot reliably answer the primary research question. reposting certainly does not seem to have lowered usage of the items in the kb, but the numbers of views in all phases were so small as to preclude drawing meaningful conclusions. a larger issue is the fact that much of the usage came immediately following our promotional efforts. this development complicated the research in a number of ways. first, because the promotional emails and social media messages specifically pointed users to the collection in flickr, it is impossible to know how the use may have differed if the primary link in the promotion had been to the knowledge bank. would the higher use seen on flickr simply have transferred to the kb? would the unfamiliarity and non-image-centric interface of the knowledge bank have thwarted casual users in their attempt to browse the collection? the centrality of the promotion efforts also suggests that one of the underlying assumptions of the study may have been wrong. this research project was premised on the idea that an openly available collection on a library website will attract a certain number of visitors (number dependent on the popularity and topicality of the subject of the collection) who find the content spontaneously via searching and browsing. placing that same content on a third-party site could theoretically divert a percentage of those users, who would then never visit the library’s site. the percentage of users diverted would likely depend on how many more users browse the third party site than the library site, as well as the relative position of the two in search rankings. the mccaghy collection should have been a good candidate for this type of use pattern. flickr is certainly heavily used and browsed, and burlesque, while not currently making headlines, is a subject with fairly broad popular appeal. the fact that users did not spontaneously discover the collection on either platform in significant numbers suggests that this may not be how discovery of library digitized collections works. it is not surprising that email lists and social media should drive larger numbers of users to a collection than happenstance the power of link curation by trusted friends via informal communication channels is well known. what is surprising is that it was the only significant use pattern in evidence. the primary takeaway is that promotion is key. if we do not promote our collections to the people who are likely to be interested in them, barring a stroke of luck, it is unlikely that they will be found. anecdotally, promotional efforts are often an afterthought in digital collections work a pleasant but unnecessary ‘extra.’ in our environment, the repository staff often feel that promotion is the work of the collection owner, who may not think of promoting the collection in the digital environment, nor know how to do so. as a result, users who would benefit from the collections simply do not know they exist. these results also suggest that librarians worried about the consequences of sharing their collections on third party sites may be worrying about the wrong thing. the sheer volume of information on any given topic makes it unlikely that any but the most dedicated researcher will information technology and libraries | september 2012 90 explore all available sources. most other users are likely to rely on trusted information sources (traditional media, blogs, social networking sites) to steer them towards the items that are most likely to interest them. instead of wondering if users will still come to the library’s site if the content is available elsewhere, perhaps we should be asking of our digital collections, “is anyone using them on any site?” and if the answer is no, the owners and caretakers of those collections should explore ways to bring them to the attention of relevant audiences. conclusion as a usage study of a collection hosted on a library site and a commercial site, this project was not a success. flawed assumptions and a lack of usable data resulted in an inability to address the primary research question in a meaningful way. however, it does shed light on the questions that motivated it. are our digitized collections being used? what effect do current methods of sharing and promotion have on that use? librarians working with digitized collections have fallen behind our colleagues in the print and institutional repository arenas in measuring use of collections, but we have the same needs for usage data. in the current climate of heightened accountability in higher education and publicly funded institutions, we need to demonstrate the value of what we do. we need to know when our efforts to promote our collections are working, and determine which projects have been most successful and merit continued development. and as always, we need to share our results, both formally and informally, with our colleagues. measuring use of digital resources is challenging, and obtaining accurate usage statistics requires not only familiarity with the tools involved, but also some understanding of the ways in which the numbers can be unrepresentative of actual use. the organizations that do collect usage statistics on their digitized collections should share their methods and their results with others to help foster an environment where such data are collected and used. next steps in this area could take the shape of further research projects, or simply more visible work collecting usage statistics on digital collections. of greatest utility to the field would be data demonstrating the relative effectiveness of different methods of increasing use. do labor-intensive usability studies deliver returns in the form of increased use of the finished site? which forms of reposting generate the most views? what types of publicity are most effective in bringing users to collections? how does use of a collection change over time? there are also more policy-driven questions to be answered. for example, should further investment in a collection or site be tied to increasing use of low-traffic collections, or capitalizing on success? differences in topic, format, and audience make it difficult to generalize in this area, but we can begin building a body of knowledge that helps us learn from each other’s successes and failures. learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 91 references 1 brinley franklin, martha kyrillidou, and terry plum. "from usage to user: library metrics and expectations for the evaluation of digital libraries." in evaluation of digital libraries: an insight into useful applications and methods, ed. giannis tsakonas and christos papatheodorou, 17-39. (oxford: chandos publishing, 2009). http://www.libqual.org/publications (accessed february 29, 2012) 2 “libqual+,” accessed february 29, 2012. http://www.libqual.org/home 3 “statsqual,” accessed february 29, 2012. http://www.digiqual.org/ 4 julie arendt and cassie wagner. "beyond description: converting web site usage statistics into concrete site improvement ideas." journal of web librarianship 4, no. 1 (2010): 37-54. 5 irene m. h. herold. "digital archival image collections: who are the users?" behavioral & social sciences librarian 29, no. 4 (2010): 267-282. 6 dan hazen, jeffrey horrell, and jan merrill-oldham. selecting research collections for digitization. (council on library and information resources, 1998). http://www.clir.org/pubs/reports/hazen/pub74.html (accessed february 29, 2012) 7 bart ooghe and dries moreels. "analysing selection for digitisation: current practices and common incentives." d-lib magazine 15, no. 9 (2009): 28. http://www.dlib.org/dlib/september09/ooghe/09ooghe.html. 8 sandra d. payette and oya y. rieger. "supporting scholarly inquiry: incorporating users in the design of the digital library." the journal of academic librarianship 24, no. 2 (1998): 121-129. 9 judy jeng. "what is usability in the context of the digital library and how can it be measured?" information technology & libraries 24, no. 2 (2005): 47-56. 10 “open archives initiative protocol for metadata harvesting,” accessed february 29, 2012. http://www.openarchives.org/pmh/ 11 “open archives initiative object reuse and exchange,” accessed february 29, 2012. http://www.openarchives.org/ore/ 12 eric miller and micheline westfall. "linked data and libraries." serials librarian 60, no. 1&4 (2011): 17-22. 13 ann m. lally and carolyn e. dunford. “using wikipedia to extend digital collections,” d-lib magazine 13, no. 5&6 (2007). accessed february 29, 2012. doi:10.1045/may2007-lally 14 “flickr: the commons,” accessed february 29, 2012. http://www.flickr.com/commons/ 15 martin kalfatovic, effie kapsalis, katherine spiess, anne camp, and michael edson. "smithsonian team flickr: a library, archives, and museums collaboration in web 2.0 space." archival science 8, no. 4 (2008): 267-277. http://www.libqual.org/publications http://www.libqual.org/home http://www.digiqual.org/ http://www.clir.org/pubs/reports/hazen/pub74.html http://www.dlib.org/dlib/september09/ooghe/09ooghe.html http://www.openarchives.org/pmh/ http://www.openarchives.org/ore/ http://www.flickr.com/commons/ information technology and libraries | september 2012 92 16 josh hadro. "lc report positive on flickr pilot." library journal 134, no. 1 (2009): 23. 17 jeremiah saunders. “flickr as a digital image collection host: a case study of the alcuin society,” collection management 33, no. 4 (2008): 302-309. doi: 10.1080/01462670802360387 18 victoria carolan and anna towlson. "a history in pictures: lse archives on flickr." aliss quarterly 6 (2011): 16-18. 19 michael stephens. "flickr." library technology reports 42, 4 (2006): 58-62. 20 jason paul michel and elias tzoc. "automated bulk uploading of images and metadata to flickr." journal of web librarianship 4, no. 4 (10, 2010): 435-448. 21 peggy garvin. "photostreams to the people." searcher 17, no. 8 (2009): 45-49. 22 jason vaughan. "insights into the commons on flickr." portal: libraries & the academy 10, no. 2 (2010): 185-214. 23 “pets-in-collections,” accessed february 29, 2012. http://petsincollections.tumblr.com/ 24 “dspace,” accessed february 29, 2012. http://www.dspace.org/ 25 “digital commons,” accessed february 29, 2012. http://digitalcommons.bepress.com/ 26 dorothea salo. "innkeeper at the roach motel." library trends 57, no. 2 (2008): 98-123. 27 for an example of the type of debate that tends to surround projects like flickr commons, see http://www.foundhistory.org/2008/12/22/tragedy-at-the-commons/. (accessed february 29, 2012) 28 “the knowledge bank,” accessed february 29, 2012. http://kb.osu.edu 29 “charles h. mccaghy collection of exotic dance from burlesque to clubs,” accessed february 29, 2012. http://hdl.handle.net/1811/47556 30 “charles h. mccaghy collection of exotic dance from burlesque to clubs,” accessed february 29, 2012. http://flic.kr/s/ahsjua3bgi 31 “flickr: the commons (usage),” accessed february 29, 2012. http://www.flickr.com/commons/usage/ 32 “the jerome lawrence and robert e. lee theatre research institute,” http://library.osu.edu/find/collections/theatre-research-institute/; “charles h. mccaghy collection of exotic dance from burlesque to clubs,” http://library.osu.edu/find/collections/theatre-research-institute/personal-papers-andspecial-collections/charles-h-mccaghy-collection-of-exotic-dance-from-burlesque-to-clubs/; “loose women in tights digital exhibit,” http://library.osu.edu/find/collections/theatreresearch-institute/digital-exhibits-projects/loose-women-in-tights-digital-exhibit/. accessed february 29, 2012. http://petsincollections.tumblr.com/ http://www.dspace.org/ http://digitalcommons.bepress.com/ http://www.foundhistory.org/2008/12/22/tragedy-at-the-commons/.%29 http://hdl.handle.net/1811/47556 http://flic.kr/s/ahsjua3bgi http://www.flickr.com/commons/usage/ http://library.osu.edu/find/collections/theatre-research-institute/ http://library.osu.edu/find/collections/theatre-research-institute/personal-papers-and-special-collections/charles-h-mccaghy-collection-of-exotic-dance-from-burlesque-to-clubs/ http://library.osu.edu/find/collections/theatre-research-institute/personal-papers-and-special-collections/charles-h-mccaghy-collection-of-exotic-dance-from-burlesque-to-clubs/ http://library.osu.edu/find/collections/theatre-research-institute/digital-exhibits-projects/loose-women-in-tights-digital-exhibit/ http://library.osu.edu/find/collections/theatre-research-institute/digital-exhibits-projects/loose-women-in-tights-digital-exhibit/ learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 93 33 for an example of the kind of coverage it received, see http://flavorwire.com/195225/fascinating-photos-of-19th-century-vaudeville-and-burlesqueperformers (accessed february 29, 2012) http://flavorwire.com/195225/fascinating-photos-of-19th-century-vaudeville-and-burlesque-performers http://flavorwire.com/195225/fascinating-photos-of-19th-century-vaudeville-and-burlesque-performers communications | maceli, wiedenbeck, and abels 71benign neglect: developing life rafts for digital content | deridder 71 in accordance with the current best practices.8 for those producers of content who are not able to meet the requirements of ingest, or who do not have access to an oais archive provider, what are the options? with the recent downturn in the economy, the availability of staff and the funding for the support of digital libraries has no doubt left many collections at risk of abandonment. is there a method for preparation of content for longterm storage that is within the reach of existing staff with few technical skills? if the content cannot get to the safe harbor of a trusted digital library, is it consigned to extinction? or are there steps we can take to mitigate the potential loss? the oais model incorporates six functional entities: ingest, data management, administration, preservation planning, archival storage, and access.9 of these six, only archival storage is primary; all the others are useless without the actual content. and if the content cannot be accessed in some form, the storage of it may also be useless. therefore the minimal components that must be met are those of archival storage and some form of access. the lowest cost and simplest option for archival storage currently available is the distribution of multiple copies dispersed across a geographical area, preferably on different platforms, as recommended by the current lockss initiative,10 which focuses on bit-level preservation.11 private lockss network models (such as the alabama digital preservation network)12 are the lowest-cost implementation, requiring only hardware, membership in lockss, and a small amount of time and technical expertise. reduction of the six functional entities to only two negates the need in contrast, other leaders of the digital preservation movement have been stating for years that benign neglect is not a workable solution for digital materials. eric van de velde, director of caltech’s library information technology group, stated that the “digital archive must be actively managed.”3 tom cramer of stanford university agrees: “benign neglect doesn’t work for digital objects. preservation requires active, managed care.”4 the digital preservation europe website argues that benign neglect of digital content “is almost a guarantee that it will be inaccessible in the future.”5 abby smith goes so far as to say that “neglect of digital data is a death sentence.”6 arguments to support this statement are primarily those of media or data carrier storage fragility and obsolescence of hardware, software, and format. however, the impact of these arguments can be reduced to a manageable nightmare. by removing as much as possible of the intermediate systems, storing open-source code for the software and operating system needed for access to the digitized content, and locating archival content directly on the file system itself, we reduce the problems to primarily that of format obsolescence. this approach will enable us to forge ahead in the face of our lack of resources and our rather desperate need for rapid, cheap, and pragmatic solutions. current long-term preservation archives operating within the open archival information system (oais) model assume that producers can meet the requirements of ingest.7 however, the amount of content that needs to be deposited into archives and the expanding variety of formats and genres that are unsupported, are overwhelming the ability of depositors to prepare content for preservation. andrea goethals of harvard proposed that we revisit assumptions of producer ability to prepare content for deposit benign neglect: developing life rafts for digital content i n his keynote speech at the archiving 2009 conference in arlington, virginia, clifford lynch called for the development of a benign neglect model for digital preservation, one in which as much content as possible is stored in whatever manner available in hopes of there someday being enough resources to more properly preserve it. this is an acknowledgment of current resource limitations relative to the burgeoning quantities of digital content that need to be preserved. we need low cost, scalable methods to store and preserve materials. over the past few years, a tremendous amount of time and energy has, sensibly, been devoted to developing standards and methods for best practices. however, a short survey of some of the leading efforts clarifies for even the casual observer that implementation of the proposed standards is beyond many of those who are creating or hosting digital content, particularly because of restrictions on acceptable formats, requirements for extensive metadata in specific xml encodings, need for programmers for implementation, costs for participation, or simply a lack of a clear set of steps for the uninitiated to follow (examples include: planets, premis, dcc, caspar, irods, sound directions, hathitrust).1 the deluge of digital content coupled with the lack of funding for digital preservation and exacerbated by the expanding variety of formats, makes the application of extensive standards and extraordinary techniques beyond the reach of the majority. given the current circumstances, lynch says, either we can seek perfection and store very little, or we can be sloppy and preserve more, discarding what is simply intractable.2 communications jody l. deridder (jlderidder@ua.edu) is head, digital services, university of alabama. jody l. deridder 72 information technology and libraries | june 2011 during digitization is that developing digital libraries usually have a highly chaotic disorganization of files, directory structures, and metadata that impede digital preservation readiness.19 if the archival digital files cannot be easily and readily associated with the metadata that provides their context, and if the files themselves are not organized in a fashion that makes their relationships transparent, reconstruction of delivery at some future point is seriously in question. underfunded cultural heritage institutions need clear specifications for file organization and preparation that they are capable of meeting without programming staff or extensive time commitments. particularly in the current economic downturn, few institutions have the technical skills to create mets wrappers to clarify file relationships.20 one potential solution is to use the organization of files in the file system itself to communicate clearly to future archivists how the files relate to one another. at the university of alabama, we have adopted a standardized file naming system that organizes content by the holding institution and type, collection, item, and then sequence of delivery (see figure 1). the file names are echoed in the file system: top level directories match the holding institution number sequence, secondary level directory names match the assigned collection number sequence, and so forth. metadata and documentation are stored at whatever level in the file system corresponds to the files to which they apply, and these text and xml files have file names that also correspond to the files to which they apply, which assists further in identification (see figure 2).21 by both naming and ordering the files according to the same system, and bypassing the need for databases, complex metadata schemes and software, we leverage the simplicity of the file system to bring order to chaos and to enable our content to be easily reconstructed by future systems. take and manage the content is still uncertain. the relay principle states that a preservation system should support its own migration. preserving any type of digital information requires preserving the information’s context so that it can be interpreted correctly. this seems to indicate that both the intellectual context and the logical context need to be provided. context may include provenance information to verify authenticity, integrity, and interpretation;17 it may include structural information about the organization of the digital files and how they relate to one another; and it should certainly include documentation about why this content is important, for whom, and how it may be used (including access restrictions). because the cost of continued migration of content is very high, a method of mitigating that cost is to allow content to become obsolete but to support sufficient metadata and contextual information to be able to resurrect full access and use at some future time—the resurrection principle. to be able to resurrect obsolete materials, it would be advisable to store the content with open-source software that can render it, an opensource operating system that can support the software, and separate plain-text instructions for how to reconstruct delivery. in addition, underlying assumptions of the storage device itself need to be made explicit if possible (type of file system partition, supported length of file names, character encodings, inode information locations, etc.). some of the need for this form of preservation may be diminished through such efforts as the planets timecapsule deposit.18 this consortium has gathered the supporting software and information necessary to access current common types of digital files (such as pdf), for long-term storage in swiss fort knox. one of the drawbacks to gathering and storing content developed for a tremendous amount of metadata collection. where the focus has been on what is the best metadata to collect, the question becomes: what is the minimal metadata and contextual information needed? the following is an attempt to begin this conversation in the hope that debate will clarify and distill the absolutely necessary and specific requirements to enable long-term access with the lowest possible barrier to implementation. if we consider the purpose of preservation to be solely that of ensuring long-term access, it is possible to selectively identify information for inclusion. the recent proposal by the researchers of the national geospatial digital archive (ngda) may help to direct our focus. they have defined three architectural design principles that are necessary to preserve content over time: the fallback principle, the relay principle, and the resurrection principle.13 in the event that the system itself is no longer functional, then a preservation system should support some form of hand-off of its content—the fallback principle. this can be met by involvement in lockss, as specified above. lacking the ability to support even this, current creators and hosts of digital content may be at the mercy of political or private support for ingest into trusted digital repositories.14 the recently developed bagit file package format includes valuable information to ensure uncorrupted transfer for incorporation into such an archive.15 each base directory containing digital files is considered a bag, and the contents can be any types of files in any organization or naming convention; the software tags the content (or payload) with checksums and manifest, and bundles it into a single archive file for transfer and storage. an easily usable tool to create these manifests has already been developed to assist underfunded cultural heritage organizations in preparing content for a hosting institution or government infrastructure willing to preserve the content.16 the gap of who would communications | deridder 73benign neglect: developing life rafts for digital content | deridder 73 clifford lynch pointed out, funding cutbacks at the sub-federal level are destroying access and preservation of government records; corporate records are winding up in the trash; news is lost daily; and personal and cultural heritage materials are disappearing as we speak.24 it is valuable and necessary to determine best practices and to seek to employ them to retain as much of the cultural and historical record as possible, and in an ideal world, these practices would be applied to all valuable digital content. but in the practical and largely resource-constrained world of most libraries and other cultural institutions, this is not feasible. the scale of content creation, the variety and geographic dispersal of materials, and the cost of preparation and support makes it impossible for this level of attention to be applied to the bulk of what must be saved. for our cultural memory from this period to survive, we need to communicate simple, clear, scalable, inexpensive options to digital holders and creators. references 1. planets consortium, planets preservation and long-term access through networked services, http:// www.planets-project.eu/ (accessed mar. 29, 2011); library of congress, premis (preservation metadata maintenance activity), http://www.loc.gov/standards/premis/ (accessed mar. 29, 2011); dcc (digital curation centre), http:// www.dcc.ac.uk/ (accessed mar. 29, 2011); caspar (cultural, artistic, and scientific knowledge for preservation, access, and retrieval), http://www.casparpreserves .eu/ (accessed mar. 29, 2011); irods (integrated rule-oriented data system), h t t p s : / / w w w. i ro d s . o rg / i n d e x . p h p / irods:data_grids,_digital_libraries,_ persistent_archives,_and_real-time_ data_systems (accessed mar. 29, 2011); mike casey and bruce gordon, sound directions: best practices for audio preservation, http://www.dlib.indiana .edu/projects/sounddirections/papers present/sd_bp_07.pdf (accessed june 14, 2010); hathitrust: a shared digital online delivery of cached derivatives and metadata, as well as webcrawlerenabled content to expand accessibility. this model of online delivery will enable low cost, scalable development of digital libraries by simply ordering content within the archival storage location. providing simple, clear, accessible methods of preparing content for preservation, of duplicating archival treasures in lockss, and of web accessibility without excessive cost or deep web database storage of content, will enable underfunded cultural heritage institutions to help ensure that their content will continue to survive the current preservation challenges. as david seaman pointed out, the more a digital item is used, the more it is copied and handled, the more it will be preserved.23 focusing on archival storage (via lockss) and accessibility of content fulfills the two most primary oais functional capabilities and provides a life raft option for those who are not currently able to surmount the forbidding tsunami of requirements being drafted as best practices for preservation. the importance of offering feasible options for the continued support of the long tail of digitized content cannot be overstated. while the heavily funded centers may be able to preserve much of the content under their purview, this is only a small fraction of the valuable digitized material currently facing dissolution in the black hole of our cultural memory. as while no programmers are needed to organize content into such a clear, consistent, and standardized order, we are developing scripts that will assist others who seek to follow this path. these scripts not only order the content, they also create lockss manifests at each level of the content, down to the collection level, so that the archived material is ready for lockss pickup. a standardized lockss plugin for this method is available. to assist in providing access without a storage database, we are also developing an open-source web delivery system (acumen),22 which dynamically collects content from this protected archival storage arrangement (or from webaccessible directories) and provides figure 1. university of alabama libraries digital file naming scheme (©2009. used with permission.) figure 2. university of alabama libraries metadata organization (©2009. used with permission.) 74 information technology and libraries | june 2011 .org/documents/domain-range/index .shtml#provenancestatement (accessed july 18, 2009). 18. planets consortium, planets time capsule—a showcase for digital preservation, http://www.ifs.tuwien .ac.at/dp/timecapsule/ (accessed june 14, 2010). 19. martin halbert, katherine skinner, and gail mcmillan, “avoiding the calfpath: digital reservation readiness for growing collections and distributed preservation networks,” archiving 2009 (may 2009): 6. 20. library of congress, metadata encoding and transmission standard (mets), http://www.loc.gov/standards/ mets. 21. jody l. deridder, “from confusion and chaos to clarity and hope,” in digitization in the real world: lessons learned from small to medium-sized digitization projects, ed. kwong bor ng and jason kucsma, (metropolitan new york library council, n.y., 2010). 22. tonio loewald and jody deridder, “metadata in, library out. a simple, robust digital library system,” code4lib journal 10 (2010), http://journal.code4lib .org/articles/3107 (accessed aug. 29, 2010). 23. david seaman “the dlf today” (keynote presentation, 2004 symposium on open access and digital preservation, atlanta, ga.), paraphrased by eric lease morgan in musings on information and librarianship, http://infomotions.com/ m u s i n g s / o p e n a c c e s s s y m p o s i u m / (accessed aug. 9, 2009). 24. lynch, challenges and opportunities. 9. consultative committee for space data systems, reference model. 10. stanford university et al., lots of copies keep stuff safe (lockss), http:// www.lockss.org/lockss/home (accessed mar. 29, 2011). 11. david s. rosenthal et al., “requirements for digital preservation systems: a bottom-up approach,” d-lib magazine 11 (nov. 2005): 11, http:// w w w. d l i b . o r g / d l i b / n o v e m b e r 0 5 / rosenthal/11rosenthal.html (accessed june 14, 2010). 12. alabama digital preservation network (adpnet), http://www.adpn .org/ (accessed mar. 29, 2011). 13. greg janée, “preserving geospatial data: the national geospatial digital archive’s approach,” archiving 2009 (may 2009): 6. 14. research libraries group/oclc, trusted digital repositories: attributes and responsibilities, http://www .oclc.org/programs/ourwork/past/ trustedrep/repositories.pdf (accessed july 17, 2009). 15. andy boyko et al., the bagit file packaging format (0.96) (ndiipp content transfer project), http://www.digital preservation.gov/library/resources/ tools/docs/bagitspec.pdf (accessed july 18, 2009). 16. library of congress, bagit library, http://www.digitalpreservation.gov/ partners/resources/tools/index.html#b (accessed june 14, 2010). 17. andy powell, pete johnston, and thomas baker, “domains and ranges for dcmi properties: definition of the dcmi term provenance,” http://dublincore repository, http://www.hathitrust.org/ (accessed mar. 29, 2011). 2. clifford lynch, challenges and opportunities for digital stewardship in the era of hope and crisis (keynote speech, is&t archiving 2009 conference, arlington, va., may 2009). 3. jane deitrich, e-journals: do-ityourself publishing, http://eands .caltech.edu/articles/e%20journals/ ejournals5.html (accessed aug. 9, 2009). 4. tom cramer, quoted in art pasquinelli, “digital libraries and repositories: issues and trends” (sun microsystems presentation at the summit bibliotheken, universitäsbibliothek kassel, 18–19, mar. 2009), slide 12, http:// de.sun.com/sunnews/events/2009/ bibsummit/pdf/2-art-pasquinelli.pdf (accessed july 12, 2009). 5. digital preservation europe, what is digital preservation? http://www.digi talpreservationeurope.eu/what-is-digi tal-preservation/ (accessed june 14, 2010). 6. abby smith, “preservation,” in susan schreibman, ray siemens, john unsworth, eds., a companion to digital humanities (oxford: blackwell, 2004), http://www.digitalhumanities.org/com panion/ (accessed june 14, 2010). 7. consultative committee for space data systems, reference model for an open archival system (oais), ccsds 650.0-b-1 blue book, jan. 2002, http://public.ccsds .org/publications/archive/650x0b1.pdf (accessed june 14, 2010). 8. andrea goethals, “meeting the preservation demand responsibly = lowering the ingest bar?” archiving 2009 (may 2009): 6. business intelligence in the service of libraries articles business intelligence in the service of libraries danijela tešendić and danijela boberić krstićev information technology and libraries | december 2019 98 danijela tešendić (tesendic@uns.ac.rs) is associate professor, university of novi sad. danijela boberić krstićev (dboberic@uns.ac.rs) is associate professor, university of novi sad. abstract business intelligence (bi) refers to methodologies, analytical tools, and applications used for data analysis of business information. this article aims to illustrate an application of bi in libraries, as reporting modules in library management systems are usually inadequate for a comprehensive business analysis. the application of bi technology is presented as a case study of libraries using the bisis library management system in order to overcome shortcomings of an existing reporting module. both user requirements regarding reporting in bisis and already existing transactional databases are analysed during the development of a data warehouse model. based on that analysis, three data warehouse models have been proposed. also, examples of reports generated by an olap tool are given. by building the data warehouse and using olap tools, users of bisis can perform business analysis in a more user-friendly and interactive manner. they are not limited with predefined types of reports. librarians can easily generate customized reports tailored to the specific needs of the library. introduction organizations usually have a vast amount of data which increases on a daily basis. the success of an organization is directly related to its ability to provide relevant information in a timely manner. an organization must be able to transform raw data into valuable information that will enable better decision-making.1 for this reason, it is impossible to imagine an organization without an efficient reporting module as a part of its management information system. if we put libraries in a business context, they are very similar to any other organization. the common characteristic of each is that they have high demand for a variety of statistical reports in order to support their business. a library management system uses a transactional database to store and process relevant data. this database is designed in accordance with the main functionalities of the system. information used to make strategic decisions is usually obtained from historical and summarized data. however, the database model may have a complex structure and may not be suitable for performing analytical queries that are often very complex and involve aggregations. execution of those queries may be a time-consuming and resource-intensive process that can decrease performance as well as the availability of the system itself. also, creating such queries can require advanced it knowledge. these problems can be overcome by developing business intelligence systems. business intelligence (bi) refers to methodologies, analytical tools, and applications used for data analysis of business information. bi gives business managers and analysts the ability to conduct mailto:tesendic@uns.ac.rs mailto:dboberic@uns.ac.rs business intelligence in the service of libraries |tešendić and krstićev 99 https://doi.org/10.6017/ital.v38i4.10599 appropriate analyses. by analyzing historical and current data, decision-makers get valuable insights that enable them to make better, more-informed decisions. bi systems rely on a data warehouse as an information source. the data warehouse is a repository of data usually structured to be available in a form ready for analytical processing activities. 2 business intelligence systems do not exist as ready-made solutions for each organization, but need to be built in accordance with the characteristics of each organization using the appropriate methodology. this article proposes a data warehouse architecture and usage of olap tools in order to support bi in libraries. the application of bi technology is illustrated through a case study of libraries using the bisis library management system. the first step in implementation of bi was the creation of a data warehouse model considering data that exist in bisis and requirements regarding reporting. after the data warehouse model had been created, data were loaded into the data warehouse using olap tools. olap tools were also used for visualization of data stored in the data warehouse. reporting in bisis the bisis library management system has been in development since 1993 by the university of novi sad, serbia. currently, the bisis community comprises over forty medium-sized libraries in serbia.3 the primary modules of the bisis system include cataloguing, reporting, circulation, opac, bibliographic data interchange, and administration. bisis supports cataloguing according to unimarc and marc 21 formats using an xml editor for bibliographic material processing.4 the bisis search engine is implemented with a lucene engine.5 bisis supports z39.50 and sru protocols for the search and retrieval of bibliographic records.6 those protocols are also used for developing a bisis service for searching and downloading electronic materials by the audio library system for visually impaired people.7 in addition, bisis allows sharing of bibliographic records with the union catalogue of the university of novi sad.8 the circulation module features all standard activities for managing users: registration, charging, discharging, searching users and publications, and generating different kinds of reports, as well as user reminders.9 the reporting module of bisis is implemented using the jasperreports tool.10 however, this module has some limitations due to the fact that bisis works only with a transactional database and does not cope well with complex reports. firstly, in order to generate reports regarding library collections, it is necessary to process all bibliographic records stored in that transactional database. this activity significantly burdens the system and reduces its performance. to avoid this, reports are prepared in advance outside working hours, usually at night. consequently, those reports include only data collected before report generation. creating reports in this manner greatly reduces system load and speeds up presentation of the reports because they are already generated. however, some reports, such as those related to the financial aspects of the library (e.g., the number of new members and the balance at the end of the day), need to be created in real time. due to the execution in real time, those reports are ineffective and affect the performance of the entire system. the next limitation of this reporting module is that it has a set of predefined reports and the creation of new reports requires additional development. in the current deployment it is not possible to add new reports without engaging software developers. also, an additional obstacle is the fact that the data for generating reports are obtained from two different data sources (described in more detail in the following sections). for example, the report regarding the number information technology and libraries | december2019 100 of borrowed books by the udc (universal decimal classification) groups requires data about the udc groups from xml documents and data about book borrowing from the relational database. generating this kind of reports cannot be done in a timely and efficient manner. taking into account these shortcomings of the reporting module, it can be concluded that the application of business intelligence, primarily data warehouse and olap tools, could improve analytical data processing in the libraries using bisis. related work one of the basic components of the business intelligence system is a data warehouse. a data warehouse is a centralized database that stores historical data. those data are in principle unchangeable and they are obtained by collecting and processing data from various data sources. data warehouses are used as support for making business decisions.11 the data sources for a data warehouse can be diverse and may include transactional databases and different file formats. the process of integrating data from different data sources into a single database is called data warehousing. data warehousing includes extracting, transforming, and loading (etl) data into data warehouse.12 the goal of data warehousing is to extract useful data for further analysis from the huge amount of data that is potentially available. there are different approaches to modeling a data warehouse. these approaches can be classified in three different paradigms according to the origin of the information requirements: (1) supplydriven, (2) demand-driven, and (3) hybrids of these. a supply-driven approach is based on data that exist in the transactional database. these data are analyzed to determine which data are the most relevant for making business decisions, or which data should be part of the data warehouse. alternatively, a demand-driven approach is based on the end-user requirements which means that the data warehouse is modeled in a way that is possible to get answers to the questions asked by the users. the third approach is a hybrid approach and it combines the previous two approaches in the process of data warehouse modelling. the hybrid approach attempts to diminish the shortcomings of the previous two approaches. in the case of a supply-driven approach, the data warehouse will probably not meet the requirements of the end users, while in the demand-driven approach there may be no data to fill the created data warehouse. in an article published in 2009, romero and abelló gave an overall view of the research in the field of dimensional modeling of data warehouses.13 various examples of implementation of data-warehouse solutions in libraries can be found in the literature. in 2014, siguenza-guzman et al. described the design of a knowledge-based decision support system based on data-warehouse techniques that assists library managers making tactical decisions about the optimal use and leverage of their resources and services. when designing the data warehouse, the authors started from the requirements of the end users (demand -driven approach) and extracted data from heterogeneous sources.14 a similar approach has been used by yang and shieh, who started from the reports needed by public libraries in taiwan and through an iterative methodological approach modeled a data warehouse that meets all their reporting requirements.15 unlike the previously described articles where a demand-driven approach was used, we applied a hybrid approach to modeling data warehouse. we analyzed data sources that exist in bisis business intelligence in the service of libraries |tešendić and krstićev 101 https://doi.org/10.6017/ital.v38i4.10599 following a supply-driven approach, but we also analysed user requirements to identify the facts and dimensions for the dimensional data warehouse model. modeling the data warehouse in order to implement a data warehouse solution, the first step is to design a data model suitable for analytical data processing. a data warehouse usually stores data in a relational database and organizes them in so called dimensional models. unlike standard relational database models, those models are denormalized and provide easier data visualization. data can be presented as a cube with three, four or n-dimensions. analyzing such data is more intuitive and user-friendly. the dimensional model contains the following concepts: dimensions, facts, and measures. dimensions represent the parameters for data analysis while facts represent business entities, business transactions, or events that can be used in analyzing business processes. the most commonly used model in dimensional modeling is the star model. after identifying the facts and dimensions, a dimensional model almost always resembles a star, with one central fact and several dimensions that surround it. dimensions and facts are usually implemented as tables in the relational database. dimension tables contain primary keys and other attributes. fact tables contain numerical data as well as dimension tables keys. the measure is a numerical attribute of the fact table and can be obtained by aggregating data by certain dimensions. there are several approaches to modeling data warehouse and we followed a hybrid approach to design dimensional models presented in this article. this implies that both the existing data sources and the user requirements were considered while designing the final data-warehouse models. that modeling process involved the following activities: 1. analysis of existing data sources in bisis with identification of possible facts and dimensions, 2. analysis of user requirements regarding reporting, 3. refactoring of the facts and dimensions in accordance with the user requirements, and 4. design of dimensional models. analysis of data sources in bisis the first step in creating a data warehouse is an analysis of existing data sources. the bisis system uses two different data sources. bibliographic records are stored in xml documents, while circulation data, as well as holdings data regarding the items that are circulated, are stored in a relational database. in 2009, tešendić et al. described the bisis circulation database model.16 that model describes data about individual and corporate library members. data about members includes information about personal data, membership fees, as well as information about a member’s borrowed and returned items. bibliographic data in bisis are presented in unimarc format. dimić and surla in 2009 described the model for bibliographic records used in bisis.17 a bibliographic record is modeled as a list of fields and subfields. a field contains a name, values of the indicators and a list of subfields. a subfield contains a name and a value of that subfield. the data described by that model are stored in xml documents because the bibliographic record structure is not suitable for relational modeling. that structure is more in line with the document-oriented data storage approach. information technology and libraries | december2019 102 analysis of user requirements one of the essential functionalities of information systems, including library management systems, is to provide various statistical reports that should help the management of the library to make better business decisions. user requirements related to analytical processing in bisis can be grouped into several categories. the first category consists of requirements regarding reports on the library collections. examples of reports from this category are: • number of publications per language for a certain period of time; • number of publications by departments; • number of new publications for a certain period of time; and • number of publications by udc groups. the second category consists of requirements related to the circulation of library resources. examples of such reports are: • number of borrowed items by member category; • number of borrowed items by language of publication; • number of borrowed items by departments; • the most popular books; and • the most avid readers for a certain period. the third category consists of requirements related to the reports on financial elements of the library's business. some of the reports are: • number of new members on a daily basis with a financial balance; • number of members by membership category and gender; and • number of members per departments. analyzing user requirements, it was perceived that a new data warehouse have to be created using data from both data sources. this means that appropriate transformations of data from the relational database as well as from the bibliographic records documents need to be performed. data warehouse models taking into account the reporting requirements as well as the data that exist in bisis, appropriate dimensional models are designed. the proposed dimensional models were designed to meet all the needs for analytical processing, as well as to enable flexibility of the reporting process in bisis. for each of the observed groups of reports, a dimensional model was created as described below. model describing library collection data a dimensional model of the bisis data warehouse used for analytical processing of the library collection data is shown in figure 1. the data from this model are used to generate reports on the library collection. examples of such reports are accessions register, number of items by udc group, number of items by departments, etc. in generating all these reports, an acquisition number of an item has the main role and all reports are created either by counting the acquisition numbers or by displaying the acquisition business intelligence in the service of libraries |tešendić and krstićev 103 https://doi.org/10.6017/ital.v38i4.10599 numbers along with other data related to that item. therefore, the acquisition number represents the measure in this dimensional model. the central table in the model is the item table and it presents a fact table. this table contains the acquisition number and foreign keys from dimension tables. all other tables in the model are dimension tables. the publication table represents a dimension table containing bibliographic data from bibliographic records. only data that are needed for reports are extracted from bibliographic records and stored in this table. those data refer to the name of the author, the title of the publication, the publication’s isbn and udc number, the number of pages, keywords, and an identification number for the bibliographic record in the transactional database. the acquisition table represents a dimension that describes the publication's acquisition data such as a retail price, the name of the supplier, and the invoice number. the location table describes departments within the library where an item is stored. the status, publisher, language, and udc_group tables relate to information about the status of an item, publisher, language of the publication, and udc group to which an item belongs. the date and year tables represent the time dimensions. data in the date table are extracted from the date of an item acquisition and data in the year table are extracted from the publishing year. figure 1. dimensional model describing library collection data. information technology and libraries | december2019 104 model describing library circulation data a dimensional model of the bisis data warehouse used for the analytical processing of library circulation data is shown in figure 2. data from this model are used for generating statistical reports regarding usage of library resources. examples of such reports are the number of borrowed publications according to different criteria (such as user categories, language of publication, departmental affiliation of the user who borrowed the publication, etc.). these data can answer questions about the most popular books or the readers with the highest number of borrowed books. similar to the previous reporting group, the acquisition number of the item which was borrowed has the main role in generating those reports. all reports from this group are created by counting acquisition numbers of borrowed items and displaying data related to those checkouts. therefore, in this dimensional model, the acquisition number is a measure. the central table in the model is the lending table and is presented as a fact table. this table contains the acquisition number of the borrowed item and foreign keys from the dimension tables. all other tables in the model are dimension tables. the publication, publisher, year, acquisition, ucd_group, status, and language tables contain data from bibliographic records and the content of these tables have been already explained. the member, membershiptype, category, education, and gender tables represent the dimension tables containing information about library users. these data are only a subset of circulation data from transactional database. the location table describes departments within the library where items are borrowed. the date table represents the time dimension. the data in the date table are derived from the date of borrowing and the date of discharge of an item. business intelligence in the service of libraries |tešendić and krstićev 105 https://doi.org/10.6017/ital.v38i4.10599 figure 2. dimensional model describing library circulation data. model describing members’ data a dimensional model of the bisis data warehouse used for the analytical processing of members’ data is shown in figure 3. data from this model are used for generating statistical reports on library members, as well as for generating financial reports based on membership fees. examples of such reports are the number of members according to different criteria (such as department of registration, member category, type of membership, gender, or education level). also, this report group contains reports that include a financial balance (for example, a list of members with membership fees in a certain time period). the membership fee has the main role in generating these reports. all reports from this group are generated by counting or displaying members who have paid a membership fee or summarizing membership fees. therefore, in this dimensional model, membership fee is a measure. the main table in the model is the membership table and it presents a fact table. it contains the membership fee, which is the measure, and foreign keys from the dimension tables. information technology and libraries | december2019 106 all other tables in the model are dimension tables. tables member, membershiptype, category, education and gender represents the dimension tables that contain information about library members and the content of these tables was previously described. the table location describes departments within the library where user registration is performed. the table date represents the time dimension. data in the table date are based on the registration date and the date of the membership expiration. figure 3. dimensional model describing library members. true value of a data warehouse in the previous sections, we presented models of data warehouse, but those models are unusable if they are not implemented and populated with data needed for business analysis. extracting, transforming, and loading (etl) processes are responsible for reshaping the relevant data from the source systems into useful data to be stored in the data warehouse. etl processes load data into a data warehouse, but that data warehouse is still only storage for those data. a real-time and interactive visualisation of those data will show the true benefits of data warehouse implementation in various organisations including libraries. to load as well as to analyze and visualize large volumes of data in data warehouses, various online analytical processing (olap) tools can be used.18 the usage of olap tools does not business intelligence in the service of libraries |tešendić and krstićev 107 https://doi.org/10.6017/ital.v38i4.10599 require a lot of programming knowledge in comparison to tools used for querying transactional databases. the interface of olap tools should provide a user with a comfortable working environment to perform analytical operations and to visualize query results without knowing programming techniques or structure of transactional database. there are various olap tools available on the market.19 when choosing an olap tool to be used in an organization, there are several important criteria to consider: the duration of query execution, user-oriented interface, the possibility of interactive reports, price of tool, automation of the etl process, etc.20pentaho bi system is one of the open-source olap tools which satisfies most of those criteria. among various features, pentaho supports creation of etl processes, data analysis, and reporting.21 implementation of etl processes can be a challenging task primarily because of the nature of the source systems. we used pentaho tool to transform data from bisis to the data warehouse, as well as to visualize data and generate statistical reports. etl processes modeling after creating a data-warehouse model, it is necessary to load data into the data warehouse. the first step in that process is to extract data from the data sources. those data may not be in accordance with the newly created data-warehouse model and appropriate transformations of data may be needed before loading. regarding the structure of the data sources, transformations can be implemented from scratch, or by using dedicated olap tools. both techniques are used in the development of our data warehouse. transformations that required data from bibliographic records were implemented from scratch because of complex data structure, while transformations that processed data from relational database are implemented using pentaho data integration (pdi) tool. pdi is a graphical tool that enables designing and testing etl processes without writing programming code. figures 4 and 5 show an example of transformations created and executed by that tool. those transformations have been applied to load members’ data from bisis relational database into the data warehouse. figure 4. transformations for loading members data. information technology and libraries | december2019 108 figure 5. the membershiptransformation process. an issue that may arise after an initial loading of a data warehouse relates to updating the data warehouse. in order to achieve a better performance of transactional databases, updates of the data warehouse should not be performed in real time. in the case of library management systems, those updates can be performed outside of working hours so data in the data warehouse will be up to date on a daily basis. an update algorithm can be defined as an etl process using olap tools or it can be implemented from scratch. data visualization the basic task of olap tools is to enable visualization of data stored in a data warehouse. the olap tools use multidimensional data representation, known as a cube, which allows a user to analyze data from different perspectives. olap cubes are built on dimensional models of a data warehouse and consist of dimensions and measures. dimensions form the cube structure and each cell of the cube holds a measure. measures are derived from the records in the fact table and dimensions are derived from the dimension tables. olap tools allow a user to select a part of the olap cube by setting an appropriate query and that part can be further analyzed by different dimensions. this process is performed by applying common operations on the cube which include slice and dice, drill down, roll up, and pivot.22 data that are results of operations on the cube can be visualized in the form of tables, charts, graphs, maps, etc. the main advantage of olap tools reflects is that end users can do their own analyses and reporting very efficiently. users can extract and view data from different points of view on demand. olap tools are valuable because they provide an easy way to analyze data using various graphical wizards. by analyzing data interactively, users are provided with feedback which can define the direction of further analysis. in order to visualize data from our data warehouse, we used the pentaho olap tool. we used it to create predefined reports identified during the analysis of user requirements as well as some interactive reports using operations on the olap cube. examples of generated reports are presented below in order to illustrate some features of the pentaho olap tool. an example of a report shown in figure 6 was obtained with a dice operation on the cube. the dice operation selects two or more dimensions from a given cube and provides a new sub-cube. in this particular example, we selected three dimensions: gender, member category, and registration date. business intelligence in the service of libraries |tešendić and krstićev 109 https://doi.org/10.6017/ital.v38i4.10599 figure 6. example of dice operation performed on the olap cube. information technology and libraries | december2019 110 figure 7. example of roll-up and drill-down operations performed on the olap cube. business intelligence in the service of libraries |tešendić and krstićev 111 https://doi.org/10.6017/ital.v38i4.10599 additionally, we analyzed only those data from 2014 to 2018. the result of this operation is presented in the form of nested pie charts. however, other forms of visualisation can be applied on the same data set very easily. in figure 7, a more complex report is presented. that report is obtained by performing combination of roll-up and drill-down operations. the roll-up operation performs aggregation on a cube reducing the dimensions. in our example, we aggregated the number of newly acquired publications for certain years ignoring all other dimension except the date dimension. a user can select a particular year, quarter, and month and analyze details of purchased publications in that period, such as title and author of the publication. this is an example of using drill-down operation on the cube. the result of that operation is presented as a table, as shown in figure 7. this report is interactive, because user can investigate data in more detail by performing other operations on the cube that are placed on the toolbar of the report. conclusion this article aims to illustrate an application of business intelligence in libraries, as reporting modules in library management systems are usually inadequate for a comprehensive business analysis. development of a data warehouse, which is the base of any business intelligence system, as well as usage of olap tools are presented. both user requirements regarding reporting in bisis and already-existing transactional databases are analyzed during the development of a datawarehouse model. based on that analysis, three data-warehouse models have been proposed. also, examples of reports generated by an olap tool are given. by building the data warehouse and using olap tools, users of bisis can perform business analysis in a more user-friendly manner than with other processes. users are not limited to predefined types of reports. librarians can easily generate customized reports tailored to the specific needs of the library. in this way, librarians work in a more comfortable environments, performing analytical operations interactively and visualizing query results without additional programming knowledge. the article presents the usage of pentaho olap tool, but the proposed data-warehouse model is independent of olap tools selection and any other tool can be integrated with the proposed data warehouse. references 1 ralph stair and george reynolds, fundamentals of information systems (cengage learning, 2017). 2 ramesh sharda, dursun delen, and efraim turban, business intelligence, analytics, and data science: a managerial perspective (pearson, 2016). 3 “bisis,” library management system bisis, accessed july 8, 2019, http://www.bisis.rs/korisnici/. 4 bojana dimić and dušan surla, “xml editor for unimarc and marc 21 cataloguing,” the electronic library 27, no. 3 (2009): 509-28, https://doi.org/10.1108/02640470910966934; bojana dimić surla,“eclipse editor for marc records,” information technology and libraries 31, no. 3 (2012): 65-75, https://doi.org/10.6017/ital.v31i3.2384; bojana dimić surla, “developing an eclipse editor for marc records using xtext,” software: practice and experience 43, no. 11 (2013): 1377-92, https://doi.org/10.1002/spe.2140. http://www.bisis.rs/korisnici/ https://doi.org/10.1108/02640470910966934 https://doi.org/10.6017/ital.v31i3.2384 https://doi.org/10.1002/spe.2140 information technology and libraries | december2019 112 5 branko milosavljević, danijela boberić, and dušan surla, “retrieval of bibliographic records using apache lucene,” the electronic library 28, no. 4 (2010): 525-39, https://doi.org/10.1108/02640471011065355. 6 danijela boberić and dušan surla,“ xml editor for search and retrieval of bibliographic records in the z39. 50 standard,” the electronic library 27, no. 3 (2009): 474-95; danijela boberić krstićev, “information retrieval using a middleware approach,” information technology and libraries 32, no. 1 (2013): 54-69, https://doi.org/10.6017/ital.v32i1.1941; miroslav zarić, danijela boberić krstićev, and dušan surla, “multitarget/multiprotocol client application for search and retrieval of bibliographic records,” the electronic library 30, no. 3 (2012): 351-66, https://doi.org/10.1108/02640471211241636. 7 danijela tesendic and danijela boberic krsticev, “web service for connecting visually impaired people with libraries,” aslib journal of information management 67, no. 2 (2015): 230-43, https://doi.org/10.1108/ajim-11-2014-0149. 8 danijela boberić-krstićev and danijela tešendić,“ mixed approach in creating a university union catalogue,” the electronic library 33, no. 6 (2015): 970-89, https://doi.org/10.1108/el-022014-0026. 9 danijela tešendić, branko milosavljević, and dušan surla, “a library circulation system for city and special libraries,” the electronic library 27, no. 1 (2009): 162-86, https://doi.org/10.1108/02640470910934669; branko milosavljević and danijela tešendić, “software architecture of distributed client/server library circulation system,” the electronic library 28, no. 2 (2010): 286-99, https://doi.org/10.1108/02640471011033648; danijela tešendić, “data model for consortial circulation in libraries,” in proceedings of the fifth balkan conference in informatics, novi sad, serbia, september, 16-20, 2012. 10 danijela boberic and branko milosavljevic, “generating library material reports in software system bisis,” in proceedings of the 4th international conference on engineering technologies icet, 2009: 133-37. 11 william h. inmon, building the data warehouse (indianapolis: john wiley & sons, 2005); ralph kimball, the data warehouse toolkit: practical techniques for building dimensional data warehouses (ny: john willey & sons, 1996), 248, no. 4. 12 ralph kimball and joe caserta, the data warehouse etl toolkit: practical techniques for extracting, cleaning, conforming, and delivering data (indianapolis: john wiley& sons, 2004), 528. 13 oscar romero and alberto abelló, “a survey of multidimensional modeling methodologies,” international journal of data warehousing and mining (ijdwm) 5, no. 2 (2009): 1-23. 14 lorena siguenza guzman, victor saquicela, and dirk cattrysse,“design of an integrated decision support system for library holistic evaluation,”in proceedings of iatul conferences (2014):112. https://doi.org/10.1108/02640471011065355 https://doi.org/10.6017/ital.v32i1.1941 https://doi.org/10.1108/02640471211241636 https://doi.org/10.1108/ajim-11-2014-0149 https://doi.org/10.1108/el-02-2014-0026 https://doi.org/10.1108/el-02-2014-0026 https://doi.org/10.1108/02640470910934669 https://doi.org/10.1108/02640471011033648 business intelligence in the service of libraries |tešendić and krstićev 113 https://doi.org/10.6017/ital.v38i4.10599 15 yi-ting yang and jiann-cherng shieh, “data warehouse applications in libraries—the development of library management reports,” in advanced applied informatics (iiai-aai), 2016 5th iiai international congress on advanced applied informatics, 88-91. ieee, 2016, https://doi.org/10.1109/iiai-aai.2016.129. 16 tešendić, milosavljević, and surla, “a library circulation system,”162-86. 17 dimić and surla, “xml editor for unimarc,” 509-28. 18 paulraj ponniah, data warehousing fundamentals for it professionals (hoboken, nj: john wiley & sons, 2011). 19 “top 10 best analytical processing (olap) tools,” software testing help, https://www.softwaretestinghelp.com/best-olap-tools/. 20 rick sherman, “how to evaluate and select the right bi tools,” https://searchbusinessanalytics.techtarget.com/buyersguide/a-buyers-guide-to-choosingthe-right-bi-analytics-tool. 21 doug moran, “pentaho community wiki,” https://wiki.pentaho.com/. 22 ponniah, “data warehousing,” 382-93. https://doi.org/10.1109/iiai-aai.2016.129 https://www.softwaretestinghelp.com/best-olap-tools/ https://searchbusinessanalytics.techtarget.com/buyersguide/a-buyers-guide-to-choosing-the-right-bi-analytics-tool https://searchbusinessanalytics.techtarget.com/buyersguide/a-buyers-guide-to-choosing-the-right-bi-analytics-tool https://wiki.pentaho.com/ abstract introduction reporting in bisis related work modeling the data warehouse analysis of data sources in bisis analysis of user requirements data warehouse models model describing library collection data model describing library circulation data model describing members’ data true value of a data warehouse etl processes modeling data visualization conclusion references academic uses of google earth and google maps in a library setting eva dodsworth and andrew nicholson academic uses of google earth and google maps in a library setting | dodsworth and nicholson 102 abstract over the last several years, google earth and google maps have been adopted by many academic institutions as academic research and mapping tools. the authors were interested in discovering how popular the google mapping products are in the academic library setting. a survey was conducted to establish the mapping products’ popularity, and type of use in an academic library setting. results show that over 90 percent of the respondents use google earth and google maps either to help answer research questions, to create and access finding aids, for instructional purposes or for promotion and marketing. the authors recommend expanding the mapping products’ user base to include all reference and liaison librarians. introduction since their launch in 2005, google maps and google earth have had an enormous impact on the way people think, learn, and work with geographic information. with easy access to spatial and cultural information, google maps/earth has provided users with the means to understand their world and their communities of interest. moreover, the customizable map features and dynamic presentation tools found in google maps and google earth make each one an attractive option for someone wanting to teach geographic information or make customized maps. for academic researchers, google mapping applications are also appealing for their powerful ability to share and host projects, create customized kml (keyhole markup language) files, and to easily communicate their own research findings in a geographic context. recognizing their potential for revitalizing map collections and geographic education, the authors felt that many academic libraries were also going to be active in using google maps/earth for a variety of purposes, from promoting their services to developing their own google kml files for users. with google earth’s ease of use and visualization capabilities, it was even thought that academic libraries would be using google earth heavily in instruction classes bringing geographic information to subject areas traditionally outside of geography. as active users of google maps/earth in their roles as academic librarians at their universities, the authors became curious to know what other academic librarians were doing with google maps/earth, particularly those working with maps and/or geography subjects. were they using eva dodsworth (edodsworth@uwaterloo.ca) is geospatial data services librarian, university of waterloo library, waterloo, and andrew nicholson (andrew.nicholson@utoronto.ca) is gis/data librarian, hazel mccallion academic learning centre, university of toronto mississauga, ontario mailto:edodsworth@uwaterloo.ca mailto:andrew.nicholson@utoronto.ca information technology and libraries | june 2012 103 the technology as part of their librarian roles on campus? how were they using it? what impacts was it having in how they delivered library services? to help answer these questions, the authors set out on a three-stage process with the aim of providing a more complete picture of google maps/earth use in academic libraries. the first stage consisted of a literature search focusing on library and information science research databases, to see what (if any) scholarly research had been written that discussed the role of google maps/earth in academic libraries. the second stage of the research had the authors examining over a dozen academic library websites to assess how they were integrating google maps/earth either through an api plug-in on their website or advertising other google maps/earth related services and collections. the third stage had the authors compile a set of twenty survey questions which were then distributed to academic librarians across canada and the united states, probing the use of google mapping products in the academic library setting. literature review despite the ubiquity of google for information searching, there was a surprising paucity of literature that documents the impact of google maps/earth in academic libraries. nevertheless, there are some articles which indicate just how much google maps can help raise the profile of library services. terry ballard, a librarian at quinnipiac university, describes in a few articles how he and colleagues were able to use google earth placemarks to promote his library’s special collections.1 the potential for “discovering the library with google earth” is also a theme in an article by brenner and klein in which the portland state university library linked their urban planning documents collection to google earth for ease of searching.2 although the focus is on public libraries, michael vandenburg documents how his library system began “using google maps as an interface for the library catalogue.” in his article, vandenburg discusses that the inspiration for such a project came about through various google maps mashups that were popular on search oriented websites such as “housing maps,” which combined realtor listings from craigslist with a google maps api. using api coding, vandenburg was able to link latitude and longitude data of countries to individual opac records enabling a visual search for items at the country level.3 while these articles focused on use of google earth as a collection discovery tool, troy swanson notes the visualization aspects of the applications and their utility for teaching information literacy. swanson has students use google earth and second life as tools to create a virtual exhibit on malcolm x. although swanson notes that the final output by the students did not meet the initial expectations, valuable learning opportunities for teaching in a 3d space were recognized and should be pursued. 4 some of these opportunities are highlighted as case studies by lamb, noting the visualization aspects of google earth would be very useful for librarians providing instruction.5 academic uses of google earth and google maps in a library setting | dodsworth and nicholson 104 google maps/earth & academic libraries: a scan of selected library websites for the next stage, the authors performed an environmental scan of academic library websites to see how they are using and implementing google mapping technology into their services. many are doing creative and innovative project work which will, we hope, encourage and guide other libraries to consider doing something similar. mapping technology can be used in several different ways, and with internet users becoming more proficient using this technology, libraries have the opportunity to take advantage of this communication medium. any document or image that has a geographic component can be digitized and made easily accessible using online mapping technology. the following section will review some of the projects highlighted on websites. the projects can be grouped into the following categories: finding aids, collection distribution, and teaching and reference services. finding aids all collections in libraries require some sort of finding aid to locate library material—the most obvious one being the library catalog. however, there are many location-based materials that use customized finding aids such as map and air photo indexes, and geospatial data coverage maps. for several years now libraries have been trying to make access to the finding aids easier by digitizing them and offering them online. not only are online versions easily updatable, but they are quite often created using google technology, allowing for the use of modern basemaps and zoom capabilities. traditional paper indexes can be difficult to navigate, especially the historical ones, making the search process rather difficult for users and library staff. one of the most popular types of online indexes created by libraries is air photo indexes. most map libraries collect air photos, and many use similar indexes to help locate aerial photography for an area of interest. several libraries have digitized the indexes making the same information available online. users simply zoom into a geographical area and click on a point to retrieve the photo information they need in order to locate the air photo in the library collection. some libraries will even send an electronic copy of the photo to the users. the mcgill university library, for example, has made its air photo information available from their webpage in a kml format to be viewed in google earth. users can click on a point of interest to easily obtain the air photo information. mcgill library has also digitized topographic indexes, making them also available via google earth.6 the university of western ontario’s serge a. sauer’s library also provides its air photo indexes online, incorporating google maps directly into their website. placemarks representing individual photos have been inserted on a google map, along with the photo description so that when users click on the placemark, photo information is released. using google mapping technology to offer online finding aids that are searchable by location is an innovative and cost-free step towards collection accessibility. what would make these types of library collections even more accessible, however, is offering users online access to digital versions of the collection items themselves. so to bring the indexing project one step forward, not only would the photo reference information be made available, but the actual image would be too, information technology and libraries | june 2012 105 thereby allowing libraries to use google mapping technology as an avenue for collection distribution and delivery. collection delivery libraries have had digital collections for quite some time. many of course do not need to digitize resources themselves as they subscribe to products such as electronic journals and books. however there are still some less common collections that are physically housed in libraries that would be much more accessible to users if they were exposed and made available online. an internet search has shed light on numerous digitization projects that use google mapping technology to search for and deliver location-based collections. examples of these types of collections include historical maps and air photos, archived photos and postcards, audio interviews, community information, textual documents like letters and diaries, and gis data. mcmaster university library is one example of a library that has digitized a historical map collection and made it available online. an index to its world war i military maps and aerial photography was created using google maps, and was embedded into its webpage.7 users can click on an area of interest to bring up the corresponding high resolution map image. likewise, brock university library has also offered its historical air photo collection online, allowing users to search using a google map, and then download photos of interest.8 additionally, yale university library has created kml indexes of its fire insurance plans, with direct links to the digitized images.9 the university of connecticut library has digitized its local historical maps and using google maps had created a map mashup which includes historic landmarks. clicking on the landmarks provides users with links to related resources. several libraries have digitized other imagery, such as postcards and photography. this is particularly popular with archival and specialized collections. the university of vermont library has embedded a google map into its website with placemarks that when clicked lead the user to the library’s long trail collection, an assortment of over 900 images of the oldest long-distance hiking trail in the united states. the images have been digitized from hand-colored lantern slides.10 cleveland state university library has also done something similar with its cleveland memory project, in which google maps were embedded into the library webpage and placemarks of local historic landmarks added. when users click on the placemarks, they are able to access a description of the landmark along with a photograph of it. clicking on “more information” will lead the user to several related resources, including the library catalog, where original documents about the location are available (e.g., images, books).11 besides digitizing their collections, some libraries have also georeferenced them so that they could not only be accurately located using an index, but so that they could be viewed in google earth (kml format). offering collections in kml format greatly increases exposure and use of geographic resources because google earth is one of the more popular location-based applications used by library users and the public. geographic files such as georeferenced air photos and satellite images, as well as gis data used to be only viewed in specialized gis programs. but gis technology has evolved into so many online applications, offering all computer users the benefits of geographic information and a platform to distribute information. academic uses of google earth and google maps in a library setting | dodsworth and nicholson 106 the university of waterloo map library is one example of a library that had digitized its historical air photo collection and made the images available in kml format for google earth usage.12 users can access a map index of the available photos from the map library webpage and then click on the index to download the images. the university of north carolina library has georeferenced several historical maps and made them available for viewing as an image overlay in google maps. this particular mapping project consists of around 150 thematic maps, including historical soil surveys, road and highway maps, city/county maps, and more. users can take advantage of the georeferenced maps and accurately compare historical features to modern ones with google maps’ basemap. having a preview of the dataset before it is downloaded assists the user in downloading only what is needed.13 perhaps more popular than a library’s air photo collection are libraries’ collections of geospatial data. geospatial, or gis data, has traditionally been only used by users who have access to gis programs such as esri’s arcgis, or arcview. more recently, librarians have discovered that when spatial files are converted into easy-to use file formats, such as kml, the user group is broadened and the files are used more. so it is no surprise that several libraries have converted their gis shapefiles (a spatial data file format used specifically in gis programs) into kml files and made them available for download from their webpages. university of connecticut library offers its gis files online in various formats, including kml. it also provides a sample image of the gis layer in google maps.14 baruch college at the city university of new york has made neighborhood census data available in google maps. the geographic boundary files were overlaid in google maps, and clicking on the map will lead users to the files available from the american census bureau’s fact finder. clearly, many libraries have incorporated google mapping technology into their digitization projects. the technology has proven capable of attracting collections that are not strictly locationfocused such as maps and air photos, but that have a location associated with it, such as archival photos of community landmarks or books written about a specific locale. google mapping technology makes the organization and storage of collections relatively effortless for library project managers, and it makes collection searching and distribution simple and friendly for the users. other uses of google maps/earth in libraries perhaps one of the simplest uses of google mapping technology can be illustrated by visiting several library websites. many libraries have embedded google maps into their website as either a webpage header15 survey: what are academic library staff doing with google maps/earth?16 following the review of the literature and academic library websites, the authors wanted to discover how academic librarians themselves were using google maps and google earth in their work, if at all. to capture this data, the authors compiled a set of survey questions targeting those in the academic library community who work with maps, gis, or geography/geology/earth science subject matter. information technology and libraries | june 2012 107 in preparing the survey questions, the authors were aware of a “survey fatigue” among the academic library community. at the time of research, many surveys were going out to librarians requesting their time and responses, so the authors wanted to keep the survey concise both in terms of number of questions, but also in the types of questions. in the end, the survey was created with twenty questions consisting of six yes/no questions, seven multiple choice, and the remaining seven questions being short answer. for distributing the survey, the authors wanted to reach as many librarians who worked with maps, geospatial data and government document subject matter as possible. the survey was then distributed on specialized map library and government publication listservs, including maps-l, govinfo, gis4lib, and carta (canadian maps & air photo systems forum). the survey was also distributed on the members’ only lists belonging to the association of canadian map libraries & archives (acmla) and the western association of map libraries (waml) listservs. the survey was made available on survey monkey for two months from december 2010 to the end of january 2011. the responses with the survey available during a quieter period of library activities, and thanks to a couple of reminder emails being sent out on the lists, our questionnaire received a total of 83 responses. who is using google maps/earth? the first couple questions dealt with the department or area of the library in which the respondent worked in, and what their position encompassed. as expected, a large majority of respondents, 81 percent, worked in “map/gis services” while 28.8 percent also had “general reference” responsibilities. other library service areas mentioned included “data services” and “it,” as well as some that fell outside library boundaries where staff worked in geography and environment science departments. not surprisingly, 52 percent of the responses indicated that their position was “librarian,” with the majority being “gis librarian” or “map librarian.” others included “reference & instructional services librarian” and “science librarian.” also received were 17 responses from gis specialists, library technicians and map assistants. what was especially noteworthy was that 12 responses were from library administrators, directors, or department heads who were finding time to work with google earth as part of their responsibilities. this number also included gis coordinators and map curators responsible for making decisions in their departments. google mapping products : what is being used, how often and for what purpose? to gain an understanding of how library staff are using google mapping products, a series of questions was asked of the respondents to determine which products were being used, how often and for which tasks. respondents were given a list of all the google mapping products available, and were asked to indicate which ones they had worked with. not surprisingly, the top two products used by respondents were google maps, 93 percent (71) and google earth, 91 percent (69). google maps api had been used by 40 percent (30) of the respondents, followed by google earth pro at 38 percent (29). eight percent (6) had also worked with google earth api, and 7 percent (5) had used google earth plus. interestingly, one respondent indicated that they had deployed google earth enterprise in their library. academic uses of google earth and google maps in a library setting | dodsworth and nicholson 108 figure 1. respondents’ use of google mapping products since many of these users may have simply used the products occasionally, it was important to get a sense of how often the products were being used. when asked the question “how regularly do you work with google mapping products for work-related projects?” 69 percent (54) responded that they use the products at least once a month. of those responses, 45 percent (35) use them at least weekly. specifically, eighteen percent (14) use them one to two times a week, thirteen percent (10) use them three to four times a week, and fourteen percent (11) use them even more often than that. only six percent (5) responded that they don’t use the products at all. information technology and libraries | june 2012 109 figure 2. frequency of use for work-related projects as google maps/earth can be used in many different ways and for different purposes in a library environment, the survey inquired how in fact these products were being used in their libraries. the survey question listed four possible tasks that the technology could be used for with the additional option for respondents to enter their own ‘other’ usages. respondents could check off all that applied. the options given included: • instruction • promotion/marketing • answering research questions • creating/accessing a finding aid tool (air photo map indexes, etc.) • other: (fill in answer) the majority of respondents, 82 percent (58) indicated they were using the products to answer research questions; 61 percent (43) for creating or accessing a finding aid tool; 56 percent (40) for instruction purposes; 27 percent for promotion/marketing and 20 percent (14) have used them for “other” purposes including georeferencing imagery, for use in webpages or creating learning objects. academic uses of google earth and google maps in a library setting | dodsworth and nicholson 110 figure 3. level and frequency of use in instruction are google mapping products being used for library instruction? for the authors, one of the best aspects of google maps/earth applications is their visualization capabilities. the ability to easily create and display geographic information to engage students makes google mapping applications an ideal instruction tool. in many ways, google maps and google earth have helped promote map and spatial literacy as concepts that are teachable. despite the free availability and ease of use of google mapping applications, the authors were somewhat surprised from the survey to find that 72 percent of library staff surveyed noted that their institution did not have any kind of map, spatial, or geospatial literacy policy in place. when it came time to provide instruction in the classroom, the survey found that only 31 percent (26) of the respondents had even used google earth in a classroom. nevertheless, in looking at the course levels, library instruction with google earth tools is actually occurring at all levels, from first year to graduate. significantly however, the frequency of the instruction seems to peak in the fourth year, where staff are using in upwards of six to nine courses. respondents were asked to give some details of these sessions, and they included a variety of class topics from environmental awareness education for first year students, to learning digitization skills in later years. has your library taken advantage of google map/earth technology for promotion or marketing purposes? information technology and libraries | june 2012 111 from our environmental scan of library websites we saw many interesting uses of google maps and google earth that were embedded directly into websites. perhaps because of this the authors were surprised to find that 55 percent of the survey respondents did not believe their library was using these technologies for promotion or marketing purposes. for those respondents who were using google maps or google earth to boost services for users, quite a few provided interesting examples of what this technology can offer. many were using google map apis to enhance map and aerial photo indexes, creating greater awareness of these resources and enhancing access. one respondent noted they had created a campus tour that highlighted all of the buildings that made up the library system, while others were using google api technology to showcase particular digitization projects such as folklore collections or geologic atlases. when asked if such activities have helped to enhance services or provided benefits to users, many responded that they had for both the users and for other library staff. greater speed and an increased familiarity of the collections were cited by several respondents, who no longer need to consult the paper indexes. does the library provide support to the wider campus community using google mapping products (not including instructional collaborations)? although many libraries are now using google maps and google earth technology, the authors were surprised that many were not actively leveraging this expertise across their campuses. almost all the respondents either skipped the question or stated that they were not providing this kind of active support. several noted that their gis services were open to all and that they were responsible for the google earth pro licences on campus, but that this was the extent of their support. working with google map/earth (kml) files in the last few years, kml files have become one of the more popular ways to display and distribute geographic information online. with its ease of use, and access, kml files have considerably broadened the user base of geographic information. kml files can be easily created in google earth, and they can be easily converted from gis files in specialized programs. it is this ease of access and usability that has popularized geographic information, hence increasing exposure to library collections and services. this survey was therefore interested in determining how libraries are using and creating kml files. when survey respondents were asked whether they work with kml files, 64 percent (47) responded they did, with 85 percent (40) claiming that they create their own kml files. for those who create their own, 92 percent (34) said that they created kml files by converting them from another file format using an external application, such as arcgis, earthpoint, ogr20gr, or shp2kml software. 78 percent (29) also created them in google earth, and 32 percent (12) created kml files by writing their own xml code. the authors were most interested to know if kml files were actually held as part of the library holdings. thirty percent (13) of respondents noted that they provide access to their kml files as academic uses of google earth and google maps in a library setting | dodsworth and nicholson 112 part of their collections, with 89 percent, (8) claiming they could be located through the library website. other areas mentioned for access included libguides and specialized gis data catalogues available through the library’s website. in terms of quantity, one respondent claimed a collection of 500-800 kml files, while other responses mentioned amounts in the ranges from 5 to 100, with some claiming that they were not sure exactly how many made up their collection. what other online mapping tools are used in your library apart from google maps and google earth? although google maps and google earth are perhaps the most well-known online mapping tools available, the authors were also interested to learn if there were other products that libraries were using as part of their service offerings. as expected many mentioned esri’s arcgis online and esri’s arcexplorer, while other responses included bing maps, openstreetmap, and open layers. discussion google mapping applications are clearly being used for academic purposes in library settings. with such diverse capabilities made available in these programs, library professionals are using them in several different ways. google earth and google maps are popular among library staff who work with gis and/or map collections. in fact, over 90 percent of the respondents use both products, either to help answer research questions, to create and access finding aids, for instructional purposes or for promotion and marketing. google mapping products have also helped libraries revitalize their collections as well as assist in transferring spatial information literacy skills to academic students and faculty. the authors hope that readers who work in a map/gis library setting will be inspired by the many examples of online mapping projects outlined in this paper and they will too use the online tools to the benefit of their library and their library users. google mapping products offer libraries an online platform to share information, and resources in an easy, accessible and low-cost way. the survey results also indicate that map/gis professionals in academic libraries trust and rely on google maps/earth as a solution to many academic queries and needs. since google mapping products were created for the use by mainstream society, it can be suggested that all other nonmap and gis related fields may find the products to be beneficial and useful to them as well. google earth and google maps are very easy to learn and the users do not require any spatial or mapping skills. as this survey was limited to map/gis users, the authors do not know how, if at all, google mapping products are being used by other library staff. this will be a future area of study. the authors do strongly suggest however for map/gis librarians to consider offering training sessions to reference staff and liaison librarians. as a multidisciplinary tool, many subject areas can benefit from google maps/earth, as it’s certainly not a tool for use by only gis/map librarians. with a little bit of training, all library staff can use google mapping products to assist with research questions, spatial literacy, location-based projects and library instruction. in fact, library staff members responsible for nontraditional library material such as photographs, postcards, audio recordings, original hand-written documents, etc. may want to consider using online mapping products to organize their collection. too many times such original material is lost in the library’s filing system, is irretrievable or unavailable during convenient hours. google maps/earth will organize all collections based on their geographic location and can offer access to the actual information technology and libraries | june 2012 113 material. more exposure to and training on these free and easy to use products can increase collection use, promote mapping technology, and organize the library’s holdings. references 1 terry ballard, “inheriting the earth: using kml files to add placemarks relating to the library’s original content to google earth and google maps” new library world 110 (2009): 357-65, doi: 10.1108/0307480091097579. jacobsen, mikael and terry ballard, “google maps: you are here: using google maps to bring out your library’s local collections” library journal, october 15, 2008 (accessed september 11, 2011). http://www.libraryjournal.com/article/ca6602836.html. 2 michaela brenner and peter klein, “discovering the library with google earth” information technology and libraries 27 (2008): 32-6. 3 michael vandenburg, “using google maps as an interface for the library catalogue” library hitech 26 (2008): 33-40. 4 troy swanson, “google maps and second life: virtual platforms meet information literacy” college & research libraries news 69 (2008): 610-12. 5 annette lamb, and larry johnson, “virtual expeditions: google earth, gis, and geovisualization technologies in teaching and learning” teacher librarian 37 (2010): 81-5. 6 a list of mcgill library’s air photo indexes can be viewed at http://www.mcgill.ca/library/library-findinfo/maps/airphotos/ (accessed september 8, 2011). 7 mcmaster university library map index can be found at http://library.mcmaster.ca/maps/ww1/ndx5to40.htm, (accessed september 8, 2011). 8 the brock university historical air photo collection can be accessed at: http://www.brocku.ca/maplibrary/airphoto/historical.php (accessed september 8, 2011). 9 the yale university sanborn indexes can be found at http://www.library.yale.edu/mapcoll/print_sanborn.html (accessed september 8, 2011). 10 the university of vermont library’s google map can be found at: http://cdi.uvm.edu/collections/browsecollection.xql?pid=longtrail&title=long%20trail%20p hotographs (accessed september 8, 2011). 11 the cleveland memory project can be found at: http://www.clevelandmemory.org/hlneo/ (accessed september 8, 2011). http://www.libraryjournal.com/article/ca6602836.html http://www.mcgill.ca/library/library-findinfo/maps/airphotos/ http://library.mcmaster.ca/maps/ww1/ndx5to40.htm http://www.brocku.ca/maplibrary/airphoto/historical.php http://www.library.yale.edu/mapcoll/print_sanborn.html http://cdi.uvm.edu/collections/browsecollection.xql?pid=longtrail&title=long%20trail%20photographs http://cdi.uvm.edu/collections/browsecollection.xql?pid=longtrail&title=long%20trail%20photographs http://www.clevelandmemory.org/hlneo/ academic uses of google earth and google maps in a library setting | dodsworth and nicholson 114 12 the university of waterloo map library website can be found at: http://www.lib.uwaterloo.ca/locations/umd/project/ (accessed september 8, 2011). 13 the university of north carolina library provides interactive maps at http://www.lib.unc.edu/dc/ncmaps/interactive/overlay.html (accessed september 8, 2011). 14 the university of connecticut library offers gis files online here: http://magic.lib.uconn.edu/connecticut_data.html (accessed september 8, 2011). 15 campus map examples include: yale university library at http://maps.commons.yale.edu/venice/ example maps for library locations on campus include: brock university library, http://www.brocku.ca/maplibrary/general/where-is-the-ml.php university of north carolina, http://www.lib.unc.edu/libraries_collections.html (all accessed on september 8, 2011). 16 the full survey instrument can be found in the appendix of this document. http://www.lib.uwaterloo.ca/locations/umd/project/ http://www.lib.unc.edu/dc/ncmaps/interactive/overlay.html http://magic.lib.uconn.edu/connecticut_data.html http://maps.commons.yale.edu/venice/ http://www.brocku.ca/maplibrary/general/where-is-the-ml.php http://www.lib.unc.edu/libraries_collections.html information technology and libraries | june 2012 115 appendix google maps and google earth: influences and impacts in your library you and your library 1. what is your work position title? 2. what department/division/area of library do you work in? (click all that apply) o map/gis services o government publications o general reference o technical services o other (please specify): google mapping products 3. please check all the products you have worked with? o google maps o google maps api o google earth o google earth plus o google earth pro o google earth api o google earth enterprise 4. how regularly do you work with google mapping products for work-related projects? o not at all o a few times a year o 1-3 times a month o 1-2 times a week o 3-4 times a week o more often than that! o not sure 5. for what work related tasks, have you used these products? (click all that apply) o instruction o promotion/marketing o answering research questions o creating/accessing a finding aid tool (air photo, map indexes, etc.) academic uses of google earth and google maps in a library setting | dodsworth and nicholson 116 library instruction using google mapping products 6. does your library have a map, or spatial, or geospatial literacy policy or program? o yes o no 7. if you are using google mapping products for instruction, what level or year of university course(s) are you using it in, and in how many courses: 1-2 3-5 6-9 10-14 15 and more 1st year (100 level) 2nd year (200 level) 3rd year (300 level) 4th year (400 level) graduate level 8. please describe some of these activities? 9. does your library offer geographic awareness or gis-related training to some or all the library staff? promotion/marketing using google mapping products 10. has your library used google mapping technology to promote, offer, or deliver a service? (for example, offering kml files for download, indexes, guides, scanned documents, placemarks/urls from google maps/earth, etc.) o yes o no 10a. if yes, please describe with as much detail as possible how your library has used google mapping technology. if possible, please provide links to the projects. 10b. if yes, how have the google mapping related projects enhanced services or benefited the library? information technology and libraries | june 2012 117 11. does the library provide support to the wider campus community using google mapping products (not including instructional collaborations)? kml/kmz collections 12. do you work with kml files? o yes o no 13. do you create your own kml files? o yes o no 14. how do you create your own kml files? o write xml code o save in google earth o convert from another file format using an external application o other (please specify) 15. does your library hold and provide access to kml or kmz files as part of its collections? o yes o no 16. if yes, approximately how many files do you currently hold? 17. how are these files findable by your patrons? o opac o library website o both 18. do you or other library staff use other online mapping tools? please list which ones and what they are used for. editorial board thoughts column getting to yes: stakeholder buy-in for implementing emerging technologies in your library ida joiner information technology and libraries | september 2018 5 ida a. joiner (ida.joiner@gmail.com), a member of lita and the ital editorial board, is the librarian at the universal academy school in irving, texas. she is the author of “emerging library technologies: it’s not just for geeks” (elsevier, august 2018). have you ever wanted to implement new technologies in your library or resource center such as (drones, robotics, artificial intelligence, augmented/virtual reality/mixed reality, 3d printing, wearable technology, and others) and presented your suggestions to your stakeholders (board members, directors, managers, and other decision makers) only to be rejected based on “there isn’t enough money in the budget,” or “no one is going to use the technology,” or “we like things the way that they are,” then this column is for you. i am very passionate about emerging technologies, how they are and will be used in libraries/ resource centers, and how librarians will be able to assist those who will be affected by these technologies. i recently published a book introducing emerging technologies in libraries. i came up with suggestions on how doing your research — including the questions below and those on the accompanying checklist —will prepare you to meet with your stakeholders and improve the likelihood of your emerging technology proposal being approved. 1. who are your stakeholders and include them early on in the process? determine who you stakeholders are, what their areas of expertise are, and how they can support your emerging technology projects. the most critical piece to getting your stakeholders on board to support your technology initiatives is addressing the question “what’s in it for them?” this will get their attention and increase your odds to getting to say “yes” to your technology initiatives. 2. what are the costs? research what your costs will be and create a budget. find innovative ways to fund your initiatives by researching grants, strategic partnerships with others who might be interested in partnering with you, and locating other funding opportunities. 3. what are the risks? identify any potential risks so that you are prepared to discuss how you will mitigate them when you meet with your stakeholders. some potential risks that you might want to address are budget cost overruns or staffing issues such as a key person resigning or going on maternity or sick leave, or policies in place to deter patrons from trying to use the technology for criminal means. mailto:ida.joiner@gmail.com https://www.elsevier.com/books/emerging-library-technologies/joiner/978-0-08-102253-5 https://www.elsevier.com/books/emerging-library-technologies/joiner/978-0-08-102253-5 getting to yes | joiner 6 https://doi.org/10.6017/ital.v37i3.10746 4. what is the timeline and key milestones? address the timeline for when you want or need to implement these technologies? have you planned for key milestones and possible delays such as funds not being available? you need to have a detailed timeline, from your first kickoff meeting with your initiative’s team, to your stakeholder meeting where you present your proposal, to getting signoff on the project. 5. what training will you offer? perform a needs assessment to determine who will need to be trained, what training you will offer, what your training costs will be, and who will pay for them. once you have all of this in place, you will select the trainer(s) and the training model (such as “train the trainer”) that you will use. 6. how will you market your technology initiatives? will you rely on social media to market your technology initiatives? will you collaborate with your marketing department for developing your message through press releases, websites, blogs, e-newsletters, flyers, and other media outlets? you will need to meet with your marketing and publications experts to plan how you will market your emerging technology initiatives along with your costs and who will pay them. 7. who is your audience and how can you engage them? this is the one of the most important areas to address in your proposal to present to your stakeholders. without our patrons, there is no library. you will need to determine who your audience is and how you can utilize the emerging technologies to assist them. are they k to 12 students, adults who will be displaced by these technologies, technology novices who want to learn more about these technologies, or university faculty and/or students who want to use the technology for their projects? you can address all of these potential audiences in your proposal to your stakeholders. these are just a few tips on how to get stakeholder buy-in for implementing emerging technologies in your library. feel free to share some of your own successes in getting shareholders on board to implement emerging technologies in your library or resource center. information technology and libraries | septmeber 2018 7 emerging technology stakeholder buy-in questionnaire i have included questions below that you should follow when you are considering getting your stakeholders on board to implement new emerging technologies in your library. if you address all of these, you have a very good chance of getting your stakeholders on board to support your initiatives. 1. what technologies do you want to implement in your library/resource center and why do you want them? 2. who are your stakeholders and what are their backgrounds? 3. why should your stakeholders support your technology initiatives? 4. what is your budget for your new technology initiatives? 5. what training is needed to support these initiatives, who will provide the training, what are the costs, and who will pay for the training? 6. how will you market these technology initiatives, what are the costs, and who will pay for them? 7. did you perform a cost-benefit-analysis for these technology initiatives? 8. are there legal fees? if so, what are they, and who will pay for them? 9. what are the risks? 10. what are the returns on the investment (roi)? 11. what strategic partnerships can you establish? 12. what is your timeline for implementing these technology initiatives? emerging technology stakeholder buy-in questionnaire letter from the editor kenneth j. varnum information technology and libraries | december 2018 1 https://doi.org/10.6017/ital.v37i4.10852 as 2018 draws to a close, so does our celebration of information technology and libraries’ 50th anniversary. in the final “ital at 50” column, editorial board member steven bowers takes a look at the 1990s. much as for steven, for me this decade was where my career direction and interests crystallized around the then-newfangled “world wide web.” taking a look at the topics covered in ital over those ten years, it’s clear that plus ça change, plus c'est la même chose: the more things change, the more they stay the same. we were exploring then questions of how the burgeoning internet would allow libraries to provide new services and be more efficient and helpful in improving existing ones. user experience, distributed data and the challenges that causes, who has access to technology and who does not…. all topics as vibrant and concerning then as they ar e now. with the end of our look back at the last 50 years, we are taking the opportunity start something new in 2019. there will be a new quarterly column, “public libraries leading the way,” to highlight a technology-based innovation from a public library perspective. topics we are interested in include the following, but proposals on any other technology topic are welcome. • virtual and augmented reality • artificial intelligence • big data • internet of things • 3-d printing and makerspaces • robotics • drones • geographic information systems and mapping • diversity, equity, and inclusion and technology • privacy and cyber-security • library analytics and data-driven services • anything else related to public libraries and innovations in technology columns will be in the 1,000-1,500 word range and may include illustrations. these will not be research articles, but are meant to share practical experience with technology development or uses within the library. if you are interested in contributing a column, please submit a brief summary of your idea. i’m grateful to the ital editorial board, and especially to ida joiner and laurie willis, for their guidance in shaping this concept. regardless of whether you work in a public, or any other, library, i’m always happy to talk with you about how your experience and knowledge could be published as an article in ital. get in touch with me at varnum@umich.edu. kenneth j. varnum, editor varnum@umich.edu december 2018 https://goo.gl/forms/mcz2kdltiwypsnq43 https://goo.gl/forms/mcz2kdltiwypsnq43 mailto:varnum@umich.edu mailto:varnum@umich.edu reproduced with permission of the copyright owner. further reproduction prohibited without permission. consortia building: a handshake and a smile, island style cutright, patricia j information technology and libraries; jun 2000; 19, 2; proquest pg. 90 consortia building: a handshake and a smile, island style patricia j. cutright in the evaluation of consortia and what constitutes these entities the discussion runs the gamut. from small, loosely knit groups who are interested in cooperation for the sake of improving services to large membershipdriven organizations addressing multiple interests, all recognize the benefits of partnerships. the federated states of micronesia are located in the western pacific ocean and cover 3.2 million square miles. throughout this scattering of small islands exists an enthusiastic library community of staff and users that have changed the outlook of libraries since 1991. motivated by the collaborative eff orts of this group, a project has unfolded over the past year that will furth er enhance library services through staff training and education while utilizing innovative technology. in assessing the library needs of the region this group crafted the document "the federated states of micronesia library services plan, 1999-2003," which coalesces the concepts, goals, and priorities put forward by a broad-based contingency of librarians. the compilation of the plan and its implementation demonstrate an understanding of the issues and exhibit the ingenuity, creativity, and willingness to solve problems on a g rand scale addressing the needs of all libraries in this vast pacific region. t he basic philosophy inher ent in librarianship is the concept of sharing. the di sse mination of information through material exchang e and interlibrary communication has enriched so cieties for centuries. th ere ar e few institutions other than libraries that are better equipped or suited for such cooperation and collaborati ve e ndeavors. with servic e as the lifeblood that runs through its inky veins , the librar y has the potential to be the driving force in an y community toward partnerships that a fford mutual benefit for all. the examination of the literatur e exposes a wid e rang e of perceptions as to the d e finition of what is a consortium . the term "consortia" conjur es up impressions that span the spectrum from highly or ganized, membership-driv en groups to loosely knit cadres focusing on impro ving services to their patrons however they can make it happen. in kopp 's pap er "library consortia and patricia j. cutright (cutright@eou .edu} is library director of the pierce library at eastern oregon university. 90 information technology and libraries i june 2000 information technology : th e past, the present, th e promise" he presents information from a study conduct ed by ruth patrick on academic library consortia. in that study she identified four general types of consortia : • large consortia concerned primarily with computerized large-scale technical processing; • small consortia conc erned with user services and everyday probl ems ; • limited-purpose consortia cooperating with respect to limited special subject areas; • limited-purpose con sorti a concerned primarily with interlibrary loan or reference; and network operations.i with this distinction in mind , this paper will focus on th e second category typifying a small , less structured organization. whil e on a visiting assis tantship in the federated states of micronesia (fsm), i worked with a partnership of libraries that believe in order for cooperation to succeed, results for the patron must be the goal-not equity between libraries or some magical balance between resources lent by one library and resources received from a noth er library.2 unified effort s to provide service to the p a tron is the key. the libraries on a small, rem ote island situated in the western pacific ocean exhibit this grassroots effort that define s the true meaning of consortia-demonstrating collaboration , cooperation , and partnerships. it is a multi type library cooperative that not only encompasses interaction among libraries but also betwe en agencies as well as governments. the librarians on the island of pohnpei, micron esia, and all the islands throughout the federated states of micronesia have embraced this consortia) attitud e whil e achieving much through these collaborative efforts : • the joint work done on crafting the library services plan, 1999-2003 for the libraries throu ghout the federated states of micronesia • initiating successful grant-writing efforts which target national goals and priorities • implementing a collaborative library automation project which is d esigned to evolve into a national union catalog • the implementation of a viable resource-sharing and document delivery service for the nation i background and socioeconomic overview micron esia, a name m eaning " tiny islands ," comprise s som e 2,200 volcanic and coral islands spread throughout reproduced with permission of the copyright owner. further reproduction prohibited without permission. 3.2 million square miles of pacific ocean. lying west of hawaii, east of the philippines, south of japan and north of australia, the total land mass of all these tropical islands is fewer than 1,200 square miles with a population base estimated at no more than 111,500.3 a locationunique region, but nonetheless still plagued with all the problems associated with any geographically remote, economically depressed area found anywhere in the united states or elsewhere in the world. the federated states of micronesia is a small-island, developing nation that is aligned with the united states through a compact of free association, making it eligible for many u.s. federal programs. the economic base is centered around fisheries and marine-related industries, tourism, agriculture, and small-scale manufacturing. the average per capita income in 1996 was $1,657 for the four states of the fsm: kosrae, pohnpei, yap, and chuuk. thirteen major languages exist in the country, with english as the primary second language. the 607 different islands, atolls, and islets dot an immense expanse of ocean; this geographic condition presents challenges in implementing and enhancing library services and technology. 4 despite the extreme geographic and economic conditions, the college of micronesia-fsm national campus in collaboration with the librarians throughout the states have been successful in implementing nationwide projects. these endeavors have resulted in technical infrastructure and the foundation for information technology instruction supported through awards from the u.s. department of education, the title iii program, and the national science foundation. i collaboration: building bridges that cross the oceans the libraries in micronesia have shown an ongoing commitment to librarianship and cooperation since the establishment of the pacific islands association of libraries and archives (piala) in 1991. the organization is a micronesia-based regional association committed to fostering awareness and encouraging cooperation and resource sharing among libraries, archives, museums, and related institutions. piala was formed to address the needs of pacific islands librarians and archivists, with a special focus on micronesia; it is responsible for the common-thread cohesiveness shared by the librarians over the past eight years. the organization has grown to become an effective champion of the needs of libraries and librarians in the pacific region.s when piala was established, the most pressing areas of concern within the region were development of resource-sharing tools and networks among the libraries, archives, museums, and related institutions of the pacific islands. the development of continuing education programs and the promotion of technology and telecommunications applications throughout the region were areas targeted for attention. those concerns have changed little since the group's inception. building upon that original premise, in january 1999 a group of interested parties from throughout the federated states of micronesia met to draft a document they envisioned would lay the groundwork for library planning over the next five years. this strategic plan encompasses all library activity-services, staffing, and the impact technology will have on libraries in the region. the document, "the federated states of micronesia library services plan, 1999-2003," coalesces the concepts, goals, and priorities put forward by a broad-based contingent. in this meeting, the group addressed basic issues of library and museum service, barriers and solutions to improve service delivery, and additional funding and training resources for libraries and museums.6 the compilation of the plan crafted at the gathering demonstrated a thorough understanding of the issues that face the librarians of the vast region. it exhibits the ingenuity, creativity, and willingness to problem-solve on a grand scale in a way that addresses the needs of all libraries in the pacific region. the goals set forward by the writing session group illustrate the concerns impacting library populations throughout the fsm. the fsm has now established six major goals to carry out its responsibilities and the need for overall improvement in and delivery of library services: 1. establish or enhance electronic linkages between and among libraries, archives, and museums in the fsm. 2. enhance basic services delivery and promote improvement of infrastructure and facilities. 3. develop and deliver training programs for library staff and users of the libraries. 4. promote public education and awareness of libraries as information systems and sources for lifelong learning. 5. develop local and nationwide partnerships for the establishment and enhancement of libraries, museums, and archives. 6. improve quality of information access for all segments of the fsm population and extend access to information to underserved segments of the population. priorities the following are general priorities for the fsm library services plan. the priorities represent needs for overall improvement of the libraries, museums, and archives. the priorities are based on the fact that currently libraries, museums, and archives development is in its infancy in consortia building i cutright 91 reproduced with permission of the copyright owner. further reproduction prohibited without permission. the fsm. specific priorities will change from year to year as programs are developed. 1. establishment of new libraries and enhancement of existing library facilities to increas e accessibility of all fsm citizens to library resources and services. outer islands and remote areas generally have no access to libraries or information sources. new facilities or mechanisms need to be established to provide access to information resources for the public. existing public and school library facilities often lack adequate staffing, climate control, and electrical connections needed to meet the needs of the community. existing public and school libraries also need to improve their facilities and services delivery to meet the needs of disabled individuals and other special populations. 2. provide training and professional development for library operation and use of new information technologies. a survey held during the writing session indicated that public and school library staff do not currently possess the skills needed to effectively provide assistance in the use of new information technologies. well-designed training programs with mechanisms for follow-up technical assistance and support need to be developed and implemented. 3. promote collaboration and cooperation among libraries, museums, and archives for sharing of holdings and technical ability. limited holdings, financial capacity, and human resources are major barriers to improving library services. collaboration and cooperation are needed among libraries, museums, and archives to maximize scarce resources . 4. develop recommended standards and guidelines for library services in the fsm. the ability to share resources and information could be significantly increased by development and implementation of recommended standards and guidelines for library services. standardization could assist with sharing of holdings and holdings information, increase availability of technical assistance, and provide guidance as new libraries and library services are set up. 5. increase access to electronic information sources. existing public and school libraries have limited or no access to electronic linkages including basic services such as e-mail and connections to the internet. the priority need is to establish basic electronic linkages for all libraries, followed by extending access to electronic information to all users.7 i shifting into action with the drafting of this five-year plan, the librarians stated emphatically the need and desire to move ahead 92 information technology and libraries i june 2000 with haste and determination . as the plan was conceptualized and documented, a small cadre of librarians from the college of micronesia -fsm national campus, the public library, and high school library crafted two successful grant proposals which addressed: • a cooperative library automation project which is designed to evolve into a national union catalog (goal 1; priorities 3, 5); • the installation of intern et services that would link the college of micronesia-fsm campuses, the public library, and high school library (goals 1, 2, 6; priorities 1, 2, 3, 5); • the development and delivery of training programs for library staff and users of the libraries (goals 3, 4, 6; priority 2); and • the implementation of a viable resource-sharing and document delivery service for the nation (goal 1, 2, 5, 6; priorities 3, 4, 5). over the past year the awarding of grant funds has shifted the library community into high gear with the design and implementation of project activities that will fulfill the targeted needs. the automation project and internet connectivity a collaborative request submitted by the bailey olter high school (bohs) library and the pohnpei public library provided the funding necessary to computerize the manual card catalog system at bohs and upgrade the dated automated library system at pohnpei public library. since the college of micronesia-fsm campuses are automated, it was important for the high school library and the public library to install like systems to achieve a networkable automated system, facilitating the development of a union catalog for all th e libraries' holdings. this migration to an automated system promoted cooperation and resource sharing for the island libraries-opening a wealth of information for all island residents. the project entailed purchasing a turnkey cataloging and circulation system that will facilitate the cataloging and processing of new acquisitions for each library as well as the conversion of approximately five thousand volumes of material already owned by the public and high school libraries. through internet connectivity, which was integral to the project, the system would also serve as public access to the many holdings of the libraries for students, faculty, and town patrons through a union catalog to be established in the future. the development and deliv ery of training programs for library staff and users is linked to the implementation of a viable resource-sharing and document delivery service for the nation. stated earlier, the librarians of the federated states of micronesia accepted the challenge facing them in rampreproduced with permission of the copyright owner. further reproduction prohibited without permission. ing up for the twenty-first century. their prior experience laid the groundwork necessary to implement the training programs necessary to bring the library community the knowledge and skills needed. a survey administered during the writing session indicated that few public and school librarians have significant training in or use of electronic linkages or information technologies, nor are they actively using such technologies at present. of the fourteen public and school librarians in the four states of micronesia, none hold a master's degree from an accredited library school or library media specialist certification. an exception is the library staff at the com-fsm national campus, where two-thirds of the librarians hold professional credentials. significant effort is needed on a sustained basis for effective training in the understanding and use of information systems throughout the nation. where training has occurred, it has often been of an infrequent, short variety with little support for ensuring implementation at the work site. additionally, often there are no formal systems for getting answers to questions when problems do arise. in addressing the information needs for this population it is apparent that education is the key component for continued improvement of library services. this concern is evident in a paper by daniel barron, where it is stated that only 54 percent of librarians and 19 percent of staff in libraries serving communities considered to be rural (i.e., 25,000 people or fewer) have an ala-mls. 8 and dowlin proposes even more perplexing questions, "how can a staff with such an educational deficit be expected to accomplish all that will be demanded to enable their libraries to go beyond being a warehouse of popular reading materials? how can we expect them to change from pointers and retrievers to organizers and facilitators?" 9 micronesia is no different than any other state or country in wanting its population to have access to qualified staff, current resources, and services. it recognizes the libraries are inadequately staffed and many others have staff who are seriously undereducated to meet the expanded information needs of the people in their communities. if these libraries are to seize the opportunities suggested by the developing positive view, develop services to support this view, and market such a view to a wider range of citizens in their communities they must invest in the intellectual capital of their staffs. in order to carry out this charge, the following activities were designed to address the educational and training needs of the librarians in the fsm. as outlined in a recently funded institute of museums and library services (imls) national leadership grant, preparation has begun with the following activities, which will address the staffing and technology concerns described in fsm libraries: 1. recruit and hire an outreach services librarian to survey training needs, coordinate and plan training, and deliver or arrange for needed training. 2. develop a skills profile for all library, museum, and archival staff positions. 3. identify training contact or coordinator for each state. 4. develop and provide periodic updates to operational manuals for school and public libraries, museums, and archives. 5. recruit local students and assist them in seeking out scholarships for professional training off island. 6. design and implement programs to provide continuous training and on-site support in new technological developments and information systems (provided on-site and virtually). 7. establish a summer training institute offering training based on needs as determined by the outreach services librarian in collaboration with state coordinators and recruiting onand off-island expertise as instructors. 8. design and develop programs for orientation and training of users of information systems (provided on-site and virtually). 9. develop and implement a "train the trainer" program, which will have representation from all four states, that will ensure continuity and sustainability of the project for the years to come. 10 the primary requisite to initiating this project is the recruitment and hiring of the outreach services librarian who will then begin the activities as listed. a beginning cadre of librarians gleaned from the summer institute will become the trainers of the future, perpetuating a learning environment enhanced with advanced technology. breakthroughs in distance education, aided with advances in telecommunications, will significantly impact this project. on-site training will be imperative for the initial cadre of summer institute attendees to provide sound teaching skills and a firm understanding of the material at hand. follow-up training will be presented on each island by the trainer either on location or virtually with available technology. products such as web course in a box, webct, or nicenet will be analyzed for appropriate utilization as teaching tools. these products will take advantage of newly established internet connections on each island and, more importantly, will provide the interactive element that distinguishes this learning methodology from the "talking head" or traditional correspondence course approach. a web site designed for this project will provide valuable information and connectivity for not only the pacific library community but anyone worldwide who may be interested in innovative methods of serving remote populations. using computer conferencing and virtual communities technology, a video conferencing system such as 8 x 8 consortia building i cutright 93 reproduced with permission of the copyright owner. further reproduction prohibited without permission. technologies will be used, which will allow face-to-face interaction with trainer and student in an intra-island situation (interisland telephone rates are too expensive for regular use as a teaching tool). to enhance the learning experience and information retrieval component for these librarians and the population they serve, the project also incorporates implementation of a viable resource-sharing, document delivery system capitalizing on a shared union catalog and using a service such as research library group's ariel product. with library budgets reflecting the critical economic climate of the nation, it becomes even more crucial for collaborative collection development and resource sharing to satisfy the needs of the library user. to maintain cost-effective communication and build a sense of community among the librarians, the messaging software icq has been installed on all participant hardware and utilized for group meetings, question and answer, and general correspondence. since icq operates as part of the internet, this package allows low-cost communication with maximum benefit in connecting the group. this technology will also be used as the primary mechanism for communication with an outside advisor who will provide expertise in the area of outreach services for rural populations. the realm of outreach services in libraries has always presented unique challenges that can now benefit greatly from current and emerging technologies. the definition of "outreach" is truly a matter of perspective, with the more traditional sense relating to a specific library servicing its own user or patron. but current practice regards "outreach" as a mere extension of services to all users whether they be a registered patron or colleague or peer. micronesia is a country where the proverbial phrase "the haves and the have-nots" is amplified. the recent (and ongoing) installation of internet services in the region has made possible many basic changes, but there still exists the reality that some of the sites for services proposed have nothing more than a common analog line and rudimentary services. as an example of the realities that exist, only 38 percent of the approximately 180 public schools in the fsm have access to reliable sources of electricity. another challenge for these libraries is the climate and environment, which has a significant impact on library facilities, equipment, and holdings. the fsm lies in the tropics, with temperatures ranging daily from 85 to 95 degrees with humidity normally 85 percent or higher.11 the high salt content in the ocean air wreaks havoc upon electrical equipment, and the favorable environs inside a library often entice everything from termites in the wooden bookcases to nesting ants in keyboards. from these examples it is apparent that the problems that trouble these libraries are not going to be solved with the magic bullet of technology. this reality constitutes the 94 information technology and libraries i june 2000 need for varying strategies and different aproaches to address the training requirements of the library staff. i summary the fsm library group, in particular the pohnpeian librarians, have accomplished much in the past year. the motivating factor for the flurry of activity that enveloped the libraries on pohnpei was spurred by the collaborative writing session in january 1999. a week-long "meeting of the minds" from libraries throughout micronesia produced the blueprint that will map the future of libraries and library service for years to come. these librarians stated their primary issues in delivering library services and came to a consensus on activities needed to address the issues. the "federated states of micronesia library services plan, 1999-2003" was crafted as a working document, a strategic plan for improving library services in the pacific region, and a commitment to achievement through collaboration. while in micronesia i observed the impact that the unification of ideas can have on the citizens of a community. in my fourteen-year tenure at eastern oregon university i have been exposed to the benefits of "consortium attitude" that come from cooperation and partnerships. time and again the university demonstrates the positive effects of what is referred to as "politics of entanglement." shepard describes the overriding philosophy that has been the recipe for success: the politics are really quite simple. we maintain an intricate pattern of relationships, any one of which might seem inconsequential. yet there is strength in the whole that is largely unaffected if a single relationship wanes. rather than mindlessly guarding turf, we seek to involve larger outside entities and in the ensnaring, to turn potential competitors into helpful partners .12 just as eastern oregon university has discovered, the libraries of the federated states of micronesia are learning the merits of entanglement. references and notes 1. james j. kopp, "library consortia and information technology: the past, the present, the promise," information technology and libraries 17 (mar. 1998): 7-12. 2. jan ison, "rural public libraries in multi-type library cooperatives," library trends 44 (summer 1995): 29-52. 3. pacific islands association of libraries and archives, www.uog.edu/rfk/piala.html, accessed june 6, 2000. 4. division of education, department of health, education and social affairs, federated states of micronesia, "federated reproduced with permission of the copyright owner. further reproduction prohibited without permission. states of micronesia, library services plan 1999-2003" (march 3, 1999): 2. 5. pacific islands association of libraries and archives, www.uog.edu/rfk/piala.html, accessed june 6, 2000. 6. division of education and others, "library services plan," 4. 7. ibid, 6. 8. daniel d. barron, "staffing rural pubic libraries: the need to invest in intellectual capital," library trends 44 (summer 1995): 77-88. the mit from gutenberg to the global information infrastructure access to information in the networked world christine l. borgman considers digital libraries from a social rather than a technical perspective. digital libraries and electronic publishing series 340 pp. $42 now in paperback remediation understanding new media jay david bolter and richard grusin " clearly written and not overly technical, this book will interest general readers, students, and scholars engaged with current trends in technology." choice 307 pp., 102 illus. $17.95 paper 9. k. e. dowlin, "the neographic library: a 30-year perspective on public libraries," in libraries and the future: essays oil the library ill the twenty-first century, f. w. lancaster, ed. (new york: haworth pr., 1993). 10. patricia j. cutright and jean thoulag, college of micronesia-fsm national campus, "institute of museums and library services, national leadership grant" (mar. 19, 1999). 11. division of education and others, "library services plan," 2. 12. w. bruce shepard, "spinning interin;titutional webs," aahe bulletin 49 (feb. 1997): 3-6. the intellectual foundation of information organization elaine svenonius "provides sound guidance to future developers of search engines and retrieval systems. the work is original, building on the foundations of information science and librarianship of the past 150 years." dr. barbara 8. tillett, director. ils program, library of congress digital libraries and electronic publishing series 264 pp. $37 now in paperback information ecologies using technology with heart bonnie a. nardi and vicki l. o'day "a new and refreshing perspective on our technologically dependent society." daily telegraph 246 pp. $15.95 paper to order call 800-356-0343 (us & canada) or 617-625-8569. prices subject to change without notice. http:/ /mitpress.mit.edu consortia building i cutright 95 2007 is ital’s 40th volume. my 40th birthday was the occasion of a great deal of bizarre behavior by my work colleagues, who boobytrapped my office. i do not like cake but love radishes. my birthday “cake” at work was a cheese ball decorated with forty radishes stuck on toothpicks. since i didn’t have to blow them out, i ate them—all forty. ital’s fortieth is no time for such shenanigans. rather it is a time for reflection, celebration, and memoriam. fred kilgour, the founding editor of the journal of library automation (jola), ital’s original title, died last summer. in planning for the 40th anniversaries of lita in 2006 and ital in 2007, the editorial board and i wanted to honor fred as founding editor. i called him and invited him to submit an article of his choosing. he thanked me but graciously declined. he was busy writing his mem oirs and said that he needed to conserve his strength for that task. to honor him as founding editor, i have invited a number of authors to submit articles describing their research or their seminal thoughts on our profession. readers have, i hope, seen those articles that are so des ignated by notes. i have also invited all lita members to submit such articles in previous editorials and in a posting to lital. several articles have resulted from these invitations. this being the first issue of the 2007 volume, it is neither too late for me to reissue an invitation, nor too late for you lita members and ital readers to respond with articles that commemorate our fortieth. i’m old enough to know that it is a cliché to proclaim “there has never been a more exciting time to be a librar ian.” it was so when volume 1 of jola appeared in 1967. it is so today. let us together peruse the tables of contents (tocs) of the first two issues. vol. 1, no. 1 ned c. morris, “computer based acquisitions system at texas a&i university”; richard d. johnson, “a book catalog at stanford”; robert wedgeworth, “brown university library fund accounting system”; richard e. chapin and dale h. pretzer, “comparative costs of converting shelf list records to machine readable form”; richard de gennaro, “the development and administration of automated systems in academic libraries” vol. 1, no. 2 lawrence auld, “automated book order and circulation control procedures at the oakland university library”; donald v. black, “creation of computer input in an expanded character set”; frederick c. kilgour, “costs of library catalog cards produced by computer”; r. a. kennedy, “bell laboratories’ library realtime loan system (bellrel)” four things are immediately striking about those titles. their authors described computerbased solutions and systems for big issues facing libraries forty years ago. second, those problems were all administrative, i.e., they involved using computers to increase the productivity of major operations performed by librarians and library staff. to paraphrase an oftcited goal, they were systems designed to attempt to control the rate of rise of library costs of operations—to improve the efficiency and effec tiveness of internal library processes. therefore third, they were not systems for library users per se. and fourth, they were harbingers of success. global cooperative cataloging and wellintegrated library systems have revolutionized our operations. we are devoting relatively more resources to direct services than we did forty years ago. i do not mean that no thoughts or efforts were being devoted to improved user services. when these articles were published, lockheed and the system development corporation (sdc) were in the process of developing the first commercially successful, general online database search systems. in fact, forty years ago, in a former life, as it were, i was present at what i believe was the first trans continental online information search, from a teletype machine in sdc’s office in dayton, ohio, to a computer at its santa monica headquarters. (aside to readers: as an impatient young man, i was struck less by the “magic” of the event than by an observation that i expressed on the spot: the response time was horrible—unacceptable. i opined that no one would put up with such a wait. i narrowly escaped with my scalp intact.) the national library of medicine (nlm) was perfecting the medical literature analysis and retrieval system (medlars), medline’s (medlars online’s) predecessor. selective dissemination of information (sdi) services were already being provided using batch processes. computers gen erated a myriad of printed article and technical report indexes. we’ve come a long way in forty years. an article in the current issue describes what librarians need to know about “facebook.” increasingly, in informationrich soci eties, our students and others want and need their infor mation technology on the run. the first five paragraphs of this editorial were com posed three weeks ago using the word processor on my palm treo 650 whilst i sat in medicalcenter waiting and examining rooms in portland, oregon. i downloaded the tocs of jola to my home desktop computer in vancouver, washington, two weeks ago. yesterday, i editorial: reflections on forty john webb john webb (jwebb@wsu.edu) is a librarian emeritus, washington state university, and editor of information technology and libraries. editorial | webb 3 contiuned on page 34 34 information technology and libraries | march 200734 information technology and libraries | march 2007 12. if you answered “yes” to question 11, please describe how facebook could be considered an aca demic endeavor. ______________________________________________ ______________________________________________ ______________________________________________ ______________________________________________ 13. please check all answers that best describe what effect, if any, use of facebook in the library has had on library services and operations?  has increased patron traffic  has increased patron use of computers  has created computer access problems for patrons  has created bandwidth problems or slowed down internet access  has generated complaints from other patrons  annoys library faculty and staff  interests library faculty and staff  has generated discussion among library faculty and staff about facebook 14. is privacy a concern you have about students using facebook in the library?  yes  no  not sure please list any observations, concerns, or opinions you have regarding facebook use in libraries. extracted the paragraphs from my palm to my desktop, and saved that document and the tocs on a universal serial bus (usb) key. today, i combined them in a new document on my laptop and keyed the remaining paragraphs in my room at an inn on a pier jutting into commencement bay in tacoma on southern puget sound. i sought inspiration from the view out my window of the water and the fall color, from old crow medicine show on my ipod, and from early sixties beyond the fringe skits on my treo. fred kilgour was committed to delivering informa tion to users when and where they wanted it. libraries must solve that challenge today, and i am confident that we shall. editorial continued from page 3 44 information technology and libraries | march 2011 jennifer emanuel usability of the vufind next-generation online catalog vufind incorporates many of the interactive web and social media technologies that the public uses online, including features from online booksellers and commercial search engines. the vufind search page is simple, containing only a single search box and a dropdown menu that gives users the option to search all fields or to search by title, author, subject, or isbn/issn (see figure 1). to combine searches using boolean logic or to limit to a particular language or format, the user must use the advanced search feature (see figure 2). the recordresults page displays results vertically, with each result containing basic item information, such as title, author, call number, location, item availability, and a graphical icon displaying the material’s format. the results page also has a column on the right side displaying “facets,” which are links that allow a user to refine their search and browse results using catalog data contained within the result set (see figure 3). vufind also contains a variety of web 2.0 features, such as the ability to tag items, create a list of favorite items, leave comments about an item, cite an item, and links to google book previews and extensive author biographies data mined from the internet. corresponding to the beginning of the vufind trial at uiuc, the university library purchased reviews, synopses, and cover images from syndetic solutions to further enhance both vufind and the existing webvoyage catalog. an additional appealing aspect of vufind was its speed; the carli installation of webvoyage is slow to load and is prone to time out while conducting searches. the uiuc library first provided vufind (http:// www.library.illinois.edu/vufind) at the beginning of the 2008 fall semester and expected it to be trialed through the end of the spring semester 2009. use statistics show that throughout the fall semester (september through december), there were approximately six thousand unique visitors each month, producing a total of more than thirty-eight thousand visits. spring statistics show use averaging more than ten thousand visitors a month, an increase most likely from word-of-mouth. librarians at both uiuc and carli were interested in what users thought about vufind, especially in relation to the usability of the interface. with this in mind, the library launched several forms of assessment during the spring semester. the first was a quantitative survey based on yale’s vufind usability testing.3 the second was a more extensive qualitative usability test that had users conducting sample searches in the interface and telling the facilitator their opinions. this article will discuss the hands-on usability portion of this study. survey responses that support the results presented herein will be reported in a separate venue. while this article only discusses vufind at a single institution, it does offer a generalized view of next-generation catalogs and how library users use such a catalog compared to a traditional online catalog. the vufind open–source, next-generation catalog system was implemented by the consortium of academic and research libraries in illinois as an alternative to the webvoyage opac system. the university of illinois at urbana-champaign began offering vufind alongside webvoyage in 2009 as an experiment in next generation catalogs. using a faceted search discovery interface, it offered numerous improvements to the uiuc catalog and focused on limiting results after searching rather than limiting searches up front. library users have praised vufind for its web 2.0 feel and features. however, there are issues, particularly with catalog data. v ufind is an open–source, next-generation catalog overlay system developed by villanova university library that was released to the public as beta in 2007 and version 1.0 in 2008.1 as of july 2009, four institutions implemented vufind as a primary catalog interface, and many more are either beta or internally testing it.2 more information about vufind, including the technical requirements and compatible opacs, is available on the project website (http://www.vufind.org). in illinois, the state consortium of academic and research libraries in illinois (carli) released a beta installation of vufind in 2008 on top of its webvoyage catalog database. the carli installation of vufind is a base installation with minor customizations to the carli catalog environment. some libraries in illinois utilize vufind as an alternative to their online catalog, including the university of illinois at urbana-champaign (uiuc), which currently advertises vufind as a more user friendly and faster version of the library catalog. as a part of the evaluation of nextgeneration catalog systems, uiuc decided to conduct hands-on usability testing during the spring of 2009. the carli catalog environment is very complex and comprises 153 member libraries throughout illinois, ranging from tiny academic libraries to the very large uiuc library. currently, 76 libraries use a centrally managed webvoyage system referred to as i-share. i-share is composed of a union catalog containing holdings of all 76 libraries as well as individual institution catalogs. library users heavily use the union catalog because of a strong culture of sharing materials between member institutions. carli’s vufind installation uses the records of the entire union catalog, but has library-specific views. each of these views is unique to the member library, but each library uses the same interface to view records throughout i-share. jennifer emanuel (emanuelj@illinois.edu) is digital services and reference librarian, university of illinois at urbana-champaign. usability of the vufind next-generation online catalog | emanuel 45 not simply find them.6 as a result, the past five years have been filled with commercial opac providers releasing next-generation library interfaces that overlay existing library catalog information and require an up-front investment by libraries to improve search capabilities. as these systems are inherently commercial and require a significant investment of capital, several open–source, next-generation catalog projects have emerged, such as vufind, blacklight, scriblio, and the extensible catalog project.7 these interfaces are often developed at one institution with their users in mind and then modified and adapted by other institutions to meet local needs. however, because they can be locally customized, libraries with significant technical expertise can have a unique interface that commercial vendors cannot compete against. one cannot discuss next-generation catalogs without mentioning the metadata that underlie opac systems. some librarians view the interface as only part of the problem of library catalogs and point to cataloging and metadata practices as the larger underlying problem. many librarians view traditional cataloging using machine-readable cataloging (marc), which has been used since the 1960s, as outdated because it was developed with nearly fifty-year-old technology in mind.8 however, because marc is so common and allows cataloging with a fine degree of granularity, current opac systems still utilize it. librarians have developed additional cataloging standards, such as dublin core (dc), metadata object description schema (mods), and functional requirements for bibliographic records (frbr), but none of these have achieved widespread adoption for cataloging printed materials. newly developed catalog projects, such as extensible catalog, are beginning to integrate these new metadata schemas, but currently others continue to use marc.9 many librarians also advocate to integrate folksonomy, or user tagging, into library catalogs. folksonomy is used by many library websites, most notably flickr, delicious, and librarything, each of which store user-submitted content that istagged with self-selected keywords that allow for easy retrieval and discovery.10 vufind integrates tagging into individual item records ■■ literature review librarians have complained about the usability of online catalogs since they were first created.4 when amazon.com became the go-to site for books and book information in the early 2000s, librarians and their users began to harshly criticize both opac interfaces and metadata standards.5 ever since north carolina state university announced a partnership with the commercial-search corporation endeca in 2006, librarians have been interested in the next generation of library catalogs and more broadly, discovery systems designed to help users discover library materials, figure 1. vufind default search figure 2. vufind advanced search figure 3. facets in vufind 46 information technology and libraries | march 2011 searching the library’s online catalog and were eager to see changes made to it. the test used was developed from a statewide usability test of different catalog interfaces usedin illinois. the test was adapted using the same sample searches, but was customized to the features and uses of vufind (see appendix). the vufind test was similar to the original test to allow a comparison of other catalog interfaces to vufind for internal evaluation purposes. i designed the test to allow subjects to perform a progressively complicated series of sample searches using the catalog while the moderator pointed out various features of the catalog interface. subjects were also asked what they thought about the search result sets and their opinions of the interface and navigation; they also were asked to perform specific tasks using vufind. the tasks were common library-catalog tasks using topics familiar at undergraduate–level students. the tasks ranged from a keyword search for “global warming” to a more complicated search for a specific compact disc by the artist prince. the tasks also included using the features associated with creating and using an account with vufind, such as adding tags and creating a favorite items list. through completing the test, subjects got an overview of vufind and were then asked to draw conclusions about their experience and compare it to other library catalogs they have used. the tests were performed in a small meeting room with one workstation set up with an install of the morae software, a microphone, and a web camera. morae is a very powerful software program developed by techsmith that records the screen on which the user is interacting with an interface, as well as environmental audio and video. although the study did not utilize all the features of the morae software, it was invaluable to the researcher to be able to review the entire testing experience with the same detail as when the test actually occurred in person. the study was carried out with the researcher sitting next to the workstation asking subjects to perform a task from the script while morae recorded all of their actions. once all fifteen subjects completed the test, the researcher watched the resulting videos and coded the answers into various themes on the basis of both broad subject categories and individual question answers. the researcher then gathered the codes into categories and used them to further analyze and gain insight into both the useful features of and problems with the vufind interface. ■■ analysis participants generally liked vufind and preferred it to the current webvoyage system. when asked to choose which catalog they would rather use, only one person, a faculty member, stated he would still use webvoyage. this faculty but does not pull tags from other sources; rather, users must tag items individually. additionally, next-generation catalogs offer a search mechanism that focuses on discovery rather than simply searching for library materials. users, accustomed to new ways of searching both on the internet and through commercial library indexing and abstracting databases, now search in a fundamentally different style than they did when opacs first became a part of library services. the online catalog is now just one of many tools that library users use to locate information and now covers fewer resources than it did ten to fifteen years ago. library users are now accustomed to using a single search box, such as with google; they also use nonlibrary online tools to find information about books and no longer view library catalogs as the primary place to look for books.11 as users are no longer accustomed to using the controlled language and particular searching methods of library catalogs because they have moved to discovering materials online, libraries must adapt to new way of obtaining information and focus not on teaching users how to locate library materials, but give them the tools to discover on their own.12 vufind is one option among many in the genre of next-generation or discovery-catalog tools. ■■ methods the study employed fifteen subjects who participated in individual, hands-on usability test sessions lasting an average of thirty minutes. i recruited volunteers though several methods, including posting to a university faculty and staff e-mail discussion list, an e-mail discussion lists aimed toward graduate students, and flyers in the undergraduate library. all means of recruitment stated that the library sought volunteer subjects to perform a variety of sample searches in a possible new library catalog interface. i also informed subjects that there was a gift card as a thank you for their time. all subjects had to sign a human subjects statement of informed consent approved by the university of illinois institutional review board. i sought a diverse sample, and therefore accepted the first five volunteers from the following pools: faculty and staff, graduate students, and undergraduate students. i felt that these three user groups were distinct enough to warrant having separate pools. the number of five users in each group was chosen because of jakob nielsen’s statement that five users will find 85 percent of usability problems and that fifteen users will discover all usability problems.13 although i did not specifically aim to recruit a diverse sample, the sample showed a large diversity in areas including age, library experience, and academic discipline. all subjects stated they had some experience usability of the vufind next-generation online catalog | emanuel 47 though there were questions as to how results were deemed relevant to the search statement as well as how they were ranked. participants were then asked to look at the right sidebar of the results page, which contains the facets. most users did not understand the term “facets,” with faculty and staff understanding the term more than graduate and undergraduate students did. one faculty member who understood the term facet noted that “facets are like a diamond with different sides or ways of viewing something.” however, when asked what term would be better to call the limiting options other than facet, several users suggested either calling the facets “categories” or renaming the column “refine search,” “narrow search,” or “sort your search.” participants were then asked to find how to see results for other i-share libraries. only two faculty members found i-share results quickly, and just half of the remaining participants were able to find the option at all. when asked what would make that option easier to find, most said they liked the wording, but the option needed to stand out more, perhaps with a different colored link or bolder type. two users thought having the location integrated as a facet would be the most useful way of seeing it. participants, however, quickly took to using the facets, as they were asked to use the climate change search results to find an electronic book published in 2008. no user had problems with this task, and several remarked that using facets was a lot easier than limiting to format and year before searching. the next task for participants was to open and examine a single record within their original climate change results (see figures 4 and 5). participants liked the layout, including the cover image with some brief title information, and a tabbed bar below showing additional information, such as more detailed description, holdings information, a table of contents, reviews, comments, and a link to request the item. several users remarked that they liked having information contained under tabs, but vufind organized each tab as a new webpage that made going back to previous tabs or the results page cumbersome. the only problem users had with the information contained within the tabs was the “staff view,” which contained the marc record information. most users looked at the marc record with confusion, including one graduate student who said, “if the staff view is of no use to the user, why even have it there?” one other useful feature that individual records in vufind contain is a link to an overlay window containing the full citation information for the item in both apa and mla formats. users were able to find this “cite this” link and liked having that information available. however, several participants noted that citation information would be much more beneficial if it could be easily exported to refworks or other bibliographic software. the next several searches used progressively higher-level member thought most of his searches were too advanced for the vufind interface and needed options that vufind did not have, such as limiting a search to an individual library or call number searching. this user did, however, specify that vufind would be easier to use for a fast and simple search. other users all responded very favorably to vufind, liking it better than any other online catalog they have used, with most stating that they wanted it as a permanent addition to the library. the most common responses to vufind were that the layout is easier on the eyes and displayed data much better than the webvoyage catalog; there were no comments about actual search results. several users stated that it was nice to be able to do a broad search and then have all limiting options presented to them as facets, allowing users to both limit after searching and letting them browse through a large number of search results. one user, an undergraduate student, stated she liked vufind because it “was new” and she always wants to try out new things on the internet. the first section of the usability test asked users to examine both the basic and advanced search options. users easily recognized how the interface functioned and liked having a single search box as the basic interface, noting that it looked more like a web search engine. they also recognized all of the dropdown menu options and agreed that the options included what they most often searched. however, four users wanted a keyword search. even though there is not a keyword search in webvoyage and there is an “all fields” menu option, participants seemed to think of the one box search universally as a keyword search and wanted that to be the default search option. one participant, an international graduate student, remarked that keyword is more understood by international students than the “all fields” search because, internationally, a field is not a search field but a scholarly field such as education or engineering. in the advanced search, all users thought the search options were clear and liked having icons to depict the various media formats. however, two users did remark that it would be useful to be able to limit by year on the advanced search page. the advanced search also is where the user can select one of seven languages, all of which are considered western languages, including latin and russian. two users, both international graduate students, stated that more languages would be beneficial, especially asian and more slavic languages. the university of illinois has separate libraries for asian and slavic materials, and these two participants said it would be useful to have search options that include the languages served by the libraries. the first task that participants were asked to do was an “all fields” search for “climate change.” they were instructed to look at the results page and an individual record to give feedback as to how they liked the layout and what they thought of the search results. upon looking at the results, all participants thought they were relevant, 48 information technology and libraries | march 2011 to items in which james joyce is the author, no participant had any problems, though several pointed out that there were three facets using his name—joyce, james; joyce, james avery; and joyce, j. a.—because of inconsistencies in cataloging (see figure 6). participants were next asked to search for an audio recording by the artist prince using the basic (single) search box. most participants did an “all fields” search for prince and attempted to use the facets to limit by a particular format. all but one was confident that they achieved the proper result, but there was confusion about the format. some participants were confused as to what format an audio recording was because the corresponding facet was for a music recording. a couple of users thought “audio recording” could be a spoken-word recording. most participants preferred that the format facets be more concrete toward a single actual physical format, such as a record, cassette, or a compact disc (see figure 7). physical formats appeared to resonate more with users than the broad cataloging term of “music recording.” a more specific format type (i.e., compact disc) is contained in the call number and should be straightforward to pull out as a facet. it appears vufind pulls the format information from marc field 245 subfield $h for medium rather than the call number (which at illinois can specify the format) or the 300 physical description field or another field such as a notes field that some institutions may use to specify the exact format. however, when participants were asked to further use facets to find prince’s first album, 1978’s for you, limitations with vufind became more apparent. each participant used a different method to search for this album, and none actually found the item either locally or in i-share, though the item has multiple copies available in both locations. most participants tried initially limiting by date because they were given that information. however, vufind’s facets focus on eras rather than specific years, which participants stated was frustrating as many items can fall under a broad era. also, the era facets brought up many more eras than one would consider an audio research skills and showed problems with both vufind and the catalog record data. the first search asked participants to do an “all fields” search for james joyce. all were able to complete the search, but there was notable confusion as to which records were written by james joyce and which were items about him. about half of the first-page results for this search did not list an author on the results page. vufind appears to pull the author field on the results page from the 100 field in the marc record, so if the 700 field is used instead for an editor, this information is not displayed on the results page. individual records do substitute the 700 field if the 100 field is not present, but this should also be the case on the initial results screen as well. several users thought it was strange that the results page often did not list the author, but an author was listed in the individual record. additionally, when asked to use the facets to limit figure 4. results set figure 5. record display figure 6. author facet figure 7. format facet usability of the vufind next-generation online catalog | emanuel 49 about both the reviews and comments that could be seen in the various records participants were asked to examine. many of the participants wanted more information as to where the reviews came from because this information was not clear. they also wanted to know whether the reviews or comments from catalog users had any type of moderation by a librarian. for the most part, participants liked having reviews inside the catalog records, but they liked having a summary even more. several users, all graduate students, expressed concern about the objectiveness of having reviews in the catalog, especially because it was not clear who did the review and feared that reviews may interject some bias that had no place in a library catalog record. one of these participants stated, “if i wanted reviews, i would just go to amazon. i don’t expect reviews, which can be subjective, to be in a library catalog—that is too commercial.” several undergraduate participants stated that reviews helped them decide whether the book was something that would be useful to them. the final task of the usability test asked participants to create an account with vufind because it is not connected to our user database. most users had no problems finishing this task, though they found some problems with the interface. first, it was not clear that users had to create an account and could not log in with their library number as they did in the library’s opac. second, the default field asks users for their barcode, which is not a term used at uiuc (users are assigned a library number). once logged in, participants were satisfied with the menu options and how their account information was displayed. finally, participants were asked, while logged in, to search for a favorite book and add it to their favorites list. all users liked the favorites-list feature, and many already knew of ways they could use it, but several wished they could create multiple lists and have the ability to arrange lists in folders. ■■ discussion participants thought favorably of the vufind interface and would use it again. they liked the layout of information much more than the current webvoyage interface and thought it was much easier to look at. they also had many comments that the color scheme (yellow and grey) was easier than the blues of the primary library opac. vufind also had more visual elements, such as cover images and icons representing format types that participants also commented on favorably. when asked to compare vufind to both the webvoyage catalog and amazon, only one participant indicated a preference for amazon, while the rest preferred vufind. the user who specified amazon, a faculty member, stated that that was where he always started searching for books; he would then search for specific titles in the recording, such as the 15th century. granted, the 15th century probably brings up music that originated in that era, not recorded then, but participants wanted the date to correspond to when an item was initially published or released. it appears that vufind pulls the era facet information from the subject headings and ignores the copyright or issue year. to users, the era facets are not useful for most of their search needs; users would rather limit by copyright or the original date of issue. another search that further highlighted problems searching for multimedia in vufind is the title search participants did for gone with the wind. everyone thought this search brought up relevant results, but when asked to determine whether the uiuc library had a copy of the dvd, many users expressed confusion. once again, the confusion was based on the inability to limit to a specific format. participants could use the facets to limit to a film or video, but not to a specific format. several participants stated that they needed specific formats because when they are doing a comparable search, they only want to find dvds. however, because all film formats are linked together under “film/video,” they must to go into individual records and examine the call number to determine the exact format. most participants stated clearly that “dvd” needed to be it’s own format facet and that entering a record to find the format required too much effort. participants also expressed frustration that the call number was the only place to determine specific format and believed that this information should be contained in the brief item information and not buried in the tabbed areas. the frustrations with the lack of specific formats also were evident when participants were asked to do an advanced search for a dvd on public speaking. all users initially thought the advanced search limiter for film/video was sufficient when they first looked at the advanced search options. however, when presented with an actual search (“public speaking”), they found that there should be more options and specific format choices up-front within the advanced search. another search that participants conducted was an author search for jack london. they then used the facets to find the book white fang. this search was chosen because the resulting records are mostly for older materials that often do not contain a lot of the additional information that newer records contain. participants looked at a specific record and then were asked what they thought of the information that was displayed. most answered that they would like as much information as you can give them, but were accepting of missing information. several participants stated that most people already know this book and thus did not need additional information. however, when pressed as to what information they would like added to the record, several users stated a summary would be the most useful. additionally, several users asked for more information 50 information technology and libraries | march 2011 the simplicity of the favorites listing feature, the difficulty of linking to other i-share library holdings, and the difficulties in using the facet categories. ■■ implications i intend to continue to perform similar usability tests on next-generation catalogs on a trial basis to examine one aspect regarding the future of online catalogs at uiuc. uiuc is looking at various catalog interfaces, of which vufind is one option, to see which best meets the needs of our users. users stated multiple times during testing that they find the current webvoyage interface to be very frustrating and will accept nearly anything that is an improvement, even if the new interface has some usability issues. vufind is not perfect for all searches, as shown by a lack of a call number search and the limitations in searching for multimedia options, but it does provide a more intuitive interface for most patrons. the future of vufind at uiuc is still open. development is currently stalled because of a lack of developer updates and internal staffing constraints both at uiuc and carli. however, because vufind is open–source, and the only ongoing cost is that of server maintenance, both carli and the library are continuing to display it as an option for searching the catalog. both carli and uiuc are closely examining other options for catalog interfaces that would provide patrons with a better search experience, but they have taken no further action to permanently adapt either vufind or to demo other options. despite its limitations, vufind is still a viable option for libraries with substantial technology expertise that are interested in a next-generation catalog interface at a low price. although it does have limitations, it has a better out-of-the-box interface than traditional opacs and should be considered alongside commercial options for any library thinking of adapting a catalog interface overlay. this usability test focused on one institution’s installation of vufind, which may or may not apply to other installations and other institutional needs. it would be interesting to study an installation of vufind at a smaller, nonresearch institution, where users have different searching needs and expectations related to a library’s opac. references 1. john houser, “the vufind implementation at villanova university,” library hi tech 27, no. 1 (2009): 96–105. 2. vufind, “vufind: about,” http://www.vufind.org/about .php (accessed sept. 10 2009). 3. kathleen bauer, “yale university vufind test— undergraduates,” http://www.library.yale.edu/libepub/ usability/studies/summary_undergraduate.doc (accessed mar. 20, 2010). library catalog to check availability. other participants who made comments about amazon stated that it was commercial and more about marketing materials, while the library catalog just provided the basic information needed to evaluate materials without attempting to sell them to you. several participants also stated they checked amazon for book information, but generally did not like it because of its commercial nature; because vufind provides much of the same information as amazon, they will use vufind first in the future. participants also thought amazon was for a popular and not scholarly audience, making it not useful for academic purposes. most users did not have much to say about the webvoyage opac, except it was overwhelming, had too many words on the result screen, and was not pleasantly visual. participants were also asked to look at vufind, amazon, and webvoyage from a visual preference. again, participants believed that vufind had the best layout. they liked that vufind had a very clean and uncluttered interface and that the colors were few and easy on the eye. they also commented about the visuals contained (cover art and icons) in the records and the vertical orientation of vufind (webvoyage has a horizontal orientation) to display records. they also liked how the facets were displayed, though two users thought they would be better situated on the left side of the results because they scan websites from the left to the right. the one thing that was mentioned several times was vufind’s lack of the star rating system that amazon uses to quickly rate an item. participants thought such a system might be better than reviews because it allows users to quickly scan through the item and not have to read through multiple reviews. when asked to rate the ease of use for vufind, with 1 being easy and 5 being difficult, participants rated it an average of 1.92. faculty rated the ease at 1.6, graduate students at 1.75, and undergraduates at 2.8. undergraduates were more likely to get frustrated at media searching and thought that some of the facets related to media items were confusing, which they used to explain their lower scores. however, when asked if they would rather use vufind over the current library catalog (webvoyage), all but one participant enthusiastically stated they would use vufind. most users stated that although vufind was not perfect, it was still much better than the other library catalog because of the better layout, visuals, and ability to limit results. the only user that specified they would still rather use the webvoyage catalog believed it had more options for advanced search, such as call number searching, which vufind lacked. there are, however, several changes that could make vufind more useful to our users that came out of usability testing. some of these are easy to implement on a local level, and others would improve the base build of vufind. a number of issues arose from usability testing, but the largest issues are the lack of refworks integration, usability of the vufind next-generation online catalog | emanuel 51 9. jennifer bowen, “metadata to support next-generation library resource discovery: lessons from the extensible catalog, phase 1,” information technology & libraries 27, no. 2 (2008): 6–19. 10. tom steele, “the new cooperative cataloging,” library hi tech 27, no. 1 (2009): 68–77. 11. ian rowlands and david nicholas, “understanding information behaviour: how do students and faculty find books?” journal of academic librarianship 34, no. 1 (2008): 3–15. 12. ja mi and cathy weng, “revitalizing the library opac: interface, searching, and display challengers,” information technology & libraries 27, no. 1 (2008): 5–22. 13. jakob nielsen, “why you only need to test with 5 users,” http://www.useit.com/alertbox/20000319.html (accessed mar. 20, 2010). 4. christine borgman, “why are online catalogs still hard to use?” journal of the american society for information science 47, no. 7 (1996): 493–503. 5. georgia briscoe, karne selden, and cheryl rae nyberg, “the catalog versus the home page: best practices for connecting to online resources,” law library journal 95, no. 2 (2003): 151–74. 6. kristin antelman, emily lynema, and andrew k. pace, “toward a twenty-first century library catalog,” information technology & libraries 25, no. 3 (2006): 128–39. 7. marshall breeding, “library technology guides: discovery layer interfaces,” http://www.librarytechnology. org/discovery.pl?sid=20100322930450439 (accessed mar. 2010). 8. karen m. spicher, “the development of the marc format,” cataloging & classification quaterly 21, no 3/4 (1996): 75–90. appendix. vufind usability study logging sheets i. the look and feel of vufind a. basic screen (the vufind main page) 1) is it obvious what to do? yes _____ no _____; what were you trying to do? 2) open the drop down box, examine the options. do you recognize theseoptions? yes _____ no _____ some _____ (if some, find out what the patron was expecting and get suggestions for improvement). comments: b. click on the advanced search option—take a minute to allow the participants to look around the screen 1) examine each of the advanced search options a) are the advanced search options clear? yes_____ no_____ b) are the advance search options helpful? yes_____no_____ 2) examine the limits fields, open the drop-down menu boxes a) are the limits clearly identified? yes _____ no _____ b) are the pictures helpful? yes _____ no _____ c) are the drop-down menu box options clear? yes _____ no _____ comments: ii. (back to the) basic search field a. enter the phrase—climate change (search all fields)—examine the search results 1) do the records retrieved appear to be relevant to your search statement? yes _____no _____don’t know _____ 2) what information would you like to see in the record? how should it be displayed? 3) examine the right sidebar. are the “facets” clear? yes _____no _____some, not all _____ 4) if you want to view items from other libraries in your search results, can you find the option? yes _____no _____ 5) can you find an electronic book published in 2008? yes _____no _____don’t know _____ comments: b. click on the first book record in the original climate change search results 1) is information about the book clearly represented? yes _____ no _____ 2) is it clear where to find item? yes _____ no _____ 3) look at the tags. do you understand what this feature is? yes _____ no _____ comments: c. look at the brief item information provided on the screen 1) is the information displayed useful in determining the scope and content of the item? yes _____no _____ 2) are the topics in the record useful for finding additional information on the topic? yes _____no _____ comments: d. click on each button below the brief record information 1) is this information useful? yes _____ no _____ 2) are the names for the tabs accurate? what should they be named? e. can you easily determine where the item is located and how to request it? yes _____no _____ comments: f. go back to the basic search box and enter the author james joyce (all fields) as a new search 1) is it easy to distinguish items by james joyce from items about james joyce? yes _____no _____ 2) using the facets, can you find only titles with james joyce as author? yes _____no _____ 3) can you find out how to cite an item? yes _____ no _____ comments: 52 information technology and libraries | march 2011 g. now try to find an audio recording by the artist prince using basic search were you successful? yes _____no _____ h. find the earliest prince recording ( “for you”; 1978). is it in the local collection? yes _____ no _____ if not, can you get a copy? comments: iii. in the advanced search screen: a. use the title drop down to find the item: gone with the wind 1) were you successful? yes _____ no _____ not sure _____ 2) can you locate a dvd of the same title? yes _____ no _____ 3) are copies of the dvd available in the university of illinois library? yes _____ no _____ comments: b. use the author drop down in the advanced search to locate titles by: jack london using the facets, find and open the record for the jack london novel, white fang. explore each of the: description, holdings, and comments tabs: 1) is this information useful? yes _____ no _____ 2) would you change the names of the tabs or the information on them? 3) other than your local library copy of white fang, can you find copies at other libraries? yes _____ no _____ comments: c. using the advanced search, find a dvd on public speaking (hint: use the limit box to select the film/video format) are there instructional videos in the university of illinois library? yes _____ no _____ 1) identify the author that’s responsible for one of the dvds 2) can you easily find other works by this author? yes _____ no _____ comments: iv. exploring the account features: a. click on login in the upper right corner of the page. on the next page, create an account. is it clear how to create an account? yes _____ no _____ b. once you have your account and are logged in to vufind, look at the menu on the right hand side. is it clear what each of the menu items are? yes _____ no _____ c. while still logged in, do a search for your favorite book and add it to your favorites list. is this tool useful, would you consider using it? yes _____ no _____ comments: v. comparing vufind to other resources: a. open three browser windows (this is easiest in firefox by entering ctrl-t for each new window) with 1) your library catalog 2) vufind 3) amazon.com enter global warming in each website in the basic search window of each. based on your initial reactions, which service appears the best for most of your uses? library catalog _____ vufind _____ amazon _____ comments: c. do you have a preference in the display formats? library catalog _____ vufind _____ amazon _____ comments: debriefing now that you have used vufind, how would you rate it—on a scale from 1–5, from easy to confusing to use? comments? how does it compare to other library catalogs you’ve used? if vufind and your home library catalog were available side-by-side, which would you use first? why? are you familiar with any of these other products: aquabrowser _____ googlebooks _____ microsoft live search _____ librarything _____amazon.com _____other preferred service _____ that’s it! thank you for participating in our usability. you will be receiving one other survey through email, we appreciate your opinions on the vufind product. lita covers 2, 3, and 4 index to advertisers testing information literacy in digital environments | katz 3 despite coming of age with the internet and other technology, many college students lack the information and communication technology (ict) literacy skills necessary to navigate, evaluate, and use the overabundance of information available today. this paper describes the development and early administrations of ets’s iskills assessment, an internet-based assessment of information literacy skills that arise in the context of technology. from the earliest stages to the present, the library community has been directly involved in the design, development, review, field trials, and administration to ensure the assessment and scores are valid, reliable, authentic, and useful. t echnology is the portal through which we interact with information, but there is growing belief that people’s ability to handle information—to solve problems and think critically about information—tells us more about their future success than does their knowledge of specific hardware or software. these skills—known as information and communications technology (ict) literacy—comprise a twentyfirstcentury form of literacy in which researching and communicating information via digital environments are as important as reading and writing were in earlier centuries (partnership for 21st century skills 2003). although today’s knowledge society challenges stu dents with overabundant information of often dubious quality, higher education has recognized that the solution cannot be limited to improving technology instruction. instead, there is an increasingly urgent need for students to have stronger information literacy skills—to “be able to recognize when information is needed and have the ability to locate, evaluate, and use effectively the needed information” (american library association 1989)—and apply those skills in the context of technology. regional accreditation agencies have integrated information lit eracy into their standards and requirements (for example, middle states commission on higher education 2003; western association of schools and colleges 2001), and several colleges have begun campuswide initiatives to improve the information literacy of their students (for example, the california state university 2006; university of central florida 2006). however, a key challenge to designing and implementing effective information lit eracy instruction is the development of reliable and valid assessments. without effective assessment, it is difficult to know if instructional programs are paying off—whether students’ information literacy skills are improving. ict literacy skills are an issue of national and inter national concern as well. in january 2001, educational testing service (ets) convened an international ict literacy panel to study the growing importance of exist ing and emerging information and communication tech nologies and their relationship to literacy. the results of the panel’s deliberations over fifteen months highlighted the growing importance of ict literacy in academia, the workplace, and society. the panel called for assessments that will make it possible to determine to what extent young adults have obtained the combination of techni cal and cognitive skills needed to be productive mem bers of an informationrich, technologybased society (international ict literacy panel 2002). this article describes ets’s iskills assessment (for merly “ict literacy assessment”), an internetbased assessment of information literacy skills that arise in the context of technology. from the earliest stages to the pres ent, the library community has been directly involved in the design, development, review, field trials, and admin istration to ensure the assessment and scores are valid, reliable, authentic, and useful. ■ motivated by the library community although the results of the international ict literacy panel provided recommendations and a framework for an assessment, the inspiration for the current iskills assessment came more directly from the higher educa tion and library community. for many years, faculty and administrators at the california state university (csu) had been investigating issues of information literacy on their campuses. as part of their systemwide information competence initiative that began in 1995, researchers at csu undertook a massive ethnographic study to observe students’ research skills. the results suggested a great many shortcomings in students’ infor mation literacy skills, which confirmed librarian and classroom faculty anecdotal reports. however, clearly such a massive data collection and analysis effort would be unfeasible for documenting the information literacy skills of students throughout the csu system (dunn 2002). gordon smith and the late ilene rockman, both of the csu chancellor ’s office, discussed with ets the idea of developing an assessment of ict literacy that could support csu’s information competence initiative as well as similar initiatives throughout the higher edu cation community. irvin r. katz irvin r. katz (ikatz@ets.org) is senior research scientist in the research and development division at educational testing service. testing information literacy in digital environments: ets’s iskills assessment � information technology and libraries | september 2007� information technology and libraries | september 2007 ■ national higher education ict literacy initiative in august 2003, ets established the national higher education ict literacy initiative, a consortium of seven colleges and universities that recognized the need for an ict literacy assessment targeted at higher educa tion. representatives of these institutions collaborated with ets staff to design and develop the iskills assessment. the consortium built upon the work of the international panel to explicate the nature of ict literacy in higher education. over the ensuing months, repre sentatives of consortium institutions served as subject matter experts for the assessment design and scoring implementation. the development of the assessment followed a process known as evidencecentered design (mislevy, steinberg, and almond 2003), a systematic approach to the design of assessments that focuses on the evidence (student performance and products) of proficiencies as the basis for constructing assessment tasks. through the evidence centered design process, ets staff (psychometricians, cognitive psychologists, and test developers) and sub jectmatter experts (librarians and faculty) designed the assessment by considering first the purpose of the assess ment and by defining the construct—the knowledge and skills to be assessed. these decisions drove discussions of the types of behaviors, or performance indicators, to serve as evidence of student proficiency. finally, simulation based tasks designed around authentic scenarios were crafted to elicit from students the critical performance indicators. katz et al. (2004) and brasley (2006) provide a detailed account of this design and development process, illustrating the critical role played by librarians and other faculty from higher education. ■ ict literacy = information literacy + digital environments consortium members agreed with the conclusions of the international ict literacy panel that ict literacy must be defined as more than technology literacy. college students who grew up with the internet (the “net generation”) might be impressively technologically literate, more accepting of new technology, and more technically facile than their parents and instructors (oblinger and oblinger 2005). however, anecdotally and in smallscale studies, there is increasing evidence that students do not use technology effectively when they conduct research or communicate (rockman 2004). many educators believe that students today are less information savvy than earlier generations despite having powerful information tools at their disposal (breivik 2005). ict literacy must bridge the ideas of information literacy and technology literacy. to do so, ict literacy draws out the technologyrelated components of infor mation literacy as specified in the oftencited standards of the association of college and research libraries (acrl) (american library association 1989), focusing on how students locate, organize, and communicate information within digital environments (katz 2005). this conflu ence of information and technology directly reflects the “new illiteracy” concerns of educators: students quickly adopt new technology, but do not similarly acquire skills for being critical consumers and ethical producers of information (rockman 2002). students need training and practice in ict literacy skills, whether through general education or within discipline coursework (rockman 2004). the definition of ict literacy adopted by the con sortium members reflects this view of ict literacy as information literacy needed to function in a technological society: ict literacy is the ability to appropriately use digital technology, communication tools, and/or networks to solve information problems in order to function in an information society. this includes having the ability to use technology as a tool to research, organize, and communicate information and having a fundamental understanding of the ethical/legal issues surrounding accessing and using information (katz et al. 2004, 7). consortium members further refined this defini tion, identifying seven performance areas (see figure 1). these areas mirror the acrl standards and other related standards, but focus on elements that were judged most central to being sufficiently information literate to meet the challenges posed by technology. ■ ets’s iskills assessment ets’s iskills assessment is an internetdelivered assess ment that measures students’ abilities to research, orga nize, and communicate information using technology. the assessment focuses on the cognitive problemsolving and criticalthinking skills associated with using technol ogy to handle information. as such, scoring algorithms target cognitive decisionmaking rather than technical competencies. the assessment measures ict literacy through the seven performance areas identified by con sortium members, which represent important problem solving and criticalthinking aspects of ict literacy skill (see figure 1). assessment administration takes approx imately seventyfive minutes, divided into two sec tions lasting thirtyfive and forty minutes, respectively. article title | author 5testing information literacy in digital environments | katz 5 figure 1. components of ict literacy define: understand and articulate the scope of an information problem in order to facilitate the electronic search for information, such as by: ■ distinguishing a clear, concise, and topical research question from poorly framed questions, such as ones that are overly broad or do not otherwise fulfill the information need; ■ asking questions of a “professor” that help disambiguate a vague research assignment; and ■ conducting effective preliminary information searches to help frame a research statement. access: collect and/or retrieve information in digital environments. information sources might be web pages, databases, discussion groups, e-mail, or online descriptions of print media. tasks include: ■ generating and combining search terms (keywords) to satisfy the requirements of a particular research task; ■ efficiently browsing one or more resources to locate pertinent information; and ■ deciding what types of resources might yield the most useful information for a particular need. evaluate: judge whether information satisfies an information problem by determining authority, bias, timeliness, relevance, and other aspects of materials. tasks include: ■ judging the relative usefulness of provided web pages and online journal articles; ■ evaluating whether a database contains appropriately current and pertinent information; and ■ deciding the extent to which a collection of resources sufficiently covers a research area. manage: organize information to help you or others find it later, such as by: ■ categorizing e-mails into appropriate folders based on a critical view of the e-mails’ contents; ■ arranging personnel information into an organizational chart; and ■ sorting files, e-mails, or database returns to clarify clusters of related information. integrate: interpret and represent information, such as by using digital tools to synthesize, summarize, compare, and contrast information from multiple sources while: ■ comparing advertisements, e-mails, or web sites from competing vendors by summarizing information into a table; ■ summarizing and synthesizing information from a variety of types of sources according to specific criteria in order to compare information and make a decision; and ■ re-representing results from an academic or sports tournament into a spreadsheet to clarify standings and decide the need for playoffs. create: adapt, apply, design, or construct information in digital environments, such as by: ■ editing and formatting a document according to a set of editorial specifications; ■ creating a presentation slide to support a position on a controversial topic; and ■ creating a data display to clarify the relationship between academic and economic variables. communicate: disseminate information tailored to a particular audience in an effective digital format, such as by: ■ formatting a document to make it more useful to a particular group; ■ transforming an e-mail into a succinct presentation to meet an audience’s needs; ■ selecting and organizing slides for distinct presentations to different audiences; and ■ designing a flyer to advertise to a distinct group of users. © 2007 educational testing service. all rights reserved. 6 information technology and libraries | september 20076 information technology and libraries | september 2007 during this time, students respond to fifteen interactive, performancebased tasks. each interactive task presents a realworld scenario, such as a class or work assignment, that frames the infor mation problem. students solve informationhandling tasks in the context of simulated software (for example, email, web browser, library database) having the look and feel of typical applications. there are fourteen three to fiveminute tasks and one fifteenminute task. the three to fiveminute tasks target a single perfor mance area, while the fifteenminute tasks comprise more complex problemsolving scenarios that target multiple performance areas. the simpler tasks contribute to the overall reliability of the assessment, while the more com plex task focuses on the richer aspects of ict literacy performance. in the assessment, a student might encounter a sce nario that requires him or her to access information from a database using a search engine (see figure 2). the results are tracked and strategies scored based on how he or she searches for information, such as key words chosen, search strategies refined, and how well the information returned meets the needs of the task. the assessment tasks each contain mechanisms to keep students from pursuing unproductive actions in the simulated environment. for example, in an internet browsing task, when the student clicks on an incorrect link, he might be told that the link is not needed for the current task. this message cues the student to try an alter native approach while still noting for scoring purposes that the student made a misstep. in a similar way, the student who fails to find useful (or any) journal articles in her database search might receive an instant message from a “teammate” providing her with a set of journal articles to be evaluated. these mechanisms potentially keep students from becoming frustrated (for example, via a fruitless search) while providing the opportunity for the students to demonstrate other aspects of their skills (for example, evaluation skills). the scoring for the iskills assessment is completely automated. unlike a multiplechoice question, each simu lationbased task provides many opportunities to collect information about a student and allows for alternative paths leading to a solution. scored responses are pro duced for each part of a task, and a student’s overall score on the test accumulates the individual scored responses across all assessment tasks. the assessment differs from existing measures in sev eral ways. as a largescale measure, it was designed to be administered and scored across units of an institution or across institutions. as a simulationbased assessment, the tasks go beyond what is possible in multiplechoice format, providing students with the look and feel of interactive digital environments along with tasks that elicit higherorder criticalthinking and problemsolving skills. as a scenariobased assessment, students become engaged in the world of the tasks, and the task scenarios describe the types of assignments students should be see ing in their ict literacy instruction as well as examples of workplace and personal information problems. ■ two levels of assessments the iskills assessment is offered at two levels: core and advanced. the core level was designed to assess readi ness for the ict literacy demands of college. it is targeted at high school seniors and firstyear college students. the advanced level was designed to assess readiness for the ict literacy challenges in transitioning to higherlevel college coursework, such as moving from sophomore to junior year or transferring from a twoyear to a fouryear institution. the advanced level targets students in their second or third year of postsecondary study. the key difference between the core and advanced levels is in the difficulty of the assessment tasks. tasks in the core level are designed to be easier; examinees are presented with fewer options, the scenarios are more straightforward, and the reasoning needed for each step in a task is simpler. an advanced task might require an individual to infer the search terms needed from a gen eral description of an information need; the correspond ing core task would state the information need more explicitly. in a task of evaluating web sites, the core level might present a web site with many clues that it is not figure 2. in the iskills assessment, students demonstrate their skills at handling information through interaction with simulated software. in this example task, students develop a search query as part of a research assignment on earthquakes. © 2007 educational testing service. all rights reserved. article title | author 7testing information literacy in digital environments | katz 7 authoritative (a “.com” url, unprofessional look, content that directly describes the authors as students). the cor responding advanced task would present fewer cues of the web site’s origin (for example, a professional look, but careful reading reveals the web site is by students). ■ score reports for individuals and institutions both levels of the assessment feature online delivery of score reports for individuals and for institutions. the individual score report is intended to help guide students in their learning of ict literacy skills, aiding identifica tion of students who might need additional ict literacy instruction. the report includes an overall ict literacy score, a percentile score, and individualized feedback on the student’s performance (see figure 3). the percentile compares students to a reference group of students who took the test in early 2006 and who fall within the target population for the assessment level (core or advanced). as more data are collected from a greater number of institutions, these reference groups will be updated and, ideally, approach nationally representative norms. score reports are available online to students, usually within one week. high schools, colleges, and universities receive score reports that aggregate results from the testtakers at their institution. the purpose of the reports is to provide an overview of the students in comparison with a reference group. these reports are available to institutions online after at least fifty students have taken either the core or advanced level test—that is, when there are sufficient num bers to allow reporting of reliable scores. figure 4 shows a graph from one type of institutional report. users have the option to specify the reference group (for example, all students, all students at a fouryear institution) and the subset of testtakers to compare to that group (for exam ple, freshmen, students taking the test within a particular timeframe). a second report summarizes the performance feedback of the individual reports, providing percentages of students who received the highest score on each aspect of performance (each of the fourteen short tasks are scored on two or three different elements). finally, institutions can conduct their own analyses by downloading the data of their testtakers, which include each student’s responses to the background questions, iskills score, and responses to institutionspecified questions. ■ testing the test a variety of specialists contributed to the development of ets’s iskills assessment: librarians, classroom fac ulty, education administrators, assessment specialists, researchers, userinterface and graphic designers, and systems developers. the team’s combined goal was to produce a valid, reliable, authentic assessment of ict literacy skills. before the iskills assessment produced figure 3. first page of a sample score report for an individual. the subsequent pages contain additional performance feedback. figure 4. sample portion of an institutional score report: comparison between a user-specified reference group and data from the user’s institution. © 2007 educational testing service. all rights reserved. © 2007 educational testing service. all rights reserved. � information technology and libraries | september 2007� information technology and libraries | september 2007 official scores for testtakers, these specialists—both ets and ict literacy experts—subjected the assess ment to a variety of review procedures at many stages of development. these reviews ranged from weekly teleconferences with consortium members during the initial development of assessment tasks (january–july 2004), to smallscale usability studies in which ets staff observed individual students completing assessment tasks (or mockups of assessment tasks), to field trials that mirrored actual test delivery. the usability studies investigated students’ comprehension of the tasks and testing environment as well as the ease of use of the simulated software in the assessment tasks. the field trials provided opportunities to collect performance data and test the automated scoring algorithms. in some cases, ets staff finetuned the scoring algorithms (or developed alternatives) when the scores produced were not psychometrically sound, such as when one element of students’ scores was inconsistent with their overall performance. through these reviews and field trials, the iskills assessment evolved to its current form, targeting and reporting the performance of individuals who complete the seventyfiveminute assessment. in some cases, feedback from experts and field trial participants led to significant changes. for example, the iskills assess ment began in 2005 as a twohour assessment (at that time called the ict literacy assessment), that reported scores only to institutions on the aggregated perfor mance of their participating students. some students entering higher education found the 2005 assessment excessively difficult, which led to the creation of the easier core level assessment. table 1 outlines the participation volumes for the field trials and test administrations. during each field trial, as well as during the institutional administration, feedback was collected from students on their experience with the test via a brief exit survey. table 2 summarizes some results of the exit survey. student reactions to the test were reasonably consistent: most students enjoyed taking the test and found the tasks realistic. in writ ten comments, students taking the institutional assess ment found the experience rewarding but exhausting, and thought the amount of reading excessive. student feedback directly influenced the design of the core and advanced level assessments, including the shorter test table 1. chronology of field trials and test administrations date administration approximate no. of students approximate no. of participating institutions july–september 2004 field trials for institutional assessment 1,000 40 january–april 2005 institutional assessment 5,000 30 may 2005 field trials for alternative individual assessment structures 400 25 november 2005 field trials for advanced level individual assessment 700 25 january–may 2006 advanced level individual assessment 2,000 25 february 2006 field trials for core level individual assessment 700 30 april–may 2006 core level individual assessment 4,500 45 august–december 2006 core level: continuous administration 2,100 20 august–december 2006 advanced level: continuous administration 1,400 10 note: items in bold represent “live” test administrations in which score reports were issued to institutions, students, or both. article title | author 9testing information literacy in digital environments | katz 9 taking time and lighter reading load compared with the institutional assessment. as shown in table 1 (bolded rows), test administra tions in 2005 and early 2006 occurred within set time frames. beginning in august 2006, the core and advanced level assessments switched to continuous testing: instead of a specific testing window, institutions create testing sessions to suit the convenience of their resources and students. the tests are still administered in a proctored lab environment, however, to preserve the integrity of the scores. ■ student performance almost 6,400 students at sixtythree institutions par ticipated during the first administrations of the core and advanced level iskills assessments between january and may 2006. (some institutions administered both the core and advanced level assessments.) testtakers consisted of 1,016 highschool students, 753 community college students, and 4,585 fouryear college and university stu dents. institutions selected students to participate based on their assessment goals. some chose to test students enrolled in a particular course, some recruited a random sample, and some issued an open invitation and offered gift certificates or other incentives. because the sample of students is not representative of all united states institu tions nor all higher education students, these results do not necessarily generalize to the greater population of collegeage students and should therefore be interpreted with caution. even so, the preliminary results reveal interesting trends in the ict literacy skills of participat ing students. overall, students performed poorly on both the core and advanced level, achieving only about half of the possible points on the tests. informally, the data suggest that students generally do not consider the needs of an audience when communicating information. for exam table 2. student feedback from the institutional assessment and individual assessments’ field trials statement % agreeing institutional assessment (n=4,898) advanced level field trials (n=736) core level field trials (n=648) i enjoyed taking this test. 61 59 67 this test was appropriately challenging. 90 90 86 i have never taken a test like this one before. 90 90 89 to perform well on this test requires thinking skills as well as technical skills. 95 93 94 i found the overall testing interface easy to use (even if the tasks themselves might have been difficult). 83 82 85 my performance on this test accurately reflects my ability to solve problems using computers and the internet. 63 56 67 i didn’t take this test very seriously. 25 25 23 the tasks reflect activities i have done at school, work, or home. 79 77 78 the software tools were unrealistic. n/a 21 24 10 information technology and libraries | september 200710 information technology and libraries | september 2007 ple, they do not appear to recognize the value of tailor ing material to an audience. regarding the ethical use of information, students tend not to check the “fair use” policies of information on the assessment’s simulated web sites. unless the usage policy (for example, copy right information) is very obvious, students appeared to assume that they may use information obtained online. on the positive side, testtakers appeared to recognize that .edu and .gov sites are less likely to contain biased material than .com sites. eighty percent of testtakers correctly completed an organizational chart based on emailed personnel information. most testtakers cor rectly categorized emails and files into folders. and when presented with an unclear assignment, 70 percent of testtakers selected the best question to help clarify the assignment. during a task in which students evaluated a set of web sites: ■ only 52 percent judged the objectivity of the sites cor rectly; ■ sixtyfive percent judged the authority correctly; ■ seventytwo percent judged the timeliness correctly; and ■ overall, only 49 percent of testtakers uniquely identi fied the one web site that met all criteria. when selecting a research statement for a class assign ment: ■ only 44 percent identified a statement that captured the demands of the assignment; ■ fortyeight percent picked a reasonable but too broad statement; and ■ eight percent picked statements that did not address the assignment. when asked to narrow an overly broad search: ■ only 35 percent selected the correct revision; and ■ thirtyfive percent selected a revision that only mar ginally narrowed the search results other results suggest that these students’ ict literacy needs further development: ■ in a web search task, only 40 percent entered mul tiple search terms to narrow the results; ■ when constructing a presentation slide designed to persuade, 12 percent used only those points directly related to the argument; ■ only a few testtakers accurately adapted existing material for a new audience; and ■ when searching a large database, only 50 percent of testtakers used a strategy that minimized irrelevant results. ■ validity evidence the goal of the iskills assessment is to measure the ict literacy skills of students—higher scores on the assess ment should reflect stronger skills. evidence for this validity argument has been gathered since the earliest stages of assessment design, beginning in august 2003. these documentation and research efforts, conducted at ets and at participating institutions, include: ■ the estimated reliability of iskills assessment scores is .88 (cronbach alpha), which is a measure of test score consistency across various administrations. this level of reliability is comparable to that of many other respected contentbased assessments, such as the advanced placement exams. ■ as outlined earlier, the evidencecentered design approach ensures a direct connection between experts’ view of the domain (in this case, ict literacy), evi dence of student performance, design of the tasks, and the means for scoring the assessment (katz et al. 2004). through the continued involvement of the library community in the form of the ict literacy national advisory committee and development committees, the assessment maintains the endorsement of its con tent by appropriate subjectmatter experts. ■ in november 2005, a panel of experts (librarians and faculty representing high schools, community colleges, and fouryear institutions from across the united states) reviewed the task content and scoring for the core level iskills assessment. after investigat ing each of the thirty tasks and their scoring in detail, the panelists strongly endorsed twentysix of the tasks. four tasks received less strong endorsement and were subsequently revised according to the committee’s recommendations. ■ students’ selfassessments of their ict literacy skills align with their scores on the iskills assessment (katz and macklin 2006). the selfassessment measures were gathered via a survey administered before the 2005 assessment. interestingly, although students’ confidence in their ict literacy skills aligned with their iskills scores, iskills scores did not correlate with the frequency with which students reported per forming ict literacy activities. this result supports librarians’ claims that mere frequency of use does not translate to good ict literacy skills, and points article title | author 11testing information literacy in digital environments | katz 11 to the need for ict literacy instruction (oblinger and hawkins 2006; rockman 2002). ■ several other validity studies are ongoing, both at ets and at collaborating institutions. these stud ies include using the iskills assessment in prepost evaluations of educational interventions, detailed comparisons of student performance on the assess ment and on more realworld ict literacy tasks, and comparisons of iskills assessment scores and scores from writing portfolios. ■ national ict literacy standards and setting cut scores in october 2006, the national forum on information literacy, an advocacy group for information literacy policy (http://www.infolit.org/), announced the formation of the national ict literacy policy council. the policy coun cil—composed of representatives from key policymaking, informationliteracy advocacy, education, and workforce groups—has the charter to draft ict literacy standards that outline what students should know and be able to do at different points in their academic careers. beginning in 2007, the council will first review existing standards docu ments to draft descriptions for different levels of perfor mance (for example, minimal ict literacy, proficient ict literacy), creating a framework for the national ict literacy standards. separate performance levels will be defined for the corresponding target population for the core and advanced assessments. these performancelevel descrip tions will be reviewed by other groups representing key stakeholders, such as business leaders, healthcare educa tors, and the library community. the council also will recruit experts in ict literacy and informationliteracy instruction to review the iskills assessment and recommend cut scores corresponding to the performance levels for the core and advanced assess ments. (a cut score represents the minimum assessment score needed to classify a student at a given performance level.) the standardsbased cut scores are intended to help educators determine which students meet the ict literacy standards and which may need additional instruction or remediation. the council will review these recommended cut scores and modify or accept them as appropriately reflecting national ict literacy standards. ■ conclusions ets’s iskills assessment is the first nationally available measure of ict literacy that reflects the richness of that area through simulationbased assessment. owing to the 2005 and 2006 testing of more than ten thousand students, there is now evidence consistent with anec dotal reports of students’ difficulty with ict literacy despite their technical prowess. the results reflect poor ict literacy performance not only by students within one institution, but across the participating sixtythree high schools, community colleges, and fouryear colleges and universities. the iskills assessment answers the call of the 2001 international ict literacy panel and should inform ict literacy instruction to strengthen these criti cal twentyfirstcentury skills for college students and all members of society. ■ acknowledgments i thank karen bogan, dan eignor, terry egan, and david williamson for their comments on earlier drafts of this article. the work described in this article represents con tributions by the entire iskills team at educational testing service and the iskills national advisory committee. works cited american library association. 1989. presidential committee on information literacy: final report. chicago: ala. available online at http://www.ala.org/acrl/legalis.html (accessed june 13, 2007). brasley, s. s. 2006. building and using a tool to assess info and tech literacy. computers in libraries 26, no. 5: 6–7, 43–48. breivik, p. s. 2005. 21st century learning and information literacy. change 37, no. 2: 20–27. dunn, k. 2002. assessing information literacy skills in the cali fornia state university: a progress report. journal of academic librarianship 28, no. 1/2: 26–36. international ict literacy panel. 2002. digital transformation: a framework for ict literacy. princeton, n.j.: educational testing service. available online at http://www.ets.org/media/ tests/information_and_communication_technology_lit eracy/ictreport.pdf (accessed june 13, 2007). katz, i. r. 2005. beyond technical competence: literacy in infor mation and communication technology. educational technology magazine 45, no 6: 144–47. katz, i. r., and a. macklin. 2006. information and communica tion technology (ict) literacy: integration and assessment in higher education. in proceedings of the 4th international conference on education and information systems, technologies, and applications, f. malpica, a. tremante, and f. welsch, eds. caracas, venezuela: international institute of informatics and systemics. katz, i. r., et al. 2004. assessing information and communications technology literacy for higher education. paper presented at the 12 information technology and libraries | september 200712 information technology and libraries | september 2007 annual meeting of the international association for educa tional assessment, philadelphia, pa. middle states commission on higher education. 2003. developing research and communication skills: guidelines for information literacy in the curriculum. philadelphia: middle states com mission on higher education. mislevy, r. j., l. s. steinberg, and r. g. almond. 2003. on the structure of educational assessments. measurement: interdisciplinary research and perspectives 1: 3–67. oblinger, d. g., and b. l. hawkins. 2006. the myth about stu dent competency. educause review 41, no. 2: 12–13. oblinger, d. g., and j. l. oblinger, eds. 2005. educating the net generation. washington, d.c.: educause, http://www. educause.edu/educatingthenetgen (accessed dec. 29, 2006). partnership for 21st century skills. 2003. learning for the 21st century: a report and mile guide for 21st century skills. washington, d.c.: partnership for 21st century skills. rockman, i. f. 2002. strengthening connections between infor mation literacy, general education, and assessment efforts. library trends 51, no. 2: 185–98. ———. 2004. introduction: the importance of information lit eracy. in integrating information literacy into the higher education curriculum: practical models for transformation. i. f. rockman and associates, eds. san francisco: jossybass. the california state university. 2006. information competence initiative web site. http://calstate.edu/ls/infocomp.shtml (accessed june 4, 2006). university of central florida. 2006. information fluency initiative web site. http://www.if.ucf.edu/ (accessed june 4, 2006). western association of schools and colleges. 2001. handbook of accreditation. alameda, calif.: western association of schools and colleges. available online at http://www.wascsenior .org/wasc/doc_lib/2001%20handbook.pdf (accessed dec. 22, 2006). highlights of isad board meeting 197 4 midwinter meeting chicago, illinois monday, january 21, 1974 43 the meeting was called to order at 10:15 a.m. by president frederick kilgour. those present were: board-frederick g. kilgour, lawrence w. s. auld, paul j. fasana, donald p. hammer (isad executive secretary), susan k. martin, ralph m. shoffner, and berniece coulter, secretary, isad. guest-brett butler. midwinter 1973 minutes approved. motion. mr. shoffner moved to approve the minutes of the midwinter 1973 board meetings. seconded by mr. fasana. carried. las vegas annual meeting minutes accepted. a correction on page one of the las vegas annual meeting minutes was noted: mr. auld's name should be added to the list of guests present. motion. mr. fasana moved that the minutes of the isad board meetings at the las vegas annual conference be accepted as corrected. seconded by mrs. martin. carried. isad history committee. the matter of appointing members to the isad history committee, whose function is to prepare a history of isad for ala's centennial celebration in 1976, was considered. mr. shoffner said that during the time he was president, he had rendered the isad history committee inactive. it was suggested by mr. kilgour that a historian would serve the purpose better than a committee. mr. shoffner remarked that he anticipated the chairman would be a historian. mrs. martin asked whether a check could be made first whether ala is planning to publish any document for the centennial celebration that would make any preparation by an isad committee or historian worth while. mr. kilgour remarked that isad definitely should be included if ala did plan to publish any document and asked the board to give an "ok" to appoint a historian. motion. mr. fasana moved that the ad hoc isad history committee 44 journal of library automation vol. 7/1 march 197 4 be abolished and recommended that the president be given the right to appoint a historian if ala planned to publish a centennial document. seconded by mr. auld. carried. ala dues structure. mr. hammer explained the information submitted to the board concerning the proposed ala. dues structure. the basic fee for ala membership under this proposed dues structure would be $35. membership in each division would be an additional $15. in es~ sence, each division would be on its own financially:· if there are not enough memberships to support a division, as could be the case, the division would cease to exist. !sad could support itself with its present membership, but there is no. way of knowing how many !sad members would still select !sad if the choice of two divisions included in the dues was removed. the divisions that publish a journal would attract membership much more easily than those that do not provide a journal. mr. hammer further remarked that the proposed dues schedule indicates that the divisions must prove themselves with membership dues as their only support, but this does not apply to ala committees, scmai, units such as the office for intellectual freedom, office for library service to the disadvantaged, and the administrative and support units of ala. these units may be of great value to ala, but if one tinit is forced to prove its value financially, then it seems that all should have to prove themselves. the divisions would be expected to depend on their own resources, e.g., if the division runs out of postage ·money, there would be no further mailings. the divisions would be expected to pay for their support services.· the idea is very closeto the federation plan which has been circulated for some time. in answer to the question of how a new division would get started, mr. hammer replied that he assumed there would have to be enough memberships to provide for it financially. mr. shoffner suggested that the discussion be divided into two parts: ( 1) the principle involved; and ( 2) the financial aspect. · the following points were brought up in the ensuing discussion by the board regarding the proposed dues structure: . starting a new division could be a problem; perhaps it could be subsidized for a stated time, after which the division: would be self-sufficient. the proposed separation of dues, however, would force a clarity in ex~ penditures of. ala in respect to how the divisions would benefit. some divisions could not be self-supporting and yet are producing important contributions for ala. ' ' a division would be at the mercy of the ala supporting units. if a sup~· port unit was not efficient, the divisions would be handicapped in the services to their members. would a division be able to know enough in advance how much money could be counted on for program planning? the answer was "yes" based highlights of meetings 45 on past membership, except in the first year. the income would be predicted on the basis of the previous year's income. an excess of income would remain in the division's funds. if the division income fell short of the anticipated amount, it would have no back-up from ala as it has presently. a person could not join one or more of the divisions without joining ala. some divisions could become part of a stronger division, e.g., a division could be broken up and absorbed into several other divisions with related interests. was there any plan to absorb or redirect these divisions which obviously could not be self-supporting? nothing has been announced so far. if a division got into financial difficulties, it could not cut down on its professional staff as a professional staff is needed to maintain ala's status with the internal revenue service. it was noted that there were more important reasons than this for maintaining a professional staff . . · this proposal was drafted by the then deputy director ruth warncke in 1970. the board was informed that a cost study of ala was recently discussed by staff members, but the reply has been that it would take five years to make such a study. the isad board disagreed with the period of five years, but stated that it could take a year. . · a division should be allowed to set up its own budget under this pro~ posal as well as have a voice in ala policy. · · the proposal appeared to be unfair in some points: ( 1) some divisions would have about twice their present income through memberships, while isad would break about even; ( 2) life members would be entitled to membership in all divisions; ( 3) apparently institutions without a group insurance plan of their own could join ala for $35 and be entitled to the gioup insurance for their staffs; at some point an examination of the privileges in each category of membership should be made; and ( 4) if the $35 ala membership fee were increased in the future, this would directly affect membership in the divisions. the isad budget for the 1973/74 year is approximately $47,000 and the journal of library automation $23,000, or a total of approximately $.70,000. if isad membership should fall back to 3,000 members and the membership fee were $25, isad could still be viable. "mr. kilgour's poll of the board revealed all were in favor of the principle of more or less independent divisions, but with reservations; the following was therefore moved: · 'motion. mr. shoffner moved that the isad board favors the prin. ciple of divided annual fees for ala and for its divisions subject to: ' · ( 1) division determination of the fee structure for division memberships and publications; ( 2) division participation in the governance of ala headquarters activities. seconded by mr. fasana. motion carried. ·selective dissemination of information system. mr. 46 journal of library automation vol. 7/1 march 1974 hammer presented a proposal for establishing on a subscription basis a selective dissemination of information system for ala members (see exhibit 1). mter discussion it was decided that mr. hammer would contact ohio state university library and obtain information on exact procedure as to how this would be run, how it would be publicized, who would develop the profiles, who would handle the subscriptions, the cost to the division, etc., and then repmt to the board. co-sponsorship of basic data processing seminars. mr. hammer presented a proposal to the board regarding co-sponsorship of basic data processing seminars with organizations outside isad, such as ibm and dataflow systems, inc. in bethesda, maryland. in the past isad seminars have generally been on library applications, but what he had in mind, mr. hammer said, was primarily on the basics of data processing, systems analysis, and other basic aspects that would be of interest to administrators. the intent would be to give administrators enough knowledge so that they could evaluate the results that they should be gaining from their data processing systems. these institutes would be a package deal in that the personnel and materials would be commercially supplied, dataflow has conducted seminars for the united states civil service commission. ibm has some seminars which are free, but there is a charge if they have to develop a special program. comment was made regarding seminars conducted several years ago where problems developed as to the commercial aspects. motion. it was moved by mrs. martin that the matter of !sad's cosponsoring basic data processing seminars with outside organizations be referred to the isad program planning committee for discussion and their evaluation. seconded by mr. fasana. carried. tuesday, january 22, 1974 the meeting was called to order by the president, mr. kilgour, at 2:25 p.m. those present were: board-frederick g. kilgour, lawrence w. s. auld, paul j. fasana, donald p. hammer (isad executive secretary), susan k. mqrtin, ralph m. shoffner, and berniece coulter, secretary, isad. guests-alex allain, brigitte kenney, ron miller, and velma veneziano. draft on ala goals and objectives. mrs. brigitte kenney sought feedback from the board on the paper previously distributed on the ala committee on planning's draft statement on ala's goals and objectives. several changes were suggested. mrs. kenney expressed her appreciation for their input. freedom to read foundation. mr. alex allain from the foundation presented the cause of the freedom to read foundation in rehighlights of meetings 47 gard to the current problem of censorship. he stressed the desire to keep channels open with the divisions of ala and with systems and networks across the nation. marbi and isad standards committee (tesla). velma veneziano, chairman of the marbi interdivisional committee, appeared before the isad board requesting clarification of the functions of marbi and the isad standards committee ( tesla). she said that her committee would like discrepancies cleared up and duplications eliminated. mrs. martin suggested that the charges to both marbi and tesla be reworded to clarify their functions. isad bylaws committee. in response to discussions concerning the establishment of several committees, mr. shoffner moved to establish an organization committee. seconded by mrs. martin. mr. fasana pointed out that the mechanism for establishing a bylaws committee was already spelled out in the isad constitution. the president can appoint the committee. motion withdrawn. mr. shoffner withdrew his motion. mr. fasana suggested that the bylaws committee also be charged with the organizational and review function. the matter of the standards committee's function was also made the charge of the bylaws committee. wednesday, january 23,1974 president kilgour called the meeting to order at 10:15 a.m. those present were: board-frederick g. kilgour, lawrence w. s. auld, paul j. fasana, donald p. hammer ( isad executive secretary), susan k. martin, ralph m. shoffner, and berniece coulter, secretary, isad. guestsbrett butler, john kountz, ann painter, charles payne, james rizzolo, richard utman, velma veneziano, and david waite. report of the nominating committee. the chairman, charles payne, announced the nominees for the 197 4/75 slate of isad candidates: vice-president/president-elect: board member-at-large: henriette a vram allen veaner ruth tighe maurice freedman the board members extended a vote of thanks to the nominating committee for their work. report of marc user's discussion group. mr. james rizzolo, chairman, said most of the discussion in the discussion group revolved around ala, clr, and the change in clr' s status which was moved in august from one irs classification to another. it is now an "op48 journal of library automation vol. 7/1 march 1974 erating foundation," i.e., it is active in programs rather than waiting for a reaction to a request using funds they have as a "carrot.'~ also discussed was whether clr should fund and pick the participants or clr should do the funding and ala pick the participants. , . also the group considered the question of standards and how one ardves at them. there are a number of groups in ala dealing with standards, but there is a need to work out a systematic method of developing standards. there needs to be a routine mechanism set up for going from an imtial formulation of an idea for a standard to a standard that the profession can live with. report of program planning committee.. the committee met at the asis meeting in los angeles prior to meeting at the ala midwinter meeting. . :rvir. brett butler, chairman, announced that three european librarians );lad been invited to participate in the 1974 annual program .at new york city. mr. kilgour was handling all arrangements. mr. kilgour informed the board that the travel expenses of all three librarians were ·being provided for by sources outside ala. linda crismond is the local planning person for the 1975. san francisco annual conference program which will be sponsored jointly with asis. joshua smith had suggested mark radwin of lockheed as liaison and he had agreed to serve in this capacity. . the new orleans institute on "alternatives in bibliographic networking" had enough registrants by midwinter to confirm it. there had been some difficulty concerning contact with speakers but the .details had been straightened out. copies of the program ·for the new orleans institute were distributed. mr. butler also inforrried the board that his committee was looking into the details of cooperating with other institutions and state schools which might be interested in working with isad in a seminar or institute. the committee was also considering what type of programs should ·be presented, subcontracting to outside companies, and how to control these. the members of the committee were working on a procedure manual for use in conducting institutes .. telecommunications committee report. the activities of the telecommunications committee are highly organizational at present. the committee has swung away from cable tv as its primary interest and towards telecommunications as applied to bibliographic networks. the chairman, david. waite, said there was a need to set up a simple guide to carry out their charge for the educational activities and legislation advisory responsibilities to the ala committee on legislation. more people would probably be appointed to the telecommunications committee as there was a need for more expertise to assign to the areas identified by the committee. . highlights of meetings 49 he further said that the need now is to determine what existing appara. tus may be utilized to fulfill the committee's responsibility to disseminate information regarding telecommunications as applied to the library community so that the committee could put most of .its effort into technical work. one project discussed was to gather background information on bibliographic data centers and network activities and their needs for telecommunication facilities in order to draft a requirements statement. the purpose of such a: statement is. that the committee could communicate .with new telecommunications systems. the committee was not aware of an ade· quate statement of library requirements that. is readily available ·for the commercial services that .are steadily increasing. assignments have been given to gordon randall, maryann duggan, and ron miller to gather this information. mtr. waite remarked that the committee would be interested in ~ny report on the proposed isad networks committee when available. brett. butler, chairman of the program planning co:rrnnittee; suggested that a telecommtmications institute should be in the future plans and mr. waite's or any of his committee members contribution of any ideas about· such would be appreciated. report 6f the interdivisional committee on machine-readable bibliographic information (marbi). (see exhibit 2~) mr. kilgoirr appointed velma veneziano to serveas liai~ son to the isad standards committee from marbi. her term as chair~ man of marbi will conclude in jnne 197 4. report of cola discussion group. (see exhibit 3.) report of committee on technical standards fqr library 'automation (tesla). (see exhibit 4.)~report of chairman jolln kountz. · · technological unemployment. president kilgour felt ala should do something about the spreading of unemployment due.' to increased use of technological development. m!r. auld suggested that someone be appointed to study the potential and existing problems in this area. this could be funded either: (1.) ,under a fellowship by clr; or (2) application for the j. morris jones .goals award. .· · · · mr. fasana. thought an interdivisional committee might be set up be~ tween the fotir rnost directly affected divisions: isad, lad, led.; 'and rtsd. ·. . . , mr. shoffner expressed· his view that as efficiency is ii:rcreased productivity is increased aj}d could possibly therefore increase employment. mr.: kil~ gour said tha.t.history had proved to.the contrary. mr. shoffner stated he felt the problem was on~. of education and ·.training. a specification· of 50 journal of library automation vol. 7/1 march 1974 what is expected of one and what training he would receive during a technical changeover was needed. mr. fasana's suggestion was that the four divisions be asked for papers of their views or a program at the san francisco annual conference be prepared on the subject of technological unemployment. mr. auld asked if it could not rather be introduced at the new york annual conference, to which ann painter volunteered the use of the isad /led education committee's two-hour time slot for the program at new york. motion. mr. fasana moved that mr. kilgour phrase a statement of the problem on technological unemployment as he sees it and present it to the !sad /led education committee for consideration as the program theme at the new york conference. seconded by mrs. martin. carried. proposed standards in ]ola tc. mr. john kountz brought up the subject of using lola tc for the interactive mechanism of presenting the proposal of a standard to the isad members for comment, and of having a form included to be filled out and returned. the board agreed that this was a good idea. isad/led education committee report. ann painter, chairman, asked for clarification of appointment of new members to the committee. roger greer is the only member whose term continues past this year. mr. hammer was asked to find out who appoints members to the above committee. the committee is working on a series of papers defining educational "modules" and has sent out a revised questionnaire to identify appropriate subject areas. it is planning to send the questionnaires to associated institutions as well as to the ala accredited schools. the need for funding the modules rather than depending upon volunteer or "slave labor" was considered by the committee. volunteers have little preparation time and so often there is a lack of in-depth or consistency in developing these modules. also the committee would like to set up a file of modules available to people across the country. there could be a problem of copyright involved. mr. kilgour asked miss painter for suggestions of people who might be interested in serving on the committee. lola manuscripts. mrs. martin, editor of ]ola, asked the board for its feeling on whether it would be appropriate or desirable to put the date of acceptance on published manuscripts in lola. the board decided that should be the editor's decision. vote of thanks to mrs. martin. the board gave mrs. susan martin a unanimous vote of thanks for her work in getting the issues of ]ola caught up to date in time to meet the post office deadline of december 31, 1973 in order to retain the second class permit. highlights of meetings 51 report of the membership survey committee. (see exhibit 5.) board minutes in lola. the board suggested that minutes published in ]ola be entitled "highlights of isad board meeting" rather than minutes. the meeting was adjourned at 12:30 p.m. exhibit 1 proposal for.establishing on a subscription basis a selective dissemination of information system for ala members the original proposal for an sdi system was intended for isad members only, but interest has grown at ala headqua1ters to the extent that it is being considered as a service to be provided for all ala members. the proposal therefore does not require any action on the part of the isad board. it is presented here for information and to give the board members an opportunity to comment on the idea and make suggestions toward developing the best possible procedure. it is hoped that a presently operating system can be found that would enable ala members to subscribe to a system using multisubject data banks that would automatically adjust profiles according to past output results and that would supply as requested copies of articles and documents whenever possible. such documents would of course be supplied at a fee additional to the basic subscription fee. it is also hoped that the operators of the system would be responsive to subscriber feedback and would improve the system as warranted. at present the only existing data banks in the library and information science fields are eric and marc, but hopefully as time goes on others will be developed. it, for example, would seem prudent for the h. w. wilson company to consider the sale of lihm1·y litemtme in machine-readable form. in any event, there is no reason to limit subscriptions to the service to information science data banks. if interested, members of ala could subscribe to other subject fields depending upon the data banks made available by the operating service. chemistry librarians could, if useful to them, subscribe to chemical abstmcts condensates, engineering librarians to enginee1'ing index, etc., etc. only time and the availability of sdi can determine the interest of librarians in such services. at the time of writing, only one of the two agencies contacted for information has provided descriptive data on their system. a copy of one of the papers sent by the ucla center for information services is attached. ohio state university libraries had not as yet responded. enquiries will be made with other operating systems so that a basis for comparison wiii be available for decision at ala headquarters. comments and suggestions from isad board members would be appreciated. information regarding presently operating systems would also be of great value. december 13, 1973 exhibit 2 reports of the meetings of the marbi committee (interdivisional committee on representation in machine readable form of bibliographic information) january 19 and 20, 1974 number one priority was the resolution of the relationship between the library of congress and marbi in its capacity as the marc advisory group. 52 journal of library automation vol. 7/1 march 1974 there was discussion of the position paper which was presented at the las vegas meeting (copy attached) entitled "the library of congress view on its relation to the ala marc advisory committee." lc had revised certain portions of this paper to conform with marbi's wishes. these revisions were acceptable to the committee. there was concern, however, over an addition which pertained to marbi's role with regard to formats other than books and serials (namely films, maps, music, etc.) alternate wording to lc's proposal was worked out by paul fasana and john knapp. several documents were submitted by henriette avram: (1) a proposed document numbering scheme for communications between lc and the committee and vice versa, and (2) proposed format for presenting changes to marc formats (copies attached). these documents and proposals were acceptable to the committee. (note: incidental to this discussion, the committee officially adopted "marbi" as its official acronym.) 1. the lc liaison presented two proposed marc format changes for the committee's consideration entitled: lc/marbi 2-addition of $x subfield for 4xx fields to allow for issn. lc/marbi 3-specincation of the 830 field. the committee decided that the following plan of action would be followed with regard to these two changes: they would be announced and distributed to isad marc users' discussion group at its january 21, 1974 meeting. the proposed changes would be sent to all on mudg's mailing list, asking for replies to the marbi chairman by february 16, 1974. the chairman would summarize responses and poll marbi committee members who would respond by march 16, 1974. the marbi committee chairman would respond to lc by march 16, 1974. marbi will request publication of changes in ]ola technical communications. 2. henriette avram presented to the committee a clr statement which had been presented to arl entitled "a composite effort to build an on-line national serials data base." the committee took note of the presentation with interest and voted to take no action on the matter at the january 19 meeting. 3. the character set subcommittee of marbi reported that it had issued a written report which will be used in support of the united states position concerning development of standards within the international standard organization. marbi issued thanks to the subcommittee and requested that they remain convened pending review of further developments coming from activities within iso. 4. there was a report on activities of the ad hoc committee convened by clr to discuss use of the marc format in a network environment. a paper entitled "sharing machine readable bibliographic data: a progress report on a series of meetings sponsored by the council on library resources" was discussed. the committee took note of these activities with interest and will wait for formal submission of format changes from the library of congress. 5. marbi discussed the apparent overlap of the change between marbi and the new isad committee on technical standards. marbi passed a resolution that the isad representatives should bring to the attention of the isad board its concern over the similarity of the function statements of the two committees, and asked that these apparent discrepancies be considered and any duplication be eliminated. 6. the proposed marbi serials task force was discussed. it was felt that marbi committee members needed to keep up on developments, and that the chairman should continue to collect and distribute as much documentation as possible to the committee highlights of meetings 53 members. it was decided that there was no need ~tt this time to set up a separate subcommittee to perform this function. 7. the proposed amendments to iso 2709-1973(e) were discussed. it appears that there are several proposals circulating to change this standard. marbi formed a subcommittee to study these proposals and respond, and possibly, to make counterproposals. the position of marbi will be reported to the chairman of ansi z-39, sc/2 and will be used in support of the u.s. position within iso. any committee member or interested professional may reply individually. the subcommittee appointed consists of charles payne, john knapp, mike malinconico, and charles husbands. response will be made by april 1, 197 4. at its regular scheduled meeting, on january 20, all members were present. (john byrum was unable to attend the unofficial meeting on january 19.) the distribution of the rtsd and isad manual material was discussed. the discussion of the previous day was summarized for purposes of review and for the benefit of the nonmembers attending the meeting. 1. marbi and lc the alternative wording to the lc position paper was presented by paul fasana. it was passed. henriette avram will have it published in lcib and will submit it to lola tc. lrts will also receive a copy. the paper will be submitted to each divisional board. 2. the national on-line union file of serials was discussed. larry livingston answered questions. 3. the character set subcommittee report will see that isad has a copy. interested professionals should ask for a copy from them. 4. the activities of the ad hoc clr committee were again reviewed. 5. the isad standards committee was discussed. 6. the serials task force for marbi was reported on. 7. the proposed changes to iso 2709-1973 (e) were reviewed. new business: 8. the activity of the ifla working group on content designators was discussed. it was reported that there is an attempt to standardize content designators across national boundaries, for purposes of international exchange. there are problems in the area of cataloging rules, not all libraries participating, and language. no action was needed, as this is only for informational purposes at this time. 9. location codes were discussed, but the issue was tabled pending report of ad hoc clr committee. 10. language and geographic area codes were brought up but not considered necessary to become involved. 11. the z39 standard account number (san) was reported by emery koltay. 12. progress in regard to the publication of the isbd-m and s was discussed. exhibit 3 cola report-midwinter '74 about fifty people were in attendance at portions of the four-hour meeting. the first half was taken up by a series of informal presentations about activity at: by: stanford allen veaner csuc john kountz berkeley & ulap sue martin ulap cis project at ucla peter watson 54 ]oumal of libmry automation vol. 7/1 march 1974 at: nypl-rlg & suny plans university of chicago lc by: mike malinconico charles payne rob mcgee mary kay daniels questions were entertained at the end of each presentation. the second half was opened by a few announcements by maryann duggan about the new orleans institute and henriette avram about the serials proposals. the major portion of the second half consisted of a panel discussion by john kountz, eme1y koltay, tom brady, and john knapp on the communication of orders, claim reports, ill requests and responses in machine-readable form. john kountz addressed general system design aspects, emery koltay discussed the isbn, issn, and standard account numbers, tom brady discussed b&t's experiences with batab, and john knapp addressed the nature of the data elements and the record structure itself. considerable discussion followed the presentations, centering heavily on the isbn and its good points and failings. both parts of the meeting seemed to be well received. the major value of cola seems to be as an occasion for a wide variety of automation-oriented people to discuss a similarly wide variety of topics in an informal environment. there was some feeling that the presentations in the first half could have been more tightly controlled. the presentation in the second half was quite useful, i feel. i would like to suggest cola as a good sounding-board for proposals and place for announcements, distributions of handouts or written position papers. john kountz and i have discussed setting aside a portion of it for tesla reports. exhibit 4 respectfully submitted, brian aveney to: board of directors, information science and automation division from: john kountz, chairman, committee on technical standards for library automation subject: report of committee's activities, ala midwinter meeting, 1974 the committee on technical standards for library automation (tesla) held its inaugural meetings on tuesday, january 1974 (4:3d-6:00 p.m. and 8:30-11:00 p.m.). these were icebreaker meetings for a new group. in view of the interest that had been expressed in various quarters, several interested observers attended, as well as six of the seven committee members (for membership attendance see attached list). in addition, the following individuals were invited to meet with the committee and present their review of standards activities in other areas; establish a working perspective for the committee within the american library association; and delineate the constraints of the committee's charge: mr. fred kilgour, mr. don hammer, ms. velma veneziano, mr. emery koltay. while the specific discussion that ensued covered a variety of topics, the central objectives for these two meetings (establishing/ defining action areas, constraints, roles, and reviewing in some detail the committee's charge) were met. in addition, stress was placed throughout the discussion on differentiating between professional, service, bibliographic, and similar library standards, and the communications/ clearinghouse function to be served by the committee in its dealings with technical standards impacting library automation. highlights of meetings 55 at its next meeting, the committee can be expected to complete its deliberations on the charge, complete a proposed pilot procedure for the handling of initiative/reactive requirements for standards, and recommend a shakedown of the proposed procedure. committee on technical standards for library automation ala midwinter meeting 1974 attendees of meetings held 21 january 1974 dr. edmund a. bowles, ibm mr. arthur brody, bro-dart industries mr. jay cunningham, university of california mr. john kountz, chairman, california state university and colleges mr. tony miele, illinois state library mr. richard utman, princeton university absent: ms. madeline henderson, national bureau of standards exhibit 5 report of the membership survey committee we mailed out 4,337 questionnaires as of november 3. as of last week, we had received 1,666 replies. they have now dwindled down to about five or six a day, so i feel we have probably received the majority of responses from our mailing. i hope for about a 40 percent response. the returns are presently being coded now by my graduate assistant, and the university of south carolina computer centre will keypunch them for us. i am hopeful that we can start analyzing the results by the end of february, and have the report ready for you by april. the expenses to date have been: $346.95 164.32 166.60 $677.88 preliminary mailing printing of envelopes return postage the bill for printing the questionnaire hasn't been received yet but should be a very minor one. jim williams will write the program for the data, and the library school has computer time which we can use. i expect when all the expenses are in that the total will be more than the budgeted $700, but not very much more. submitted by: elspeth pope, chairman jim williams bill summers martha manheimer letter from the editor kenneth j. varnum information technology and libraries | march 2018 1 https://doi.org/10.6017/ital.v37i1.10388 this issue marks 50 years of information technology and libraries. the scope and everaccelerating pace of technological change over the five decades since journal of library automation was launched in 1968 mirrors what the world at large has experienced. from “automating” existing services and functions a half century ago, libraries are now using technology to rethink, recreate, and reinvent services — often in areas that simply were in the realm of science fiction. in an attempt to put today’s technology landscape in context, ital will publish a series of essays this year, each focusing on the highlights of a decade. in this issue, editorial board member mark cyzyk talks about selected articles from the first two volumes of the journal. in the remaining issues this year, we’ll tackle the 1970s, 1980s, 1990s, and 2000s. the journal itself, now as ever before, focuses on the present and the near future, so we will hold off recapitulating the current decade until our centennial celebration in 2068. as we look back over the journal’s history, the editorial board is also looking to the future. we want to make sure that we know for whom we are publishing these articles, and to make sure that the journal is as relevant to today’s (and tomorrow’s) readership as it has been for those who have brought us to the present. to that end, we invite anyone who is reading this issue to take this brief survey — tell us a little about how you came to ital today, how you’re connected with library technology, and what you’d like to see in the journal. it won’t take much of you r time (no more than 5 minutes) and will help us understand the context in which we are working. there’s another opportunity for you to help shape the future of the journal. due to a number of terms being up at the end of june 2018, we have at least five openings on the editorial board to fill. if you are passionate about libraries and technology, enjoy working with authors to shape their articles, and want to help set out today’s scholarly record for tomorrow’s technologists, submit a statement of interest at https://goo.gl/forms/5gbqouuseolxrfx52. we seek to have an editorial board that represents the diversity of library technology practitioners, and particularly invite individuals from non-academic libraries and underrepresented demographic groups to apply. sincerely, kenneth j. varnum editor march 2018 https://umich.qualtrics.com/jfe/form/sv_6hafly0cyjpbk4j https://umich.qualtrics.com/jfe/form/sv_6hafly0cyjpbk4j https://goo.gl/forms/5gbqouuseolxrfx52 editorial board thoughts: technology and mission: reflections of a first-year college library director ed tallent information technology and libraries | december 2012 3 as i reflect on my first year as director for a small college library, several themes are clear to me, but perhaps none resonates as vibrantly as the challenges in managing technology, technology planning, and the never-ending need for technology integration, both within the library and the college. it is all-encompassing, involving every library activity and initiative. while my issues will naturally have a contextual flavor unique to my place of employment, i imagine they reflect issues that all librarians face (or have already faced). what is perhaps less unique is how these issues of library technology intersect with some very high priority college initiatives and challenges. and, given myriad reports on students’ ongoing ambivalent attitudes toward libraries (after everything we have done for them!), it still behooves us to keep working at this integration of the library into the learning and teaching process and to hitch our wagon to larger strategic missions. so, what issues have i faced? the campus portal vs. library web site: this issue is neither new nor unique, but is still is a tangled web of conflicting priorities and attitudes, campus politics and technology vision, the extent and location of technology support, and the flexibility of the campus portal or content management system (cms) and the people who direct it. it is not a question of any misunderstandings, as the need to market the library via the campus web site is obvious and the goal of personalized service is laudatory. yet, marrying the external marketing needs with the internal support needs is a difficult balance to achieve. the web offers a more dramatic entrée to the library than a portal/intranet, and portal technology is not perfect, as jacob neilson highlights in a recent post. the goal obviously is further complicated by the fact that the support needed to maintain a quality web presence--one that is well graphically interesting, vibrant and intuitive--is significant when one considers library web sites are rarely used a place to begin research by students and faculty. ed tallent (edtallent@ curry.edu) is director, levin library, curry college, milton, massachusetts. http://www.useit.com/alertbox/intranet-usability.html editorial board thoughts: technology and mission | tallent 4 the portal, on the other hand, promises a personalized approach and easier maintenance, but lacks the level of operability that would be desirable. the web presence can support both user needs and offer visitors a sense of the quality services and collections the library provides. so, at this writing, what we have is a litany of questions not yet resolved. mobile, tablets, and virtual services: the questions also abound in these areas. should we build our own mobile services, or contract out the development? do we (can we) focus on creating a leadership role for the library in the area of emerging technology, or wait for a coordinated institutional vision and plan to emerge? in the area of tablets, we are about to commence circulating ipads and anyone who has gone through the labyrinthian process just to load apps will know that the process gives one pause as to the value of such an initiative, and that is before they circulate and need to be managed. still, it is a technology initiative that demands review of library work flows, security, student training, and collection access. virtual services were at a fairly nascent state upon my arrival and have grown slowly, as they are being developed in a culture that stressed individual, hands-on, and personalized services. virtual services can be all that, but that needs to be demonstrated not only to the user but to the people delivering the service. the added value here is that the work engages us in valuable reflections on the way in which we work or should work. value of the library: i began my new position at time when the college was deeply engrossed in the issue of student recruitment, retention, and success. for my employer these are significant institutional identity issues, and the library is expected to document its contributions to student outcomes and success. not nearly enough has been done, though a working relationship with a new director of institutional research is developing and critical issues such information literacy, integrated student support, learning spaces, learning analytics, and the need for a data warehouse will be incorporated into the into the college’s strategic plan. the opportunity is there for the library to link with major college initiatives, for example, and make information literacy more than a library issue. citation management: now, here is a traditional library activity, the bane of many a reference service interaction and the undergraduate’s last-minute nightmare. a combination of technical, service and fiscal challenge revolve around the campus climate on the use of technology to respond to this quandary. what to do with faculty who believe strongly that the best way to learn this skill is by hand, not with any system that aims for interoperability and a desire to save the time of the user? for others, which tool should be used? should we not just go with a free one? while discipline differences will always exist, the current environment does present opportunities for the library to take a leadership role in defining what the possibilities are and ideally connecting the approach to appropriate and measurable learning outcomes and to the larger issue of academic integrity. information technology and libraries | december 2012 5 e-books, pda, article tokens: one of the unforeseen benefits of my moving to a small college library is that there is not the attachment to a print collection that exists in many/most research libraries. there is remarkable openness to experimenting with and committing to various methods of digital delivery of content. thus, we have been able to test myriad possibilities, from patron driven book purchasing, tokens for journal articles, and streaming popular films from a link in the library management system. this blurring of content, delivery, and functionality presents numerous opportunities for librarians to have conversations with departments of the future of collections. connecting with alumni: this is always an important strategic issue for colleges and universities and it seems as though there are promising emerging options for libraries to deliver database content to alumni, as vendors are beginning to offer more reasonable alumni-oriented packages. my library will be working with the appropriate campus offices next year to develop a plan for funding targeted library content for alumni as part of the college’s broader strategic activities to engage alumni. web design skills: while i understand the value that products like libguides can bring to the community, allowing content experts (librarians) to quickly and easily create template-driven web-based subject guides, i remain troubled by the lack of design skills librarians possess, and by the lack of recognition that good design can be just as important as good content. this is not a criticism, as we are not graphic designers. we have a sense of user needs, knowledge about content, and a desire to deliver, but i believe that products like this lead librarians to believe that good design for learning is easy. i do not claim to be an expert, but i know this is not the case. this approach does not translate into user friendly guides that hold to consistent standards. i think we need to recognize that we can benefit from non-librarian expertise in the area of web design. one opportunity that i want to investigate along these lines is to create student internships that would bring design skills and the student perspective to the work. a win-win, as this also supports the college’s desire for more internships and experiential learning for students. there is neither time nor space to address an even broader library technology issue on the near horizon, which will be another campus engagement moment, the future ils for the library. yet, maybe that should have been addressed first, since what i have read and heard, the new ilss will solve all of the above problems! metadata provenance and vulnerability timothy robert hart and denise de vries information technology and libraries | december 2017 24 timothy robert hart (tim.hart@flinders.edu.au) is phd researcher and denise de vries (denise.devries@flinders.edu.au) is lecturer of computer science, college of science and engineering, flinders university, adelaide, australia. abstract the preservation of digital objects has become an urgent task in recent years as it has been realised that digital media have a short life span. the pace of technological change makes accessing these media increasingly difficult. digital preservation is primarily accomplished by main methods, migration and emulation. migration has been proven to be a lossy method for many types of digital objects. emulation is much more complex; however, it allows preserved digital objects to be rendered in their original format, which is especially important for complex types such as those comprising multiple dynamic files. both methods rely on good metadata to maintain change history or construct an accurate representation of the required system environment. in this paper, we present our findings that show the vulnerability of metadata and how easily they can be lost and corrupted by everyday use. furthermore, this paper aspires to raise awareness and to emphasise the necessity of caution and expertise when handling digital data by highlighting the importance of provenance metadata. introduction unesco recognised digital heritage in its “charter on the preservation of digital heritage,” adopted in 2003, stating, “the digital heritage consists of unique resources of human knowledge and expression. it embraces cultural, educational, scientific and administrative resources, as well as technical, legal, medical and other kinds of information created digitally, or converted into digital form from existing analogue resources. where resources are ‘born digital’, there is no other format but the digital object.” 1 born-digital objects are at risk of degradation, corruption, loss of data, and becoming inaccessible. we combat this through digital preservation to ensure they remain accessible and useable. the two main approaches to preservation are migration and emulation. migration involves migrating digital objects to a different and currently supported file type. emulation involves replicating a digital environment in which the digital object can be accessed in its original format. both methods have advantages and disadvantages. migration is the more common method because it is simpler than emulation and the risks can often be neglected. these risks include potential data loss or change, in which the effects are permanent. emulation is complex, but it offers the better means to access preserved objects, especially complex file types comprising multiple dynamic files that must be constructed correctly. emulation also allows users to handle digital objects as closely to the “look and feel” as originally intended. 2 mailto:tim.hart@flinders.edu.au mailto:denise.devries@flinders.edu.au metadata provenance and vulnerability | hart and de vries 25 https://doi.org/10.6017/ital.v36i4.10146 accurate and complete metadata is central to both migration and emulation; thus, it is the focus of this paper. metadata are needed to record the migration history of a digital object and to record contextual information. they are also necessary to accurately render digital objects in emulated environments. emulated environments are designed around a digital object’s dependencies , which typically include, but are not limited to, drivers, software, and hardware. 3 the metadata describe the attributes of the digital object from which we can derive the type of system in which it can run (e.g., the operating system), the versions of any software dependencies, and other criteria that are crucial for accurate creation of an emulated environment. while metadata are being used to support the preservation of digital objects, there is another equally important role it should be playing. it is not enough to preserve the object so it can be accessed and used in the future. what of the history and provenance of the digital object? what about search and retrieval functionality within the archive or repository the digital object is held in? one must consider how these preserved objects will be used in the future, and by whom. preserving digital objects is difficult if adequate metadata is not present, especially if the item is outdated and no longer supported. looking to the future, we should try to ensure metadata are processed correctly for the lifecycle of the digital object. this means care must be taken at the time of creation and curation of any digital objects because although some metadata are typically generated automatically, many elements that will play a pivotal role later must be created manually. digital objects also commonly go through many changes, which is something that must be captured, as the change history will reveal what has happened to the object over of its lifecycle. the changes may include how the object has been modified, migrations to different formats, and what software created or changed the object—all of which is considered when emulating an appropriate environment. examples of these changes can be found in case studies presented in the paper. metadata types the common and more widely used metadata types include, but are not restricted to, administrative, descriptive, structural, technical, transformative, and preservation metadata. each metadata type describes a unique set of characteristics for digital objects. administrative metadata include information on permissions as well as how and when an object was created. transformative metadata includes logs of events that have led to changes to a digital object. 4 structural metadata describe the internal structure of an object and any relationships between components. technical metadata describe the digital object with attributes such as height, weight, format, and other technical details. 5 preservation metadata support digital preservation by maintaining authenticity, identity, renderability, understandability, and viability. they are not bound to any one category as they comprise multiple types of metadata, not including descriptive or contextual metadata. however, unlike the common metadata types, preservation metadata are unique from the other metadata types and are often ambiguous. 6 in 2012, the developers of version 2.2 of the premis data dictionary for preservation metadata saw descriptive metadata as less crucial for preserving digital objects; however, they did state it was important for discovery and decision making. 7 while version 2.2 allowed descriptive information technology and libraries | december 2017 26 metadata to be handled externally through existing standards such as dublin core, the latest version (2017) of the dictionary allows for “intellectual entities” to be created within premis that can capture descriptive metadata. 8 thus, while digital preservation does not require all types of metadata, the absence of contextual metadata limits the future possibilities for the preserved object. hart writes that because the multimedia objects are dynamic and interactive, and often composed of multiple image, audio, video, and software files, descriptive metadata are increasingly important because they can be used to describe, organise, and package the files. 9 it is also stressed that content description is of great importance because digital objects are not self-describing, which makes identifying semantic-level content difficult; without description metadata, context is lost. 10 for example, without description metadata to provide context, an image’s subject information and search and retrieval functionality is lost. without this information, verifying whether an object is the original, a copy, or a fabricated or fraudulent item is impossible in most cases. metadata vulnerability—case studies digital objects that are currently being created often go through several modifications, making it difficult to identify the original or authentic copy of the object. verifying and validating authenticity is important for preserving, conserving, and archiving objects. the digital preservation coalition defines authenticity as the digital material is what it purports to be. in the case of electronic records, it refers to the trustworthiness of the electronic record as a record. in the case of “born digital” and digitised materials, it refers to the fact that whatever is being cited is the same as it was when it was first created unless the accompanying metadata indicates any changes. confidence in the authenticity of digital materials over time is particularly crucial owing to the ease with which alterations can be made. 11 tests were undertaken to discover how vulnerable metadata can be in digital files that are subject to change, which can lead to loss, addition, and modification. the tests were conducted using the file types jpeg, pdf, and docx (word 2007). the tests revealed what metadata can be extracted and what metadata could be present in the selected file types. furthermore, they revealed how specific metadata can verify and validate the authenticity of a file such as an image. for each test, the metadata were extracted using exiftool (http://owl.phy.queensu.ca/~phil/exiftool/). alternative browser-based tools were tested and provided similar results; however, exiftool was selected as the primary testing tool because it produced the best results and had the best functionality. some of the files tested provided extensive sets of metadata that are too large to include, but subsets can be found in hart (2009). note that only subsets are included because some metadata was removed for privacy and relevance reasons. the process and method for each test was conducted in the following manner: http://owl.phy.queensu.ca/~phil/exiftool/ metadata provenance and vulnerability | hart and de vries 27 https://doi.org/10.6017/ital.v36i4.10146 • case study 1—jpeg o original metadata extracted for comparison o image copied, metadata extracted from copy and examined for changes o file uploaded to social media, downloaded from social media, extracted and examined against original • case study 2—jpeg (modified) o original metadata extracted for comparison o image opened and modified in photo editing software (adobe photoshop), metadata extracted from new version and examined against original • case study 3—pdf o basic metadata extraction performed to establish what metadata are typically found in pdf files and what types of metadata could be possible • case study 4—docx o original metadata extracted for comparison o file saved as pdf through microsoft word and metadata compared to original o file converted to pdf through adobe acrobat and metadata compared to original case study 1 this case study investigated the everyday use of digital files, the first being simply copying a file. it was revealed that copying a file creates an exact copy of the original file and no changes in metadata aside from the creation and modification time/date. thus, the copy could not be identified against the original unless the original creation time/date was known. the second everyday use was uploading an image to facebook. the metadata-extraction tests revealed that the original file had approximately 265 metadata elements. (the approximation is caused by the ambiguity of certain elements that may be read as singular or multiple entries.) these elements included, but were not limited to, the following: • dates • technical metadata • creator/author information • color data • image attributes • creation-tool information • camera data • change • software history many of the metadata elements had useful information for a range of situations. even so, several metadata elements were missing that would require a user input for creation. once the file had been uploaded to and then downloaded from social media, approximately 203 metadata elements were lost, included date, color, creation-tool information, camera data, change, and software history. it can be argued that removing some of this metadata would help keep user information private, but certain metadata should be retained, such as change and software history. these information technology and libraries | december 2017 28 metadata make it easier to differentiate fabricated images from authentic images and to know which modifications have been made to a file. for preservation purposes, the missing metadata is what may be needed to provide authenticity. this case study aims to make users aware of the significant risk of metadata loss when dealing with digital objects. if metadata are not identified and captured before the object is processed within a repository, the loss could be irreversible. case study 2 the second case study revealed how the change and software history metadata can be used to easily identify when a file has been modified. in the test conducted, it was evident by visually comparing the images that changes were made; however, modifications are not always obvious as some changes can be subtle, such as moving an element in the image that completely changes what the image is conveying. the following example displays the change history from the image used in case study 1, revealing how the metadata can easily identify modification: • history action—saved, saved, saved, saved, converted, derived, saved • history when—the first saved was at 2010:02:11 21:59:05, the last saved was at 2010:02:11 22:12:01 with each action having its own timestamp • history software agent—adobe photoshop cs4 windows for each action • history parameters—converted from tiff to jpeg further testing was conducted with simple photo manipulation using an original image to see firsthand the issues described in the initial test. the image contained approximately 178 metadata elements, including the typical metadata that were found in the first case study. once the image was processed and modified with adobe photoshop cs5, the metadata were no longer identical. the modified image had approximately 201 metadata elements. the new elements included photoshop-specific data, change, and software history. however, extensive camera data were lost. it can be argued that the camera data are not important for digital preservation because the lack of it will not hinder the preservation process. however, once the file is preserved and those data are lost, important technical and descriptive information can never be regained. for example, consider a spectacular digital image that captures an important moment in history. if that image is preserved for twenty years, in that time cameras and perhaps photography itself will have advanced dramatically. how digital images are captured and processed might be completely different and will most likely provide different results. should someone wish to know how that preserved image was captured, they would need to know what camera was used, lens and shutter speed data, lighting data, and other technical information. preserving those metadata can be almost as important as preserving the file itself because each metadata element has importance and meaning to someone. as most viewers of online media are aware, photos are often modified, especially on social media. this is often performed on “selfies,” pictures taken of oneself. these can be modified to make the person in the photo look better or to hide features they see as flawed. small modifications, such as covering some blemishes or improving the lighting have little effect on the image’s context, but some modifications and manipulations that can mislead people. these manipulated images often metadata provenance and vulnerability | hart and de vries 29 https://doi.org/10.6017/ital.v36i4.10146 take the form of viral hoax images circulating around the web. for example, figure 1 displays how two images can be combined into a composite image that changes the context of the image. figure 1. composite image. “photo tampering throughout history,” fourandsix technologies, 2003, http://pth.izitru.com/2003_04_00.html. the two images side by side are original photos taken in basra of a british soldier gesturing to iraqi civilians to take cover. in the right image, the iraqi man is holding a child and seeking help from the solider; as you can see, this soldier does not interpret this as a hostile act. the image above is a composite of the two that changes the story. in this image, the soldier appears to be responding with hostility toward the man approaching. with basic photo manipulation, this soldier who is protecting innocent civilians is portrayed holding them against their will. images like this circulate through media of all types, and although the exchangeable image file format (exif) metadata may not identify what has been done to the image, it would eliminate any doubt that the image has been modified. unfortunately, these data are not made available. making users aware of this vulnerability may improve detection of file manipulation at the time of ingest to better ensure only accurate and authentic material is being considered for preservation. donations received by digital repositories such as libraries must be scrutinised by trained individuals. with this awareness and knowledge of metadata, they can perform their duties to a much higher standard. case study 3 the pdf metadata extraction provided interesting results. over a range of tests on academic research papers, the main metadata identified consisted of pdf version, author, creator, creation date, modification date, and xmp (adobe extensible metadata platform) data. these metadata http://pth.izitru.com/2003_04_00.html information technology and libraries | december 2017 30 were not present in every pdf tested; in fact, the majority of pdf files seemed to be lacking important metadata. the author and creator fields were generally listed as “administrator” or “user” and bibliographic metadata was usually missing. however, pdf openly supports xmp embedding, therefore, bibliographic metadata could be embedded into the pdf. through further testing, bibliographic metadata linked to the pdfs were discovered stored in online databases. bibliographic software such as endnote and zotero allow metadata extraction, which enables users to import pdf files and automatically generate the appropriate bibliographic metadata. for example, zotero performs this extraction by first searching for a match for the pdf on google scholar. if this search does not return a match, zotero uses the embedded digital object identifier (doi) to perform the match. this method is not consistent: it often fails to retrieve any data, and in rare cases it retrieves the wrong data, which leads to incorrect references. given what we saw happen to metadata when a file is uploaded such as in case study 1 and the nature of a pdf’s journey through template selection, editing, and publishing, it is no surprise that metadata are lost or diluted along the way. case study 4 the fourth case study conducted on docx files provided an extensive set of metadata, some of which are unique to this file type. creating a new word document via the file explorer context menu and attempting to extract metadata resulted in an error as there were no readable metadata to extract until the file was accessed and saved. once the file had some user input and was saved, the metadata were created and could be extracted. microsoft office files contain external xml files that holds information about the document, such as formatting data, user information, edit history, and information about the document’s page count, word count, etc. picture a docx file as an uncompressed directory. however, using exiftool on the docx file allowed retrieval of the metadata from all the hidden files. the metadata included creation, modification, and edit information, such as number of edits and total edit time. every element within the document (e.g., text, images, tables, etc.) has its own metadata attached that are crucial for preserving the format of the document. the next step in the test involved converting the docx file into pdf using the following two methods: (1) converting the document via the “publish” save option within microsoft word; and (2) “right clicking” the document and selecting the option to convert to an adobe pdf. the results of the two methods varied slightly. method 1 stripped all the metadata from the document and generated only default pdf metadata consisting of system metadata (file size, date, time, permissions) and the pdf version, author details, and document details. method two behaved the same way except that some xmp metadata were created. both methods resulted in no informative metadata remaining as the majority of the xmp elements were empty fields or contained generic values such as the computer name as the author. all formatting and metadata unique to microsoft word was lost. this case study is an enlightening example of what can happen to metadata when a file is changed from one format to another. metadata provenance and vulnerability | hart and de vries 31 https://doi.org/10.6017/ital.v36i4.10146 human intervention the human element is a requirement in digital preservation as certain metadata, such as descriptive and administrative metadata, can only be created by humans. in fact, as hart notes, user input is needed to record the majority of the digital preservation metadata. 12 the process can be tedious, as described by wheatley. 13 one of the examples described included following the processes in a repository from ingest to access, beginning with the creation of metadata and the managerial tasks that are necessary. these tasks include using extraction tools and automation where possible. using frameworks to record changes to metadata is required, and in some cases metadata must be stored externally to their digital objects. this allows multiple objects of the same type to utilise a generic set of metadata to avoid redundant data. however, although using a generic metadata set is convenient, a large collection of digital objects could be affected if the metadata is lost or damaged. the human element increases the risk of error drastically because there are numerous steps to metadata creation. misconduct is also possible. therefore, the less digital preservation is reliant on humans (and the easier the tasks are that require human input), the better. this can only be achieved by automating most process and training people to ensure they handle their responsibilities accurately, consistently, and completely. learning the results from the case studies like those described in this paper will better prepare users working with digital objects. discussion to achieve the most authentic, consistent, and complete digital preservation, institutions must revise their preservation workflows and processes. this entails ensuring the initial processes within workflows are correct before processing digital content. the content must come from a credible source and have its authenticity approved. participation from the donor of the digital content might be beneficial if they can provide information and metadata about the content. this information could provide additional context for the content as well as identify its history (e.g., format migration or modification). this is not always possible as the donor is not always be the creator of the digital content. if the original source is no longer available, as much information as possible should be gathered from the donor about the acquisition of the content and any information regarding the original source. this should be considered and carefully monitored throughout the lifecycle of digital content. granted, if no changes are needed, devices such as write blockers can ensure this as they restrict users and any systems from making unwanted changes or “writes.” however, changes are sometimes unavoidable and (although it may not affect the content) detrimental. when changes are required, it is crucial to maintain the digital history by capturing all metadata added, removed, or modified during processing, commonly known as the “change history.” donor participation should be stipulated in a donor agreement, something that each institution offers to all donors, sometimes in the form of agreements through communication and often with a structured document. donor-agreement policies differ for each institution: some are quite detailed, allowing donors to carefully stipulate their conditions, whereas others place most of the information technology and libraries | december 2017 32 responsibility on the receiving institution. when dealing with sensitive or historic data of importance, policies should be in place to capture adequate data from the donor. when the content does not fall into this category, standard procedures, which should be present in all donor agreements and institution policies, can be followed. institutions must also consider when to apply these steps as some transactions between donor and institution can follow standard protocol; others are more complex, such as donations of content with diverse provenance issues. conclusion we have presented four case studies that illustrate how vulnerable digital-object metadata are. these examples show that common methods of handling files can cause irretrievable loss of important information. we discovered significant loss of metadata when uploading photos to social media and when converting a file to another format. the digital footprint left behind from photo manipulation was also exposed. we shed light on the bibliographic-metadata generation of pdf files, how they are obtained, and the surrounding issues. action is needed to ensure proper metadata creation and preservation for born-digital objects. librarians and archivists must place a greater emphasis on why digital objects are preserved as well as how and when users may need to access them. therefore, all types of metadata must be captured to allow users from all disciplines to take advantage of historical data in many years to come. given the rate of technological change, we must be prepared; observing first-hand the vulnerability of metadata is a step toward a safer future for our digital history. references 1 “charter on the preservation of digital heritage,” unesco, october 15, 2003, http://portal.unesco.org/en/ev.phpurl_id=17721&url_do=do_topic&url_section=201.html. 2 k. rechert et al., “bwfla—a functional approach to digital preservation,” pik—praxis der informationsverarbeitung und kommunikation 35, no. 4 (2012), 259–67. 3 k. rechert et al., design and development of an emulation-driven access system for reading rooms, archiving conference, 2014, 126–31, society for imaging science and technology, 2014. 4 m. phillips et al., the ndsa levels of digital preservation: explanation and uses, archiving conference, 2013, 216–22, society for imaging science and technology, 2013. 5 “premis: preservation metadata maintenance activity” library of congress, accessed march 10, 2016, http://www.loc.gov/standards/premis/. 6 r. gartner and b. lavoie, preservation metadata (2nd edition) (york, uk: digital preservation coalition, 2013), 5–6. http://portal.unesco.org/en/ev.php-url_id=17721&url_do=do_topic&url_section=201.html http://portal.unesco.org/en/ev.php-url_id=17721&url_do=do_topic&url_section=201.html http://www.loc.gov/standards/premis/ metadata provenance and vulnerability | hart and de vries 33 https://doi.org/10.6017/ital.v36i4.10146 7 premis editorial committee, premis data dictionary for preservation metadata, version 2.2 (washington, dc: library of congress, 2012), http://www.loc.gov/standards/premis/v2/premis-2-2.pdf. 8 premis editorial committee, premis schema, version 3.0 (washington, dc: library of congress, 2015), http://www.loc.gov/standards/premis/v3/premis-3-0-final.pdf. 9 timothy hart, “metadata standard for future digital preservation” (honours thesis, flinders university, adelaide, australia, 2015). 10 j. r. smith and p. schirling, “metadata standards roundup,” ieee multimedia 13, no 2 (april-june 2006): 84–88. 11 “glossary,” digital preservation coalition, accessed august 5, 2016, http://handbook.dpconline.org/glossary. 12 timothy hart, “metadata standard for future digital preservation” (honours thesis, flinders university, adelaide, australia, 2015). 13 paul wheatley, “institutional repositories in the context of digital preservation,” microform & digitization review 33, no. 3 (2004): 135–46. http://www.loc.gov/standards/premis/v2/premis-2-2.pdf http://www.loc.gov/standards/premis/v3/premis-3-0-final.pdf http://handbook.dpconline.org/glossary abstract introduction metadata types metadata vulnerability—case studies case study 1 case study 2 case study 3 case study 4 human intervention discussion conclusion references reproduced with permission of the copyright owner. further reproduction prohibited without permission. editorial: i inhaled helmer, john f information technology and libraries; jun 2000; 19, 2; proquest pg. 59 editorial: i inhaled t his editorial introduces the third special issue of information technology and libraries dedicated to library consortia, and the second primarily aimed at surveying consortial activities outside the united states. 1 the concept of a special consortial issue began in 1997 as an outgrowth of a sporadic and wide-ranging discussion with jim kopp, editor of ital 1996-98. at the time, jim and i were involved in the creation and maturation of the orbis consortium in oregon and washington. jim was a member and later chair of the governing council and i was chief volunteer staff person and finding myself increasingly absorbed by consortial work. our discussions lasted more than a year and were sustained by many e-mail messages and several enjoyable conversations over bottles of nut brown ale. in the mid-1990s it seemed obvious that we were witnessing the beginning of a renaissance in library consortia. consortia had been around for many years but now established groups were showing renewed vigor and new groups seemed to be forming every day. why was this happening? what were all these consortia doing? jim and i discussed these questions and speculated on future roles for library consortia and their impact on member libraries. library consortia seemed an ideal topic for a special issue of ital. my initial goal as guest editor of ital was to take a snapshot of a variety of consortia and begin to better understand the implications of the explosive growth we were witnessing. while assembling the march 1998 issue i soon realized that consortia were all over the map, both figuratively and literally. a small amount of study revealed a tremendous variety of consortia and a truly worldwide distribution. although american consortia were starting to receive attention in the professional literature, a great deal of important work was occurring abroad. this realization gave rise to the september 1999 issue and the present issue dedicated to consortia from around the world. in addition to six articles from the united states, these three special issues of ital include contributions from south africa, canada, israel, spain, australia, brazil, john f. helmer china, italy, micronesia, and the united kingdom. taken together these groups represent a dizzying array of organizing principles, membership models, governance structures, and funding models. although most are geographically defined, the type of library they serve also defines many. virtually all license electronic resources for their membership but many offer a wide variety of other services including shared catalogs, union catalogs, patron-initiated borrowing systems, authentication systems, cooperative collection development, digitizing, instruction, preservation, courier systems, and shared human resources. each consortium is formed by unique political and cultural circumstances, but a few themes are common to all. it is clear that the technology of the web, the increasing importance of electronic resources, and advances in resource-sharing systems have created new opportunities for consortia. beyond these technological and economic motivations, i believe that in consortia we see the librarian's instinct for collaboration being brought to bear at a time of great uncertainty and rapid change. librarians often forget that as a profession we collaborate and cooperate with an ease seldom seen in other endeavors. there is safety in numbers and in uncertain times it helps to confer with others, spread risk over a larger group, and speak with a collective voice. library consortia fulfill these functions very well and their future continues to look bright. as i conclude my duties as guest editor i would like to thank jim kopp for sparking my interest in this project and for several years of stimulating conversation. special thanks are due to managing editors ann jones and judith carter as well as the helpful and professional staff at ala production services. obstacles of language and time differences make composing and editing a publication such as this unusually challenging. the quality and cohesivejohn f.helmer(jhelmer@darkwing.uoregon.edu) is executive director, orbis library consortium. production: ala production services (troy d. linker, kevin heubusch; ellie barta-moran, angela hanshaw, and karen sheets), american library association, 30 e. huron st., chicago, il 60611. publication of material in infornrntion trclz110logy and libraries does not constitute official endorsement bv lita or the ala. . abstracted in computer & /11jtj1·11wtwn systems, compllting rn 1icws, il~{ormation science abstracts, library [-r lnforlllatio11 science abstracts, rtfrrati'unyi zlwrnal, i\iauclmaya i tckfrniclzeskaya l11fon11atsiya, otdyclnyi vyp11sk, and science abstracts pu{j/icnticms_ indexed in co111pu1\r!nth citation lndcx, comptdcr contents, co111putcr litaaturc lndc:r, current contc11ts/healtl1 scn.·iccs admi11istratio1l, current ccmtcnfs/social bclwuioral scic11ces, c11rrcnt index to journals in education, education, library literature, a1agazinc jndcj:, ncwscarcl1, and social sciences citation index. microfilm copies available to subscribers from university microfilms, ann arbor, michigan. for information sciences-permanence of paper for printed library materials, ansi 239.48-1992.= copyright ©2000 american library association. all material in this journal subject to copyright by ala may be photocopied for the noncommercial purpose of scientific or educational advancement granted by sections 107 and 108 of the copyright revision act of 1976. for other reprinting, photocopying, or translating, address requests to the ala office of rights and permissions. the paper used in this publication meets the minimum requirements of american national standard editorial 59 reproduced with permission of the copyright owner. further reproduction prohibited without permission. ness of these issues of ital are due in large measure to the efforts of these individuals. in inhaling the spore, the editorial introduction to the first special consortial issue, i compared a librarian's involvement in consortia to the cameroonian stink ant's inhalation of a contagious spore. the effect of this spore is featured in mr. wilson's cabinet of wonder, lawrence weschler's remarkable history of the museum of jurassic technology. 2 weschler explains that, once inhaled, the spore lodges in the brain and "immediately begins to grow, quickly fomenting bizarre behavioral changes in its ant host." although the concept of a consortial spore is somewhat extreme (or "icky" according to my nine-yearold daughter) the editorial was an accurate reflection of my own sense of being inexorably drawn into a consortium-drawn not so much against my will but as a willing crazed participant. at the time i was nominally working for the university of oregon library system and vainly trying to keep consortial work in perspective. 60 information technology and libraries i june 2000 by the time of my second editorial, epidemiology of the consortia/ spore, i was exploring consortia around the world but still laboring under the illusion that i could keep my own consortium at arm's length. i must have failed since, as of this writing, i have left my position at the uo and now serve as the executive director of the orbis library consortium. like the cameroonian stink ant, i have inhaled the spore and am now happily laboring under its influence. references and notes 1. see ital 17, no. 1 (mar. 1998) and ital 18, no. 3 (sept. 1999). 2. lawrence weschler, mr. wilson's cabinet of wonder (new york: vintage books, 1995). the museum of jurassic technology (www.mjt.org) is located in culver city, calif. see www.mjt.org/ exhibits/stinkant.html for more on the cameroonian stink ant. mapping for the masses: gis lite and online mapping tools in academic libraries kathleen w. weessies and daniel s. dotson information technology and libraries | march 2013 23 abstract customized maps depicting complex social data are much more prevalent today than in the past. not only in formal published outlets, interactive mapping tools make it easy to create and publish custom maps in both formal and more casual outlets such as social media. this article defines gis lite, describes three commercial products currently licensed by institutions, and discusses issues that arise from their varied functionality and license restrictions. introduction news outlets from newspapers to television to internet these days are filled with maps that make it possible for readers to visualize complex social data. presidential election results, employment rates, and the plethora of data arising from the census of population are just a small sampling of social data mapped and consumed daily. the sharp rise in published maps in recent years has increased consumer awareness of the effectiveness of presenting data in map format and has raised expectations for finding, making and using customized maps. not just in news media, but in academia also, researchers and students have high interest in being able to make and use maps in their work. just a few years ago even the simplest maps had to be custom made by specialists. researchers and publishers had to seek out highly trained experts to make maps on their behalf. as a result, custom maps were generally only to be found in formal publications. the situation has changed partly because geographic information system (gis) software for geographic analysis and map making is more readily available than in years past. it does, however, remain specialized and wants considerable training for users to be proficient at even a basic level.1 this gap between supply and demand has been partly filled, especially in the last five years, by the growth of internet-based “gis lite” tools. while some basic tools are freely available on the internet, several tools are subscription-based and are licensed by libraries, schools and businesses for use. college and university libraries especially are quickly becoming a major resource for data visualization and mapping tools. the aim of this article is to describe several data-rich gis lite tools available in the library market and how these products have met or failed to meet the needs of several real-life college class kathleen w. weessies (weessie2@msu.edu), a lita member, is geosciences librarian and head of the map library, michigan state university, lansing. michigan. daniel s. dotson (dotson.77@osu.edu) is mathematical sciences librarian and science education specialist, associate professor, ohio state university libraries, columbus, ohio. mailto:weessie2@msu.edu mailto:dotson.77@osu.edu mapping for the masses: gis lite & online mapping tools in academic libraries | weessies and dotson 24 situations. this is followed by a discussion of issues arising from user needs and restrictions posed by licensing and copyright. what is gis lite? students and faculty across the academic spectrum often discover that their topic has a geographic element to it and a map would enhance their work (paper, presentation, project, poster, article, book, thesis or dissertation, etc.). if their research involves data analysis, geospatial tools will draw attention to spatial patterns in the data that might not otherwise be apparent. every scholar with such needs must make a cost/benefit decision concerning gis: is his or her need greater than the cost in time and effort (sometimes money) necessary to learn or hire skills to produce map products? a full functioning gis, being a specialized system of software designed to work with geospatially referenced datasets, is designed to address all the problems above. the data may be analyzed and output into customized maps exactly to the researcher’s need. the traditional lowend solution available to non-experts, on the other hand, is colorizing a blank outline map, either with hand-held tools (markers, colored pencils, etc.) or on a computer using a graphic editing program. the profusion of web mapping options dangles tantalizingly with possibility, and occasionally (and increasingly) is able to provide an output that illustrates a useful point of users’ research in a professional enough manner to fill a need. in recent years the web has blossomed with map applications collectively called the “geoweb” or “geospatial web.” geoweb or geospatial web refers to the “emerging distributed global gis, which is a widespread distributed collaboration of knowledge and discovery.”2 some geoweb applications are well known street map resources such as google maps and mapquest. others are designed to deliver data from an organization, such as the national hazards support system (http://nhss.cr.usgs.gov), national pipeline mapping system (http://www.npms.phmsa.dot.gov/publicviewer), and the broadband map (http://www.broadbandmap.gov). a few tools focus on map creation and output such as arcgis online (http://www.arcgis.com/home/webmap/viewer.html) and scribble maps (http://www.scribblemaps.com). the newest subgenre of the geoweb consists of participatory mapping sites such as openstreet map (http://www.openstreetmap.org), did you feel it? (http://earthquake.usgs.gov/earthquake.usgs.gov/earthquakes/dyfi), and ushahidi (http://community.ushahidi.com/deployments). the geoweb literature is small but growing. 3 elwood reviewed published research on the geographic web.4 the geoweb literature tends to focus on creation of mappable data and delivery of geoweb services.5 in these the map consumer only appears as a contributor of data. very little has been written about users’ needs from the geoweb. the term gis lite has arisen among map and gis librarians to describe a subset of geoweb applications. gis lite is useful to library patrons lacking specialized gis training but who wish to conduct some gis and map-making activities on a lower learning curve. for the purpose of this article, gis lite will refer to applications, usually web-based, which allow users to manipulate geospatial data and create map outputs without programming skills or training in full gis software. http://nhss.cr.usgs.gov/ http://www.npms.phmsa.dot.gov/publicviewer http://www.broadbandmap.gov/ http://www.arcgis.com/home/webmap/viewer.html http://www.scribblemaps.com/ http://www.openstreetmap.org/ http://earthquake.usgs.gov/earthquake.usgs.gov/earthquakes/dyfi http://community.ushahidi.com/deployments information technology and libraries | march 2013 25 while many geoweb applications allow only low-level output options, gis lite will provide an output intended to be used in activities or rolled into a gis for further geospatial processing. in libraries, gis lite is closely allied with data and statistics resources. data and statistics librarianship have already been discussed as disciplines in the literature such as by hogenboom6 and gray.7 new technologies and access to deeper data resources such as the ones presented here have raised the bar for librarians’ responsibilities for curating, serving, and aiding patrons in its use. rather than be passive shepherds of information resources, librarians are now active participants and even information partners. librarians with map and gis skills similarly can directly enhance the quality of student scholarship across academic disciplines.8 the gis lite resources, however, need not remain specialized tools of map and gis librarians. librarians working in disciplines across the academic spectrum may incorporate them into their arsenal of tools to meet patron needs. data visualization tools a growing number of academic libraries have licensed access to online data providers. the following data tools contain enough gis lite functionality to aid patrons in visualizing and manipulating data (primarily social data) and creating customized map outputs. three of the more powerful commercial products described here are social explorer, simplymap, and proquest statistical datasets. social explorer licensed by oxford university press, social explorer provides selected data from the us decennial census 1790 to 2010, plus american community survey 2006 through 2010.9 the interface enables either retrieval of tabular data or visualization of data in an interactive map. as the user selects options through pull-down menus, the map automatically refreshes to reflect the chosen year and population statistics. the level of geography depicted defaults to county level data. if a user zooms in to an area smaller than a county, then data refreshes to smaller geographies such as census tracts if they are available at that level for that year. output is in the form of graphic files suitable for sharing in a computer presentation (see figure 1). one advantage of social explorer is that it utilizes historic boundaries as they existed for states, territories, counties, and census tracts for each given year. social explorer utilizes data and boundary files generated by the national historical gis (nhgis) based at the university of minnesota in collaboration with other partners. the creation of these historical boundaries was a significant undertaking and accomplishment.10 custom tables of data and the historic geographic boundaries may also be retrieved and downloaded for use from an affiliated engine through the nhgis website (http://www.nhgis.org). a disadvantage of this product is that the tool, while robust, does not completely replicate all the data available in the original paper census volumes. also, historical boundaries have not been created for city or township-level data. the final map layout is not customizable either in the location of title and legend or in the data intervals. http://www.nhgis.org/ mapping for the masses: gis lite & online mapping tools in academic libraries | weessies and dotson 26 figure 1: map depicting population having four or more years of college, 1960 (source: social explorer, 2012; image used with permission) simplymap simplymap (http://geographicresearch.com/simplymap) is a product of geographic research. this powerful interface brings together public and licensed proprietary data to offer a broad array of 75,000 data variables in the united states. us census data are available 1980–2010 normalized to the user’s choice of either year 2000 or year 2010 geographies. numerous other licensed datasets primarily focus on demographics and consumer behavior, which makes it popular as a marketing research tool. each user establishes a personal login which allows created maps and tables to persist from session to session. upon creating a map view, the user may adjust the smaller geographic unit at which the theme data is displayed and also may adjust the data intervals as desired. the user creates a layout, adjusting the location of the map legend and title before exporting as a graphic or pdf (see figure 2). data are also exportable as gis-friendly shapefiles. http://geographicresearch.com/simplymap information technology and libraries | march 2013 27 the great advantage of this product is the ability to customize the data intervals. this makes it possible to filter the data and display specific thresholds meaningful to the user. for instance if a user needs to illustrate places where an activity or characteristic is shared by “over half” of the population, then one may change the map to display two data categories: one for places where up to 50 percent of the population shares the characteristic and a second category for places where more than 50 percent of the population shares the characteristic. another potential advantage is that all local data have been allocated pro rata so that all variables, regardless of their original granularity, may be expressed by county boundaries, by zip code boundaries, or by census tract. a disadvantage of the product is the lack of historical boundaries to match historical data. figure 2. map depicting census tracts that have more than 50% black population (yellow line indicates cincinnati city boundary) (source: simplymap, 2012; image used with permission) mapping for the masses: gis lite & online mapping tools in academic libraries | weessies and dotson 28 proquest statistical datasets statistical datasets was developed by conquest systems inc. and is licensed by proquest. this product also mingles a broad array of several thousand public and licensed proprietary datasets, including some international data, in one interface. the user may retrieve data and view it in tabular or chart form. if the data have a geographic element, then the user may switch the view to a map interface. the resulting map may be exported as an image. the data may also be exported to a gis-friendly shapefile format. this product offers more robust data manipulation than the other products, in that the user may perform calculations between any of the data tables and create a chart or map of the created data element (see figure 3). statistical datasets, however, has more simplistic map layout capabilities than the other products. figure 3. map of sorghum production, by country, in 2010 (source: proquest statistical datasets, 2012; image used with permission) case studies the following three case studies are of college classroom situations in which students utilized maps or map making as part of the assigned course work. the above mapping options are assessed for how well they met the assignment needs. information technology and libraries | march 2013 29 case study 1 an upper level statistics course at the ohio state university requires students to create maps using sas (http://www.sas.com). while many may not associate the veteran statistical software package with creating maps, this course uses it along with sas/graph to combine statistical data with a map. the project requires data articulated at the county level in ohio, which the students then combine into multi-county regions. the end result is a map with regions labeled and rendered in 3d according to the data values. an example of the type of map that could be produced from such data using sas can be seen in figure 4. figure 4. map of observed rabbit density in ohio using sas, sas/graph, and mail carrier survey data,1998 (image used with permission) while the data are provided in this course, students could potentially seek help from the library in a traditional way to find numerical data expressed at a county level. the librarian would guide http://www.sas.com/ mapping for the masses: gis lite & online mapping tools in academic libraries | weessies and dotson 30 patrons through appropriate avenues to locate data such as to the three products listed above. all three options contain numerous data variables for ohio at the county level. because the students are further processing the data elsewhere (in this case sas), the output options of the three products are less important. ultimately the availability of data on a desired subject would be the primary determinant for choosing one of the three gis lite options discussed here. social explorer will export the data in tabular form which can then be ingested into sas. simplymap and proquest statistical datasets would both be a bit easier, though, because both packages allow the user to export the data as shapefiles which are directly imported into sas/graph as both boundary files and joined tabular data. case study 2 a first year writing class at michigan state university has a theme of the american ethnic and racial experience. assignments all relate to a student’s chosen ethnic group and geographic location from approximately 1880 to 1930. assignments build upon each other to culminate in a final semester paper. students with ancestors living in the united states at that time are encouraged to examine their own family’s ethnicity and how they fit in their geographic context. otherwise, students may choose any ethnic group and place of interest. maps are a required element in the assignments. maps that display historical census data help students place the subject ethnic group into the larger county, state, and national context over the time frame. the students can see, for instance, if their subject household was part of an ethnic cluster or an outlier to ethnic clusters. the parameters for finding data and maps are generous and open to each student’s interpretation. the wish is for students to find social statistics and maps that are insightful to their topic and will help them tell their story. of the three statistical resources considered above, currently the only useful one is social explorer because it covers the time period studied by the class. the students may map several social indicators at the county level across several decades and compare their local area to the region and the nation. also they may save their maps and include them in their papers (properly credited). case study 3 “the ghetto” is an elective geography class restricted to upperclassman at michigan state university. in the semester project, students analyze the spatial organization and demographic variables of “ghetto” neighborhoods in a chosen city. a ghetto is defined as neighborhoods that have a 50 percent or higher concentration of a definable ethnic group. since black and white are the only two races consistently reported at the census tract level for all the years covered by the class (1960 through 2010) the students necessarily use that data for their projects. data needs for the class are focused and deep. the students specifically need to visualize us census data from 1960 through 2010 at the census tract level within the city limits for several social indicators. indicators include median income, median housing value, median rent, educational attainment, income, and rate of unemployment. the instructor has traditionally required use of the paper census volumes and students created hand-made maps that highlight information technology and libraries | march 2013 31 tracts in the subject city that conformed to the ghetto definition and those that did not for each of the census years covered. computer-retrieved data and computer-generated maps would be acceptable, but at the time of this writing no gis lite product is able to make all the maps that meet the specific requirements of this class. social explorer covers all of the date range and provides data down to the tract level. however it does not provide an outline of the city limits and does not provide all the data variables required in the assignment. simplymap will only work for 2000 through 2010 because tract boundaries are only available for those two years even though the data go back to 1980. simplymap does provide two excellent features though: it is the only product that allows an overlay of the (modern) city boundary on top of the census tract map, ands it is the only product that allows manipulation of the data intervals. students may choose to break the data at the needed 50 percent mark, while the other products utilize fixed data intervals not useful to this class. proquest statistical datasets can compute the data into two categories to create the necessary data intervals; however census data are only available beginning with census 2000. map products for user needs these three real-life class scenarios illustrate how the rich and seemingly duplicative resources of the library can range from perfectly suitable to perfectly useless depending on each project’s exact needs. the appropriateness of any given tool can only be assessed fairly if the librarian is familiar with all the “ins and outs” of every product. the geoweb and gis lite tools mentioned throughout this article are summarized in table 1. the suitability of gis lite tools will be further affected by the following issues. historical boundaries the range and granularity of data tools are subject to factors sometimes at odds with what a researcher would wish to have. at this time, for instance, many historical resources provide data only as detailed as the county level. county level data are available largely due to the efforts of the nhgis mentioned above and the newberry library’s atlas of county boundaries project (http://publications.newberry.ort/ahcbp). far fewer resources provide historical data at smaller geographies such as city, township, or census tract levels. this is because the smaller the geographies get, the exponentially more there are to create and for map interfaces to process. from the well-known resource city and county data book,11 it is easy enough to retrieve us city data. the historical boundaries of every city in the united states, however, have not been created. this is because city boundaries are much more dynamic than county boundaries and there is no centralized authoritative source for their changes over time. two of the three case studies presented here utilized historic data. this isn’t necessarily a representative proportion of user needs; librarians should assess data resources in light of their own patrons’ needs. normalization two equally valid data needs concerning any kind of time series data concern changing geographic boundaries. census tracts, for instance, provide geographic detail roughly at the neighborhood level designed by the bureau of census to encompass approximately 2,500 to 8,000 http://publications.newberry.ort/ahcbp mapping for the masses: gis lite & online mapping tools in academic libraries | weessies and dotson 32 people.12 because people move around and the density of population changes from decade to decade, so the configuration and numbering of tracts change over time. some scholars will wish to see the data values in the tracts as they were drawn at the time of issue. in this situation, a neighborhood of interest might belong to different tracts over the years or even be split between two or more tracts. other scholars focused on a particular neighborhood may wish to see many decades of census data re-cast into stable tracts in order to be directly comparable. data providers will take one approach or the other on this issue, and librarians will do well to be aware of their choice. license restrictions a third issue affecting use of these products is the ability to use derived map images, not only in formal outlets such as professional presentations, articles, books, and dissertations, but also informal outlets such as blogs and tweets. for the most part gis lite vendors are willing—even pleased—to see their products promoted in the literature and in social media. the vendors uniformly wish any such use to be properly credited. the license that every institution signs when acquiring these products will specify allowed and disallowed activities. the license, fixated on disallowing abuse or resale or other commercialization of the data, might leave a chilling effect on users wishing to use the images in their work. if a user is in any doubt as to the suitability of an intended use of a map, he or she should be encouraged to contact the vendor to seek permission for its use. as data resources grow and become more readily usable, the possibility for scholarly inquiry grows. librarians with familiarity with gis lite tools may partner with their patrons and guide them to the best resources. information technology and libraries | march 2013 33 table 1: a selection of geoweb and gis lite tools and their output options tool name url free or fee electronic output options* geoweb tools atlas of historical county boundaries http://publications.newberry.org/ahcbp/ free spatial data as shapefile, kmz; image as pdf did you feel it? http://earthquake.usgs.gov/earthquakes/dyfi/ free tabular data as txt, xml. image as jpg, pdf, ps google maps https://maps.google.com/ free none mapquest http://www.mapquest.com free none national broadband map http://www.broadbandmap.gov/ free image as png national hazards support systems (usgs) http://nhss.cr.usgs.gov/ free image as pdf, png national pipeline mapping system https://www.npms.phmsa.dot.gov/publicview er/ free image as jsf openstreetmap http://www.openstreetmap.org/ free tabular data as xml; image as png, jpg, svg, pdf ushahidi community deployments http://community.ushahidi.com/deployments/ free image as jpg gis lite tools arcgis online http://www.arcgis.com limited free options; access is part of institutional site license spatial data as arcgis 10; image as png (in arcexplorer) proquest statistical datasets http://cisupa.proquest.com/ws_display.asp?filt er=statistical%20datasets%20overview fee tabular data as excel, pdf, delimited text, sas, xml; spatial data as shapefile; image may be copied to clipboard sas/graph http://www.sas.com/technologies/bi/query_re porting/graph/index.html fee image as pdf, png, ps, emf, pcl scribble maps http://www.scribblemaps.com/ free spatial data as kml, gpx; image as jpg simplymap http://geographicresearch.com/simplymap fee tabular data as excel, csv, dbf, spatial data as shapefile; image as pdf, gif * does not include taking a screen shot of the monitor or making a durable url to the page http://publications.newberry.org/ahcbp/ http://earthquake.usgs.gov/earthquakes/dyfi/ https://maps.google.com/ http://www.mapquest.com/ http://www.broadbandmap.gov/ http://nhss.cr.usgs.gov/ https://www.npms.phmsa.dot.gov/publicviewer/ https://www.npms.phmsa.dot.gov/publicviewer/ http://www.openstreetmap.org/ http://community.ushahidi.com/deployments/ http://www.arcgis.com/ http://cisupa.proquest.com/ws_display.asp?filter=statistical%20datasets%20overview http://cisupa.proquest.com/ws_display.asp?filter=statistical%20datasets%20overview http://www.sas.com/technologies/bi/query_reporting/graph/index.html http://www.sas.com/technologies/bi/query_reporting/graph/index.html http://www.scribblemaps.com/ http://geographicresearch.com/simplymap information technology and libraries | march 2013 34 references 1. national research council, division on earth and life studies, board on earth sciences and resources, geographical sciences committee, learning to think spatially (washington, d.c.: f academies press, 2006): 9. 2. pinde fu and jiulin sun, web gis: principles and applications (redlands, ca: esri press, 2011): 15. 3. for good overviews of the geoweb, see muki haklay, alex singleton and chris parker, “web mapping 2.0: the neogeography of the geoweb,” geography compass 2, no. 6 (2008): 20112039, http://dx.doi.org/10.1111/j.1749-8198.2008.00167.x; jeremy w crampton, “cartography: maps 2.0,” progress in human geography 33, no. 1 (2009): 91-100, http://dx.doi.org/10.1177/0309132508094074. 4. sarah elwood, “geographic information science: visualization, visual methods, and the geoweb,” progress in human geography 35, no. 3 (2010): 401-408, http://dx.doi.org/10.1177/0309132510374250. 5. songnian li; suzana dragićević, and bert veenendaal eds, advances in web-based gis, mapping services and applications (boca raton, fl: crc press, 2011). 6. hogenboom, karen, carissa phillips, and merinda hensley, "show me the data! partnering with instructors to teach data literacy," in declaration of interdependence: the proceedings of the acrl 2011 conference, march 30-april 2, 2011, philadelphia, pa, ed. dawn m. mueller. (chicago: association of college and research libraries, 2011), 410-417, http://www.ala.org/acrl/files/conferences/confsandpreconfs/national/2011/papers/show_ me_the_data.pdf. 7. ann s. gray, “data and statistical literacy for librarians,” iassist quarterly 28 no. 2/3 (2004): 24-29, http://www.iassistdata.org/content/data-and-statistical-literacy-librarians. 8. kathy weimer, paige andrew, and tracey hughes, map, gis and cataloging / metadata librarian core competencies (chicago: american library association map and geography round table, 2008), http://www.ala.org/magirt/files/publicationsab/magertcorecomp2008.pdf. 9. social explorer. http://www.socialexplorer.com/pub/home/home.aspx. 10. catherine fitch and steven ruggles, building the national historical geographic information system historical methods 36, no. 1 (2003): 41-50, http://dx.doi.org/10.1080/01615440309601214 . 11. u. s. bureau of census. county and city data book, http://www.census.gov/prod/www/abs/ccdb.html. http://dx.doi.org/10.1111/j.1749-8198.2008.00167.x http://dx.doi.org/10.1177/0309132508094074 http://dx.doi.org/10.1177/0309132510374250 http://www.ala.org/acrl/files/conferences/confsandpreconfs/national/2011/papers/show_me_the_data.pdf http://www.ala.org/acrl/files/conferences/confsandpreconfs/national/2011/papers/show_me_the_data.pdf http://www.iassistdata.org/content/data-and-statistical-literacy-librarians http://www.ala.org/magirt/files/publicationsab/magertcorecomp2008.pdf http://www.socialexplorer.com/pub/home/home.aspx http://dx.doi.org/10.1080/01615440309601214 http://www.census.gov/prod/www/abs/ccdb.html information technology and libraries | march 2013 35 12. census tracts and block numbering areas. http://www.census.gov/geo/www/cen_tract.html. acknowledgments the authors wish to thank dr. michael fligner, dr. clarence hooker, and dr. joe darden for permission to use their courses as case studies. http://www.census.gov/geo/www/cen_tract.html 214 information technology and libraries | december 2010 margaret brown-sica, jeffrey beall, and nina mchale next-generation library catalogs and the problem of slow response time and librarians will benefit from knowing what typical and acceptable response times are in online catalogs, and this information will assist in the design and evaluation of library discovery systems. this study also looks at benchmarks in response time and defines what is unacceptable and why. when advanced features and content in library catalogs increase response time to the extent that users become disaffected and use the catalog less, nextgen catalogs represent a step backward, not forward. in august 2009, the auraria library launched an instance of the worldcat local product from oclc, dubbed worldcat@auraria. the library’s traditional catalog—named skyline and running on the innovative interfaces platform—still runs concurrently with worldcat@auraria. because worldcat local currently lacks a library circulation module that the library was able to use, the legacy catalog is still required for its circulation functionality. in addition, skyline contains marc records from the serialssolution 360 marc product. since many of these records are not yet available in the oclc worldcat database, these records are being maintained in the legacy catalog to enable access to the library’s extensive collection of online journals. almost immediately upon implementation of worldcat local, many library staff began to express concern about the product’s slow response time. they bemoaned its slowness both at the reference desk and during library instruction sessions. few of the discussions of the product’s slow response time evaluated this weakness in the context of its advanced features. several of the reference and instruction librarians even stated that they refused to use it any longer and that they were not recommending it to students and faculty. indeed, many stated that they would only use the legacy skyline catalog from then on. therefore we decided to analyze the product’s response time in relation to the legacy catalog. we also decided to further our study by examining response time in library catalogs in general, including several different online catalog products from different vendors. ■■ response time the term response time can mean different things in different contexts. here we use it to mean the time it takes for all files that constitute a single webpage (in the case of testing performed, a permalink to a bibliographic record) to travel across the internet from a web server to the computer on which the page is to be displayed. we do not include the time it takes for the browser to render the page, only the time it takes for the files to arrive to the requesting computer. typically, a single webpage is made of multiple files; these are sent via the internet from a web response time as defined for this study is the time that it takes for all files that constitute a single webpage to travel across the internet from a web server to the end user’s browser. in this study, the authors tested response times on queries for identical items in five different library catalogs, one of them a next-generation (nextgen) catalog. the authors also discuss acceptable response time and how it may affect the discovery process. they suggest that librarians and vendors should develop standards for acceptable response time and use it in the product selection and development processes. n ext-generation, or nextgen, library catalogs offer advanced features and functionality that facilitate library research and enable web 2.0 features such as tagging and the ability for end users to create lists and add book reviews. in addition, individual catalog records now typically contain much more data than they did in earlier generations of online catalogs. this additional data can include the previously mentioned tags, lists, and reviews, but a bibliographic record may also contain cover images, multiple icons and graphics, tables of contents, holdings data, links to similar items, and much more. this additional data is designed to assist catalog users in the selection, evaluation, and access of library materials. however, all of the additional data and features have the disadvantage of increasing the time it takes for the information to flow across the internet and reach the end user. moreover, the code that handles all this data is much more complex than the coding used in earlier, traditional library catalogs. slow response time has the potential to discourage both library patrons from using the catalog and library staff from using or recommending it. during a reference interview or library instruction session, a slow response time creates an awkward lull in the process, a delay that decreases confidence in the mind of library users, especially novices who are accustomed to the speed of an open internet search. the two-fold purpose of this study is to define the concept of response time as it relates to both traditional and nextgen library catalogs and to measure some typical response times in a selection of library catalogs. libraries margaret brown-sica (margaret.brown-sica@ucdenver.edu) is assistant professor, associate director of technology strategy and learning spaces, jeffrey beall (jeffrey.beall@ucdenver.edu) is assistant professor, metadata librarian, and nina mchale (nina.mchale@ucdenver.edu) is assistant professor, web librarian, university of colorado denver. next-generation library catalogs | brown-sica, beall, and mchale 215 mathews posted an article called “5 next gen library catalogs and 5 students: their initial impressions.”7 here he shares student impressions of several nextgen catalogs. regarding slow response time mathews notes, “lots of comments on slowness. one student said it took more than ten seconds to provide results. some other comments were: ‘that’s unacceptable’ and ‘slow-motion search, typical library.’” nagy and garrison, on lauren’s library blog, emphasized that any “cross-silo federated search” is “as slow as the slower silos.”8 any search interface is as slow as the slowest database from which it pulls information; however, that does not make users more likely to wait for search results. in fact, many users will not even know—or care—what is happening behind the scenes in a nextgen catalog. the assertion that slow response time makes wellintentioned improvements to an interface irrelevant is supported by an article that analyzes the development of semantic web browsers. frachtenberg notes that users, however, have grown to expect web search engines to provide near-instantaneous results, and a slow search engine could be deemed unusable even if it provides highly relevant results. it is therefore imperative for any search engine to meet its users’ interactivity expectations, or risk losing them.9 this is not just a library issue. users expect a fast response to all web queries, and we can learn from studies on general web response time and how it affects the user experience. huang and fong-ling help explain different user standards when using websites. their research suggests that “hygiene factors” such as “navigation, information display, ease of learning and response time” are more important to people using “utilitarian” sites to accomplish tasks rather than “hedonistic” sites.10 in other words, response time importance increases when the user is trying to perform a task— such as research—and possibly even more for a task that may be time sensitive—such as trying to complete an assignment for class. ■■ method for testing response time in an assortment of library catalogs, we used the websitepulse service (http://www .websitepulse.com). websitepulse provides in-depth website and server diagnostic services that are intended to save e-business customers time and money by reporting errors and web server and website performance issues to clients. a thirty-day free trial is available for potential customers to review the full array of their services; however, the free web page test, available at http://www.website server and arrive sequentially at the computer where the request was initiated. while the world wide web consortium (w3c) does not set forth any particular guidelines regarding response time, go-to usability expert jakob nielsen states that “0.1 second is about the limit for having the user feel that the system is reacting instantaneously.”1 he further posits that 1.0 second is “about the limit for the user’s flow of thought to stay uninterrupted, even though the user will notice the delay.”2 finally, he asserts that: 10 seconds is about the limit for keeping the user’s attention focused on the dialogue. for longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.3 even though this advice dates to 1994, nielsen noted even then that it had “been about the same for many years.”4 ■■ previous studies the chief benefit of studying response time is to establish it as a criterion for evaluating online products that libraries license and purchase, including nextgen online catalogs. establishing response-time benchmarks will aid in the evaluation of these products and will help libraries convey the message to product vendors that fast response time is a valuable product feature. long response times will indicate that a product is deficient and suffers from poor usability. it is important to note, however, that sometimes library technology environments can be at fault in lengthening response time as well; in “playing tag in the dark: diagnosing slowness in library response time,” brown-sica diagnosed delays in response time by testing such variables as vendor and proxy issues, hardware, bandwidth, and network traffic.5 in that case, inadequate server specifications and settings were at fault. while there are many articles on nextgen catalogs, few of them discuss the issue of response time in relation to their success. search slowness has been reported in library literature about nextgen catalogs’ metasearch cousins, federated search products. in a 2006 review of federated search tools metalib and webfeat, chen noted that “a federated search could be dozens of times slower than google.”6 more comments about the negative effects of slow response time in nextgen catalogs can be found in popular library technology blogs. on his blog, 216 information technology and libraries | december 2010 ■■ findings: skyline versus worldcat@auraria in figure 2, the bar graph shows a sample load time for the permalink to the bibliographic record for the title hard lessons: the iraq reconstruction experience in skyline, auraria’s traditional catalog load time for the page is pulse.com/corporate/alltools.php, met our needs. to use the webpage test, simply select “web page test” from the dropdown menu, input a url—in the case of the testing done for this study, the permalink for one of three books (see, for example, figure 1)—enter the validation code, and click “test it.” websitepulse returns a bar graph (figure 2) and a table (figure 3) of the file activity from the server sending the composite files to the end user ’s web browser. each line represents one of the files that make up the rendered webpage. they load sequentially, and the bar graph shows both the time it took for each file to load and the order in which the files were received. longer segments of the bar graph provide visual indication of where a slow-loading webpage might encounter sticking points—for example, waiting for a large image file or third-party content to load. accompanying the bar graph is a table describing the file transmissions in more detail, including dns, connection, file redirects (if applicable), first and last bytes, file transmission times, and file sizes. figure 1. permalink screen shot for the record for the title hard lessons in auraria library’s skyline catalog figure 2. websitepulse webpage test bar graph results for skyline (traditional) catalog record figure 3. websitepulse webpage test table results for skyline (traditional) catalog record next-generation library catalogs | brown-sica, beall, and mchale 217 requested at items 8, 14, 15, 17, 26, and 27. the third parties include yahoo! api services, the google api service, recaptcha, and addthis. recaptcha is used to provide security within worldcat local with optical character recognition images (“captchas”), and the addthis api is used to provide bookmarking functionality. at number 22, a connection is made to the auraria library web server to retrieve a logo image hosted on the web server. at number 28, the cover photo for hard lessons is retrieved from an oclc server. the files listed in figure 6 details the complex process of web browsers’ assembly of them. each connection to third-party content, while all relatively short, allows for additional features and functionality, but lengthens overall response. as figure 6 shows, the response time is slightly more than 10 seconds, which, according to nielsen, “is about the limit for keeping the user ’s attention focused on the dialogue.”12 while widgets, third-party content, and other web 2.0 tools add desirable content and functionality to the library’s catalog, they also do slow response time considerably. the total file size for the bibliographic record in worldcat@auraria—compared to skyline’s 84.64 kb—is 633.09 kb. as will be shown in the test results below for the catalog and nextgen catalog products, bells and whistles added to traditional 1.1429 seconds total. the record is composed of a total of fourteen items, including image files (gifs), cascading style sheet (css) files, and javascript (js) files. as the graph is read downward, the longer segments of the bars reveal the sticking points. in the case of skyline, the nine image files, two css files, and one js file loaded quickly; the only cause for concern is the red line at item four. this revealed that we were not taking advantage of the option to add a favicon to our iii catalog. the web librarian provided the ils server technician with the same favicon image used for the library’s website, correcting this issue. the skyline catalog, judging by this data, falls into nielsen’s second range of user expectations regarding response time, which is more than one second, or “about the limit for the user’s flow of thought to stay uninterrupted, even though the user will notice the delay.”11 further detail is provided in figure 3; this table lists each of the webpage’s component files, and various times associated with the delivery of each file. the column on the right lists the size in kilobytes of each file. the total size of the combined files is 84.64 kb. in contrast to skyline’s meager 14 files, worldcat local requires 31 items to assemble the webpage (figure 4) for the same bibliographic record. figures 5 and 6 show that this includes 10 css files, 10 javascript files, and 8 images files (gifs and pngs). no item in particular slows down the overall process very much; the longestloading item is number 13, which is a wait for third-party content, a connection to yahoo!’s user interface (yui) api service. additional third-party content is being figure4. permalink screen shot for the record for the title hard lessons in worldcat@auraria figure 5. websitepulse webpage test bar graph results for worldcat@auraria record 218 information technology and libraries | december 2010 total time for each permalinked bibliographic record to load as reported by the websitepulse tests; this number appears near the lower right-hand corner of the tables in figures 3, 6, 9, 12, and 15. we selected three books that were each held by all five of our test sites, verifying that we were searching the same three bibliographic records in each of the online catalogs by looking at the oclc number in the records. each of the catalogs we tested has a permalink feature; this is a stable url that always points to the same record in each catalog. using a permalink approximates conducting a known-item search for that item from a catalog search screen. we saved these links and used them in our searches. the bibliographic records we tested were for these books; the permalinks used for testing follow the books: book 1: hard lessons: the iraq reconstruction experience. washington, d.c.: special inspector general, iraq reconstruction, 2009 (oclc number 302189848). permalinks used: ■■ worldcat@auraria: http://aurarialibrary.worldcat .org/oclc/302189848 ■■ skyline: http://skyline.cudenver.edu/record=b243 3301~s0 ■■ lcoc: http://lccn.loc.gov/2009366172 ■■ ut austin: http://catalog.lib.utexas.edu/record= b7195737~s29 ■■ usc: http://library.usc.edu/uhtbin/cgisirsi/ x/0/0/5?searchdata1=2770895{ckey} book 2: ehrenreich, barbara. nickel and dimed: on (not) getting by in america. 1st ed. new york: metropolitan, 2001 (oclc number 256770509). permalinks used: ■■ worldcat@auraria: http://aurarialibrary.worldcat .org/oclc/45243324 ■■ skyline: http://skyline.cudenver.edu/record=b187 0305~s0 ■■ lcoc: http://lccn.loc.gov/00052514 ■■ ut austin: http://catalog.lib.utexas.edu/record= b5133603~s29 ■■ usc: http://library.usc.edu/uhtbin/cgisirsi/ x/0/0/5?searchdata1=1856407{ckey} book 3: langley, lester d. simón bolívar: venezuelan rebel, american revolutionary. lanham: rowman & littlefield catalogs slowed response time considerably, even doubling it in one case. are they worth it? the response of auraria’s reference and instruction staff seems to indicate that they are not. ■■ gathering more data: selecting the books and catalogs to study to broaden our comparison and to increase our data collection, we also tested three other non-auraria catalogs. we designed our study to incorporate a number of variables. we decided to link to bibliographic records for three different books in the five different online catalogs tested. these included skyline and worldcat@auraria as well three additional online public access catalog products, for a total of two instances of innovative interfaces products, one of a voyager catalog, and one of a sirsidynix catalog. we also selected online catalogs in different parts of the country: worldcatlocal in ohio; skyline in denver; the library of congress’ online catalog (lcoc) in washington, d.c.; the university of texas at austin’s (ut austin) online catalog; and the university of southern california’s (usc) online catalog, named homer, in los angeles. we also did our testing at different times of the day. one book was tested in the morning, one at midday, and one in the afternoon. websitepulse performs its webpage tests from three different locations in seattle, munich, and brisbane; we selected seattle for all of our tests. we recorded the figure 6. websitepulse webpage test table results for worldcat@auraria record next-generation library catalogs | brown-sica, beall, and mchale 219 .org/oclc/256770509 ■■ skyline: http://skyline.cudenver.edu/record=b242 6349~s0 ■■ lcoc: http://lccn.loc.gov/2008041868 ■■ ut austin: http://catalog.lib.utexas.edu/record= b7192968~s29 ■■ usc: http://library.usc.edu/uhtbin/cgisirsi/ x/0/0/5?searchdata1=2755357{ckey} we gathered the data for thirteen days in early november 2009, an active period in the middle of the semester. for each test, we recorded the response time total in seconds. the data is displayed in tables 1–3. we searched bibliographic records for three books in five library catalogs over thirteen days (3 x 5 x 13) for a total of 195 response time measurements. the websitepulse data is calculated to the ten thousandth of a second, and we recorded the data exactly as it was presented. publishers, c2009 (oclc number 256770509). permalinks used: ■■ worldcat@auraria: http://aurarialibrary.worldcat table 1. response times for book 1 response time in seconds day wor ldcat skyline lc ut austin usc 1 10.5230 1.3191 2.6366 3.6643 3.1816 2 10.5329 1.2058 1.2588 3.5089 4.0855 3 10.4948 1.2796 2.5506 3.4462 2.8584 4 13.2433 1.4668 1.4071 3.6368 3.2750 5 10.5834 1.3763 3.6363 3.3143 4.6205 6 11.2617 1.2461 2.3836 3.4764 2.9421 7 20.5529 1.2791 3.3990 3.4349 3.2563 8 12.6071 1.3172 3.6494 3.5085 2.7958 9 10.4936 1.1767 2.6883 3.7392 4.0548 10 10.1173 1.5679 1.3661 3.7634 3.1165 11 9.4755 1.1872 1.3535 3.4504 3.3764 12 12.1935 1.3467 4.7499 3.2683 3.4529 13 11.7236 1.2754 1.5569 3.1250 3.1230 average 11.8310 1.3111 2.5105 3.4874 3.3953 table 2. response times for book 2 response time in seconds day worldcat skyline lc ut austin usc 1 10.9524 1.4504 2.5669 3.4649 3.2345 2 10.5885 1.2890 2.7130 3.8244 3.7859 3 10.9267 1.3051 0.2168 4.0154 3.6989 4 13.8776 1.3052 1.3149 4.0293 3.3358 5 10.6495 1.3250 4.5732 3.5775 3.2979 6 11.8369 1.3645 1.3605 3.3152 2.9023 7 11.3482 1.2348 2.3685 3.4073 3.5559 8 10.7717 1.2317 1.3196 3.5326 3.3657 9 11.1694 1.0997 1.0433 2.8096 2.6839 10 19.0694 1.6479 2.5779 4.3595 2.6945 11 12.0109 1.1945 2.5344 3.0848 18.5552 12 12.6881 0.7384 1.3863 3.7873 3.9975 13 11.6370 1.1668 1.2573 3.3211 3.6393 average 12.1174 1.2579 1.9410 3.5791 4.5190 table 3. response times for book 3 response time in seconds day worldcat skyline lc ut austin usc 1 10.8560 1.3345 1.9055 3.7001 2.6903 2 10.1936 1.2671 1.8801 3.5036 2.7641 3 11.0900 1.5326 1.3983 3.5983 3.0025 4 10.9030 1.4557 2.0432 3.6248 2.9285 5 12.3503 1.5972 3.5474 3.6428 4.5431 6 9.1008 1.1661 1.4440 3.4577 3.1080 7 9.6263 1.1240 2.3688 3.1041 3.3388 8 10.9539 1.1944 1.4941 2.8968 3.4224 9 11.0001 1.2805 1.3255 3.3644 2.7236 10 10.2231 1.3778 1.3131 3.3863 3.4885 11 10.1358 1.2476 2.3199 3.4552 2.9302 12 12.0109 1.1945 2.5344 3.0848 18.5552 13 11.5881 1.2596 2.5245 3.8040 3.8506 average 10.7717 1.3101 2.0076 3.4325 4.4112 table 4. averages response time in seconds book worldcat skyline lc ut austin usc book 1 11.8310 1.3111 2.5105 3.4874 3.3953 book 2 12.1174 1.2579 1.9410 3.5791 4.5190 book 3 10.7717 1.3101 2.0076 3.4325 4.4112 average 11.5734 1.2930 2.1530 3.4997 4.1085 220 information technology and libraries | december 2010 university of colorado denver: skyline (innovative interfaces) as previously mentioned, the traditional catalog at auraria library runs on an innovative interfaces integrated library system (ils). testing revealed a missing favicon image file that the web server tries to send each time (item 4 in figure 3). however, this did not negatively affect the response time. the catalog’s response time was good, with an average of 1.2930 seconds, giving it the fastest average time among all the test sites in the testing period. as figure 1 shows, however, skyline is a typical legacy catalog that is designed for a traditional library environment. library of congress: online catalog (voyager) the average response time for the lcoc was 2.0076 ■■ results the data shows the response times for each of the three books in each of the five online catalogs over the thirteenday testing period. the raw data was used to calculate averages for each book in each of the five online catalogs, and then we calculated averages for each of the five online catalogs (table 4). the averages show that during the testing period, the response time varied between 1.2930 seconds for the skyline library catalog in denver to 11.5734 seconds for worldcat@auraria, which has its servers in ohio. university of colorado denver: worldcat@auraria worldcat@auraria was routinely over nielsen’s ten second limit, sometimes taking as long as twenty seconds to load all the files to generate a single webpage. as previously discussed, this is due to the high number and variety of files that make up a single bibliographic record. the files sent also include cover images, but they are small and do not add much to the total time. after our tests on worldcat@auraria were conducted, the site removed one of the features on pages for individual resources, namely the “similar items” feature. this feature was one of the most file-intensive on a typical page, and its removal should speed up page loads. however, worldcat@auraria had the highest average response time by far of the five catalogs tested. figure 7. permalink screen shot for the record for the title hard lessons in the library of congress online catalog figure 8. websitepulse webpage test bar graph results for library of congress online catalog record figure 9. websitepulse webpage test table results for library of congress online catalog record next-generation library catalogs | brown-sica, beall, and mchale 221 item 14 is a script, that while hosted on the ils server, queries amazon.com to return cover image art (figures 11–12). the average response time for ut austin’s catalog was 3.4997 seconds. this example demonstrates that response times for traditional (i.e., not nextgen) catalogs can be slowed down by additional content as well. university of southern california: homer (sirsidynix) the average response time for usc’s homer catalog was 4.1085 seconds, making it the second slowest after seconds. this was the second fastest average among the five catalogs tested. while, like skyline, the bibliographic record page is sparsely decorated (figure 7), this pays dividends in response time, as there are only two css files and three gif files to load after the html content loads (figure 9). figure 8 shows that initial connection time is the longest factor in load time; however, it is still short enough to not have a negative effect. total file size is 19.27 kb. as with skyline, the page itself (figure 7) is not particularly end-user friendly to nonlibrarians. university of texas at austin: library catalog (innovative interfaces) ut austin, like auraria library, runs an innovative interfaces ils. the library catalog also includes book cover images, one of the most attractive nextgen features (figure 10), and as shown in figure 12, third-party content is used to add features and functionality (items 16 and 17). ut austin’s catalog uses a google javascript api (item 16 in figure 12) and librarything’s catalog enhancement product, which can add book recommendations, tag browsing, and alternate editions and translations. total content size for the bibliographic record is considerably larger than skyline and the lcoc at 138.84 kb. it appears as though inclusion of cover art nearly doubles the response time; figure 10. permalink screen shot for the record for the title hard lessons in university of texas at austin’s library catalog figure 11. websitepulse webpage test bar graph results for university of texas at austin’s library catalog record figure 12. websitepulse webpage test table results for university of texas at austin’s library catalog record 222 information technology and libraries | december 2010 completed. added functionality and features in library search tools are valuable, but there is a tipping point when these features slow down a product’s response time to where users find the search tool too slow or unreliable. based on the findings of this study, we recommend that libraries adopt web response time standards, such as those set forth by nielsen, for evaluating vendor search products and creating in-house search products. commercial tools like websitepulse make this type of data collection simple and easy. testing should be conducted for an extended period of time, preferably during a peak period—i.e., during a busy time of the semester for academic libraries. we further recommend that reviews of electronic resources add response time as an worldcat@auraria, and the slowest among the traditional catalogs. this sirsidynix catalog appears to take a longer time than the other brands of catalogs to make the initial connection to the ils; this accounts for much of the slowness (see figures 14 and 15). once the initial connection is made, however, the remaining content loads very quickly, with one exception: item 13 (see figure 15), which is a connection to the third-party provider syndetic solutions, which provides cover art, a summary, an author biography, and a table of contents. while the display of this content is attractive and well-integrated to the catalog (figure 13), it adds 1.2 seconds to the total response time. also, as shown in item 14 and 15, usc’s homer uses the addthis service to add bookmarking enhancements to the catalog. total combined file size is 148.47 kb, with the bulk of the file size (80 kb) coming from the initial connection (item 1 in figure 15). ■■ conclusion an eye-catching interface and valuable content are lost on the end user if he or she moves on before a search is figure 13. permalink screen shot for the record for the title hard lessons in homer, the university of southern california’s catalog figure 14. websitepulse webpage test bar graph results for homer, the university of southern california’s catalog figure 15. websitepulse webpage test table results for homer, the university of southern california’s catalog next-generation library catalogs | brown-sica, beall, and mchale 223 4. ibid. 5. margaret brown-sica. “playing tag in the dark: diagnosing slowness in library response time,” information technology & libraries 27, no. 4 (2008): 29–32. 6. xiaotian chen, “metalib, webfeat, and google: the strengths and weaknesses of federated search engines compared with google,” online information review 30, no. 4 (2006): 422. 7. brian mathews, “5 next gen library catalogs and 5 students: their initial impressions,” online posting, may 1, 2009, the ubiquitous librarian blog, http://theubiquitouslibrarian .typepad.com/the_ubiquitous_librarian/2009/05/5-next-genlibrary-catalogs-and-5-students-their-initial-impressions.html (accessed feb. 5, 2010) 8. andrew nagy and scott garrison, “next-gen catalogs are only part of the solution,” online posting. oct. 4, 2009, lauren’s library blog, http://laurenpressley.com/library/2009/10/next -gen-catalogs-are-only-part-of-the-solution/ (accessed feb. 5, 2010). 9. eitan frachtenberg, “reducing query latencies in web search using fine-grained parallelism,” world wide web 12, no. 4 (2009): 441–60. 10. travis k huang and fu fong-ling, “understanding user interface needs of e-commerce web sites,” behaviour & information technology 28, no. 5 (2009): 461–69, http://www .informaworld.com/10.1080/01449290903121378 (accessed feb. 5, 2010). 11. nielsen, usability engineering, 135. 12. ibid. evaluation criterion. additional research about response time as defined in this study might look at other search tools, to include article databases, and especially other metasearch products that collect and aggregate search results from several remote sources. further studies with more of a technological focus could include discussions of optimizing data delivery methods—again, in the case of metasearch tools from multiple remote sources—to reduce response time. finally, product designers should pay close attention to response time when designing information retrieval products that libraries purchase. ■■ acknowledgments the authors wish to thank shelley wendt, library data analyst, for her assistance in preparing the test data. references 1. jakob nielsen, usability engineering (san francisco: morgan kaufmann, 1994): 135. 2. ibid. 3. ibid. microsoft word june_ital_liu_final.docx a library in the palm of your hand: mobile services in top 100 university libraries yan quan liu and sarah briggs information technology and libraries | june 2015 133 abstract what is the current state of mobile services among academic libraries of the country’s top 100 universities, and what are the best practices for librarians implementing mobile services at the university level? through in-‐depth website visits and survey questionnaires, the authors studied each of the top 100 universities’ libraries’ experiences with mobile services. results showed that all of these libraries offered at least one mobile service, and the majority offered multiple services. the most common mobile services offered were mobile sites, text messaging services, e-‐books, and mobile access to databases and the catalog. in addition, chat/im services, social media accounts and apps were very popular. survey responses also indicated a trend towards responsive design for websites so that patrons can access the library’s full site on any mobile device. respondents recommend that libraries considering offering mobile services begin as soon as possible as patron demand for these services is expected to increase. introduction mobile devices, such as smart phones, tablets, e-‐book readers, handheld gaming tools and portable music players are practically omnipresent in today’s society. according to walsh (2012), “mobile data traffic in 2011 was eight times the size of the global internet in 2000 and, according to forecasts, mobile devices will soon outnumber human beings”.1 studies have revealed that use of mobile devices is widespread and continues to increase. as of 2013, 56% of americans owned a smart phone (smith 2013). this number is even higher among people ages 18 to 29.2 however, peters (2011) points out that mobile phones at least can be found among people of all ages, nationalities and socioeconomic classes. he writes, “we truly are in the midst of a global mobile revolution.”3 in 2012, the acrl research planning and review committee found that 55% of undergraduates have smart phones, 62% have ipods, and 21% have some kind of tablet. over 67% of these students use their devices academically.4 elmore and stephens (2012) write, “academic libraries cannot afford to ignore this growing trend. for many students a mobile phone is no longer just a telephonic device but a handheld information retrieval tool.”5 yan quan liu (liuy1@southernct.edu) is professor in information and library science at southern connecticut state university, new haven, ct, and special hired professor at tianjin university of technology, tianjin, china. sarah briggs (sjg.librarian@gmail.com) is library/media specialist at jonathan law high school, milford, ct. a library in the palm of your hand: mobile services in the top 100 university libraries | liu and briggs | doi: 10.6017/ital.v34i2.5650 134 it is clear from these studies that academic libraries can expect their patrons to be accessing their services via mobile devices in growing numbers and need to adapt to this reality. however, the sheer number of mobile devices on the market and the myriad ways libraries could offer mobile services can be daunting. additionally, offering mobile services requires investing time, money, and personnel. in order to give libraries a starting point, this paper examines the current status of mobile services in the united states’ top 100 universities’ libraries as a model, specifically what services are being offered, what are they being used for, and what challenges libraries have encountered in offering mobile services. in doing so, this paper attempts to answer two questions: what is the state of mobile services among academic libraries of the country’s top ranked universities, and what can the experiences of these libraries teach us about best practices for mobile services at the university level? literature review current status of mobile services in academic libraries there is not a lot of data regarding the prevalence of mobile services in academic libraries. a 2010 study found that 35% of the english speaking members of the association of research libraries had a mobile website for either the university, the library, or both (canuel and crichton 2010).6 a study of chinese academic libraries revealed that only 12.8% surveyed had a section of their web pages devoted to mobile library service (li 2013).7 in 2010, canuel and crichton found that 13.7% of association of universities and colleges of canada members had some mobile services, including websites and apps.8 in the united states, a 2010 survey found that 44% of academic libraries offered some type of mobile service. 39% had a mobile website, and 36% had a mobile version of the library’s catalog. half of libraries which did not offer mobile services were in the planning process for creating a mobile website, catalog, and text notifications. additionally, 40% planned on implementing sms reference services, and 54% wanted the ability to access library databases on mobile devices (thomas 2010).9 however, it is widely assumed that mobile services will expand rapidly in the future (canuel and crichton 2010).10 more recently, a 2012 survey of academic libraries in the pacific northwest found that 50% had a mobile version of the library’s website and/or catalog, 40% used qr codes, 38% had a text messaging service, and 18% replied “other” with mobile interfaces for databases being a popular offering. however, 31% of survey respondents still did not have any mobile services (ashford and zeigen 2012).11 osika and kaufman (2012) surveyed community and junior colleges nationwide to determine what mobile services were being offered. 73% offered mobile catalog access, 62% offered vendor database apps, two were creating a mobile app for the library, and 14.7% had a mobile library website.12 definition and types of mobile services although there are dozens of different mobile devices on the market, la counte (2013) aptly and succinctly defines them as follows: “the reality is that mobile devices can refer to essentially any device that someone uses on the go” (vi).13 smart phones, netbooks, tablet computers, e-‐readers, information technologies and libraries | june 2015 135 gaming devices and ipods are examples of mobile devices that are now commonplace on college campuses. barnhart and pierce (2012) define these devices as “…networked, portable, and handheld…”14 additionally, these devices may be used to read, listen to music, and watch videos (west, hafner and faust 2006).15 according to lippincott (2008), libraries should consider all their patron groups as potential mobile library users, including faculty, distance education students, on-‐campus students, students placed in internships or doing other kinds of fieldwork, and students using mobile devices to work on collaborative projects outside of school.16 the most common mobile services discussed in the literature are mobile-‐friendly websites or apps, mobile-‐friendly access to the library’s catalog and databases, text messaging services, qr codes, augmented reality, e-‐books, and information literacy instruction facilitated by mobile devices. these services fall into one of two categories: traditional library services amended to be available with mobile devices and services created specifically for mobile devices. common library services that have been updated to be mobile-‐friendly include a mobile website (either as a mobile version of the library’s regular site, an app, or both), mobile-‐friendly interfaces for the library’s catalog and databases, access to books in electronic format, and information literacy instruction which makes use of mobile devices. regarding mobile websites and apps, walsh (2012) writes, “if a well-‐designed app is like a top-‐end sports car, a mobile website is more like a family run-‐ around. it may not be as good looking, but it is likely to be cheaper, easier to run and accessible to more people.”17 it is not feasible to replicate the entire website in a mobile version, so libraries must know what patrons find most important and address that information through the mobile site (walsh 2012).18 according to a 2012 survey of academic libraries in the pacific northwest, the most popular types of information found on mobile websites are links to the catalog, a way to contact a librarian, links to databases, and hours of operation (ashford and zeigen 2012).19 many libraries are also providing mobile access to their catalogs and databases. this is sometimes difficult because often third-‐party vendors are responsible for the catalogs and/or databases, and libraries must rely on these vendors to provide mobile access (iglesias and meesangnil 2011).20 however, many vendors already offer mobile-‐friendly interfaces; libraries must be aware when this is the case and provide links to these interfaces. when a vendor does not provide a mobile-‐friendly interface, the library should encourage the vendor to do so (bishoff 2013, p. 118).21 there is a growing expectation that libraries will provide e-‐books to patrons as e-‐books become increasingly popular. walsh (2012) states that the proportion of adults in the united states who own an e-‐book reader doubled between november 2010 and may 2011.22 according to bischoff, ruth, and rawlins (2013), 29% of americans owned a tablet or e-‐reader as of january 2012.23 this has presented challenges for libraries, mainly in two areas: format and licensing. there is risk involved in choosing a format that will only work with one product, i.e. a nook or a kindle, a library in the palm of your hand: mobile services in the top 100 university libraries | liu and briggs | doi: 10.6017/ital.v34i2.5650 136 because not every patron will own the same device, and ultimately one device might become the most popular, rendering books purchased for other devices obsolete. on the other hand, formats that work with multiple devices tend to have only basic functionality and do not provide an ideal user experience (walsh 2012).24 walsh (2012) recommends epub, which works well with many different devices, is free, and supports the addition of a digital rights management layer.25 licensing is also an issue as libraries and publishers strive to find a method of loaning e-‐books amenable to both. no one model has emerged which is mutually satisfactory (walsh 2012).26 libraries are increasingly integrating mobile technologies into information literacy instruction and other forms of instruction. for example, services such as skype and facetime, which walsh (2012) describes as “a window to another world” (p. 105), can be used for distance learning, including reference and instruction.27 when interactions do not need to take place live, many mobile devices have the capability to take pictures, record video, and record audio (walsh 2012, p. 97).28 this allows class events, including lectures and discussions, to be broadcast to people and spaces beyond the physical classroom. walsh (2012) notes that, when constructing podcasts or vodcasts, it is important to make mobile-‐friendly versions of these available, bearing in mind different platforms and screen sizes people might be using to access the content.29 text messaging, qr codes, and augmented reality are examples of library services that were created expressly for mobile devices. text messaging in particular has become a very popular mobile service offering; as thomas and murphy (2009) write, “interacting with patrons through text messaging now ranks among core competencies for librarians because sms increasingly comprises a central channel for communicating library information.”30 a common use of text messaging is a ‘text a librarian’ service. walsh (2012) recommends launching such a service even if the library currently offers no other mobile services, noting, “it can be quick, easy and cheap to introduce such a service and it is an ideal entry into the world of providing services via mobile devices” (p. 45).31 peters (2011) points out that the shorter the turnaround time (he recommends less than ten minutes) the better. he notes that many questions arise as the result of a situation the questioner is currently in. he writes, “if you do not respond in a matter of minutes, not hours, the context will be lost and the need will be diminished or satisfied in other ways.”32 qr codes have become popular in libraries offering mobile services. qr codes encode information in two dimensions (vertically and horizontally), and thus can provide more information than a barcode. the applications necessary for using qr codes are usually free, and they can be read by most mobile devices with cameras (little 2011).33 the most common uses of qr codes in academic libraries, according to elmore and stephens (2012), are linking to the library’s mobile website and social media pages, searching the library catalog, viewing a video or accessing a music file, reserving a study room, and taking a virtual tour of the library facilities.34 augmented reality may not currently be used as often in libraries as other services such as mobile sites and text messaging, but many libraries are finding unique and compelling ways to use ar. ar applications link the physical with the digital, are interactive in real time, and are registered in 3-‐d. information technologies and libraries | june 2015 137 hahn (2012) defines ar as follows: “in order to be considered a truly augmented reality application, an app must interactively attach graphics or data to objects in real time, to achieve the real and virtual combination of graphics into the physical environment.”35 he notes that such applications are excellent additions to libraries’ mobile services because they connect physical and digital worlds, much like libraries.36 one example of augmented reality is north carolina state university’s wolfwalk, which is advertised as “…a historical walking tour of the nc state campus using the location-‐aware campus map” (ncsu libraries).37 to create the tour, the ncsu libraries special collections research center provided over one thousand photographs of the campus from the 19th century to the present (ncsu libraries).38 research design to make sure the information gathered was current and valid, this study employed two approaches, website visits and survey investigation, to determine the state of mobile services at the top 100 universities’ libraries. the website visits explored what mobile services are being offered and how they are being offered at these university libraries. the survey sent via email inquired how they are providing mobile services in their libraries and what their results have been regarding challenges, successes, and best practices. the survey data was analyzed and compared to the data obtained via website exploration to form a more comprehensive picture of mobile services at these universities. participants university libraries' patrons are frequent users of mobile technology. according to osika and kaufman (2012), studies have found that 45% of 18 to 29-‐year-‐olds who have internet-‐capable cell phones do most of their browsing on their devices. 39 kostruski and skornia (2011) note that people of this age group are “…leaders in mobile communication…the traditional college-‐age student.”40 as the nation’s leaders in undergraduate and graduate programs and academic research, an examination of the status of the top 100 university libraries' mobile services can provide useful service patterns and a benchmark for the service improvements that would benefit academic programs. based on the u.s. news & world report's national university rankings, this study selected the top 100 universities in the 2014 rankings.41 procedure website visits as the first step were conducted from march 2, 2014 to march 16, 2014. each library’s home page was carefully examined for the most common mobile services named in the literature with these categorized items: 1) a mobile website or app, 2) mobile access to the library’s catalog and databases, 3) text messaging services, 4) qr codes, 5) augmented reality, and 6) e-‐books. to assess each site, we first visited the site via a nexus 7 to see if it had a mobile version. next, we viewed each library’s full site on a laptop computer. we browsed through each page of the site looking for mention or use of each said categorization. we also searched for these a library in the palm of your hand: mobile services in the top 100 university libraries | liu and briggs | doi: 10.6017/ital.v34i2.5650 138 items via the library’s site map or site search functions whenever available. the results were tabulated with a codebook in the established categorization through microsoft excel. although the website visits place great value on gathering quantitative data about what mobile services are offered at these libraries, this method has its limitations. firstly, it locates only those mobile services that appear on a library’s website, but services the library provides which are not mentioned on the website can be overlooked. also, the use of mobile devices or services in library instruction, a very commonly mentioned mobile service in the literature, cannot generally be determined via a website visit. in addition, the website visit provides only a snapshot of the current state of mobile services; university libraries may be planning to implement or even be in the process of implementing mobile services. lastly, website visits evaluate what is publicly available, but it is not possible to access password-‐protected information meant only for a university’s students and faculty to assess mobile content. to address these shortcomings, we created a survey using surveymonkey to complement the data supplied from the website visits. we sent out the survey via email to each of the top 100 universities’ libraries. the survey was conducted from april 10, 2014, to april 24, 2014. results and analysis study results presented compelling evidence that mobile services are already ubiquitous among the country's top universities. the most recognized ones are mobile sites, mobile apps, mobile opacs, mobile access to databases, text messaging services, qr codes, augmented reality, and e-‐ books. these service forms confirm those commonly named in the literature as library mobile services. what basic types of mobile services do the libraries provide? the results showed all of the libraries offered one or more of the specific mobile services in chart 1 with multiple entries allowed, presenting modernized new service patterns the university libraries provide to meet the needs and demands of university communities in this digital era. information technologies and libraries | june 2015 139 chart 1. percentage of libraries offering specific mobile services (multiple entries allowed). it is clear from both the survey results and the website visits that almost all libraries at the top 100 universities are offering multiple mobile services, with mobile websites, mobile access to the library’s catalog, mobile access to the library’s databases, e-‐books, and text messaging services being the most common. qr codes and especially augmented reality are not as common. of the eight main mobile services we looked for via the website visits and survey (mobile site, mobile app for the site, mobile opac, mobile access to databases, text messaging, qr codes, augmented reality, and e-‐books), all libraries surveyed offer between one and seven of these services. no universities have none of these services, and no universities have all of these services. only one university has one service, none have two, seven have three, thirteen have four, twenty-‐ four have five, forty-‐six have six, and eight have seven. to make this information easy to read, we summarized it in table 1 below. number of mobile services offered number of libraries percentage of libraries no mobile services 0 0% 1 mobile service 1 1% 2 mobile services 0 0% 3 mobile services 7 7% 4 mobile services 13 13% 5 mobile services 24 24% 6 mobile services 46 46% 7 mobile services 8 8% 8 mobile services 0 0% table 1. number of mobile services offered. 5.0% 29.2% 58.7% 77.2% 81.6% 81.7% 88.0% 92.6% augmented reality mobile app for site qr codes text messaging mobile website mobile databases mobile opac e-‐books percentage of libraries offering specimic mobile services a library in the palm of your hand: mobile services in the top 100 university libraries | liu and briggs | doi: 10.6017/ital.v34i2.5650 140 such a data pattern demonstrates not only that mobile services are very widespread at these universities’ libraries, but also that the vast majority of these libraries offer multiple mobile services. in other words, libraries do not appear to be offering mobile services in isolation; they have taken several of their most popular services (such as websites, reference, and search functions) and mobilized all of them. in fact, the average number of mobile services offered among the eight services we examined is 5.31. although results collected from the two research methods (website visits and survey) are almost identical for mobile websites and mobile opacs and are very comparable for text messaging, qr codes, and augmented reality there is a bit of a gap between results from the website visits and the survey regarding mobile databases (92.9% vs. 70.59%), but perhaps libraries that responded to the survey just happened to offer mobile access to databases less often than all the libraries in general. it is interesting that we located e-‐books on 100% of the websites we visited, but only 85.29% of respondents mention offering them. perhaps this discrepancy can be explained by a clarification in terms. we looked for the presence of books in electronic format that could be accessed online. perhaps survey respondents only considered e-‐books specifically formatted for smart phones or tablets as a mobile service. also, later in the survey several respondents mention communication issues as an ongoing challenge in offering mobile services, specifically, not always knowing what other library departments are offering in terms of mobile services. it is possible that some survey respondents are not responsible for the e-‐book collection and thus did not mention it as a mobile service. another discrepancy exists between the results for mobile apps for the library’s site (20.2% for the website visits versus 38.24% for the survey). these results indicate that mobile apps for libraries’ sites are more common than we had previously thought. perhaps these apps are being advertised in places other than on the library’s website, and therefore a website visit is not the best way to discover them. the website visits did not look for mobile library instruction, mobile book renewal, or mobile interlibrary loan, but through our website visits we saw these services mentioned several times and thus included them in the survey. they turned out to be somewhat common among libraries surveyed; 41.18% of respondents offer mobile book renewal, 20.59% offer mobile interlibrary loan, and 32.35% offer mobile-‐friendly library instruction. table 2 below compares the data collected from both the website visits and the survey among these 100 universities, ranking from high to low percentages. in most cases, they are very similar. information technologies and libraries | june 2015 141 mobile services percentage of libraries offering service (website visits) percentage of libraries offering service (survey) e-‐books 100% 85.29% mobile databases 92.90% 70.59% mobile opac 87.80% 88.24% mobile website 80.80% 82.35% text messaging 80.80% 73.53% qr codes 61.60% 55.88% mobile app for site 20.20% 38.24% augmented reality 7.00% 2.94% table 2. data comparison of specific mobile services between website visits & survey. what content do the mobile sites offer? in addition to assessing whether libraries had a mobile site, the survey asked libraries that already have a mobile site what is included on the site. 100% of libraries with mobile sites include library hours on their site, making this the most common feature. the next two most common features are library contact information and a search function for the catalog, which both received 96.67%. searching within mobile-friendly databases , such as ebscohost mobile, jstor and pubmed, is the next most popular feature, although it trailed a little behind library hours, contact information, and catalog searching at 70%. book renewal received 56.67%, and access to patron accounts received 53.33%. interlibrary loan is the least common feature by far, offered by only 26.67% of respondents. this information is summarized in chart 2 below: chart 2. components of libraries’ mobile sites. 26.67% 53.33% 56.67% 70.00% 96.67% 96.67% interlibrary loan access to patron accounts book renewal search the databases library contact information search the catalog components of libraries' mobile sites a library in the palm of your hand: mobile services in the top 100 university libraries | liu and briggs | doi: 10.6017/ital.v34i2.5650 142 these results are interesting as, overall, they reflect higher percentages for specific mobile services than question 1 on the survey, which asked which mobile services libraries offer. for example, in question 1, 88.24% of respondents offer mobile access to the library’s catalog, whereas for libraries with mobile sites, 96.67% offer access to the catalog on the mobile site. the ability to search mobile-‐friendly versions of databases the library subscribes to was almost the same for both groups, with 70.59% of respondents to question 1 offering this and 70% of respondents having this as a component of their mobile sites. mobile book renewal is much more common among libraries with mobile sites; 56.67% of respondents with mobile sites compared to 41.18% of total respondents. a slightly higher percentage of respondents with mobile sites offer mobile interlibrary loan (26.67%) compared to all respondents (20.59%). this data suggests that, on the whole, libraries with mobile sites are more likely to offer other mobile services as well, specifically mobile access to the catalog, mobile book renewal, and mobile interlibrary loan. what mobile reference services do libraries provide? the survey also looked for information on virtual and/or mobile reference services. 81.25% of survey respondents offer text/sms messaging, 100% offer chat/im, and 21.88% offer reference services via a social media account. these results showing popular reference services in these top universities are summarized in chart 3 below: chart 3. popular mobile reference services. chat/im is obviously the most popular method of providing virtual/mobile reference services; all survey respondents offer this service. text/sms is also very popular, indicating that the majority of libraries see value in providing both despite their similar functions. the fact that social media does not compare favorably to either texting or chat/im services is curious because most social media platforms have a mobile version available that libraries can take advantage of for free. however, this may not be the best medium for reference. one respondent commented on this question, “our ‘ask a librarian’ service is available from desktop facebook, but not on mobile facebook.” 22% 81% 100% social media text/sms chat/im popular virtual/mobile reference services information technologies and libraries | june 2015 143 what apps do libraries use or provide for patrons? although the website visits and survey results indicated that apps for a library’s site are not very common, both tools revealed that use of apps for various purposes is widespread. the most commonly mentioned app is browzine, which is used for accessing e-‐journals. several respondents mentioned apps developed in-‐house for using library services, such as an app for reserving a study room, accessing university archives, and sending catalog records to a mobile device. another respondent stated that the university’s app has a library function. several respondents mentioned vendor-‐provided or third-‐party apps, such as apps for accessing pubmed, sciencedirect, naxos music library, accessmylibrary (for gale resources), a mobile medical dictionary, and the american chemical society. one respondent noted that the library loans ipads preloaded with popular apps to support student research such as endnote, notability, goodreader, pages, numbers, and keynote, among others. finally, these apps were named at least once as an app libraries either use or provide access to: iresearch (for storing articles locally), boopsie (for building a library mobile app), ebrary (for accessing e-‐books), and safari (for accessing books and videos online). these results indicate that the use of apps is fairly robust and diverse among these libraries. additionally, from these results, it seems more common for libraries to use and/or provide apps created by third parties than to develop an in-‐house app, perhaps due to the expertise and expense involved in creating and maintaining an app. what mobile services will be added in the future? the final question of the survey asks libraries if there are any plans to offer a mobile service not currently provided. responses are summarized in chart 4 below. chart 4. percentage of the libraries seeking to add specific mobile services the most common selection is mobile friendly library instruction, with 61.54%. the next most common is a mobile website (46.15%). mobile interlibrary loan was chosen by 38.46% of 8% 8% 8% 15% 15% 15% 38% 46% 62% text messaging services qr codes mobile app(s) e-‐books augmented reality mobile opac mobile databases mobile book renewal mobile interlibrary loan mobile website mobile library instruction planned mobile services additions a library in the palm of your hand: mobile services in the top 100 university libraries | liu and briggs | doi: 10.6017/ital.v34i2.5650 144 respondents. less common services planned include adding mobile access to the library’s opac, mobile access to the library’s databases, and mobile book renewal, each of which were chosen by 15.38% of respondents. 7.69% of respondents are planning to add mobile apps, e-‐books, and augmented reality, respectively. no one indicated plans to add text messaging services or qr codes. these results indicate that libraries expect demand for traditional library services in a mobile-‐friendly format to continue to expand; mobile friendly library instruction was only offered by 32.35% of respondents, yet 61.54% have plans to offer this service in the future. mobile interlibrary loan is currently offered by 20.59% of respondents, so the fact that 38.46% would like to add it represents a significant change. not surprisingly, mobile websites are likely to remain a very popular mobile service. the fact that 82.35% of respondents already have a mobile website and 46.15% who do not have one wish to add one in the near future means that mobile-‐friendly sites are well on their way to becoming ubiquitous, at least among libraries at the top 100 universities, and may reasonably be expected to take their place among websites in general as a necessity to maintain institutional viability. additionally, several respondents mentioned moving towards responsive design, in which their websites are fully functional regardless of whether they are accessed on mobile devices or desktops. what are challenges and strategies for offering mobile services? in addition to looking for the presence or absence of mobile services being offered at top 100 university libraries, the survey also examined libraries’ experiences in implementing mobile services, including challenges, successes, and best practices. several themes emerged in response to these questions. the most common challenge among respondents was having the time, expertise, staffing and money to support mobile services, especially apps and mobile sites. to solve this problem, respondents mention relying on vendors and third-‐party providers supplying apps to access their resources, but this does not give libraries the flexibility and specificity of an in-‐house app. another common challenge mentioned by several respondents involved technical issues, such as difficulties with off campus access to resources via a proxy server and compatibility issues among different browsers and especially different devices. a lack of communication and/or support is another issue for libraries. one respondent reported a lack of support from the campus computing center for mobile services. one respondent discussed the difficulty of having a coordinated mobile effort when the library has a large number of departments, and each department may or may not be aware of what the others are doing in regards to mobile services. survey results revealed that few libraries have policies in place to support mobile services. coming up with a specific plan for implementing such services can help libraries work towards promoting effective communication and garnering support. one respondent wrote, “the biggest challenges have been: (1) developing a strategy (2) developing a service model (3) having a systematic model for managing content for both mobile-‐ and non-‐mobile applications. we've had information technologies and libraries | june 2015 145 success with the first two and are making great progress on the third.” interestingly, several respondents noted that underuse is an issue for some services. one respondent mentioned that qr codes are not used often, and another mentioned that the library’s text-‐a-‐librarian service is much underutilized. several respondents cited the need to market mobile services as an antidote to this problem. seeking regular feedback from the user community regarding mobile services wants and needs is another recommended solution. other issues include the fact that not all library services are mobilized. however, libraries are actively looking for solutions for this. there is a trend among respondents towards developing a site that is responsive to all devices, including desktops, laptops, tablets, and phones. this will take the place of a separate mobile site. as one respondent states, “at the moment, our library mobile website only has a fraction of the services available via our desktop website. we are in the process of moving everything to responsive design, with the expectation that all services will be equally available in mobile and desktop.” in reading through these responses, one message is clear: mobile services are a must. several respondents noted that demand for mobile services is growing, with one writing, “get started as soon as possible. our analytics show that mobile use is continuing to increase.” conclusion this study confirms that as of spring 2014 mobile services are already ubiquitous among the country’s top 100 universities’ libraries and are likely to continue to grow. where the most common services offered are e-‐books, chat/im, mobile access to databases, mobile access to the library catalog, mobile sites, and text messaging services, there is a trend towards responsive design for websites so that patrons can access the library’s full site on any mobile device. the experiences of these libraries demonstrate the value of creating a plan for providing mobile services, allotting the appropriate amount of staffing, time, and funding, communicating among departments and stakeholders to coordinate mobile efforts, marketing services, and regularly seeking patron feedback. however, there is no one approach to offering mobile services, and each library must do what works best for its patrons. references 1. andrew walsh, using mobile technology to deliver library services (maryland: scarecrow press, 2012), xiv. 2. “smartphone ownership 2013,” last modified june 5, 2013, http://www.pewinternet.org/2013/06/05/smartphone-‐ownership-‐2013/. 3. thomas a. peters, “left to their own devices: the future of reference services on personal, portable information, communication, and entertainment devices,” reference librarian 52 (2011): 88-‐97, doi:10.1080/02763877.2011.520110. a library in the palm of your hand: mobile services in the top 100 university libraries | liu and briggs | doi: 10.6017/ital.v34i2.5650 146 4. acrl research planning and review committee, “top ten trends in academic libraries, “ college & research libraries news 73 (2012): 311-‐320. 5. lauren elmore and derek stephens, “the application of qr codes in uk academic libraries,” new review of academic librarianship 18 (2012):26-‐42, doi:10.1080/13614533.2012.654679. 6. robin canuel and chad crichton, “canadian academic libraries and the mobile web,” new library world 112 (2011): 107-‐120, doi:10.1108/03074801111117014. 7. aiguo li, “mobile library services in key chinese academic libraries,” journal of academic librarianship 39 (2013): 223-‐226, doi:10.1016/j.acalib.2013.01.009. 8. robin canuel and chad crichton, “canadian academic libraries,” 107-‐120. 9. lisa carlucci thomas, “gone mobile? (mobile libraries survey 2010),” library journal 135 (2010): 30-‐34. 10. robin canuel and chad crichton, “canadian academic libraries,” 107-‐120. 11. “mobile technology in libraries survey,” last modified 2012, http://www.ohsu.edu/xd/education/library/about/staff-‐ directory/upload/mobile_survey_academic_final.pdf. 12. brittany osika and cate kaufman, “’mobilizing’ community college libraries,” searcher 20 (2012): 36-‐46. 13. scott la counte, “introduction,” in mobile library services: best practices, ed. charles harmon and michael messina. (maryland: scarecrow press, 2013), v-‐vii. 14. fred d. barnhart and jeannette e. pierce, “becoming mobile: reference in the ubiquitous library,” journal of library administration 52 (2012): 559-‐570, doi:10.1080/01930826.2012.707954. 15. mark andy west, arthur w. hafner, and bradley d. faust, “expanding access to library collections and services using small-‐screen devices,” information technology & libraries 25 (2006): 103-‐107. 16. joan k. lippincott, “mobile technologies, mobile users: implications for academic libraries,” arl: a bimonthly report on research library issues & actions 261 (2008): 1-‐4. 17. walsh, using mobile technology, 58. 18. ibid. 19. “mobile technology in libraries survey.” information technologies and libraries | june 2015 147 20. edward iglesias and wittawat meesangnil, “mobile website development: from site to app,” bulletin of the american society for information science and technology 38 (2011): 18-‐23, doi: 10.1002/bult.2011.1720380108. 21. joshua bishoff, “going mobile at illinois: a case study,” in mobile library services: best practices, ed. charles harmon and michael messina. (maryland: scarecrow press, 2013), 107-‐ 121. 22. walsh, using mobile technology. 23. helen bischoff, michele ruth, and ben rawlins, “making the library mobile on a shoestring budget,” in mobile library services: best practices, ed. charles harmon and michael messina. (maryland: scarecrow press, 2013), 43-‐54. 24. walsh, using mobile technology. 25. ibid. 26. ibid. 27. ibid., 105. 28. ibid., 97. 29. ibid. 30. “go mobile: use these strategies and increase your mobile literacy and your patrons’ satisfaction,” last modified november 1, 2009, http://libraryconnect.elsevier.com/articles/technology-‐content/2009-‐11/go-‐mobile. 31. walsh, using mobile technology, 45. 32. peters, “left to their own devices.” 33. geoffrey little, “keeping moving: smart phone and mobile technologies in the academic library,” journal of academic librarianship 37 (2011): 267-‐269, doi: 10.1016/j.acalib.2011.03.004. 34. elmore and stephens, “the application of qr codes.” 35. jim hahn, “mobile augmented reality applications for library services,” new library world 113 (2012): 429-‐438, accessed june 21, 2014, doi:10.1108/03074801211273902. 36. ibid. 37. wolfwalk: explore nc state history right on your phone,” http://www.lib.ncsu.edu/wolfwalk/. a library in the palm of your hand: mobile services in the top 100 university libraries | liu and briggs | doi: 10.6017/ital.v34i2.5650 148 38. ibid. 39. osika and kaufman, “mobilizing community college libraries.” 40. kate kosturski and frank skornia, “handheld libraries 101: using mobile technologies in the academic library,” computers in libraries 31 (2011): 11-‐13. 41. “national university rankings,” http://colleges.usnews.rankingsandreviews.com/best-‐ colleges/rankings/national-‐universities/spp+50. drawing upon findings from a national survey of u.s. public libraries, this paper examines trends in internet and public computing access in public libraries across states from 2004 to 2006. based on library-supplied information about levels and types of internet and public computing access, the authors offer insights into the network-based content and services that public libraries provide. examining data from 2004 to 2006 reveals trends and accomplishments in certain states and geographic regions. this paper details and discusses the data, identifies and analyzes issues related to internet access, and suggests areas for future research. t his article presents findings from the 2004 and 2006 public libraries and the internet studies detail ing the different levels of internet access available in public libraries in different states.1 at this point, 98.9 percent of public library branches are connected to the internet and 98.4 percent of connected public library branches offer public internet access.2 however, the types of access and the quality of access available are not uniformly distributed among libraries or among the libraries in various states. while the data at the national level paint a portrait of the internet and public computing access provided by public libraries overall, studies of these differences among the states can help reveal successes and lessons that may help libraries in other states to increase their levels of access. the need to continue to increase the levels and quality of internet and public computing access in public libraries is not an abstract problem. the services and con tent available on the internet continue to require greater bandwidth and computing capacity, so public libraries must address everincreasing technological demands on the internet and computing access that they provide. 3 public libraries are also facing increased external pressure on their internet and computing access. as patrons have come to rely on the availability of internet and computing access in public libraries, so too have government agencies. many federal, state, and local government agencies now rely on public libraries to facilitate citizens’ access to egovernment services, such as applying for the federal prescription drug plans, filing taxes, and many other interactions with the gov ernment.4 further, public libraries also face increased demands to supply public access computing in times of natural disasters, such as the major hurricanes of 2004 and 2005.5 as a result, both patrons and govern ment agencies depend on the internet and computing access provided by public libraries, and each group has different, but interrelated, expectations of what kinds of access public libraries should provide. however, the data indicate that public libraries are at capacity in meet ing some of these expectations, while some libraries lack the funding, technologysupport capacity, space, and infrastructure (e.g., power, cabling) to reach the expecta tions of each respective group. as public libraries (and the internet and public com puting access they provide) continue to fill more social roles and expectations, a range of new ideas and strate gies can be considered by public libraries to identify suc cessful methods for providing access that is high quality and sufficient to meet the needs of patrons and commu nity. the goals of the public libraries and the internet stud ies have been to help provide an understanding of the issues and needs of libraries associated with providing internetbased services and resources. the 2006 public libraries and the internet study employed a webbased survey approach to gather both quantitative and qualitative data from a sample of the 16,457 public library outlets in the united states.6 a sample was drawn to accurately represent metropolitan status (roughly equating to their designation of urban, suburban, or rural libraries), poverty levels (as derived through census data), state libraries, and the national picture, producing a sample of 6,979 public library out lets.7 the survey received a total of 4,818 responses for a response rate of 69 percent. the data in this article, unless otherwise noted, are drawn from the 2004 and 2006 public libraries and the internet studies.8 while the survey received responses from librar ies in all fifty states, there were not enough responses in all states from which to present statelevel findings. the study was able to provide statelevel analysis for thirtyfive states (including washington, d.c.) in 2004 and fortyfour states at the outlet level (including washington, d.c.) and fortytwo states at the system level (including washington, d.c.) in 2006. in addi tion, there was some variance in states with adequate responses between the 2004 and 2006 studies. a full listing of the states is available in the final reports of the 2004 and 2006 studies at http://www.ii.fsu.edu/ plinternet_reports.cfm. thus, the findings below reflect 4 information technology and libraries | june 2007 public libraries and internet access across the united states: a comparison by state 2004–2006 paul t. jaeger, john carlo bertot, charles r. mcclure, and miranda rodriguez paul t. jaeger (pjaeger@umd.edu) is an assistant professor at the college of information studies at the university of maryland; john carlo bertot (bertot@ci.fsu.edu) is professor and associate director of the information use management and policy institute, college of information, florida state university; charles r. mcclure (cmcclure@ci.fsu.edu) is francis eppes professor and director of the information use management and policy institute, college of information, florida state university; and miranda rodriguez (mrodrig08@umd.edu) is a graduate student in the college of information studies at the university of maryland. public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 5 only those states for which both the 2004 and 2006 stud ies were able to provide analysis. n public libraries and the internet across the states overview of 2004 to 2006 as the public library and the internet studies have been ongoing since 1994, the questions asked in the biennial studies have evolved along with the provision of internet access in libraries. the questions have varied between surveys, but there have been consistent questions that allow for longitudinal analysis at the national level. the 2004 study introduced the analysis of the data at both the national and the state levels. with both the 2004 and 2006 studies providing data at the state level, some longitudi nal analysis at the state level is now possible. overall, there were a number of areas of consistent data across the states from 2004 to 2006. most states had fairly similar, if not identical, percentages of library outlets offering public internet access between 2004 and 2006. for the most part, changes were increases in the percentage of library outlets offering patron access. further, the average number of hours open per week in 2004 (44.5) and in 2006 (44.8) were very similar, as were the percentages of library outlets reporting increases in hours per week, decreases in hours per week, and no changes in hours per week. while these numbers are consistent, it is not known whether this average number of hours open, or the distribution of the hours open across the week, is sufficient to meet patron needs in most communities. data across the states also indicated that physical space is the primary reason for the inability of libraries to add more workstations within the library building. there was also consistency in the findings related to upgrades and replacement schedules. changes and continuities from 2004 to 2006 while the items noted above show some areas of stability in the internet access provided by public libraries across the states, insights are possible in the areas of change for libraries overall or in the libraries that are leading in particular areas. table 1 details the states with the highest average number of hours open per public library outlet in 2004 and 2006. between 2004 and 2006, the national average for the number of hours open increased slightly from 44.5 hours per week to 44.8 hours per week. this increase is reflected in the numbers for the individual states in 2006, which are generally slightly higher than the numbers for the individual states in 2004. for example, the top state in 2006 averaged 55.7 hours per outlet each week, while the top state in 2004 averaged 54.8 hours. the top four states—ohio, new jersey, florida, and virginia—were the same in both years, though with the top two switching positions. this demonstrates a continuing commitment in these four states by state and local government to ensure wide access to public librar ies. these states are also ones with large populations and state budgets, presumably fueling the commitment and facilitating the ability to keep libraries open for many hours each week. while the needs of patrons in other states are no less significant, the data indicate that states with larger populations and higher budgets, not surpris ingly, may be best positioned to provide the highest levels of access to public libraries for state residents. the other six states in the 2006 top ten were not in the 2004 top ten. the primary reason for this is that the six states in 2006 increased their hours more than other states. note that the fifthranked state in 2004, south carolina, averaged 49 hours per outlet each week, which is less than the tenthranked state in 2006, illinois, at 49.5 hours. simply by maintaining the average number of hours open per outlet between 2004 and 2006, south carolina fell from fifth to out of the top ten. these differ ences are reflected in the fact that there is nearly a ten hour difference from first place to tenth place in 2004; yet only a sixhour discrepancy exists from first place to tenth in 2006. these numbers suggest that hours of operation may change frequently for many libraries, indicating the need for future evaluations of operational hours in rela tion to meeting patron demand. table 2 displays the states with the highest average number of public access workstations per public library in 2004 and 2006. the national averages between 2004 and 2006 also showed a slight increase from 10.4 workstations table 1. highest average number of hours open in public library outlets by state in 2004 and 2006 2004 2006 1. new jersey 54.8 1. ohio 55.7 2. ohio 54.6 2. new jersey 55.6 3. florida 52.4 3. florida 52.3 4. virginia 51.3 4. virginia 52.3 5. south carolina 49.0 5. indiana 51.9 6. utah 48.0 6. pennsylvania 50.6 7. new mexico 47.4 7. washington, d.c. 50.6 8. rhode island 47.3 8. maryland 50.0 9. alabama 46.9 9. connecticut 49.8 10. new york 46.2 10. illinois 49.5 national: 44.5 national: 44.8 in 2004 to 10.7 workstations in 2006. a key reason for this slow growth in the number of workstations appears to have a great deal to do with limitations of physical space in libraries; in spite of increasing demands, space con straints often limit computer capacity.9 unlike table 1, the comparisons between 2004 and 2006 in table 2 do not show acrosstheboard increases from 2004 to 2006. in fact, florida had the highest average of workstations per library outlet in both 2004 and 2006, but the average number decreased from 22.6 in 2004 to 21.7 in 2006. it is interesting to note that florida has a significantly higher number of workstations than the next highest state in both 2004 and 2006. in contrast, many of the states in the lower half of the top ten in 2004 had sub stantially lower average numbers of workstations in 2004 than in 2006. in 2004 there were an average of seven more computers in spot two than spot ten; in 2006, there were only an average of four more computers from spot two to ten. the large increases in the number of workstations in some states, like nevada, michigan, and maryland, indicate sizeable changes in budget, numbers of outlets, and/or population size. also of note is the significant drop of the average number of workstations in kentucky, declining from 18.8 in 2004 to fewer than 13 in 2006. a possible explanation is that, since kentucky libraries have been leaders in adopting wireless technologies (see table 3), the demand for workstations has decreased as libraries have added wireless access. five states appear in the top ten of both years— florida, indiana, georgia, california, and new jersey. the average number of workstations in indiana, california, and georgia increased from 2004 to 2006, while the aver age number of workstations in florida and new jersey decreased between 2004 and 2006. some of the decreases in workstations can be accounted for by increases in the availability of wireless access in public libraries, as librar ies with wireless access may feel less need to add more networked computers, relying on patrons to bring their own laptops. such a strategy, of course, will not increase access for patrons who cannot afford laptops. some libraries have sought to address this issue by having lap tops available for loan within the library building. the states listed in table 3 had the highest average levels of wireless connectivity in public library outlets in 2004 and 2006. the differences between the numbers in 2004 and 2006 reveal the dramatic increases in the avail ability of wireless internet access in public libraries. the national average in 2004 was 17.9 percent, but in 2006, the national average had more than doubled to 37.4 percent of public libraries offering wireless internet access. this sizeable increase is reflected in the changes in the states with the highest levels of wireless access. every position in the ratings in table 3 shows a dra matic jump from 2004 to 2006. the top position increased from 47 percent to 63.8 percent. the tenth position increased from 19.6 percent to 47.8 percent, an increase of nearly twoandahalf times. these increases show how much more prominent wireless internet access has become in the services that public libraries offer to their communities and to their patrons. four states appear on both the 2004 and 2006 lists— virginia, kentucky, rhode island, and new jersey. these four states all showed increases, but the rises in some table 2. highest average number of public access workstations in public library outlets by state in 2004 and 2006. 2004 2006 1. florida 22.6 1. florida 21.7 2. kentucky 18.8 2. indiana 17.5 3. new jersey 15.5 3. nevada 15.7 4. georgia 14.0 4. michigan 14.8 5. utah 13.0 5. maryland 14.6 6. rhode island 12.6 6. georgia 14.4 7. indiana 12.3 7. arizona 14.1 8. texas 11.9 8. california 14.0 9. california 11.8 9. new jersey 13.8 10. south carolina 11.7 10. virginia 13.0 new york 11.7 national: 10.4 national: 10.7 table 3. highest levels of public access wireless internet connectivity in public library outlets by state in 2004 and 2006 2004 2006 1. kentucky 47% 1. virginia 63.8% 2. new mexico 38.6% 2. connecticut 56.6% 3. new hampshire 31.6% 3. indiana 56.6% 4. virginia 30.8% 4. rhode island 53.9% 5. texas 26.4% 5. kentucky 52.0% 6. kansas 25.8% 6. new jersey 50.9% 7. new jersey 22.8% 7. maryland 49.8% 8. rhode island 22.5% 8. illinois 48.3% 9. florida 21.9% 9. california 47.8% 10. new york 19.6% 10. massachusetts 47.8% national: 17.9% national: 37.4% 6 information technology and libraries | june 2007 public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 7 other states were significant enough to reduce kentucky from the topranked state in 2004 to the fifth ranked, in spite of the fact that the number of public libraries in kentucky offering wireless access increased from 47 per cent to 52 percent. in both years, a majority of the states in the top ten were located along the east coast. further, high levels of wireless access may be linked in some states to areas of high population density or the strong presence of technologyrelated sectors in the state, as in california and virginia. smaller states with areas of dense popula tions, such as connecticut, rhode island, and maryland, are also among the leaders in wireless access. tables 4 and 5 provide contrasting pictures regarding the number of public access internet workstations in public libraries by state in 2004 and 2006. table 4 shows the states with the highest percentages of libraries that consistently have fewer workstations that are needed by patrons, while table 5 shows the states with the highest percentages of libraries that consistently have sufficient workstations to meet patron needs. of note is the fact that, unlike the preceding three tables, there appears to be no significant geographical clustering of states in tables 4 and 5. nationally, the percentage of libraries that consis tently have insufficient workstations to meet patron needs declined from 15.7 percent in to 2004 to 13.7 percent in 2006, a change that is within the margin of error (+/ 3.4 percent) of the question on the 2006 survey. due to the size of the change, it is not known if the national decline was a real improvement or simply a reflection of the margin of error. washington, d.c., oregon, new mexico, idaho, and california appear on the lists for both 2004 and 2006 in table 4. washington, d.c. had the highest percentage of libraries reporting insufficient workstations in both years, though there was a significant decrease from 100 percent of libraries in 2004 to 69 percent of libraries in 2006. in this case, the significant drop represents major strides forward to providing sufficient access to patrons in washington, d.c. similarly, though california features on both lists, the percentages dropped from 44.9 percent in 2004 to 22.2 percent in 2006, a decline of more than half. states like these are obviously making efforts to address the need for increased workstations. overall, eight out of ten positions in table 4 remained constant or saw a decline percentage in each position from 2004 to 2006, indicating a national decrease in libraries with insufficient workstations. in sharp contrast, fewer than 20 percent of nevada libraries in 2004 reported insufficient workstations, placing well out of the top ten. however, in 2006 nevada ranked second, with 51.5 percent of public libraries reporting insufficient workstations to meet patron demand. with nevada’s rapidly growing population, it appears that the demand for internet access in public libraries may not be keeping pace with the population growth. the percentage of public libraries reporting suffi cient workstations to consistently meet patron demands increased slightly at the national level from 14.1 percent in 2004 to 14.6 percent in 2006, again well within the margin of error (+/ 3.5 percent) of the 2006 question. however, in table 5, the top ten positions in 2006 all fea ture lower percentages than the same positions in 2004. in 2004 the topranked state had 53.2 percent of libraries able to consistently meet patron needs for internet access, but the topranked state in 2006 had only 31 percent of libraries able to consistently meet patron access needs. table 4. public library outlet public access workstation availability by state in 2004 and 2006–consistently have fewer workstations than are needed 2004 2006 1. washington, d.c. 100% 1. washington, d.c. 69.9% 2. california 44.9% 2. nevada 51.5% 3. florida 36% 3. oregon 34.8% 4. new mexico 30.7% 4. new mexico 31.9% 5. oregon 30.4% 5. tennessee 30.4% 6. utah 29.2% 6. alaska 27.8% 7. south carolina 28.4% 7. idaho 26% 8. kentucky 24.1% 8. california 22.2% 9. alabama 21.5% 9. new york 21.4% 10. idaho 21.1% 10. rhode island 19% national: 15.7% national: 13.7% table 5. public library outlet public access workstation availability by state in 2004 and 2006—always have a sufficient number of workstations to meet demand. 2004 2006 1. wyoming 53.2% 1. louisiana 31% 2. alaska 34.9% 2. new hampshire 30.4% 3. kansas 32.2% 3. north carolina 28.4% 4. rhode island 31.4% 4. arkansas 26.2% 5. new hampshire 29.7% 5. wyoming 25.2% 6. south dakota 25.2% 6. mississippi 24.4% 7. georgia 25% 7. missouri 23.6% 8. arkansas 24.8% 8. vermont 22.2% 9. vermont 32.7% 9. nevada 20.9% 10. virginia 22.4% 10. pennsylvania 17.9% west virginia 17.9% national: 14.1% national: 14.6% � information technology and libraries | june 2007 four states—new hampshire, arkansas, wyoming, and vermont—appear on both the 2004 and 2006 lists. the national increase in the sufficiency of the num ber of workstations to meet patron access needs and decreases in all of the topranked states between 2004 and 2006 seems incongruous. this situation results, however, from a decrease in range of differences among the states from 2004 to 2006, so that the range is compressed and the percentages are more similar among the states. further, in some states, the addition of wireless access may have served to increase the overall sufficiency of the access in libraries, possibly leveling the differences among states. nevertheless, the national average of only 14.6 percent of public libraries consistently having sufficient numbers of workstations to meet patron access needs is clearly a major problem that public libraries must work to address. comparing the 2006 data of tables 4 and 5 demonstrates that patron demands for internet access are being met neither evenly nor consistently across the states. nationally, the percentage of public library systems with increases in the information technology budgets from the previous year dropped dramatically from 36.1 percent in 2004 to 18.6 percent in 2006. as can be seen in table 6, various national, state, and local budget crunches have significantly reduced the percentages of public library systems with increases in information technology budgets. when inflation is taken into account, a stationary information technology budget represents a net decrease in funds available in real dollar terms, so the only public libraries that are not actually having reductions in their information technology budgets are those with increases in such budgets. since internet access and the accompa nying hardware necessary to provide it are clearly a key aspect of information technology budgets, decreases in these budgets will have tangible impacts on the ability of public libraries to provide sufficient internet access. virtually every position on table 6 has a decrease of 20 percent to 30 percent from 2004 to 2006, with the largest decrease being from 84.2 percent in 2004 to 48.3 percent in 2006 in the second position. five states—delaware, kentucky, florida, rhode island, and south carolina—are listed for both 2004 and 2006, though every one of these states registered a decrease from 2004 to 2006. no drop was more dramatic than south carolina’s from 84.2 percent in 2004 to 31 percent in 2006. overall, though, the declining information tech nology budgets and continuing increases in demands for information technology access among patrons cre ates a very difficult situation for libraries. public libraries and the internet in 2006 along with questions that were asked on both the 2004 and 2006 public libraries and the internet studies, the sur vey included new questions on the 2006 study to account for social changes, alterations of the policy environment, and the maturation of internet access in public librar ies. several findings from the new questions on the 2006 study were noteworthy among the state data. the states listed in table 7 had the highest percentage of public library systems with increases in total operating budget over the previous year in 2006. nationally, 45.1 percent of public library systems had some increase in their overall budget, which includes funding for staff, physical structures, collection development, and many other costs, along with technology. at the state level, three northeastern states clearly led the way, with more than 75 percent of library systems in maryland, delaware, and rhode island benefiting from an increase in the overall operating budget. also of note is the fact that two fairly table 6. highest levels of public library system overall internet information technology budget increases by state in 2004 and 2006 2004 2006 1. florida 87.5% 1. delaware 60% 2. south carolina 84.2% 2. kentucky 48.3% 3. rhode island 67.5% 3. maryland 47.6% 4. delaware 64.9% 4. wyoming 45.7% 5. new jersey 61.5% 5. louisiana 40% 6. north carolina 55.5% 6. florida 38% 7. virginia 53.6% 7. rhode island 33.3% 8. kentucky 53.2% 8. south carolina 31% 9. new mexico 49.3% 9. arkansas 27.5% 10. kansas 49% 10. california 27.3% national: 36.1% national: 18.6% table 7. highest levels of public library system total operating budget increases by state in 2006 1. maryland 85.7% 2. delaware 80% 3. rhode island 76.4% 4. idaho 74.5% 5. kentucky 73.6% 6. connecticut 68.6% 7. virginia 62.8% 8. new hampshire 62.5% 9. north carolina 61.6% 10. wyoming 60.9% national: 45.1% public libraries and internet access | jaeger, bertot, mcclure, and rodriguez � rural and sparsely populated western states—idaho and wyoming—were among the top ten. five of the states in the top ten in highest percent ages of increases in operating budget in 2006 were also among the top ten in highest percentages of increases in information technology budgets in 2006. comparing table 7 with table 6 reveals that delaware, kentucky, maryland, rhode island, and wyoming are on both lists. in these states, increases in information technology budgets seem to have accompanied larger increases in the overall 2006 budget. an interesting point to ponder in comparing table 6 with table 7 is the large discrepancy between average increases in information technology budgets (18.6 per cent) with overall budgets (45.1 percent) at the national level. as internet access is becoming more vital to pub lic libraries in the content and services they provide to patrons, it seems surprising that such a smaller portion of library systems would receive an increase in information technology budgets than in overall budgets. one growing issue with the provision of internet access in public libraries is the provision of access at suf ficient connection speeds. more and more internet con tent and services are complex and require large amounts of bandwidth, particularly content involving audio and video components. fortunately, as demonstrated in table 8, 53.5 percent of libraries nationally indicate that their connection speed is sufficient at all times to meet patron needs. in contrast, only 16.1 percent of public libraries nationally indicate that their connection speed is insuf ficient to meet patron needs at all times. georgia has the highest percentage of libraries that always have sufficient connection speed at 80.5 percent. in the case of georgia, the statewide library network is most likely a key part of ensuring the majority of libraries have sufficient access speed. many of the other states that have the highest percentages of public librar ies with sufficient connection speeds are located in the middle part of the country. the state with the highest percentage of libraries with insufficient connection speed to meet patron demands is virginia, with 35 per cent of libraries. curiously, virginia consistently ranks in the top ten of tables 1–3. though virginia libraries have some of the longest hours open, some of the high est numbers of workstations, and some of the highest levels of wireless access, they still have the highest per centage of libraries with insufficient connection speed. only five states had more than 25 percent of libraries with connection speeds insufficient to meet the needs of patrons at all times. this issue is significant now in these states, as these libraries lack the necessary connec tion speeds. however, it will continue to escalate as an issue as content and services on the internet continue to evolve and become more complex, thus requiring greater connection speeds. comparing table 8 with table 4 (consistently have fewer workstations than are needed) and table 5 (always have a sufficient number of workstations to meet demand) reveals some parallels. alabama and rhode island are among the top ten states both for connection speed being consistently insufficient to meet patron needs (table 8) and consistently have fewer workstations than are needed (table 4). conversely, vermont and louisiana are among the top ten states both for connection speed being sufficient to meet patron needs at all times (table 8) and always have a sufficient number of workstations to meet demand (table 5). table 9 displays the two leading types of internet connection providers for public libraries and the states with the highest percentages of libraries using each. nationally, 46.4 percent of public libraries rely on an internet service provider (isp) for internet access. in the states listed in table 9, threequarters or more of librar ies use an isp, with more than 90 percent of libraries in kentucky and iowa using an isp. the next most common means of connection for public libraries is through a library cooperative or library network, with 26.2 percent of libraries nationally using these means. in such cases, member libraries rely on their established network to serve as the connector to the internet. the library net work approach seems to be most effective in geographi cally small states. the top three on the list being three of the smallest of the states—rhode island, delaware, and west virginia—with more than 75 percent of libraries in each of these states connecting through a network. nationally, the remaining approximately 25 percent of table �. highest percentages of public library outlets where public access internet service connection speed is sufficient at all times or insufficient by state in 2006 sufficient to meet patrons needs at all times insufficient to meet patron needs 1. georgia 80.5% 1. virginia 35% 2. new hampshire 70.6% 2. north carolina 28.1% 3. iowa 64.2% 3. alaska 27.3% 4. illinois 64% 4. delaware 26.9% 5. ohio 63.9% 5. mississippi 26.6% 6. indiana 63.6% 6. missouri 24.3% 7. vermont 63.5% 7. rhode island 23.1% 8. oklahoma 62.8% 8. oregon 22.4% 9. louisiana 61.7% 9. connecticut 21.5% 10. wisconsin 61.5% 10. arkansas 21.2% national: 53.5% national: 16.1% 10 information technology and libraries | june 2007 libraries connect through a network managed by a nonlibrary entity or by other means. the highest percentages of public library sys tems receiving each kind of erate discount are presented in table 10. erate discounts are an important source of technology funding for many public libraries across the country, with more than $250,000,000 in erate discounts distributed to libraries between 2000 and 2003.10 nationally in 2006, 22.4 percent of public library systems received discounts for internet connectivity, 39.6 percent for telecommunications services, and 4.4 percent for internal connection costs. mississippi and louisiana appear in the top five for each of the three types of discounts. minnesota and west virginia are each in the top five for two of the three lists. many of the states benefiting the most from erate funding in 2006 have large rural popu lations spread out over a geographically dispersed area, indicating the continuing importance of e rate discounts in bringing internet connections to rural public libraries. maryland and west virginia are both included in the telecommunications service column of table 10 due to proportionally large areas of these smaller states that are rural. the importance of the telecommunications dis counts in certain states is obviated by the fact that more than 75 percent of public library systems in all five states listed received such discounts. in comparison, only one state has more than 75 percent of library systems receiv ing discounts for internet connectivity, while no state has 30 percent of library systems receiving discounts for internal connection costs, with the latter reflecting the manner in which erate funding is calculated. in spite of the penetration of the internet into virtually every public library in the united states and the general expectations that internet access will be publicly available in every library, not all public libraries offer information technology training for patrons. nationally, 21.4 percent of public library outlets do not offer technology training. table 10 lists the states with the highest percentages of public library outlets not offering information technol ogy training. six of the ten states listed are located in the southeastern part of the country. the lack of resources or adequate number of staff to provide training is a leading concern in these states. not offering patron training may be strongly linked to lacking economic resources to do so. for example, the two states with the highest percentage of public libraries not offering patron training—mississippi and louisiana—are also the two states in the top five recipients of each kind of erate funding listed in table 10. if the libraries in states like these are economically struggling just to provide internet access, it seems likely that providing accompany ing training might be difficult as well. a further difficulty is that there is little public or private funding available specifically for training. n discussion of issues the similarities and differences among the states indi cate that the evolution of public access to the internet in public libraries is not necessarily an evenly distributed phenomenon, as some states appear to be consistent lead ers in some areas and other states appear to consistently trail in others. while the national picture is one primarily of continued progress in the availability and quality of internet access available to library patrons, the progress is not evenly distributed among the states. 11 libraries in different states struggle with or benefit from different issues. some public libraries are limited by state and local budgetary limitations, while other libraries are seeking alternate funding sources through grant writ ing and building partnerships with the corporate world. some face barriers to providing access due to their geo graphical location or small service population. it may also be the case that the libraries in some states do not per ceive that patrons desire increased access. other public libraries are able to provide highend access as a result of having strong local leadership, sufficient state and local funding, welldeveloped networks and cooperatives, and a proactive state library. though the discussion of the “digital divide” has become much less frequent, the state data seem to indi cate that there are gaps in levels of access among libraries in different states. while every state has very successful individual libraries in terms of providing quality internet table �. highest levels of types of internet connection provider for public library outlets by state in 2006 internet service provider library cooperative or network 1. kentucky 93.5% 1. rhode island 84.7% 2. iowa 90.9% 2. delaware 79.5% 3. new hampshire 83.8% 3. west virginia 77.9% 4. vermont 81.1% 4. wisconsin 71.2% 5. oklahoma 80.6% 5. massachusetts 54.7% wyoming 80.6% 6. minnesota 52.5% 7. idaho 80.2% 7. ohio 48.9% 8. montana 78.9% 8. georgia 45.1% 9. tennessee 78.4% 9. mississippi 41.2% 10. alabama 74.6% 10. connecticut 38.5% national: 46.4% national: 26.2% public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 11 access and individual libraries that could be doing a better job, the state data indicate that library patrons in different parts of the country have variations in the levels and quality of access available to them. uniformity across all states clearly will never be feasible, though, as differ ent states and their patrons have different needs. for example, tables 1, 2, and 3 all display features that indicate highlevel internet access in public librar ies—high numbers of hours open, high numbers of public access workstations, and high levels of wireless internet access. three states—maryland, new jersey, and virginia—appear in the top ten in these three lists for 2006. further, connecticut, florida, illinois, and indiana each appear in the top ten of two of these three lists. these states clearly are making successful efforts at the state and local levels to guarantee widespread access to public libraries and the internet access they provide. gaps in access are also evident among different regions of the country. the highest percentages of library systems with increases in total operating budgets were concentrated in states along the east coast, with seven of the states listed in table 7 being midatlantic or northeastern states. in con trast, the highest percentages of library systems relying on erate funding in table 10 were concentrated in the midwest and the southeast. further, the numbers in tables 6 and 7 showed far greater increases in the total operating budgets than in the information technology budgets in all regions of the country. as a result, public libraries in all parts of the united states may need to seek alternate sources of funding specifically for information technology costs. as can be seen in table 3, the leading states in adoption of wireless technology are concentrated in the northeast and midatlantic. in table 11, southern states, particu larly louisiana and mississippi, had many of the highest percentages of libraries not offering any internet training to patrons. it is important to note with data from the gulfstates, however, that the effects of hurricane katrina may have had a large impact on the results reported. one key difference in a number of states seems to be the presence of a state library actively working to coordi nate access issues. this particular study was not able to address such issues, but evidence indicates that the state library can play a significant role in ensuring sufficiency of internet access in public libraries in a state. maine, west virginia, and wisconsin all have state libraries that apply and distribute funds at the statewide level to ensure all public libraries, regardless of size or geography, have highend connections to the internet. the state library of west virginia, for example, applied for erate funding for telecommunications costs on a statewide basis and received 79.1 percent funding in 2006, using such funding to cover not only connection costs for public libraries, but also to provide it and network support to libraries. another example of a successful statewide effort to provide sufficient internet access can be found in maryland. in the early 1990s, maryland public library administrators agreed to let the state library use library services and technology act (lsta) funds to build the sailor network, connecting all public libraries in the state.12 this network predates the erate program by a number of years, but having an established statewide network has helped the state library to coordinate table 10. highest percentages of public library systems receiving e-rate discounts by category and state in 2006 internet connectivity telecommunications services internal connection costs 1. louisiana 89.2% 1. mississippi 92.6% 1. mississippi 29.6% 2. indiana 70.8% 2. south carolina 89.4% 2. minnesota 22.6% 3. mississippi 63% 3. louisiana 79.5% 3. arizona 19.3% 4. minnesota 50.5% 4. west virginia 79.1% 4. west virginia 14.2% 5. tennessee 44.7% 5. maryland 76.2% 5. louisiana 12.3% national: 22.4% national: 39.6% national: 4.4% table 11. highest levels of public library systems not offering patron information technology training services by state in 2006 1. louisiana 48.7% 2. mississippi 40.7% 3. arkansas 39.6% 4. alaska 36% 5. arizona 34.8% 6. georgia 34.5% 7. new hampshire 32.8% 8. south carolina 31.1% 9. tennessee 30% 10. idaho 29% national: 21.4% 12 information technology and libraries | june 2007 applications, funding, and services among the libraries of the state. the state budget in maryland also provides other types of funding to support the state library, the library systems, and the library outlets in providing internet access. in states such as georgia, maryland, maine, west virginia, and wisconsin, the provision of internet access in public libraries is shaped not only by library outlets and library systems, but by the state libraries as well. in these and other states, the efforts of the state library appear to be reflected in the data from this study. a final area for discussion is the degree to which librarians understand how much bandwidth is required to meet the needs of library users, how to measure actual bandwidth that is available in the library, and how to determine the degree to which that bandwidth is suf ficient. indeed, many providers advertise that their con nection speeds are “up to” a certain speed when in fact they deliver considerably less.13 the authors have offered an analysis of determining the quality and sufficiency of bandwidth elsewhere.14 suffice to say that there is consid erable confusion as to “how good is good enough” band width connection quality. these types of issues frame understandings of how connected libraries in different states are and whether those connections are sufficient to meet the needs of patrons. n future research while the experience of individual patrons in particular libraries will vary widely in terms of whether the access available is sufficient to meet their information needs, the fact that the state data indicate variations in the levels and quality of access among some states and regions of the country is worthy of note. an important area of sub sequent research will be to investigate these differences, determine the reasons for them, and develop strategies to alleviate these apparent gaps in access. investigating these differences requires consideration of local and situational factors that may affect access in one library but perhaps not in another. for example, one public library may have access to an internet provider that offers higher speed connectivity that is not available in another location. the range of the possible local and situational factors affecting access and services is extensive. a prelimi nary list of the factors that contribute to being a success fully networked public library is described in greater detail in the 2006 study.15 however, additional investigation into the degree to which these factors affect access, quality of service, and user satisfaction needs to be continued. the personal experience of the authors in working with various state library agencies suggests the need for additional research that explores relationships among those states ranked highest in areas such as connectivity and workstations with programs and services offered by the state library agencies. one state library, for example, has a specific program that works directly with individual public libraries to assist them in completing the various erate forms. is there a link between that state library providing such assistance and the state’s public libraries receiving more erate discounts per capita than other states? this is but one example where investigating the role of the state library and comparing those roles and services to the rankings may be useful. perhaps a number of “best practices” could be identified that would assist the libraries in other states in improv ing access and services. in terms of research methods, future research on the topics identified in this article may need to draw upon strategies other than a national survey and onsite focus groups/interviews. the 2006 study, for the first time, included site visits and interviews and produced a wealth of data that supplemented the national survey data.16 onsite analysis of actual connection speeds in a sample of public libraries is but one example. the degree to which survey respondents know the connec tion speeds at specific workstations is unclear. simply because a t1 line comes in the front door, it is not nec essarily the speed available at a particular workstation. other methods such as log file analysis or userbased surveys of networked services (as opposed to surveys completed by librarians) may offer insights that could augment the national survey data. other approaches such as policy analysis may also prove useful in better understanding access, connectiv ity, and services on a statebystate basis. there has been no systematic description and analysis of statebased laws and regulations that affect public library internet access, connectivity, and services. the authors are aware of some states that ensure a minimum bandwidth will be provided to each public library in the state and pay for such connectivity. such is not true in other states. thus, a better understanding of how statebased policies and regulations affect access, connectivity, and services may identify strategies and policies that could be used in other states to increase or improve access, connectiv ity, and services. the data discussed in this article also point to many other important needs in future research. libraries in certain states that seem to be frequently ranking high in the tables indicate that certain states are better able to sustain their libraries in terms of finances and usage. however, additional factors may also be key in the differ ences among the states. future research needs to consider the internet access in public libraries in different states in relation to other services offered by libraries and to uses of the internet connectivity in libraries, including types of online content and services available, types of training public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 13 available, community outreach, other collection issues, staffing in relation to technology, and other factors. n conclusion internet and public computing access is almost univer sally available in public libraries in the united states, but there are differences in the amounts of access, the kinds of access, and sufficiency of the access available to meet patron demands. now that virtually every public library has an internet connection, provides internet access to patrons, and offers a range of public computing access, the attention of public libraries must refocus on ensuring that every library can provide sufficient internet and com puting access to meet patron needs. the issues to address include being open to the public a sufficient number of hours, having enough internet access workstations, hav ing adequate wireless access, and having sufficient speed and quality of connectivity to meet the needs of patrons. if a library is not able to provide sufficient access now, the situation will only continue to grow more difficult as the content and services on the internet continue to be more demanding of technical and bandwidth capacity. public libraries must also focus on increasing provi sion of internet access in light of federal, state, and local governments recently adding yet another significant level of services to public libraries by “requesting” that they provide access to and training in using numerous egov ernment services. such egovernment services include social services, prescription drug plans, health care, disas ter support, tax filing, resource management, and many other activities.17 the maintenance of traditional services, the addi tion and expansion of public access computing and networked services, and now the addition of a range of egovernment services tacitly required by federal, state, and local governments, in combination, risk stretching public library resources beyond their ability to keep up. to avoid such a situation, public libraries, library sys tems, and state governments must learn from the library outlets, systems, and states that are more successfully providing sufficient internet access to their patrons and their communities. among these leaders, there are likely models for success that can be identified for the benefit of other outlets, systems, and states. beyond the lessons that can be learned from the most connected, however, there are also practical and logistical issues that remain beyond the control of an individual library and sometimes the entire state, such as geographical and economic factors. ultimately, the analysis of state data offered here sug gests that much can be learned from one state that might assist another state in terms of improving connectivity, access, and services. while the data suggest a number of significant discrepancies among the various states, it may be that a range of best practices can be identified from those more highly ranked states that could be employed in other states to improve access, connectivity, and ser vices. staff at the various state library agencies may wish to discuss these findings and develop strategies that can then improve access nationwide. providing access to the internet is now as established a role for public libraries as providing access to books. patrons and communities, and now government orga nizations, rely on the fact that internet access will be available to everyone who needs it. while there are other points of access to the internet in some communities, such as school media centers and community technology centers, the public library is often the only public access point available in many communities.18 public libraries across the states must continually work to make sure the access they provide meets all of these needs. n acknowledgements the 2004 and 2006 public libraries and the internet studies were funded by the american library association and the bill & melinda gates foundation. drs. bertot, mcclure, and jaeger served as the coprincipal investigators of the study. more information on these studies is available at http://www.ii.fsu.edu/plinternet/. references and notes 1. john carlo bertot, charles r. mcclure, and paul t. jaeger, public libraries and the internet 2004: survey results and findings (tallahassee, fla.: information institute, 2005), http://www.ii.fsu .edu/plinternet_reports.cfm; john carlo bertot et al., public libraries and the internet 2006: study results and findings (tal lahassee, fla.: information institute, 2006), http://www.ii.fsu. edu/plinternet_reports.cfm (accessed mar. 31, 2007). 2. bertot et al., public libraries and the internet 2006. 3. john carlo bertot and charles r. mcclure, “assessing the sufficiency and quality of bandwidth for public libraries,” information technology and libraries 26, no. 1 (2007): 14–22. 4. john carlo bertot et al., “drafted: i want you to deliver egovernment,” library journal 131, no. 13 (2006): 34–39; john carlo bertot et al., “public access computing and internet access in public libraries: the role of pub lic libraries in egovernment and emergency situations,” first monday 11, no. 9 (2006). http://www.firstmonday .org/issues/issue11_9/bertot/ (accessed mar. 31, 2007). 5. ibid.; paul t. jaeger et al., “the 2004 and 2005 gulf coast hurricanes: evolving roles and lessons learned for public libraries in disaster preparedness and community services,” public library quarterly (in press). 6. there are actually nearly 17,000 service outlets in the united states. however, the sample frame eliminated bookmobiles as 14 information technology and libraries | june 2007 well as library outlets that the study team could neither geocode nor calculate poverty measures. additional information on the methodology is available in the study report at http://www.ii.fsu .edu/plinternet/ (accessed mar. 31, 2007). 7. bertot et al., public libraries and the internet 2006. 8. bertot, mcclure, and jaeger, public libraries and the internet 2004; bertot et al., public libraries and the internet 2006. the 2004 survey instrument is available at http://www.ii.fsu.edu/pro jectfiles/plinternet/plinternet_appendixa.pdf. the 2006 survey instrument is available at http://www.ii.fsu.edu/projectfiles/ plinternet/2006/appendix1.pdf (accessed mar. 31, 2007). 9. bertot et al., public libraries and the internet 2006. 10. paul t. jaeger, charles r. mcclure, and john carlo bertot, “the erate program and libraries and library consortia, 2000 2004: trends and issues,” information technology and libraries 24, no. 2 (2005): 57–67. 11. bertot, mcclure, and jaeger, public libraries and the internet 2004; bertot et al., public libraries and the internet 2006; john carlo bertot, charles r. mcclure, and paul t. jaeger, “public libraries struggle to meet internet demand: new study shows libraries need support to sustain online services,” american libraries 36, no. 7 (2005): 78–79. 12. john carlo bertot and charles r. mcclure, sailor assessment final report: findings and future sailor development (bal timore, md.: division of library development and services, 1996). 13. matt richtel and ken belson, “not always full speed ahead,” new york times, nov. 18, 2006. 14. bertot and mcclure, “assessing the sufficiency,” 14–22. 15. bertot et al., public libraries and the internet 2006. 16. ibid. 17. bertot et al., “drafted: i want you to deliver egovern ment”; bertot et al., “public access computing and internet access in public libraries”; jaeger et al., “the 2004 and 2005 gulf coast hurricanes.” 18. paul t. jaeger et al., “the policy implications of internet connectivity in public libraries,” government information quarterly 23, no. 1 (2006): 123–41. letter from the editor kenneth j. varnum information technology and libraries | september 2018 1 https://doi.org/10.6017/ital.v37i3.10747 this september 2018 issue of ital continues our celebration of the journal’s 50 th anniversary with a column by former editorial board member mark dehmlow, who highlights the technological changes beginning to stir the library world in the 1980s. the seeds of change planted in the 1970s are germinating, but the explosive growth of the 1990s is still a few years away. in addition to peer-reviewed articles on recommender systems, big data processing and storage, finding vendor accessibility documentation, using gis to find specific books on a shelf, and a recommender system for archival manuscripts, we are also publishing the student paper by this year’s ex libris/lita student writing award, “the open access citation advantage: does it exist and what does it mean for libraries?”, by colby lewis at the university of michigan school of information. this inciteful paper impressed the competition’s judges (as ital’s editor, i was one of them) and i am very pleased to include ms. lewis’ work here. this issue also marks my fourth as editor. with one year under my belt i am finding a rhythm for the publication process and starting to see the increased flow of articles from outside traditional academic library spaces that i wrote about in december 2017. as always, if you have an idea for a potential ital article, please do get in touch. we on the editorial board look forward to working with you. sincerely, kenneth j. varnum, editor varnum@umich.edu september 2018 http://www.ala.org/news/member-news/2018/04/colby-lewis-wins-2018-litaex-libris-student-writing-award mailto:varnum@umich.edu 108 information technology and libraries | september 2011 nancy m. foasberg adoption of e-book readers among college students: a survey understand whether and how students are using e-book readers to respond appropriately. as new media formats emerge, libraries must avoid both extremes: uncritical, hype-driven adoption of new formats and irrational attachment to the status quo. ■■ research context recently introduced e-reader brands have attracted so much attention that it is sometimes difficult to remember that those currently on the market are not the first generation of such devices. the first generation was introduced, to little fanfare, in the 1990s. devices such as the softbook and the rocket e-book reader are well documented in the literature, but were unsuccessful in the market.1 the most recent wave of e-readers began with the sony reader in 2006 and amazon’s kindle in 2007, and thus far is enjoying more success. barnes and noble and borders have entered the market with the nook and the kobo, respectively, and apple has introduced the ipad, a multifunction device that works well as an e-reader. amazon claims that e-book sales for the kindle have outstripped their hardcover book sales.2 these numbers may reflect price differences, enthusiasm on the part of early adopters, marketing efforts on the parts of these particular companies, or a lack of other options for e-reader users because the devices are designed to be compatible primarily with the offerings of the companies who sell them. nevertheless, they certainly indicate a rise in the consumption of e-books by the public, as the dramatic increase in wholesale e-book sales bears out.3 in the meantime, sales of the devices increased nearly 80 percent in 2010.4 with this flurry of activity have come predictions that e-readers will replace print eventually, perhaps even within the next few years.5 books have been published with such bold titles as print is dead.6 however, despite the excitement, e-readers are still a niche market. according to the 2010 pew internet and american life survey, 5 percent of americans own e-book readers. those who do skew heavily to the wealthy and well-educated, with 12 percent having an annual household income of $75,000 or more and 9 percent of college graduates owning an electronic book reader. this suggests that e-book readers are still a luxury item to many.7 to academic librarians, it is especially important to know whether e-readers are being adopted by college students and whether they can be adapted for academic use. e-readers’ virtues, including their light weight, their ability to hold many books at the same time, and the speed with which materials can be delivered, could make them very attractive to students. however, they have many limitations for academic work. most do not provide the ability to copy and paste into another document, have to learn whether e-book readers have become widely popular among college students, this study surveys students at one large, urban, four-year public college. the survey asked whether the students owned e-book readers and if so, how often they used them and for what purposes. thus far, uptake is slow; a very small proportion of students use e-readers. these students use them primarily for leisure reading and continue to rely on print for much of their reading. students reported that price is the greatest barrier to e-reader adoption and had little interest in borrowing e-reader compatible e-books from the library. p ortable e-book readers, including the amazon kindle, barnes and noble nook, and the sony reader, free e-books from the constraints of the computer screen. although such devices have existed for a long time, only recently have they achieved some degree of popularity. as these devices become more commonplace, they could signal important changes for libraries, which currently purchase and loan books according to the rights and affordances associated with print books. however, these changes will only come about if e-book readers become dominant. for academic libraries, the population of interest is college students. their use of reading formats drives collection development practices, and any need to adjust to e-readers depends on whether students adopt them. thus, it is important to research the present state of students’ interest in e-readers. do they own e-readers? do they wish to purchase one? if they do own them, do they use them often and regard them suitable for academic work? the present study surveys students at queens college, part of the city university of new york, to gather information about their attitudes toward and ownership of e-books and e-book readers. because only queens college students were surveyed, it is not possible to draw conclusions about college students in general. however, the data do provide a snapshot of a diverse student body in a large, urban, four-year public college setting. the goal of the survey was to learn whether students own and use e-book readers, and if so, how they use them. in the midst of enthusiasm for the format by publishers, librarians and early adopters, it is important to consult the students themselves, whose preferences and reading habits are at stake. it is also vital for academic libraries to nancy m. foasberg (nfoasberg@qc.cuny.edu) is humanities librarian, queens college, city university of new york, flushing, new york. adoption of e-book readers among college students: a survey | foasberg 109 foundation survey, internet and american life, found that e-readers were luxury items owned by the well educated and well off. in the survey, 5 percent of respondents reported owning an e-reader.12 in the ecar study of undergraduate students and information technology, 3.1 percent of undergraduate college students reported owning an e-book reader, suggesting that college students are adopting the devices at a slower rate than the general population.13 commercial market research companies, including harris interactive and the book industry study group, also have collected data on e-book adoption. the harris interactive poll found that 8 percent of their respondents owned e-readers, and that those who did claimed that they read more since acquiring it. however, as a weighted online poll with no available measure of sampling error, these results should be considered with caution.14 the book industry study group survey, although it was sponsored by several publishers and e-reader manufacturers, appears to use a more robust method. this survey, consumer attitudes toward e-book reading, was conducted in three parts in 2009 and 2010. kelly gallagher, who was responsible for the group that conducted the study, remarks that “we are still in very early days on e-books in all aspects—technology and adoption.” although the size of the market has increased dramatically, the survey found that almost half of all e-readers are acquired as a gift and that half of all e-books “purchased” are actually free. however, among those who used e-books, about half said they mostly or exclusively purchased e-books rather than print. the e-books purchased are mostly fiction (75 percent); textbooks comprised only 11 percent of e-book purchases.15 much of the literature on e-book readers consists of user studies, which provide useful information about how readers might interact with the devices once they have them in hand but provide no information about whether students are likely to use them of their own volition. however, these studies are of interest because they hint at reasons that students may or may not find e-readers useful, important information for predicting the future of e-books. user studies have covered small devices, such as pdas (personal data assistants);16 first-generation e-readers, such as the rocket ebook;17 and more recent e-book readers.18 the results of many recent e-reader user studies have been very similar to studies on the usability of the first generation of e-book readers: the devices offer advantages in portability and convenience but lack good note-taking features and provide little support for nonlinear navigation. amazon sponsored large-scale research on academic uses of e-book readers at universities, such as princeton, case western reserve university, and the university of virginia,19 while other universities, such as northwest missouri state university,20 carried out their own projects limited note-taking capabilities, and rely on navigation strategies that are most effective for linear reading. the format also presents many difficulties regarding library lending. many publishers rely on various forms of drm (digital rights management) software to protect copyrighted materials. this software often prevents e-books from being compatible with more than one type of e-book reader. indeed, because e-book collections in academic libraries predate the emergence of e-book readers, many libraries now own or subscribe to large e-book collections that are not compatible with the majority of these devices. furthermore, publishers and manufacturers have been hesitant to establish lending models for their books. amazon recently announced that they would allow users to lend a book once for a period of fourteen days, if the publisher gave permission.8 this very cautious and limited approach speaks volumes about publishers’ fears regarding user sharing of e-books. several libraries have developed programs for lending the devices,9 but there is no real model for lending e-books to users who already own e-readers. a service called overdrive also provides downloadable collections, primarily of popular fiction, that can be accessed in this manner. however, the collections are small and are not compatible with all devices, including the most popular, the kindle. in the united kingdom, the publisher’s association has provided guidelines under which libraries can lend e-books, which include a requirement that the user physically visit the library to download the e-book.10 clearly, we do not currently have anything resembling a true library lending model for e-reader compatible e-books, especially not one that takes advantage of the format’s strengths. despite the challenges, it is clear that if e-book readers are enthusiastically adopted by students, libraries will need to find a way to offer materials compatible with them. as buczynski puts it, “libraries need to be in play at this critical juncture lest they be left out or sidelined in the emerging e-book marketplace.”11 however, because the costs of participating are likely to be substantial, it is very important to discover whether students are indeed adopting the hardware. few studies have focused on spontaneous student adoption of the devices, although several mention that when students were introduced to e-readers, they appeared to be unfamiliar with the devices and regard them as a novelty. however, e-readers have become more prevalent since many of these studies were conducted. thus this study surveys students to find their attitudes toward e-book readers. ■■ literature review only a few studies have attempted to quantify the popularity of e-readers. as mentioned above, the 2010 pew 110 information technology and libraries | september 2011 their first encounter with an e-book reader.”34 while this is mere anecdote, it, along with the survey results noted above, raises the question of how popular the device really is on college campuses. finally, a third group of studies attempts to predict the future of e-readers and e-books. even before the introduction of e-readers, some saw e-books as the likely future of academic libraries.35 more recently, one report discusses the likelihood of and barriers to e-book adoption. this article concludes that “barriers to e-book adoption still exist, but signs point to this changing within the next two to five years. that, of course, has been said for most of the past 15 to 20 years.”36 still, nelson points out that technologies can become ubiquitous very quickly, using the ipod as an example, and warns libraries against falling behind.37 yet another report puts e-books in the two-tothree-year adoption range and claims that e-books “have reached mainstream adoption in the consumer sector” and that the “obstacles have . . . started to fall away.”38 ■■ method the e-reader survey was conducted as part of queens college’s student technology survey, which also covered several other aspects of students’ interactions with technology. the author is grateful to the center for teaching and learning (in particular, eva fernández and michelle fraboni) for graciously agreeing to include questions about e-readers in the survey and providing some assistance in managing the data. this survey, run through queens college’s center for teaching and learning, was hosted by surveymonkey and was distributed to students through their official e-mail accounts. participants were offered a chance to win an ipod touch as an incentive, but students who did not participate also were offered an opportunity to enter the ipod drawing. the survey was available between april and june 2010. all personally identifying information was removed from the responses to protect student privacy. rather than surveying the entire population about e-readers and e-books, the survey limited most of the questions to students with some experience with the format. of the students who responded to the survey, only 63 (3.7 percent) used e-readers. however, 338 more students identified themselves as users of e-books but did not use e-readers. all other students skipped past the e-book questions and were directed to the next part of the survey. the questions about e-readers fell into several categories. the students were asked about their ownership of devices and which devices they planned to purchase in the future. while they might of course change their minds about future purchases, this is a useful way of measuring whether students regard the devices as desirable. with other e-readers. other types of programs, most notably texas a&m’s kindle lending program,21 and many academic focus groups have also contributed to our knowledge of how students use e-readers. users in nearly every study have praised the portability of these devices. this can be very important to students; users in one study noted that the portability of reading devices allowed them to “reclaim an otherwise difficult to use brief period,”22 and in another, students were able to multitask, doing household chores and studying at the same time.23 adjustable text size and the ability to search for words in the text have also been popular among students, as has the novelty value of these devices. environmental concerns surrounding heavy printing have also been cited as an advantage of e-readers.24 however, the limitations of these devices, some of which are severe in an academic setting, also have been noted. the comments of students at gettysburg college are typical: they liked the e-readers for leisure reading, but found them awkward for classroom use.25 lack of note-taking support was an important drawback for many students. waycott and kukulska-hulme noted that students were much less likely to take notes while reading with a pda than they were with print.26 a study at princeton found that the same was true of students using the kindle,27 and students at northwest missouri state university said they read less with an e-textbook than with a traditional one, although they did not report changes in their study habits.28 despite the ability of many devices to search the text of a book, users in many studies also disliked the inability to skim and browse through the materials as they would with print.29 interestingly, this complaint appeared in studies of all types of e-readers, even those with larger screens. students, in a recent study with the sony reader and ipod touch, noted that these devices did a poor job of supporting pdfs, a standard format for online course materials. the documents were displayed at a very small size and the words were sometimes jumbled.30 whether these drawbacks will prevent students from adopting e-book readers remained to be seen. library and information science (lis) students in a small, week-long study reiterated the problems found in the above studies, but nevertheless found themselves using e-readers extensively and reading more books and newspapers than they had before.31 several of these user studies hint that e-readers are not currently commonplace as far as users often seemed to regard the devices with surprise and curiosity. in some studies, while users were initially attracted to the novelty value of the devices, their enthusiasm dimmed after using the devices and discovering technical problems and limitations.32 one author describes e-readers as “attention getters, but not attention keepers.”33 a study in early 2009, in which students were provided with e-readers, notes that “for the majority of the participants, this was adoption of e-book readers among college students: a survey | foasberg 111 attitudes of students in general, similar surveys should be taken across many campuses in several demographically different areas. researching e-readers is inherently difficult because the landscape is changing very quickly. since the survey began, apple’s ipad became available, prices for dedicated e-readers have dropped dramatically, publishers have become more willing to offer content electronically, and amazon has released a new version of the kindle and has begun taking out television advertisements for it. without a follow-up survey, it is impossible to know whether these events have changed student attitudes. ■■ results and discussion e-reader adoption of the 1,705 students who responded to the survey, 401 say that they read e-books (table 1). most students (338) who use e-books read them on a device other than an e-reader, but 63 say they use a dedicated reader for e-books (table 2). however, when students were asked about the technological devices that they own, only 56 selected e-book readers. perhaps the seven students who use e-book readers but don’t report owning one are sharing or borrowing them, or perhaps they are using a device other than the ones enumerated in the question. aside from table 3, which breaks down the e-reader brands that students own, the following data will be based upon the larger sample of 63 students. the students who read e-books on another device were asked whether they planned to buy an e-reader in the respondents were also asked about their use of e-books. this category includes questions about what kind of reading students use e-books for, how much of their reading uses e-books, and where they are finding their e-books. it was important to learn whether students considered e-book readers appropriate for academic work, and whether they considered the library a potential source for e-books. finally, to assess their attitudes toward e-book readers, students were asked to identify the main benefits and drawbacks of e-book readers. several possibilities were listed, and students were asked to respond to them along a likert scale. a field was also included in which students could fill in their own answers. after 643 incomplete surveys were eliminated, there were 1,705 responses from queens college students. this is about 8 percent of the queens college student body. e-mail surveys always run the risk of response bias, especially when they concern technology. however, students who responded were representative of queens college in terms of sex, age, class standing, major, and other demographic characteristics. the results were compared using a chi-squared test with the level of significance set at 0.05. in some cases, there were too few respondents to test significance properly and comparisons could not be made. please see appendix for the e-reader questions included in the survey instruments. they will be referred to in more depth throughout this article. ■■ survey limitations the survey results may not be generalizable because of the survey’s small sample size. in particular, the 63 respondents who use e-book readers may not be representative of student e-reader owners in general. the survey also relies on self-reporting; no direct observation of student behavior took place. students who do use e-readers may be more comfortable with technology and more likely to respond to e-mail surveys. however, the sample is representative for queens college students, and the percentage of students who own e-book readers is close to the national average at the time the survey was taken (5 percent).39 since only queens college students were surveyed, the results reflect the behavior and attitudes of students at a single large, four-year public college in new york city. the results do not necessarily reflect the experience of students at other types of institutions or in other parts of the united states. the other parts of the technology survey show that qc students are heavy users of technology, so they may adopt new technologies such as e-book readers more quickly than other students. to understand the table 1. e-book use among respondents e-book use number of respondents read e-books 401 (23.5%) do not read e-books 1262 (74.0%) don’t know what an e-book is 42 (2.5%) total 1705 (100%) table 2. devices used to read e-books among e-book readers device used number of respondents (% of e-book users) dedicated e-reader 63 (15.7) other device 338 (84.3) total 401 (100) 112 information technology and libraries | september 2011 desire to buy an ipad, many more than reported owning an e-reader. curiously, the e-reader owners reported that they planned to buy an ipad at the same rate as the other students. it is not clear whether these students plan to replace their e-reader or use multiple devices. in either case, while the arrival of the ipad and other tablet devices seems likely to increase the number of students carrying potential e-reading devices, some of its adopters will probably be students who already own e-readers. not surprisingly, students who used e-readers tended to be early adopters of technology in general (table 4).40 compared to the general pool of respondents, they were much more likely to like or love new technologies and much less likely to describe themselves as neutral or skeptical of them. in a chi-squared test, these differences were significant at a level of 0.001. although e-reading devices have existed since the 1990s, the newest, most popular generation of them is so recent that people who own one now are early adopters by definition. compared to the rest of the survey respondents, both e-reader owners and other e-book users were much more likely to identify as early adopters of technology in general. given this trend, the adoption rate of e-readers among students may slow once the early adopters are satisfied. uses of e-books students who used an e-book reader were asked how much of their reading they did with it and whether they used it for class, recreational, or work-related reading (table 5). students without e-readers were asked the same questions about their use of e-books. while it is likely that students who use e-book readers continue to access e-books in other ways, this distinction was made because this survey was designed to study their use of e-readers specifically. because e-reader users were not asked about their use of e-books in other formats, it is not clear whether their habits with more traditional e-book formats differ from those of other students. fewer than half the e-reader users in the study used the device for two-thirds of their reading or more. in the table below, students who did all their reading and those who did about two-thirds of their reading with e-books are combined, because so few claimed to read e-books exclusively. three students with e-readers and future. the majority had no immediate plans to buy one, with those who said they did not plan to acquire one and those who did not know combining for 62.43 percent. 23.67 percent planned to buy one either within the next year or before leaving college, and the remaining 13.91 percent planned to acquire an e-reader after graduating. despite ergonomic disadvantages, many more students are using e-books on some other device, such as a computer or a cell phone, than are loading them on e-readers. furthermore, a large percentage of these students do not plan to buy an e-book reader. the factors preventing these students from buying e-readers will be covered in more detail in the “attitudes toward e-readers” section below. however, it seems likely that a major factor is price, identified by both e-reader owners and non-owners as the greatest disadvantage of these devices. when asked to list the devices they owned, 56 students named some type of e-book reader. among these, the amazon kindle was the most popular (table 3). as expected, e-readers have yet to be adopted by most students at queens college. at the time of this survey, less than 4 percent of respondents owned one. while the rest of the survey shows that these students are highly wired—82 percent own a laptop less than five years old and 93 percent have high-speed internet access at home—this has not translated to a high rate of e-reader ownership. although apple’s ipad, a tablet device that functions as an e-reader among other things, was not yet released at the time of the survey, it may see wider adoption than the dedicated devices. when the survey was originally distributed, this device had been announced but not yet released. overall, 8 percent of students expressed a table 3. e-reader brands owned by students devices owned number of students (% of e-reader owners) amazon kindle 26 (46.4%) barnes & noble nook 14 (25.0%) sony reader 10 (17.9%) other 6 (10.7%) total 56 (100.0%) table 4. e-reader use and self-identification as an early adopter e-reader owners all respondents love or like new technologies 40 (63.5%) 698 (40.9%) neutral or skeptical about new technologies 23 (36.5%) 1007 (59.1%) total 63 (100.0%) 1705 (100.0%) adoption of e-book readers among college students: a survey | foasberg 113 pleasure. this finding is much more surprising, given the very slow adoption of e-books before the introduction of e-readers, and the ergonomic problems with reading from vertical screens. however, students who used e-books without e-readers were much more likely to read e-books for classes. this difference may be due to the sorts of material that are available in each format. although textbook publishers have shown interest in creating e-textbooks for use on devices such as the ipad, there is little selection available for e-readers as yet. when working without e-book readers, however, there is a wide variety of academic materials available in electronic formats, and many textbooks include an online component. academic libraries, including the one at queens college, subscribe to large e-book collections of academic materials. for the most part, these collections cannot be used on an e-reader, but they are available through the library’s website to students with an internet connection and a browser. it is also possible that the e-readers are not well suited to class readings. some past studies, cited above, have found that e-readers do not accommodate functions such as note taking, skimming, and non-sequential navigation very well. since these are important functions for academic work, and both print books and “traditional” e-books are superior in these respects, such limitations may prevent students from using e-readers for classes. the user behaviors reported here do not appear to herald the end of print; in fact, very few students with e-readers use them for all their reading, and over half of the students with e-readers use them for one-third of their reading or less. it is not clear whether students intentionally choose to read some materials in print and others with nine without said they used e-books for all their reading. very few students without e-book readers used e-books for a large proportion of their reading; indeed, 54 percent said they used e-books for less than a third of their reading. differences between the groups were tested for significance using a chi-squared test. note that percentages may not add up to 100 percent, due to rounding. since many studies of e-book readers have found them more suitable for recreational reading than for academic work, users of e-readers were asked to identify the kinds of readings for which they used e-readers and asked to identify all options that they found applicable (table 6). since students were allowed to choose more than one option, the totals are greater than the number of participants. indeed, e-readers were much more likely to be used for recreational reading and other types of e-books far more likely to be used for class. for other types of reading, differences between these groups were not significant. since e-readers have been marketed largely for the popular fiction market and are designed to accommodate casual linear reading, it is not surprising that students who use them are most likely to report using them for leisure reading. in this area they seem to enjoy a strong advantage over more traditional e-book formats read on another device such as a computer or a cell phone. however, the study did not control for the amount of reading that students do. students who use e-readers may be heavier leisure readers in general. further research could clarify whether heavier use of leisure e-reading is due to the devices or the tendencies of those who own them. a large proportion of the students who read e-books without e-readers (65.7 percent) do read e-books for table 5. amount of reading done with e-books amount of reading e-reader users other users x2 significance level significant? about two-thirds or all 27 (42.8%) 65 (19.2%) 16.8 0.001 yes about a third 14 (22.2%) 90 (26.6%) 0.1 0.5 no less than a third 22 (34.9%) 183 (54.1%) 7.9 0.01 yes total 63 (99.9%) 338 (99.9%) ———table 6. types of reading done with e-books type of reading e-reader users other users x2 significance level significant? recreational 54 (85.7%) 222 (65.7%) 9.9 0.01 yes class 24 (38.1%) 217 (64.2%) 14.7 0.001 yes work 11 (17.8%) 88 (26.0%) 2.1 0.5 no other 3 (4.8%) 8 (2.4%) 1.1 0.5 no 114 information technology and libraries | september 2011 from the manufacturer of the e-reader that supports them, this result is not surprising. it suggests that these booksellers have a high degree of power in the market, a potential effect of e-readers that deserves further attention. however, official e-book sellers of the sort mentioned above are not the only option for students seeking digital reading material, since both independent online bookstores and open access repositories such as project gutenberg were used by students. libraries, both public and academic, reached traditional e-book users much more successfully than e-reader users. although many libraries have large e-book collections, there is currently little material for e-readers. despite the existence of a service called overdrive, which provides e-books compatible with some e-readers (excluding the kindle), circulating e-books is challenging, due to a host of technical and legal problems. given this environment, it is not surprising that students without e-readers were more likely to use their public library as a source of e-books than were e-reader users. the queens college campus library, which offers many electronic collections but none that are e-reader-friendly, fared worse; only one student claimed to have used it as a source of e-reader compatible materials. in the free comment field, students mentioned other sources of e-books such as the apple itunes store, the campus bookstore and lulu.com, an online bookseller that also provides self-publishing. several also admitted, unprompted, that they download books illegally. attitudes toward e-readers in the interests of learning what caused students to adopt e-readers or not, the survey used a series of likert-style questions to ask what the students considered the benefits and drawbacks of such devices. strikingly, e-reader owners and non-owners agreed about both the advantages and disadvantages; owning an e-reader did not seem to change most of the things that students value and dislike about it. figure 1 shows the number of students in each group who their e-reader, or whether they are limited by the materials available for the e-reader. the circumstances under which students switch between electronic and print would be an excellent area for future research; is it a matter of what is practically available, or is the e-reader better suited for some texts and reading circumstances than others? sources of e-books the major producers of e-readers are either primarily booksellers, such as amazon and barnes & noble, or are hardware manufacturers who also provide a store where users can purchase e-books, such as sony (or, after the ipad launch, apple). in both models, the manufacturers hope to sell e-books to those who have purchased their devices. they provide more streamlined ways of loading these e-books on their devices, and in some cases use drm to prevent their e-books from being used on competing devices, as well as to inhibit piracy. table 7 shows the sources from which readers with and without e-readers obtain e-books. e-reader users were much more likely than non-users to get their e-books from the official store associated it—that is, the store providing the e-reader, such as amazon, barnes and noble, or sony’s readerstore. there was no significant difference between the two groups’ use of open access or independent sources, but the students who did not use e-readers were much more likely to use e-books from their public library, and while 19.8 percent of students without e-readers used the campus library as a source of e-books, only one student with an e-reader did. since respondents were allowed to choose more than one answer, the results do not sum up to 100 percent. by a wide margin, students who own e-readers are most likely to purchase their e-reading materials from the “official” store; 86 percent cited the official store as a source of e-books. students without e-readers also use these stores more than any other source of e-books, but they are nevertheless far less likely to use them than e-reader users. because it is much easier to buy e-books table 7. sources of e-books how do you get e-books? e-reader users other users x2 significance level significant? store specific to popular e-readers 54 (85.7%) 154 (45.6%) 34.2 0.001 yes open access repositories 16 (25.4%) 120 (35.5%) 2.4 0.5 no public library 10 (15.9%) 99 (29.3%) 4.8 0.05 yes independent online retailer 9 (14.3%) 71 (21.0%) 1.5 0.5 no other 4 (6.3%) 39 (11.5%) n/a n/a n/a campus library 1 (1.6%) 67 (19.8%) n/a n/a n/a adoption of e-book readers among college students: a survey | foasberg 115 students with e-readers were more likely than others to rate portability and convenience as “very valuable.” as the studies cited above suggest, being able to easily download books, carry them away from the computer, and store many books on a single device are very appealing to students. only the final two features, text-to-speech and special features such as dictionaries, attracted enough “not very valuable” or “not valuable” responses for an inter-group comparison. both groups considered text-to-speech the least valuable feature, but students who did not own e-readers were significantly more likely to consider it a valuable or very valuable feature, perhaps indicating that the users to whom this is important have avoided the devices, which currently support it in a very limited fashion. perhaps, too, students with e-readers rated this feature less useful because of its current limitations. in either case, rated each feature either valuable or very valuable. if the positive features of the devices are ranked based on the percentage of respondents who considered them very valuable, the order is almost the same for students with and without e-readers. for students with e-readers, the features rank as follows: portability, convenience, storage, special functions, and text-to-speech. for those without, convenience ranks slightly higher than portability; all other features rank in the same order. tables 8 and 9 present the results of these questions in more detail. for the sake of brevity, the chi-squared results have been omitted. any differences considered significant in the discussion below are significant at least at the 0.05 level. nearly all e-reader users and a strong majority of other e-book users rated portability, convenience, and storage either “valuable” or “very valuable,” though figure 1. features rated “valuable” or “very valuable” 116 information technology and libraries | september 2011 among respondents suggests that that many of those who do not own an e-book reader are unfamiliar with the technology. since e-readers are primarily sold over the internet, many people have not had a chance to see or handle one, perhaps partly explaining this result. if they become more widespread, this may well change. not surprisingly, respondents who did not own e-readers were significantly more likely to prefer print. however, it is worth noting that even among students who did use e-readers, over a third “agree” or “completely agree” that they prefer print, with another third neither agreeing nor disagreeing. use of e-readers does not appear to indicate hostility toward print. this is consistent with the students’ self-reports of e-reader use; as reported above, over half of the students surveyed use e-readers for one-third of their reading or less. thus, it seems unlikely that most of these students plan to totally abandon print any time soon; rather, e-readers are providing another format that they use in addition to print. as for students who do not use e-readers, over half say they prefer print, but this is far from their most widespread concern; rather, like e-reader owners, they are most likely to cite the cost of the reader or the selection of books available as a drawback of the devices. queens college students considered price the most important drawback of e-readers. for both groups (owners and non-owners), it was the factor most likely to be identified as a concern, and the difference between the it was the only variable listed in the survey for which either the “not very valuable” and “not valuable” responses from either group amounted to a combined total of greater than 10 percent of the respondents in that group. in addition to valuing the same features, e-reader owners and non-owners had similar concerns about the device. figure 2 shows the number of respondents in each group who agreed or completely agreed that the issues listed were one of the main shortcomings of e-book readers. tables 10 and 11 give the responses in more detail. the responses with which the most respondents either agreed or completely agreed were the same: cost of e-reader, selection of e-books, and cost of e-books, in that order. although groups such as the electronic frontier foundation have raised concerns about privacy issues related to e-readers,41 these issues have made little impression on students; both e-reader users and nonusers were in agreement in putting privacy at the bottom of the list. one exception to the general agreement between e-reader users and other e-book readers was concern about eyestrain. the majority (63 percent) of those who do not use e-readers either “completely agree” or “agree” that eyestrain is a drawback, while only 29 percent of e-reader owners did. this was a major concern for early e-readers, leading the current generation of these devices to use e-ink, a technology that resembles paper and is thought to eliminate the eyestrain problem. the disparity table 8. value of e-reader features, according to e-reader users very valuable valuable somewhat valuable not very valuable not valuable at all no response portability 52 (82.54%) 10 (15.87%) 1 (1.59%) 0 (0.00%) 0 (0.00%) 0 (0.00%) convenience 46 (73.02%) 13 (20.63%) 1 (1.59%) 1 (1.59%) 1 (1.59%) 1 (1.59%) storage 42 (66.67%) 16 (25.40%) 2 (3.17%) 1 (1.59%) 0 (0.00%) 2 (3.17%) special functions 32 (50.79%) 18 (28.57%) 7 (11.11%) 3 (4.76%) 3 (4.76%) 0 (0.00%) text-speech 10 (15.87%) 13 (20.63%) 12 (19.05%) 16 (25.40%) 11 (17.46%) 1 (1.59%) table 9. value of e-reader features, according to other e-book users very valuable valuable somewhat valuable not very valuable not valuable at all no response portability 199 (58.88%) 89 (26.33%) 39 (11.53%) 4 (1.18%) 5 (1.48%) 2 (0.06%) convenience 194 (57.40%) 98 (28.99%) 34 (10.06%) 7 (2.07%) 2 (0.59%) 3 (0.89%) storage 181 (53.55%) 99 (29.28%) 40 (11.83%) 10 (2.96%) 4 (1.18%) 4 (1.18%) special functions 169 (50.00%) 82 (24.26%) 58 (17.16%) 22 (6.51%) 4 (1.18%) 3 (0.89%) text-speech 95 (28.11%) 77 (22.78%) 77 (22.78%) 50 (14.79%) 35 (10.36%) 4 (1.18%) adoption of e-book readers among college students: a survey | foasberg 117 responded, but they brought up issues such as highlighting, battery life, and the small size of the screen. another student was more confident in the value of e-readers and used this space to proclaim paper books dead. e-book circulation programs finally, students were asked whether they would be interested in checking out e-readers with books loaded on them from the campus library (table 12). as is often the case when a survey asks for interest in a prospective new service, the response was very positive. however, it was expected that many of the students would prefer to download materials for devices that they already own to take advantage of the convenience of e-readers. on the contrary, a high percentage of both types of students expressed interest in checking out e-book readers, but very few wished to check out e-books two groups was not significant. at the time this survey was taken, amazon’s kindle cost close to $300 and barnes and noble’s nook was priced similarly. soon after the survey closed, however, the major e-reader manufacturers engaged in a “price war,” which resulted in the prices of the best-known dedicated readers, amazon’s kindle and barnes and noble’s nook, falling to under $200. given the feeling among survey respondents that the price of the readers is a serious drawback, this reduction may cause the adoption rate to rise. it would be worthwhile to repeat this survey or a similar one in the near future to learn whether the e-reader price war has had any effect upon price-sensitive students. in the pilot survey, students had written in further responses about the drawbacks of e-readers, but not about their benefits. while some of those responses were incorporated into the final survey, a free text field was also added to catch any further comments. few students figure 2. drawbacks with which students “agree” or “completely agree” 118 information technology and libraries | september 2011 ■■ future research although this survey provides some data to help libraries think about the popularity of e-readers among students, many aspects of students’ use of e-readers remain unexplored. further research on how student adoption of e-book readers varies by location and demographics, particularly considering students’ economic characteristics, for a device of their own. even students who owned e-readers were much more likely to express interest in checking out the device than checking out materials to read on it. this preference belies the common assumption that users do not wish to carry multiple devices and prefer to download everything electronically. instead, they were interested in checking out an e-reader from the library. unless the emphasis of the question altered the results, it is somewhat difficult to account for this response. table 10. drawbacks of e-readers, according to e-reader owners completely agree agree neither agree nor disagree disagree completely disagree no response cost of reader 19 (30.16%) 23 (36.51%) 13 (20.63%) 7 (11.11%) 0 (0.00%) 1 (1.59%) selection 11 (17.46%) 26 (41.27%) 12 (19.05%) 7 (11.11%) 6 (9.52%) 1 (1.59%) cost of e-books 10 (15.87%) 20 (31.75%) 16 (25.40%) 11 (17.46%) 5 (7.94%) 1 (1.59%) prefer print 6 (9.52%) 16 (25.40%) 21 (33.33%) 11 (17.46%) 8 (12.70%) 1 (1.59%) eyestrain 7 (11.11%) 11 (17.46%) 20 (31.75%) 15 (23.81%) 9 (14.29%) 1 (1.59%) interface 7 (11.11%) 10 (15.87%) 24 (38.10%) 9 (14.29%) 8 (12.70%) 5 (7.94%) privacy 3 (4.76%) 9 (14.29%) 13 (20.63%) 26 (41.27%) 11 (17.46%) 1 (1.59%) table 11. drawbacks of e-readers, according to other e-book users completely agree agree neither agree nor disagree disagree completely disagree no response cost of reader 146 (43.20%) 117 (34.62%) 50 (14.79%) 14 (4.14%) 11 (3.25%) 0 (0.00%) selection 80 (23.67%) 136 (40.24%) 84 (24.85%) 27 (7.99%) 7 (2.07%) 4 (1.18%) cost of e-books 94 (27.81%) 121 (35.80%) 76 (22.49%) 37 (10.95%) 10 (2.96%) 0 (0.00%) prefer print 78 (23.08%) 99 (29.29%) 116 (34.32%) 25 (7.40%) 19 (5.62%) 1 (0.30%) eyestrain 84 (24.85%) 129 (38.17%) 80 (23.67%) 33 (9.76%) 11 (3.25%) 1 (0.30%) interface 43 (12.72%) 82 (24.26%) 145 (42.90%) 33 (9.76%) 20 (5.92%) 15 (4.44%) privacy 39 (11.54%) 65 (19.23%) 144 (42.60%) 49 (14.50%) 40 (11.83%) 1 (0.30%) table 12. interest in checking out preloaded e-readers from the library e-reader owners other e-book users would be interested in checking out e-readers 44 (70.0%) 257 (76.0%) would not be interested in checking out e-readers 4 (6.3%) 38 (11.2%) would not be interested in checking out e-readers, but would like to check out e-books to read on my own e-reader 15 (23.8%) 43 (12.7%) total 63 (100.1%) 338 (99.9%) adoption of e-book readers among college students: a survey | foasberg 119 whom would not object to using a print edition if one were available. under these circumstances, and realizing that the future popularity of e-readers is far from guaranteed, developing such models is, for now, more important than putting them into practice in the short term. references 1. nancy k. herther, “the ebook reader is not the future of ebooks,” searcher 16, no. 8 (2008): 26–40, http://search.ebsco host.com/login.aspx?direct=true&db=a9h&an=34172354&site =ehost-live (accessed dec. 22, 2010). 2. charlie sorrel, “amazon: e-books outsell hardcovers,” wired, july 20, 2010, http://www.wired.com/gadgetlab/ 2010/07/amazon-e-books-outsell-hardcovers/ (accessed dec. 22, 2010). 3. international digital publishing forum, “industry statistics,” oct. 2010, http://www.idpf.org/doc_library/indus trystats.htm (accessed dec. 22, 2010). 4. kathleen hall, “global e-reader sales to hit 6.6m 2010,” electronics weekly, dec. 9, 2010, http://www.electronicsweekly .com/articles/2010/12/09/50083/global-e-reader-sales-to -reach-6.6m-2010-gartner.htm (accessed dec. 22, 2010). 5. cody combs, “will physical books be gone in five years?” video interview with nicholas negroponte, cnn, oct. 18, 2010, http://www.cnn.com/2010/tech/innovation/10/17/negro ponte.ebooks/index.html (accessed dec. 22, 2010). 6. jeff gomez, print is dead: books in our digital age (basingstoke, uk: palgrave macmillan, 2009). 7. aaron smith, “e-book readers and tablet computers,” in americans and their gadgets (washington, d.c.: pew internet & american life project, 2010), http://www.pewinternet.org/ reports/2010/gadgets/report/ebook-readers-and-tablet -computers.aspx (accessed dec. 22, 2010). 8. alex sharp, “amazon announces kindle book lending feature is coming in 2010,” suite101, oct. 26, 2010, http:// www.suite101.com/content/amazon-announces-kindle-book -lending-feature-is-coming-in-2010-a300036#ixzz18cxanfke (accessed dec. 22, 2010). 9. karl drinkwater, “e-book readers: what are librarians to make of them?” sconul focus 49 (2010): 4–10, http://www .sconul.ac.uk/publications/newsletter/49/2.pdf (accessed dec. 22, 2010). drinkwater provides an overview and a discussion of the challenges and benefits of such programs. 10. benedicte page, “pa sets out restrictions on library e-book lending,” the bookseller, oct. 21, 2010, http://www .thebookseller.com/news/132038-pa-sets-out-restrictions-on -library-e-book-lending.html (accessed dec. 22, 2010). 11. james a. buczynski, “library ebooks: some can’t find them, others find them and don’t know what they are,” internet services reference quarterly 15, no. 1 (2010): 11–19, doi: 10.1080/10875300903517089, http://dx.doi.org/ 10.1080/10875300903517089 (accessed dec. 22, 2010). 12. smith, “e-book readers and tablet computers,” http:// www.pewinternet.org/reports/2010/gadgets/report/ebook -readers-and-tablet-computers.aspx (accessed dec. 22, 2010). 13. shannon d. smith and judith borreson caruso, the ecar study of undergraduate students and information technology, 2010 (boulder, colo.: educause, 2010), http://net.educause. is certainly important. more research on the habits of students with e-readers would also help libraries and universities to better serve their needs. in particular, while this survey found that students tend to switch between electronic and print formats, little is yet known about when and why they move from one to the other. it will also be important to research the differences between the reading habits of students who own e-readers and those who do not, as this may prove useful in interpreting the survey data about types of reading done with different kinds of e-books. furthermore, since the e-book market changes quickly, continuing to research student adoption of e-readers is also important to monitor student reactions to new developments. ■■ conclusion while many queens college students express an interest in e-readers, and even those who do not own one believe that their portability and convenience offer valuable advantages, only a small percentage of students, many of whom are early adopters of technology in general, actually use one. furthermore, even those who own e-readers do not use them exclusively, and only a third say they prefer it to print. in light of these responses, the proper response to this technology may not be a discussion about whether “paper books are dead” (as one of the survey respondents wrote in the comment field) but how each format is used. research on when, where, and for what purposes students might choose print or electronic has already begun.42 many of the factors that contribute to the niche status of e-readers are changing. competition between manufacturers has brought down the price of the reader itself, and the selection of books available for them is improving. because these were some of the most important problems standing in the way of e-reader adoption for queens college students, e-reader ownership could increase rapidly. the lack of a significant difference between the attitudes of e-reader owners and nonowners merits further emphasis and examination, as it may indicate that price is indeed the major barrier to e-reader ownership. although the prices are lower now than they were when the survey was originally taken, this would present a major concern if e-readers became the expected format in which students read, perhaps even the possibility of a new kind of digital divide. as the future is uncertain, it is important for academic libraries to pay attention to their students’ adoption of e-readers, and to consider models under which they can provide materials compatible with them. however, it is important to remember that such materials would, at present, be accessible to only a small subset of users, many of 120 information technology and libraries | september 2011 20. jon t. rickman et al., “a campus-wide e-textbook initiative,” educause quarterly 32, no. 2 (2009), http://www.edu cause.edu/library/eqm0927 (accessed dec. 22, 2010). 21. dennis t. clark, “lending kindle e-book readers: first results from the texas a&m university project,” collection building 28, no. 4 (2009): 146–49, doi: 10.1108/01604950910999774, http://www.emeraldinsight.com/journals.htm?articleid=18174 06&show=abstract (accessed dec. 22, 2010). 22. marshall and rutolo, “reading-in-the-small,” 58. 23. mallett, “a screen too far?” 142. 24. “e-reader pilot at princeton.” 25. foster and remy, “e-books for academe,” 6. 26. waycott and kukulska-hulme, “students’ experiences with pdas,” 38. 27. “e-reader pilot at princeton.” 28. rickman, “a campus-wide e-textbook initiative.” 29. dennis t. clark et al., “a qualitative assessment of the kindle e-book reader: results from initial focus groups,” performance measurement and metrics 9, no. 2 (2008): 118–129, doi: 10.1108/14678040810906826, http://www.emeraldinsight .com/journals.htm?articleid=1736795&show=abstract (accessed dec. 22, 2010); james dearnley, cliff mcknight, and anne morris. “electronic book usage in public libraries: a study of user and staff reactions to a pda-based collection,” journal of librarianship and information science 36, no. 4 (2004): 175–182, doi: 10.1177/0961000604050568, http://lis.sagepub.com/content/36/4/175 (accessed dec. 22, 2010); mallett, “a screen too far?” 143; waycott and kukulska-hulme, “students’ experiences with pdas,” 36. 30. mallet, “a screen too far?” 142–43. 31. m. cristina pattuelli and debbie rabina. “forms, effects, function: lis students’ attitudes toward portable e-book readers,” aslib proceedings: new information perspectives 62, no. 3 (2010): 228–44, doi: 10.1108/00012531011046880, http://www .emeraldinsight.com/journals.htm?articleid=1863571&show=ab stract (accessed dec. 22, 2010). 32. see, for example, gil-rodriguez and planella-ribera, “educational uses of the e-book,” 58–59; and cliff mcknight and james dearnley, “electronic book use in a public library,” journal of librarianship & information science 35, no. 4 (2003): 235–42, doi: 10.1177/0961000603035004003, http://lis.sagepub .com/content/35/4/235 (accessed dec. 22, 2010). 33. rickman et al. “a campus-wide e-textbook initiative.” 34. maria kiriakova et al., “aiming at a moving target: pilot testing ebook readers in an urban academic library,” computers in libraries 30, no. 2 (2010): 20–24, http://search .ebscohost.com/login.aspx?direct=true&db=a9h&an=48757663 &site=ehost-live (accessed dec. 22, 2010). 35. mark sandler, kim armstrong, and bob nardini, “market formation for e-books: diffusion, confusion or delusion?” the journal of electronic publishing 10, no. 3 (2007), doi: 10.3998/3336451.0010.310, http://quod.lib.umich.edu/cgi/t/ text/text-idx?c=jep;view=text;rgn=main;idno=3336451.0010.310 (accessed dec. 22, 2010). 36. mark r. nelson, “e-books in higher education: nearing the end of an era of hype?” educause review 43, no. 2 (2008), http://www.educause.edu/educause+review/ educausereviewmagazinevolume43/ebooksinhigher educationnearing/162677 (accessed dec. 22, 2010). 37. ibid. 38. l. johnson et al., the 2010 horizon report (austin, tex.: edu/ir/library/pdf/ers1006/rs/ers1006w.pdf (accessed dec. 22, 2010). 14. harris interactive, “one in ten americans use an e-reader; one in ten likely to get one in the next six months,” press release, sept. 22, 2010, http://www.piworld.com/com mon/items/biz/pi/pdf/2010/09/pi_pdf_harrispoll_ereaders. pdf (accessed dec. 22, 2010). 15. kat meyer, “#followreader: consumer attitudes toward e-book reading,” blog posting, o’reilly radar, aug. 4, 2010, http://radar.oreilly.com/2010/08/followreader-consumer-atti tudes-toward-e-book-reading.html (accessed dec. 22, 2010). 16. the following articles are all based on user studies with small form factor devices: paul lam, shun leung lam, john lam and carmel mcnaught, “usability and usefulness of ebooks on ppcs: how students’ opinions vary over time,” australasian journal of educational technology 25, no. 1 (2009): 30–44, http:// www.ascilite.org.au/ajet/ajet25/lam.pdf (accessed dec. 22, 2010); catherine c. marshall and christine rutolo, “readingin-the-small: a study of reading on small form factor devices,” in jcdl ’02 proceedings of the 2nd acm/ieee-cs joint conference on digital libraries (new york: acm, 2002): 56–64. doi: 10.1145/544220.544230, http://portal.acm.org/citation .cfm?doid=544220.544230 (accessed dec. 22, 2010); and j. waycott and a. kukulska-hulme, “students’ experiences with pdas for reading course materials,” personal ubiquitous computing 7, no. 1 (2002): 30–43, doi: 10.1007/s00779–002–0211-x, http://www .springerlink.com/content/w288kry251dd2vcd/ (accessed dec. 22, 2010). 17. some examples in an academic context: james dearnley and cliff mcknight, “the revolution starts next week: the findings of two studies considering electronic books,” information services & use 21, no. 2 (2001): 65–78, http://search .ebscohost.com/login.aspx?direct=true&db=a9h&an=5847810& site=ehost-live (accessed dec. 22, 2010); and eric j. simon, “an experiment using electronic books in the classroom,” journal of computers in mathematics & science teaching 21, no. 1 (2002): 53–66, http://vnweb.hwwilsonweb.com/hww/jumpstart.jhtml?recid= 0bc05f7a67b1790e5237dc070f466830549a60a87b3fa34bd0b8951acd 7a879da9fa151218a88252&fmt=h (accessed dec. 22, 2010). 18. eva patrícia gil-rodriguez and jordi planella-ribera, “educational uses of the e-book: an experience in a virtual university context,” in hci and usability for education and work, ed. andreas holzinger, lecture notes in computer science no. 5298 (berlin: springer, 2008): 55–62, doi: 10.1007/9783-540-89350-9-5, http://www.springerlink.com/content/ d357482823j10m96/ (accessed dec. 22, 2010); “e-reader pilot at princeton, final report,” (princeton university, 2009), http:// www.princeton.edu/ereaderpilot/ereaderfinalreportlong .pdf (accessed dec. 22, 2010); gavin foster and eric d. remy. “e-books for academe: a study from gettysburg college,” educause research bulletin, no. 22 (2009), http://www.educause .edu/resources/ebooksforacademeastudyfromgett/187196 (dec. 22, 2010); and elizabeth mallett, “a screen too far? findings from an e-book reader pilot,” serials 23, no. 2 (2010): 14–144, doi: 10.1629/23140, http://uksg.metapress.com/ media/mfpntjwvyqtggyjvudu7/contributions/f/3/2/6/ f32687v5r12n5h77.pdf (accessed july 11, 2011). 19. steve kolowich, “colleges test amazon’s kindle e-book reader as study tool,” usa today, feb. 23, 2010, http://www .usatoday.com/news/education/2010–02–23-ihe-amazon-kin dle-for-college23_st_n.htm (accessed dec. 22, 2010). adoption of e-book readers among college students: a survey | foasberg 121 question 22, and was reused in the current survey. again, the author extends thanks to michelle fraboni and eva fernández, who ran this portion of the survey at queens college and allowed the use of their data. 41. electronic frontier foundation, “updated and corrected: e-book buyer’s guide to privacy,” deeplinks blog, jan. 6, 2010, http://www.eff.org/deeplinks/2010/01/updated-and-corrected-e-book-buyers-guide-privacy (accessed dec. 22, 2010). 42. pattuelli and rabina, “lis students’ attitudes.” new media consortium, 2010), http://wp.nmc.org/horizon2010/chapters/electronic-books/ (accessed july 11, 2011). 39. aaron smith, “e-book readers and tablet computers,” h t t p : / / w w w. p e w i n t e r n e t . o rg / r e p o r t s / 2 0 1 0 / g a d g e t s / report/ebook-readers-and-tablet-computers.aspx (accessed july 11, 2011). 40. this question was located in a portion of the survey not focused on e-book readers and thus does not appear in the appendix. the question derives from smith and caruso, 105, 122 information technology and libraries | september 2011 appendix. queens college student technology survey adoption of e-book readers among college students: a survey | foasberg 123 124 information technology and libraries | september 2011 adoption of e-book readers among college students: a survey | foasberg 125 126 information technology and libraries | september 2011 adoption of e-book readers among college students: a survey | foasberg 127 128 information technology and libraries | september 2011 president’s message: open access/open data colleen cuddy information technologies and libraries | march 2012 1 i am very excited to write this column. this issue of information technology and libraries (ital) marks the beginning of a new era for the journal. ital is now an open-access, electronic-only journal. there are many people to thank for this transition. the lita publications committee led by kristen antelman did a thorough analysis of publishing options and presented a thoughtful proposal to the lita board; the lita board had the foresight to push for an open-access journal even if it might mean a temporary revenue loss for the division; bob gerrity, ital editor, has enthusiastically supported this transition and did the heavy lifting to make it happen; and the lita office staff worked tirelessly for the past year to help shepherd this project. i am proud to be leading the organization during this time. to see ital go open access in my presidential year is extremely gratifying. as cliff lynch notes in his editorial, “the library profession has been slow to open up access to the publications of its own professional societies, to take advantage of the greater reach and impact that such policies can offer.” as librarians challenge publishers to pursue open-access venues, myself included, i am relieved to no longer be a hypocrite. by supporting open access we are sending a strong message to the community that we believe in the benefits of open access and we encourage other library organizations to do the same. ital will now reach a much broader and larger audience. this will benefit our authors, the organization, and the scholarship of our profession. i understand that while our members embrace open access, not everyone is pleased with an online-only journal. the number of new journals being offered electronically only is growing and i believe we are beginning to see a decline in the dual publishing model of publishers and societies offering both print and online journals. my library has been cutting back consistently on print copies of journals and this year will get only a handful of journals in print. personally, i have embraced the electronic publishing world. in fact, i held off on subscribing to the new yorker until it had an ipad subscription model! i estimate that i read 95 percent of my books and all of my professional journals electronically. the revolution has happened for me and for many others. i know that our membership will adapt and transition their ital reading habits to our new electronic edition and i look forward to seeing this column and the entire journal in its new format. colleen cuddy (colleen.cuddy@med.cornell.edu) is lita president 2011-12 and director of the samuel j. wood library and c. v. starr biomedical information center at weill cornell medical college, new york, new york. mailto:colleen.cuddy@med.cornell.edu president’s message | cuddy 2 earlier this week saw the research works act die. librarians and researchers across the country celebrated this victory as we preserved an important open-access mandate requiring the deposition of research articles funded by the national institutes of health into pubmed central. this act threatened not just research but the availability of health information to patients and their families. as librarians, we still need to be vigilant about preserving open access and supporting open-access initiatives. i would like to draw your attention to the federal research public access act (frpaa, hr 4004). this act was recently introduced in the house, with a companion bill in the senate. as described by the association of research libraries, frppa would ensure free, timely, online access to the published results of research funded by eleven u.s. federal agencies. the bill gives individual agencies flexibility in choosing the location of the digital repository to house this content, as long as the repositories meet conditions for interoperability and public accessibility, and have provisions for long-term archiving. the legislation would extend and expand access to federally-funded research resources and, importantly, spur and accelerate scientific discovery. notably, this bill does not take anything away from publishers. no publisher will be forced to publish research under the bill’s provisions; any publisher can simply decline to publish the material if it feels the terms are too onerous. i encourage the library community to contact their representatives to support this bill. open access and open data are the keystones of e-science and its goals of accelerating scientific discovery. i hope that many of you will join me at the lita president’s program on june 24, 2012, in anaheim. tony hey, corporate vice president of microsoft research connections and former director of the u.k.'s e-science initiative, and clifford lynch, executive director of the coalition for networked information, will discuss data-intensive scientific discovery and its implications for libraries, drawing from the seminal work the fourth paradigm. librarians are beginning to explore our role in this new paradigm of providing access to and helping to manage data in addition to bibliographic resources. it is a timely topic and one in which librarians, due to our skill set, are poised to take a leadership role. reading the fourth paradigm was a real game changer for me. it is still extremely relevant. you might consider reading a chapter or two prior to the program. it is an open-access e-book available for download from microsoft research (http://research.microsoft.com/en-us/collaboration/fourthparadigm/). i keep a copy on my ipad, right there with downloaded ital article pdfs. http://www.arl.org/pp/access/frpaa-2012.shtml http://research.microsoft.com/en-us/collaboration/fourthparadigm/ near-field communication (nfc): an alternative to rfid in libraries articles near-field communication (nfc) an alternative to rfid in libraries neeraj kumar singh information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.11811 neeraj kumar singh (neerajkumar78ster@gmail.com), phd, is deputy librarian, panjab university, chandigarh, india abstract libraries are the central agencies for the dissemination of knowledge. every library aspires to provide maximum opportunities to its users and ensure optimum utilization of available resources. hence, libraries have been seeking technological aids to improve their services. near-field communication (nfc) is a type of radio-frequency technology that allows electronics devices—such as computers, mobile phones, tags, and others—to exchange information wirelessly across a small distance. the aim of this paper is to explore nfc technology and its applications in modern era. the paper will discuss potential use of nfc in the advancement of traditional library management system. introduction similar to other identification technologies such as radio-frequency identification (rfid), barcodes, and qr codes, near-field communication (nfc) is a short-range (4–10 cm) wireless communication technology. nfc is based on the existing 13.56 mhz rfid contactless card standards which have been established for several years and are used for payment, ticketing, electronic passport, and access control among many other applications. data rates range from 106 to 424 kilobits per second. a few nfc devices are already capable of supporting up to 848 kilobits per second which is now being considered for inclusion in the nfc forum specifications. 1 compared to other wireless communication technologies nfc is designed for proximity or shortrange communication which provides a dedicated read zone and some inherent security. its 13.56 mhz frequency places it within the ism band, which is available worldwide. it is a bi-directional communication meaning that you can exchange data in both directions with a typical range of 4 – 10 cm depending on the antenna geometry and the output power.2 nfc is convenient and fast: the action is automatically triggered when your phone comes within 10 cm near the nfc tag and you get instant access to the content on mobile, without a single click.3 rfid and nfc technologies are similar in that both use radio waves. both rfid and nfc technologies exchange data within electronic devices in active mode as well as in passive mode. in the active mode, outgoing signals are basically those that actually come from the power source, whereas in case of passive mode the signals use the reflected energy they have received from the active signal. in rfid technology the radio waves can send information to receivers up to hundreds of meters away depending on the frequency of the band used by th e tag. if provided with high amount of power, these signals can also be sent to extreme distances (e.g., in the case of airport radar). at large airports it typically controls traffic within a radius of 100 kilometers of the airport below an elevation of 25,000 feet. rfid is also used very often in tracking animals and vehicles. mailto:neerajkumar78ster@gmail.com information technology and libraries june 2020 near field communication (nfc) | singh 2 in contrast, items like passports and payment cards should not be capable of long-distance transmissions because of the threat of theft of personal information or funds. nfc is designed to meet this need. nfc tags are very small in size so as to fit on the inner side of devices and products such as inside luggage, purses and packs as well as from inside wallets and clothing and can be tracked. nfc technology has added security features that make it much more secure than the previously popular rfid equivalent and it is difficult to steal information stored in it. nfc has short range of work area compared to other wireless technologies, so it can be widely used for payments, ticketing and service admittance and thus has proved to be a safer technology. it is because of this security feature that this technology is used in cellular phones to turn them into a wallet.4 both rfid and nfc wireless technologies can operate in active and passive communication modes to exchange data within electronic devices. the main differences between nfc and rfid are: • though both rfid and nfc use radio frequencies for communication, nfc can be said to be an extension of the rfid technology. the rfid technology has been in use for more than a decade, but nfc has emerged on the scene recently. • rfid has a wider range whereas nfc has limited communication and operates only at close proximity. nfc typically has a range of a few centimeters. • rfid can function in many frequencies and many standards are being used, but nfc requires a fixed frequency of 13.56 mhz, and some other fixed technical specifications to function properly. • rfid technology can be used for such applications as item tracking, automated toll collecting on roads, vehicle movement, etc., that require wide area signals. nfc is appropriate for applications that carry data that needs to be kept secure like mobile payments, access controls, etc., that carry sensitive information. • rfid operates over long distances while exchanging data wirelessly so it is not secure for the applications that store personalized data. rfid using items susceptible to various fraud attacks such as data corruption. nfc’s short working range considerably reduces this risk of data theft, eavesdropping, and “man in the middle” attacks. • nfc has the capability to communicate both ways and thus is suitable to be used for advanced interactions such as card emulation and peer-to-peer sharing. • a number of rfid tags can be scanned simultaneously, while only a single nfc tag can be scanned at a time. how nfc works the extended functionality of a traditional rfid system has led to the nfc forum. the nfc forum has defined three operating modes for nfc devices: tag reader/writer mode; peer-to-peer mode, and card emulation mode (see figure 1). the nfc forum technical specifications for the different operating modes are based on the iso/iec 18092 nfc ip-1, jis x 6319-4, and iso/iec 14443. these specifications must be used to derive the full benefit from the capabilities of nfc technology. contactless smart card standards are referred to as nfc-a, nfc-b, and nfc-f in nfc forum specifications.5 information technology and libraries june 2020 near field communication (nfc) | singh 3 figure 1. nfc operation modes6 reader/writer mode in reader/writer mode (see figure 2), an nfc-enabled device is capable of reading nfc forummandated tag types, such as a tag embedded in an nfc smart poster. this mode allows nfcenabled devices to read the information that is stored on nfc tags embedded in smart posters and displays. since these tags are relatively inexpensive, they provide a great marketing tool for companies. figure 2. reader mode7 the reader/writer mode on the radio frequency interface is compliant with the nfc-a, nfc-b, and nfc-f schemes. examples of its use include reading timetables, tapping for special offers, and updating frequent flyer points, etc.8 information technology and libraries june 2020 near field communication (nfc) | singh 4 peer-to-peer mode in peer-to-peer mode (see figure 3), both devices must be nfc-enabled in order for them to communicate with each other to exchange information and to share files. the users of nfcenabled devices can thus quickly share information and other files with a touch. as an example, users can exchange data such as digital photos or virtual business cards via bluetooth or wifi. figure 3. peer-to-peer mode9 peer-to-peer mode is based on the nfc forum’s logical link control protocol specification and is standardized on the iso/iec 18092 standard. card-emulation mode in card-emulation mode (see figure 4), an nfc device behaves like a contactless smart card so that users can perform transactions such as purchases, ticketing, and transit access control with just a touch. an nfc device may have the ability to emulate more than one card. in card-emulation mode, an nfc-enabled device communicates with an external reader much like a traditional contactless smart card. this allows contact less payments and ticketing by nfc-enabled devices without changing the existing infrastructure. information technology and libraries june 2020 near field communication (nfc) | singh 5 figure 4. card-emulation mode by adding nfc to a contactless infrastructure one can enable two-way communications. in the air transport sector, this could simplify many operations such as updating seat information while boarding or adding frequent flyer points while making a payment.10 nfc standards and specifications the nfc specifications are defined by an industry organization called the nfc forum, which has nearly 200 member companies. the nfc forum was formed in 2004 with the objective of advancing the use of nfc technology. this was achieved by educating the market about nfc technology and developing specifications to ensure interoperability among devices and services. the nfc forum members are working together in task forces and working groups. as noted earlier, nfc technology is based on existing 13.56 mhz rfid standards and includes several protocols such as iso 14443 type a and type b, and jis x 6319-4 (which is also a japanese industrial standard known as sony felica). the iso 15693 standard, an additional 13.56 mhz protocol established in the market, is being integrated into the nfc specification by an nfc forum task force. smartphones in the market are already supporting the iso 15693 protocol.11 these nfc specifications and especially the specifications for the extended nfc functionalities are again standardized by the international standard organizations like iso/iec ecma and etsi.12 initially the rfid standards i.e. iso/iec 14443 a, iso/iec 14443 b and jis x6319-4 were also pronounced as nfc standards by different companies working in the field such as nxp, infineon, and sony. the first ever nfc standard was ecma 340, based on the air interface of iso/iec 14443a and jis x6319-4. ecma 340 adapted the iso/iec standard 18092. at the same time, major credit card companies like europay, mastercard, and visa introduced the emvco payment standard, which is based on iso/iec 14443 a and iso/iec 14443 b. these groups harmonised the over-the-air interfaces within the nfc forum. they are named nfc-a (iso/iec 14443 a based), nfc-b (iso/iec 14443 b based), and nfc-f (felica based).13 information technology and libraries june 2020 near field communication (nfc) | singh 6 nfc tags an nfc tag is a small microchip embedded in a sticker or wristband that can be read by the mobile devices that are within range. information regarding the item is stored in these microchips.14 an nfc tag has the capability to send the information stored on it to nfc enabled mobile phones. nfc tags can also perform various actions, such as changing the settings of handsets or even launch a website.15 tag memory capacity varies by the type of tag. for example, a tag may store a phone number or a url.16 the most common use of the nfc tag function on an object is mobile wallet payment processing, where the user swipes or flicks a mobile phone on a nfc tag to make payment. google’s version of this system is google wallet.17 figure 5. a quick overview of the tag types18 applications of nfc since it emerged as a standard technology in 2003, nfc technology has been implemented across multiple platforms in various ways. the primary driving force behind nfc is its application in the commercial sector in which the implementation of the technology focuses on such areas as sales and marketing. there are also emerging many new and interesting applications in various other fields of education and healthcare. all of these may impact libraries, librarians, and library users, either by prompting adaptations to existing collections and services or inspiring innovation in our profession.19 • mobile payment: customers with nfc-enabled smartphones can link with their bank accounts and are able to pay by simply tapping phones to an nfc-enabled point-of-sale.20 information technology and libraries june 2020 near field communication (nfc) | singh 7 • access and authentication: “keyless access” to restricted areas, cars, and other vehicles. one can imagine other potential uses of nfc in the future with the devices in the home being controlled by it.21 • transportation and ticketing: nfc-enabled phones can connect with an nfc-enabled kiosk to download a ticket, or the ticket can be sent directly to an nfc-enabled phone over the air (ota). the phone can then tap a reader to redeem that ticket and gain access. 22 • mobile marketing: nfc tags they can be embedded into the indoor and outdoor signage. upon tapping their smartphone on an nfc-enabled smart poster, the customer can read a consumer review, visit a website, or even view a movie trailer. • healthcare: nfc medical cards and bracelet tags can store relevant, up-to-date patient information like health history, allergies, infectious diseases, etc. • gaming: nfc technology is the bridge between physical and digital games. players can tap each other’s phones together and earn extra points or receive access to a new level, or get clues, by using nfc application.23 • inventory tracking, smart packaging, and shelf labels: nfc-tagged objects could provide a wide variety of information in different use environments. nfc-enabled smartphones can be used to tap the tags to access book reviews and information about the book’s author and recommend the book to other readers. users could check out a book or add it to a wish list to check out at a later date. indeed, with nfc, library records and metadata could theoretically be stored on and retrieved from library physical holdings themselves, allowing a patron to tap a book or resource borrowed from the library to recall its title, author, and due date.24 applications of nfc in libraries: introducing the smart library some libraries are beginning to use nfc technology as an alternative to rfid. yusof et al. proposed a newly developed application called the smart library, or “s-library,” that has adopted the nfc technology.25 in the s-library, library users can perform many library transactions just by using their mobile smartphones with integrated nfc technology. the users of s-library are required to download and install an app in their compatible mobile phone. this app provides the user relevant and easy to use library functionality such as searching, borrowing, returning, and viewing their transaction records. in this s-library model the app is integrated with the library management software. the s-library app needs to be installed on the mobile device, and the mobile device requires an internet connection that will connect it to the lms. the s-library provides five major functionalities to the user: scan, search, borrow, return, and transaction history. in the scanning function, users can access the information of a book by simply touching their mobile phone to the nfc tag on the book. as soon as the phone touches the book, information regarding its title, author, contents, synopsis, etc. will automatically be displayed on the screen of the mobile device. users can search for books by entering keywords such as book title, author name, year, etc. through the borrowing function the app allows users to check out books of interest. the user just needs to touch their mobile phone to the nfc-tagged book to borrow it. the transaction is automatically stored to the lms database. similar to the borrowing process is the returning process. the user is required to select the return function on the menu and touch the mobile device to the book, and the returning transaction will be automatically performed and stored in the lms database. however, it should be ensured that the book is physically returned to the library by returning the book through the nfc-enabled book drop information technology and libraries june 2020 near field communication (nfc) | singh 8 system of the library and only then transaction should be updated in the lms. the user can check the due date for the current transaction as well as his transaction history. the function of transaction history allows the user to view the list of books that have been borrowed from time to time and their status.26 data transmission for nfc technology can be up to 848 kilobits/second whereas the data transmission rate with rfid technology is 484 kilobits/second. taking advantage of this high data rate, the response time for s-library is also very fast. this is a huge improvement over rfid technology and especially over barcode technology where data transmission rate is variable and inconsistent and dependent upon the quality of the barcodes. the second key advantage of slibrary is that the time taken to read a tag (the communication time between a reader and an nfc enabled device) is very fast. the third advantage of nfc is its usability in comparison to the other two technologies. nfc technology is human-centric because it is intuitive and fast and the user is able to use it anywhere, anytime using their mobile phones. in rfid and barcode technology usability is item centric as person has to go to the specific device located in the library. 27 most of the shortcomings of rfid and barcode technology have been overcome by the s-library. with barcode technology, the quality of barcodes, printing clarity, print contrast ratio , and also the low level of security were all challenges. rfid technology had many drawbacks such as lack of common rfid standards, security vulnerability, reader and tag collision that happens when multiple tags are energized by the rfid tag reader simultaneously and they reflect their respective signals back to the reader at the same time. because nfc is touch based, it has presented a viable alternative tool for library users to overcome these weaknesses of the older technology. yosof et al. found many advantages to s-library: faster book borrowing; saved time of the user as well as the library staff; the connection can be initialised in less than a second; no configuration on the mobile device is required; and higher usability ratings and security.28 however, there are also some limitations of s-library. first, device compatibility is an issue, because s-library presently supports only the android platform. second, as the s-library application only supports up to a 10centimeter range, coverage is an issue. mobile payments nfc technology can be used for several library functions such as making payments, paying library fines, purchasing tickets to library events, or donating to library. users may also be able to use their digital wallet to pay for photocopying, printing, scanning, etc. keeping the requirements of the nfc technology in the future libraries have to enquire about the possibility of adding nfc payment capabilities into the existing hardware and also while purchasing new machines. already, bibliotheca’s smartserv 1000 self-serve kiosk, introduced in september 2013, includes nfc as a payment option. in the future other library automation companies for nfc integration would also be worth monitoring.29 library access and authentication nfc-enabled devices can be used to accessing the library and authenticate users. these capabilities suggest that nfc technology may play an important role in the next generation of identity management systems. of particular interest in this context are several applications of nfc in two-factor authentication, which generally combines a traditional password or other digital credential with a physical, nfc-enabled component as well. for example, an authentication system information technology and libraries june 2020 near field communication (nfc) | singh 9 could require the user to type in a fixed password in addition to tapping an nfc-enabled phone, identity card, or ring to the device they are logging in to. ibm has demonstrated a two -factor authentication method for mobile payment in which a user first types in a password and then taps an nfc-enabled credit card, issued by their bank, to their nfc-enabled smartphone. libraries could investigate similar access and authentication applications for nfc, both for internal use (staff badges and keys) as well as for public services. particularly if nfc mo bile payment finally gains consumer attraction, library patrons may begin to expect that they can use their nfc-enabled mobile devices to replace not just their credit cards but also their library cards. already, d-tech’s rfid air self check unit allows library patrons to log into their user accounts by tapping their nfc-enabled phone to the kiosk. the patron then uses the kiosk’s rfid reader to check out their library materials and receives a receipt via email or sms. beyond its application in circulation, nfc authentication can be applied to streamline access to other services and resources of the library.30 nfc-enabled devices could be used to make reservation of library spaces, classrooms, auditoriums or community halls, digital media labs, meeting rooms , etc. library users could use nfc authentication to be able to access digital library resources, such as databases, e-journals, e-books collections, and other digital collections. nfc might allow libraries of all kinds to provide more convenient access and authentication options to users, though privacy and security considerations would certainly need to be addressed. nfc access and authentication will certainly have an impact on academic libraries. at universities where nfc access systems are deployed, student identification cards can be replaced with nfc-enabled mobile phones for afterhours services such as library building entry, wifi access, and printing, copying, and scanning services. the inconvenience of multiple logins can be eliminated. however, the libraries will have to take the responsibility of protecting student information and library resources with added security.31 promotion of library services librarians can borrow ideas from commercial implementations of nfc-based marketing to enhance promotions for library resources, services, and events. as a first step, as kane and schneidewind suggested, nfc tags can complement several promotional uses of qr codes that have already been piloted or implemented in libraries. 32 for promotional use, libraries can easily embed nfc tags in their new book displays that can be linked to the bestseller list or current acquisitions lists in the library catalog or digital collections. similarly, if the reference book collection is tagged with nfc tags, it could be linked to the relevant digital collections of databases or e-books. nfc tags can be placed on library building doors or on library promotional material by which information such as library hours, opening days, schedule of events, membership rules , or floor plans for the building could be shared. as an example, at the renison university college library in ontario, canada, visitors can tap an nfc-enabled “library smartcard” to retrieve a digital brochure of library services in a variety of formats, including pdf, epub, and mp3.33 to promote outreach programs and events instead of merely sharing links the libraries can take advantage of nfc’s interactive capabilities. as an example, libraries could use nfc tags on their event posters so that the users that can scan them and register for an event, save the event to their personal calendar, join the friends of the library program, or even download a library app. to send a text message to a librarian the users can tap the smart poster promoting a virtual reference service. nfc-enabled promotional materials can engage users with library content even when they are outside of the library building itself. a brilliantly creative example was created by the field information technology and libraries june 2020 near field communication (nfc) | singh 10 museum of chicago. it used nfc-enabled outdoor smart posters throughout the city to promote an exhibit of the 1893 world’s fair. the event posters depicted a personage from 1893 that invited the viewer to “see what they saw.” users could tap their nfc-enabled mobile device to the smart poster (or read a qr code) to download an app from the field museum that included 360° images of the fair as well as videos highlighting items in the exhibition.34 inventory control the smart packaging use case brings forward a very important question for libraries that use rfid for inventory control. first, can existing rfid tags and infrastructure be leveraged to provide additional services to patrons with nfc-enabled mobile devices? the concept is not new; walsh envisioned using library rfid tags to store book recommendations or other digital information, which users could then access with a conveniently located rfid reader. 35 what nfc brings to walsh’s vision is that a dedicated rfid reader may no longer be necessary; a patron could use their own nfc-enabled smartphone to read a tag rather than taking it to a special location to be read. indeed, with nfc, library records and metadata could theoretically be stored on and retrieved from library physical holdings themselves, allowing a patron to tap a book or resource borrowed from the library to recall its title, author, and due date. an exciting and immediate use for nfc in libraries is for self-checkout: a patron can browse the stacks and could tap an nfctagged book with their nfc-enabled phone to check it out without visiting the circulation desk or waiting in line.36 smart packaging a sector close to librarians’ hearts is publishing and several publishers have started testing smart packaging for books, using embedded nfc tags to share additional content with readers such as book reviews, reading lists, etc. with digital extras, the concept of smart packaging has significant implications for libraries as a new opportunity to connect physical collections (i.e., from books to digital media). one can envision in the future that when a user taps an nfc-enabled library book they shall get access to relevant digital information (such as bibliographic information) in a variety of citation formats, editorial reviews, the author’s biography, a projected rating for the book, and links to other similar information. borrowing and returning books one of a library’s key functions is circulating physical books from the library’s collections. due to the low cost of barcode technology, many libraries around the world are using it for circulation management. however, barcode technology has several constraints: it requires a line-of-sight to the barcode, it does not provide security of library collection, it does not offer any benefit for collection management, and it is becoming challenging for libraries to satisfy the increasing demands of their users, for example, reservation of books issued out, checking their transaction history, etc. this leads to the need to implement a new technology to improve the library circulation management, inventory, and security of library collections. librarians are known as early adopters of technology and have started using rfid to provide circulation services in a more effective and efficient manner, for security of library collections, and to satisfy the increasing demands of the users, for example putting tags in books allows them to issue multiple books together by placing stack of books near a reader. information technology and libraries june 2020 near field communication (nfc) | singh 11 recommendations according to mchugh and yarmey, the implementation of nfc has been slow and unsteady and they do not foresee an immediate implementation in libraries.37 however, they recommend that librarians learn and prepare for nfc. they recommend, for example, that librarians: • follow the progress of research and scholarship on nfc and commercial progress of nfc technology to better anticipate its adoption in your community; • experiment with nfc technology and develop prototype applications for nfc use in the library; • offer an informational workshop on nfc for users and library colleagues; • enquire from the rfid vendor about tag compatibility with nfc and rewriting the tags; • monitor the progress of security and privacy aspects of nfc technology and educate the users about these issues; develop or update your library security policy; • allow patrons to “opt-in” to any nfc services at your library, providing other modes of communication where possible; • develop and share best practices for nfc implementations; and • support research on nfc in libraries via planning grants, research forums, and conference sessions. conclusions beyond the potential benefits of nfc, librarians should also be aware of and prepared for privacy and security concerns that accompany the technology. user privacy is of the utmost concern. nfc involves users’ mobile devices generating, collecting, storing, and sharing a significant amount of personal data. several of these functions, particularly mobile payment, necessitate the exchange of highly confidential data, including but not limited to a user’s financial accounts, purchase history, etc. spam may also be a concern; sending unwanted content (e.g., advertisements, coupons, or adware) to users’ mobile devices without their consent. librarians should also use special caution when considering the implementation of nfc for library promotions or services. security is a significant concern and an active area of research, as many nfc implementations involve the exchange of sensitive financial or otherwise personal data. an important concept in nfc security, particularly in the context of mobile payment, is the idea of a tamper-proof “secure element” as a basic protection for sensitive or confidential data such as account information and credentials for authentication.38 outside of continued standardization, the most effective measures for protecting n fc data transmissions are data encryption and the establishment of a secure channel between the sending and receiving devices (e.g., using a key agreement protocol and/or via ssl). for security concerns, as with privacy concerns, librarians have a crucial role to play in user education. there are important steps that individual users can and should take to protect their devices—e.g., setting a lock code for their device, knowing how to remotely wipe a stolen phone, and installing and regularly updating antivirus software. however, many users are unaware of the vulnerability of their mobile devices and often fail to enact even basic protections. by empowering objects and people to communicate with each other at a different level and establish a “touch to share” paradigm, nfc technology has the potential to transform the information technology and libraries june 2020 near field communication (nfc) | singh 12 information environment surrounding our libraries and fundamentally alter the ways in which the library patrons interact with information. endnotes 1 doaa abdel-gaber and abdel-aleem ali, “near-field communication technology and its impact in smart university and digital library: comprehensive study,” journal of library and information sciences, 3, no. 2 (december 2015): 43-77, https://doi.org/10.15640/jlis.v3n2a4. 2 “nfc technology discover what nfc is, and how to use it,” accessed march 17, 2019, https://www.unitag.io/nfc/what-is-nfc. 3 apuroop kalapala, “analysis of near field communication (nfc) and other short range mobile communication technologies” (project report, indian institute of technology, roorkee, 2013 ), accessed march 19, 2019, https://idrbt.ac.in/assets/alumni/pt2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc) %20and%20other%20short%20range%20mobile%20communication%20technologies_2013. pdf. 4 ed, “near field communication vs radio frequency identification,” accessed march 10, 2019, http://www.nfcnearfieldcommunication.org/radio-frequency.html. 5 “what it does,” nfc forum, accessed march 12, 2019, https://nfc-forum.org/what-is-nfc/whatit-does. 6 josé bravo et al., “m-health: lessons learned by m-experiences,” sensors 18, 1569 (2018): 1–27. 10.3390/s18051569. 7 vedat coskun, busra ozdenizci, and kerem ok, “the survey on near field communication,” sensors 15, no. 6 (2015): 13348-405, https://doi.org/10.3390/s150613348. 8 coskun, ozdenizci, and ok, “the survey on near field communications,” 13352. 9 coskun, ozenizci, and ok, “the survey on near field communication.” 10 “how nfc works?,” cnrfid, accessed january 12, 2019, http://www.centrenationalrfid.com/how-nfc-works-article-133-gb-ruid-202.html. 11 coskun, ozdenizci, and ok, “the survey on near field communication,” 13352. 12 c. ruth, “nfc forum calls for breakthrough solutions for annual competition,” accessed march 21, 2019, https://nfc-forum.org/newsroom/nfc-forum-calls-for-breakthrough-solutions-forannual-competition/. 13 m. roland, “near field communication (nfc) technology and measurements,” accessed may 12, 2019, https://cdn.rohdeschwarz.com/pws/dl_downloads/dl_application/application_notes/1ma182 /1ma182_5e_nfc_white_paper.pdf. 14 roland, “near field communication (nfc) technology and measurements.” https://doi.org/10.15640/jlis.v3n2a4 https://www.unitag.io/nfc/what-is-nfc https://idrbt.ac.in/assets/alumni/pt-2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf https://idrbt.ac.in/assets/alumni/pt-2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf https://idrbt.ac.in/assets/alumni/pt-2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf https://idrbt.ac.in/assets/alumni/pt-2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf http://www.nfcnearfieldcommunication.org/radio-frequency.html https://nfc-forum.org/what-is-nfc/what-it-does https://nfc-forum.org/what-is-nfc/what-it-does https://doi.org/10.3390/s150613348 http://www.centrenational-rfid.com/how-nfc-works-article-133-gb-ruid-202.html http://www.centrenational-rfid.com/how-nfc-works-article-133-gb-ruid-202.html https://nfc-forum.org/newsroom/nfc-forum-calls-for-breakthrough-solutions-for-annual-competition/ https://nfc-forum.org/newsroom/nfc-forum-calls-for-breakthrough-solutions-for-annual-competition/ https://cdn.rohdeschwarz.com/pws/dl_downloads/dl_application/application_notes/1ma182/1ma182_5e_nfc_white_paper.pdf https://cdn.rohdeschwarz.com/pws/dl_downloads/dl_application/application_notes/1ma182/1ma182_5e_nfc_white_paper.pdf information technology and libraries june 2020 near field communication (nfc) | singh 13 15 “what is a near field communication tag (nfc tag)?,” techopedia, accessed may 27, 2019, https://www.techopedia.com/definition/28812/near-field-communication-tag-nfc-tag. 16 “what is meant by the nfc tag?,” quora, accessed july 12, 2019, https://www.quora.com/what-is-meant-by-the-nfc-tag. 17 s. profis, “everything you need to know about nfc and mobile payments,” accessed june 27, 2019, https://www.cnet.com/how-to/how-nfc-works-and-mobile-payments/. 18 “the 5 nfc tag types,” accessed march 24, 2019, https://www.dummies.com/consumerelectronics/5-nfc-tag-types/. 19 abdel-gaber and ali, “near-field communication technology and its impact in smart university and digital library,” 64–71. 20 iviane ramos de luna et al., “nfc technology acceptance for mobile payments: a brazilian perspective,” review of business management 19, no. 63 (2017): 82–103, https://doi.org/10.7819/rbgn.v0i0.2315. 21 rajiv, “applications and future of near field communication,” accessed march 14, 2019, https://www.rfpage.com/applications-near-field-communication-future/. 22 “nfc in public transport,” nfc forum, accessed april 12, 2019, http://www.smartticketing.org/downloads/papers/nfc_in_public_transport.pdf. 23 “gaming applications with rfid and nfc technology,” smarttech, accessed may 14, 2019, https://www.smarttec.com/en/applications/gaming. 24 sheli mchugh and kristen yarmey, “near field communication: recent developments and library implications,” synthesis lectures on emerging trends in librarianship 1, no. 1 (march 2014), 1–93. 25 m.k. yusof et al., “adoption of near field communication in s-library application for information science,” new library world 116, no. 11/12 (2015): 728–47, https://doi.org/10.1108/nlw-02-2015-0014. 26 yusof et al., “adoption of near field communication,” 734–36. 27 yusof et al., “adoption of near field communication,” 744. 28 yusof et al., “adoption of near field communication,” 745. 29 abdel-gaber and ali, “near-field communication technology and its impact in smart university and digital library,” 64. 30 mchugh and yarmey, “near field communication,” 27. 31 mchugh and yarmey, “near field communication,” 734. https://www.techopedia.com/definition/28812/near-field-communication-tag-nfc-tag https://www.quora.com/what-is-meant-by-the-nfc-tag https://www.cnet.com/how-to/how-nfc-works-and-mobile-payments/ https://www.dummies.com/consumer-electronics/5-nfc-tag-types/ https://www.dummies.com/consumer-electronics/5-nfc-tag-types/ https://doi.org/10.7819/rbgn.v0i0.2315 https://www.rfpage.com/applications-near-field-communication-future/ http://www.smart-ticketing.org/downloads/papers/nfc_in_public_transport.pdf http://www.smart-ticketing.org/downloads/papers/nfc_in_public_transport.pdf https://www.smarttec.com/en/applications/gaming https://doi.org/10.1108/nlw-02-2015-0014 information technology and libraries june 2020 near field communication (nfc) | singh 14 32 danielle kane and jeff schneidewind, “qr codes as finding aides: linking electronic and print library resources,” public services quarterly 7, no. 3–4 (2011): 111–24, https://doi.org/10.1080/15228959.2011.623599. 33 mchugh and yarmey, “near field communication,” 31. 34 mchugh, and yarmey, “near field communication,” 31. 35 andrew walsh, “blurring the boundaries between our physical and electronic libraries: location-aware technologies, qr codes and rfid tags,” the electronic library 29, no. 4 (2011): 429–37, https://doi.org/10.1108/02640471111156713. 36 projes roy and shailendra kumar, “application of rfid in shaheed rajguru college of applied sciences for women library, university of delhi, india: challenges and future prospects,” qualitative and quantitative methods in libraries 5, no. 1 (2016): 117–130, http://www.qqmljournal.net/index.php/qqml/article/view/310. 37 mchugh, and yarmey, “near field communication,” 61–2. 38 garima jain and sanjeet dahiya, “nfc: advantages, limits and future scope,” international journal on cybernetics & informatics 4, no. 4 (2015): 1–12, https://doi.org/10.5121/ijci.2015.4401. https://doi.org/10.1080/15228959.2011.623599 https://doi.org/10.1108/02640471111156713 http://www.qqml-journal.net/index.php/qqml/article/view/310 http://www.qqml-journal.net/index.php/qqml/article/view/310 https://doi.org/10.5121/ijci.2015.4401 abstract introduction how nfc works reader/writer mode peer-to-peer mode card-emulation mode nfc standards and specifications nfc tags applications of nfc applications of nfc in libraries: introducing the smart library mobile payments library access and authentication promotion of library services inventory control smart packaging borrowing and returning books recommendations conclusions endnotes 54 information technology and libraries | june 2011 recreation, law enforcement and public safety, and social services available in the community ■■ access to electronic encyclopedias, local libraries’ catalogs, full-text articles online, and document delivery.”2 at the time we were asking the question, will an information infrastructure be built? the answer? most assuredly. indeed, librarians stepped up to the table and ensured that the public had access to information-related services at their local library. the information the public asked for in 1994, as listed above, is widely available today. there are numerous examples in which librarians and libraries have served as leaders in the ongoing sustainablity of local, regional, and national information networks. it was pointed out at the time, and remains true today, that in an era of ever-shrinking resources, libraries cannot and should not compete with telecommunications, entertainment, and computer companies. they need to “join them as equals in the information arena.”3 lita has a viable role in the development of the twentyfirst-century skills that will firmly put the information infrastructure into place. a lita member is appointed as a liaison to the office for information technology policy (oitp) and serves on the lita technology and access committee, which addresses similar issues. the lita transliteracy interest group explores, develops, and promotes the role of libraries in all aspects of literacy. working with the oitp provides lita membership with the opportunity to participate in current issues, such as digital literacy. the information infrastructure has come a long way in the last twenty some years. there is still much to be done. robert bocher, technology consultant with the wisconsin state library and oitp fellow, will present “building the future: addressing library broadband connectivity issues in the 21st century” at the lita president’s program from 4 p.m. to 5:30 p.m. on sunday, june 26, at the ala annual conference in new orleans. i look forward to seeing you at the program and to hear about the successes and the work that remains to be done to address the broadband needs we all face in the country. references 1. federal communications commission, the national broadband plan: chapter 2: goals for a high performance america, http://www.broadband.gov/plan/2 -goals-for-a-high-performance-america/ (accessed apr. 2, 2011). 2. karen starr, “the american public, the public library, and the internet; an ever-evolving partnership” in the cybrarian’s manual, ed. pat ensor (chicago: ala, 1997): 23–24. 3. ibid., 31. t wenty years ago, librarians became involved in the implementation of the internet for the use of the public across the country. those initiatives were soon followed by the bill and melinda gates foundation projects supporting public libraries, which included funding hardware grants to implement public computer labs and connectivity grants to support high-speed internet connections. in 2008, the institute of museum and library services (imls) convened a task force to define twentyfirst-century skills for museums and libraries, which became an ongoing national initiative (http://www.imls .gov/about/21stcskills.shtm). the one year anniversary of the release of the national broadband plan was march 16, 2011. as described on broadband.gov, the plan is intended “to create a high-performance america—a more productive, creative, efficient america in which affordable broadband is available everywhere and everyone has the means and skills to use valuable broadband applications.”1 in 1994, the idaho state library’s development division cosponsored eight focus groups in which 179 people participated. the participants were asked several questions, including the types of information they would like to see on the internet. the results reflected the public’s interest at that time in the following: ■■ “expert advice on a variety of topics including medicine, law, car repair, computer technology, animal husbandry, and gardening ■■ economic development, investment, bank rates, consumer product safety, and insurance ■■ community-based information such as events, volunteers, local classified advertisements, special interest groups, housing information, public meetings, transportation schedules, and local employment opportunities ■■ computer training, foreign language programs, homework service, teacher recertification, school activities, school scheduling, and adult education ■■ electronic mail and the ability to transfer files locally as well as worldwide ■■ access to public records, voting records of legislators, absentee voting, the ability to renew a driver’s license, the rules and regulations from governmental agencies, and taxes ■■ information about hunting and fishing, environmental quality, the local weather, road advisories, sports, karen j. starr (karen.j.starr@gmail.com) is lita president 2010-11 and assistant administrator for library and development services, nevada state library and archives, carson city. karen j. starr president’s message: 21st century skills, 21st century infrastructure public library computer waiting queues: alternatives to the first -come-first-served strategy stuart williamson public library computer waiting queues | williamson 72 abstract this paper summarizes the results of a simulation of alternative queuing strategies for a public library computer sign-up system. using computer usage data gathered from a public library, the performance of these various queuing strategies is compared in terms of the distribution of user wait times. the consequences of partitioning a pool of public computers are illustrated as are the potential benefits of prioritizing users in the waiting queue according to the amount of computer time they desire. introduction many of us at public libraries are all too familiar with the scene: a crowd of customers huddled around the library entrance in the morning, anxiously waiting for the doors to open to begin a race for the computers. from this point on, the wait for a computer at some libraries, such as the one we will examine, can hover near thirty minutes on busy days and peak at an hour or more. such long waiting times are a common source of frustration for both customers and staff. by far the most effective solution to this problem is to install more public computers at your library. of course, when the space or money run out, this may no longer be possible. another approach is to reduce the length or number of sessions each customer is allowed. unfortunately, reducing session length can make completion of many important tasks difficult; whereas, restricting the number of sessions per day can result in customers upset over being unable to use idle computers.1 finally, faced with daunting wait times, libraries eager to make their computers accessible to more people may be tempted to partition their waiting queue by installing separate fifteen-minute “express” computers. a primary focus of this paper is to illustrate how partitioning the pool of public computers can significantly increase waiting times. additionally, several alternative queuing strategies are presented for providing express-like computer access without increasing overall waiting times. we often take for granted the notion that first-come-first-served (fcfs) is a basic principle of fairness. “i was here first,” is an intuitive claim that we understand from an early age. however, stuart williamson (swilliamson@metrolibrary.org) is researcher, metropolitan library system, oklahoma city, oklahoma. mailto:swilliamson@metrolibrary.org information technology and libraries | june 2012 73 the inefficiency present in a strictly fcfs queue is implicitly acknowledged when we courteously invite a person with only a few items to bypass our overflowing grocery cart to proceed ahead in the check-out line. most of us would agree to wait an additional few minutes rather than delay someone else for a much greater length of time. when express lanes are present, they formalize this process by essentially allowing customers needing help for only a short period of time to cut in line. these line cuts are masked by the establishment of separate dedicated lines, i.e., the queue is partitioned into express and non-express lines. one question addressed by this article is “is there a middle ground?” in other words, how might a library system set up its computer waiting queue to achieve express-lane type service without splitting the set of public internet computers into partitions that operate separately and in parallel? several such strategies are presented here along with the results of how each performed in a computer simulation using actual customer usage data from a public library. strategies queuing systems are heavily researched in a number of disciplines, particularly computer science and operations research. the complexity and sheer number of different queuing models can present a formidable barrier to library professionals. this is because, in the absence of real-world data, it is often necessary to analyze a queuing system mathematically by approximating its key features with an applicable probability distribution. unfortunately, applying these distributions entails adopting their underlying assumptions as well as any additional assumptions involved in calculating the input parameters. for instance, the poisson distribution (used to approximate customer arrival rates) requires that the expected arrival rate be uniform across all time intervals, an assumption which is clearly violated when school lets out and teenagers suddenly swarm the computers.2 even if we can account for such discrepancies, there remains the difficulty of estimating the correct arrival rate parameter for each discrete time interval being analyzed. fortunately, many libraries now use automated computer sign-up systems which provide access to vast amounts of real-world data. with realistic data, it is possible to simulate various queuing strategies, a few of which will be analyzed in this article. a computer simulation using real-world data provides a good picture of the practical implications of any queuing strategy we care to devise without the need for complex models. as is often the case, designing a waiting queue strategy involves striking a balance among competing factors. for instance, one way of reducing waiting times involves breaking with the fcfs rule and allowing users in one category to cut in front of other users. how many cuts are acceptable? does the shorter wait time for users in one category justify the longer waits in another? there are no right answers to these questions. while simulating a strategy can provide a realistic picture of its results in terms of waiting times, evaluating which strategy’s results are preferable for a particular library must be done on a case-by-case basis. in addition to the standard fcfs strategy with a single pool of computers and the same fcfs strategy implemented with one computer removed from the pool to serve as a dedicated fifteen public library computer waiting queues | williamson 74 minute express computer (referred to as fcfs-15), we will consider for comparison three other well-known alternative queuing strategies: shortest-job-first (sjf), highest-response-ratio-next (hrrn), and a variant of shortest-job-first (sjf-fb) which employs a feedback mechanism to restrict the number of times a given user may be bypassed in the queue.3 the three alternative strategies all require advance knowledge or estimation of how long each particular computer session will last. in our case, this means customers would need to indicate how long of a session they desire upon first signing up for a computer. any number of minutes is acceptable so we will limit the sign-up options to four categories in fifteen-minute intervals: fifteen minutes, thirty minutes, forty-five minutes, and sixty minutes. each session will then be initially categorized into one of four priority classes (p1, p2, p3, and p4) accordingly. as the data will show, customers selecting shorter sessions are given a higher priority in the queue and will thus have a shorter expected waiting time. it should be noted that relying on users to choose their own session length presents its own set of problems. it is often difficult to estimate how much time will be required to accomplish a given set of tasks online. however, users face a similar difficulty in deciding whether to opt for a dedicated fifteen-minute computer under the fcfs-15 system. the trade-off between use time and wait time should provide an incentive for some users to self-ration their computer use, placing an additional downward pressure on wait times. however, user adaptations in response to various queuing strategies are outside the scope of this analysis and will not be considered further. the shortest-job-first (sjf) strategy functions by simply selecting from the queue the user in the highest priority class. the amount of time spent waiting by each user is only considered as a tie breaker among users occupying the same priority class. our results demonstrate that the sjf strategy is generally best for minimizing overall average waiting time as well as for getting customers needing the least amount of computer time online the fastest. the main drawbacks of this strategy are that these gains come at the expense of more line cuts and higher average and maximum waiting times for the lowest priority users—those needing the longest sessions (sixty minutes). there is no limit to how many times a user can be passed over in the queue. in theory, this means that such a user could be continually bypassed and never be assigned a computer during the day. the sjf-fb strategy is a variant of sjf with the addition of a feedback mechanism that increases the priority of users each time they are cut in line. for instance, if a user signs up for a sixtyminute session, he/she is initially assigned a priority of 4. suppose that shortly after, another user signs up for a thirty-minute session and is assigned a priority of 2. the next available computer will be assigned to the user with the priority 2. the bypassed user’s priority will now be bumped up by a set interval. in this simulation an interval of 0.5 is used so the bypassed user’s new priority becomes 3.5. as a result, users beginning with a priority of 4 will reach the highest priority of 1 after being bypassed six times and will not be bypassed further. this effectively restricts the maximum number of times a user can be cut in front of at six. information technology and libraries | june 2012 75 the final alternative strategy, highest-response-ratio-next (hrrn), is a balance between fcfs and sjf. it considers both the arrival time and requested session length when assigning a priority to each user in the queue. each time a user is selected from the queue, the response ratio is recalculated for all users. the user with the highest response ratio is selected and assigned the open computer. the formula for response ratio is: ( ) this allows users with a shorter session request to cut in line, but only up to a point. even customers requesting the longest possible session move up in priority as they wait, just at a slower pace. this method produces the same benefits and drawbacks as the sjf strategy; but the effects of both are moderated, and the possibility of unbounded waiting is eliminated. still, although the expected number of cuts will be lower using hrrn than with sjf, there is no limit on how many times a user may be passed over in the queue. the response ratio formula can be generalized by scaling the importance of the waiting time factor. for instance in the modified response ratio below, increasing values of x > 1 will cause the strategy to more resemble fcfs, and decreasing values of 0 < x < 1 will more resemble sjf. ( ) one could experiment with different values of x to find a desired balance between the number of line cuts and the impact on average waiting times for customers in the various priority classes. this won’t be pursued here, and x will be assumed to be 1. methodology the data used in this simulation come from the metropolitan library system’s southern oaks library in oklahoma city. this library has eighteen public internet computers that customers can sign up for using proprietary software developed by jimmy welch, deputy executive director/technology for the metropolitan library system. the waiting queue employs the firstcome-first-served (fcfs) strategy. customers are allotted an initial session of up to sixty minutes but may extend their session in thirty-minute increments so long as the waiting queue is empty. repeat customers are also allowed to sign up for additional thirty-minute sessions during the day, provided that no user currently in the queue has been waiting for more than ten minutes (an indication that demand for computers is currently high). anonymous usage data gathered by the system in august 2010 was compiled to produce the information about each customer session shown in table 1. public library computer waiting queues | williamson 76 table 1. session data (units in minutes) the information about each session required for the simulation includes the time at which the user arrived to sign up for a computer, the number of minutes it took the user to log in once assigned a computer, how many minutes of computer time were used, whether or not this was the user’s first or a subsequent session for the day, and finally, whether the user gave up waiting and abandoned his/her place in the queue. users are given eight minutes to log in once a computer station is assigned to them before they are considered to have abandoned the queue. once this data has been gathered, the computer simulation runs by iterating through each second the library is open. as user sign-up times are encountered in the data, they are added to the waiting queue. when a computer becomes available, a user is selected from the queue using the strategy being simulated and assigned to the open computer. the customer occupies the computer for the length of time given by their associated log-in delay and session length. when this time expires, customers are removed from their computer and the information recorded during their time spent in the waiting queue is logged. results there were 7,403 sign-ups for the computers at the southern oaks library in august 2010. each of these requests is assigned a priority class based on the length of the session as detailed in table 2. the intended session length of users choosing to abandon the queue is unknown. abandoned sign-ups are assigned a priority class randomly in proportion to the overall distribution of priority classes in the data so as not to introduce any systematic bias into the results. even though their actual session length is zero, these users participate in the queue and cause the computer eventually assigned to them to sit idle for eight minutes until it is re-assigned. customers signing up for a subsequent session during the day are always assigned the lowest priority class (p-4) regardless of their requested session length. this is a policy decision to not give priority to users who have already received a computer session for the day. information technology and libraries | june 2012 77 table 2. assignment of priority classes figure 1 displays the average waiting time for each priority class during the simulation (bars) along with the total number of sessions initially assigned to each class (line). it is immediately obvious from the chart that each alternative strategy excels at reducing the average wait for high priority (p1) users. also observe how removing one computer from the pool to serve exclusively as a fifteen-minute computer drastically increases the fcfs-15 average wait times in the other priority classes. clearly, removing one (or more) computer from the pool to serve as a dedicated fifteen-minute station is a poor strategy here for all but the 519 users in class p-1. losing just one of the eighteen available computers nearly doubles the average wait for the remaining 6,884 users in the other priority classes. figure 1. average user wait minutes by priority class public library computer waiting queues | williamson 78 by contrast, note that the reduced average wait times for the highest priority users in class p-1 persist in classes p-2 and p-3 for the non-fcsc strategies. the sjf strategy produces the most dramatic reductions for the 2,164 users not in class p-4. however, for the 5,239 users in class p-4, the sjf strategy produced an average wait time that was 2.1 minutes longer than the purely fcfs strategy. the hrrn strategy achieves lesser wait time reductions than sjf in the higher priority classes, but hrrn increased the average wait for users in class p-4 by only 0.7 minutes relative to fcfs. the average wait using the sjf-fb strategy falls in between that of sjf and hrrn for each priority class while guaranteeing users will be cut at most six times. an examination of the maximum wait times for each priority class in figure 2 illustrates how the express lane itself can be a bottleneck. even with a dedicated fifteen-minute express computer under the fcfs-15 strategy, at least one user would have waited over half an hour to use a computer for fifteen minutes or less. in all but the highest priority class (p-2 through p-4), the fcfs-15 strategy again performs poorly with at least one user in each of these classes waiting over ninety minutes for a computer. figure 2. maximum user wait minutes by priority class capping the number of times a user may be passed over in the queue under the sfj-fb strategy makes it less likely that members of classes p-2 and p-3 will be able to take advantage of their higher priority to cut in front of users in class p-4 during periods of peak demand. as a result, the sjf-fb maximum wait times for classes p-2 and p-3 are similar to those under the fcfs strategy. this was not the case in the breakdown of sjf-fb average waiting times across priority classes in figure 1. information technology and libraries | june 2012 79 table 3 breaks down waiting times for each queuing strategy according to the overall percentage of users waiting no more than the given number of minutes. here we see the effects of each strategy on the system as a whole, instead of by priority class. notice that the overall average wait times for the non-fcfs strategies are lower than those of fcfs. this indicates that the total reduction in waiting times for high-priority users exceeds the additional time spent waiting by users in class p-4. in other words, these strategies are globally more efficient than fcfs. notice, too, in table 3 that the non-fcfs strategies achieve significant reductions in the median wait time compared with fcfs. table 3. distribution of wait times by strategy after demonstrating the impact that breaking the first-come-first-served rule can have on waiting times, it is important to examine the line cuts that are associated with each of these strategies. line cuts are recorded by each user in the simulation while waiting in the queue. each time a user is selected from the queue and assigned a computer, remaining users who arrived prior to the one just selected note having been skipped over. by the time they are assigned a computer, users have recorded the total number of times they were passed over in the queue. public library computer waiting queues | williamson 80 figure 3. cumulative distribution of line cuts by queuing strategy figure 3 displays the cumulative percentage of users experiencing no more than the listed number of cuts for each non-fcfs strategy. the majority of users are not passed over at all under these strategies. however, there is a small minority of users that will be repeatedly cut in line. for instance, in our simulation, one unfortunate individual was passed over in the queue sixteen times under the sjf strategy. this user waited ninety-one minutes using this strategy as opposed to only fifty-nine minutes under the familiar fcfs waiting queue. most customers would become upset upon seeing a string of sixteen people jump over them in the queue and get on a computer while they are enduring such a long wait. the hrrn strategy caused a maximum of nine cuts to an individual in this simulation. this user waited seventy-three minutes under hrrn versus only fifty-five minutes using fcfs. extreme examples such as those above are the exception. under the hrrn and sjf-fb strategies, 99% of users were passed over at most four times while waiting in the queue. conclusion we have examined the simulation of several queuing strategies using a single month of computer usage data from the southern oaks library. the relative performance difference between queuing strategies will depend on the supply and demand of computers at any given location. clearly, at libraries with plenty of public computers for which customers seldom have to wait, the choice of queuing strategy is inconsequential. however, for libraries struggling with waiting times on par with those examined here, the choice can have a substantial impact. information technology and libraries | june 2012 81 in general, however, these simulation results demonstrate the ability of non-fcfs queuing strategies to significantly lower waiting times for certain classes of users without partitioning the pool of computers. these reductions in waiting times come at the cost of allowing high priority users to essentially cut in line. this causes slightly longer wait times for low priority users; but, overall average and median wait times see a small reduction. of course, for some customers, being passed over in line even once is intolerable. furthermore, creating a system to implement an alternative queuing strategy may present obstacles of its own. however, if the need to provide for quick, short-term computer access is pressing enough for a library to create a separate pool of “express” computers; then, one of the non-fcfs queuing strategies discussed in this paper may be a viable alternative. at the very least, the fcfs-15 simulation results should give one pause before resorting to designated “express” and “nonexpress” computers in an attempt to remedy unacceptable customer waiting times. acknowledgments the author would like to thank the metropolitan library system, kay bauman, jimmy welch, sudarshan dhall, and bo kinney for their support and assistance with this paper as well as tracey thompson and tim spindle for their excellent review and recommendations. references 1. j. d. slone, “the impact of time constraints on internet and web use,” journal of the american society for information science and technology 58 (2007): 508–17. 2. william mendenhall and terry sincich, statistics for engineering and the sciences (upper saddle river, nj: prentice-hall, 2006), 151–54. 3. abraham silberschatz, peter baer galvin, and greg gagne, operating system concepts (hoboken, nj: wiley, 2009), 188–200. a simple scheme for book classification using wikipedia | yelton 7 andromeda yelton a simple scheme for book classification using wikipedia ■■ background hanne albrechtsen outlines three types of strategies for subject analysis: simplistic, content-oriented, and requirements-oriented.3 in the simplistic approach, “subjects [are] absolute objective entities that can be derived as direct linguistic abstractions of documents.” the content-oriented model includes an interpretive step, identifying subjects not explicitly stated in the document. requirementsoriented approaches look at documents as instruments of communication; thus they anticipate users’ potential information needs and consider the meanings that documents may derive from their context. (see, for instance, the work of hjørland and mai.4) albrechtsen posits that only the simplistic model, which has obvious weaknesses, is amenable to automated analysis. the difficulty in moving beyond a simplistic approach, then, lies in the ability to capture things not stated, or at least not stated in proportion to their importance. synonymy and polysemy complicate the task. background knowledge is needed to draw inferences from text to larger meaning. these would be insuperable barriers if computers limited to simple word counts. however, thesauri, ontologies, and related tools can help computers as well as humans in addressing these problems; indeed, a great deal of research has been done in this area. for instance, enriching metadata with princeton university’s wordnet and the national library of medicine’s medical subject headings (mesh) is a common tactic,5 and the yahoo! category structure has been used as an ontology for automated document classification.6 several projects have used library of congress classification (lcc), dewey decimal classification (ddc), and similar library tools for automated text classification, but their results have not been thoroughly reported.7 all of these tools have had problems, though, with issues such as coverage, currency, and cost. this has motivated research into the use of wikipedia in their stead. since wikipedia’s founding in 2001, it has grown prodigiously, encompassing more than 3 million articles in its english edition alone as of this writing; this gives it unparalleled coverage. wikipedia also has many thesaurus-like features. redirects function as “see” references by linking synonyms to preferred terms. disambiguation pages deal with homonyms. the polyhierarchical category structure provides broader and narrower term relationships; the vast majority of pages belong to at least one category. links between pages function as related-term indicators. editor’s note: this article is the winner of the lita/ex libris student writing award, 2010. because the rate at which documents are being generated outstrips librarians’ ability to catalog them, an accurate, automated scheme of subject classification is desirable. however, simplistic word-counting schemes miss many important concepts; librarians must enrich algorithms with background knowledge to escape basic problems such as polysemy and synonymy. i have developed a script that uses wikipedia as context for analyzing the subjects of nonfiction books. though a simple method built quickly from freely available parts, it is partially successful, suggesting the promise of such an approach for future research. a s the amount of information in the world increases at an ever-more-astonishing rate, it becomes both more important to be able to sort out desirable information and more egregiously daunting to manually catalog every document. it is impossible even to keep up with all the documents in a bounded scope, such as academic journals; there were more than twenty-thousand peer-reviewed academic journals in publication in 2003.1 therefore a scheme of reliable, automated subject classification would be of great benefit. however, there are many barriers to such a scheme. naive word-counting schemes isolate common words, but not necessarily important ones. worse, the words for the most important concepts of a text may never occur in the text. how can this problem be addressed? first, the most characteristic (not necessarily the most common) words in a text need to be identified—words that particularly distinguish it from other texts. some corpus that connects words to ideas is required—in essence, a way to automatically look up ideas likely to be associated with some particular set of words. fortunately, there is such a corpus: wikipedia. what, after all, is a wikipedia article, but an idea (its title) followed by a set of words (the article text) that characterize that title? furthermore, the other elements of my scheme were readily available. for many books, amazon lists statistically improbable phrases (sips)— that is, phrases that are found “a large number of times in a particular book relative to all search inside! books.”2 and google provides a way to find pages highly relevant to a given phrase. if i used google to query wikipedia for a book’s sips (using the query form “site:en.wikipedia .org sip”), would wikipedia’s page titles tell me something useful about the subject(s) of the book? andromeda yelton (andromeda.yelton@gmail.com) graduated from the graduate school of library and information science, simmons college, boston, in may 2010. 8 information technology and libraries | march 2011 ■■ an initial test case to explore whether my method was feasible, i needed to try it on a test case. i chose stephen hawking’s a brief history of time, a relatively accessible meditation on the origin and fate of the universe, classified under “cosmology” by the library of congress. i began by looking up its sips on amazon.com. noticing that amazon also lists capitalized phrases (caps)—“people, places, events, or important topics mentioned frequently in a book”—i included those as well (see table 1).14 i then queried wikipedia via google for each of these phrases, using queries such as “site:en.wikipedia .org ‘grand unification theory.’” i selected the top three wikipedia article hits for each phrase. this yielded a list of sixty-one distinct items with several interesting properties: ■■ four items appeared twice (arrow of time, entropy [arrow of time], inflation [cosmology], richard feynman). however, nothing appeared more than twice; that is, nothing definitively stood out. ■■ many items on the list were clearly relevant to brief history, although often at too small a level of granularity to be good subject headings (e.g., black hole, second law of thermodynamics, time in physics). ■■ some items, while not unrelated, were wrong as subject classifications (e.g., list of solar system objects by size, nobel prize in physics). ■■ some items were at best amusingly, and at worst bafflingly, unrelated (e.g., alpha centauri [doctor who], electoral district [canada], james k. polk, united states men’s national soccer team). ■■ in addition, i had to discard some of the top google hits because they were not articles but wikipedia special pages, such as “talk” pages devoted to discussion of an article. this test showed that i needed an approach that would give me candidate subject headers at a higher level of granularity. i also needed to be able to draw a brighter line between candidates and noncandidates. the presence of noncandidates was not in itself distressing—any automated approach will consider avenues a human would not—but not having a clear basis for discarding low-probability descriptors was a problem. as it happens, wikipedia itself offers candidate subject headers at a higher level of granularity via its categories system. most articles belong to one or more categories, which are groups of pages belonging to the same list or topic.15 i hoped that by harvesting categories from the sixty-one pages i had discovered, i could improve my method. this yielded a list of more than three hundred categories. unsurprisingly, this list mostly comprised irrelevant because of this thesaurus structure, all of which can be harvested and used automatically, many researchers have used wikipedia for metadata enrichment, text clustering and classification, and the like. for example, han and zhao wanted to automatically disambiguate names found online but faced many problems familiar to librarians: “the traditional methods measure the similarity using the bag of words (bow) model. the bow, however, ignores all the semantic relations such as social relatedness between named entities, associative relatedness between concepts, polysemy and synonymy between key terms. so the bow cannot reflect the actual similarity.” to counter this, they constructed a semantic model from information on wikipedia about the associative relationships of various ideas. they then used this model to find relationships between information found in the context of the target name in different pages. this enabled them to accurately group pages pertaining to particular individuals.8 carmel, roitman, and zwerdling used page categories and titles to enhance labeling of document clusters. although many algorithms exist for sorting large sets of documents into smaller, interrelated clusters, there is less work on labeling those clusters usefully. by extracting cluster keywords, using them to query wikipedia, and algorithmically analyzing the results, they created a system whose top five recommendations contained the human-generated cluster label more than 85 percent of the time.9 schönhofen looked at the same problem i examine— identifying document topics with wikipedia data—but he used a different approach. he calculated the relatedness between categories and words from titles of pages belonging to those categories. he then used that relatedness to determine how strongly words from a target document predicted various wikipedia categories. he found that although his results were skewed by how wellrepresented topics were on wikipedia, “for 86 percent of articles, the top 20 ranked categories contain at least one of the original ones, with the top ranked category correct for 48 percent of articles.”10 wikipedia has also been used as an ontology to improve clustering of documents in a corpus,11 to automatically generate domain-specific thesauri,12 and to improve wikipedia itself by suggesting appropriate categories for articles.13 in short, wikipedia has many uses for metadata enrichment. while text classification is one of these potential uses, and one with promise, it is under-explored at present. additionally, this exploration takes place almost entirely in the proceedings of computer science conferences, often without reference to library science concepts or in a place where librarians would be likely to benefit from it. this paper aims to bridge that gap. a simple scheme for book classification using wikipedia | yelton 9 computationally trivial to do so, given such a list. (the list need not be exhaustive as long as it exhaustively described category types; for instance, the same regular expression could filter out both “articles with unsourced statements from october 2009” and “articles with unsourced statements from may 2008.”) at this stage of research, however, i simply ignored these categories in analyzing my results. to find a variety of books to test, i used older new york times nonfiction bestseller lists because brand-new books are less likely to have sips available on amazon.19 these lists were heavily slanted toward autobiography, but also included history, politics, and social science topics. ■■ results of the thirty books i examined (the top fifteen each from paperback and hardback nonfiction lists), twenty-one had sips and caps available on amazon. i ran my script against each of these phrase sets and calculated three measures for each resulting category list: ■■ precision (p): of the top categories, how many were synonyms or near-synonyms of the book’s lcshs? ■■ recall (r): of the book’s lcshs, how many had synonyms or near-synonyms among the top categories? ■■ right-but-wrongs (rbw): of the top categories, how many are reminiscent of the lcshs without actually being synonymous? these included narrower terms (e.g., the category “african_american_actors” when the lcshs included “actors—united states —biography”), broader terms (e.g., “american_folk_ singers” vs. “dylan, bob, 1941–”), related terms (e.g., “the_chronicles_of_narnia_books” vs. “lion, the witch and the wardrobe (motion picture)”), and examples (“killian_documents_controversy” vs. “united states—politics and government—2001–2009”). i considered the “top categories” for each book to be the five that most commonly occurred (excluding wikipedia administrative categories), with the following exceptions: ■■ because i had no basis to distinguish between them, i included all equally popular categories, even if that would bring the total to more than five. thus, for example, for the book collapse, the most common category occurred seven times, followed by two categories with five appearances and six categories with four. rather than arbitrarily selecting two of the six four-occurrence categories to bring the total to five, i examined all nine top categories. ■■ if there were more than five lcshs, i expanded the number of categories accordingly, so as not to candidates (“wars involving the states and peoples of asia,” “video games with expansion packs,” “organizations based in sweden,” among many others). many categories played a clear role in the wikipedia ecology of knowledge but were not suitable as general-purpose subject headers (“living people,” “1849 deaths”). strikingly, though, the vast majority of candidates occurred only once. only forty-two occurred twice, fifteen occurred three times, and one occurred twelve times: “physical cosmology.” twelve occurrences, four times as many as the next candidate, looked like a bright line. and “physical cosmology” is an excellent description of brief history— arguably better than lcsh’s “cosmology.” the approach looked promising. ■■ automating further test cases the next step was to test an extensive variety of books to see if the method was more broadly applicable. however, running searches and collating queries for even one book is tedious; investigating a large number by hand was prohibitive. therefore i wrote a categorization script (see appendix) that performs the following steps:16 ■■ reads in a file of statistically improbable phrases17 ■■ runs google queries against wikipedia for all of them18 ■■ selects the top hits after filtering out some common wikipedia nonarticles, such as “category” and “user” pages ■■ harvests these articles’ categories ■■ sorts these categories by their frequency of occurrence this algorithm did not filter out wikipedia administrative categories, as creating a list of them would have been prohibitively time-consuming. however, it would be table 1. sips and caps for a brief history of time sips grand unification energy, complete unified theory, thermodynamic arrow, psychological arrow, primordial black holes, boundary proposal, hot big bang model, big bang singularity, more quarks, contracting phase, sum over histories caps alpha centauri, solar system, nobel prize, north pole, united states, edwin hubble, royal society, richard feynman, milky way, roger penrose, first world war, weak anthropic principle 10 information technology and libraries | march 2011 “continental_army_generals” vs. “united states— history—revolution, 1775–1783.” ■■ weak: some categories treated the same subject as the lcsh but not at all in the same way ■■ wrong: the categories were actively misleading the results are displayed in table 2. ■■ discussion the results of this test were decidedly more mixed than those of my initial test case. on some books the wikipedia method performed remarkably well; on misleadingly increase recall statistics. ■■ i did not consider any categories with fewer than four occurrences, even if that left me with fewer than five top categories to consider. the lists of three-, two-, and one-occurrence categories were very long and almost entirely composed of unrelated items. i also considered, subjectively, the degree of overlap between the lcshs and the top wikipedia categories. i chose four degrees of overlap: ■■ strong: the top categories were largely relevant and included synonyms or near-synonyms for the lcsh ■■ near miss: some categories suggested the lcsh but missed its key points, such as table 2. results (sorted by percentage of relevant categories). book p r rbw subjective quality chronicles, bob dylan 0.2 0.5 0.8 strong the chronicles of narnia: the lion, the witch and the wardrobe official illustrated movie companion, perry moore 0.25 1 0.625 strong 1776, david mccullough 0 0 0.8 near miss 100 people who are screwing up america, bernard goldberg 0 0 0.625 weak the bob dylan scrapbook, 1956–1966, with text by robert santelli 0.2 0.5 0.4 strong three weeks with my brother, nicholas sparks 0 0 0.57 weak mother angelica, raymond arroyo 0.07 0.33 0.43 near miss confessions of a video vixen, karrine steffans 0.25 0.33 0.25 weak the fairtax book, neal boortz and john linder 0.17 0.33 0.33 strong never have your dog stuffed, alan alda 0 0 0.43 weak the world is flat, thomas l. friedman 0.4 0.5 0 near miss the tender bar, j. r. moehringer 0 0 0.2 wrong the tipping point, malcolm gladwell 0 0 0.2 wrong collapse, jared diamond 0 0 0.11 weak blink, malcolm gladwell 0 0 0 wrong freakonomics, steven d. levitt and stephen j. dubner 0 0 0 wrong guns, germs, and steel, jared diamond 0 0 0 weak magical thinking, augusten burroughs 0 0 0 wrong a million little pieces, james frey 0 0 0 wrong worth more dead, ann rule 0 0 0 wrong tuesdays with morrie, mitch albom no category with more than 4 occurrences a simple scheme for book classification using wikipedia | yelton 11 my method’s success with a brief history of time. i tested another technical, jargon-intensive work (n. gregory mankiw’s macroeconomics textbook), and found that the method also worked very well, giving categories such as “macroeconomics” and “economics terminology” with high frequency. therefore a system of this nature, even if not usable for a broad-based collection, might be very useful for scientific or other jargon-intensive content such as a database of journal articles. ■■ future research the method outlined in this paper is intended to be a proof of concept using readily available tools. the following work might move it closer to a real-world application: ■■ a configurable system for providing statistically improbable phrases; there are many options.23 this would provide the user with more control over, and understanding of, sip generation (instead of the amazon black box), as well as providing output that could integrate directly with the script. ■■ a richer understanding of the wikipedia category system. some categories (e.g., “all articles with unsourced statements”) are clearly useful only for wikipedia administrative purposes, not as document descriptors; others (e.g., “physical cosmology”) are excellent subject candidates; others have unclear value as subjects or require some modification (e.g., “environmental non-fiction books,” “macroeconomics stubs”). many of these could be filtered out or reformatted automatically. ■■ greater use of wikipedia as an ontology. for example, a map of the category hierarchies might help locate headers at a useful level of granularity, or to find the overarching meaning suggested by several headers by finding their common broader terms. a more thorough understanding of wikipedia’s relational structure might help disambiguate terms.24 others, it performed very poorly. however, there are several patterns here: many of these books were autobiographies, and the method was ineffective on nearly all of these.20 a key feature of autobiographies, of course, is that they are typically written in the first person, and thus lack any term for the major subject—the author’s name. biography, by contrast, is rife with this term. this suggests that including titles and authors along with sips and caps may be wise. additionally, it might require making better use of wikipedia as an ontology to look for related concepts (rather in the manner that han and zhao used it for name disambiguation).21 books that treat a single, well-defined subject are easier to analyze than those with more sprawling coverage. in particular, books that treat a concept via a sequence of illustrative essays (e.g., tipping point, freakonomics) do not work well at all. sips may apply only to particular chapters rather than to the book as a whole, and the algorithm tends to pick out topics of particular chapters (e.g., for freakonomics, the fascinating chapter on sudhir venkatesh’s work on “gangs_in_chicago, _illinois”22) rather than the connecting threads of the entire book (e.g. “economics—sociological aspects”). the tactics suggested for autobiography might help here as well. my subjective impressions were usually, but not always, borne out by the statistics. this is because some of the rbws were strongly related to one another and suggested to a human observer a coherent narrative, whereas others picked out minor or dissimilar aspects of the book. there was one more interesting, and promising, pattern: my subjective impressions of the quality of the categories were strongly predicted by the frequency of the most common category. remember that in the brief history example, the most common category, “physical cosmology,” occurred twelve times, conspicuously more than any of its other categories. therefore i looked at how many times the top category for each book occurred in my results. i averaged this number for each subjective quality group; the results are in table 3. in other words, the easier it was to draw a bright line between common and uncommon categories, the more likely the results were to be good descriptions of the work. this suggests that a system such as this could be used with very little modification to streamline categorization. for example, it could automatically categorize works when it met a high confidence threshold (when, for instance, the most common category has double-digit occurrence), suggest categories for a human to accept or reject at moderate confidence, and decline to help at low confidence. it was also interesting to me that—unlike my initial test case—none of the bestsellers were scientific or technical works. it is possible that the jargon-intensive nature of science makes it easier to categorize accurately, hence table 3. category frequency and subjective quality subjective quality of categories frequencies of most common category average frequency of most common category strong 6, 12, 16, 19 13.25 near miss 5, 5, 7, 10 6.75 weak 4, 5, 6, 7, 8 6 wrong 3, 4, 4, 5, 5, 5, 7, 7 5 12 information technology and libraries | march 2011 (1993): 219. 4. birger hjørland, “the concept of subject in information science,” journal of documentation 48, no. 2 (1992): 172; jenserik mai, “classification in context: relativity, reality, and representation,” knowledge organization 31, no. 1 (2004): 39; jens-erik mai, “actors, domains, and constraints in the design and construction of controlled vocabularies,” knowledge organization 35, no. 1 (2008): 16. 5. xiaohua hu et al., “exploiting wikipedia as external knowledge for document clustering,” in proceedings of the 15th acm sigkdd international conference on knowledge discovery and data mining, paris, france, 28 june–1 july 2009 (new york: acm, 2009): 389. 6. yannis labrou and tim finin, “yahoo! as an ontology— using yahoo! categories to describe documents,” in proceedings of the eighth international conference on information and knowledge management, kansas city, mo, usa 1999 (new york: acm, 1999): 180. 7. kwan yi, “automated text classification using library classification schemes: trends, issues, and challenges,” international cataloging & bibliographic control 36, no. 4 (2007): 78. 8. xianpei han and jun zhao, “named entity disambiguation by leveraging wikipedia semantic knowledge,” in proceeding of the 18th acm conference on information and knowledge management, hong kong, china, 2–6 november 2009 (new york: acm, 2009): 215. 9. david carmel, haggai roitman, and naama zwerdling, “enhancing cluster labeling using wikipedia,” in proceedings of the 32nd international acm sigir conference on research and development in information retrieval, boston, ma, usa (new york: acm, 2009): 139. 10. peter schönhofen, “identifying document topics using the wikipedia category network,” in proceedings of the 2006 ieee/wic/acm international conference on web intelligence, hong kong, china, 18–22 december 2006 (los alamitos, calif.: ieee computer society, 2007). 11. hu et al., “exploiting wikipedia.” 12. david milne, olena medelyan, and ian h. witten, “mining domain-specific thesauri from wikipedia: a case study,” in proceedings of the 2006 ieee/wic/acm international conference on web intelligence, 22–26 december 2006 (washington, d.c.: ieee computer society, 2006): 442. 13. zeno gantner and lars schmidt-thieme, “automatic content-based categorization of wikipedia articles,” in proceedings of the 2009 workshop on the people’s web meets nlp, acl-ijcnlp 2009, 7 august 2009, suntec, singapore (morristown, n.j.: association for computational linguistics, 2009): 32. 14. “amazon.com capitalized phrases,” amazon.com, http://www.amazon.com/gp/search-inside/capshelp.html/ ref=sib_caps_help (accessed mar. 13, 2010). 15. for more on the epistemological and technical roles of categories in wikipedia, see http://en.wikipedia.org/wiki/ wikipedia:categorization. 16. two sources greatly helped the script-writing process: william steinmetz, wicked cool php: real-world scripts that solve difficult problems (san francisco: no starch, 2008); and the documentation at http://php.net. 17. not all books on amazon.com have sips, and books that do may only have them for one edition, although many editions may be found separately on the site. there is not a readily apparent pattern determining which edition features sips. therefore ■■ a special-case system for handling books and authors that have their own article pages on wikipedia. in addition, a large-scale project might want to work from downloaded snapshots of wikipedia (via http:// download.wikimedia.org/), which could be run on local hardware rather than burdening their servers, this would require using something other than google for relevance ranking (there are many options), with a corresponding revision of the categorization script. ■■ conclusions even a simple system, quickly assembled from freely available parts, can have modest success in identifying book categories. although my system is not ready for real-world applications, it demonstrates that an approach of this type has potential, especially for collections limited to certain genres. given the staggering volume of documents now being generated, automated classification is an important avenue to explore. i close with a philosophical point. although i have characterized this work throughout as automated classification, and it certainly feels automated to me when i use the script, it does in fact still rely on human judgment. wikipedia’s category structure and its articles linking text to title concepts are wholly human-created. even google’s pagerank system for determining relevancy rests on human input, using web links to pages as votes for them (like a vast citation index) and the texts of these links as indicators of page content.25 my algorithm therefore does not operate in lieu of human judgment. rather, it lets me leverage human judgment in a dramatically more efficient, if also more problematic, fashion than traditional subject cataloging. with the volume of content spiraling ever further beyond our ability to individually catalog documents—even in bounded contexts like academic databases, which strongly benefit from such cataloging— we must use human judgment in high-leverage ways if we are to have a hope of applying subject cataloging everywhere it is expected. references and notes 1. carol tenopir. “online databases—online scholarly journals: how many?” library journal (feb. 1, 2004), http://www .libraryjournal.com/article/ca374956.html (accessed mar. 13, 2010). 2. “amazon.com statistically improbable phrases,” amazon. com, http://www.amazon.com/gp/search-inside/sipshelp .html/ref=sib_sip_help (accessed mar. 13, 2010). 3. hanne albrechtsen. “subject analysis and indexing: from automated indexing to domain analysis,” the indexer, 18, no. 4 a simple scheme for book classification using wikipedia | yelton 13 problematic million little pieces to be autobiography, as it has that writing style, and as its lcsh treats it thus. 21. han and zhao, “named entity disambiguation.” 22. sudhir venkatesh, off the books: the underground economy of the urban poor (cambridge: harvard univ. pr., 2006). 23. see karen coyle, “machine indexing,” the journal of academic librarianship 34, no. 6 (2008): 530. she gives as examples phraserate (http://ivia.ucr.edu/projects/phraserate/), kea (http://www.nzdl.org/kea/), and extractor (http://extractor. com/). 24. per han and zhao, “named entity disambiguation.” 25. lawrence page et al., “the pagerank citation ranking: bringing order to the web,” stanford infolab (1999), http:// ilpubs.stanford.edu:8090/422/ (accessed mar. 13, 2010). this paper precedes the launch of google; as the title indicates, the citation index is one of google’s foundational ideas. this step cannot be automated. 18. be aware that running automated queries without permission is an explicit violation of google’s terms of service. seegoogle webmaster central, “automated queries,” http://www.google.com/support/webmasters/bin/answer .py?hl=en&answer=66357 (accessed mar. 13, 2010). before using this script, obtain an api key, which confers this permission. ajax web search api keys can be instantly and freely obtained via http://code.google.com/apis/ajaxsearch/web.html. 19. “hardcover nonfiction,” new york times, oct. 9, 2005, http://www.nytimes.com/2005/10/09/books/bestseller /1009besthardnonfiction.html?_r=1 (accessed mar. 13, 2010); “paperback nonfiction,” new york times, oct. 9, 2005, http://www .nytimes.com/2005/10/09/books/bestseller/1009bestpapernon fiction.html?_r=1 (accessed mar. 13, 2010). 20. for the purposes of this discussion i consider the appendix. php script for automated classification 4) { echo “i’m sorry; the number specified cannot be more than 4.”; die; } // next, turn our comma-separated list into an array. 14 information technology and libraries | march 2011 $sip_temp = fopen($argv[1], ‘r’); $sip_list = ‘’; while (! feof($sip_temp)) { $sip_list .= fgets($sip_temp, 5000); } fclose($sip_temp); $sip_array = explode(‘, ‘, $sip_list); /* here we access google search results for our sips and caps. it is a violation of the google terms of service to run automated queries without permission. obtain an ajax api key via http://code.google.com. */ $apikey = ‘your_key_goes_here’; foreach($sip_array as $query) { /* in multiword terms, change spaces to + so as not to break the google search. */ $query = str_replace( “ “, “+”,,” $query); $googresult = “http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site%3aen.wikipedia.org+$query&key=$apikey”; $googdata = file_get_contents($googresult); // pick out the urls we want and put them into the array $links preg_match_all(‘|” url”:” [^” ]*”|i’,, $googdata, $links); /* strip out some crud from the json syntax to get just urls */ $links[0] = str_replace( “\” url\”:\” “, “”, $links[0]); $links[0] = str_replace(“\” “, “”, $links[0]); /* here we step through the links in the page google returned to us and find the top wikipedia articles among the results */ $i=0; foreach($links[0] as $testlink) { /* these variables test to see if we have hit a wikipedia special page instead of an article. there are many more flavors of special page, but these are the most likely to show up in the first few hits. */ $filetest = strpos($testlink, ‘wiki/file:’); $cattest = strpos($testlink, ‘wiki/category:’); $usertest = strpos($testlink, ‘wiki/user’); $talktest = strpos($testlink, ‘wiki/talk:’); $disambtest = strpos($testlink, ‘(disambiguation)’); $templatetest = strpos($testlink, ‘wiki/template_’); if (!$filetest && !$cattest && !$usertest && !$talktest && !$disambtest && !$templatetest) { $wikipages[] = $testlink; $i++; } /* once we’ve accumulated as many article pages as the user asked for, stop adding links to the $wikipages array. */ appendix. php script for automated classification (continued) a simple scheme for book classification using wikipedia | yelton 15 if ($i == $argv[2]) { break; } //this closes the foreach loop which steps through $links } // this closes the foreach loop which steps through $sip_array } /* for each page that we identified in the above step, let’s find the categories it belongs to. */ $mastercatarray = array(); foreach ($wikipages as $targetpage) { // scrape category information from the article page. $wikiscrape = file_get_contents($targetpage); preg_match_all(“|/wiki/category.[^\” ]+|”,,” $wikiscrape, $categories); foreach ($categories[0] as $catstring) { /* strip out the “wiki/category:” at the beginning of each string */ $catstring = substr($catstring, 15); /* keep count of how many times we’ve seen this category. */ if (array_key_exists($catstring, $mastercatarray)) { $mastercatarray[$catstring]++; } else { $mastercatarray[$catstring] =1; } } } // sort by value: most popular categories first. arsort($mastercatarray); echo “the top categories are:\n”; print_r($mastercatarray); ?> appendix. php script for automated classification (continued) article title | author 5revitalizing the library opac | mi and weng 5 the behavior of academic library users has drastically changed in recent years. internet search engines have become the preferred tool over the library online public access catalog (opac) for finding information. libraries are losing ground to online search engines. in this paper, two aspects of opac use are studied: (1) the current opac interface and searching capabilities, and (2) the opac bibliographic display. the purpose of the study is to find answers to the following questions: why is the current opac ineffective? what can libraries and librarians do to deliver an opac that is as good as search engines to better serve our users? revitalizing the library opac is one of the pressing issues that has to be accomplished. t he information-seeking behavior of today’s academic library users has drastically changed in recent years. according to a survey conducted and published by oclc in 2005, approximately 89 percent of college students across all the regions that were included in the study (including areas outside the united states) begin their electronic information searches with internet search engines.1 more than half of u.s. residents used google for their searches. internet search engines dominate the information-seeking landscape. academic libraries are the ones affected most, because many college students are satisfied with the answers they find on the internet for their assignments, and they end up not taking advantage of the many quality resources in their libraries. for many years, before the internet search engine emerged, library catalogs were the sole information-seeking gateway. just as the one-time industry giant kodak has lost ground to digital photography, academic library opacs are losing ground to online search engines. all along we academic librarians have devotedly and assiduously produced good cataloging records for the public to use. we have diligently and faithfully educated and helped our faculty and students find the proper library resources to fulfill their research needs and assignment requirements. we feel good about what we have achieved. why have our users switched to online search engines? ■ the evolution of user behavior it is technology and rising user expectations that have contributed to the changes in user behavior. as coyle and hillmann pointed out: “today’s library users have a different set of information skills from those of just a few decades ago. they live in a highly interactive, networked world and routinely turn to web search engines for their information needs.”2 a recent study conducted by the university of georgia on undergraduate research behavior in using the university’s electronic library concluded that internet sites and online instruction modules are the primary sources for their research.3 the students’ year of study did not make much of a difference in their choices. tenopir also concluded from her study of approximately 200 scholarly works published between 1995 and 2003 that no matter what type of resources were used, “convenience remains the single most important factor for information use.”4 recently, oclc identified three major trends in the needs of today’s information consumers—self-service (moving to self-sufficiency), satisfaction, and seamlessness.5 services provided by google, amazon, and similar companies are the major cause of these emerging trends. customers have wholeheartedly embraced these products because of their ease of use and quick delivery of “good enough” results. researchers do not need to take information literacy classes to learn how to use an online search engine. they do not need to worry about forgetting important but infrequently used search rules or commands. in addition, the search results delivered by online search engines are sorted using relevance ranking systems that are more user-friendly than the ones currently employed by academic library opacs. these are just some of the features that current academic library opacs fail to deliver. in 2004, campbell and fast presented their analysis of an exploratory study of university students’ perceptions of searching opacs and web search engines.6 they found that “[s]tudents express a distinct preference for search engines over library catalogues, finding the catalogue baffling and difficult to use effectively.” as a result, library opacs, because they do not fulfill user needs, have been bombarded with criticism.7 we often hear librarians complain about how library users forget what they have learned in user education classes. librarians sometimes even laugh at users’ ignorance and ineffectiveness in searching library opacs. this legacy mentality has actually prevented librarians from recognizing the changes in user behavior and expectations that have occurred in the past decade. rarely have librarians considered ineffective opac design to be at the root of unsuccessful opac use. roy tennant has mentioned frequently in his presentations that “only librarians like to search; users prefer to find”; that “users aren’t lazy, they are jia mi and cathy weng revitalizing the library opac: interface, searching, and display challenges jia mi (jmi@tcnj.edu) is electronic resources/serials librarian and cathy weng (weng@tcnj.edu) is head of cataloging, the college of new jersey library, ewing. 6 information technology and libraries | march 20086 information technology and libraries | march 2008 human.”8 it is only natural that library users turn to internet search engines first for their information needs. ■ the opac reexamined cutter, in his 1876 book, introduced the objectives of the library catalog as follows: 1. to enable a person to find a book of which either a. the author b. the title is known c. the subject 2. to show what the library has a. by a given author b. on a given subject c. in a given kind of literature 3. to assist in the choice of a book a. as to its edition (bibliographically) b. as to its character (literary or topical)9 the majority of today’s opacs have successfully fulfilled cutter’s model in finding known items. following the card-catalog convention, bibliographic elements such as title, author, and subject have been the leading search options in opac search menus for many years. it was assumed that users always came to the library with specific author, title, or subject information in mind before searching the catalog. the opac bibliographic display is in essence an electronic version of the card catalog. to accommodate the bibliographic data from card catalogs, many display labels were created, but often without regard to whether or not they were suitable in an online environment. this data-centered, card-catalog type of design was easily understood and fluently used by librarians, but not by most end users. campbell and fast found in their study that “while the participants were generally happy with their understanding of search engines, they frequently expressed a low opinion of their ability to search the catalogue.” they also found that students felt that “[t]he web is cluttered; the catalogue is organized. however, this organization was not always helpful; it was admired, but not understood.”10 the traditional catalog retrieval mechanism is significantly different from the web search engine. as yu and young noted in 2004, “web search engines and online bookstores have a number of features that are not typically incorporated into opacs. these functions include: natural-language entry, automated mapping to controlled vocabulary, spell-checking, similar pages, relevance-ranked output, popularity tracking, and browsing.”11 these features have unquestionably affected user expectations in searching library opacs. teaching users to search for structured bibliographic data is completely opposed to the ever-popular free and open internet search mechanism drawn from the google-like search experience, which does not require any special training. since academic libraries aim to provide more dynamic and versatile services, revitalizing library opacs should be considered a top priority. furthermore, librarians’ expectations of user behavior should adjust to today’s needs. educating users to become fluent in using opac search commands and rules has become less relevant as users now seldom read and follow instructions. investing effort and energy in designing a truly user-friendly opac that functions intuitively to achieve productive retrieval could not be more imperative. academic librarians have started pondering what changes should be made to library opacs so that a truly user-friendly, twenty-first-century catalog that offers a “google-like” experience can be delivered. two important aspects that affect the usability of library opacs are addressed in this article: (1) the current interface and searching capabilities and (2) the bibliographic display. the opac’s public interface and searching capabilities together function as a finding aid. it determines how successful a user is in retrieving information and is the gateway to library resources. the effectiveness of an opac’s bibliographic display affects the user’s understanding of the bibliographic description. users use bibliographic information to identify, select, and obtain library resources. ■ the study of the public interface of library opacs to find out how academic libraries designed and administered their opacs, the authors examined the interfaces of 123 association of research libraries (arl) libraries’ opacs powered by five major integrated library systems (ils): aleph, horizon, millennium, unicorn, and voyager. the study focused on searching ability, relevance ranking, layout, and linking functionalities. during the study, we expected each ils system to have its own opac design. we also anticipated that search mechanisms would be managed differently at each location. however, we were surprised by the great disparities that we discovered in opac quality, a clear indication of the time and effort (or lack thereof) devoted to their maintenance and improvement. the findings are summarized below. google-driven changes—keyword search as the default search key in his article “mental models for search are getting firmer,” usability expert jakob nielsen argued that cur> article title | author 7revitalizing the library opac | mi and weng 7 rent users have already developed a firm mental model of searching: search is such a prominent part of the web user experience that users have developed a firm mental model for how it’s supposed to work. users expect search to have three components: ■ a box where they can type words ■ a button labeled “search” that they click to run the search ■ a list of top results that’s linear, prioritized, and appears on a new page—the search engine results page (serp) in our experience, when users see a fat “search” button, they’re likely to frantically look for “the box where i type my words.” the mental model is so strong that the label “search” equals keyword searching, not other types of search.12 studies have also shown that the default search option to which an opac is set affects users’ success in retrieving information. two studies on university opac search transactions confirmed that novice users preferred searching by keyword. at nanyang technological university, singapore, a recent search transaction log study was conducted to “identify query and search failure patterns with the goal of identifying areas of improvement for the system.” results indicated that “the most commonly used search option for the ntu opac is the keyword search. the use of keyword searches contributed to 68.9 percent of all queries while other options such as title, author, and subject accounted for 16.5 percent, 8.2 percent, and 6.4 percent of all searches respectively.”13 at california state university–los angeles, a fourquarter (2002–2003) search transaction log analysis also revealed similar results. after the library implemented an “advanced keyword search” feature that provided more user-centered, behind-the-scenes search algorithms and that set keyword search as the default, the keyword search queries rose dramatically.14 many university library opacs have already begun to adopt features employed by internet search engines. among the 123 arl library opacs studied, 81 have “keyword(s) anywhere” as the default search key (see appendix a). this is a positive sign that libraries are paying attention to user search behavior. thirty-six libraries’ default search keys are still set to “title,” and six libraries, instead of providing a default search option, list field choices from which users must choose before entering their search terms. the title search used as the default option holds some potential problems. in order to retrieve good results from a title search, users are expected to type in a title in the right order, spelled correctly, and omitting the initial article (a, an, the), if any. while librarians are fluent with these seemingly simple rules, students and faculty constantly have trouble remembering them. providing online search tips and offering information literacy classes only help a little. since presenting keyword search as the default has proved effective, libraries using title search as their opac default search option might want to reconsider switching their default setting to keyword. search ability—true keyword search the basis of current opac search systems is boolean logic. the ease of using google-like search engines comes from its implicit “and” feature, which eliminates the need to enter boolean connectors (and, or, not) between search terms. this is logical because users usually look for records that contain all the terms that they enter. sixty-six percent of the arl libraries studied have opacs with keyword set as the default search option. these libraries handle boolean logic in keyword searching very differently. all five ils vendors offer “automatic and” functionality, but not all of these libraries have adopted it: in some cases, users are required to enter boolean operators during a search. emory university library’s opac automatically executes “same” for multiple search words if no boolean operators are entered which means that it will find records with the search terms in the same bibliographic fields. syracuse university’s opac automatically uses the boolean operator “or” for all keyword queries. this practice can generate too many irrelevant results. libraries that automatically supply the boolean operator “and” for multiple terms entered in the search box consequently produce more relevant results. in addition, none of the arl opacs studied handle auto-correction for typos, spell-check, auto-plurals, auto-word-truncation, punctuations, or special characters. this makes searching unnecessarily inconvenient. for many years now, teaching students how to properly use boolean operators has been one of the essential topics in information literacy classes. after taking these classes, do students use boolean operators when searching? a study of 2,374 transaction logs collected by 836 french universities revealed that french university students use boolean operators infrequently. fifty-six percent of the queries used only a single term. approximately 28 percent of the queries contained one boolean operator. to further investigate the impact of information search expertise on the use of boolean operators, the study showed that approximately one-third (32 percent) of the students (considered the “novice” group in the study) still did not use boolean operators even when they were explicitly invited to do so, compared to 83 percent of librarians (considered the “expert” group in the study), who used at least one boolean operator for their queries.15 therefore, complicated search strategies and syntax are mostly used by expert users. novice users 8 information technology and libraries | march 20088 information technology and libraries | march 2008 prefer to use natural-language queries. libraries also handle phrase searching in different ways. phrase searching usually is embedded within keyword search either explicitly or implicitly depending upon the ils system. aleph (ex libris) libraries use a radio button for “word or phrase” or “words adjacent” or “exact phrase” options for the computer to execute the command. unicorn (sirsi/dynix) libraries provide three options: “keyword,” “begins with,” and “exact.” some libraries have the “exact” command executed to search every field in a bibliographic record; other libraries search the title, subject, and author fields only. the millennium system’s (innovative) keyword search feature can do automatic phrase and “and” search. some millennium libraries (e.g., michigan state university) take advantage of this feature to search words entered as phrases first and, if unsuccessful, the system then repeats the search for the same words using the boolean operator “and.” this feature produces more relevant search results. however, several millennium libraries have not implemented this feature. they still use “boolean keyword” search as the default and instruct users to add quotation marks to define phrases. the voyager (ex libris, formerly endeavor) system offers two types of keyword searches: “keyword relevance” and “keyword boolean.” both options can handle phrase searching. but users are required to enter quotation marks for specific terms used as phrases. some libraries intentionally made only one keyword search option available. other libraries provided both options and used different languages as an opac search key (see appendix b). these search keys are not self-explanatory, and users will often find them puzzling. the default help screen provided by the ils vendor and adopted by many voyager libraries does not help much either (see appendix c). thirty-one of the 35 voyager libraries provide a boolean keyword search option. only five libraries utilize the automatic “and” feature. one library uses boolean keyword search as the only keyword option, but did not activate the automatic “and” functionality. relevance ranking in search results when users search by keyword, the best way to sort the results is by relevance. presenting the most relevant results at the top of the results page is crucial because it enhances library resource discovery and access. other sorting options, such as title or publication date, are not very useful since users usually do not have titles or publication dates in mind when browsing search results from a keyword search. three ils systems (millennium, unicorn, and voyager) have a relevance-ranking feature, yet this functionality was very much underutilized by the libraries studied. of the eighteen unicorn libraries, only five offered relevance ranking. none made it the default sorting option. thirtysix of the 38 millennium libraries provided relevance ranking as a sorting option. only twelve of those libraries made relevance ranking the default sorting system. twenty-seven out of the thirty-five voyager libraries offered the keyword (relevance) search option, under which the search results were automatically ranked by relevance. out of the twenty-nine voyager libraries that offered the keyword (boolean) search option, only four libraries used relevance as the default sorting system. the rest of the libraries used a “system sort” mechanism that sorted search results by bibliographic control number. figure 1 summarizes the sorting options used by the arl libraries studied and also shows the default sorting options for keyword search. unlike online search engines, which pull data directly from full-text documents, library opacs search for words from the structured metadata entered by catalogers. different fields are set to carry different weights for relevance considerations. the behind-the-scenes algorithm (the criteria used to decide the level of relevance) should be carefully established to warrant a good ranking scheme. for example, the new opac of north carolina state university library, powered by endeca, adopted an algorithm based on field weighting, phrase matching, facet lcsh, term frequency (tf), and inverse document frequency (idf). their search results are indeed more logically ranked by relevance. recently there have been suggestions to incorporate circulation statistics, book review data, and a library of congress call number table into the algorithm. the checkout data would provide a rough substitute for google’s pagerank (a count of links to a site, which is an indication of the site’s popularity), and book reviews would provide more text to be considered in the relevancy tests. using library of congress call numbers would either require having the call number table loaded and then running the search terms against it or including call numbers in the algorithm, giving more weight to titles having the same call number. for example, seven out of twenty-three results generated for a search for “new york history” on an opac have the call number “f128.” the call number “f128” is linked to the call number table with the subject new york and history. it can be confirmed that seven items with call number “f128” should be considered more relevant and ranked first on the results list. more research needs to be done in this area. the search results display the search results display is critical. the information, options, and bibliographic data presented on the browse page help users decide what actions to take next. in the opacs examined, the authors found the following problems: article title | author 9revitalizing the library opac | mi and weng 9 1. search terms and search boxes were not retained on the results page after a search is performed, many opacs do not effectively carry the original search information onto the results screen. this information includes the search key and the words typed in the search box. users need to consult this information to identify and select records relevant to their needs from the search results page. based on the retained information, users also decide what to do next. for example, they might change their search strategy or modify their previous search. many of the opacs studied neglected to display the original search information. even better than just displaying the text of the user’s search terms would be to maintain them in search boxes at the top and bottom of the results display page. this way, users would only have to modify their search terms rather than type new search terms each time they wished to modify their original search. only one of the twentyone aleph libraries studied kept the previous search terms in the search box on the results page. fourteen of the aleph libraries retained neither the previous search strategy nor the search terms. six libraries placed the search box at the bottom of the search results page, which could be easily missed. 2. post-search limit functions were not always readily available sometimes keyword searches produce an overwhelmopac sorting options for keyword search relevance year (publication date) author title call # (subject) format default aleph 21 0 21 21 21 8 4 year/author: 17 title/year ascending: 1 title: 1 system sort: 2 horizon 7 0 7 7 7 0 0 publication date: 2 title: 2 author(ascending): 1 system sort: 2 millennium 38 36 38 0 38 0 0 date: 20 title: 5 relevance: 12 system sort: 1 unicorn 18 5w descending 18 ascending 18 18 18 18 0 new to old: 5 relevance: 1 (ncsu) system sort: 12 voyager 35 kw (r) 27 kw (b) 4 descending 34 ascending 34 35 35 0 0 relevance: 5 kw with relevance: 27 system sort: 8 figure 1. arl libraries sorting options for keyword search (as of march 2007) 10 information technology and libraries | march 200810 information technology and libraries | march 2008 ing number of search results. since the relevance ranking functionality currently provided by ils vendors does not work very well, the best way to refine searches is to make effective search limit options available. limiting options such as format, language, date, availability, and location should be readily available on the results page. some ilss in our study hid this feature, either under a modified search link or an advanced search link. this made refining a search unnecessarily cumbersome. 3. item statuses were not available on the search results page in addition to bibliographic information, users also need to know whether an item they want is available. having the item status on the browse page is very helpful because users can skip the records that have been checked out. some libraries studied did not have this information on the results browse page. users needed to go to the individual bibliographic records to find out whether an item was available or not. a few libraries provided an added-value option to limit the results by “available items”—a very useful feature. 4. a lack of value-added information a book cover image conveys an impression of a book that words cannot. it can also help a user recognize a book he or she has seen previously. in addition to cover images, libraries can provide value-added and contextual information by linking those images to tables of contents, summaries, sample passages of text, and reviews. one way libraries provide value-added and contextual information is to link cover images to the library of congress’s table of contents page. another way is to link opacs to information obtained from syndetics.com, a company that provides cover images, tables of contents, summaries, author biographical information, and reviews. the ohio state university library not only adds the table of contents into the marc record, but also links the names of the authors of a particular resource to other works by the same authors. this is a great discovery tool for finding related resources, and it is especially helpful, since in the future opacs will be able to search not only books but also articles and other resources. 5. title links were misleading we found that several libraries’ opacs title links on the results page did not take users to the detailed bibliographic record, but instead directed users to an alphabetical title-browsing page. to get to the actual bibliographic record, users had to click a “display full record” link (which is sometimes difficult to locate) to view the individual bibliographic record. this misleading feature makes the retrieval process inefficient. 6. switching between individual records and the results list was cumbersome after viewing an individual bibliographic record, users will want to return to the results browse page, either by hitting the “back” button or by clicking on a “return to results” link. many library opacs in our study returned the user to the top of the results page rather than to the location to which the user had previously scrolled. this forced the user to scroll back down through the records that had already been examined. this feature ought to be improved. 7. the color of entry links that had already been read were not differentiated for over a decade now, web browsers have changed the color of links that have already been clicked on. however, this has not been the case with opacs. to solve this problem, visited bibliographic entry links on search results pages should likewise be given a different color from entries that have not yet been visited. this feature facilitates the browsing of the search results. if what has been viewed is clearly marked, users only need to focus on entries that have not yet been visited. some libraries in our study did not have this feature. 8. searched keywords were not highlighted when a keyword search is performed, highlighting the entered keywords in each bibliographic record that has been retrieved is helpful. based on the bibliographic elements in which the highlighted keywords appear, users can then decide how relevant the retrieved publication is to their research. all five ils vendors provide this feature, and many libraries did a good job of implementing it. however, some libraries neglected to make this feature available. 9. many libraries lack a meaningful call-number browse feature library opacs should take better advantage of call number links by allowing users to browse them much as if they were browsing shelves in the stacks. to that end, opacs should link call numbers directly to a page with more useful identifying information, such as the authors and titles. no aleph library opacs that we studied currently have this feature. instead, clicking on the hyperlinked call number field only leads users to a list of more call numbers, which is not helpful at all. 10. title link, subject link, and author link should be relabeled to be meaningful to end users (other valueadded features) millennium’s “similar records” and voyager’s “more like this” are added to pull similar titles under the same article title | author 11revitalizing the library opac | mi and weng 11 subjects. unicorn and horizon offer a panel on the left side of the detailed book record, which can add meaningful information to these links. but how the panel is used depends on the individual libraries. some libraries use the panel with only library holding information, but other libraries, such as university of virginia, make an informative presentation of those links to students. virginia has added three browse features to make the index links much more meaningful: “find more by this author” (author link), “find more on these topics” (subject link), and “nearby items on shelf” (call number link). (see figure 2.) this value-added feature can indeed facilitate retrieval process. by analyzing five major integrated library systems’ opacs among arl libraries, the authors have come to believe that librarians can make a big difference in improving opacs. no matter how good the library system is, librarians still need to invest effort, time, and technical knowledge to configure and take full advantage of the many capabilities that ilss offer. public services, technical services, and system librarians should all work together to continuously study the usability of opacs and to make them more effective. it is true that all current opacs lack spell-check and automatic stemming functionality. aleph and horizon need to add relevance ranking, and millennium, unicorn, and voyager should make our data work harder and relevance ranking algorithms more effective. besides those systems in need of improvements, the study shows that all library opacs could do a much better job if they focus on the user’s needs. ■ the opac bibliographic display study when the web opac was introduced, libraries around the world quickly abandoned the traditional card catalog display and adopted the line-by-line display with display labels on one side and bibliographic information on the other. because the line-by-line display format can be locally customized, each library’s opac bibliographic display looks very different. for decades, most academic libraries in the united states have used aacr and marc as their content and metadata standards for resource description and access. marc and aacr were originally created for card catalogs in which descriptive elements and access elements were separately defined and presented. the line between the two types of elements has become less distinct in today’s web environment. many elements in bibliographic records can serve as both description and tracing elements on opacs.16 hyperlink functionality has also streamlined the retrieval process. to see how academic libraries in the united states format their opac bibliographic displays, the authors examined the opacs of fifteen academic libraries.17 the purpose was to study the effectiveness of the display of records in different formats. in the mid-1990s, wool studied the bibliographic display practices for monographs of thirty-six online catalogs in the united states. in his study, five criteria were used to analyze each bibliographic record structure.18 the authors of this paper adopted for analysis three of the five opac bibliographic display criteria used by wool, only this time with an emphasis on the user’s perspective and needs. eight different titles were reviewed and compared: three monographs, two serials, one video recording, and two sound recordings.19 the analysis given below is based on the following three criteria: ■ the accuracy and clarity of display labels; ■ the order of bibliographic elements display; and ■ the utilization of bibliographic data. accuracy and clarity of display labels for this discussion, the authors divided the bibliographic elements into three areas: ■ the first tier: information about author/contributor, title, imprint, and subjects; ■ the second tier: other descriptive information, including the physical description, notes, related contributors, related titles, etc.; and ■ the third tier: the linking fields (marc 76x–78x fields) and the electronic location and access field (i.e., 856 field). the first-tier elements the information displayed in the first tier can be consid-figure 2. university of virginia libraries catalog 12 information technology and libraries | march 200812 information technology and libraries | march 2008 most libraries in our study used the label “author” for the principal author. the principal author could be a personal author, a corporate author, or a conference name. if it is a personal author, it could be a writer, an artist, or a composer. some opacs used “author” to represent all types of responsible bodies, be it a personal author, a corporate author, a meeting name, an artist, a music composer, etc. this use of a single label to cover a diverse set of situations is confusing. some libraries used separate labels (“author,” “corporate author,” “meeting name,” “author/ artist,” “author/composer,” or “author, etc.”) for different types of responsible bodies (see appendix d). “uniform title” was defined in aacr to collocate resources derived from the same original intellectual or artistic creation. for example, when cataloging a translation, in addition to its official translated title, an established uniform title is entered to indicate the original work. when browsing by uniform title on a properly set opac, all entries related to the original intellectual creation should be retrieved. this uniform title browsing feature helps users locate related publications in the catalog. the problem is that the term “uniform title” is only understood by catalogers, not by others. there is no label for such an entry that can be easily understood by the average user. however, suppressing the uniform title entry to avoid confusing users will cause the opac to lose its helpful collocation functionality. some libraries studied use the term “uniform title” as a display label. some libraries use “other title” as a display label. some libraries display this entry under the label “title” along with the title proper (title in the 245 field). none of the above-mentioned arrangements are ideal. the display labels for subject headings provided by each library were very similar. most academic libraries in the united states use the library of congress subject headings and the medical subject headings as the thesauri for subject entries. specifying the thesauri for headings on opacs with acronyms like “lcsh” and “mesh” is of no help to users, because these thesauri do not clarify anything that would assist users in their research. figure 3 lists the display labels used by libraries in the study. ered the key elements for identification. opac users first examine them and decide if the manifestation described is relevant to their query. most opacs studied used “title” as the display label for the title statement. this element actually consists of the title and statement of responsibility (author, etc. statement). using the label “title” alone is not inclusive enough. one library (university of arizona library) displayed only the title portion under the label “title” and provided a separate label, “author/contributor info,” for the statement of responsibility portion, which, while helpful in a limited way, could also create more confusion. let us consider, for example, the project directory (répertoire des projets) of tdc (in french, cdt). the title statement for this data would be “project directory / tdc = répertoire des projets / cdt.” here, the english title and statement of responsibility is equivalently presented with its french title and statement of responsibility. the opac display using the university of arizona library’s model is as follows: title: project directory author/contributor info: tdc = répertoire des projets / cdt. this arrangement will not work for items with titles and statements of responsibility in multiple languages presented on a single manifestation. the french title appears under the label “author/contributor info,” which makes no sense. marc fields library of congress subject (marc 650 field 2nd indicator 0) medical subject (marc 650 field 2nd indicator 2) d is p la y l a b e ls subject (lcsh) subject (mesh) subject-lib. cong. subject-medical subject lc subject medical library of congress subject headings medical subject headings subject(s) subject(s) subject, general subject, geographic subject, medical subject med. subject figure 3. display labels for subjects article title | author 13revitalizing the library opac | mi and weng 13 the second-tier elements the elements in the second tier include the physical description, notes, related authors, and related titles. this is an area where mapping bibliographic elements onto proper display labels is difficult. this area was also not managed well by the libraries studied. unlike first-tier elements in which one element usually corresponds to a unique display label, second-tier elements exhibit two patterns in the opacs examined: many-to-one and one-to-many. that is, multiple categories of data (of different marc fields) can be represented by one display label, e.g., incorporating physical description, numbering notes, and publication numbering into “description” (many-to-one). on the other hand, one display label can represent one single, repeatable bibliographic element (the same marc field repeated many times), e.g., multiple general notes (oneto-many). both arrangements (one-to-many and manyto-one) can result in a simpler, cleaner public display, since some descriptive elements are self-explanatory and users can get by without specific display labels supplied. the disadvantage of these arrangements is that the level of specificity of public displays is compromised. some important descriptions can be easily missed if they are clustered in a group of elements. for bibliographic elements that are not self-explanatory, this type of arrangement can fail to convey useful information, or even worse, deliver inaccurate or vague information. for example: description: v. : ill. ; 28 cm (physical description, marc 300 field) report year ends mar. 31. (numbering note, marc 515 field) ’77– (publication span, marc 362 field) published: philadelphia : robert morris associates, 1977– (imprint, marc 260 field) ’77– (publication span, marc 362 field) annual (frequency, marc 310 field) the numbering field (field 362) is defined to describe a serial publication’s chronological or numerical publication extent. carelessly placing data like “’77-” under labels such as “description” or “published” is very unclear. in fact, it is inaccurate because “’77-” is the publication span, not the publication date. without a proper label, it is difficult to convey this information to users. some libraries we studied used such labels as “publication history,” “publishing history,” “publication dates,” or “volume/date range” to describe the publication span. this practice is misleading (see appendix e). names like coauthors, editors, cast members, performers, related corporate names, or meeting names of people who contributed to or were involved in the creation of the work are considered secondary contributors. using one label to cover the various roles (author, editor, composer, etc.) is the practice most libraries have adopted. like the primary author field, this element represents a variety of roles depending upon the type of manifestation. some opacs used one display label to cover all related personal names, corporate names, and meeting names (see appendix f). most libraries failed to supply a proper label for a secondary name when it was entered with a related title. this so-called “name-title added entry” is provided to collocate materials under the same author and title in the catalog. ideally, the name-title combined element, provided with redirect functionality via hyperlink, should perform an author-title combination search for exact retrieval. most opac systems could only perform either an author or a title search. the search results were unsurprisingly irrelevant, because they did not utilize both elements of the name-title added entry to produce results that were sufficiently specific: users would get only a list of authors or a list of titles instead of an author-title combination entry list. some libraries presented this type of element only as an unhyperlinkable note, which defeats the purpose of having such data available. handling series for opac displays is also challenging. the majority of opacs studied did a poor job in this area. in general, a series title transcribed from the resource also functions as an access element if the transcribed title is the same as the established one in the authority file. when the transcribed series title is different from the established series title, ideally the transcribed series title should only be accessible via the library system’s cross-references feature, which then directs users to bibliographic records that contain the established entry. this type of descriptive element is not meant to be displayed on the opac. the opacs examined used the labels listed in figure 4 to handle transcribed and established series entries. labels listed in the same row were taken from the same opac. as can be seen, users are not expected to know the difference between a “series statement” and a “series.” in many cases, these two elements are identical due to the vendor authority control process.20 this could confuse the user, especially when both elements are displayed right next to each other. 14 information technology and libraries | march 200814 information technology and libraries | march 2008 the third-tier elements the third-tier elements consist of linking fields (marc 7xx fields) and electronic location and access fields (marc 856 field). the linking fields are used mostly in serial bibliographic records. their purpose is to link the title being described to its related publications, e.g., supplements, translations, preceding titles, or succeeding titles. elements in this category should be displayed and linked directly to the related record via control numbers provided in the bibliographic record. if the catalog does not have the related record, a clear message should indicate this to the user. unfortunately, many libraries do not display all the linking entries. none of the opacs studied offered direct link functionality. instead, what was usually offered was a redirect feature via hyperlink that prompted the system to issue a new author or title search. the direct link functionality via record control numbers was never made available. if the library did not have the related entry, the opac system simply took the user back to the original entry—a very confusing design flaw. to ease the user’s access to internet resources, the electronic location and access element (marc 856 field) was defined for catalogers to record the internet location of the resource being described and its related information. by clicking the hyperlinked element on an opac, users seamlessly get to the desired electronic document site. the url specified in the field might link to full-text documents, the table of contents, the document abstract, the publisher’s description, or the author’s biographical information. a label that fits all types of materials is crucial. the bibliographic elements displayed under the label should also be carefully managed. under the label, some libraries displayed the type of resource (e.g., table of contents). other libraries displayed the http url only. some libraries displayed both the type of resource and the http url (see figure 5). as for the location of the label in the opac record, we found that the location of the url link depended on the opac in which it appeared: in some opacs, links were located at the top of records; in others, they appeared in the middle or at the bottom. we found that the location of the link was not terribly critical, provided that the label was prominent and the display text understandable. the order of the bibliographic elements display the way bibliographic data is organized in each opac record, together with display labels, helps users to quickly identify library resources. although each library can locally choose the arrangement of bibliographic data displayed on its opac, most libraries prefer to place the citation information (author, title, publication) ahead of other elements. the sequence of the other elements exhibited enormous variation in the opacs studied. some libraries placed the electronic access element above all other data (suny buffalo); some libraries placed local holdings information, call number, and item availability in the middle of the bibliographic record. arrangements were clearer and more understandable when provided with clear labels and a distinct layout between the local holdings information and bibliographic data. problems arose when second-tier elements were mingled with firsttier elements and when they shared the same display label. see example in figure 6. in this example, two titles are displayed under the “title” label. the first title, “rma annual statement studies,” is the full title (marc field 245) of the publication. the second title, “rma annual statement studies: industry default probabilities and cash flow measures,” is the title of the resource’s related publication (marc field 730), which normally is considered a second-tier element and should be placed farther from the title proper with a clear label. since the display order of bibliographic elements is completely customizable, we found in our study that few libraries put enough effort into providing clear bibliographic displays. more importantly, records in different formats (e.g., monographs, serials, music materials, video recordings) were not given equal attention. some labels and data sequences might work for one format, but not another. utilization of bibliographic data another factor that has an effect on the usability of an opac is the utilization of bibliographic data. two issues are addressed in terms of utilization of bibliographic data: (1) the completeness and suitability of the metadata displayed on an opac, and (2) the extent of repurposing the bibliographic data and creating added value to an opac.21 a typical bibliographic record contains descriptive data, access data, and adminlabel for transcribed element label for established element series statement series series statement series indexed as other series series series note series description series figure 4. display labels for series article title | author 15revitalizing the library opac | mi and weng 15 istrative data. descriptive data is provided to describe the manifestation cataloged and is considered of interest to the public. access data is entered and indexed for retrieval. administrative data is used for setting up search limits (e.g., limit by language, format) and pulling statistics (e.g., how many titles in spanish). it is most useful for internal, administrative use. librarians must be careful when deciding whether such data elements will be displayed. in terms of the completeness and the suitability of metadata in the opac display, the authors discovered the following in the opacs studied: 1. many libraries’ opacs displayed control numbers, such as the oclc control number (the 035 field), the lc control number (010 field), and other local system control numbers. this type of information is usually of no interest to the public. see example in figure 7. in this example, the numbers listed under the label “wln #” represent different types of system control numbers, which are of no concern to users and therefore should not be displayed. 2. some opacs displayed bibliographic data from the leader fields of the cataloging record. marc leader fields are a group of fixed-length codes that represent the type of resource (monograph, serial, or musical score) and material format (print, electronic, or sound recording). the information could be helpful for patrons if they are displayed with the proper label on the opac. libraries that chose to display the leader data on their opacs did not do a good job of making the information clear to users. for example, one library listed “journals and newspapers,” “computer file,” “serial,” “book,” “e-resource,” and “gov publication” under the label “record type” (see figure 8). seeing so many record types under one label can easily confuse library users. 3. some libraries omitted certain crucial variable fields, e.g., the linking entry complexity note (field 580, containing information about title history), related title access entries (fields 730 and 740, containing related titles), and linking entries (linking the record to other bibliographically related records, e.g., 76x, 77x, and 78x fields). these fields are defined with a clear purpose and should be carefully considered for public display with clear labels. some libraries in our study displayed them but left other irrelevant information on the opac, which clutters the display with information that does not help users. see example in figure 9. in this example, under the label “related publication,” the french version and the spanish version of jama are displayed. in addition to the french title and the spanish title, the marc 21 language code and its corresponding issn are also displayed. the language code and the eight-digit issn number— since no separate label is provided for them—are confusing. 4. the linking elements not only should be displayed on the opac, but should also be hyperlinkable. they ought to be used to link to related bibliographic records. in an online environment, this sort of field can also function as a descriptive element. some opacs displayed linking entries but did not enable hyperlink functionality. some libraries displayed two instances of them, one as a descriptive element and the other as a linking element with hyperlink capability. another important aspect of making use of bibliographic data is repurposing the bibliographic data to provide added value to opacs. lorcan demsey mentions frequently in his blog that in order to sustain library value, libraries should “make data work harder.” he points out that “libraries have invested a great deal in bibliographic data—yet it has remained somewhat inert in our catalogs, failing to release the value of the investment.”22 these rich data can be better utilized for different purposes, including designing an enhanced opac. lavoie, et al. described further in their recent article about data mining: as more activities move into networked spaces, more areas of our lives are shedding data. this data is increasingly being mined for intelligence that drives services. . . . [c]ompanies like amazon repurpose data to create added value. this is a lesson librarians must learn if they want to improve their own visibility and value in increasingly crowded digital information spaces where users, as always, want good results without too much time or effort. . . . the good news is that libraries don’t come to this task empty-handed but with figure 5. online opac record from suny buffalo figure 6. online opac record from the college of new jersey 16 information technology and libraries | march 200816 information technology and libraries | march 2008 rich, structured information about the materials in our collections.23 tim o’reilly highlighted in his article the successful example of how amazon reutilizes data: amazon relentlessly enhanced the data, adding publisher-supplied data such as cover images, table of contents, index, and sample materials. even more importantly, they harnessed their users to annotate the data, such that after ten years, amazon, not bowker, is the primary source for bibliographic data on books, a reference source for scholars and librarians as well as consumers. . . . effectively, amazon “embraced and extended” their data suppliers.24 all opacs reviewed in the study operate within the traditional vendor-supplied module. this long-established approach gives libraries limited flexibility to customize the search key options, search results displays, restricted sorting options, and preand post-search limit options of their opacs. unfortunately, libraries can do very limited data mining inside the vendor’s hard-coded framework. many valuable metadata are buried in the bibliographic database. system vendors have failed to make the most of technology to better utilize data. very few libraries have thought outside the box and taken advantage of the existing rich bibliographic data. the emergence of north carolina state university’s endecapowered opac was a good example of repurposing data and creating value-added information. the data facets used on ncsu’s single search-andbrowse combined opac interface are pulled and repurposed from their sirsi/dynix database. as one might have expected, eight of the eleven facets are extracted from the library’s marc bibliographic records (“availability” and “browse: new” are from item records). out of the eight facets, four are extracted from subject headings; two are from the fixed fields; one is from the call number field and one from the variable fields of the bibliographic record.25 ■ discussion and recommendation based on the authors’ findings above, the following are the primary factors that have contributed to the ineffectiveness of the opacs offered by today’s academic libraries. 1. system limitations the inadequacy of today’s ils has been a known problem. inflexible search options make library catalogs difficult to use. despite the fact that some vendors diligently enhance their systems’ functionalities, overall performance is still disappointing. karen markey pointed out in a recent article that one of the reasons why the solutions recommended by researchers in the 1990s were not applied to online library catalogs was “the failure of ils vendors to monitor shifts in information-retrieval technology and respond accordingly with system improvements.”26 antelman et al. observed similarly that all major ils vendors are still marketing catalogs that represent second-generation functionality. despite between-record linking made possible by migrating catalogs to web interfaces, the underlying indexes and exact-match boolean search remain unchanged. it can no longer be said that more sophisticated approaches to searching are too expensive computationally; they may, however, to be too expensive to introduce into legacy systems from a business perspective.27 since ils vendors first introduced their products back in the 1980s, user behavior and expectations have changed immensely. while libraries have started to figure 7. online opac record from the university of washington. figure 9. online opac record from the university of michigan.figure 8. online record from suny buffalo article title | author 17revitalizing the library opac | mi and weng 17 recognize the changes and are working hard toward meeting the needs of multiple generations of users, little can be done if ils products still operate within the same old-fashioned information-retrieval structure. because ils vendors have failed to revamp their opac modules to meet user needs, libraries have been forced to seek other options. north carolina state university is one of the first libraries to exercise its options. its new opac system, powered by endeca (operated on the sirsi/dynix platform), has shown remarkable improvements in ease of use, which usability tests have verified. recently, two ils vendors (innovative and ex libris) have been in the process of developing new opac modules using new technology and a new approach in data mining. 2. libraries are not fully exploiting the functionality already made available by ilss unsurprisingly, the opacs examined by the authors, if powered by the same vendor, showed similarities in general layout and interface features. during the study, it soon turned out to be easy for the authors to recognize the ils system of each opac. as mentioned previously, we expected opacs to vary somewhat. what was unexpected was the huge differences in, among other things, interface layout, search options and search languages, behind-thescenes search algorithms, search results displays, display labels and the corresponding bibliographic data, and what data was chosen for display. the disparities that we found in these features suggested that there had been great differences in the amount of attention, energy, and time devoted by each library to designing its opac. some libraries took advantage of available features and made better use of them than others. (see appendix g for examples of best practices of library opacs.) many libraries did only the very minimum. while we recognize that academic library opacs are difficult to use, we also need to recognize that some libraries do not fully exploit existing resources, thereby exacerbating the difficulty of using their opacs. 3. the unsuitability of marc standards to online bibliographic display as previously mentioned, aacr and marc were initially designed for card catalogs without display labels in mind. many marc fields can be used for multiple purposes. providing labels that properly fit all the cataloging data needed to cover all types of resources is nearly impossible. from the opacs studied, some libraries used vague labels in an effort to encompass as many circumstances as possible. some libraries used labels suitable only for certain formats, but not all formats. neither approach is satisfactory. the solution has to come from cataloging and metadata standards. wool identified this issue back in the 1990s: the interchangeability of descriptive data elements and access points (since each can be made to serve both functions online) makes the separate creation of description and headings seem pointless and burdensome. labeling of data elements (made possible through the mapping of terms to marc fields) creates a need for simpler, less ambiguous bibliographic data definitions than are appropriate for the dense and context-rich narrative-style records catalogers continue to create . . . cataloging standards will need to be rewritten in order to provide the kind of data flexibility expected in online systems . . . records flexible enough to be added to, subtracted from, and rearranged without loss and garbling of meaning. what is needed is a modular record structure, in which every segment of data can stand on its own with appropriate labeling and which can support all possible display lengths and combinations of data elements.28 a decade later, not much progress has been made in improving cataloging and metadata standards for online display. while enhancing cataloging and metadata standards for better retrieval is desirable, making the standards more complicated and difficult to adopt in order to accommodate opac displays is not. as librarians are working to simplify cataloging, our essential rich metadata should not be sacrificed. one possible solution is to have the system recognize the existence of certain subfields and produce specific display labels accordingly. this certainly will not solve all the issues with regard to display labels. regardless, there is much room for improvement, and librarians’ attention is this area is critically needed. ■ conclusion the information-seeking world has entered an era of selfservice. roy tennant described well the self-service trend: “i wish i had known that the solution for needing to teach our users how to search our catalog was to create a system that didn’t need to be taught.”29 tim o’reilly also indicated in his article “what is web 2.0” that “the web 2.0 lesson [is to] leverage customer-self service and algorithmic data management to reach out to the entire web, to the edges and not just the center, to the long tail and not just the head.” he also argued that “[t]rusting users as co-developers” is one of the core competencies of web 2.0 companies.30 academic libraries should aim toward designing a user-centered, self-sufficient, twenty-first-century online catalog that fits the web 2.0 model. the ultimate goal is that users will be comfortable and confident using library opacs for their information needs wherever a computer 18 information technology and libraries | march 200818 information technology and libraries | march 2008 is available and without special training. as campbell and fast have trenchantly asked, “are we witnessing a major disruption, a large-scale redefinition of information design and delivery so radically different from the traditional library environment that it renders irrelevant all our experience in bibliographic control?”31 this remains an open question. regardless, a new generation of opacs will need to be in place soon. much needs to be done to make academic library opacs matter. academic librarians cannot afford to be considered irrelevant in the information-seeking world. the future of academic libraries relies on effective opacs. this is one of the most pressing tasks that must be accomplished. references and notes 1. cathy de rosa et al., perceptions of libraries and information resources: a report to the oclc membership (dublin, ohio: oclc, 2005), 1–17. http://www.oclc.org/reports/2005perceptions.htm (accessed jan. 20, 2007). 2. karen coyle and diane hillmann, “resource description and access (rda): cataloging rules of the 20th century,” d-lib magazine 13, no. 1/2 (2007). http://www.dlib.org/dlib/january07/coyle/01coyle.html (accessed feb. 3, 2007). 3. anna m.van scoyoc and caroline cason, “the electronic academic library: undergraduate research behavior in a library without books,” portal: libraries and the academy 6, no. 1 (2006): 47–58. 4. carol tenopir, “user and users of electronic library resources: an overview and analysis of recent research studies,” council on libraries and information resources, 2003. http://www.clir.og/pubs/reports/pub120/pub120 (accessed jan. 20, 2007). 5. cathy de rosa et al., the 2003 oclc environmental scan (dublin, ohio: oclc, 2003), http://www.oclc.org/reports/ escan/introduction/default.htm (accessed jan. 20, 2007). 6. d. grant campbell and karl v. fast, “panizzi, lubetzky, and google: how the modern web environment is reinventing the theory of cataloguing,” the canadian journal of information and library science 28, no. 3 (2004): 25–38. 7. roy tennant, “breaking library services out of the box,” presentation (2005), http://www.cdlib.org/inside/news/ presentations/rtennant/2005netspeed/ (accessed feb. 11, 2007); andrew pace, “my kingdom for an opac,” american libraries online (feb. 2005), http://www.ala.org/ala/alonline/ techspeaking/2005colunms/techfeb2005.cfm (accessed feb. 11, 2007); karen g. schneider, “how opacs suck, part 1: relevance rank (or the lack of it),” ala techsource blog (mar. 13, 2006), http://www.techsource.ala.org/blog/2006/03/how-opacssuck-part-1-relevance-rank-or-the-lack-of-it.html (accessed feb. 11, 2007); karen g. schneider, “how opacs suck, part 2: the checklist of shame,” ala techsource blog (apr. 3, 2006), http:// www.techsource.ala.org/blog/2006/04/how-opacs-suck-part2-the-checklist-of-shame.html (accessed feb. 11, 2007); “how opacs suck, part 3: the big picture,” ala techsource blog (may 20, 2006), http://www.techsource.ala.org/blog/2006/05/ how-opacs-suck-part-3-the-big-picture.html (accessed feb. 11, 2007); lorcan dempsey, lorcan dempsey’s weblog (oct. 4, 2005), http://orweblog.oclc.org/archives/000815.html (accessed feb. 11, 2007); kristin antelman, emily lynema, and andrew k. pace, “toward a twenty-first century library catalog,” information technology and libraries 25, no. 3 (2006): 128–139. 8. roy tennant, “libraries through the looking-glass,” 2004 ala midwinter endeavor presentation. http://www.cdlib. org/inside/news/presentations/rtennant/2004ala/ (accessed march 16, 2007). 9. charles ammi cutter, rules for a printed dictionary catalogue (washington, d.c.: government printing office, 1876). 10. d. grant campbell and karl v. fast, “panizzi, lubetzky, and google: how the modern web environment is reinventing the theory of cataloguing,” 31. 11. holly yu and margo young, “the impact of web search engines on subject searching in opac,” information technology and libraries 23, no.4 (2004): 194. 12. jakob nielsen, “mental models for search are getting firmer,” in jakob nielsen’s alertbox, http://www.useit.com/ alertbox/20050509.html (accessed feb 20, 2007). 13. eng pwey lau and dion hoe-lian goh, “in search of query patterns: a case study of a university opac,” information processing and management 42, no. 1 (2006): 1316–1329. 14. holly yu and margo young, “the impact of web search engines on subject searching in opac,” 173. 15. dinet jérome, favart monik and passerault jean-michel, “searching for information in an online public access catalogue opac: the impacts of information search expertise on the use of boolean operators,” journal of computer assisted learning 20, no. 5 (2004): 338–346. 16. gregory wool, “the many faces of a catalog record: a snapshot of bibliographic display practices for monographs in online catalogs,” information technology and libraries 15, no. 3 (1996): 184. 17. the fifteen libraries are located at the college of new jersey, library of congress, northwestern university, princeton university, state university of new york at buffalo, temple university, university of arizona, university of florida, university of illinois–urbana-champaign, university of michigan, university of minnesota, university of rochester, university of texas– austin, university of washington, and vanderbilt university. 18. gregory wool, “the many faces of a catalog record: a snapshot of bibliographic display practices for monographs in online catalogs,” 173–195. 19. eight titles representing monograph, serial, video recording, and sound recording were used to study the effectiveness of the bibliographic display. the eight titles are: (1) to love the wind and the rain: african americans and environmental history, edited by dianne d. glave and mark stoll. university of pittsburgh press, 2006. (monograph) (2) to kill a mocking bird, by harper lee (mongraph) (3) rma annual statement studies, robert morris associates, 1977(serial) (4) sideways (20th century fox, 2004) (video recording) (5) chamber music (newport classic, 2000) (sound recording) (6) end of summer book of hours ; bright music, naxos, 2003 / by ned rorem (sound recording) (7) jama : the journal of the american medical association, 1960(serial) article title | author 19revitalizing the library opac | mi and weng 19 (8) the 21st century at work, by lynn a. karoly (rand, 2004) (mongraph) 20. many vendors retag the 440 field to 490 in bibliographic record and create an 830 field based on the contents of the 440 field. the series title in the 830 field receives authority control. many libraries prefer not to restore the 830 field back to the 440 fields causing the duplicate series statements on opac if both fields are displayed. 21. lorcan demsey, “making data work—web 2.0 and catalogs.” 22. ibid. 23. brian lavoie, lorcan dempsey, and lynn silipigni connaway, “making data work harder,” library journal.com (jan. 15, 2006), http://www.libraryjournal.com/article/ca6298444. html (accessed jan. 28, 2006). 24. tim o’reilly, “what is web 2.0: design patterns and business models for the next generation of software,” (sept. 30, 2005), http://www.oreillynet.com/pub/a/oreilly/tim/ news/2005/09/30/what-is-web-20.html (accessed jan. 28, 2007). 25. tito sierra, “a faceted interface to the library catalog,” ala 2007 midwinter meeting, http://www.lib.ncsu.edu/ endeca/presentations.html (accessed feb. 11, 2007). 26. karen markey, “the online library catalog: paradise lost and paradise regained?” d-lib magazine 13, no.1/2 (2007). http://www.dlib.org/dlib/january07/markey/01markey.html (accessed feb. 11, 2007). 27. kristin antelman, emily lynema, and andrew k pace, “toward a twenty-first century library catalog,” 129. 28. gregory wool, “the many faces of a catalog record: a snapshot of bibliographic display practices for monographs in online catalogs,” 184–185. 29. roy tennant, “lipstick on a pig,” library journal.com (apr. 15, 2005), http://libraryjournal.com/article/ca516027. html (accessed feb. 11, 2007). 30. tim o’reilly, “what is web 2.0: design patterns and business models for the next generation of software.” 31. d. grant campbell and karl v. fast, “panizzi, lubetzky, and google: how the modern web environment is reinventing the theory of cataloguing,” 26. appendix a. default search keys used by arl libraries (as of march 2007) appendix b. keyword search keys used by voyager libraries keyword (relevance) keyword (boolean) keyword with relevance ranking keyword (enclose phrases “in quotes”) keyword anywhere (user “” for phrase) keyword combined (use and/or/not “ “ for phrase) keyword anywhere (relevance ranked) keyword (and or not) keyword anywhere advanced boolean words anywhere keyword boolean basic keyword keyword(s) (user and, or, not, or “a phrase”) any word anywhere boolean search (use and or not) relevance keyword (user + for key terms) command keyword keyword phrase keyword (use “and” “or” “not”) keyword and or not( keyword boolean) keyword (results sorted by relevance) expert keyword keyword keyword expert (user an or not “phrase”) keyword command ranked keyword keyword 20 information technology and libraries | march 200820 information technology and libraries | march 2008 keyword (ranked by relevance) keyword keyword command search find all words search for a phrase keyword (quick search) boolean search appendix c. default keyword search help page provided by voyager system keyword search ■ enter words and/or phrases ■ use quotes to search phrases: "world wide web" ■ use + to mark essential terms: +explorer ■ use * to mark important terms: *internet ■ use ? to truncate (cut off) words: theat? finds theaters, theatre, theatrical, etc. ■ do not use boolean operators (and, or, not) to combine search terms boolean ■ use the boolean terms (and, or, not) to combine search terms. ■ use quotation marks to search for a phrase, e.g., "united states" ■ use ? to truncate a word, e.g., browser? ■ use parentheses to group search terms, e.g., (automobile or car) and repair appendix d. display labels for entries of principal responsibility marc fields libraries 100 (personal name) 110 (corporate name) 111 (meeting name) u. of arizona author author author u. of ill. author author conference lc personal name corporate name meeting name u. of minnesota author author author u. of michigan author author author northwestern u. author, etc. author, etc. author, etc. princeton u. author/artist author/artist author/artist u. of washington author author author suny buffalo author author author temple author corp author conference u. of florida author, etc. author, etc. author, etc. u. of rochester main author main author conference ut austin author corporate author conference tcnj principal author principal author conference name vanderbilt u. author corporate author meeting/event name article title | author 21revitalizing the library opac | mi and weng 21 appendix e. display labels for publication extent libraries marc 362 field u. of arizona issued u. of ill. publication history lc description u. of minnesota published u. of michigan pub history northwestern u. extent of publication princeton u. description u. of washington (suppressed from opac) suny buffalo publication dates temple publication started u. of florida publishing history u. of rochester (suppressed from opac) ut austin publication coverage date tcnj description vanderbilt u. volume/date range appendix f. display labels for entries of secondary responsibility marc fields libraries 700 (personal name) 710 (corporate name) 711 (meeting name) u. of arizona other auth other auth other auth u. of ill champaign other name other name other name lc related names related names related names u. of minnesota contributor contributor contributor u. of michigan contributors people contributors other contributors other northwestern u. other authors, title, etc. other authors, title, etc. other authors, title, etc. princeton u. related name(s) related name(s) related name(s) u. of washington alt author alt author alt author suny buffalo contributors contributors contributors temple other author(s) other author(s) other name u. of florida other author(s), etc. other author(s), etc. other author(s), etc. u. of rochester other author(s) other author(s) other author(s) ut austin added author (not display) (not display) tcnj other contributor(s) other contributor(s) conference name vanderbilt u. author, editor, etc. corporate author meeting/event 22 information technology and libraries | march 200822 information technology and libraries | march 2008 appendix g. examples of best practices of opacs (accessed july 16, 2007) search interface, including retaining search keys and searched terms university of notre dame http://alephprod.library.nd.edu:8991/f/?func= find-b-0&local_base=ndu01pub keyword searching ability michigan state university http://magic.msu.edu/search~/x facets browsing (endeca) north carolina state university http://www.lib.ncsu.edu/catalog mcmaster university http://libcat.mcmaster.ca make author, subject and call number links more accessible university of virginia https://virgo.lib.virginia.edu/uhtbin/cgisirsi/0/ uva-lib/0/60/1180/x links to amazon ratings ohio state university http://library.ohio-state.edu/search direct export to refworks johns hopkins university https://catalog.library.jhu.edu/ipac20/ipac. jsp?profile=default#focus university of chicago http://libcat.uchicago.edu/ipac20/ipac. jsp?profile=ucpublic cover art/toc/ summary/review indiana university http://www.iucat.iu.edu/authenticate.cgi?status=start guesstimate/del.icio.us persistent link enabled virginia tech http://addison.vt.edu ital_24n4p24-32 ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ : | zhang et al. 75seeing the wood for the trees | zhang et al. 75 here again, no weighting or differentiating mechanism is included in describing the multiple elements. what is addressed is the “what” problem: what is the work of or about? metadata schemas for images and art works such as vra core and cdwa focus on specificity and exhaustivity of indexing, that is, the precision and quantity of terms applied to a subject element. however, these schemas do not address the question of how much the work is of or about the item or concept represented by a particular keyword. recently, social tagging functions have been adopted in digital library and catalog systems to help support better searching and browsing. this introduces more subject terms into the system. yet again, there is typically no mechanism to differentiate between the tags used for any given item, except for only a few sites that make use of tag frequency information in the search interfaces. as collections grow and more federated searching is carried out, the absence of weights for subject terms can cause problems in search and navigation. the following examples illustrate the problems, and the rest of the paper further reviews and discusses the precedent research and practice on weighting, and further outlines the issues that are critical in applying a weighting mechanism. example, the dublin core metadata element set recommends the use of controlled vocabulary to represent subject in “keywords, key phrases, or classification codes.”1 similarly, the library of congress practice, suggested in the subject headings manual, is to assign “one or more subject headings that best summarize the overall contents of the work and provide access to its most important topics.”2 a topic is only “important enough” to be given a subject heading if it comprises at least 20 percent of a work, except for headings of named entities, which do not need to be 20 percent of the work when they are “critical to the subject of the work as a whole.”3 although catalogers are aware of it when they assign terms, this weight information is left out of the current library metadata schemas and practice. a similar practice applies in non-textual object subject indexing. because of the difficulty of selecting words to represent visual/aural symbolism, subject indexing for art and cultural objects is usually guided by panofsky’s three levels of meaning (pre-iconographical, iconographical, and post-iconographical), further refined by layne in “ofness” and “aboutness” in each level. specifically, what can be indexed includes the “ofness” (what the picture depicts) as well as some “aboutness” (what is expressed in the picture) in both pre–iconographical and iconographical levels.4 in practice, vra core 4.0 for example defines subject subelements as: terms or phrases that describe, identify, or interpret the work or image and what it depicts or expresses. these may include generic terms that describe the work and the elements that it comprises, terms that identify particular people, geographic places, narrative and iconographic themes, or terms that refer to broader concepts or interpretations.5 seeing the wood for the trees: enhancing metadata subject elements with weights subject indexing has been conducted in a dichotomous way in terms of what the information object is primarily about/of or not, corresponding to the presence or absence of a particular subject term, respectively. with more subject terms brought into information systems via social tagging, manual cataloging, or automated indexing, many more partially relevant results can be retrieved. using examples from digital image collections and online library catalog systems, we explore the problem and advocate for adding a weighting mechanism to subject indexing and tagging to make web search and navigation more effective and efficient. we argue that the weighting of subject terms is more important than ever in today’s world of growing collections, more federated searching, and expansion of social tagging. such a weighting mechanism needs to be considered and applied not only by indexers, catalogers, and taggers, but also needs to be incorporated into system functionality and metadata schemas. s ubjects as important access points have largely been indexed in a dichotomous way: what the object is primarily about/ of or not. this approach to indexing is implicitly assumed in various guidelines for subject indexing. for hong zhang, linda c. smith, michael twidale, and fang huang gaocommunications hong zhang (hzhang1@illinois.edu) is phd candidate, graduate school of library and information science, university of illinois at urbana-champaign, linda c. smith (lcsmith@illinois.edu) is professor, graduate school of library and information science, university of illinois at urbana-champaign, michael twidale (twidale@illinois.edu) is professor, graduate school of library and information science, university of illinois at urbana-champaign, and fang huang gao (fgao@gpo.gov) is supervisory librarian, government printing office. 76 information technology and libraries | june 2011 ■■ examples of problems exhaustive indexing: digital library collections a search query of “tree” can return thousands of images in several digital library collections. the results include images with a tree or trees as primary components mixed with images where a tree or trees, although definitely present, are minor components of the image. figure 1 illustrates the point. these examples come from three different collections and either include the subject element of “tree” or are tagged with “tree” by users. there is no mechanism that catalogers or users have available to indicate that “tree” in these images is a minor component. note that we are not calling this out as an error in the professionally developed subject terms, nor indeed in the end user generated tags. although particular images may have an incorrectly applied keyword, we want to talk about the vast majority where the keyword quite correctly refers to a component of the image. furthermore, such keywords referring to minor components of the image are extremely useful for other queries. this kind of exhaustive indexing of images enables the effective satisfaction of search needs, such as looking for pictures of “buildings, people, and trees” or “trees beside a river.” with large image collections, such compound needs become more important to satisfy by combinations of searching and browsing. to enable them, metadata about minor subjects is essential. however, without weights to differentiate subject keywords, users will get overwhelmed with partially relevant results. for example, a user looking for images of trees (i.e., “tree” as the primary subject) would have to look through large sets of results such as a photograph of a dog with a tiny tree out of focus in the background. for some items that include rich metadata, such as title or description, when people look at a particular item’s record, with the title and sometimes the description, we may very well determine that the picture is primarily of, say, a dog instead of trees. that is, the subject elements have to be interpreted based on the context of other elements in the record to convey the “primary” and “peripheral” subjects among the listed subject terms. however, in a search and navigation system where subject elements are usually treated as context-free, search efficiency will be largely impaired because of the “noise” items and inability to refine the scope, especially when the volume of items grows. lack of weighting also limits other potential uses of keywords or tags. for example, all the tags of all the items in a collection can be used to create a tag cloud as a low cost way to contribute to a visualization of what a collection is “about” overall.6 unfortunately, a laboriously developed set of exhaustive tags, although valuable for supporting searching and browsing within a large image collection, could give a very distorted overview of what the whole collection is about. extending our example, the tag “tree” may occur so frequently and be so prominent in the tag cloud that a user infers that this is mostly a botanical collection. selective indexing: lcsh in library catalogs although more extreme in the case of images in conveying the “ofness,” the same problem with multiple subjects also applies to text in terms of “aboutness.” the following example comes from an online library catalog in a faceted navigation web interface using library of congress subject headings in subject cataloging.7 the query “psychoanalysis and religion” returned 158 results, with 126 in “psychoanalysis and religion” under the topic facet. according to the subject headings manual, the first subject is always the primary one, while the second and others could be either a primary or nonprimary subject.8 this means that among these 126 books, there is no easy way to tell which books are “primarily” about “psychoanalysis and religion” unless the user goes through all of them. with the provided metadata, we do know that all books that have “psychoanalysis and religion” as the first subject heading are primarily about this topic, but a book that has this same heading as its second subject heading may or may not be primarily about this topic. there is no way to indicate which it is in the metadata, nor in the search interface. as this example shows, the library of congress manual involves an attempt to acknowledge and make a distinction between primary and nonprimary subjects. however in practice the attempt is insufficient to be really useful since apart from the first entry, it is ambiguous whether subsequent entries are additional primary subjects or nonprimary subjects. consequently, the search system and, further on, the users are not able to take full advantage of the care of a cataloger in deciding whether an additional subject is primary or not. other information retrieval systems the negative effect of current subject indexing without weighting on search outcomes has been identified by some researchers on particular information retrieval systems. in a study examining “the contribution of metadata to effective searching,”9 hawking and zobel found that the available subject metadata are “of little value in ranking answers” to search queries.10 their explanation is that “it is difficult to indicate via metadata tagging the relative importance of a page to a particular topic,”11 in addition to the problems in data quality and system implementation. the same problem : | zhang et al. 77seeing the wood for the trees | zhang et al. 77 authors compared with the automatic indexing systems, because human indexers should be better at weighting the significance of subjects, and be more able to distinguish between important and peripheral compared with computers that base significance on term frequency.13 indeed, while various weighting algorithms have been used in automatic indexing systems to approximate the distinguishing function, there is simply no such mechanism built in human subject the particular page harder to find.12 a similar problem is reported in a recent study by lykke and eslau. in comparing searching by controlled subject metadata, searching based on automatic indexing, and searching based on automatic indexing expanded with a corporate thesaurus in an enterprise electronic document management system, the authors found that the metadata searches produced the lowest precision among the three strategies. the problem of indiscriminate metadata indexing is “remarkable” to the of multiple tags without weights is described: in the kinds of queries we have studied, there is typically one page (or at most a small number) that is particularly valuable. there are many other pages which could be said to be relevant to the query—and thus merit a metadata match—but they are not nearly so useful for a typical searcher. under the assumption that metadata is needed for search, all of these pages should have the relevant metadata tag, but this makes a. subject: women; books; dresses; flowers; trees; . . . in: victoria & albert museum (accessed aug. 30, 2010), http://collections.vam.ac.uk/item/014962/oil-painting-the-day-dream b. tags: japanese; moon; nights; walking; tree; . . . in: brooklyn museum (accessed aug. 30, 2010), http://www.brooklynmuseum.org/opencollections/objects/121725/aoi_slope_outside_toranomon_gate_no._113_from_ one_hundred_famous_views_of_edo c. tags: japanese; birds; silk; waterfall; tree; . . . in: steve: the museum social tagging project (accessed aug. 30, 2010), http://tagger.steve.museum/steve/object/15?offset=2 figure 1. example images with “tree” as a subject item 78 information technology and libraries | june 2011 anderson in niso tr021997.20 in addition, researchers have noticed the limitations of this dichotomous indexing. in an opinion piece, markey emphasizes the urgency to “replace boolean-based catalogs with post-boolean probabilistic retrieval methods,”21 especially given the challenges library systems are faced with today. it is the time to change the boolean, i.e., dichotomous, practice of subject indexing and cataloging, no matter whether it is produced by professional librarians, by user tagging, or by an automatic mechanism. indeed, as declared by svenonius, “while the purpose of an index is to point, the pointing cannot be done indiscriminately.”22 needed refinements in subject indexing the fact that weighted indexing has become more prominently needed over the past decade may be related to the shift in the continuum from subject indexing as representation/ surrogate to subject indexing as access points, which is consistent with the shift from a small number of subject terms to more subject terms. this might explain why the weighting practice is applied in the above mentioned medline/pubmed system. with web-based systems, social tagging technology, federated searching, and the growing number of collections producing more subject terms, to distinguish between them has become a prominent problem. in reviewing information users and use from the 1920s to the present, miksa points out the trend to “more granular access to informational objects” “by viewing documents as having many diverse subjects rather than one or two ‘main’ subjects,” no matter what the social and technical environment has been.23 in recognizing this theme in the future development of information organization and retrieval systems, we argue that the subject indexing mechanism subject indexing has been discussed in the research area of subject analysis for some time. weighting gives indexing an increased granularity and can be a device to counteract the effect of indexing specificity and exhaustivity on precision and recall, as pointed out by foskett: whereas specificity is a device to increase relevance at the cost of recall, exhaustivity works in the opposite direction, by increasing recall, but at the expense of relevance. a device which we may use to counteract this effect to some extent is weighting. in this, we try to show the significance of any particular specification by giving it a weight on a pre-established scale. for example, if we had a book on pets which dealt largely with dogs, we might give pets a weight of 10/10, and dogs, a weight of 8/10 or less.16 anderson also includes weighting as a part of indexing in the guidelines for indexes and related information retrieval devices (niso tr021997): one function of an index is to discriminate between major and minor treatments of particular topics or manifestations of particular features.17 he also notes that a weighting scheme is “especially useful in high-exhaustivity indexing”18 when both peripheral and primary topics are indicated. similarly, fidel lists “weights” as one of the issues that should be addressed in an indexing policy.19 metadata indexing without weighting is related to the simplified dichotomous assumption in subject indexing—primarily about/of and not primarily about/of, which further leads to the dichotomous retrieval result—retrieved and not retrieved. weighting as a mechanism to break this dichotomy is noted by metadata indexing even though human indexers are able to do the job much better than computers. weighting: yesterday, today, and future precedent weighting practices written more than thirty years ago, the final report of the subject access project describes how the project researchers applied weights to the newly added subject terms extracted from tables of contents and backof-the-book indexes. the criterion used in that project was that terms and phrases with a “ten-page range or larger” were treated as “major” ones.14 a similar mechanism was adopted in the eric database beginning in the 1960s, with indexes distinguishing “major” and “minor” descriptors as the result of indexing. while some search systems allowed differentiation of major and minor descriptors in formulating searches, others simply included the distinction (with an asterisk) when displaying a record. unfortunately, this distinguishing mechanism is no longer included in the later eric indexing data. a system using weighted indexing and searching and still running today is the medline/pubmed interface. a qualifier [majr] can be used with a medical subject headings (mesh) term in a query to “search a mesh heading which is a major topic of an article (e.g., thromboembolism[majr]).”15 in the search result page, each major mesh topic term is denoted by an asterisk at the end. weighting concept and the purpose of indexing the weighting concept is connected with the fundamental purpose of indexing. the idea of weighting in : | zhang et al. 79seeing the wood for the trees | zhang et al. 79 user tagging and machine generated metadata, such weighting becomes more important than ever if we are to make productive use of metadata richness and still see the wood for the trees. references 1. “dublin core metadata element set, version 1.1,” http://dublincore.org/docu ments/dces/ (accessed nov. 20, 2010). 2. library of congress, subject headings manual (washington, d.c.: library of congress, 2008). 3. ibid. 4. elaine svenonius, “access to nonbook materials: the limits of subject indexing for visual and aural languages,” journal of the american society for information science, 45, no. 8 (1994): 600–606. 5. “vra core 4.0 element description,” http://www.loc.gov/standards/vracore/ vra_core4_element_description.pdf (accessed mar. 31, 2011). 6. richard j. urban, michael b. twidale, and piotr adamczyk, “designing and developing a collections dashboard,” in j. trant and d. bearman (eds). museums and the web 2010: proceedings, ed. j. trant and d. bearman (toronto: archives & museum informatics, 2010). http://www .archimuse.com/mw2010/papers/urban/ urban.html (accessed apr. 5, 2011). 7. “vufind at the university of illinois,” http://vufind.carli.illinois.edu (accessed nov. 20, 2010). 8. library of congress, subject headings manual. 9. david hawking and justin zobel, “does topic metadata help with web search?” journal of the american society for information science & technology 58, no. 5 (2007): 613–28. 10. ibid. 11. ibid. 12. ibid, 625. 13. marianne lykke and anna g. eslau, “using thesauri in enterprise settings: indexing or query expansion?” in the janus faced scholar. a festschrift in honour of peter ingwersen, ed. birger larsen et al. (copenhagen: royal school of library & information science, 2010): 87–97. 14. subject access project, books are for use: final report of the subject access project to the council on library resources (syracuse, n.y.: syracuse univ., 1978). 15. “pubmed,” http://www.nlm.nih more than three categories or using continuous scales instead of category rating.24 subject indexing involves a similar judgment of relevance when deciding whether to include a subject term. more sophisticated scales certainly enable more useful ranking of results, but the cost of obtaining such information may rise. after the mechanism of incorporating weights into subject indexing/ cataloging is developed, guidelines should be provided for indexing practice to produce consistent and good quality. weights in both indexing and retrieval system adding weights to subject indexing/ cataloging needs to be considered and applied in three parts: (1) extending metadata schemas by encoding weights in subject elements; (2) subject indexing/cataloging with weight information; and (3) retrieval systems that exploit the weighting information in subject metadata elements. the mechanism will not work effectively in the absence of any one of them. conclusion this paper advocates for adding a weighting mechanism to subject indexing and tagging, to enable search algorithms to be more discriminating and browsing better oriented, and thus to make it possible to provide more granular access to information. such a weighting mechanism needs to be considered and applied not only by indexers, catalogers, and taggers, but also needs to be incorporated into system functionality. as social tagging is brought into today’s digital library collections and online library catalogs, as collections grow and are aggregated, and the opportunity arises for adding more metadata from a variety of different sources, including end should provide sufficient granularity to allow more granular access to information, as demonstrated in the examples in the previous section. potential challenges while arguing for the potential value of weights associated with subject terms, it is also important to acknowledge potential challenges posed by this approach. human judgment treating assigned terms equally might seem to avoid the additional human judgment and the subjectivity of the weight levels because different catalogers may give different weight to a subject heading. we argue that assigning subject headings is itself unavoidably subjective. we are already using professional indexers and subject catalogers to create value-added metadata in the form of subject terms. assigning weights would be a further enhancement. on the other hand, adding a weighting mechanism into metadata schemas is independent of the issue of human indexing. no matter who will do the subject indexing or tagging, either professional librarians or users or possibly computers, there is a need for weight information in the metadata records. the weighting scale in terms of the specific mechanism of representing the weight rating, we can benefit from research on weighting of index terms and on the relevance of search results. for example, the three categories of relevant, partially relevant, and nonrelevant in information retrieval are similar to the major, minor, and nonpresent subject indexing method in the examples above. borlund notes several retrieval studies proposing 80 information technology and libraries | june 2011 22. svenonius, “access to nonbook materials,” 601. 23. francis miksa, “information organization and the mysterious information user,” libraries & the cultural record 44, no. 3 (2009): 343–70. 24. pia borlund, “the concept of relevance in ir,” journal of the american society for information science & technology 54, no. 10 (2003): 913–25. 18. ibid. 19. raya fidel, “user-centered indexing,” journal of the american society for information science 45, no. 8 (1994): 572–75. 20. anderson, guidelines for indexes and related information retrieval devices, 20. 21. karen markey, “the online library catalog: paradise lost and paradise regained?” d-lib magazine 13, no. 1/2 (2007). . g o v / b s d / d i s t e d / p u b m e d t u t o r i a l / 020_760.html (accessed nov. 20, 2010). 16. a. c. foskett, the subject approach to information, 5th ed. (london: library association publishing, 1996): 24. 17. james d. anderson, guidelines for indexes and related information retrieval devices. niso-tr02–1997, http:// www.niso.org/publications/tr/tr02.pdf (accessed nov. 20, 2010): 25. 10844 20190318 galley library services navigation: improving the online user experience brian rennick information technology and libraries | march 2019 14 brian rennick (brian_rennick@byu.edu) is aul for library it, brigham young university. abstract while the discoverability of traditional information resources is often the focus of library website design, there is also a need to help users find other services such as equipment, study rooms, and programs. a recent assessment of the brigham young university library website identified nearly two hundred services. many of these service descriptions were buried deep in the site, making them difficult to locate. this article will describe a web application that was developed to improve service discovery and to help ensure the accuracy and maintainability of service information on an academic library website. introduction the brigham young university library released a new version of its website in 2014. multiple usability studies were conducted to inform the design of the new site. during these studies, the web designers observed that when a user did not see what they were looking for on the homepage, they were likely to click on the “services” link as the next best option. the word services appeared to be an effective catch-all term. web designers asked themselves, “what is a library service?” they concluded that a library service could be anything public-facing that meets the needs of a user. using this broad definition, services could include: • library materials—both digital and physical (e.g. books, dvds) • material services (e.g. course reserve, interlibrary loan) • equipment and technology (e.g. computers, cameras, tripods) • help and guidance (e.g. research assistance, computer assistance) • locations (e.g. group study rooms, classrooms, help desks) • programs (e.g. friends of the library, lectures) because libraries offer so many diverse services, structuring a website to effectively promote them all brings many challenges. for instance, a common approach to presenting library services on a website is to have a menu that lists a few of the most popular or important services. the last menu item will normally be a link to a web page for “other services” that provides a more comprehensive service list. such an all-inclusive listing of library services on a single web page can easily lead to information overload for users. where do services belong in a library website’s information architecture? determining the one correct path is not easy because there are multiple valid ways to organize services into web pages. services could be arranged by department, service category, user group (undergraduates, graduates, faculty, visitors, alumni), or any number of other ways. an ideal system would allow users to follow the path that makes the most sense to them. information technology and libraries | march 2019 15 user expectations for a single (google-like) search box add to the challenges for service listings.1 a single search box, also known as a metasearch system, web-scale discovery service, or federated search, combines search results from multiple library sources. a study at the university of colorado found that users expected to locate services by entering keywords into the single search box on the library’s homepage.2 for example, the users attempted to search for “interlibrary loan” and “chat with a librarian” using the single search box. it is unrealistic to expect all users to follow a specific series of links in order to find the one correct path to information about a service when they are accustomed to google-style searching. even when a user manages to locate the correct web page where a service is described, the pertinent information can still be difficult to pinpoint when service descriptions are buried in paragraphs. users need to be able to quickly perform a visual scan of a web page to locate service information. kozak and hartley suggest that “bulleted lists are easier to read, easier to search and easier to remember than continuous prose.”3 the ongoing maintenance of service listings poses another significant challenge. for large academic libraries, up-to-date service information is difficult to maintain because it is typically scattered throughout a website. each department may have its own set of web pages and service listings. department pages created and maintained by different individuals end up with inconsistent design, organization, and voice. services that are common to multiple departments will have duplicate listings with different descriptions. maintenance of accurate information becomes an issue as services change; tracking down all of the references to a discontinued or modified service requires extensive searching of the website. literature review studies and commentaries regarding the information architecture of academic library websites have been covered extensively in the literature.4 a few articles specifically address the way that library services are organized on websites. library services are a significant component of academic library website content. clyde studied one hundred library websites from thirteen countries in order to compare common features and to determine some of the purposes for a library website.5 purposes for the sites varied. some focused on providing information about the library and its services while others functioned more like a portal, providing links to internet resources. cohen and still developed a list of core content for academic library websites by examining pages from university and two-year college sites.6 they organized the content into categories: library information, reference, research, instruction, and functionalities. liu surveyed arl libraries to get an overview of the state of web page development.7 the subsequent spec kit identifies services commonly found on academic library websites. yang and dalal studied a random sample of academic library websites to see which web–based reference services were offered and how they were presented.8 they also examined the differing terminology used to describe the services. the choice of terminology used on library websites impacts the findability of services. dewey compared academic websites from thirteen member libraries of a consortium to determine how findable service links were on the sites.9 the service links used in the evaluation covered “access, reference, information, and user education” categories. the study measured the number of clicks from the homepage that were required to find information about a service. dewey found library services navigation | rennick 16 https://doi.org/10.6017/ital.v38i1.10844 inconsistent use of terminology used to describe library services from one site to another. dewey posited that extensive use of library jargon could, in a sense, hide links from users. the overall conclusion was that the websites contained “too much information poorly placed.” a study of an academic library website by mcgillis and toms also found that participants struggled with terminology when attempting to locate services.10 the website reflected “traditional library structures” instead of using categories that were meaningful to users. the decision on where to place library services on a website is an important step in the design process. as part of their proposal to establish a benchmarking program for academic library websites, hightower, shih, and tilghman created classifications for the web pages they studied.11 library services were assigned to the “directional” category instead of representing a separate category. vaughan described a history of changes to an academic website that took place from 1996–2000.12 an interesting change was that, after multiple redesigns, the web designers combined two categories into a single “library services” category in order to simplify top level navigation on the home page. comeaux studied thirty-seven academic library websites to see how design elements evolved between 2012 and 2015.13 a portion of the study compiled terms used as navigation labels. the term “about” was the most common navigation label followed by “services” as the second most common. use of the term “services” as a main navigation label increased in popularity from 2012 to 2015. several researchers suggest organizing library services into web pages or portals that target different audiences. gullikson et al. studied usability issues related to the information architecture of an academic website and discovered that study participants followed different paths in their attempts to locate service information on the site.14 some users found items easily while others were unsuccessful. menu labels were not universally understood. the researchers identified a need for multiple access points to information in order to accommodate different mental models. they suggested employing multiple information organizational schemes, such as categorizing links by function, frequency of use, and target user group. adams and cassner analyzed the websites of arl libraries to see how services for distance education students and faculty were presented.15 they recommend strategies for helping distance students navigate the website, including maintaining a web page designed specifically for distance students that avoided jargon and clearly described services. detlor and lewis envisioned academic library websites as “sophisticated guidance systems which support users across a wide spectrum of information seeking behaviors—from goal-directed search to wayward browsing.”16 they reviewed arl library websites to see which important features were present or absent. their coding methodology was adopted by gardner, juricek, and xu in their study of how library web pages can meet the needs of campus faculty.17 liu proposed a conceptual model for an improved academic library website that would be organized into portals designed for specific user groups, such as undergraduates, faculty, or visitors.18 some of the arl websites studied by the researcher already implemented portals by user group. a more recent approach for locating library services has been to include website search results when using the single search from the homepage. for example, the north carolina state libraries website includes library-wide site search results when using the single search.19 the wayne state university libraries single search displays results from a university-wide site search.20 information technology and libraries | march 2019 17 an influential report produced by andrew pace provides practical advice for designing library websites.21 in the report, pace described the library services that should be included on a site and stressed that website design affects the discoverability and delivery of these services: “whether requiring minimal maintenance or constant upkeep, the extensibility of the design and flexibility of a site’s architecture ultimately saves the library time, money, hassle, and user frustration.”22 the web application described in this article aims to achieve these goals in terms of service discoverability and website maintainability. a services web application in an effort to tackle the challenges of services navigation and maintenance, the brigham young university library developed a web application for organizing services that allows multiple routes to service information. the application, known internally as “services,” was built using django, an open-source python web framework. the application incorporates a comprehensive list of library services and a map of service relationships. each service is assigned one or more categories, locations, and service areas within the application: • categories and subcategories—broad groupings of services (e.g., research help, for faculty, printing and copying) • locations—physical or virtual places within the library where services can be found (e.g., help desks, rooms) • service areas— library departments or other organizational units that offer services (e.g., humanities, special collections) services can have multiple categories, locations, and service areas and some service areas have multiple locations within the library (see figure 1). service information can also include links to related services. these links facilitate the serendipitous discovery of additional services (see figure 2). service information is stored in a relational database that joins connected entities together. an html template is used to format service information from the database in order to generate web pages for each of the services. maintaining the data in this manner ensures that changes made to service information in the database flow through to all of the associated web pages. adding or modifying entries automatically triggers the generation of new html for only the impacted services. generating static content by using triggers keeps the web pages up-to-date without the performance hit of real-time dynamic page generation. library services navigation | rennick 18 https://doi.org/10.6017/ital.v38i1.10844 figure 1. sample map illustrating relationships between services (on the left side) and service area locations (on the right side). information technology and libraries | march 2019 19 figure 2. sample map of how related service web pages are linked. library services navigation | rennick 20 https://doi.org/10.6017/ital.v38i1.10844 user scenarios the following examples of navigation paths typify how the web application can help users locate services. in each case there are multiple alternative paths that could be followed to find the same information. scenario 1. a student is looking for a computer that has music notation software installed. clicking the “services” link on the library homepage leads to a summary of library services. the student clicks the “public computers” link found under the “featured services” heading and is presented with detailed information about the computers. in the bullet points listed in the “overview” section there is a link to “see the list of software available on these computers.” following this link the student is able to learn that the desired software is available in the library’s music and dance media lab. scenario 2. while visiting a web page for the faculty delivery service, a professor notices a link to the category “for faculty.” following the link leads to a page that highlights some of the library services provided exclusively to campus faculty. the professor clicks the link “faculty expedited book orders” and is taken to a web page that describes the service and provides an online form for requesting a book. scenario 3. a student would like to borrow a camera for a class project. entering “digital cameras” into the main search box on the library homepage produces a link to “digital cameras (dslr)” listed under the “library services” heading at the top of the search results. following the link leads to a web page with information about the library’s digital camera offerings. the web page provides links to related services, including the library’s video production studio. the student decides to reserve the studio instead of checking out a camera. anatomy of a services web page each service web page is divided into sections to help users quickly find the type of information they seek. each section represents an information module with a specific purpose and an identifying design; the sections are color coded and displayed in a consistent order on each page. this helps users to find the same kind of information in the same place on every service page. major sections include: • title • description • keywords • hours • location • contact • overview • call to action • frequently asked questions • additional resources • related services • categories information technology and libraries | march 2019 21 a few of the sections require an explanation. the hours, location, and contact sections are links located directly below the title and description. clicking these links displays the section content. the overview section is intended to provide brief bullet points near the top of the web page so that users can quickly scan the most important information about the service. the call to action section follows these bullet points and contains one or more links to web applications that facilitate use of the service. examples of calls to action include: • place a hold • reserve a group study room • register for an advanced writing class • submit an interlibrary loan request most of the sections are optional since not all sections apply to every service. the services web pages can also include raw html that is embedded in a section in order to provide unique formatting for those services that do not neatly fit the standard layout. for example, the public computers page includes a section that displays the current availability of computers for each floor of the library. the look and feel of services web pages can be extended to other pages on the library website. library departments have web pages that provide information about personnel, mission, location, and services offered. some of these pages have been converted to a format that resembles the services layout in an effort to add cohesiveness to the library website. the department pages have sections similar to services pages such as hours, location, contact information, and an overview with bullet points. the pages can automatically display links to all of the services available in the department. because department pages are part of the services application and are connected to services with a relational database, changes to service information remains in sync across the entire website. this helps alleviate the problem of out-of-date department web pages. searching for services services can be located by submitting a query in a search box or by following links found on the main services web page. the services search engine matches words from the query with words found in a service name or associated tags. each service is tagged with keywords, phrases, or synonyms to increase the likelihood of successful searching. users may not be familiar with library jargon and will search for services using a variety of terms. it is impossible to name library services in a way that is understood by everyone, especially since academic library services target both students and faculty. a study on library services and user-centered language found that: “the choices of the graduate students did not always mirror those of the faculty. this highlights the inherent challenge of marketing services—the target audiences for the same service can have very different opinions and preferences.”23 services can have multi-word phrases assigned in addition to individual keywords. for example, the data management service has the following synonyms assigned: data curation, data management plan, and dmp. new keywords and phrases can be identified by reviewing search queries in the system log files and by conducting usability studies. library services navigation | rennick 22 https://doi.org/10.6017/ital.v38i1.10844 figure 3. the interlibrary loan service web page. information technology and libraries | march 2019 23 in addition to using a search box on the services web pages, users can search for services using the single search box on the library’s homepage. the single search box returns a link to matching services as part of search results when the search engine recognizes services keywords in a query. the services application has an api that makes keywords and other service information available to the single search box application. figure 4. search for a service from the single search box on the library’s homepage. figure 5. json results from the services api. to facilitate browsing, services are organized into three groups on the services web page: featured services, categories, and service areas. the featured services group highlights the most commonly sought-after services. categories are organized by the type of service or the target audience. the service areas group directs users to services available in library departments or units. the services web page does not list every service but instead directs users to web pages based on categories or service areas that list individual services. the services search feature can also include links to non-services. for example, library policies are not services yet users occasionally search for them on the services page (the library website posts {"status": 200, "results": [{"url": "https://lib.byu.edu/services/datamanagement/", "type": "service", "name": "data management", "slug": "datamanagement", "description": "through our institutional repository scholarsarchive, faculty can store research data. this is particularly useful for faculty who must develop data management plans for research projects funded by grants.", "keywords": ["data curation", "dmp", "data management plan", "data storage", "open access"]}], "total": 1, "query": "dmp"} library services navigation | rennick 24 https://doi.org/10.6017/ital.v38i1.10844 policy documents on the about page). in order to minimize user frustration with searching, links to non-services are included in search results so that users can be redirected to the desired pages. to help with optimization for external search engines such as google, each services page has a user-friendly url that clearly identifies the service. for example, the 3d printer service has the url https://lib.byu.edu/services/3d-printers/. each web page also includes the service name in an embedded html title tag. conclusion adopting a broad view of what represents a service has altered the library’s approach to the information architecture of the website. the services web application offers several innovations for improving library service discoverability and maintenance including: • standardized organization of service information • attaching keywords/aliases to service descriptions • an api for integration with the single search box on the homepage • links to related services • generation of web pages from a relational database usability tests were conducted throughout the development of the services application. follow-up assessments are planned for the future in order to verify that the application works as expected and to identify potential adjustments to the design. the services application shows promise as an effective tool for facilitating the discovery of services and increasing the reliability and uniformity of service information. acknowledgements the author gratefully acknowledges the contributions of grant zabriskie for the original concept and design of the services application and ben crowder for the implementation. references 1 cory lown, tito sierra, and josh boyer, “how users search the library from a single search box,” college & research libraries 74, no. 3 (may 2013): 227-41, https://doi.org/10.5860/crl-321. 2 rice majors, “comparative user experiences of next-generation catalogue interfaces,” library trends 61, no. 1 (summer 2012): 186–207, https://doi.org/10.1353/lib.2012.0029. 3 marcin kozak and james hartley, “writing the conclusions: how do bullet-points help?” journal of information science 37 no. 2 (feb. 2011): 221–24, https://doi.org/10.1177/0165551511399333. 4 barbara a. blummer, “a literature review of academic library web page studies,” journal of web librarianship 1 no. 1 (2007): 45–64, https://doi.org/10.1300/j502v01n01_04; galina letnikova, “usability testing of academic library web sites: a selective annotated bibliography,” internet reference services quarterly 8 no. 4 (2004): 53–68, https://doi.org/10.1300/j136v08n04_04. information technology and libraries | march 2019 25 5 laurel a. clyde, “the library as information provider: the home page,” the electronic library 14 no. 6 (dec. 1996): 549–58, https://doi.org/10.1108/eb045522. 6 laura b. cohen and julie m. still, “a comparison of research university and two-year college library web sites: content, functionality, and form,” college & research libraries 60 no. 3 (1999): 275–89, https://doi.org/10.5860/crl.60.3.275. 7 yaping peter liu, “web page development and management: a spec kit,” association of research libraries (1999): https://hdl.handle.net/2027/mdp.39015042087232. 8 sharon q. yang and heather a. dalal, “delivering virtual reference services on the web: an investigation into the current practice by academic libraries,” journal of academic librarianship 41 no. 1 (2015): 68–86, https://doi.org/10.1016/j.acalib.2014.10.003. 9 barbara i. dewey, “in search of services: analyzing the findability of links on cic university libraries’ web pages,” information technology and libraries, 18 no. 4 (1999): 210–13, http://www.ala.org/sites/ala.org.acrl/files/content/conferences/pdf/dewey99.pdf. 10 louise mcgillis and elaine g. toms, “usability of the academic library web site: implications for design,” college & research libraries 62 no. 4 (july 2001): 355–67, https://doi.org/10.5860/crl.62.4.355. 11 christy hightower, julie shih, and adam tilghman, “recommendations for benchmarking web site usage among academic libraries,” college & research libraries 59 no. 1 (jan. 1998): 61–79, https://crl.acrl.org/index.php/crl/article/viewfile/15182/16628. 12 jason vaughan, “three iterations of an academic library web site,” information technology and libraries 20 no. 2 (june 2001): 81–92, https://search.proquest.com/docview/215832160. 13 david j. comeaux, “web design trends in academic libraries—a longitudinal study,” journal of web librarianship 11 no. 1 (2017): 1–15, https://doi.org/10.1080/19322909.2016.1230031. 14 shelly gullikson et al., “the impact of information architecture on academic web site usability,” the electronic library 17 no. 5 (oct. 1999): 293–304, https://doi.org/10.1108/02640479910330714. 15 kate e. adams and mary cassner, “content and design of academic library web sites for distance learners: an analysis of arl libraries,” journal of library administration 37 no. 1/2 (2002): 3–13, https://doi.org/10.1300/j111v37n01_02. 16 brian detlor and vivian lewis, “academic library web sites: current practice and future directions,” journal of academic librarianship 32 no. 3 (may 2006): 251–58, https://doi.org/10.1016/j.acalib.2006.02.007. 17 susan j. gardner, john eric juricek, and f. grace xu, “an analysis of academic library web pages for faculty,” journal of academic librarianship 34 no. 1 (jan. 2008): 6–24, https://doi.org/10.1016/j.acalib.2007.11.006. library services navigation | rennick 26 https://doi.org/10.6017/ital.v38i1.10844 18 shu liu, “engaging users: the future of academic library web sites,” college & research libraries 69 no. 1 (jan. 2008): 6–27, https://doi.org/10.5860/crl.69.1.6. 19 kevin beswick, “quicksearch,” north carolina state university libraries, accessed nov. 28, 2018, https://www.lib.ncsu.edu/projects/quicksearch. 20 cole hudson and graham hukill, “one-to-many: building a single-search interface for disparate resources,” in exploring discovery: the front door to your library’s licensed and digitized content, ed. kenneth j. varnum (chicago: ala editions, 2016), 141–53, http://digitalcommons.wayne.edu/libsp/114. 21 andrew k. pace, “optimizing library web services: a usability approach,” library technology reports 38 no. 2 (mar./apr. 2002): 1–87, https://doi.org/10.5860/ltr.38n2. 22 ibid. 23 allison r. benedetti, “promoting library services with user-centered language,” libraries & the academy 17 no. 2 (apr. 2017): 217-34, https://doi.org/10.1353/pla.2017.0013. 10181 20190318 galley a systematic approach towards web preservation muzammil khan and arif ur rahman information technology and libraries | march 2019 71 muzammil khan (muzammilkhan86@gmail.com) assistant professor, department of computer and software technology, university of swat. arif ur rahman (badwanpk@gmail.com) assistant professor, department of computer science, bahria university islamabad. abstract the main purpose of the article is to divide the web preservation process into small explicable stages and design a step-by-step web preservation process that leads to creating a well-organized web archive. a number of research articles are studied about web preservation projects and web archives, and designed a step-by-step systematic approach for web preservation. the proposed comprehensive web preservation process describes and combines strengths of different techniques observed during the study for preserving digital web contents into a digital web archive. for each web preservation step, different approaches and possible implementation techniques have been identified that can be adopted in digital archiving. the potential value of the proposed model is to guide the archivist, related personnel, and organizations to effectively preserved their intellectual digital contents for future use. moreover, the model can help to initiate a web preservation process and create a wellorganized web archive to efficiently manage the archived web contents. a section briefly describes the implementation of the proposed approach in a digital news stories preservation framework for archiving news published online from different sources. introduction the amount of information generated by institutions is increasing with the passage of time. one of the mediums that uses this information is the world wide web (www). the www has become a tool to share information quickly with everyone regardless of their physical location. the number of web pages is vast. google and bing each index approximately 4.8 billion.1 though the www is a rapidly growing source of information, it is fragile in nature. according to the available statistics, 80 percent of pages become unavailable after one year and 13 percent of links (mostly web references) in scholarly articles are broken after 27 months.2 moreover, 11 percent of posts and comments on websites for various purposes are lost within a year. according to another study conducted on 10 million web pages collected from the internet archive in 2001, the average survival rate of web pages is 1,132.1 days with a standard deviation of 903.5 days. 90.6 percent pages of those web pages are inaccessible today.3 the information fragility causes this valuable scholarly, cultural, and scientific information to vanish and become inaccessible to future generations. in recent years, it was realized that the lifespan of digital objects is very short, and rapid technological changes make it more difficult to access these objects. therefore, there is a need to preserve the information available on the www. digital preservation is performed using the primary methods of emulation and migration, in which emulation provides the preserved digital objects in their original format while migration provide objects in a different format.4 in the last systematic approach towards web preservation | khan and ur rahman 72 https://doi.org/10.6017/ital.v38i1.10181 two decades, a number of institutions worldwide, such as national and international libraries, universities, and companies started to preserve their web resources (resources found at a web server, i.e., web contents and web structure). the first web archive was initiated in 1996 by brewster kahle, named the internet archive, and it holds more than 30 petabytes data, which includes 279 billion web pages, 11 million books and texts, and 8 million other digital objects such as audio, video, image files, etc. more than seventy web archive initiatives were started in 33 countries since 1996, which shows the importance of web preservation projects and preservation of web contents. this information era encourages librarians, archivists, and researchers to preserve the information available online for upcoming generations. while digital resources may not replace the information available in physical form, the digital version of these information resources improves access to the available information.5 there are different aspects of the preservation process and web archiving, e.g., digital objects’ ingestion to the archive during preservation process, digital object’s format and storage, archival management, administrative issues, access and security to the archive, and preservation planning. these aspects need to be understood for effective web preservation and will help in addressing the challenges that occur during the preservation process. the reference model for open archival information system (oais) is an attempt to provide a high-level framework for the development and comparison of digital archives. in web preservation, a challenging task is to identify the starting point of the preservation process and to effectively complete the process which help to proceed further to the other activities. therefore, the complicated nature of the web and the complex structure of the web contents make the preservation of the web content even more difficult. the oais reference model helps in achieving the goals of a preservation task in a step-by-step manner. the stakeholders are identified, i.e., producer, management, and consumer, and the packages, i.e., submission information package (sip), archival information package (aip) and dissemination information package (dip), which need to be processed, are clearly defined.6 this study aims to design a step-by-step systematic approach for web preservation that helps to understand preservation or archival activities’ challenges, especially those that relate to digital information objects at various steps of the preservation process. the systematic approach may lead to an easy way to analyze, design, implement, and evaluate the archive with clarity and different options for an effective preservation process and archival development. an effective preservation process is one that leads to a well-organized, easily managed web archive and accomplishes designated community requirements. this approach may help to address the challenges and risks that confront archivists and analysts during preservation activities. step-by-step systematic approach digital preservation is “the set of processes and activities that ensure long-term, sustained storage of, access to and interpretation of digital information.”7 the growth and decline rates of www content and the importance of the information presented on the web make it a key candidate for preservation. web preservation confronts a number of challenges due to its complex structure, a variety of available formats, and the type of information (purpose) it provides. the overall layout of the web varies domain to domain based on the type of information and its presentation. the websites can be categorized based on two things. first, the type of information (i.e., the web information technology and libraries | march 2019 73 contents) and second, the way this information presented (i.e., the layout or structure of the web page. examples include educational, personal, news, e-commerce, and social networking websites, which vary a lot in their contents and structure. the variations in the overall layout make it difficult to preserve different web contents in a single web archive. the web preservation activities are summarized in figure 1. the following sections explain the web preservation activities and possible implementation in proposed systematic approach. defining the scope of the web archive the www provides an opportunity to share information using various services, such as blogs, social networking websites, e-commerce, wikis, and e-libraries. these websites provide information on a variety of topics and address different communities based on their interest and needs. there are many differences in the way the information is handled and presented on the www. in addition, the overall layout of the web changes from one domain to another domain.8 therefore, it is not practically feasible to develop a single system to preserve all types of websites for the long term. so, before starting to preserve the web, one (the archivist) should define the scope of the web to be archived. the archive will be either a site-centric, topic-centric, or domaincentric archive.9 site-centric archive a site-centric archive focuses on a particular website for preservation. these types of archives are mostly initiated by the website creator or owner. the site-centric web archives allow access to the old versions of the website. topic-centric archive topic-centric archives are created to preserve information on a particular topic published on the web for future use. for scientific verification, researchers need to refer to the available information while it is difficult to ensure access to these contents due to the ephemeral nature of the web. a number of topic-centric archive projects have been performed including the archipol archive of dutch political websites,10 the digital archive for chinese studies (dachs) archive2,11 minerva by the library of congress,12 and the french elections web archive for archiving the websites related to the french elections.13 domain-centric archive the word “domain” refers to a location, network, or web extension. a domain-centric archive covers websites published with a specific domain name dns, using either a top-level domain (tld), e.g., .com, .edu, or .org, or a second-level domain (sld), e.g., .edu.pk or .edu.fr. an advantage of domain-centric archiving is that it can be created by automatically detecting specific websites. several projects have a domain-centric scope, e.g., the portuguese web archive (pwa) national websites,14 the kulturarw, a swedish royal library web archive collection of.se and .com domain websites,15 and the uk government web archive collection of uk government websites, e.g., .gov.uk domain websites. understanding the web structure after defining the scope of the intended web archive, the archivist will have a better understanding of the interest and expected queries of the intended community based on the resources available or the information provided by the selected domain. the focus in this step is to understand the type of information (contents) provided by the selected domain and how the information has been presented. the web can be understood by two dimensions. the first systematic approach towards web preservation | khan and ur rahman 74 https://doi.org/10.6017/ital.v38i1.10181 figure 1. systematic approach for web preservation process. information technology and libraries | march 2019 75 considers the web as a medium that communicates contents using various protocols, i.e., http, and the second considers the web as a content container, which further presents the contents to the viewers and not simply contents, e.g. the underlying technology used to display the contents.16 the preservation team should understand such parameters as the technical issues, the future technologies, and the expected inclusion of other related content. identify the web resources the archivist should understand the contents and the representation of the contents of the selected domain, e.g., blogs, social networking websites, institutional websites, educational institutional websites, newspaper websites, or entertainment websites. all of these websites provide different information and address individual communities that have distinct information needs. a web page is the combination of two things, i.e., web contents and web structure.17 the resources which can be preserved are as follows. web contents web contents or web information can be categorized into the following categories: • textual contents (plain text): this category describes textual information that appears on a web page. it does not include links, behaviors, and presentation stylesheets. • visual contents (images): these contents are the visual forms of information or are a complementary material to the information provided in the textual form. • multimedia contents: as another form of information, multimedia contents mainly include audio and video. it may also include animation or even text as a part of a video or a combination of text, audio, and video. web structure web structure can be categorized in the following categories: • appearance (web layout or presentation): this category indicates the overall layout or presentation of a web page. the look and feel of a web page (representation of the contents) are important, which is maintained with different technologies, e.g., html or stylesheets, etc. • behavior (code navigations): categorized by link navigations, these can be within a website or to other websites, external document links or dynamic and animated features, such as live feed, comments, tagging, or bookmarking. identify designated community the archivist should identify the designated community of the intended web archive, their functional requirements and expected queries by analyzing them carefully. the designated community means the potential users, such as those who can access the archived web contents for different purposes, i.e., accessing old information that is not available in normal circumstances or referring to an old news article which is not bookmarked properly or retrieving relevant news articles published long ago, etc. prioritize the web resources after a comprehensive assessment of the resources of the selected domain and the identification of potential users’ requirements and expected queries, the archivist should prioritize the web systematic approach towards web preservation | khan and ur rahman 76 https://doi.org/10.6017/ital.v38i1.10181 resources. the complexity of web resources and their representation cause complications in the digital preservation process. generally, it may be undesirable or unviable to preserve all web resources; therefore, it is worthwhile to designate the web resources for preservation. the priority should be assigned on the basis of two things: first, the potential reuse of the resource and second, the frequency with which the resource will be accessed. the resources with no value, little value, or those managed elsewhere can be excluded. for prioritization of resources, the moscow method can be applied.18 the acronym moscow can be elaborated as: m must have, the resource must be preserved or resources that must be a part of the archive and preserved. for example, in the digital news story archive (dnsa), the textual news story must be preserved in the archive because the preservation emphasis is on a textual news story.19 online news contains textual news stories, and many news stories contain associated images, and a fraction of news stories contain associated audio-video contents. s should have, the resource should be preserved if at all possible. almost all the news stories have associated images; a few news stories have associated audio and video that complement it and should be preserved as a part of the news story in the web archive. c could have, the resource could be preserved if it does not affect anything else or is nice to have. the web structure in dnsa depends on the resources to be used for the preservation of news stories; the layout of the newspaper website could (c) be a part of the preservation process if it does not affect anything, e.g., storage capacity and system efficiency. w won’t have, the resource would not be included. archiving multiple versions of the layout or structure of the online newspaper are not worthwhile and hence would not (w) be preserved. the prioritization of these resources is very important in the context of web preservation planning because it does not waste time and energy, and it is the best way to handle users’ requirements and fulfill their expected queries. how to capture the resource(s) the selection of a feasible capturing technique depends on: first, the resources to be captured and second, the capturing task frequency. there are three web resources capturing techniques, i.e., by browser, web crawler, and authoring system. each capturing technique has associated advantages and disadvantages.7 web capturing using browsers the intended web content can be captured using browsers after a web page is rendered when the http transaction occurs. this technique is also referred to as a snapshot or post-rendering technique. the method captures those things which are visible to the users; the behavior and other attributes remain invisible. capturing static contents is one of the disadvantages of web capturing by the browser approach, this approach generally preserved contents in the form of images. it is best for well-organized websites, and commercial tools are available for capturing the web. the following are well-known tools to capture web using browsers. webcapture (https://web-capture.net/) is a free online web-capturing service. it is a fast web page snapshot tool, which can grab web pages in seven different formats, i.e. jpeg, tiff, png, bmp information technology and libraries | march 2019 77 image formats, pdf, svg, and postscript files of high quality. it also allows downloading the intended format in a zip file and is suitable for long vertical web pages with no distortion in layout. a.nnotate (http://a.nnotate.com/), is an online annotating web snapshot tool to keep track of information gathered from the web efficiently and easily. it allows adding tags and notes to the snapshot and building a personal index of web pages as document index. the annotation feature can be used for multiple purposes, for example, compiling an annotated library of objects for organization, sharing commented web pages, product comparison, etc. snagit (https://www.techsmith.com/screen-capture.html) is a well-known snapshot tool for capturing screens with built-in advanced image editing features and screen recording. snagit is a commercial and advanced screen capture tool that can capture web pages with images, linked files, source code, and the url of the web page. acrobat webcapture (file > create > pdf from web page...) creates a tagged pdf file from the web page that a user visits while the adobe pdf toolbar is used for the entire website.20 the capture by a browser technique has the following advantages: • by this technique, the archivist can capture only the displayed contents, and it is an advantage if you need to preserve the displayed contents only. • it is a relatively simple technique for well-organized websites. • commercial tools exist for web capturing using browsers. in addition, the disadvantages are the following: • capturing displayed contents only is a disadvantage if the focus is not on only displayed contents. • it results in frozen contents and treats contents as if they are publications. • it loses the web structure, such as appearance, behavior, and other attributes of the web page. web capturing using an authoring system/server the authoring system capturing technique is used for web harvesting directly from the website hosting server. all the contents, e.g., textual information, images, and source code, are collected from the source web server. the authoring system allows the archivist to preserve the different versions of the website. the authoring system depends on the infrastructure of the content management system and is not a good choice for external resources. the system is best for an owned web server and works well for limited internal purposes. the web curator tool (http://webcurator.sourceforge.net/), pandas (an old british library harvesting tool), and netarchivesuite (https://sbforge.org/display/nas/netarchivesuite) are known tools use for planning and scheduling web harvesting. they can be used by non-technical personnel for both selection and harvesting web content selection policies. these web archiving tools were developed in a collaboration of the national library of new zealand and the british library and are used for the uk web archive (http://www.ariadne.ac.uk/issue50/beresford/). the tools can interface with web crawlers, such as heritrix (https://sourceforge.net/projects/archivecrawler/). authoring systems are also referred to as workflow systems or curatorial tools. systematic approach towards web preservation | khan and ur rahman 78 https://doi.org/10.6017/ital.v38i1.10181 the authoring system has the following advantages: • it is best for web harvesting, which captures everything available. • it is easy to perform, if you have proper access permission or you own the server or system to access for capturing the resources. • it works in short to medium term resources and feasible for internal access within organizations. the disadvantages of web capturing using the authoring system are: • it captures all available raw information, not only presentations. • it may be too reliant on the authoring infrastructure or the content management system. • it is not feasible for large term resources, or for external access from outside organization. web capturing using web crawlers web crawlers are perhaps the mostly used technique for capturing web contents in systematic and automated manner.21 crawler development needs the expertise and experience of different tools, i.e. positive and negative of technologies, and the viability of a tool in a specific scenario. the main advantage of crawlers is that they extract embedded content. heritrix, httrack, wget, and deeparc are common examples of web crawlers. heritrix (https://github.com/internetarchive/heritrix3/wiki) is developed in java, an open source and freely available web crawler, and it was developed by internet archive. heritrix is one of the widely used extensible and web-scale web crawlers in web preservation projects. initially, the heritrix was developed for specific purpose crawling of specific websites and now a resourceful or customize web crawler for archiving the web. httrack (https://www.httrack.com/) is a freely available configurable browser utility. httrack crawls html, images, and other files from a server to a local directory and allows offline viewing of the website. the httrack crawler downloads a complete website from the web server to a local computer system and makes it available for offline for viewing with all related link-structure and seems like the user is using it online. it also updates the archived websites at the local system from the server and resumes all the interrupted previous extractions. the httrack available for both windows and linux/unix operating systems. wget (http://www.gnu.org/software/wget/) is a freely available non-interactive command line tool that can easily be configured with other technologies and different scripts. it can capture files from the web using widely used ftp, ftps, http and https protocols, and support cookies as well. it also updates the archived websites and resumes all the interrupted extractions. wget is available for both microsoft windows and unix operating systems. the advantages of web crawling: • widely used in capturing techniques. • can capture specific content or everything. • avoids some of the accessing issues, such as: link rewriting and embedded external content from an archive or live. information technology and libraries | march 2019 79 disadvantages associated with web crawling: • much work is required, as well as tools or development expertise and experience, etc. • the web crawler does not have the right scope: sometimes, it does not capture everything that it should, and sometimes the crawler captures too much content. web content selection policy in the previous steps, the web resources are identified, prioritized based on requirements and expected queries of the designated community, and feasible capturing technique is identified based on capturing frequency. now, the contents need to be prepared and filtered for selection, and a feasible selection approach needs to be selected based on the contents. a web content selection policy helps to determine and clarify, which web contents are required to be captured based on the priorities, the purpose and the scope of web contents already defined.22 the decision of the selection policy comprises the description of the context, the intended users, the access mechanisms and the expected uses of the archive. the selection policy may comprise the selection process and selection approach. the selection process can be divided into subtasks which, in combination, provide a qualitative selection of web contents to a certain extent, i.e., preparation, discovery, and filtering, as shown in figure 2. the main objective of the preparation phase is to determine the targeted information space, the capture technique, capturing tools, extension categorization, granularity level, and the frequency of archiving activity. the best personnel who can provide help in preparation are the domain experts, regardless of the scope of the web archive. the domain experts may be the archivists, researchers, librarians, or any other authentic reference, i.e. a document or a research article. the tools defined in the preparation phase will help to discover intended information in the discovery phase, which can be divided into the following four categories: 1. hubs may be the global directories or topical directories, collection of sites or even a single web page with essential links related to a particular subject or topic. 2. search engines can facilitate discovery by defining a precise query or set of alternative queries related to a topic. the use of specialized search engines can significantly improve the results of discovering related information that can be greatly improved. 3. crawlers can be used to extract web contents such as textual information, images, audio, video and links. moreover, the overall layout of a web page or a whole website can also be extracted in a well-defined systematic manner. 4. external sources may be non-web sources that may be anything, such as printed material for mailing lists, which can be monitored by the selection team. the main objective of the discovery phase is to determine the source of information to be stored the archive. this determination can be achieved by two ways. first, a manually created entry point list is used to determine the list of entry points (usually links) for crawling the collection manually and updating the list during the crawl. there are two discovery methods, i.e., exogenous and endogenous. exogenous discovery is used in manual selection and mostly relies on exploitation of an entry point list for hubs, search engines, and on non-web documents. second, there is an automatically created entry point list to determine the list of entry points by extracting links automatically and obtaining an updated list every time during the crawl. endogenous discovery is systematic approach towards web preservation | khan and ur rahman 80 https://doi.org/10.6017/ital.v38i1.10181 used in automatic selection and relies on the link extraction using crawlers by exploring the entry point list. figure 2. selection process. the main objective of the filtering phase is to optimize and make concise the discovered web contents (discovery space). filtering is important in order to collect more specific web content and remove unwanted or duplicated content. usually, for preservation, an automatic filtering method is used; manual filtering is useful if the robots or automatic tools cannot interpret the web. the discovery and filter phase can be combined practically or logically. several evaluation axes can be used for the selection policy (e.g., quality, subject, genre, and publisher). in the literature, we have three known techniques for selecting web content. the selection approach can be either automatic or manual. manual content selection is very rare because it is labor intensive: it requires automatic tools for finding the content, and then manual review of that collection to identify the subset that should be captured. automatic selection policies are used frequently in web preservation projects for web collection, especially for web archives.23 the selection of the collection approach depends on the frequency with which the web content has been preserved in the archive. there are four different selection approaches for web content collection. unselective approach the unselective approach implies collecting everything possible; by specifically using this approach, the whole website and its related domains and subdomains are downloaded to the archive. it is also referred to as automatic harvesting or selection, bulk selection, and domain selection.24 the automatic approach is used in a situation where a web crawler usually performs the collection. for example, the collection of websites from a domain, i.e., .edu means all educational institution websites (at domain level) or the collection of all possible contents/pages from a website (harvesting at website level) by extracting the embedded links. a section of the data preservation community believes that technically it is a relatively cheaper, quicker collection approach and yields a comprehensive picture of the web as a whole. in contrast, its significant drawbacks are that it generates huge unsorted, duplicated, and potentially useless data, consuming too many resources. information technology and libraries | march 2019 81 the swedish royal library’s project kulturarw3 harvests websites at domain level, i.e., collecting websites from a .se domain which is a physically located website in sweden and one of the first projects to adopt this approach.25 usually, national-based web archive initiatives adopt the unselective approach, most notably nedlib, a helsinki university library harvester, and aola, an austrian online archive.26 selective approach the selective approach was adopted by the national library of australia (nla) in the pandas project in 1997. in this approach, a website is included for archiving based on certain predefined strategies and on the access and information provided by the archive. the library of congress’ project minerva and the british library project “britain on the web” are the other known projects that have adopted the selective approach. according to nla, the selected websites are archived based on nla guidelines after negotiation with the owners.27 the inclusion decision could be taken at one of the following levels: • website level: which websites should be included from a selected domain, e.g., to archive all educational websites from high level domain “.pk”. • web page level: which web pages should be included from a selected website, e.g., to archive the homepages of all educational websites. • web content level: which type of web contents should be preserved, e.g., to archive all the images from the homepages of educational websites. a selective approach is best if the numbers of websites to be archived are very large or the archiving process is targeting the entire www and wants to narrow down the scope by identifying the resources in which the archivists are more interested. this approach performs implicit or explicit assumptions about the web contents that are not to be selected for preservation. it may be very helpful to initiate a pilot preservation project, which identifies: what is possible? what can be managed? in addition, some tangible results may be obtained easily and quickly in order to enhance the scope of the project in a broader perspective. the selective approach may be based on a predefined criterion or based on an event. selective approach based on criteria involves selecting web resources based on various predefined sets of criteria. nla’s guidance characterizes the criteria-based selective approach as the “most narrowly defined method,” and described it as “thematic selection.” a simple or a complex content-selection criteria can be defined, which depends on the overall goal of preservation. for example, all resources owned by an organization, all resources of one genre, i.e., all programming blogs, resources contributed to a common subject, resources addressing a specific community within an institution, i.e., students or staff, all publications belonging to an individual organization or group of organizations, all resources that may benefit external users or an external user’s community, e.g., historians, or alumni. selective approach based on event involves selecting web resources or websites based on various time-based events. the archivists may focus on websites that address national or international important events, e.g., disasters, elections, and the football world cup, etc. eventbased websites have two characteristics: (1) very frequent updates and (2) website content is lost after a short time, e.g., a few weeks or a few months. for example, the start and end of a term or systematic approach towards web preservation | khan and ur rahman 82 https://doi.org/10.6017/ital.v38i1.10181 academic year, the duration of an activity, e.g., research project, appointment, or departure of a new senior official. deposit approach in the deposit collection approach, the information package is submitted by the administrator or owner of the website which includes a copy of the website with related files that can be accessed through different hyperlinks. the archival information package is applicable to the small collection (of a few websites), or the owner of the website can initiate the preservation project, e.g. a company can initiate a project for preserving their website. the deposit collection approach was adopted by the national archives and records administration (nara) for the collection of us federal agency websites in 2001 and by die deutsche bibliothek (ddb, http://deposit.ddb.de/) for the collection of dissertations and some online publications. new digital initiatives are heavily dependent on administrator or owner support and provide an easy way to deposit new content to the repository, e.g., in the macewan university’s institutional repository, the librarians leading the project tried to offer an easy and effective way to deposit their archival contents.28 combined approach there are advantages and disadvantages associated with each collection approach. the ongoing debate is which approach is best in a given situation. for example, the deposit approach should be an inexpensive agreement with the depositors. the emphasis is to use the combination of automatic harvesting and selective approaches as these two approaches are cheaper as compared to other selection approaches because a few staff personnel are required and cope with technological challenges. this initiative was taken by the bibliothque nationale de france (bnf) in 2006. the bnf automatically crawls information regarding the updated web pages and stores it in an xml-based “site delta” and uses page relevancy and importance, similar to how google ranks pages, to evaluate individual pages.29 the bnf used a selective approach for the deep web (that is, web pages or websites that are behind a password or are otherwise not generally accessible to search engines), referred to as “deposit track.” metadata identification cataloging is required to discover a specific item from the digital collection. an identifier or set of identifiers is required to retrieve a digital record in digital repositories or an archive. for digital documents, this catalog or registration or identifier is referred to as metadata.30 metadata are structured information concerning resources that describe, locate (discover or place), manage, easily retrieve (access) and use digital information resources. metadata are often referred to as “data about data” or “information about information”, but it may be more helpful and informative to describe these data as “descriptive and technical documentation.”31 metadata can be divided into the following three categories: 1. descriptive metadata describes a resource for discovery and identification purposes. it may consist of elements for a document such as title, author(s), abstract, and keywords, etc. 2. structural metadata describes how compound objects are put together, for example, how sections are ordered to form chapters. information technology and libraries | march 2019 83 3. administrative metadata imparts information to facilitate resource management, such as when and how a file was created, who can access the file, its type, and other technical information. administrative metadata is classified into two types: (1) rights management metadata addresses intellectual property rights and (2) preservation metadata contains information needed to archive and preserve a resource.32 due to new information technologies, digital repositories, especially web-based repositories, have grown rapidly over the last two decades. this interest prompts the digital libraries communities to devise metadata strategies to manage the immense amount of data stored in digital libraries.33 metadata play a vital role in the long-term preservation of digital objects and important to identify the metadata which may help to retrieve a specific object from the archive after preservation. according to duff et al., “the right metadata is the key to preserving digital objects.”34 there are hundreds of metadata standards developed over the years for different user environments, disciplines, and for different purposes; many of them are in their second, third, or nth edition.35 digital preservation and archiving requires metadata standards to trace and ensure its access to the digital objects. several of the common standards are briefly discussed below. dublin core metadata initiative (dcmi, http://dublincore.org/) was initiated at the 2nd world wide web conference in 1994 and was standardized by ansi/niso z39.85 in 2001 and iso 15386 in 2003.36 the main purpose of the dcmi was to define an element set for representing web resources; initially, thirteen core elements were defined which later increased to a fifteen-element set. the elements are optional, repeatable, can be followed in any order, and expressed in xml.37 metadata encoding and transmission standard (mets, http://www.loc.gov/standards/mets/) is an xml metadata standard intended to represent information of the complex digital objects. mets elements evolved from the early project making of america ii “moa2” in 2001, supported by the library of congress and sponsored by the digital library federation “dlf” and registered with national information standards organization “niso” in 2004. a mets document contains seven major sections in which each contains different aspects of metadata.38 metadata object description schema (mods, http://www.loc.gov/standards/mods/) was initiated by the marc21 maintenance agency at the library of congress in 2002. mods elements are richer then dcmi, simpler then marc21 bibliographic format and expressed in xml.39 the mods identified the widest facets or features of an object and presented nineteen high-level optional elements.40 visual resources association core strategies (vra core, http://www.loc.gov/standards/vracore/) was developed in 1996, and the current version 4.0 was released in 2007. the vra core is a widely used standard for art, libraries, and archives for such objects as paintings, drawings, sculpture, architecture, and photographs, as well as books and decorative and performance art.41 the vra core contains nineteen elements and nine sub-elements.42 preservation metadata implementation strategies (premis, http://www.loc.gov/standards/premis/) was developed in 2005, sponsored by the online computer library center (oclc) and the research libraries group (rlg), includes a data dictionary and some information about metadata. premis defined a set of five interactive core semantic units or entities and xml schema for endorsing digital preservation activities. it is not systematic approach towards web preservation | khan and ur rahman 84 https://doi.org/10.6017/ital.v38i1.10181 concerned with discovery and access but with common metadata, and for descriptive metadata, other standards (dublin core, mets or mods) need to be used. the premis data model contains intellectual entities (contents that can be described as a unit, e.g., books, articles, databases), objects (discrete units of information in digital form, which can be files, bitstreams, or any representation), agents (people, organization, or software), events (actions that involve an object and an agent known to the system) and rights (assertion of rights and permission).43 it is indisputable that good metadata improves access to the digital object in the digital repository. therefore, the creation and selection of appropriate metadata make the web archive accessible to the archive user. structure metadata helps to manage the archival collection internally, as well as the related services, but may not always help to discover the primary source of the digital object.44 currently, there are many semi-automated metadata generation tools. the use of these semiautomatic tools for generating metadata is crucial for the future, considering the operation’s complexity and cost of manual metadata origination.45 archival format the web archive initiatives select websites for archiving based on relevance of contents and the intended audience of the archived information. the size of the web archives varies significantly depending on their scope and the type of content they are preserving, e.g., web pages, pdf documents, images, audio, or video files.46 to preserve these contents, a web archive uses different storage formats containing metadata and utilizes data compression techniques. the internet archive defined the arc format (http://archive.org/web/researcher/arcfileformat.php), later used as a defacto standard. in 2009, the internet organization for standardization (iso) established the warc format (https://goo.gl/0rbwsn) as an official standard for web archiving. approximately 54 percent of web archive initiatives applied arc and warc formats for archiving. the use of standard formats helps the archivists to facilitate the creation of collaborative tools, such as search engines and ui utilities to efficiently manipulate the archived data.47 information dissemination mechanisms a well-defined preservation process can lead to a well-organized web archive that is easy to maintain and easy to retrieve a specific digital object from the collection using information dissemination techniques. poor search results are one of the main problems in information dissemination of web archives. the users of a web archive expend excessive time to retrieve intended documents or information to satisfy the user’s query. archivists are more concerned with “ofness,” “what collections are made up of,” although archive users are concerned with aboutness, “what collections are about.”48 to use the full potential of web archives a usable interface is needed to help the user to search the archive for specific digital object. full text and keyword search are the dominant ways to search the unstructured information repository, evidently observed from the online search engines. the sophistication of search results against user queries is based on the ranking tools.49 the access tools and techniques are getting the attention of researchers, and approximately 82 percent of european web archives concentrate on such tools, which makes these web archives easily accessible.50 the lucene full-text search engine and its extension nutchwax is widely used in web archiving. moreover, for the combination of semantic descriptions that already rely on or are implicit within their descriptive metadata, reasoning-based or semantic searching of the archival information technology and libraries | march 2019 85 collection can enable the system to produce novel possibilities for the archival content retrieval and browsing.51 even in the current era of digital archives, mobile services are adopted in digital libraries, e.g., access to e-books, libraries databases, catalogs, and text messaging are common mobile services offered in university libraries.52 in a massive repository, a user query retrieves millions of documents, which makes it difficult for users to identify the most relevant information. the ranking model estimates the results relevancy based on user’s queries using specified criteria to overcome this problem and sorts the results by placing the most relevant result at the top.53 there are a number of ranking models that exist in the literature, e.g., conventional ranking models, e.g., tf-idf, bm25f, temporal ranking models, e.g., pagerank, and learning to rank models, e.g., l2r. the findings of the systematic approach for web preservation are used to automate the process of the digital news-story preservation. the steps of the proposed model are carefully adopted to develop a tool that is able to add contextual information to the stories to be preserved. digital news stories preservation framework the advancement of web technologies and maturation of the internet attracts news readers to access news online that is provided by multiple sources and to obtain the desired information comprehensively. the amount of news published online has grown rapidly, and for an individual, it is cumbersome to browse through all online sources for relevant news articles. the news generation in the digital environment is no longer a periodic process with a fixed single output, such as printed newspapers. the news is instantly generated and updated online in a continuous fashion. however, because of different reasons, such as the short lifespan of digital information and the speed of generation of information, it has become vital to preserve digital news for the long term. digital preservation includes various actions to ensure that digital information remains accessible and usable, as long as they are considered important.54 libraries and archives preserve by carefully digitizing newspapers considering as a good source of knowing the history. many approaches have been developed to preserve digital information for the long term. the lifespan of news stories published online varies from one newspaper to another, i.e., from one day to a month. however, a newspaper may be backed up and archived by the news publisher or national archives; in the future, it will be difficult to access particular information published in various newspapers regarding the same news story. the issues become even more complicated if a story is to be tracked through an archive of many newspapers, which requires different access technologies. the digital news story preservation (dnsp) framework was introduced to preserve digital news articles published online from multiple sources.55 the dnsp framework is planned based on adopting the proposed step-by-step systematic approach for web preservation to develop a wellorganized web archive. initially, the main objectives defined for the dnsp framework are: • to initiate a well-organized national level digital news archive of multiple news sources. • to normalize news articles during preservation to a common format for future use. • to extract explicit and implicit metadata, which would be helpful in ingesting stories to the archive and browsing through the archive in the future. • to introduce content-based similarity measures to link digital news articles during preservation. systematic approach towards web preservation | khan and ur rahman 86 https://doi.org/10.6017/ital.v38i1.10181 the digital news story extractor (dnse) is a tool developed to facilitate the extraction of news stories from the online newspapers and to migrate to a normalized format for preservation. the normalized format also includes a step to add metadata in the digital news stories archive (dnsa) for future use.56 to facilitate the accessibility of news articles preserved from multiple sources, some mechanisms need to be adopted for linking the archived digital news articles. an effective term-based approach “common ratio measure for stories (crms)” for linking digital news articles in dnsa is introduced that links similar news articles during the preservation process.57 the approach is empirically analyzed, and the results of the proposed approach are compared to get conclusive arguments. the initial results computed automatically using a common ratio measure for stories are encouraging and are compared with the similarity of news articles based on human judgment. the results are generalized by defining a threshold value based on multiple experimental results using the proposed approach. currently, there is ongoing work to extend the scope of dnsa to dual languages, i.e., urdu and english, as well as content-based similarity measures to link news articles published in urduenglish. moreover, research is underway to develop tools for exploiting the linkage created among stories during the preservation process for search and retrieval tasks. summary effective strategic planning is critical in creating web archives; hence, it requires a wellunderstood and a well-planned preservation process. the process should result in a wellorganized web archive that includes not only the content to be preserved but also the contextual information required to interpret the content. the study attempts to answer many questions by guiding the archivists and related personnel, such as: how to lead the web preservation process effectively? how to initiate the preservation process? how to proceed through different steps? what are the possible techniques that may help to create a well-organized web archive? how can the archived information can be used to its greatest potential? to answer these questions, the study resulted in an appropriate step-by-step process for web preservation and a well-organized web archive. the targeted goal of each step is identified by researching the existing approaches that can be adopted. the possible techniques for those approaches are discussed in detail for each step. references 1 “world wide web size,” the size of the world wide web, visited on jan 31, 2019, http://www.worldwidewebsize.com/. 2 brian f. lavoie, “the open archival information system reference model: introductory guide,” microform & imaging review 33, no. 2 (2004): 68-81; alexandros ntoulas, junghoo cho, and christopher olston, “what's new on the web? the evolution of the web from a search engine perspective,” in proceedings of the 13th international conference on world wide web-04 (new york, ny: acm, 2004), 1-12. information technology and libraries | march 2019 87 3 teru agata et al., “life span of web pages: a survey of 10 million pages collected in 2001,” ieee/acm joint conference on digital libraries, (ieee, 2014), 463-64, https://doi.org/10.1109/jcdl.2014.6970226. 4 timothy robert hart and denise de vries, “metadata provenance and vulnerability,” information technology and libraries 36, no. 4 (dec. 2017): 24-33, https://doi.org/10.6017/ital.v36i4.10146. 5 claire warwick et al., “library and information resources and users of digital resources in the humanities,” program 42, no. 1 (2008): 5-27, https://doi.org/10.1108/00330330810851555. 6 lavoie, “open archival information system reference model.” 7 susan farrell, k. ashley, and r. davis, “a guide to web preservation,” practical advice for web and records managers based on best practices from the jisc-funded powr project (2010), https://jiscpowr.jiscinvolve.org/wp/files/2010/06/guide-2010-final.pdf. 8 lavoie, “open archival information system reference model;” farrell, ashley, and davis, “guide to web preservation.” 9 peter lyman, “archiving the world wide web,” washington, library of congress (2002), https://www.clir.org/pubs/reports/pub106/web/. 10 diomidis spinellis, “the decay and failures of web references,” communications of the acm 46, no. 1 (2003): 71-77, https://dl.acm.org/citation.cfm?doid=602421.602422. 11 digital archive for chinese studies (dachs) archive2 https://www.zo.uniheidelberg.de/boa/digital_resources/dachs/index_en.html, visited on jan 31, 2019. 12 julien masanès, “web archiving methods and approaches: a comparative study,” library trends 54, no. 1 (2005): 72-90, https://doi.org/10.1353/lib.2006.0005. 13 hanno lecher, “small scale academic web archiving: dachs,” in web archiving (berlin/heidelberg: springer, 2006), 213-25, https://doi.org/10.1007/978-3-540-463320_10. 14 daniel gomes et al., “introducing the portuguese web archive initiative,” in 8th international web archiving workshop (berlin/heidelberg: springer, 2009). 15 gerrit voerman et al., “archiving the web: political party web sites in the netherlands,” european political science 2, no. 1 (2002): 68-75, https://doi.org/10.1057/eps.2002.51. 16 sonja gabriel, “public sector records management: a practical guide,” records management journal 18, no. 2 (2008), https://doi.org/10.1108/00242530810911914. 17 farrell, ashley, and davis, “guide to web preservation.” systematic approach towards web preservation | khan and ur rahman 88 https://doi.org/10.6017/ital.v38i1.10181 18 jung-ran park and andrew brenza, “evaluation of semi-automatic metadata generation tools: a survey of the current state of the art,” information technology and libraries 34, no. 3 (sept, 2015): 22-42, https://doi.org/10.6017/ital.v34i3.5889. 19 muzammil khan and arif ur rahman, “digital news story preservation framework,” in digital libraries: providing quality information: 17th international conference on asia-pacific digital libraries, icadl 2015 seoul, korea, december 9-12, 2015 (proceedings, vol. 9469, springer, 2015), 350-52, https://doi.org/10.1007/978-3-319-27974-9; muzammil khan, “using text processing techniques for linking news stories for digital preservation,” phd thesis, faculty of computer science, preston university kohat, islamabad campus, hec pakistan, 2018. 20 dennis dimick, “adobe acrobat captures the web,” washington apple pi journal (1999): 23-25. 21 trupti udapure, ravindra d. kale, and rajesh c. dharmik, “study of web crawler and its different types,” iosr journal of computer engineering (iosr-jce) 16, no. 1 (2014): 01-05, https://doi.org/10.9790/0661-16160105. 22 dora biblarz et al., “guidelines for a collection development policy using the conspectus model,” international federation of library associations and institutions, section on acquisition and collection development (2001). 23 farrell, ashley, and davis, “guide to web preservation;” e. pinsent et al., “powr: the preservation of web resources handbook,” http://jisc.ac.uk/publications/programmerelated/2008/powrhandbook.aspx (2010); michael day, “preserving the fabric of our lives: a survey of web preservation initiatives,” lecture notes in computer science (berlin/heidelberg: springer, 2003): 461-72, https://doi.org/10.1007/978-3-540-45175-4_42. 24 pinsent et al., “powr:”; day, “preserving the fabric.” 25 allan arvidson, “the royal swedish web archive: a complete collection of web pages,” international preservation news (2001): 10-12. 26 andreas rauber, andreas aschenbrenner, and oliver witvoet, “austrian online archive processing: analyzing archives of the world wide web,” research and advanced technology for digital libraries (2002): ecdl 2002. lecture notes in computer science, vol 2458, (berlin/heidelberg: springer, 2002), 16-31, https://doi.org/10.1007/3-540-45747-x_2. 27 william arms, “collecting and preserving the web: the minerva prototype,” rlg diginews 5, no. 2 (2001). 28 sonya betz and robyn hall, “self-archiving with ease in an institutional repository: micro interactions and the user experience,” information technology and libraries 34, no. 3 (sept. 2015): 43-58, https://doi.org/10.6017/ital.v34i3.5900. 29 serge abiteboul et al., “a first experience in archiving the french web,” in international conference on theory and practice of digital libraries, (berlin/heidelberg: springer, 2002), 115, https://doi.org/10.1007/3-540-45747-x_1; sergey brin and lawrence page, “reprint of: information technology and libraries | march 2019 89 the anatomy of a large-scale hypertextual web search engine,” computer networks 56, no. 18 (2012): 3825-33, https://doi.org/10.1016/j.comnet.2012.10.007. 30 masanès, “web archiving.” 31 niso-press, “understanding metadata,” national information standards (2004), http://www.niso.org/publications/understanding-metadata. 32 ibid. 33 jane greenberg, “understanding metadata and metadata schemes,” cataloging & classification quarterly 40, no. 3-4 (2009): 17-36, https://doi.org/10.1300/j104v40n03_02. 34 michael day, “preservation metadata initiatives: practicality, sustainability, and interoperability,” publishers: archivschule marburg (2004): 91-117. 35 jenn riley, glossary of metadata standards (2010). 36 corey harper, “dublin core metadata initiative: beyond the element set,” information standards quarterly 22, no. 1 (2010): 20-31. 37 jane greenberg, “dublin core: history, key concepts, and evolving context (part one),” in slide presentation on dc-2010 international conference on dublin core and metadata applications pittsburgh, pa (2010). 38 cundiff v. morgan, “an introduction to the metadata encoding and transmission standard (mets),” library hi tech 22, no. 1 (2004): 52-64, https://doi.org/10.1108/07378830410524495; leta negandhi, “metadata encoding and transmission standard (mets),”in texas conference on digital libraries, tcdl-2012 (2012). 39 sally h. mccallum, “an introduction to the metadata object description schema (mods),” library hi tech 22, no. 1 (2004): 82-88, https://doi.org/10.1108/07378830410524521. 40 r. gartner, “mode: metadata object description schema,” jisc techwatch report tsw (2003): 03-06. www.loc.gov/standards/mods/. 41 vra-core, “an introduction of vra core,” http://www.loc.gov/standards/vracore/vra core4 intro.pdf, created: oct 2014. 42 vra-core, “vra core element outline,” http://www.loc.gov/standards/vracore/vra core4 outline.pdf, created: feb 2007. 43 priscilla caplan, “understanding premis,” washington dc, usa: library of congress, (2009), https://www.loc.gov/standards/premis/understanding-premis.pdf; j. relay, “an introduction to premis,” singapore ipress tutorial, (2011), http://www.loc.gov/standards/premis/premistutorial ipres2011 singapore.pdf. systematic approach towards web preservation | khan and ur rahman 90 https://doi.org/10.6017/ital.v38i1.10181 44 jennifer schaffner, “the metadata is the interface: better description for better discovery of archives and special collections, synthesized from user studies,” making archival and special collections more accessible, 85 (2015). 45 joao miranda and daniel gomes, “trends in web characteristics,” in web congress, 2009. laweb'09. latin american, (ieee, 2009), 146-53, https://doi.org/10.1109/la-web.2009.28. 46 daniel gomes, joão miranda, and miguel costa, “a survey on web archiving initiatives,” research and advanced technology for digital libraries (2011): 408-20, https://doi.org/10.1007/978-3-642-24469-8_41. 47 ibid. 48 schaffner, “metadata is the interface.” 49 miguel costa and mário j. silva, “evaluating web archive search systems,” in international conference on web information systems engineering (berlin/heidelberg: springer, 2012), 440454. https://doi.org/10.1007/978-3-642-35063-4_32. 50 foundation, i, “web archiving in europe,” technical report, commercenet labs (2010). 51 georgia solomou and dimitrios koutsomitropoulos, “towards an evaluation of semantic searching in digital repositories: a dspace case-study,” program 49, no. 1 (2015): 63-90, https://doi.org/10.1108/prog-07-2013-0037. 52 liu yan quan and sarah briggs, “a library in the palm of your hand: mobile services in top 100 university libraries,” information technology and libraries 34, no. 2 (june 2015): 133, https://doi.org/10.6017/ital.v34i2.5650. 53 ricardo baeza-yates and berthier ribeiro-neto, modern information retrieval 463. (new york: acm pr., 1999). 54 daniel burda and frank teuteberg, “sustaining accessibility of information through digital preservation: a literature review,” journal of information science, 39, no. 4 (2013): 442-58, https://doi.org/10.1177/0165551513480107. 55 muzammil khan et al., “normalizing digital news-stories for preservation,” in digital information management (icdim), 2016 eleventh international conference on (ieee, 2016), 8590, https://doi.org/10.1109/icdim.2016.7829785. 56 khan, et al., “normalizing digital news.” 57 muzammil khan, arif ur rahman, and m. daud awan, “term-based approach for linking digital news stories,” in italian research conference on digital libraries (cham, switzerland: springer, 2018), 127-38, https://doi.org/10.1007/978-3-319-73165-0_13. 102 information technology and libraries | september 2010 lita committees and interest groups are being asked to step up to the table and develop action plans to implement the strategies the lita membership have identified as crucial to the association’s ongoing success. members of the board are liaisons to each of the committees, and there is a board liaison to the interest groups. these individuals will work with committee chairs, interest group chairs, and the membership to implement lita’s plan for the future. the committee and interest group chairs are being asked to contribute those actions plans by the 2011 ala midwinter meeting. they will be compiled and made available to all lita and ala members for their use through the lita website (http://lita.org) and ala connect (http://connect.ala.org). what is in it for you? lita is known for its leadership opportunities, continuing education, training, publications, expertise in standards and information policy, and knowledge and understanding of current and cuttingedge technologies. lita provides you with opportunities to develop those leadership skills that you can use in your job and lifelong career. the skills working within a group of individuals to implement a program, influence standards and policy, collaborate with other ala divisions, and publish can be taken home to your library. your participation documents your value as an employee and your commitment to lifelong learning. in today’s work environment, employers look for staff with proven skills who have contributed to the good of the organization and the profession. lita needs your participation in developing and implementing continuing education programs, publishing articles and books, and illustrating by your actions why others want to join the association. how can you do that? volunteer for a committee, help develop a continuing education program, write an article, write a book, role model for others with your lita participation, and recruit. what does your association gain? a solid structure to support its members in accomplishing the mission, vision, and strategic plan they identified as core for years to come. look for opportunities to participate and develop those skills. we will be working with committee and interest group chairs to develop meeting management tool kits over the next year, create opportunities to participate virtually, identify emerging leaders of all types, collaborate with other divisions, and provide input on national information policy and standards through ala’s office for information technology policy and other similar organizations. if you want to be involved, be sure to let lita committee and interest group chairs, the board, and your elected officers know. c loud computing. web 3.0 or the semantic web. google editions. books in copyright and books out of copyright. born digital. digitized material. the reduction of stanford university’s engineering library book collection by 85 percent. the publishing paradigm most of us know, and have taken for granted, has shifted. online databases came and we managed them. then cd-roms showed up and mostly went away. and, along came the internet, which we helped implement, use, and now depend on. how we deal with the current shifts happening in information and technology during the next five to ten years will say a great deal about how the library and information community reinvents itself for its role in the twenty-first century. this shift is different, and it will create both opportunities and challenges for everyone, including those who manage information and those who use it. as a reflection of the shifts in the information arena, lita is facing its own challenges as an association. it has had a long and productive role in the american library association (ala) dating back to 1966. the talent among the association members is amazing, solid, and a tribute to the individuals who belong to and participate in lita. lita’s members are leaders to the core and recognized as standouts within ala as they push the edge of what information management means, and can mean. for the past three years, lita members, the board, and the executive committee have been working on a strategic plan for lita. that process has been described in michelle frisque’s “president’s message” (ital v. 29, no. 2) and elsewhere. the plan was approved at the 2010 ala annual conference in washington, d.c. a plan is not cast in concrete. it is a dynamic, living document that provides the fabric that drives the association. why is this process important now more than ever? we are all dealing with the current recession. libraries are retrenching. people face challenges participating in the library field on various levels. the big information players on the national and international level are changing the playing field. as membership, each of us has an opportunity to affect the future of information and technology locally, nationally, and internationally. this plan is intended to ensure lita’s role as a “go to” place for people in the library, information, and technology fields well into the twenty-first century. karen j. starr (kstarr@nevadaculture.org) is lita president 2010–11 and assistant administrator for library and development services, nevada state library and archives, carson city. karen j. starrpresident’s message: moving forward 198 information technology and libraries | december 2011 yan hantutorial articles: one was to make a case for using the cloud;4 while the other provided more details of moving a library’s it infrastructure (ils, website, and digital library systems) to a cloud along with discussing motivation, results, and evaluation in three areas (quality and stability, impact on library services, and cost).5 on the cost discussion, mitchell mentioned the difficulty of calculating technology total cost of ownership (tco) and cited two papers suggesting minimal cost savings. mitchell suggested the same but did not provide detailed cost information. in comparison, this paper has a detailed breakdown cost analysis along with different services, such as web applications and storage. mirsa and mondal proposed a suitability index and a return on investment (roi) model by taking into consideration impacts and real value.6 their suitability index and roi model is well thought but consider using the cloud for every aspect of all it operations as a whole. as a result, a company using this model will have the final conclusion of a “suitable,” or “may or may not be,” or “not suitable.” however, modular it operations and services (e.g., e-mail and storage) can be evaluated individually because these services can be easily upgraded or changed with minimal impacts to customers. i/o intensive services and storage intensive services have different resource requirements and thus the same evaluation criteria may not give an accurate picture of costs and benefits. for example, storing digital preservation files for libraries is a one-time data intensive operation. giving the above different nature of it operations and services, cloud computing may be suitable for some it operations but not for others. healy suggested that many companies did not have a complete financial analysis by missing staff retraining and system management. he listed the following areas for tco: hardware, software, recurring licensing and maintenance, bandwidth, a starting point for locating information for research; (2) buyer, the library as a purchaser of resources; and (3) archive, the library as a repository of resources. the 2009 survey indicates a gradual decline in their perception of the importance of “gateway,” no change in “archive,” growth in “buyer,” and increased importance for two new roles: “teaching support” and “research support.”1 to meet customers’ needs in these roles, libraries are innovating services, including catalogs and home websites (as “gateway” services), repository and digital library programs (as “archive,” “teaching support,” and “research support” services), and interlibrary loan (as a “buyer” and “research support” services). these services rely on stable and effective it infrastructure to operate. in the past, the growing needs of these web applications increased it expenditures and work complexity. more web applications, more storage, and more it support staff are weaved into centralized on-site it infrastructure along with huge investments in physical servers, networks, and buildings. however, decreasing budgets in libraries have had huge impact on all aspects of library operations and staffing. web applications running on local, managed servers might not be effective in technology nor efficient in cost. web applications utilizing cloud computing can be much more effective and efficient in some cases. literature review there are a growing number of articles related to cloud computing in libraries. chudnov described his personal experience of using cloud services amazon ec2 and s3 in an informal tone, costing him 50 cents.2 jordan discussed oclc’s strategies of building its next generation of services in cloud and provided a clear view of oclc’s future directions for us.3 mitchell wrote two cloud computing: case studies and total costs of ownership this paper consists of four major sections: the first section is a literature review of cloud computing and a cost model. the next section focuses on detailed overviews of cloud computing and its levels of services: saas, paas, and iaas. major cloud computing providers are introduced, including amazon web services (aws), microsoft azure, and google app engine. finally, case studies of implementing web applications on iaas and paas using aws, linode and google appengine are demonstrated. justifications of running on an iaas provider (aws) and running on a paas provider (google appengine) are described. the last section discusses costs and technology analysis comparing cloud computing with local managed storage and servers. the total costs of ownership (tco) of an aws small instance are significantly lower, but the tco of a typical 10tb space in amazon s3 are significantly higher. since amazon offers lower storage pricing for huge amounts of data, the tco might be lower. readers should do their own analysis on the tcos. a 2009 study from ithaka suggested that faculty perceive three traditional functions of a library: (1) gateway, the library as yan han (hany@u.library.arizona.edu) is associate librarian, university of arizona libraries, tucson, arizona. selecting a web content management system for an academic library website | han 199cloud computing: case studies and total costs of ownership | han 199 fundamental computing resources so that they can deploy and run arbitrary software such as operating systems and applications.13 in this model, the providers only manage underlying physical cloud infrastructure (e.g. physical servers and network), and provides services via virtualization. the users have maximum control on the infrastructure as if they own underlying physical servers and network. leading providers of this model includes amazon, linode, rackspace, joyent, and ibm blue cloud. major cloud computing providers include amazon web services (aws), microsoft windows azure, and google appengine. aws is considered to be an iaas, paas, and saas provider, which offers a collection of multiple computing services through the internet, including a few well-known services such as amazon elastic compute cloud (ec2),14 amazon simple storage service (s3), and amazon simpledb. ec2 started as a public beta in 2006. it allows users to pay for computing resources as they use them. with scalable use of computing resources and attractive pricing models, ec2 is one of the biggest brand names in cloud computing. it offers different os options, including multiple linux distributions, opensolaris, and windows server. ec2 uses xen virtualization, each virtual machine is called an instance. an instance in ec2 has no persistent storage, and data stored will be lost if the instance is terminated. therefore it is typical to use ec2 along with amazon elastic block store (ebs) or s3, which provides persistent storage for ec2 instances. amazon claims that both ebs and s3 are highly available and reliable. a user can create, start, stop, and terminate server instances through multiple geographical locations for benefits of resource optimization and high availability. for example, a user can start an instance in northern virginia, a potential to transform the it industry and it services, shifting the way it infrastructure and hardware are designed, purchased, and managed. many experts have their own version of cloud computing, which was discussed before.9 the national institute of standards and technology defines cloud computing as “a model for enabling convenient, on-demand network access to a shared pool of configuration computing resources that can be rapidly provisioned and released with minimal management effort or service provider interaction.”10 nist also gives its three service models layered based on computing infrastructure: ■■ software as a service (saas) allows users to use the cloud computing providers’ applications through a thin client interface such as a web browser.11 in the saas model, the cloud computing providers manage almost everything in the cloud infrastructure (e.g., physical servers, network, os, applications). it is directly targeted for general end users. the end users can directly run applications on the clouds and do not need install, upgrade, and backup applications and their work. typical saas products are google apps and salesforce sales crm. ■■ platform as a service (paas) allows users to deploy their own applications on the provider’s cloud infrastructure under the provider’s environment such as programming languages, libraries, and tools.12 in this model, the cloud computing providers manage everything except the application in the cloud infrastructure. paas is directly targeted for general software developers. they can develop, test, and run their codes on a paas platform. typical examples of this model includes google appengine, windows azure, and joyent. ■■ infrastructure as a service (iaas) allows users to manage processing, storage, networks, and other staffing allocation, monitoring, backup, failover, security audit and compliance, integration, training, and speed to implementation.7 the author published his first paper regarding cloud computing in 2010.8 since then, the author has implemented and has been managing multiple web applications and services using iaas and paas providers. several web applications of the university of arizona libraries (ual) have been migrated to the cloud. this paper focuses on enterprise-level applications and services, not individual-level cloud applications such as google docs. the purposes of this article are to ■■ define cloud computing and levels of services; ■■ introduce and compare major cloud computing providers; ■■ provide case studies of running two web applications (dspace and a home grown java application) utilizing cloud computing with justification; ■■ provide a comparison of tco of running web applications comparing a cloud computing provider with a local managed server; ■■ provide a comparison of tco of 10tb storage space comparing a cloud computing provider with local managed storage; and ■■ briefly discuss technology advantages of cloud computing. definition of cloud computing and levels of services cloud computing services and providers cloud computing is becoming popular in the it industry. over the past few years, the supply-and-demand of this new area has been seeing a huge increase of investment in infrastructure and has been drawing broader uses in the united states. the author believes that it has a 200 information technology and libraries | december 2011 16gb storage, 200gb transfer, and the cost is $19.95 per month.20 customers pay up front. open-source cloud computing software and private cloud cloud computing also goes to open source if any person or organization wants to set up their own clouds. eucalyptus is an open-source cloud computing system developed by the university of california at santa barbara. some of its eye-catching features include full compatibility with amazon ec2 public infrastructure and multiple hypervisors, which allows different virtual machines (e.g., xen, kvm, vsphere) to run on one platform.21 its open-source company, eucalyptus systems, provides technical supports to end users. building a cloud infrastructure on cloud(s) is also possible and might be desirable in certain situations. current linux distributions work with eucalyptus to provide private cloud services such ubuntu enterprise cloud and red hat’s deltacloud. some organizations have been setting up private clouds to utilize advantages of cloud computing. the azure allows non-windows applications to run on the platform. for example, apache web server can be run as a “worker role.”17 there also are a few small-to-medium size providers such as linode.18 table 1 lists major cloud computing providers. the cloud computing providers operate in two business models: variable (pay-for-your-usage) plans and fixed plans. variable plans allows customers to pay only for the resources actually consumed (e.g., instancehours, data transfer). aws offers a variable plan. google app engine works in a similar way. google app engine offers two interesting features: daily budgets and free quotas. a daily budget allows customers to control the amount of resources used every day. the free quota is currently set as 6.5 hours of cpu time per day, 1 gb data in and out per day, and 1gb of data storage.19 by the end of each month, customers receive a bill listing the number of running hours, the amount of storage used, the size of data transfers, and other add-on services. linode only offers a fixed plan. the charge is based on the amount of ram, data storage, and data transfer by assuming an instance is always running. for example, the smallest instance has 360mb ram, mirroring instance in ireland, and another mirroring instance in asia. amazon keeps increasing its offering by introducing new paas and saas services, such as simpledb, simple e-mail service, and e-commerce. google app engine is a paas provider offering a cloud platform for web applications in google’s data centers. it was released as a beta version in 2008 but is currently in a full service mode. appengine functions like a middle layer, which frees customers worrying about running oss, modules, and libraries. it currently supports python and java programming languages and related frameworks, and it is expected to support more languages in the future. google app engine uses bigtable with its gql (a sqllike language). bigtable15 is google’s proprietary database, used in multiple google applications such as google earth, google search, and app engine. the design of gql intentionally does not support “join” statement for multiple machine optimization.16 unlike aws, google appengine has a nice feature that allows customers a taste of the platform: it is free of charge up to a certain level of resource use. after that, fees are charged for additional cpu time, bandwidth and storage. windows azure also is a paas provider, which runs on microsoft data centers. it provides a new way to run applications and storing data in microsoft way. microsoft customers can install and run applications on microsoft cloud. customers are provided with two different instance types: web role instances and worker role instances. customers can use a “web role instance” to accept incoming http/https requests using asp.net, windows communication foundation (wcf) or another.net technology working with iis. a “worker role instance” is not associated with iis, but functions as a background job. the two instances can be combined to create desired web services. it is clear that windows table 1. list of major cloud computing providers cloud computing provider layer akamai paas, saas amazon web services iaas, paas, saas emc saas eucalyptus iaas open source software google paas(appengine), saas ibm paas, saas linode iaas microsoft paas (azure), saas rackspace iaas, paas, saas salesforce.com paas, saas vmware vcloud paas, iaas zoho saas selecting a web content management system for an academic library website | han 201cloud computing: case studies and total costs of ownership | han 201 the work of modification of sql-style code would have been significant. the author has a monthly bill of $40 using an aws small instance. case study 2: japanese gif holding library finder application the author helped the north american coordinating council on japanese library resources (ncc) to develop and maintain a web service to identify japanese global ill framework (gif) libraries to facilitate interlibrary loan (ill) service. the application was developed in java using j2ee framework, and run in typical java servlet container such as tomcat. the application was initially operated in a small, locally managed server, and was migrated to linode and google appengine in may 2010. cloud computing provider selection and implementation unlike case 1, the author tested and installed the application to aws, linode and google appengine. aws and linode are iaas providers which give users greater control over virtual nodes on their cloud infrastructure. google appengine might be a better choice when applications run on normal os environments, because system administration tasks can be completed by paas providers, saving users’ time and resources. as a paas provider, google maintains its infrastructure environment such as os, programming languages, and tools. installing the application in google appengine can go through an eclipse plug-in or through command lines. in this case, the gif application is a simple system written in java without any database transactions. therefore google app engine’s proprietary gql database is not a barrier. however, users should be aware that google appengine has other unique features. for example, cloud computing provider selection and implementation a typical dspace instance requires java and related libraries, j2ee environment, and postgresql as database backend. three cloud computing providers have been evaluated: aws, linode, and google appengine. two instances were successfully installed and configured in aws and linode after a few days of testing. building a dspace instance on the cloud is the same process as running it on local except that it is much quicker to build, restart, rebuild, and backup. for example, an initial os installation in a traditional server will take a few hours compared to doing the same task that takes a few minutes using an iaas provider. installation on the aws ec2 and linode is almost the same except creating a login and setting up security policies. to log on to aws, command line tools using an x.509 certificate using public/private key are by default. a generated keypair is required to ssh an instance and no password ssh option is provided. in addition, appropriate “security groups” are required to set up to enable network protocols. in this case, protocols such as ssh and http along with typical port number 80 and 8080 must be enabled. activities such as manage instances, creating images, and setup security policies can be set up through aws web interface (see figure 1). steps and commands of running regular operations can be found in the appendix. in linode, using “root” to log on is allowed. users do not need to set network and security policies, as protocols and ports are already open. in system administration practice, running applications without enforcing security policies does present security risks to applications and systems. linode allows users to set up security policies. the author decided not to proceed with installation in google appengine because of its proprietary database gql. if implemented in google appengine, private cloud eases concerns in the public cloud such as security of data, control of data, and legal issues. for example, an institution can build its own cloud infrastructure using eucalyptus (or ubuntu cloud) with its own computing resources or simply using amazon aws. the private cloud computing service becomes customizable cloud computing resources which can be configured and reconfigured as needed. why is this valuable? in traditional computing approaches, servers, storage, and networking equipment are purchased, configured, and then used without significant changes for three to five years until lives end. in this case, some planning must be scheduled ahead of time thinking of computing resource needs in three to five years. it is certain that additional resources (e.g., ram, hard disks, cpu) will be reserved for future needs and are currently wasted. the private cloud reduces concerns regarding security and data control. however, one must still buy, build, and manage the private cloud, increasing tco and reducing the cost benefit. case studies: applications on the cloud case study 1: dspace implementation and analysis many libraries are running their institutional repositories at locally managed servers. ual has been running its repositories since 2004 as one of the earliest dspace adapters. one of the dspace instances was tested on the cloud in january 2010 after comparing costs and supports. later the author chose to run a production dspace in aws starting march 2010. the repository (http://www.afghan data.org/) currently holds 1,800 titles of digitized unique afghan materials. since then, several content and system updates have been applied. 202 information technology and libraries | december 2011 a good case for calculating the tco.25 in cases below, readers should be aware that there are the following assumptions: ■■ software, training, licensing, and maintenance costs are the same by assuming using on the same software environment on the local managed infrastructure and on the cloud. ■■ monitoring costs are the same based on the fact that monitoring software has to be hosted somewhere. ■■ bandwidth and network costs ignored. ■■ security audit and compliance ignored by assuming all data are open. the author runs an instance of 100gb in aws and a monthly bill of this node is around $40. in comparison, if running a local managed server, a physical server would have been purchased. in our case, a comparison of tco shows that the cloud computing model has a significant 50 percent cost saving, assuming a server life expectancy is five years. analysis and discussions cost analysis running applications on the cloud gives many technical advantages and results in significant cost savings over running them on local managed servers. in this section, the author presents detailed cost comparisons between virtual managed nodes in the cloud computing and local managed storage and servers in the traditional model. cost saving and low barriers to launch web services using the cloud is significant when considering easy start-up, scalability, and flexibility. one of the biggest advantages of the cloud computing lies in its on-demand, allowing users to start applications with minimal cost. the current cost of starting an instance on aws is 0.03 per hour if reserved. above the clouds: a berkeley view of cloud computing cites a comparison: “it costs $2.56 to rent $2 worth of cpu” and “costs are $6.00 when purchasing vs. $1.20–$1.50 per month on s3.”24 clearly healy made currently google appengine only allows users to have their codes running in python and java; it uses its own database query language gql. this creates an extra step for developers who are willing to migrate existing codes to google and existing sql queries have to be rewritten. in addition, other limitations with google app engine include allowing only a subset of the jre standard edition and users are unable to create new threads.22 the cost of running the application on google app engine is great, because google app engine offers free of charge up to its free quota. google identified 90 percent of applications were hosted free.23 this is a great paas resource for small web applications. applications on the cloud since 2009, the author has been running multiple web applications and services on multiple iaas and paas providers and has been very happy regarding services and overall costs. the running applications and services are listed in table 2. figure 1. amazon aws management console selecting a web content management system for an academic library website | han 203cloud computing: case studies and total costs of ownership | han 203 ■❏ operation expense: $7,190– $10,690. ignoring downtime and failure expenses, insurance cost, technology training, and backup process. ■● system administrator cost: $3,500–$7,000 = 5 years x 1–2 percent time x (50,000 salary + 50000 x 40 percent benefits). 1–2 percent time is about 5–10 minutes per day assuming this administrator works at 8 hours per day 5 days per week at 100 percent capacity. ■■ space cost: $1,500. ■● space cost for a book in ual is $2.80 per year. a physical server is estimated to be $300 dollars per year for space. ■● electricity cost: $2,190. of a 1.0–1.2 ghz 2007 opteron or 2007 xeon processor.”26 ■■ the tco of a physical server comparable to an aws small instance for 5 years: $5,858–$7,608. ■❏ an aws small instance is roughly 50 percent of computing power of a server quoted. (the tco here is calculated as 50 percent of $11,715–$15,215). ■❏ hardware: $4,525. ■● $4,525 = $2,658 (server) + $1,125 (3-year support) + $1,125 x2 /3 (additional 2-year support). note: dell poweredge server: intel xeon e56302.53ghz with 5-year support for mission critical 6-hours repair (source: dell. com quoted on oct. 20, 2010). ■■ the tco of an aws small instance for 5 years: $2,750–$3,750. ■❏ hardware: $0. ■❏ operation expense: $2,750– $3,750 ■● system administrator cost: $0–$1,000?. by eliminating physical infrastructure, there is no need or minimal cost to manage a server. ■● $2,750 = $350 (aws initial subscription fee) + $40/ month x 12 months x 5 years. the instance’s capacity can be found on aws, and cpu power can be evaluated by using /proc/cpuinfo. amazon indicated that “one ec2 compute unit provides the equivalent cpu capacity table 2. some ual web applications and cloud computing service providers computing infrastructure functions applications computing environment instances service providers data storage data storage n/a linux / windows data storage using ebs or s3 aws access digital repository dspace j2ee, java, tomcat, postgresql, afghanistan digital collections aws linode content management system joomla linux, apache, php, mysql, afghanistan digital libraries aws linode website html html sonoran desert knowledge exchange aws linode integrated library system koha linux, apache, perl, mysql afghanistan higher education union catalog aws linode web applications home-grown j2ee web application j2ee, java, tomcat japanese gif (global interlibrary-loan) holding finder at linode at google app engine aws linode google app engine computing services monitoring nagios linux, perl internal application aws linode networked devices administration ssh, sftp linux n/a aws linode 204 information technology and libraries | december 2011 meet users’ needs at will. rebuilding nodes and creating imaging are also easier on the cloud. server failure resulting from hardware error can result in significant downtime. the ual has a few server failure in the past few years. last year a server’s raid hard drives failed. the time spent on ordering new hard disks, waiting for server company technician’s arrival, and finally rebuilding software environment (e.g., os, web servers, application servers, user and group privileges) took six or more hours, not to mention about stress rising among customers due to unavailability of services. mirroring servers could minimize service downtime, but the cost would be almost doubled. in comparison, in the cloud computing model, the author took a few snapshots using the aws web management interface. if a node fails, the author can launch an instance using the snapshot within a minute or two. factors such as software and hardware failure, natural disasters, network failure, and human errors are the main causes for system downtime. the cloud computing providers generally have multiple data centers in different regions. for instance, amazon s3 and google appengine are claimed to be highly available and highly reliable. both aws and google app engine offer automatic scaling and load balancing. the cloud computing providers have huge advantages in offering high availability to minimize hardware failure, natural disasters, network failure, and human errors, while the locally managed server and storage approach has to be invested a lot to reduce these risks. in 2009 and 2010 the university of arizona has experienced at least two network and server outages each lasting a few hours; one failure was because of human error and the other was because of a power failure from tucson electric power. when a power line was cut by accident, what can you do? in comparison, over the past two years minimal downtime from includes 12tb hard disks (about 10tb usable space after raid 5 configuration) with 5-year support, assuming 5-year life expectancy. ■❏ operation expense: $1,438– $2,138 per year. ■● system administrator cost: $700–$1,200. see above. ■● space cost: $300. see above. ■● electricity costs: $438 per year. see above. ■● network cost ignored. technology analysis there is no need to purchase a server; no need to initial a cloud node; no need to setup security policies; no need to install tomcat, java and j2ee environment; and no need to update software. compared to the traditional approach, paas eliminates upfront hardware and software investment, reduces time and work for setting up running environment, and removes hardware and software upgrade and maintenance tasks. iaas eliminates upfront hardware investment along with other technical advantages discussed below. the cloud computing model offers much better scalability over the traditional model due to its flexibility and lower cost. in our repository, the initial storage requirement is not significant, but can grow over time if more digital collections are added. in addition, the number of visits is not high, but can increase significantly later. an accurate estimate of both factors can be difficult. in the traditional model, a purchased server has preconfigured hardware with limited storage. upgrading storage and processing power can be costly and problematic. downtime will be certain during the upgrade process. in comparison, the cloud computing model provides an easy way to upgrade storage and processing power with no downtime if handling well. bigger storage and larger instances with high-memory or highcpu can be added or removed to ■■ electricity cost: $2,190 = 5 years x 365 days/year x 24 hours/day x 0.5 kilowatt / hour x $0.10/kilowatt. most libraries running digital library programs require big storage for preserving digitization files. the analysis below just illustrates a comparison of the tco of 10tb space. it shows that the tco of locally managed storage has lower costs than amazon s3’s storage tco. though the cloud computing model still have the advantage of on-demand, avoid big initial investment on equipment, the author believes that locally managed storage may be a better solution if planned well. since amazon s6 storage pricing decreases from $0.14/gb to $0.095/gb over 500tb, amazon s3’s tco might be lower if an organization has huge amounts of data. the author suggests readers should do their own analysis. ■■ the tco of 10tb in amazon s3 per year: $16,800. note: amazon s3 replicate data at least 3 times, assuming these preservation files do not need constant changes. otherwise, data transfer fees could be high. ■❏ operation expense: $16,800 per year. ■● $16,800 = $1,400/month x 12 months. (based on amazon s3 pricing of $0.14/gb per month) ■● network cost ignored. ■■ the tco of a 10tb physical storage per year: $11,212–$12,612. ■❏ to match reliability of amazon s3, local managed storage needs three copies of data: two in hard disk and one in tape. note: dell ax4–5i san storage: quoted on october 26, 2010. replicate data 3 times, including 2 copies in hard disks, one copy in tape. ignoring time value of money, 3 percent inflation per year based on cpi statistic data. ■❏ hardware: $4,168 per year. ■● $20,840 a san storage selecting a web content management system for an academic library website | han 205cloud computing: case studies and total costs of ownership | han 205 ’06), nov. 6–8, 2006, seattle, wash., h t t p s : / / w w w. u s e n i x . o r g / e v e n t s / o s d i 0 6 / t e c h / c h a n g / c h a n g _ h t m l / (accessed apr. 21, 2010). 16. google, “gql reference, 2010, http://code.google.com/appengine/ docs/python/datastore/gqlreference .html (accessed apr. 21, 2010); google developers, “campfire one: introducing google app engine (pt. 3),” 2010, http:// www.youtube.com/watch?v=og6ac7dnx8 (accessed apr. 21, 2010). 17. david chappell, “introducing windows azure,” 2009, http://download.microsoft.com/download/e/4/3/ e43bb484–3b52–4fa8-a9f9-ec60a32954bc/ azure_services_platform.pdf (accessed apr. 2, 2010). 18. linode, “linode—xen vps hosting,” 2010, http://www.linode.com/ (accessed apr. 7, 2010). 19. google, “quotas—google app engine,” 2010, http://code.google.com/ appengine/docs/quotas.html (accessed oct. 21, 2010). 20. jay jordan, “climbing out of the box and into the cloud: building webscale for libraries,” journal of library administration 51, no. 1 (2011): 3–17. 21. nurmi daniel et al., “the eucalyptus open-source cloud-computing system,” in 9th ieee/acm international symposium on cluster computing and the grid, 2009, doi: 10.1109/ccgrid.2009.93. 22. google, “the jre white list— google app engine—google code,” 2010, http://code.google.com/appengine/ docs/java/jrewhitelist.html (accessed apr. 9, 2010); google, “the java servelet environment,” 2010, http://code.google .com/appengine/docs/java/runtime .html (accessed apr. 9, 2010). 23. google, “changing quotas to keep most apps serving free,” 2009, http:// googleappengine.blogspot.com/2009/ 06/changing-quotas-to-keep-most-apps .html (access oct. 21, 2010). 24. michael armbust et al., above the clouds: a berkeley view of cloud computing (eecs department, university of california, berkeley: reliable adaptive distributed systems laboratory, 2009), http://www.eecs.berkeley.edu/pubs/ te c h r p t s / 2 0 0 9 / e e c s 2 0 0 9 2 8 . h t m l (accessed july 1, 2009). 25. amazon, “amazon ec2 pricing,” 2010, http://aws.amazon.com/ec2/pricing/ (accessed feb. 20, 2010). 26. michael healy, “beyond cya as a service,” information week 1288 (2011): 24–26. case of 10tb storage. since amazon offers lower storage pricing for huge amounts of data, readers are recommended to do their own analysis on the tcos. references 1. roger c. schonfeld and ross housewright, faculty survey 2009: key strategic insights for libraries, publishers, and societies, 2010, http://www.ithaka .org/ithaka-s-r/research/faculty-surveys -2000–2009/faculty-survey-2009 (accessed apr. 20, 2010). 2. daniel chudnov, “a view from the clouds,” computers in libraries 30, no. 3 (2010): 33–35. 3. jay jordan, “climbing out of the box and into the cloud: building webscale for libraries,” journal of library administration 51, no. 1 (2011): 3–17. 4. erik mitchell, “cloud computing and your library,” journal of web librarianship 4, no. 1 (2010): 83–86. 5. erik mitchell, “using cloud services for library it infrastructure,” code4lib journal 9 (2010), http://journal .code4lib.org/articles/2510 (accessed feb 10, 2011). 6. subhas c. misra and arka mondal, “identification of a company’s suitability for the adoption of cloud computing and modelling its corresponding return on investment,” mathematical & computer modelling 53 (2011): 504–21, doi: 10.1016/j. mcm.2010.03.037. 7. michael healy, “beyond cya as a service,” information week 1288 (2011): 24–26. 8. yan han, “on the clouds: a new way of computing,” information technology & libraries 29, no. 2 (2010): 88–93. 9. ibid. 10. peter mell and tim grance, the nist definition of cloud computing, nist, http://csrc.nist.gov/groups/sns/cloud -computing/ (accessed oct. 21, 2010). 11. ibid. 12. ibid. 13. ibid. 14. amazon, amazon elastic compute cloud (amazon ec2), 2010, http://aws .amazon.com/ec2/ (accessed oct. 21, 2010). 15. fay chang et al., “bigtable: a distributed storage system for structure data,” in 7th symposium on operating systems design and implementation (osdi the cloud computing providers was reported. there are some issues when implementing cloud computing. above the clouds: a berkeley view of cloud computing discusses ten obstacles and related opportunities for cloud computing.27 all of these obstacles and opportunities are technical. the author’s first paper on this topic also discusses legal jurisdiction issues when considering cloud computing.28 users should be aware of these potential issues when making a decision of adopting the cloud. summary this paper starts with literature review of articles in cloud computing, some of them describing how libraries are incorporating and evaluating the cloud. the author introduces cloud computing definition, identifies three-level of services (saas, paas, and iaas), and provides an overview of major players such as amazon, microsoft, and google. open source cloud software and how private cloud helps are discussed. then he presents case studies using different cloud computing providers: case 1 of using an iaas provider amazon and case 2 of using a paas provider google. in case 1, the author justifies the implementation of dspace on aws. in case 2, the author discusses advantages and pitfalls of paas and demonstrates a small web application hosted in google appengine. detailed analysis of the tcos comparing aws with local managed storage and servers are presented. the analysis shows that the cloud computing has technical advantages and offers significant cost savings when serving web applications. shifting web applications to the cloud provides several technical advantages over locally managed servers. high availability, flexibility, and cost-effectiveness are some of the most important benefits. however, the locally managed storage is still an attractive solution in a typical 206 information technology and libraries | december 2011 (accessed july 1, 2009). 29. yan han, “on the clouds: a new way of computing,” information technology & libraries 29, no. 2 (2010): 88–93. (eecs department, university of california, berkeley: reliable adaptive distributed systems laboratory, 2009), http://www.eecs.berkeley.edu/pubs/ te c h r p t s / 2 0 0 9 / e e c s 2 0 0 9 – 2 8 . h t m l 27. erik mitchell, “cloud computing and your library,” journal of web librarianship 4, no. 1 (2010): 83–86. 28. michael armbust et al., above the clouds: a berkeley view of cloud computing, appendix. running instances on amazon ec2 task 1: building a new dspace instance ■■ build a clean os: select an amazon machine image (ami) such as ubuntu 9.2 to get up and running in a minute or two. ■■ install required modules and packages: install java, tomcat, postgresql, and mail servers. ■■ configure security and network access on the node. ■■ install and configure dspace: install system and configure configuration files. task 2: reloading a new dspace instance ■■ create a snapshot of current node with the ebs if desired: use aws’s management tools to create a snapshot. ■■ register the snapshot using aws’s management tools and write down the snapshot id, specify the kernel and ramdisk. command: ec2-register: registers the ami specified in the manifest file and generate a new ami id (see amazon ec2 documentation) (example: ec2-register -s snap-12345 -a i386 -d “description of ami” -n “name-of-image” —kernel aki-12345 — ramdisk ari-12345 ■■ in the future, a new instance can be started from this snapshot image in less than a minute. command: ec2-run-instances: launches one or more instances of the specified ami (see amazon ec2 documentation) (example: ec2-run-instance ami-a553bfcc -k keypair2 -b /dev/sda1=snap-c3fcd5aa: 100:false) task 3: increasing storage size of current instance ■■ to create an instance with desired persistent storage (e.g., 100 gb) command: ec2-run-instances: launches one or more instances of the specified ami (see amazon ec2 documentation) (example: ec2-run-instances ami-54321 -k ec2-key1 -b /dev/sda1=snap-12345:100:false) ■■ if you boot up an instance based on one of these amis with the default volume size, once it’s started up you can do an online resize of the file system: command: resize2fs: ext2 file system resizer (example: resize2fs /dev/sda1) task 4: backup ■■ go to aws web interface and navigate to the “instances” panel. ■■ select our instance and then choose “create image (ebs ami).” ■■ this newly created ami will be a snapshot of our system in its current state. searchable signatures: context and the struggle for recognition gina schlesselmantarango information technology and libraries | september 2013 5 abstract social networking sites made possible through web 2.0 allow for unique user-generated tags called “searchable signatures.” these tags move beyond the descriptive and act as means for users to assert online individual and group identities. this paper presents a study of searchable signatures on the instagram application, demonstrating that these types of tags are valuable not only because they allow for both individuals and groups to engage in what social theorist axel honneth calls the “struggle for recognition,” but also because they provide contextual use data and sociohistorical information so important to the understanding of digital objects. methods for the gathering and display of searchable signatures in digital library environments are also explored. introduction a comparison of user-generated tags with metadata traditionally assigned to digital objects suggests that social network platforms provide an intersubjective space for what social theorist axel honneth has termed the “struggle for recognition.” 1 social network users, through the creation of identity-based tags—or what can be understood as “searchable signatures”—are able to assert and perform online selves and are thus able to demand, or struggle for, recognition within a larger social framework. baroncelli and freitas cogently argue that web 2.0, or the interactive online social arena, in fact functions as a “recognition market in which contemporary individuals . . . trade personal worth through displays and exchanges of . . . self-presentations.” 2 a comparison of a metadata schema used in yale university’s digital images database with usergenerated tags accompanying shared photographs on the social networking platform instagram demonstrates that searchable signatures are unique to social networking sites. as phenomena that allow for public presentations of disembodied selves, searchable signatures thus provide specific information about the context of the digital images with which they are associated. capturing context remains a challenge for those working with digital collections, but searchable signatures allow viewers to derive valuable use data and sociohistorical information to better understand the world in which digital images originated and exist. literature review web 2.0 identities and recognition theory while web 2.0 can be imagined as a highly collaborative space where social actors are able to gina schlesselman-tarango (gina.schlesselman@du.edu) holds a master of social sciences from the university of colorado denver and is currently an mlis candidate at university of colorado. mailto:gina.schlesselman@du.edu searchable signatures: context and the struggle for recognition | schlesselman-tarango 6 communicate to the world new identities, some warn that this communication is somehow engineered and performed. van dijck, in an analysis of social media, argues that it is indeed “publicity strategies [that] mediate the norms for sociality and connectivity,” and baroncelli and freitas note that web 2.0 allows people to make themselves visible through modes of spectacularization.3 though his focus is on the spectacle in fin de siècle france, clark provides some insight into the effects of spectacularization on the individual. 4 working within a historical materialist framework, clark points that with the growth of capitalism, the individual has become colonized. 5 clark further describes this colonization as “massive internal extension of the capitalist market—the invasion and restructuring of whole areas of free time, private life, leisure, and personal expression . . . the making-into-commodities of whole areas of social practice which had once been referred to casually as everyday life.” 6 here, web 2.0 is not a liberatory tool but instead a space where users are colonized to the extent that they create selves exchanged through social networking sites owned by capitalist enterprises. web 2.0, then, has created a situation in which personal time and identification can be successfully commodified. baroncelli and freitas conclude, “from that formula, personal life becomes a capital to be shared with other people—preferably, with a large audience.” 7 the problem, then, is that one’s existence is defined simply “by being seen by others” and can no longer be understood as authentic.8 despite the sophistication of the argument detailed above, there are some who view the online self, created through web 2.0, as a legitimate and authentic identity. in an account of the online self, hongladarom summarizes this position, noting that both offline and virtual identities are constructed in social environments. 9 for hongladarom, these identities are not different in essence because “what it is to be a person . . . is constituted by external factors.” 10 the online world as an external factor has the ability to affirm one’s existence, regardless of whether that existence is physical or virtual. in sum, it is the social other and not a material existence that is the authenticating factor in identity formation. there are others who validate the role that spectacle—or what also can be understood as performance—plays in identity formation. pearson calls on the work of goffman to argue, “identity-as-performance is seen as part of the flow of social interaction as individuals construct identity performances fitting their milieu.” 11 for pearson, the identity is always performed, be it through web 2.0 or otherwise. there is nothing particularly worrisome, then, about the effects of web 2.0 on the self, nor does web 2.0 threaten the authenticity of the self. identity is always performed and is in a sense a spectacle—this does not mean, however, that identity in itself is spurious. it is with this perspective of the online self as a performed albeit authentic identity that this paper further develops. before a thorough analysis of the searchable signature as an online self can be conducted, a deeper understanding of honneth’s theory of recognition is in order. information technology and libraries | september 2013 7 in his 1995 work the struggle for recognition: the moral grammar of social conflicts, honneth sets out to develop a social theory based on what he calls “morally motivated struggle.” 12 based on the habermasian concept of communicative action, honneth contends that it is through mutual recognition that “one can develop a practical relation-to-self [and can] view oneself from the normative perspective of one’s partners in interaction, as their social addressee.” 13 relation-toself is key for honneth, and he argues that a healthy relation-to-self, or what can be thought of as self-esteem, is developed when one is seen as valuable by others. beyond self-esteem, honneth points that the success of social life itself depends on “symmetrical esteem between individualized (and autonomous) subjects.” 14 for honneth, this “symmetrical esteem” can lead to solidarity between individuals. “relationships of this sort,” he explains, “can be said to be cases of ‘solidarity’ because they inspire not just passive tolerance but felt concern for what is individual and particular about the other person.” 15 that is to say that felt concern for another allows one to see the specific traits of the other as valuable in working towards common goals, and honneth imagines that in situations of “symmetrical esteem . . . every subject is free from being collectively denigrated, so that one is given the chance to experience oneself to be recognized, in light of one’s own accomplishments and abilities, as valuable for society.” 16 until this ideal is realized, however, individuals must find sites in which to struggle to be recognized as valuable social assets. according to baroncelli and freitas, it is in fact web 2.0 that provides the arena where “the contemporary demand for the visibility of the self” is able to flourish. 17 they position this argument within honneth’s framework, asserting that the visibility of self is “directed towards a quest for recognition,” and they thus conclude that web 2.0 can be understood as a “recognition market.” 18 context and its importance capturing and integrating markers of context into records, according to chowdhury, still present a challenge for many.19 “there is now a general consensus that the major challenge facing a digital library as well as a digital preservation program is that it must describe its content as well as the context sufficiently well to allow its correct interpretation by the current and future generations of users,” he contends.20 context in itself is difficult to define, let alone its myriad facets that might or might not facilitate better understanding of digital objects. dervin, in her exploration of the meaning of context, points that it is often conceptualized as the “container in which the phenomenon resides.” 21 she points that the list of factors that constitute the container and might be considered contextual is in fact “inexhaustible”—items on this list, for example, might include the gender, race, and ethnicity of those involved in a phenomenon. 22 in an indexing or digital collection environment, the goal is to determine which of these many factors ought be included in a record to best allow for discovery and use. searchable signatures: context and the struggle for recognition | schlesselman-tarango 8 others imagine context as a fluid, ever-changing process rather than as a static container of data. “in this framework,” dervin writes, “reality is in a continuous and always incomplete process of becoming.” 23 this understanding of context as changing is helpful for those working with objects that live in digital environments, especially web 2.0. certainly the interactive nature of the web has created room for a variety of users to create, share, appropriate, comment on, tag, reject, celebrate, and ultimately understand images in a multitude of contexts that might be different from one moment to the next. there are many reasons to include contextual information in records of digital objects. lee argues that by providing context, or what he describes as the “social and documentary” world “in which [a digital object] is embedded,” future users will be able to better understand the “details of our current lives.” 24 further, lee contends that context is helpful in that is illustrates the ways in which a digital object is related to other materials: relationships to other digital objects can dramatically affect the ways in which digital objects have been perceived and experienced. in order for a future user to make sense of a digital object, it could be useful for that user to know precisely what set of . . . representations—e.g. titles, tags, captions, annotations, image thumbnails, video keyframes—were associated with a digital object at a given point in time. 25 the user-generated tag, then, is a valuable representation that provides contextual information surrounding the perception and experience of the image with which it is directly related. discussion user-generated tags and traditional metadata user-generated tags have been hailed as an important stage in the evolution of image description and are said to have the potential to shape controlled vocabularies used in traditional metadata schemas. for example, in a comparison of flickr tags and index terms from the university of st. andrews library photographic archive, rorissa stresses the importance of exploring similarities and differences between indexers’ and users’ language, noting that “social tagging could serve as a platform on which to build future indexing systems.” 26 like others, rorissa hopes that continued research into user-generated social tags will be able to “bridge the semantic gap between indexerassigned terms and users’ search language.” 27 in fact, some are currently utilizing social tags in an effort to describe and facilitate access to collections. one such organization is steve: the museum social tagging project, “a place where you can help museums describe their collections by applying keywords, or tags, to objects.” 28 the organization allows users to not only view traditional metadata associated with cultural objects, but also tags generated by others. in an effort to better understand the similarities and differences between user-generated tags and the language used in traditional metadata schemas, one must compare the two systems. information technology and libraries | september 2013 9 yale university’s digital images database provides a glimpse at the ways in which traditional metadata schemas are typically used to describe images in digital library settings. most of the images included in the database are accompanied by descriptive, structural, and administrative metadata. for example, an item entitled “boy sitting on a stoop holding a pole” (see figure 1) from the university’s collection of 1957–90 andrews st. george papers provides a digital copy of the image, the image number, name of the creator, date of creation, type of original material, dimensions, copyright information, manuscript group name and number, box and folder numbers, and a credit line.29 the image is further described by the following: “man in the shed is making homemade bombs. the boy and man are also in image 45350.” 30 figure 1. “boy sitting on a stoop holding a pole” from yale university’s digital images database collection of 1957–90 andrews st. george papers, november 2012. certainly, such information is useful in library environments and provides users with helpful and formatted data to best guide the information discovery process. the finding aid for the andrews st. george collection is additionally helpful in that it includes information about provenance, access, processing, associated materials, and the creator; it also contains descriptive information about the collection by box and folder number. 31 however, if additional use data and sociohistorical searchable signatures: context and the struggle for recognition | schlesselman-tarango 10 information specific to this individual item were available, it would be most helpful in assisting users in determining the image’s greater context. a study of modes of participation on social networking sites suggests that it is now possible to supply such contextual information for digital objects that live in interactive online environments. a useful site for exploring user-generated tags associated with images is instagram, a social application designed for iphone and android.32 instagram users are able to upload and edit photos, and other users can then view, like, and comment on the shared photos. instagram users are able to follow other users and search for photos by the creator’s username or by accompanying tags. instagram, owned by facebook, is interoperable with other social networking sites, and users have the ability to share their photos on facebook, flickr, tumblr, and twitter. as of july 2012, it was reported that instagram had 80 million users, and in september 2012, the new york times reported that 5 billion photos were shared through the application.33 users are limited to 30 tags per photo, and instagram suggests that users be as specific as possible when describing an image with a tag so that communities of users with similar interests can form.34 many tags, like the information included in traditional metadata schemas, aim to best describe an image by explaining its content; for example, one user assigned the tags #kids, #nieces, #nephews, and #family to a photograph of a group of smiling children (see figure 2). like the information accompanying the photograph in the yale university digital images database, such tags provide users and viewers with tools to better determine the “aboutness” of the image at hand. information technology and libraries | september 2013 11 figure 2. photo shared on instagram assigning both descriptive tags and the searchable signature #proudaunt, november 2012. however, instagram users are repurposing the tagging function in a way that is unique to social networking sites. in addition to the descriptive tags assigned to the image of the children described above, the user also tagged the photo with the term #proudaunt (see figure 2). there is, however, no aunt (what can be assumed to be an adult female) in the photograph. this tag, then, functions to further identify the user who created or shared the photograph and does not describe the content of the image at hand. a search of the same tag, #proudaunt, demonstrates that this user is not alone in identifying as such: in november 2012, this search returned 40,202 images with the same tag and more than 58,000 images with tags derived from the same phrase (#proudaunty, #proudauntie, #proudaunties, #proudauntiemoment, and #proudaunti) (see figure 3). figure 3. list of results from #proudaunt hashtag search on instagram, november 2012. this type of user-generated tag—one that identifies the creator or sharer of the photograph yet is not necessarily meant to describe the content of the image—can be understood as a searchable signature. such identity-based tags are not found within yale university’s digital images database; the closest relative of the searchable signature is the creator’s name. while searchable, this name is not alternative, or secondary, and it was not created and does not exist in a social environment. searchable signatures: context and the struggle for recognition | schlesselman-tarango 12 currently, born-digital objects are often created and shared in a technological milieu that allows for the assignment of user-generated tags. consequently, the integration of the searchable signature into the presentation of digital objects has become part of accepted social practice and offers unique opportunities for digital library curators and users alike. until quite recently, most materials—be they photographs, manuscripts, or government documents—were not born in digital environments. however, digitization projects have been undertaken to ensure that such historical materials are more widely and eternally available. these reborn digital objects, then, have been and can be integrated into dynamic social environments. steve: the museum social tagging project, mentioned earlier in this paper, is one example of an organization that has capitalized on the social practice of user-generated tagging and is using descriptive tags along with traditional metadata to better describe reborn digital objects. it is important, then, to explore what (if any) implications the application of the searchable signature, a unique type of user-generated tag, has for historical objects that are later integrated into digital environments. searchable signatures associated with born digital images on social networking sites contain valuable information about their creators, users, and the images’ context. one cannot ignore that users will, if given the chance, also likely apply signatures to reborn digital objects in similar ways that they do to objects that have always existed in social environments. since the searchable signature is used to identify not only digital image creators, but also sharers, and if these signatures do in fact provide important insight into the sharers and their motivations, then these signatures are not to be ignored. rather than focusing on the creating, the lens through which to understand the searchable signature for reborn digital objects can be shifted to the social act of sharing: by whom, when, in which social environments, and for what purposes. a deeper analysis of the presentation of self through the searchable signature and the role that the signature plays in providing valuable contextual information for both bornand reborn-digital objects is developed below. searchable signatures and the struggle for recognition if web 2.0 indeed functions as a recognition market, then social media and social networking sites might appear to be tables at such a market. placing oneself behind a table—be it facebook, twitter, or instagram—the user is able to perform his or her online identity to passersby and effectively struggle to be recognized as a unique individual or as a member of a social group. these performances, which could be deemed narcissistic in nature, can alternatively be read as healthy attempts to self-actualize and connect to larger society.35 one such “table” in the recognition market is instagram. beyond instagram’s social nature that allows participants to interact with and follow one another, the specific role of the searchable signature is of interest to those who are concerned with struggles for recognition. rather than describing shared images, searchable signatures reflect performative yet authentic user identities. information technology and libraries | september 2013 13 mccune, in a case study of consumer production on instagram, acknowledges the potential of the tag to not only facilitate image exchange but to communicate users’ positions as members of social groups.36 through a simple search of tags, users who identify as, for example, “cat ladies,” are able to validate their identities when they see that there are many others who use the same or similar language in demonstrations of the self (see figure 4). other signatures such as #proudaunt, while not necessarily playful, still function to provide viewers with additional information about the instagram user that cannot be determined through the photo itself. the ability to find images based on these searchable signatures allows users to find others who identify in a like manner and to imagine themselves as part of a larger social group. in effect, searchable signatures allow users to be recognized as social addressees of like-minded others. positioning oneself within a group must be understood as a struggle for recognition, for to imagine oneself as part of the social fabric is also to see oneself as valuable. figure 4. list of results from #catlady hashtag search on instagram, november 2012. enabled by web 2.0, searchable signatures contain potential for marginalized peoples or groups to assert online selves to be seen and ultimately heard in a truly intersubjective landscape. it is not too much of a leap to imagine that searchable signatures might make possible the organization of individuals and groups for political purposes. in fact, in a discussion of social groups, honneth notes that “the more successful social movements are at drawing the public sphere’s attention to searchable signatures: context and the struggle for recognition | schlesselman-tarango 14 the neglected significance of the traits and abilities they collectively represent, the better their chances of raising the social worth, or indeed, the standing of their members.” 37 here, searchable signatures might provide such movements with a venue to capture the public’s attention and to effectively struggle for and gain recognition. searchable signatures and context as markers of individual and group identities, searchable signatures are unique in that they provide a snapshot of the multitude of social, historical, political, individual, and interpersonal relationships that ontologize the images with which they are paired. it is this very contextual information that is at times lacking in traditional indexing environments. by examining searchable signatures, experts and users are able to understand which individuals and groups create, use, and identify with certain images. thus, as markers of self, searchable signatures provide use data for scholars to better investigate which images are important to online individual or group identities. if the searchable signature is used in a political fashion, historians and sociologists might be able to study which types of images, for example, marginalized groups rally around, identify with, and use in their struggles for recognition. such use data also illuminates how and by whom certain digital images have been appropriated over time. for example, if a picture of a cat is first created or shared via instagram by an animal rights activist, the image might be accompanied by the searchable signature #humanforcats. this same image, shared by another user months later, might be accompanied by the #catlady signature. those interested will be able to examine how the same image has been historically used for different purposes and will be better able to grasp the evolving nature of its digital context. in addition to use data, the searchable signature provides insight into the sociohistorical context surrounding digital images. for those who perceive “reality . . . as accessible only (and always incompletely) in context, in specific historicized moments in time space” the searchable signature clarifies and makes more accessible that reality surrounding the digital image. 38 in a traditional library setting, a photo of a cat might be indexed with descriptive subject headings such as “cat,” “persian cat,” or “kitten—behavior.” however, the searchable signature #catladyforlife provides additional information on how the cat has become, for a certain social group in a specific moment in time, a trope of sorts for those who are proud of not only their relationships with their domestic pets, but of their shared values and lifestyles as well. if a historian were to dig deeper, he or she also might see that “cat lady” has historically been used in a derogatory manner to mark single, unattractive women thought to be crazy and unable to care for the great number of cats they own and that, by (re)claiming this title, women might be engaging in a struggle for recognition that extends beyond mere admiration for felines.39 chowdhury, in a continued discussion of challenges facing the digital world, asks whether it is “possible to capture the changing context along with the content of each information resource, because as we know the use and importance . . . changes significantly with time.” 40 additionally, he information technology and libraries | september 2013 15 asks, “will it be possible to re-interpret the stored digital content in the light of the changing context and user community, and thereby re-inventing the importance and use of the stored objects?” 41 it is here that the searchable signature offers use data and sociohistorical information to illuminate the (changing) value digital images have for individuals, communities, and society. conclusion clark argues that representation must be understood within the confines of what he calls “social practice.” 42 social practice, among other things, can be understood as “the overlap and interference of representations; it is their rearrangement in use.” 43 representation of self also must be understood within current social practice, and an important facet of today’s practice is web 2.0. as a social space, web 2.0 allows for the creation of disembodied self-representations. one type of such representation, the searchable signature, is a phenomenon unique to social networking sites. while many acknowledge the potential of descriptive, user-generated tags to inform or even to be used in conjunction with metadata schemas or controlled vocabularies, instagram users have created an additional, alternative use for the tag. rather than simply using tags to describe shared images, they have successfully created a route to online identity formation and recognition. searchable signatures demonstrate the power of the online self, as they allow users to struggle to be recognized as unique individuals or as parts of larger social groups. these signatures, too, might act as platforms on which social groups can assert their value and thus demand recognition. additionally, searchable signatures provide contextual information that reflects the social practice in which digital images live. while the capture and integration of such information remains a challenge for those engaged in traditional indexing, web 2.0 allows for this unique type of usergenerated tag and thus provides better understanding of the context surrounding digital images. as to the question of whether searchable signatures can be integrated into existing metadata schemas or be used to inform controlled vocabularies in library environments, it is not unreasonable to suggest that digital objects be accompanied by their supplemental yet valuable representations (e.g., searchable signatures and the like). many methods exist through which these signatures might be both gathered and displayed. certainly, a full exploration of such practices is the stuff of future research; however, some initial ideas are detailed below. one method of gathering identity-based tags would involve the active hunting down of searchable signatures. locating objects on social networking sites that are also in one’s digital collection, the indexer would identify and track associated user-generated searchable signatures. this method would require extreme diligence, high levels of comfort navigating and using web 2.0, a clear idea of which social networking sites yield the most valuable searchable signatures, and likely one or more full-time staff members devoted to such activities. even if feed systems were employed for individual digital objects, this method demands much of indexers and would likely not be sustainable over time. searchable signatures: context and the struggle for recognition | schlesselman-tarango 16 a more passive yet efficient way of gathering searchable signatures would simply be to build on methods that have shown to be successful. by creating interactive digital environments that encourage users to assign not only descriptive but also identity-based tags, indexers are freed of the time-consuming task of hunting for searchable signatures on the web. since searchable signatures have come to be part of online social practice, the assigning of them would likely be familiar to users—initially, libraries might need to prompt users to share signatures or provide them with examples. this gathering tactic could be used to harvest signatures for items that are already part of the library’s digital collection (telling us about signatures used by potential sharers) or as a means to incorporate new digital objects into the collection (telling us about signatures used by both creators and sharers). in both gathering scenarios, indexers might choose to display only the most occurring or what they deem to be the most relevant searchable signatures, or they might choose to display all such tags; decisions such as these will ultimately depend on each institution’s mission and resources. of course, if a library integrates a born-digital image into its collection and can identify the searchable signatures originally assigned to it via social networking sites or otherwise, this information should also be recorded. here, users will be able to get a glimpse of the image in its pre-library life. providing associated usernames, dates posted, and the name of the social networking sites too will assist in providing a more complete picture of the individuals or groups linked to the image. this information can provide valuable data about the information creators and sharers who use specific social platforms. the aim of this paper is to lay the theoretical groundwork to better understand the role of searchable signatures in today’s digital environment as well as the signature’s unique ability to provide context for digital images. surely, further research into the phenomenon of the searchable signature would demonstrate how it is currently used outside of instagram or as a political tool. others might consider examining the username as another arena in which individuals or groups construct and perform online identities and thus engage in struggles for recognition. usernames also might provide contextual use data and sociohistorical information that inevitably support greater understanding of digital objects. finally, further research is needed to identify how libraries could utilize the searchable signature in promotional activities and to build and cater to user communities. references 1. axel honneth, the struggle for recognition: the moral grammar of social conflicts (cambridge: mit press, 1995). 2. lauane baroncelli and andre freitas, “the visibility of the self on the web: a struggle for recognition,” in proceedings of 3rd acm international conference on web science, 2011, accessed august 12, 2013, www.websci11.org/fileadmin/websci/posters/191_paper.pdf. information technology and libraries | september 2013 17 3. jose van dijck, “facebook as a tool for producing sociality and connectivity,” television & new media 13, no. 2 (2012): 160–76; baroncelli and freitas, “the visibility of the self.” 4. t. j. clark, introduction to the painting of modern life: paris in the art of manet and his followers (princeton, nj: princeton university press, 1984), 1–22. 5. ibid. 6. ibid., 9. 7. baroncelli and freitas, “the visibility of the self.” 8. ibid. 9. soraj hongladarom, “personal identity and the self in the online and offline world,” minds & machines 21 (2011): 533–48. 10. ibid., 541. 11. erika pearson, “all the world wide web’s a stage: the performance of identity in online social networks,” first monday 14 (2009), accessed november 9, 2012, www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm; erving goffman, the presentation of self in everyday life (garden city, ny: doubleday, 1959). 12. honneth, the struggle for recognition, 1. 13. jurgen habermas, the theory of communicative action (boston: beacon, 1984); honneth, the struggle for recognition, 92. 14. honneth, the struggle for recognition, 129. 15. ibid. 16. ibid., 130. 17. baroncelli and freitas, “the visibility of the self.” 18. ibid. 19. gobinda chowdhury, “from digital libraries to digital preservation research: the importance of users and context,” journal of documentation 66, no. 2 (2010): 207–23, doi: 10.1108/00220411011023625. 20. ibid., 217. 21. brenda dervin, “given a context by any other name: methodological tools for taming the unruly beast,” in information seeking in context, ed. pertti vakkari et al. (london: taylor graham, 1997), 13–38. searchable signatures: context and the struggle for recognition | schlesselman-tarango 18 22. ibid., 15. 23. ibid., 18. 24. christopher a. (cal) lee, “a framework for contextual information in digital collections,” journal of documentation 67 (2011): 95–143. 25. ibid., 100. 26. abebe rorissa, “a comparative study of flickr tags and index terms in a general image collection,” journal of the american society for information science and technology 61, no. 11 (2010): 2230–42. 27. ibid., 2239. 28. “steve central: social tagging for cultural collections,” steve: the museum social tagging project, accessed december 16, 2012, http://tagger.steve.museum. 29. “yale university library manuscripts & archives department,” yale university manuscripts & archives digital images database, last modified april 19, 2012, accessed december 3, 2012, http://images.library.yale.edu/madid. 30. ibid. 31. “andrew st. george papers (ms 1912),” manuscripts and archives, yale university library, accessed april 30, 2013, http://drs.library.yale.edu:8083/fedoragsearch/rest. 32. “faq,” instagram, accessed november 10, 2012, http://instagram.com/about/faq. 33. emil protalinksi, “instagram passes 80 million users,” cnet, july 6, 2012, accessed november 13, 2012, http://news.cnet.com/8301-1023_3-57480931-93/instagram-passes-80-millionusers; jenna wortham, “it’s official: facebook closes its acquisition of instagram,” new york times, september 6, 2012, accessed november 13, 2012, http://bits.blogs.nytimes.com/2012/09/06/its-official-facebook-closes-its-acquisition-ofinstagram. 34. “tagging your photos using #hashtags,” instagram, accessed november 10, 2012, http://help.instagram.com/customer/portal/articles/95731-tagging-your-photos-usinghashtags; “instagram tips: using hashtags,” instagram, accessed november 10, 2012, http://blog.instagram.com/post/17674993957/instagram-tips-using-hashtags. 35. andrew l. mendelson and zizi papacharissi, “look at us: collective narcissism in college student facebook photo galleries,” in a networked self: identity, community and culture on social network sites, ed. zizi papacharissi (new york: routledge, 2010), 251–73. 36. zachary mccune, “consumer production in social media networks: a case study of the http://tagger.steve.museum/ http://images.library.yale.edu/madid/ http://drs.library.yale.edu:8083/fedoragsearch/rest/ http://instagram.com/about/faq/ http://news.cnet.com/8301-1023_3-57480931-93/instagram-passes-80-million-users/ http://news.cnet.com/8301-1023_3-57480931-93/instagram-passes-80-million-users/ http://bits.blogs.nytimes.com/2012/09/06/its-official-facebook-closes-its-acquisition-of-instagram/ http://bits.blogs.nytimes.com/2012/09/06/its-official-facebook-closes-its-acquisition-of-instagram/ http://help.instagram.com/customer/portal/articles/95731-tagging-your-photos-using-hashtags http://help.instagram.com/customer/portal/articles/95731-tagging-your-photos-using-hashtags http://blog.instagram.com/post/17674993957/instagram-tips-using-hashtags information technology and libraries | september 2013 19 ‘instagram’ iphone app” (master’s dissertation, university of cambridge, 2011), accessed december 20, 2012, http://thames2thayer.com/portfolio/a-study-of-instagram. 37. honneth, the struggle for recognition, 127. 38. dervin, “given a context by any other name,” 17. 39. kiri blakeley, “crazy cat ladies,” forbes, october 15, 2009, accessed december 4, 2012, www.forbes.com/2009/10/14/crazy-cat-lady-pets-stereotype-forbes-woman-timefelines.html; crazy cat ladies society & gentlemen's auxiliary homepage, accessed december 4, 2012, www.crazycatladies.org. 40. chowdhury, “from digital libraries to digital preservation,” 219. 41. ibid. 42. clark, introduction to the painting of modern life, 6. 43. ibid. acknowledgments many thanks to erin meyer and dr. krystyna matusiak at the university of denver for their feedback and guidance. http://thames2thayer.com/portfolio/a-study-of-instagram/ author name and second author the use of ajax, or asynchronous javascript + xml, can result in web applications that demonstrate the flexibility, responsiveness, and usability traditionally found only in desktop software. to illustrate this, a repository metasearch user interface, ojax, has been developed. ojax is simple, unintimidating but powerful. it attempts to minimize upfront user investment and provide immediate dynamic feedback, thus encouraging experimentation and enabling enactive learning. this article introduces the ajax approach to the development of interactive web applications and discusses its implications. it then describes the ojax user interface and illustrates how it can transform the user experience. w ith the introduction of the ajax development paradigm, the dynamism and richness of desktop applications become feasible for web-based applications. ojax, a repository metasearch user interface, has been developed to illustrate the potential impact of ajax-empowered systems on the future of library software.1 this article describes the ajax method, highlights some uses of ajax technology, and discusses the implications for web applications. it goes on to illustrate the user experience offered by the ojax interface. ■ ajax in february 2005, the term ajax acquired an additional meaning: asynchronous javascript + xml.2 the concept behind this new meaning, however, has existed in various forms for several years. ajax is not a single technology but a general approach to the development of interactive web applications. as the name implies, it describes the use of javascript and xml to enable asynchronous communication between browser clients and server-side systems. as explained by garrett, the classic web application model involves user actions triggering a hypertext transfer protocol (http) request to a web server.3 the latter processes the request and returns an entire hypertext markup language (html) page. every time the client makes a request to the server, it must wait for a response, thus potentially delaying the user. this is particularly true for large data sets. but research demonstrates that response times of less than one second are required when moving between pages if unhindered navigation is to be facilitated through an information space.4 the aim of ajax is to avoid this wait. the user loads not only a web page, but also an ajax engine written in javascript. users interact with this engine in the same way that they would with an html page, except that instead of every action resulting in an http request for an entire new page, user actions generate javascript calls to the ajax engine. if the engine needs data from the server, it requests this asynchronously in the background. thus, rather than requiring the whole page to be refreshed, the javascript can make rapid incremental updates to any element of the user interface via brief requests to the server. this means that the traditional page-based model used by web applications can be abandoned; hence, the pacing of user interaction with the client becomes independent of the interaction between client and server. xmlhttprequest is a collection of application programming interfaces (apis) that use http and javascript to enable transfer of data between web servers and web applications.5 initially developed by microsoft, xmlhttprequest has become a de facto standard for javascript data retrieval and is implemented in most modern browsers. it is commonly used in the ajax paradigm. the data accessed from the http server is usually in extensible markup language (xml) but another format, such as javascript object notation, could be used.6 applications of ajax google is the most significant user of ajax technology to date. most of its recent innovations, including gmail, google suggest, google groups, and google maps, employ the paradigm.7 the use of ajax in google suggest improves the traditional google interface by offering real-time suggestions as the user enters a term in the search field. for example, if the user enters xm, google suggest might offer refinements such as xm radio, xml, and xmods. experimental ajax-based auto-completion features are appearing in a range of software.8 shanahan has applied the same ideas to the amazon online bookshop.9 his experimental site, zuggest, extends the concept of auto-completion: as the user enters a term, the system automatically triggers a search without the need to hit a search button. the potential of ajax to improve the responsiveness and richness of library applications has not been lost on the library community.10 several interesting experiments have been tried. at oclc, for example, a “suggest-like service,” based on controlled headings from the worldjudith wusteman and pádraig o’hiceadha using ajax to empower dynamic searching | wusteman 57 using ajax to empower dynamic searching judith wusteman (judith.wusteman@ucd.ie) is a lecturer in the ucd school of information and library studies, university college dublin, ireland. 58 information technology and libraries | june 2006 wide union catalog, worldcat, has been implemented.11 ajax has also been used in the oclc deweybrowser.12 the main page of this browser includes four iframes, or inline frames, three for the three levels of dewey decimal classification and a fourth for record display.13 the use of ajax allows information in each iframe to be updated independently without having to reload the entire page. implications of ajax there have been many attempts to enable asynchronous background transactions with a server. among alternatives to ajax are flash, java applets, and the new breed of xml user-interface language formats such as xml user interface language (xul) and extensible application markup language (xaml).14 these all have their place, particularly languages such as xul. the latter is ideal for use in mozilla extensions, for example. combinations of the above can and are being used together; xul and ajax are both used in the firefox extension version of google suggest.15 the main advantage of ajax over these alternative approaches is that it is nonproprietary and is supported by any browser that supports javascript and xmlhttprequest—hence, by any modern browser. it could be validly argued that complex client-side javascript is not ideal. in addition to the errors to which complex scripting can be prone, there are accessibility issues. best practice requires that javascript interaction adds to the basic functionality of web-based content that must remain accessible and usable without the javascript.16 an alternative non-javascript interface to gmail was recently implemented to deal with just this issue. a move away from scripting would, in theory, be a positive step for the web. in practice, however, procedural approaches continue to be more popular; attempts to supplant them, as epitomized by xhtml 2.0, simply alienate developers.17 it might be assumed that the use of ajax technology would result in a heavier network load due to an increase in the number of requests made to the server. this is a misconception in most cases. indeed, ajax can dramatically reduce the network load of web applications, as it enables them to separate data from the graphical user interface (gui) used to display it. for example, each results page presented by a traditional search engine delivers, not only the results data, but also the html required to render the gui for that page. an ajax application could deliver the gui just once and, after that, deliver data only. this would also be possible via the careful use of frames; the latter could be regarded as an ajax-style technology but without all of ajax’s advantages. ■ from client-server to soa the dominant model for building network applications is the client/server approach, in which client software is installed as a desktop application and data generally reside on a server, usually in a database.18 this can work well in a homogenous single-site computing environment. but institutions and consortia are likely to be heterogeneous and geographically distributed. pcs, macs, and cell phones will all need access to the applications, and linux may require support alongside windows. even if an organization standardizes solely on windows, different versions of the latter will have to be supported, as will multiple versions of those ubiquitous dynamic link libraries (dlls). indeed, the problems of obtaining and managing conflicting dlls have spawned the term “dll hell.”19 in web applications, a standard client, the browser, is installed on the desktop but most of the logic, as well as the data, reside on the server. of course, the browser developers still have to worry about “dll hell,” but this need not concern the rest of us. “speed must be the overriding design criterion” for web pages.20 but the interactivity and response times possible with client/server applications are still not available to traditional web applications. this is where ajax comes in: it offers, to date, the best of the web application and client/server worlds. much of the activity is moved back to the desktop via client-side code. but the advantages of web applications are not lost: the browser is still the standard client. service-oriented architecture (soa) is an increasingly popular approach to the delivery of applications to heterogeneous computing environments and geographically dispersed user populations.21 soa refers to the move away from monolithic applications toward smaller, reusable services with discrete functionality. such services can be combined and recombined to deliver different applications to users. web services is an implementation of soa principles.22 the term describes the use of technologies such as xml to enable the seamless interoperability of web-based applications. ajax enables web services and hence enables soa principles. thus, the adoption of ajax facilitates the move toward soa and all the advantages of reuse and integration that this offers. ■ arc arc is an experimental open-source metasearch package available for download from the sourceforge opensource foundry.23 it can be configured to harvest open using ajax to empower dynamic searching | wusteman 59 archives initiative-protocol for metadata harvesting (oai-pmh)-compliant data from multiple repositories.24 the harvested results are stored in a relational database and can be searched using basic web forms. arc’s advanced search form is illustrated in figure 1. ■ applying ajax to the search gui the use of ajax has the potential to narrow the gulf between the responsiveness of guis for web applications and those for desktop applications. the flexibility, usability, and richness of the latter are now possible for the former. the ojax gui, illustrated in figure 2, has been developed to demonstrate how ajax can improve the richness of arc-like guis. ojax, including full source code, is available under the open-source apache license and is hosted on sourceforge.25 ojax comprises a client-side gui, implemented in javascript and html, and server-side metasearch web services, implemented in java. the web services connect directly to a metasearch database created by arc from harvested repositories. the database connectivity leverages several libraries from the apache jakarta project, which provides open-source java solutions.26 ■ development process the ojax gui was developed iteratively using agile software development methods.27 features were added incrementally and feedback gained from a proxy user. in order to gain an in-depth understanding of the system and the implications for the remainder of the gui, features were initially built from scratch, using objectoriented javascript.they were then rebuilt using three open-source javascript libraries: prototype, script.aculo .us, and rico.28 prototype provides base ajax capability. it also includes advanced functionality for object-oriented javascript, such as multiple inheritance. the other two libraries are built on top of prototype. the script.aculo. us library specializes in dynamic effects, such as those used in auto-completion. the rico library, developed by sabre, provides other key javascript effects—for example, dynamic scrollable areas and dynamic sorting.29 ■ storyboard one of the aims of the national information standards organization (niso) metasearch initiative is to enable all library users to “enjoy the same easy searching found in web-based services like google.”30 adopting this approach, ojax incorporates the increasingly common concept of the search bar, popularized by the google toolbar.31 ojax aims to be as simple, uncluttered, and unthreatening as possible. the goal is to reflect the simple-search experience while, at the same time, providing the power of an advanced search. thus, the user interface has been kept as simple as possible while maintaining equivalent functionality with the arc advanced search interface. all arc functionality, with the exception of the grouping feature, is provided. to help the intuitive flow of the operation, the fields are set out as a sentence: find [term(s)] in [all archives] from [earliest year] until [this year] in [all subjects] tool tips are available for text-entry fields. by default, searching is on author, title, and abstract. these fields map to the creator, title, and description dublin core metadata fields harvested from the original repositories.32 the search can be restricted by deselecting unwanted fields. arc supports both mysql and oracle databases.33 mysql has been chosen for ojax as mysql is an open-source database. boolean search syntax has been figure 1. arc’s advanced search form figure 2. the ojax metasearch user interface 60 information technology and libraries | june 2006 implemented in ojax to allow for more powerful searching. the syntax is similar to that used by google in that it identifies and/or and exact phrase functionality by +/and “ ”. hence it preserves the user’s familiarity with basic google search syntax. however, it is not as powerful as the full google search syntax; for example, it does not support query modifiers such as: intitle: 34 the focus of this research is the application of ajax to the search gui and not the optimization of the power or expressive capability of the underlying search engine. however, the implementation of an alternative back end that uses a full-text search engine, such as apache lucene, would improve the expressive power of advanced queries.35 full-text search expressiveness is likely to be key to the usability of ojax, ensuring its adequacy for the advanced user without alienating the novice. ■ unifying the user interface one of the main aims of ojax is the unification of the user interface. instead of offering distinct options for simple and advanced search and for refining a completed search, the interface is sufficiently dynamic to make this unnecessary. the user need never navigate between pages because all options, both simple and advanced, are available from the same page. and all results are made available on that same page in the form of a scrollable list. the only point at which a new page is presented is when the resource identifier of a result is clicked. at this stage, a pop-up window, external to the ojax session, displays the full metadata for that resource. this page is generated by the external repository from which the record was originally harvested. simple and advanced search options are usually kept separate because most users are unwilling or unable to use the latter.36 furthermore, the design of existing search-user interfaces is based on the assumption that the retrieval of results will be sufficiently time-consuming that users will want to have selected all options beforehand. with ojax, however, users do not have to make a complete choice of all the options they might want to try before they see any results. as data are entered, answers flow to accommodate them. because the interface is so dynamic and responsive and because users are given immediate feedback, they do not have to be concerned about wasting time due to the wrong choice of search options. users iterate toward the search results they require by manipulating the results in real time. the reduced level of investment that users must make before they achieve any return from the system should encourage them to experiment, hence promoting enactive learning. ■ auto-completion in order to provide instant feedback to the user, the search-terms field and the subject field use ajax to autocomplete user entries. figure 3 illustrates the result of typing smith in the search-terms field. a list is automatically dropped down that itemizes all matches and the number of their occurrences. users select the term they want, the entire field is automatically completed, and a search is triggered. the arc system denormalizes some of the harvested data before saving them in its database. for example, it merges all the author fields into one single field, each name separated by a bar character. to enable the ojax auto-completion feature, it was necessary to renormalize the names. a new table is used to store each name in a separate row; names are referenced by the resource identifier. to enable this, arc’s indexing code was updated so that it creates this table as it indexes records extracted from the oai-pmh feed. in its initial implementation, ojax uses a simple algorithm for auto-completion. future work will involve developing a more complex heuristic that will return results more closely satisfying user requirements. ■ auto-search as already mentioned, a central theme of ojax is the attempt to reduce the commitment necessary from users before they receive feedback on their actions. one way in which dynamic feedback is provided is the triggering of an immediate search whenever an entire option has been selected. examples of entire options include choice of an archive or year and acceptance of a suggested autocompletion. in addition, the following heuristics are used to identify when a user is likely to have finished entering a search term and, thus, when a search should be triggered: 1. entering a space character in the search-terms field or subject field 2. tabbing out of a field after having modified its contents 3. five seconds of user inactivity for a modified field the third heuristic aims to catch some of the edge cases that the other heuristics may miss. it is assumed likely that a term has been completed if a user has made no edits in the last five seconds. as each term will be using ajax to empower dynamic searching | wusteman 61 separated by a space, it is only the last term in a search phrase that is likely not to trigger an auto-search via the first heuristic. users can click the search button whenever they wish, but they should never have to click it. the zuggest system abandons the search button entirely; ojax retains it, mainly in order to avoid confounding user expectations.37 while a search is in progress, the search button is greyed out and acquires a red border. this is particularly useful in alerting the user that a search has been automatically triggered. this is the only feature of ojax that may have an impact on network load in terms of slightly higher traffic. however, the increased number of requests is offset by a reduction in the size of each response because the gui is not downloaded with it. for example, initiating a search in arc results in an average response size of 57.32k. the response is in the form of a complete html page. initiating a search in ojax results in an average response size of 7.96k. the latter comprises a web service response in xml. in other words, more than seven ojax autosearches would have to be triggered before the size of the initial search result in arc was exceeded. ■ dynamic archive list the use of ajax enables a static html page to contain a small component of dynamic data without the entire page having to be dynamically generated on the server. ojax illustrates this: the contents of the drop-down box listing the searchable archives are not hard-coded in the html page. rather, when the page is loaded, an ajax request for the set of available archives is generated. this is a useful technique; static html pages can be cached by browsers and proxy servers, and only the dynamic portion of the data, perhaps those used to personalize the page, need be downloaded at the start of a new session. ■ dynamic scrolling searches commonly produce thousands of results. typical systems, such as google and arc, make these results available via a succession of separate pages, thus requiring users to navigate between them. finding information by navigating multiple pages can take longer than scrolling down a single page, and users rarely look beyond the second page of search results.38 to avoid these problems and to encourage users to look at more of the available results, those results could be made available in one scrollable list. but, in a typical non-ajax application, accessing a scrollable list of, say, two thousand items would require the entire list to be downloaded via one enormous html page. this would be a huge operation; if it did not crash the browser, it would, at least, result in a substantial wait for the user. the rico library provides a feature to enable dynamic scrollable areas. it uses ajax to fetch more records from the server when the user begins to scroll off the visible area. this is used in the display of search results in ojax, as illustrated in figure 4. to the user, it appears that the scrollable list is seamless and that all 4,678 search results are already downloaded. in fact, only 386 have been downloaded. the rest are available at the server. as the user scrolls further down, say to item 396, an ajax request is made for the next ten items. any item downloaded is cached by the ajax engine and need not be requested again if, for example, the user scrolls back up the list. a dynamic information panel is available to the right of the scroll bar. it shows the current scroll position in relation to the beginning and end of the results set. in figure 3. auto-completion in the search terms field figure 4. display of search results and dynamic information panel 62 information technology and libraries | june 2006 figure 4, the information panel indicates that there are 4,678 results for this particular search and that the current scroll position is at result number 386. this number updates instantly during scrolling, preserving the illusion that all results have been downloaded and providing users with dynamic feedback on their progress through the results set. this means that users do not have to wait for the main results window to refresh to identify their current position. ■ auto-expansion of results ojax aims to provide a compact display of key information, enabling users to see multiple results simultaneously. it also aims to provide simple access to full result details without requiring navigation to a new web page. in the initial results display, only one line each of the title, authors, and subject fields, and two lines of the abstract, are shown for each item. as the cursor is placed on the relevant field, the display expands to show any hidden detail in that field. at the same time, the background color of the field changes to blue. when the cursor is placed on the bar containing the resource identifier, all display fields for that item are expanded, as illustrated in figure 5. this expansion is enabled via simple cascading style sheet (css) features. for example, the following css declaration hides all but the first line of authors: #searchresults td div { overflow:hidden; height: 1.1em } when the cursor is placed on the author details, the overflow becomes visible and the display field changes its dimensions to fit the text inside it: #searchresults td div:hover { overflow:visible; height:auto } ■ sorting results another method used by ojax to minimize upfront user investment is to provide initial search results before requiring the user to decide on sort options. because results are available so quickly and because they can be re-sorted so rapidly, it is not necessary to offer pre-search selection of sort options. ajax facilitates rapid presentation of results; after a re-sort, only those on the first screen must be downloaded before they can be presented to the user. results may be sorted by title, author, subject, abstract, and resource identifier. these options are listed on the gray bar immediately above the results list. clicking one of these options sorts the results in ascending order; an upward-pointing arrow appears to the right of the sort option chosen, as illustrated in figure 6. clicking on the option again sorts in descending order and reverses the direction of the arrow. clicking on the arrow removes the sort; the results revert to their original order. functionality for the sort feature is provided by the rico javascript library. server-side implementation supports these features by caching search results so that it is not necessary to regenerate them via a database query each time. figure 5. auto-expansion of all fields for item number 386 figure 6. results being sorted in ascending order by title using ajax to empower dynamic searching | wusteman 63 ■ search history several experimental systems—for example, zuggest— have employed ajax to facilitate a search-history feature. a similar feature could be provided for ojax. a button could be added to the right of the results list. when chosen, it could expand a collapsible search-history sidebar. as the cursor was placed on one of the previous searches listed in the sidebar, a call out, that is, a speech bubble, could be displayed. this could provide further information such as the number of matches for that search and a summary of the search results clicked on by the user. clicking one of the previous searches would restore those search results to the main results window. this feature would take advantage of the ajax persistent javascript engine to maintain the history. its use could help counter concerns about ajax technology “breaking” the back button; the feature could be implemented so that the back button returned the user to the previous entry in the search history.39 in fact, this implementation of back-button functionality could be more useful than the implementation in google, where hitting the back button is likely to take the user to an interim results page; for example, it might simply take the user from page 3 of results to page 2 of results. ■ scrapbook users browsing through search results on ojax would require some simple method of maintaining a record of those resource details that interested them. ajax could enable the development of a useful scrapbook feature to which such resource details could be copied and stored in the persistent javascript engine. ojax could further leverage a shared bookmark web service, such as del. icio.us or furl, to save the scrapbook for use in future sessions and to share it with other members of a research or interest group.40 ■ potential developments for ojax as well as searching a database of harvested metadata, the ojax user interface could also be used to search an oai-pmh-compliant repository directly. with appropriate implementation, all of ojax’s current features could be made available, apart from auto-completion. a recent development has enabled the direct indexing of repositories by google using oai-pmh.41 the latter provides google with additional metadata that can be searched via the google web services apis. the current ojax web services could be replaced by the google apis, thus eliminating the need for ojax to host any server-side components. hence, ojax could become an alternative gui for google searching. ■ conclusion ojax demonstrates that the use of ajax can enable features in web applications that, until now, have been restricted to desktop applications. in ojax, it facilitates a simple, nonthreatening, but powerful search user interface. page navigation is eliminated; dynamic feedback and a low initial investment on the part of users encourage experimentation and enable enactive learning. the use of ajax could similarly transform other web applications aimed at library patrons. however, ajax is still maturing, and the barrier to entry for developers remains high. we are a long way from an ajax button appearing in dreamweaver. reusable, well-tested components, such as rico, and software frameworks, such as ruby on rails, sun’s j2ee framework, and microsoft’s atlas, will help to make ajax technology accessible to a wider range of developers.42 as with all new technologies, there is a temptation to use ajax simply because it exists. as ajax matures, it is important that its focus does not become the enabling of “cool” features but remains the optimization of the user experience. references and notes 1. ojax homepage, http://ojax.sourceforge.net (accessed apr. 5, 2006). 2. j. j. garrett, “ajax: a new approach to web applications,” feb. 18, 2005, www.adaptivepath.com/publications/ essays/archives/000385.php (accessed nov. 11, 2005). 3. ibid. 4. j. nielsen, “the need for speed,” alertbox mar. 1, 1997, www.useit.com/alertbox/9703a.html (accessed nov. 11, 2005). 5. dynamic html and xml: the xmlhttprequest object, http://developer.apple.com/internet/webcontent/xmlhttpreq .html (accessed apr. 5, 2006). 6. javascript object notation, wikipedia definition, http:// en.wikipedia.org/wiki/json (accessed apr. 5, 2006). 7. google gmail, http://mail.google.com (accessed apr. 5, 2006); google suggest, www.google.com/webhp?complete =1&hl=en (accessed apr. 5, 2006); google groups, http://groups .google.com (accessed apr. 5, 2006); google maps, http://maps .google.com (accessed apr. 5, 2006). 8. p. binkley, “ajax and auto-completion,” quædam cuiusdam blog may 18, 2005, www.wallandbinkley.com/quaedam/?p=27 (accessed nov. 11, 2005). 9. francis shanahan, zuggest, www.francisshanahan.com/ zuggest.aspx (accessed apr. 5, 2006). 64 information technology and libraries | june 2006 10. a. rhyno, “ajax and the rich web interface,” librarycog blog apr. 10, 2005, http://librarycog .uwindsor.ca:8087/artblog/librarycog/1113186562 (accessed nov. 11, 2005); r. tennant, “tennant’s top tech trend tidbit,” lita blog june 22, 2005, http://litablog.org/?p=35 (accessed nov. 11, 2005). 11. t. hickey, “ajax and web interfaces,” outgoing blog, mar. 31, 2005. retrieved nov. 11, 2005 http://outgoing.typepad .com/outgoing/2005/03/web_application.html. 12. oclc deweybrowser. http://ddcresearch.oclc.org/ ebooks/fileserver (accessed apr. 5, 2006). 13. hickey, “ajax and web interfaces.” 14. j. wusteman, “from ghostbusters to libraries: the power of xul,” library hi tech 23, no 1 (2005a). retrieved nov. 11, 2005 www.ucd.ie/wusteman/; cover pages, microsoft extensible application markup language (xaml), http://xml.cover pages.org/ms-xaml.html (accessed apr. 5, 2006). 15. google extensions for firefox, http://toolbar.google .com/firefox/extensions/index.html (accessed apr. 5, 2006). 16. c. adams, “ajax: usable interactivity with remote scripting,” sitepoint. (jul. 13, 2005), www.sitepoint.com/article/ remote-scripting-ajax (accessed nov. 11, 2005). 17. xhtml 2.0, w3c working draft, may 27, 2005, www .w3.org/tr/2005/wd-xhtml2-20050527 (accessed apr. 5, 2006). 18. client/server model, http://en.wikipedia.org/wiki/ client/server (accessed apr. 5, 2006). 19. dll hell, http://en.wikipedia.org/wiki/dll_hell (accessed apr. 5, 2006). 20. j. nielsen, “the need for speed.” 21. service-oriented architecture, http://en.wikipedia.org/ wiki/service-oriented_architecture (accessed apr. 5, 2006). 22. j. wusteman, “realizing the potential of web services,” oclc systems & services: international digital library perspectives 22, no. 1 (2006): 5–9. 23. arc—a cross archive search service, old dominion university digital library research group, http://arc.cs.odu .edu (accessed apr. 5, 2006); niso metasearch initiative, www .niso.org/committees/ms_initiative.html (accessed apr. 5, 2006); arc download page, sourceforge, http://oaiarc.source forge.net (accessed apr. 5, 2006). 24. open archives initiative protocol for metadata harvesting, www.openarchives.org/oai/openarchivesprotocol.html (accessed apr. 5, 2006). 25. ojax download page, sourceforge, http://sourceforge .net/projects/ojax (accessed apr. 5, 2006). 26. apache jakarta project, http://jakarta.apache.org (accessed apr. 5, 2006); apache jakarta commons dbcp, http:// jakarta.apache.org/commons/dbcp (accessed apr. 5, 2006); apache jakarta commons dbutils, http://jakarta.apache.org/ commons/dbutils (accessed apr. 5, 2006). 27. agile software development definition, wikipedia, http://en.wikipedia.org/wiki/agile_software_development (accessed apr. 5, 2006). 28. prototype javascript framework, http://prototype.conio .net (accessed apr. 5, 2006); script.aculo.us, http://script.aculo .us (accessed apr. 5, 2006); rico, http://openrico.org/rico/ home.page (accessed apr. 5, 2006). 29. sabre, www.sabre.com (accessed apr. 5, 2006). 30. niso metasearch initiative, www.niso.org/committees/ ms_initiative.html (accessed apr. 5, 2006). 31. google toolbar, http://toolbar.google.com (accessed apr. 5, 2006). 32. dublin core metadata initiative, http://dublincore.org (accessed apr. 5, 2006). 33. mysql, www.mysql.com (accessed apr. 5, 2006). 34. google help center, advanced operators, www.google .com/help/operators.html (accessed apr. 5, 2006). 35. apache lucene, http://lucene.apache.org (accessed apr. 5, 2006). 36. j. nielsen, “search: visible and simple,” alertbox may 13, 2001, www.useit.com/alertbox/20010513.html (accessed nov. 11, 2005). 37. francis shanahan, zuggest. 38. j. r. baker, “the impact of paging versus scrolling on reading online text passages,” usability news 5, no. 1 (2003), http://psychology.wichita.edu/surl/usabilitynews/51/ paging_scrolling.htm (accessed nov. 11, 2005); j. nielsen, “search: visible and simple.” 39. j. j. garrett, “ajax: a new approach to web applications.” 40. del.icio.us, http://del.icio.us (accessed apr. 5, 2006); furl, www.furl.net (accessed apr. 5, 2006). 41. google sitemaps (beta) help, www.google.com/web masters/sitemaps/docs/en/other.html (accessed apr. 5, 2006). 42. ruby on rails, www.rubyonrails.org (accessed apr. 5, 2006); java 2 platform, enterprise edition (j2ee), http://java .sun.com/j2ee (accessed apr. 5, 2006); m. lamonica, “microsoft gets hip to ajax,” cnet news.com, june 27, 2005, http:// news.com.com/microsoft+gets+hip+to+ajax/2100-1007_3 -5765197.html (accessed nov. 11, 2005). editorial board thoughts: a considerable technology asset that has little to do with technology mark dehmlow information technology and libraries | march 2014 4 for this issue’s editorial, i thought i would set aside the trendy topics like discovery, the clo ud, and open . . . well, everything—source, data, science—and instead focus on an area that i think has more long-term implications for technologists and libraries. for technologists in libraries, probably any industry really, i believe our most important challenges aren’t technical at all. for the average “techie,” even if an issue is complex, it is often finite and ultimately traceable to a root cause—the programmer left off a semi-colon in a line of code, the support person forgot to plug in the network cable, or the systems administrator had a server choke after a critical kernel error. debugging people issues, on the other hand, is much less reductive. people are nothing but variables who respond to conflict with emotion and can become entrenched in their perspectives (right or wrong). at a minimum, people are unpredictable. the skill set to navigate people and personalities requires patience, flexibility, seeing the importance of the relationship through the 1s and 0s, and often developing mutual trust. working with technology benefits from one’s intelligence (iq), but working with people requires a deeper connection to perception, self-awareness, body language, and emotions, all parts of emotional intelligence (eq). eq is relevant to all areas of life and work, but i think particularly relevant to technology workers. of particular importance are eq traits related to emotional regulation, self-awareness, and the ability to pick up social queues. my primary reasoning for this is that technology is (1) fairly opaque to people outside of technology areas and (2) technology is driving so much of the rapid change we are experiencing in libraries. it units in traditional organizations have a significant challenge because many root issues in technology are not well understood, and change is uncomfortable for most, so it is easy to resent technology for being such a strong catalyst for change. as a result, it is becoming more incumbent upon us in technology to not only instantiate change in our organizations but also to help manage that change through clear communication, clear expectation setting, defining reasonable timeframes that accommodate individuals’ needs to adapt to change, a commitment to shift behavior through influence, and just plain old really good listening. i would like to issue a bit of a challenge to technology managers as you are making hiring decisions. if you want the best possible working relationships with other functional areas in the library, especially traditional areas, spend time evaluating candidates for soft skills like a relaxed demeanor; patience; clear, but not condescending, communication; and a personal commitment to mark dehmlow (mdehmlow@nd.edu), a member of lita and the ital editorial board, is director, information technology program, hesburgh libraries, university of notre dame, south bend, indiana. editorial board thoughts: a considerable technology asset | dehmlow 5 serving others. these skills are very hard to teach. they can be developed if one is committed to developing them, but more often than not, they are innate. if a candidate has those traits as a base but also has an aptitude for understanding technology, that individual will likely be the kind of employee people will want to keep, certainly much more so than someone who has incredible technical skill but little social intelligence. for those who are interested in developing their eq, there are many of tools available—a million management books on team building, servant leadership, influencing coworkers, providing excellent service, etc. personally, i have found that developing a better sense of self-awareness is one of the best ways to increase one’s eq. tests such as the meyers briggs type indicator ,1 the strategic leadership type indicator ,2 and the disc,3 which categorize your personality and work-style traits, can be very effective tools for understanding how you approach your work and how your work style may affect your peers. combined with a willingness to flex your style based on the personalities of your coworkers, these can be very powerful tools for influencing outcomes. most importantly, i have found putting the importance of the relationship above the task or goal can make a remarkable difference in cultivating trust and collaboration. self-awareness and flexible approaches not only have the opportunity to improve internal relationships between technology and traditional functional areas of the library, but between techies and end users. we are using technology in many new creative ways to support end users, meaning techies are more and more likely to have direct contact with users. in many ways, our reputation as a committed service profession will be affected by out tech staffs’ ability to interact well with end users, and ultimately, i believe the proportion of our tech staff that have a high eq could be one the strongest predictor s of the long-term success for technology teams in libraries. references 1. “my mbti personality type,” the myers briggs foundation, http://www.myersbriggs.org/mymbti-personality-type/mbti-basics. 2. “strategic leadership type indicator —leader’s self assessment,” hrd press, http://www.hrdpress.com/slti. 3. “remember that boss who you just couldn’t get through to? we know why…and we can help,” everything disc, http://www.everythingdisc.com/disc-personality-assessment-about.aspx. http://www.myersbriggs.org/my-mbti-personality-type/mbti-basics/ http://www.myersbriggs.org/my-mbti-personality-type/mbti-basics/ http://www.hrdpress.com/slti http://www.everythingdisc.com/disc-personality-assessment-about.aspx information technology and libraries at 50: the 1980s in review mark dehmlow information technology and libraries | september 2018 8 mark dehmlow (mdehmlow@nd.edu) is director, library information technology at the hesburgh libraries, university of notre dame. my view of library technology in the 1980s through the lens of journal of library automation (jola) and its successor information technology and libraries (ital) is a bit skewed by my age. i am a gen-xer and much of my professional perspective has been shaped by the last two decades in libraries. while i am cognizant of our technical past, my perspective is very much grounded in the technical present. in a way, i think that context made my experience reviewing the 1980s in jola and ital all the more fun. the most pronounced event for the journal during the 1980s was the transition from the journal of library automation to information technology and libraries between 1981 to 1982. the rationale for this change is perhaps best captured through the context set in the guest editorial “old wine in new bottles?” by kenney in the first issue of ital: “proliferating technologies, the trend toward integration of some of these technologies into new systems, and rapidly increasing adoption of technology-based systems of all types in libraries .…”1 the article grounds us in the anxieties and challenges of the decade surrounding an accelerating change in technology. libraries were evolving from implementing systems of “automation,” a term that focuses more on processes, to broadening their view to “information technology,” which is more of a discipline — an ecosystem made up of technology, process, systems, standards, policies, etc. in a way, the article acknowledges the departure of libraries from their adolescent technological pasts to their young adult present for which the 80s would be the background. perhaps no other event is more technologically significant during the decade than the standardization of the internet. while the concept of networks and a network of networks, e.g. the internet, was conceptualized in the 1960s, it was the development of the tcp/ip network protocol that is the most consequential event because it made it possible to interconnect computer systems using a common means of communication. while the internet wouldn’t become ubiquitously popularized until the early 1990s with the emergence of the world wide web, the internet was active and alive well before that and, in its early state, was critical to the emergence and evolution of library technologies. from the first issue through the last of the 1980s, ital references the term “online” frequently. the “online” of the 80s however was largely text based, where systems were interconnected using lightweight terminals to navigate browse and search systems. it was not unlike a massive “choose your own adventure book,” skipping from menu to menu to find what you were looking for. throughout my review, i was happy to see a small, but significant, percentage of international articles that focused on character sets, automation, and collection comparisons in countries like kuwait, australia, china, and israel. diversity is a cornerstone for lita and ala and the journal has continued this trend to encourage the submission of articles from outside of the u.s. the 1980s volumes of ital traversed a plethora of topics ranging from measuring system the 1980s in review | dehmlow 9 https://doi.org/10.6017/ital.v37i3.10749 performance (efficiency was important during a time when computing was relativ ely slow and expensive) to how to use library systems to provide data that can be used to make business decisions. over the decade, there was a significant focus on library organizations coming to terms with new technology, e.g. the automation of circulation, acquisitions, and the marc bibliographic record. there were several articles that discussed the complications, costs, and best practices for converting card-catalog metadata to electronic records and several other articles that detailed large barcoding projects. the largest number of articles on a single topic focused on the automation and management of authority control in automated library systems. there were articles on the emergence of research databases often delivered as applications on cd-roms which would then be installed on microcomputers. the term “microcomputer” was frequently used because the 80s saw the emergence of the personal computer in the work environment, a transformative step in enabling staff and patrons alike to access online library services and applications to support their research and work. electronic mail was in its infancy and became a novel way to share information with end users across a campus. several articles focused on the physical design of search terminals and optimizing the ergonomics of computers. there were also many articles about designing the best opac interface for users, ranging from how to present bibliographic records to users, to what information should be sent to printers, to early efforts to extend local catalogs with article-based metadata. many of these topics have parallels today. instead of only analyzing statistical usage data we can pull from our systems, libraries are striving to develop predictive analytics, leveraging big-data from across an assortment of institutions. i found the 1988 article “investigating computer anxiety in an academic library,” which examines staff resistance to technology and change to be as apropos today as it was then.2 cd-roms have gone the way of the feathered and overly hairsprayed coifs of the 80s and have largely been superseded by hard drives and solid state flash media that can hold significantly more data and can transfer data more rapidly. the current decade of the 2010s has been dedicated to providing the optimal search experience for our end users as we have broadened our efforts to the discovery of all scholarly information, not just what is held in our collections. and of course, instead of adding a few article abstracting resources to our catalogs in an innovative, but difficult to sustain manner, the commercial sector has created web-scale mega-indexes that are integrated with our catalogs and offer the promise of searching a predominant amount of the scholarly record. there was a really interesting thread of articles over the decade that traced the evolution of the ils in libraries. there were articles about how to develop automation systems for libraries, the various functions that could be automated — cataloging, circulation, acquisitions, etc. — and evaluation projects for commercial systems. if the 2000s was the era of consolidation, the early 1980s could easily represent the era of proliferation. the decade nicely traces the first two generations of library systems, starting with university-developed automation and database backed systems and the migration of many of those systems to vendors. the northwestern university-based notis system was referenced a lot and there were some mentions of oclc’s acquisition and distribution of the ls/2000 system. this part of our automation history is a palpable reminder that libraries have been innovative leaders in technology for decades, often developing systems ahead of the commercial industry in an effort to meet our evolving service portfolios. this early strategy for libraries mirrors recent developments of institutional repositories, current research information systems (criss), and faculty profiling systems like vivo that were developed before the commercial sector saw the feasibility of commercialization. information technology and libraries | september 2018 10 the cycle of selecting and implementing a new integrated library system is something that m any organizations are faced with again. the only difference is that the commercial sector has entered into the development of the 4th or 5th generation of integrated library systems, many of which are coming with data services integrated and most of them are implemented in the cloud. in addition to seeing our technically rudimentary past, there were several articles over the decade that discussed especially innovative ideas or that anticipated future technologies. a 1983 article by tamas doszkocs which was written long before the emergence of google is an early revelation that regular patrons struggle to use expert systems that require normalized and boolean searching strategies. not surprising is the conclusion that users lean organically toward natural language searching, but even then we were having the expert experience vs. intuitive experience debate in the profession: “the development of alternative interfaces, specifically designed to facilitate direct end user interaction in information retrieval systems, is a relatively new phenomenon.”3 the 1984 article, “packet radio for library automation,” is about eliminating the challenges of retrofitting buildings with cabling to connect lan networks by using radio based interfaces.4 could this be an early precursor to wifi? there is the 1985 article titled “microcomputer based faculty-profile” about using a local database management application on a pc to create an index of faculty publications and university publishing trends.5 this is nearly three decades before the popularization of the cris and faculty profile system. in 1986, there is an article “integrating subject pathfinders into a geac ils: a marc-formatted record approach,” an article that made me think about how library websites are structured, and the current trend of developing online research guides and making them discoverable in our websites as a research support tool.6 and finally, i was struck by the innovative approach in 1987’s “remote interactive online support,” wherein the authors wrote about using hardware to make simultaneous shell connections to a search interface so they could give live search guidance to researchers remotely. 7 we take remote technical support for granted now, but in the late 80s, this required several complicated steps to achieve. the 80s were an exciting time for technology development and a decade that is rife with technical evolution. i think this quote from the article “1981 and beyond: visions and decisions” by fasana in the journal of library automation best elucidates the deep connection between the past and the future, “library managers are currently confronted with a dynamic environment in which they are attempting simultaneously to plan library services and systems for the future, and to control the rate and direction of change.”8 this still holds true. library managers are still planning services in a rapidly changing environment, except, i like to think we have learned to live with change that we cannot control the rate nor direction of. 1 b. kenney, “guest editorial: old wine in new bottles?,” information technology and libraries, 1 no. 1 (march 1982), p. 3. 2 maryellen sievert, rosie l. albritton, paula roper, and nina clayton, “investigating computer anxiety in an academic library,” information technology and libraries 7 no. 3 (september 1988), pp. 243-252. the 1980s in review | dehmlow 11 https://doi.org/10.6017/ital.v37i3.10749 3 tamas e. doszkocs, “cite nlm: natural-language searching in an online catalog,” information technology and libraries 2 no. 4 (december 1983), p. 364. 4 edwin b. brownrigg, clifford a. lynch, and rebecca pepper, “packet radio for library automation,” information technology and libraries 3 no. 3 (september 1984), pp. 229-244. 5 vladimir t. borovansky and george s. machovec, “microcomputer based faculty-profile,” information technology and libraries 4 no. 4 (december 1985), pp. 300-305. 6 william e. jarvis and victoria e. dow, “integrating subject pathfinders into a geac ils: a marcformatted record approach,” information technology and libraries 5 no. 3 (september 1986), pp. 213-227. 7 s. f. rossouw and c. van rooyen, “remote interactive online support,” information technology and libraries 6 no. 4 (december 1987), pp. 311-313. 8 paul j. fasana, “1981 and beyond: visions and decisions,” journal of library automation 13 no. 2 (june 1980), p. 96. reproduced with permission of the copyright owner. further reproduction prohibited without permission. new strategies in library services organization: consortia university libraries in spain miguel duarte barrionuevo information technology and libraries; jun 2000; 19, 2; proquest pg. 96 new strategies in library services organization: consortia university libraries in spain miguel duarte barrionuevo new political, economic, and technological developments, as well as the growth of information markets, in spain have created a foundation for the creation of library consortia. the author describes the process by which different regions in spain have organized university library consortia. s panish libraries are public entities that depend either on central or local governments and are funded through either the national general budget or the regional government (comunidades aut6nomas) budget. on one hand, the player at the national level is the education and culture ministry, which contributes to the fifty-two state public libraries and shares jurisdiction with the regional government. on the other hand, universities are self-governed institutions of a public nature regulated by the ley de reforma universitaria, or university reform law, which was approved by the spanish parliament in 1983 to promote scientific study and greater selfgovernment of spanish universities. universities have their own budget, and they are mainly funded by the regional government. the university library system is currently made of about fifty public libraries and twelve private libraries. since the second half of the 1980s, a new philosophy concerning public services has spread in spain, as in other european countries: a philosophy calling for higher quality and more efficiency in the management and administration of the public capital. there has also arisen a claim to the government's satisfactory use of public funds as a social right, as well as a claim to a return on that capital in social terms. this is where libraries' public services come into play. there is a clear interest in all the aspects related to the introduction of new techniques in management. quality management, effectiveness and efficiency measuring, costs control, services assessment, and users content or analysis from the stakeholders' point of view are concepts that emerge in university libraries. in order to adjust to the circumstances, universities are changing their management procedures, and university libraries have been forced into managing their "business" according to managerial criteria. the commonality of their activities, and the relaxation of geographical boundaries fostered by information technologies, have encouraged libraries to join consortia in order to remain relevant in the current library services context. such concepts as the "electronic," "digital," and miguel duarte barrionuevo is head director of the central library of the university of cadiz (andalucia), and an active contributor of the university libraries consortium of andaluc1a. 96 information technology and libraries i june 2000 "virtual" libraries lead, from my point of view, to a different configuration in the library services context; they have pushed the library managers to consider strategically where they are and what is their most adequate position within this new configuration. departments dealing with information are to be wider, more heterogeneous, and multidisciplinary. new organization strategies need to be defined in order to offer services in a different way when library managers are forced to obtain the best results out of their limited resources, the organization of consortia represents a qualitative leap forward in cooperation, efficiency, and cost-savings. library consortia aim to share resources and to promote participation on the basis of the mutual benefit of the libraries involved and, although the concepts of cooperation, coordination, and sharing resources are not new in the library world, the organization of library consortia introduces a major level of commitment and involvement among the participants. i new settings, new facts libraries are going through a crisis. a library is still an institution with a strong traditional character, but its traditional duties as depository of knowledge no longer justify its costs, and the crisis is exacerbated by an accelerated technological and informative revolution. 1 within the changing atmosphere of the spanish university in the last few years, goals and objectives are affected by a number of socioeconomic, institutional, and technological factors, as well as others with an internal character that push these institutions to move toward change as an opportunity to maintain continuous improvement. materials and services are more expensive, and technology is more sophisticated every day, which leads to a need for strong investments. the public financing funds are more and more limited while the costs are growing. the university, in general, is suffering from a lack of efficiency and organizational flexibility; staff rejects monotonous tasks and holds high expectations; the fast dynamics of the implementation of information technology in the last few years has caused a very serious imbalance in the skill levels of people and in job-position demands. all these factors generate a new setting of weaknesses and hopes to which the university libraries have to respond in order to maintain their competitive advantages. i technology technology has recently become a strategic element in the development of libraries. technology is more and more sophisticated and its life is shorter. its use implies reproduced with permission of the copyright owner. further reproduction prohibited without permission. the need of strong investments in computer and communication infrastructure. i economical pressure on information market agents materials costs have diversified and arc more and more costly, with annual growths far exceeding even inflation rate levels. an absolute change has been produced in the supply and demand of the information market, which causes the agent's utter disorientation: the publishing sector is adapting very slowly to the electronic context; the distribution sector needs a deep technological and organizational transformation (few spanish suppliers offer added value services such as cataloguing, outsourcing, or material preparation-puvill libras, or filial multinationals such as blackwell or dawson are exceptions). electronic data interchange, a european standard like sisac, is not a standard format among the sector and there is not a national supplier that offers services of the approval plans type. additionally, the agents of the information market are very conditioned by the change of the demand orientation. specialized users (teachers, researchers, thesis students, etc.) demand from libraries electronic resources, quick information, and access at all times from remote locations. this conflicts with the restrictive tendencies in the maintenance of the public services and drastic budget cuts. libraries are forced to obtain the highest possible ratio of efficiency in the use of the fewest resources. i total quality management implementation and other management techniques the result is implementation of total quality management (tqm), which guarantees quality of services. it is important to consider tqm as an instrument that develops organizational strategies. it is a continuous process developed in order to replace obsolete types of organization, to orient the corporate activity as a permanent basis to the processing optimization, and to obtain a coherent relation between the efficacy in the reaching of objectives and the efficiency in the use of resources. changes in the editorial industry, the budget cuts, the quick expansion of electronic resources, the new price politics, and the problems related to copyright and intellectual property form the new setting. in this context, the consortia organization is considered by the university and library managers as a means to face the challenges which the new settings imply, to unify their pressure capacity with regard to the different agents, and to take advantage of the system's strength in order to adjust to the new situation and improve their competitive advantage. i adequate information technologies the spanish university libraries are connected to the academic information network upheld by rediris, a scientific-technical installation that depends on the science and technology office of the prime minister. the main line that maintains the redlris services is formed by seventeen nodes in each region (comunidad aut6noma), connected by atm circuits on atm accesses of 34/155 mbps. each node is formed by a set of communication equipment that allows the coordination of the main transmission means and of the access lines from the centers of each regions. redlris participates in the ten-34 project, which aims at building up an ip paneuropean net of 34 mbps, that interconnects us with the different academic and research nets and that is planned to become a ten-135 in 1999.2 on the other hand, the region (comunidad aut6noma) incorporates added value elements to the net segments they manage, such as faster access speeds that allow centralized architecture (for instance, union catalogue consortia libraries of galicia is managed through a broad band net of 155 mbps). the region also allows access to databases in cd-rom and electronic formats orientated to the final users in a regional context. for instance, the scientific computer center of andalucia manages twenty-two databases in cd-rom and other electronic formats that can be searched by all the andalusian universities and research centers through the andalusian scientific research net. homogeneous automation level the automation process of the library services, initiated at the end of the decade of the '80s, is practically completed. dobys-libis, libertas, vrls, absys, and sabini are the most widely used library management systems. 3 since 1997 some libraries have updated their library automation system to unicom (sirsi) and innopac (innovative interfaces). the spanish university libraries have a homogeneous automation level and can establish projects from the consortia perspective, such as regional union catalogs, sharing electronic information resources, and shared purchase policies. favourable political situation traditionally, the cooperative efforts have obtained little offical support. however, in the last years, a positive attitude can be perceived from the academic authorities in new stragetiges in library services organization i barrionuevo 97 reproduced with permission of the copyright owner. further reproduction prohibited without permission. relation to cooperation activities and the cooperative projects development, both as an answer to the need to reduce costs by sharing resources and as a means to face the growing and unstoppable demand from the users. the initiatives for the consortia organization are supported by highest academic level institutional agreements among the universities: principals and vice-principals of research (such is the case of the consortia of andalucia and madrid) or they are the result of initiatives taken by the autonomous government (galicia consortium) or a confluence of interests between the autonomous government and the universities (catalufla consortium). remote access to end users' information resources following the automation projects and the network technologies and data transmission development, most university libraries have made projects for all information resources integration and maintain a wide group of services: campuswide networks, catalogs, databases in cdrom (e .g., indice espanol de ciencias sociales y humanidades, indice espanol de ciencia y tecnologia, aranzadi legislaci6n y jurisprudencia, medline, abi inform, academic search) , e-mail , and remote access via internet. access to dll resources is available through the libraries management system opac web. there is access to any of these resources from any point connected to the network, whether from terminal servers, workstations, pcs, unix stations, or macs. i cooperation in spain up to the middle of the '80s, university libraries were separate realities with scattered funds and disorganized services; they were not structured as a system and they were lacking any tradition or mentality of cooperation. in a 1994 poll, only 40 percent of university library directors declared that cooperation among libraries was important. 4 we could say that the cooperation initiatives depend on the will of the people who obtain little support from the government. therefore, two different stages could be set: one in which cooperation is the result of personal actions, taken with no institutional support, in which local projects are undertaken ; or one in which individual initiatives are taken by the people in charge of libraries and a certain concern from the central government converge. will to share resources spain did not join the movement toward library automation until the '80s . at this time, the cooperative tenden98 information technology and libraries i june 2000 cies now associated with information and communication technologies were only slightly realized in the libraries. eventually, however, a consolidation of efforts took place, helping to bring about, at the end of the '80s and beginning of the '90s, some important cooperative initiatives out of which some specialized union catalogs could be brought. some of the first cooperative initiatives arose from the association of specialized libraries. 5 among these we can point out the coordinating committee of biomedical documentation, whose mission was to promote the cooperation and rationalization of document resources in the field of biomedicine. this committee holds conferences and maintains a union catalog of the daily publications on health services accessible through internet. 6 documat, created in 1988, groups together the libraries specializing in mathematics and maintains a union catalog of journals on which basis are organized plans of shared acquisition . mecano groups together the libraries of the schools of engineering and maintains a union catalog accessible through internet? early cooperative initiatives were also promoted by the library automation systems users groups. red universitaria espanola de dobis / libis began in 1990 when twelve universities using the system decide to create an online union catalog maintained by the university of oviedo. the libertas spanish users group maintains its union catalog associated with sls database, accessible online from bristol. rueca is the union catalog of absys users .8 need to cooperate in the early '80s a forum started in universities that attempted to influence the writing of the university statutes (as a result of the ley de reforma universitaria) and establish a general criterion for regulations. as a result of this debate, two documents have been published and have proved to be essential for subsequent cooperative development. 9 some reports from conferenc es on university libraries h eld in 1989 in the university complutense of madrid had a wide influence at the national level, and the same year, fundesco produced a report about the state-of-the-art in automation in the spanish university libraries .10 the situation that is repeated in these reports about th e libraries is extremely pessimistic. their evolution from 1985 to 1995 has been perfectly described by m. taladriz and l. anglada as "the lack of recognition of the role of university libraries ... the dispersion of bibliographical funds ... the general disorganization of the library services .... " 11 in 1988, red de bibliotecas universitarias (rebiun, university libraries network) was created. although inireproduced with permission of the copyright owner. further reproduction prohibited without permission. tially only nine university libraries were involved , the number grew to seventeen during the following years. the cooperative activiti es were centralized, and th ey obtained remarkable results in training, the improvement of library interlending, and in the publishing on cd-rom of bibliographical records from participant libraries. at the same time, and thanks to the celebration of the ifla congress in barcelona in 1993, the general need to create a wider discussion forum including all the univ ersity libraries and to obtain bett er cooperation and coordination was established. this idea crystallized with the creation of the conferencia de directores de bibliotecas universitarias y cientfficas (cobiduce, th e conference of university and scientific libraries directors). the first working mee ting was held in november 1993.12 this led to th e merging of rebiun with cobiduce in order to concentrate all the cooperation efforts into a single institution. a single institution, which kept th e name of rebiun, was created in 1996. in 1998, rebiun became the local committe e of the conferencia de rectores de las universidades espanolas (crue, conference of spanish university principals). rebiun has become the organization that oversees all the cooperation and coordination efforts in spanish academic librari es . rebiun activities include a union catalog published on cd-rom, "regulation s for university and scientific librari es," agreements on int erli brary loans, and activities in different working groups .13 i university libraries' consortia in the past few years the tran sfer of powers to the autonomous regions on ed u ca tion and culture, a consequence of a constitutional order, has brought about another political and administrative context for the achievement of the libraries ' objectives. th e autonomous regions are now working on the design of regional developm en t plans or regional information systems that are related, unfailingly, to the cooperative activity of the libraries of the territor y. thi s initiati ve can be applied to university librarie s as well as any other type of library , which, through their institutions, request their autonomous governments' assistance or funding in order to achieve cooperative projects. or it could be done the other way round: a governm ent can outline an action plan for its libraries and suggest it to the potential participants. thus, the basis for consortia development was set in the second half of the '90s, and encouraged by events like the celebrated conference in ca diz , organized by the university of carlos iii de madrid and the university of cadiz libraries , and ebsco information services (spanish branch) in 1998. catalonia consortium of university libraries (consorcio de bibliotecas universitarias de catalufia) we could sum up the situation in catalonia according to the following: the existenc e of new automated libraries, few automated records, the us e of their own automation systems, and the existence of only three universitie s. we can es tablish some cooperation background developed at this time : cruc, caps , and the joint selection of an automation system realiz ed by universidad aut6noma de barcelona and universidad politecnica de cataluna. it is not until the '90s that positive factors combined to move the cooperative movement a step forward in catalonia. these positive factors were a homogeneous s ta te of automation among university libraries, a good communications network, and the use of standards for library data recording . the previous cooperative movements and an analysis of the worldwide evolution of libraries helped in the building of a united view in which coop era tion appeared as an additional instrument for the improvement of the library world. the university library directors of catalonia considered cooperation a way to accelerate the evolution of libraries, to create new services, to facilitate changes, and to save expenses. with this conviction, they wrote a proproposal for the creat io n of a library network in catalonia , which in 1993 resu lted in the interconnection of the university librarie s in catalonia, followed in 1995 with the first steps toward the cre ation of the united catalog of the univer sities of catalonia. this catalog was fully operative in early 1996. at the end of 1996 th e univ ersity library consortium of catalonia (cbuc) was created with the task of improving library services through cooperation. 14 its objectiv es are: • to create new workin g too ls • to improve services • to build a digital librar y • to take better advantages of resources • to face together the changing role in libraries the cbuc comprises the university of barcelona , universidad autonoma de barcelona, the politechnical university of catalonia , pompeu fabra university , th e univ ersity of girona, the university of lleida, rovira i virg ili university, the university oberta of catalonia , and the library of catalonia. the direction of cbuc is determined by a board of representatives from each of th e institutions, an executive committee of six members, and a technical committee of librar y dir ectors. a staff of seven new stragetiges in library services organization i barrionuevo 99 reproduced with permission of the copyright owner. further reproduction prohibited without permission. runs the cbuc office, and different working groups audit active plans and study possible issues of concern. university libraries consortium of the madrid region (comunidad autonoma de madrid) the public university libraries, based in the madrid region (universidad de alcala, universidad carlos iii, universidad complutense, universidad politecnica, universidad rey juan carlos, and uiversidad nacional de educaci6n a distancia), are developing many cooperation programs with the following objectives: • to facilitate access to information resources • to improve the existing library services • to test and promote the use of information and communication technologies • to reduce costs by sharing resources 15 two programs have already been initiated: interlibrary loan. an agreement to obtain a faster delivery system for books and journal articles has been established. using the services of a private courier company, maximum delivery time from one university to another will be set to forty-eight hours. this service started working on the first of sepember. training. different courses for the joint training of library staff are being organized on a cooperative basis. in the future, other programs will be developed, including a union catalog (with the creation of a collective data basis that will also save cataloging costs by sharing bibliographical resources); and an elecronic library, which will allow common access to electronic resources. galician libraries consortium the galician libraries consortium is the result of a regional government intiative. 16 in november 1996 the xunta de galicia signed an agreement of scientific and technological collaboration with fijitsu icl spain in which the company agreed to develop the telecommunications infrastructure of the community: the galician information highway (agi: autopista gallega de la informaci6n). inaugurated in 1997, agi serves as the basis for projects with great political and social appeal. three projects were embarked upon : • tele-teaching, • tele-medicine, and • access to libraries users have access to a loan service by which a loan may be requested from any library in the consortium. the loan works as it would work in a local climate, with the same limitations, controls, and blocking of any other local loan system . the request to the system is sent online and is fulfilled within twenty-four to forty-eight hours. the consortium originally was to encompass all types of libraries, but as the project advanced, it was decided to restrict the collaboration to university libraries. this allowed the project to move forward with greater speed, because the member libraries had more narrowly defined interests and concerns. the xunta de galicia prepared the "protocol of intentions," which has been signed by the highest representatives of the three gallician universities (universidad de santiago, universidad de la corufta, and universidad de vigo). this protocol is characterized by two essential ideas: 1. allow adequate time for planning individual incorporation into the consortium, so that each institution may participate at the rate it deems appropriate. 2. create a permanent working commission formed by representatives of the institutions involved, which will: • answer existing and future questions; • define the model of consortium that each organization desires to establish through specific objectives; and • promote adequate measurement in order to obtain the objectives that have been designed . andalucian university libraries consortium in the era of the internet, electronic documents, and the virtual library, maintaining independent libraries is out of order . in addition, the efforts needed to face the challenges of the information society and the changes that society is demanding of universities are destined to become weaknesses more than strengths in those institutions that face them individually. there are many reasons why it is advisable for libraries to approach these challenges collaboratively: • the productivity and competitiveness that society demands of the universities • the huge technological opportunities to share information • the importance of the changes that are taking place in the products and services that the information market offers • the high cost of the new products (e.g., e-joumals) • the need of very specialized knowledge in order to activate some of these services • the growing demands of library users the andalucian university libraries concluded that if they wished to stay current with information technologies, if they wished to continue implementing improved services, and if they wished to do so within their budg100 information technology and libraries i june 2000 reproduced with permission of the copyright owner. further reproduction prohibited without permission. ets, solid cooperation mechanisms would have to be established. in march 1998 the andalucian vice-principals of research requested the directors of the andalucian university libraries to analyze possible cooperative activities among the university libraries of the community. two goals were set in this meeting: • the analysis of library automation products currently on the market. • the analysis of the current individual management systems within the andalucian libraries (which, though automation varied within them, were each considered to be outdated) and the potential for sharing resources with the present systems, which is difficult because currently available systems may not be compatible with z39.50. the object of this analysis is to define essential requirements so the new systems to be implemented facilitate possible cooperative actions. this possible integration will not be simple: the university pablo de olavide, recently created, is planning to purchase its own system; the universities of seville, granada, and cordoba are using dobis-libis; and the universities of cadiz and malaga are using libertas and are preparing to update to innopac. the andalucian university libraries have studied some of the systems that the spanish market offers: abys (baratz, document systems), amicus (elias), innopac (sls), sabini (sabini library automation) and unicorn (sirsi). they are preparing a catalog of electronic information resources available in the andalucian university libraries to know which resources are available and preferred by different universities. the andalucian university libraries consortium is in an early stage; while its organizational structure and functions are defined, its tasks are still being elaborated. the delegate commission of the vice-principals of research of the andalucian universities is responsible for this work. the commission is presided over by the viceprincipal of the university of seville and formed by the directors of the andalucian libraries and the juridical consultant of the university of cordoba. the commision will produce a working paper that outlines the main facets of the organization, based on the following general principles: • to add value to the computer net of research • to favor the use of technologies that contribute to the improvement of the production times and the designing of efficient processes • to apply scale economies: • in the purchase of products and services • in repetitive tasks and activities • to favor the use of information resources among the members of the andalusian universities and the society in general in order for the project to succeed, the following conditions must exist: • a homogeneous situation among the libraries in terms of regulations and technical instruments used in the description of materials, data format, and information interchange format; • the andalucian universities are connected with high speed optic fiber lines (32 mb); • the administrative framework is clearly defined; and • the responsible members of the andalusian university libraries are convinced that cooperation will improve substantially the quality of the library services in each university. additionally, the following advantages must result: • decline or leveling of production expenses • economies of scale in the purchase of products such as computer systems, databases, and journal and electronic information subscriptions • shared technical support • shared training costs • shared information resources through interlibrary loan i conclusions the ultimate goal of cooperation is to join users and the documents and information they need; establishing relations among participant institutions is a means to that end. consortia represent the possibility to test alternatives to the traditional automated library. they represent the potential to offer the best library services to a wider number of users with all the resources they possess. further than simple cooperation that unites efforts and resources, consortia represent the possibility to test innovative formulas of processes management and services organization from a regional perspective. references 1. miguel duarte, "evaluaci6n del rendimiento aplicando sistemas de gesti6n de calidad, la experiencia de la biblioteca de la universidad de cadiz" [performance assesment implementing total quality management systems. the university library of cadiz experience], in xv jornadas_ de gerencia universitaria: mode/as de financiaci6n, evaluaci6n y me1ora de la calidad de la gesti6n de las servicios [15th university managers meeting: financing models, assesment and quality assurance new stragetiges in library services organization i barrionuevo 101 reproduced with permission of the copyright owner. further reproduction prohibited without permission. of services] (cadiz, university pr., 1997), 309-10; marta torres, "el impacto de las autopistas de la informaci6n para la comunidad academica y los bibliotecarios" [the information highway to academic community and librarians], in autopistas de la informaci6n: el reto de/ siglo xxi (madrid: editorial complutense, 1996), 37-55. 2. victor castelo en la mesa redonda: suen.an los informaticos con bibliotecas electr6nicas. en seminario sobre consorcios de bibliotecas [dream the computerman with electronic libraries?] table ronde in libraries consortia conference, cadiz, university press, 1999, 130; see also www.rediris.es, accessed apr. 24, 2000. 3, m. jimenez and alice keefer, "library automation in spain," program 26, no. 3 (1992): 225-37; assumpcio stivill, "automation of university libraries in spain," telephasa seminar on innovative information services and information handling (tilburg, june 10-12, 1991); rebiuns statistical annual offers data about catalog automation. 4. luis anglada and margarita taladriz, "pasado, presente y futuro de las bibliotecas universitarias espaii.olas" [past, present and future of spanish university libraries] in ix jornadas de bibliotecas de andalucfa (granada: asociaci6n andaluza de bibliotecarios, 1996), 108-31. 5. l. anglada, "cooperaci6 bibliotecaria a espanya [library cooperation in spain]," item 95, no. 16: 51--67. 6. see www.doc6.es/cdb, accessed apr. 24, 2000. 7. see http:/ /biblioteca.upv.es/bib/mecano, accessed apr. 24, 2000. 8. see www.uned,es/bibliote/biblio/ruedo.htm and www. baratz.cs/rueca, accessed apr. 24, 2000. 9. "the library in the university: report on the university libraries in spain, produced by a working team formed by university librarians and teachers" (madrid: ministry of general culture of the book and libraries, 1985); "university libraries: recommendations about its regulations, conference's on university libraries, 'castillo magalia,' las navas de! marques," avila, may 27-28, 1986 (madrid: library coordination centre, 1987). 10. situaci6n de las bibliotecas universitarias dependientes del mec [academic libraries from education department state of art] (madrid: universidad complutense, biblioteca, 1988); estudio sob re normalizaci6n e informatizaci6n de las bibliotecas cientificas espaii.olas.-fundesco, 1989 (no publicado). 11. luis anglada and margarita taladriz, 108. 12. see consorcios de bibliotecas [consortia libraries conference], maribel gomez campillejo, ed. (cadiz: cadiz univ. pr., 1999). 13. see www2.uji.es/rebiun, accessed apr. 24, 2000. 14. for more information about cbuc, see www.cbuc.es, accessed apr. 24, 2000. 15. marta torres, los consorcios, forma de organizaci6n bibliotecaria en el s.xxi. una aproximaci6n desde la perspectiva espaii.ola. in consorcios de bibliotecas (library consortia conference), 17-35. 16. santiago raya, "el consorcio de bibliotecas de galicia [galician library consortium]," in consorcios de bibliotecas [library consortia conference], cit, 117-25. 102 information technology and libraries i june 2000 reproduced with permission of the copyright owner. further reproduction prohibited without permission. wikiwikiwebs: new ways to communicate in a web environment chawner, brenda;lewis, paul h information technology and libraries; mar 2006; 25, 1; proquest education journals pg. 33 reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. academic libraries on social media: finding the students and the information they want heather howard, sarah huber, lisa carter, and elizabeth moore information technology and libraries | march 2018 8 heather howard (howar198@purdue.edu) is assistant professor of library science; sarah huber (huber47@purdue.edu) is assistant professor of library science; lisa carter (carte241@purdue.edu) is library assistant; and elizabeth moore (moore658@purdue.edu) is library assistant and student supervisor at purdue university. librarians from purdue university wanted to determine which social media platforms students use, which platforms they would like the library to use, and what content they would like to see from the library on each of these platforms. we conducted a survey at four of the nine campus libraries to determine student social media habits and preferences. results show that students currently use facebook, youtube, and snapchat more than other social media types; however, students responded that they would like to see the library on facebook, instagram, and twitter. students wanted nearly all types of content from the libraries on facebook, twitter, and instagram, but they did not want to receive business news or content related to library resources on snapchat. youtube was seen as a resource for library service information. we intend to use this information to develop improved communication channels, a clear social media presence, and a cohesive message from all campus libraries. introduction in his book tell everyone: why we share and why it matters, alfred hermida states, “people are not hooked on youtube, twitter or facebook but on each other. tools and services come and go; what is constant is our human urge to share.”1 libraries are places of connection, where people connect with information, technologies, ideas, and each other. as such, libraries look for ways to increase this connection through communication. social media is a key component of how students communicate with classmates, families, friends, and other external entities. it is essential for libraries to communicate with students regarding services, collections, events, library logistics, and more. purdue university is a large, land-grant university located in west lafayette, indiana, with an enrollment of more than forty thousand. the purdue libraries consist of nine libraries, presented collectively on the social media platforms facebook and twitter since 2009 and youtube since 2012. going forward, the purdue libraries want to ensure it establishes a cohesive message and brand that is communicated to students on platforms they use and on which they will engage with it. the purpose of this study was to determine which social media platforms the students are currently using, which platforms they would like the library to use, and what content they would like to see from the libraries on each of these platforms. mailto:howar198@purdue.edu mailto:huber47@purdue.edu mailto:carte241@purdue.edu mailto:moore658@purdue.edu academic libraries on social media | howard, huber, carter, and moore 9 https://doi.org/10.6017/ital.v37i1.10160 literature review academic libraries and social media academic libraries have been slow to accept social media as a venue for either promoting their services or academic purposes. a 2007 study of 126 academic librarians found that only 12 percent of those surveyed “identified academic potential or possible benefits” of facebook while 54 percent saw absolutely no value in social media.2 however, the mission of academic libraries has shifted in the last decade from being a repository of knowledge to being a conduit for information literacy; new roles include being a catalyst for on-campus collaboration and a facilitator for scholarly publication within contemporary academic librarianship.3 academic librarians have responded to this change, with many now believing that “social media, which empowers libraries to connect with and engage its diverse stakeholder groups, has a vital role to play in moving academic libraries beyond their traditional borders and helping them engage new stakeholder groups.”4 student perceptions about academic libraries on social media as the use of social media has grown with college-aged students, so has an increasing acceptance of academic libraries using social media to communicate. a pew research center report from 2005 showed just 7 percent of eighteen to twenty-nine year olds using social media. by 2016, 86 percent were using social media.5 in 2007 the oclc asked 511 college students from six different countries to share their thoughts on libraries using social networking sites. this survey revealed that “most college students would be unlikely to participate in social networking services offered by a library,” with just 13 percent of students believing libraries have a place on social media.6 however, just two years later (in 2009), a shift was seen: students were open to connecting with academic libraries, as observed in a survey of 366 freshmen at valparaiso university. when asked their thoughts on the library sending announcements and communications to them via facebook or myspace (a social media powerhouse at the time), 42.6 percent answered they would be “more receptive to information received in this way than any other response.” a smaller group, 12.3 percent, responded more negatively to this approach. students showed concern for their privacy and the level of professionalism, as a quote from a student illustrates: “facebook is to stay in touch with friends or teachers from the past. email is for announcements. stick with that!!!” 7 as students report becoming more open to academic libraries on social media, the question of whether they will engage through social media emerges. a recent study from western oregon university’s hammersley library asked this question with promising results. forty percent of students said they were either “very likely “or “somewhat likely” to follow the library on instagram and twitter, as opposed to wanting communications being sent to them directly through social media (for example, a facebook message). pinterest followed, with 33 percent of students saying they were either “very likely” or “somewhat likely” to follow the library using this platform.8 throughout the literature, students have shown an interest in information about the libraries that is useful to them. in another survey given to undergraduate students from three information technology classes at florida state university, one question examined the perceived importance of different library social media postings to students. the report showed students considered postings related to operations updates, study support, and events as the most important.9 in the hammersly study noted above, 78 percent and 87 percent of respondents said information technology and libraries | march 2018 10 they were either “very interested” or “somewhat interested,” respectively, in every category relating to library resources presented in the survey, but “interesting/fun websites and memes” received the least interest from participants.10 the literature shows an increase in students being receptive to academic libraries on social media. results vary campus to campus and students are leery of libraries reaching out to them via social media, but they have an increasingly positive view about content posted that will help them with the library. research questions the aim of this project was to investigate the social media behaviors of purdue university students as they relate to the libraries, and to develop evidence-based practices for managing the library’s social media accounts. the project focused on three research questions: 1. what social media platforms are students using? 2. what social media platforms do students want the library to use? 3. what kind of content do students want from the library on each of these platforms? methods we created the survey using the web-based qualtrics survey software. it was distributed in electronic form only, and it was promoted to potential respondents via table tents in the libraries, bookmarks at the library desk, facebook posts, and in-classroom promotion. potential respondents were advised that the survey was anonymous and voluntary. the survey consisted of closed questions, though many questions contained an open-ended field for answers that did not fall into the provided choices. inspiration for some of the options in our survey questions came from the hammersly library study, as we felt they did a good job capturing information about the social media usage of their patrons.11 our survey asked what social media platforms students use, what they use them for, how often they visit the library, how likely they are to follow the library on social media, which platforms they want the library to have, and what content they would like from the library on each of those platforms. the social media platforms included were facebook, flickr, g+, instagram, linkedin, pinterest, qzone, renren, snapchat, tumblr, twitter, youtube, and yik yak.12 there were also open-ended spaces where participants could write in additional platforms. the survey originally ran for three weeks in only the business library early in the spring 2017 semester, as its intended purpose was to inform how the business library would manage social media. after that survey was completed, we decided to replicate the survey in three additional libraries (humanities, social science, and education; engineering; and the main undergraduate libraries). this was done to expand the dataset and reach additional students in a variety of disciplines. these libraries were chosen because they were the libraries in which the authors work, with the hope to expand to additional libraries in the future. the second survey also lasted for three weeks starting in mid-april of the spring 2017 semester. as a participation incentive, students who completed the initial survey and the second survey had an opportunity to enter a drawing for a $25 visa gift card. academic libraries on social media | howard, huber, carter, and moore 11 https://doi.org/10.6017/ital.v37i1.10160 the survey was advertised across four different campus libraries and promoted in several ways to reach different populations. though the results are not from a random sample of the student population, the results are broad enough that we intend to apply them to our entire student population. results survey the survey was completed by 128 students. an additional 13 students began the survey but did not complete it; we removed their results from the analysis. the breakdown of respondents was 10 percent freshmen (n = 13), 22 percent sophomore (n = 28), 27 percent junior (n = 35), 20 percent senior (n = 25), and 21 percent graduate or professional (n = 27). library usage the students were asked how frequently they visit the library to determine if the survey was reaching a population of regular or infrequent library visitors. the results showed that the students who completed the survey were primarily frequent library users, with 93 percent (n = 119) visiting once a week or more. social media platforms the students were asked to identify which social media platforms they used and how frequently they used them. the most popular social media platforms were determined by combining the number of students who said they used them daily or weekly. the top five were facebook (n = 114, 88 percent), youtube (n = 102, 79 percent), snapchat (n = 90, 70 percent), instagram (n = 85, 66 percent), and twitter (n = 41, 32 percent). full results are in table 1. table 1. usage frequency by platform social media platform daily weekly monthly < once per month never facebook 94 (72.87%) 20 (15.50%) 5 (3.88%) 5 (3.88%) 4 (3.10%) flickr 0 (0.00%) 1 (0.78%) 2 (1.55%) 8 (6.20%) 117 (90.70%) g+ 3 (2.33%) 6 (4.65%) 4 (3.10%) 16 (12.40%) 99 (76.74%) instagram 68 (52.71%) 17 (13.18%) 5 (3.88%) 11 (8.53%) 27 (20.93%) linkedin 9 (6.98%) 29 (22.48) 22 (17.05%) 22 (17.05%) 46 (35.66%) pinterest 12 (9.30%) 12 (9.30%) 16 (12.40%) 19 (14.73%) 69 (53.49%) qzone 0 (0.00%) 0 (0.00%) 0 (0.00%) 4 (3.10%) 124 (96.12%) renren 0 (0.00%) 0 (0.00%) 1 (0.78%) 3 (2.33%) 124 (96.12%) snapchat 84 (65.12%) 6 (4.65%) 6 (4.65%) 7 (5.43%) 25 (19.38%) tumblr 7 (5.43%) 2 (1.55%) 7 (5.43%) 11 (8.53%) 101 (78.29%) information technology and libraries | march 2018 12 social media platform daily weekly monthly < once per month never twitter 28 (21.71%) 13 (10.08%) 12 (9.30%) 9 (6.98%) 66 (51.16%) youtube 58 (44.96%) 44 (34.11%) 15 (11.63%) 4 (3.10%) 7 (5.43%) yik yak 0 (0.00%) 0 (0.00%) 0 (0.00%) 11 (8.53%) 117 (90.70%) other: email 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: groupme 3 (2.33%) 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: reddit 2 (1.55%) 2 (1.55%) 0 (0.00%) 0 (0%) 0 (0.00%) other: skype 0 (0.00%) 0 (0.00%) 0 (0.00%) 1 (0.78%) 0 (0.00%) other: vine 0 (0.00%) 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: wechat 3 (2.33%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: weibo 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: whatsapp 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) social media activity next, students were asked how much time they spend on social media doing the following activities: watching videos, keeping in touch with friends/family, sharing photos, keeping in touch with classmates/professors, learning about campus events, doing research, getting news, or following public figures. table 2 shows that students overwhelmingly use social media daily or weekly to watch videos (94 percent, n = 120), keep in touch with family/friends (93 percent, n = 119), and to get news (81 percent, n = 104). the least popular activities, those that students do less than once per month or never, were research (47 percent, n = 60) and to following public figures (34 percent, n = 45). social media and the library the students were asked how likely they are to follow the libraries on social media. the response to this was primarily positive, with 57 percent of respondents saying they are either extremely likely or somewhat likely to follow the library. one response for this question was inexplicably null, so for this question n = 127. figure 1 contains the full results. academic libraries on social media | howard, huber, carter, and moore 13 https://doi.org/10.6017/ital.v37i1.10160 table 2. social media activity social media activity daily weekly monthly < once per month never watch videos 85 (66.41%) 35 (27.34%) 1 (0.78%) 4 (3.13%) 3 (2.34%) keep in touch with friends/family 89 (69.53%) 30 (23.44%) 6 (4.69%) 2 (1.56%) 1 (0.78%) share photos 32 (25%) 33 (25.78%) 38 (29.69%) 20 (15.63%) 5 (3.91%) keep in touch with classmates/professors 34 (26.56% 47 (36.72%) 21 (16.41%) 19 (14.84%) 7 (5.47%) learn about campus events 24 (18.75%) 53 (41.41%) 29 (22.66%) 18 (14.06%) 4 (3.13%) do research 24 (18.75%) 26 (20.31%) 18 (14.06%) 23 (17.97%) 37 (28.91%) get news 66 (51.56%) 38 (29.69%) 7 (5.47%) 9 (7.03%) 8 (6.25%) follow public figures 34 (26.56%) 30 (23.44%) 20 (15.63%) 19 (14.84%) 24 (18.75%) other 2 (1.56%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) figure 1. library social media follows. 12 66 23 16 10 0 10 20 30 40 50 60 70 extremely likely somewhat likely neither likely nor unlikely somewhat unlikely extremely unlikely how likely are you to follow the library on social media? information technology and libraries | march 2018 14 the students were asked which social media platforms they thought the library should be on. five rose to the top of the results: facebook (82 percent, n = 105), instagram (55 percent, n = 70), twitter (40 percent, n = 51), snapchat (34 percent, n = 44), and youtube (29 percent, n = 37). full results can be seen in figure 2. after a student selected a platform they wanted the library to be on, logic built into the survey then directed them to an additional question that asked what content they would like to see from the library on that platform. content included library logistics (hours, events, etc.), research techniques and tips, how to use library resources and services, library resource info (database instruction/tips, journal availability, etc.), business news, library news (e.g., if the library wins an award), campus-wide info/events, and interesting/fun websites and memes. for facebook, students widely selected all types of content, with the most selections made for library logistics (n = 73) and the fewest made for business news (n = 33). for instagram, students wanted all content except business news (n = 18). snapchat was similar, except along with business news (n = 8), students also were not interested in receiving content related to library resource information (n = 9). twitter was similar to facebook in that all content was widely selected. youtube had a focus on library services, with the three most-selected content options being research techniques and tips (n = 20), how to use library resources and services (n = 19), and library resource info (n = 16). table 3 contains the full results. figure 2. library social media presence. 105 7 70 23 10 1 1 44 5 51 37 0 20 40 60 80 100 120 facebook g+ instagram linkedin pinterest qzone renren snapchat tumblr twitter youtube what social media platform should the library be on? academic libraries on social media | howard, huber, carter, and moore 15 https://doi.org/10.6017/ital.v37i1.10160 table 3. library social media content by platform what type of content would you like to see from the library? content type f a c e b o o k (n = 1 0 5 ) g + (n = 7 ) in s ta g r a m (n = 7 0 ) l in k e d in (n = 2 3 ) p in te r e s t (n = 1 0 ) s n a p c h a t (n = 4 4 ) t u m b lr (n = 5 ) t w itte r (n = 5 1 ) y o u t u b e (n = 3 7 ) library logistics (hours, events, etc.) 73 (69.52%) 2 (28.57%) 34 (48.57%) 7 (30.43%) 4 (40%) 23 (52.27%) 2 (40%) 32 (62.75%) 8 (21.62%) research techniques & tips 52 (49.52%) 3 (42.85%) 28 (40%) 13 (56.53%) 7 (70%) 19 (43.18%) 3 (60%) 27 (52.94%) 20 (54.05%) how to use library resources & services 53 (50.48%) 3 (42.85%) 26 (37.14%) 8 (34.78%) 7 (70%) 16 (36.36%) 3 (60%) 25 (49.02%) 19 (51.35%) library resource info (database instruction/tips , journal availability, etc.) 53 (50.48%) 3 (42.85%) 22 (31.42%) 8 (34.78%) 6 (60%) 9 (20.45%) 2 (40%) 23 (45.10%) 16 (43.24%) business news 33 (31.43%) 2 (28.57%) 18 (25.71%) 13 (56.52%) 3 (30%) 8 (18.18%) 2 (40%) 17 (33.33%) 7 (18.92%) library news (e.g., if the library wins an award) 49 (46.67%) 3 (42.85%) 37 (52.86%) 12 (52.17%) 5 (50%) 19 (43.18%) 3 (60%) 24 (47.06%) 7 (18.92%) campus-wide info/events 73 (69.52%) 3 (42.85%) 42 (60%) 5 (21.74%) 5 (50%) 26 (59.09%) 2 (40%) 35 (68.63%) 13 (35.14%) interesting/fun websites & memes 48 (45.71%) 0 41 (58.57%) 2 (8.70%) 10 (100%) 30 (68.18%) 3 (60%) 26 (50.98%) 12 (32.43%) other 1 (0.95%) 0 2 (2.86%) 0 1 (10%) 2 (4.55%) 0 2 (3.92%) 1 (2.70%) discussion historically, libraries have used social media as a marketing tool.13 with social media’s everincreasing popularity with young adults, academic libraries have actively established a presence on several platforms.14 our survey shows that our students follow this trend, using social media regularly and for a variety of activities. we were surprised that facebook turned out to be the information technology and libraries | march 2018 16 most widely used by our students, as much has been written in the last few years about teens and young adults leaving the platform.15 a november 2016 survey, however, found that 65 percent of teens said they used facebook daily, a large increase from 59 percent in november 2014. though snapchat and instagram preferred, teens continue to use facebook for its utility in scheduling events or keeping in touch regarding homework.16 students do seem receptive to following the library on different platforms and report wanting primarily library-related content from us, including more in-depth content such as research techniques and database instruction. limitations and future work findings from this study give insight into opportunities for libraries to reach university students through social media. we acknowledge that only limited generalizations can be made because of the way the survey was conducted. our internal recruitment methods led to a selection bias in our surveyed population, as advertisement of the survey took place either in the chosen libraries or on the purdue libraries’ existing facebook page. because of this, our sample consists primarily of students who visit the library or already follow the library on facebook. we hope to alter this in future surveys by expanding our recruitment to other physical spaces across campus. in addition, we plan to add questions that first establish a better understanding of students’ opinions of libraries being on social media before asking what social media they would like to see libraries use. this would potentially avoid leading students to an answer. further, we are concerned we took for granted students’ understanding of library resources; that is, we may have made distinctions librarians understand, but students may not. in future studies, we plan to rephrase, and possibly combine, questions in a way that will be clear to people less familiar with library resources and services. we believe confusion with these questions created contradictory responses. for example, “research help through social media” received a low response rate, but “information on research techniques and tips” received a much higher response rate. additionally, a limitation of using a survey to collect behavior information is that respondents do not always report how they actually behave. using methods such as focus groups, interviews, text mining, or usability studies could provide a more holistic view of student behavior. duplication of this study on a yearly or semi-yearly basis across all libraries could help us see how social media preferences change over time and across a larger sample of our population. this study aimed to provide a broad view of a large university’s student body by surveying across different subject libraries. with the changes discussed, we think a revised survey could give us the detailed information we need to build a more effective social media strategy that reaches both library users and non-users. conclusion this study improved our understanding of the social media usage and preferences of purdue students. from these results, we intend to develop better communication channels, a clear social media presence, and a more cohesive message across the purdue libraries. under the direction of our new director of strategic communication, a social media committee was formed with representatives from each of the libraries to contribute content for social media. the committee will consider expanding the purdue libraries’ social media presence to communication channels where students have said they are and would like us to be. as social media usage is ever-changing, we recommend repeated surveys such as this to better understand where on social media students want to see their libraries and what information they want to receive from them. academic libraries on social media | howard, huber, carter, and moore 17 https://doi.org/10.6017/ital.v37i1.10160 references 1 alfred hermida, tell everyone: why we share and why it matters (toronto: doubleday canada, 2014), 1. 2 laurie charnigo and paula barnett-ellis, “checking out facebook.com: the impact of a digital trend on academic libraries,” information technology and libraries 26, no. 1 (march 2007): 23–34, https://doi.org/10.6017/ital.v26i1.3286. 3 stephen bell, lorcan dempsey, and barbara fister, new roles for the road ahead: essays commissioned for the acrl’s 75th anniversary (chicago: association of college and research libraries, 2015). 4 amanda harrison et al., “social media use in academic libraries: a phenomenological study,” journal of academic librarianship 43, no. 3 (may 1, 2017): 248–56, https://doi.org/10.1016/j.acalib.2017.02.014. 5 “social media fact sheet,” pew research center, january 12, 2017, http://www.pewinternet.org/fact-sheet/social-media/. 6 online computer library center, sharing, privacy and trust in our networked world: a report to the oclc membership, (dublin, ohio: oclc, 2007)), https://eric.ed.gov/?id=ed532599. 7 ruth sara connell, “academic libraries, facebook and myspace, and student outreach: a survey of student opinion,” portal: libraries and the academy 9, no. 1 (january 8, 2009): 25–36, https://doi.org/10.1353/pla.0.0036. 8 elizabeth brookbank, “so much social media, so little time: using student feedback to guide academic library social media strategy,” journal of electronic resources librarianship 27, no. 4 (2015): 232–47, https://doi.org/10.1080/1941126x.2015.1092344. 9 besiki stvilia and leila gibradze, “examining undergraduate students’ priorities for academic library services and social media communication,” journal of academic librarianship 43, no. 3 (may 1, 2017): 257–62, https://doi.org/10.1016/j.acalib.2017.02.013. 10 brookbank, “so much social media, so little time.” 11 stvilia and gibradze, “examining undergraduate students’ priorities.” 12 qzone and renren are chinese social media platforms. 13 curtis r. rogers, “social media, libraries, and web 2.0: how american libraries are using new tools for public relations and to attract new users,” south carolina state library, may 22, 2009, http://dc.statelibrary.sc.gov/bitstream/handle/10827/6738/scsl_social_media_libraries_20 09-5.pdf?sequence=1; jakob harnesk and marie-madeleine salmon, “social media usage in libraries in europe—survey findings,” linkedin slideshare slideshow presentation, august https://doi.org/10.6017/ital.v26i1.3286 https://doi.org/10.1016/j.acalib.2017.02.014 http://www.pewinternet.org/fact-sheet/social-media/ https://eric.ed.gov/?id=ed532599 https://doi.org/10.1353/pla.0.0036 https://doi.org/10.1080/1941126x.2015.1092344 https://doi.org/10.1016/j.acalib.2017.02.013 http://dc.statelibrary.sc.gov/bitstream/handle/10827/6738/scsl_social_media_libraries_2009-5.pdf?sequence=1 http://dc.statelibrary.sc.gov/bitstream/handle/10827/6738/scsl_social_media_libraries_2009-5.pdf?sequence=1 information technology and libraries | march 2018 18 10, 2010, https://www.slideshare.net/jhoussiere/social-media-usage-in-libraries-in-europesurvey-teaser. 14 “social media fact sheet.” 15 daniel miller, “facebook’s so uncool, but it’s morphing into a different beast,” the conversation, 2013, http://theconversation.com/facebooks-so-uncool-but-its-morphing-into-a-differentbeast-21548; ryan bradley, “understanding facebook’s lost generation of teens,” fast company, june 16, 2014, https://www.fastcompany.com/3031259/these-kids-today; nico lang, “why teens are leaving facebook: it’s ‘meaningless,’” washington post, february 21, 2015, https://www.washingtonpost.com/news/the-intersect/wp/2015/02/21/why-teensare-leaving-facebook-its-meaningless/?utm_term=.1f9dd4903662. 16 alison mccarthy, “survey finds us teens upped daily facebook usage in 2016,” emarketer, january 28, 2017, https://www.emarketer.com/article/survey-finds-us-teens-upped-dailyfacebook-usage-2016/1015053. https://www.slideshare.net/jhoussiere/social-media-usage-in-libraries-in-europe-survey-teaser https://www.slideshare.net/jhoussiere/social-media-usage-in-libraries-in-europe-survey-teaser http://theconversation.com/facebooks-so-uncool-but-its-morphing-into-a-different-beast-21548 http://theconversation.com/facebooks-so-uncool-but-its-morphing-into-a-different-beast-21548 https://www.fastcompany.com/3031259/these-kids-today https://www.washingtonpost.com/news/the-intersect/wp/2015/02/21/why-teens-are-leaving-facebook-its-meaningless/?utm_term=.1f9dd4903662 https://www.washingtonpost.com/news/the-intersect/wp/2015/02/21/why-teens-are-leaving-facebook-its-meaningless/?utm_term=.1f9dd4903662 https://www.emarketer.com/article/survey-finds-us-teens-upped-daily-facebook-usage-2016/1015053 https://www.emarketer.com/article/survey-finds-us-teens-upped-daily-facebook-usage-2016/1015053 introduction literature review academic libraries and social media student perceptions about academic libraries on social media research questions methods results survey library usage social media platforms social media activity social media and the library discussion limitations and future work conclusion references 4 information technology and libraries | december 2007 author id box for 2 column layout column title editor enterprise digital asset management (dam) systems are beginning to be explored in higher education, but little information about their implementation issues is available. this article describes the university of michigan’s investigation of managing and retrieving rich media assets in an enterprise dam system. it includes the background of the pilot project and descriptions of its infrastructure and metadata schema. two case studies are summarized—one in healthcare education, and one in teacher education and research. experiences with five significant issues are summarized: privacy, intellectual ownership, digital rights management, uncataloged materials backlog, and user interface and integration with other systems. u niversities are producers and repositories of large amounts of intellectual assets. these assets are of various forms: in addition to text materials, such as journal papers, there are theses, performances from per forming arts departments, recordings of native speakers of indigenous languages, or videos demonstrating surgical procedures, to name a few.1 such multimedia materials have not, in general, been available outside the originat ing academic department or unit, let alone systematically cataloged or indexed. valuable assets are “lost” by being locked away in individual drawers or hard disks.2 managing and retrieving multimedia assets are not problems confined to academia. media companies such as broadcast news agencies and movie studios also have faced this problem, leading to their adoption of digital asset management (dam) systems. in brief, dam systems are not only repositories of digitalrich media content and the associated metadata, but also provide management functionalities similar to database manage ment systems, including access control.3 a dam system can “ingest digital assets, store and index assets for easy searching, retrieve assets for use in many environments, and manage the rights associated with those assets.”4 in summer 2000, the university of michigan (um) tv station, umtv, was searching for a video archive solution. that fall, a um team visited cnn and experienced a “eureka!” moment. as james hilton, thenassociate provost for academic, information, and instructional technology affairs, later wrote, “building a digital asset management into the infrastructure . . . will be the digital equivalent of bringing indoor plumbing to the campus.”5 in spring 2001, an enterprise dam system was considered for inclusion in the university infrastruc ture. upon completion of a limited proofofconcept project, a crosscampus team developed the request for proposals (rfp) for the dams living lab, which was issued in july 2002 and subsequently awarded to ibm and ancept. in august 2003, hardware and software installation began in the living lab.6 by 2006, the project changed its name to bluestream to appeal to the grow ing mainstream user base.7 six academic and two support units agreed to partner in the pilot: ■ school of education ■ school of dentistry ■ college of literature, science, and the arts ■ school of nursing ■ school of pharmacy ■ school of social work ■ information technology central services ■ university libraries the academic units were asked to provide typical and unusual digital media assets to be included in the living lab pilot. the pilot focused on rich media, so the preferred types of assets were digital video, images, and other multimedia delivered over the web. the living lab pilot was designed to address four key questions: ■ how to create a robust infrastructure to process, manage, store, and publish digital rich media assets and their associated metadata. ■ how to build an environment where assets are eas ily searched, shared, edited, and repurposed in the academic model. ■ how to streamline the workflow required to create new works with digital rich media assets. ■ how to provide a campuswide platform for future application of rights declaration techniques (or other ip tools) to existing assets. this article describes the challenges encountered during the researchanddevelopment phase of the um enterprise dam system project known as the living lab. the project has now ended, and the implemented project is known as bluestream. enterprise digital asset management system pilot: lessons learned yong-mi kim, judy ahronheim, kara suzuka, louis e. king, dan bruell, ron miller, and lynn johnson yong-mi kim (kimym@umich.edu) is carat-rackham fellow 2004, school of information; judy ahronheim (jaheim@umich .edu) is metadata specialist, university libraries; kara suzuka (ksuzuka@umich.edu) is assistant research scientist, school of education; louis e. king (leking@umich.edu) is managing producer, digital media commons; dan bruell (danlbee@umich .edu) is director, school of dentistry; ron miller (ronalan@umich .edu) is multimedia services position lead, school of education; and lynn johnson (lynjohns@umich.edu) is associate professor, school of dentistry, university of michigan, ann arbor. article title | author 5enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 5 ■ background of the living lab: u-m enterprise dam system project an enterprise project such as the living lab at um can have significant impact on an institution’s teaching and learning activities by allowing all faculty and students easy yet secure access to media assets across the entire campus. such extensive impact can only be obtained by overcoming numerous and varied obstacles and by docu menting actual implementation experiences employed to overcome those challenges. enterprise dam system vendors such as stellent, artesia, and canto list clients from many different industry sectors, including gov ernment and education, but provide no detailed case studies on their web sites.8 information regarding the status of enterprise dam system projects and specific issues that arose during implementation is difficult to find. information publicly available for enterprise dam system projects in higher education is usually in the form of white papers or proposals that do not cover the actual implementations.9 given the high degree of interest and the number of pilot projects announced in recent years, this shortcoming has prompted the writing of this article, which presents the most important lessons learned dur ing the first phase of the living lab pilot project with the hope that these experiences will be valuable to other academic institutions considering similar projects. as part of its core mission, um strives to meet the teaching and learning needs of the entire campus. thus, the living lab pilot solicited participation from a diverse crosssection of the university’s departments and units with the goal of evaluating the use of varied teaching and learning assets for the system. from the beginning, it was expected that this system would handle assets in many different forms, such as digital video or digitized images, and also accommodate various organizational schemas and metadata for different collections. this sets the um enterprise dam system apart from projects that focus on only one type of collection or define a large monolithic metadata schema for all assets. data were gathered through interviews with asset providers, focus groups with potential users, and a review of the relevant literature. a number of barriers were identified during the pilot’s first phase. while there were some technical barriers, the most signifi cant barriers were cultural and organizational ones for which technical solutions were not clear. perhaps the most significant cultural divide was between the culture of academia and the culture of the commercial sector. cultural and organizational assumptions from com mercial business practices were embedded in the design of the products initially used in the living lab imple mentation. thus, an additional implementation chal lenge was determining which issues should be resolved through technical means, and which should be solved by changing the academic culture. this is expected to be an ongoing challenge. ■ architecture (building the infrastructure) an enterprise dam system in an academic community such as um needs to support a wide variety of services in order to meet the numerous and varied teaching, research, service, and administrative functions. figure 1 illustrates the services that are provided by an enterprise dam system and concurrently demonstrates its com plexity. the left column, process, lists a few of the media processes that various producers will use prepare their media and subsequent ingestion into the enterprise dam system; the middle column, manage, demonstrates the various functions of the enterprise dam system; while the third column, publish, lists a subset of the publishing venues for the media. because an enterprise dam system supports a variety of rich media, a number of software tools and workflows are required. figure 2 illustrates this complexity and describes the architecture and workflow used to add a video segment. the organization of figure 2 parallels that of figure 1. the left column, process, indicates that flip factory by telestream is used to convert digital video from the original codec to one that can be used for play back.10 in addition, videologger by virage uses media analysis algorithms to extract key frames and time codes created by louis e. king, ©2004 regents of the university of michigan figure 1. component services of the living lab 6 information technology and libraries | december 20076 information technology and libraries | december 2007 from the video as well as to convert the speechtotext for easy searching.11 the middle column, manage, illustrates tools from ibm that help create rich media as well as tools from stellent, such as its ancept media server (ams), that store and index the rich media assets.12 the third column, publish, illustrates two examples of how these digital video assets could be made available to the end user. one strategy is as a real video stream using real network’s helix server, and the other as a quicktime video stream using ibm’s videocharger.13 a thorough discussion of all of the software and hardware that make up um’s dam system is beyond the scope of this article. however, a list of the software components with links to their associated web sites is provided in figure 3. from the beginning the living lab pilot aimed for a diverse collection of assets to promote resource discovery and sharing across the university. figure 4 illustrates how the living lab is expected to fit into the varied publishing venues that comprise the campus teaching and learning infrastructure. existing storage and network infrastruc tures are used to deliver media assets to various software systems on campus. the living lab is used to streamline the cataloging, searching, and retrieving processes encoun tered during academic teaching and research activities. the following example describes how the enterprise dam system fits into the future campus cyberinfrastruc ture. a faculty member in the school of music is a jazz composer. one of her compositions is digitally stored in the enterprise dam system along with the associated metadata (cataloging information) that will allow the piece to be found during a search. that single audio file is then found, accessed, and used by five unique publish ing venues—the course web site, the university web site, a radio broadcast, the music store, and the library archive. the faculty member uses the piece in her jazz interpreta tion course and thus includes a link to the composition on her sakai course web site.14 when she receives an award, the um issues a press release on the um web site that includes a link to an audio sample. concurrently, michigan radio uses the enterprise dam system to find the piece for a radio interview with her that includes an audio segment.15 her performance is published by block m records, um’s webbased recording label, and, lastly, the library permanently stores the valuable piece in its institutional archive, deep blue.16 ■ metadata (managing assets within the academic model) the vision for enterprise dam at um is for digital assets to not only be stored in a secure repository, but also be findable, accessible, and usable by the appropriate persons in the university community in their academic endeavors. information about these assets, or metadata, is a crucial component of fulfilling this vision. an important created by louis e. king, ©2004 regents of the university of michigan figure 2. the living lab architecture north american systems ancept media server www.nasi.com/ancept.php ibm content manager www-306.ibm.com/software/data/cm/cmgr/mp/ telestream flip factory www.telestream.net/products/flipfactory.htm virage videologger www.virage.com/content/products/index.en.html ibm video charger www-306.ibm.com/software/data/videocharger/ real networks helix server www.realnetworks.com/products/media_delivery. html apple quicktime streaming server www.apple.com/quicktime/streamingserver/ handmade software image alchemy www.handmadesw.com/products/image_alchemy. htm figure 3. software used in the living lab article title | author 7enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 7 question that arises is, “what kind of metadata should be required for the assets in the living lab?” to help answer this question, potential asset provid ers were interviewed regarding their current approach to metadata, such as if they used a particular schema and how well it met their purposes. not surprisingly, asset providers had widely varied metadata implementations. while the assets intended for the living lab pilot all had some metadata, the scope and granularity varied greatly. metadata storage and access methods also varied, ranging from databases implemented using commercial database products and providing web frontends, to a combination of paper and spreadsheet records that had to be consulted together to locate a particular asset. the assets to be used in the living lab pilot consisted primarily of high and lowresolution digital images and digitized video. these interviews also generated a number of requirements for any potential living lab metadata schema. it was deter mined that the schema should be able to: ■ describe heterogeneous collections at an appropriate level of granularity and detail, allowing for domain specific description needs and vocabularies; ■ allow metadata entry by nonspecialists; ■ enable searches across multiple subject areas and col lections; ■ provide provenance information for the assets; and ■ provide information on authorized uses of the assets for differing classes of users. an examination of the literature showed a general consensus that no single metadata standard could meet the requirements of heterogeneous collections.17 projects as diverse as pb core and vius at penn state adopted the approach of drawing from multiple existing metadata standards.18 their approaches differ in that pb core is a combination of selected metadata elements from a num ber of standards plus additional elements unique to pb core, while vius opted for a merged superset of all the elements in the standards selected. in interviews with asset providers (usually faculty), cataloging backlog and the lack of personnel for gen erating and entering metadata emerged as consistent problems. there was concern that an overly complex or specialized schema would aggravate the cataloging back log by making metadata generation timeconsuming and cumbersome. budgetary constraints made hiring pro fessional metadata creators prohibitive. another aspect of the personnel problem was that adequate descrip tion required subject specialists who were, ideally, the resource authors or creators. but subject specialists, while familiar with the resources and the potential audience for them, may not be knowledgeable of how to produce highquality metadata, such as controlled vocabularies or consistent naming formats. to address these issues, the more simple and straight forward indexing process offered by dublin core (dc) was selected as the starting point for the metadata schema in the living lab.19 dc was originally developed to sup port resource discovery of a digital object, with resource authors as metadata creators. dc is a relatively small standard, but is extensible through the use of qualifiers. it has been adopted as a standard by a number of standards organizations, such as iso and ansi. a body of research exists on its use in digital libraries and its efficacy for authorgenerated metadata, and there are metadata crosswalks between dc and most other metadata stan dards. a number of other subjectspecific standards were also examined for more specialized description needs and controlled vocabularies: vra core, ims learning resource metadata specification, and snodent.20 in the end, the project leaders elected to adopt a rather novel approach to metadata by not defining one metadata schema for all assets. by taking advantage of the power of multiple approaches (for example, pb core for mixand match, and vius for a merged superset) each collection can have its own schema as long as it contains the ele ments of a more general, lowestcommondenominator schema. this overall schema, um_core, was defined based on dc. the elements are prefixed with dc or um to specify the schema origin. um_publisher and um_alternatepublisher identify who should be contacted about problems or ques tions regarding that particular asset. um_secondarysubject is a crosscollection subject classification schema devel created by louis e. king, ©2004 regents of the university of michigan figure 4. the enterprise dam system as the future campus infrastructure for academic venues 8 information technology and libraries | december 20078 information technology and libraries | december 2007 oped by the um libraries, and helps map the asset into the context of the university. in adopting such an approach to metadata, metadata creation is seen not as a oneshot process, but a collaborative and iterative one. for example, on initial ingestion into the living lab, the only metadata entered for an image may be dc_title, dc_date, and um_publisher. additional meta data may be entered as users discover and use the asset, or as input from a subject specialist becomes available. the discussion so far has focused on metadata pro duced with human intervention. a number of metadata elements can be obtained from the digital objects through the use of software. in an enterprise dam system, this is referred to as automatically generated metadata and is what can be directly obtained from a computer file such as file name, file size, and file format. this type of metadata is expected to play a larger role as an increasing propor tion of assets will be born digital and come accompanied by a rich set of embedded metadata. for example, images or video produced by current digital cameras contain exchangeable image file format (exif) metadata, which include such information as image size, date produced, and camera model used. when available, the living lab presents automatically generated metadata to the user in addition to the elements in um_core. thus, asset metadata in the living lab can be pro duced in two ways: automatically generated through a tool such as virage videologger in the case of video, or entered by hand through the current dam system inter face.21 in addition, if metadata already exist in a database format, such as filemaker, this can be imported once the appropriate mappings are defined.22 videologger, a video analysis tool for digital video files, can extract video key frames, add closed captions, determine whether the audio is speech or music, convert speech to text, and identify (through facial recognition) the speaker(s). these capabilities allow for more sophis ticated searching of video assets compared to the cur rent capabilities of search engines such as google. some degree of contentbased searching can now be done, as opposed to searching that relies on the title and other textual description provided separately from the video itself. for the pilot, particular interest was expressed in the speech recognition capability of videologger. videologger generates a timecoded text of spoken key words with 50 to 90 percent accuracy. the result is not nearly accurate enough to generate a transcript, but does indeed provide robust data for searching the content of video. given the diversity of assets in the living lab, it is clear that the university can utilize lowcost keyword analysis to enhance search granularity as well as the more expensive, fully accurate handprocessed transcript. ■ workflow examples two instructional challenges demonstrate how an enter prise digital asset management system can provide a solution to instructional dilemmas and how a unique workflow needs to be created for each situation. the chal lenges related to each project are described. school of dentistry the educational dilemma the um school of dentistry uses standardized patient instructors (spis) to assess students’ abilities to interact with patients. carefully trained actors play carefully scripted patient roles. dental students interview the patients, read their records, and make decisions about the patients’ care, all in a few minutes (see figure 6). each session is video recorded. currently, spis grade each student on predeter mined criteria, and the video recording is only used if a student contests the spis’ grade. ideally, a dental educator should review each recording and also grade each student. however, the um class size of 105 dental students causes a recordingbased grading process to be prohibitively expensive in terms of personnel time. in addition, the use of digital videotape makes it difficult for the recorded sessions to be made available to the students. because the tapes are part of the student’s record, they cannot be checked out. if a student wants to review a tape, she or he must make an appointment and review it in a supervised setting. living lab solution the um school of dentistry’s living lab pilot attempted simultaneously to improve the spi program and lower the cost of faculty grading spi sessions through three goals: dc_title dc_creator dc_subject um_secondarysubject dc_description dc_publisher dc_contributor dc_date dc_type dc_format dc_identifier dc_source dc_language dc_relation dc_coverage dc_rights um_publisher um_alternatepublisher figure 5. the u-m enterprise dam system metadata scheme um_core article title | author 9enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 9 1. use speechtotext analysis to create an easily searched transcript; 2. streamline the recording process; and 3. make the videos available online for student review. each of these challenges and the current results are summarized. speech-to-text analysis it was hypothesized that an effective speechtotext anal ysis of the spi session could enable a grader quickly to locate video segments that: (1) represented student dis cussion of specific dental procedures; and (2) contained student verbalizations of key clinical communication skills.23 in summer 2005, nine spi sessions were recorded and a comparison between manual transcription and the automated speechtotext processes was conducted. the transcribed audio track was manually marked up with timecoded reference points and inserted as an annota tion track to the video. those same videos also were ana lyzed through the video logger speechtotext service in the living lab, resulting in an automatically generated, timecoded text track. lastly, six keywords were selected that, if spoken by the student, indicated the correct use of either a dental procedure or good communication skills. keyword searches were conducted on both the manual transcription and the speechtotext analysis. three results were calculated on the key word searches of both versions of all nine recorded sessions. they were: (1) the number of successful keyword searches; (2) the number of successful search results that did not actually contain the keywords (false positives); and (3) the time required to complete the manual transcrip tion and texttospeech analysis of the recordings. the results demonstrated that the speechtotext analysis matched the manual transcription 20 to 60 percent of the time. also, the speechtotext process resulted in a false positive less than 10 percent of the time. lastly, the time required to complete the speechtotext analysis of a session was two minutes, while the average time required to complete a manual transcription of the same session was 180 minutes. while not perfect, the results are encouraging that manually transcribing the audio is no longer necessary. improvements are being made to the clinical environment and microphones so that a higherquality recording is obtained. it is anticipated that those changes combined with improved software will improve the results of the speechtotext analysis sufficiently so that automated keyword searches can be conducted for grading purposes. streamlining the recording process scale is a significant challenge to capturing 105 spi inter actions in a short amount of time. two to three weeks are required for the entire class of 105 students to complete a series of spi experiences, with as a many as four concur rent sessions at any given time. in summer 2006, it was decided to record 50 percent of one class. logistically, one camera operator could staff two stations simultane ously. the stations had to be physically close enough for a oneperson operation, but not so close that audio from the adjacent session was recorded. the optimal distance was about thirty to thirtyfive feet of separa tion. staggering the start times of each session allowed the camera operator to make sure each was started with optimal settings. since the results of the speechtotext analysis were linked to the quality of the equipment used, two prosumer minidv cameras with professional quality microphones and tripods also were purchased. student availability an important strength of living lab is the ability to make the assets both protected and accessible. the current itera tion does not have an interface for usercreated access con trol lists (acl), instead they need to be created by a systems administrator. once a systems administrator has created an acl, academic technology support staff can add or subtract people. to satisfy family educational rights and protection act regulations, a separate acl is needed for each student for the spi project.24 currently, the possibility of including the spi recordings and their associated transcriptions as ele ments of an eportfolio is being explored.25 in the meantime, students can use url references to include these videos and transcripts in such webbased tools as eportfolios and course management systems. discussion as the challenges of improving speechtotext analysis, recording workflow, and usercreated acls are overcome, the spi program will be able to operate at a new and previ ously unimagined level. a more objective keyword grad ing process can be instituted. students will be easily able to search through and review their sessions at times and locations that are convenient for them. living lab also will allow students to view their eportfolio of spi interactions and witness how they have improved their communica tion skills with patients. for the first time in healthcare education, a clinician’s communication skills, such as bedside or chairside manner, will be able to be taught and assessed using objective methods. school of education the challenge of using records of practice for research and professional education classroom documentation plays a significant role in educational research and in the professional education of teachers at the um school of education. collections of 10 information technology and libraries | december 200710 information technology and libraries | december 2007 videos capturing classroom lessons, smallgroup work, and interviews with students and teachers—as well as other classroom records, such as images of student work, teacher lesson plans, and assessment documents—are basic to much of the research that takes places in the school of education. however, there also is a large and increasing demand to use these records from real class rooms for educational purposes at the um and beyond, creating rich media materials for helping preservice and practicing teachers learn to see, understand, and engage in important practices of teaching. this desire to create widely distributed educational materials from classroom documentation raises two important challenges: first, there is the important challenge of protecting the identity of children (and, in some cases, teachers); and second, there is the difficult task of ensuring that the classroom records can be easily accessed by individuals who have permission to view and use the records while being inac cessible to those without permission. one research and materials development project at the um school of education has been exploring the use of living lab to support the critical work of processing classroom records for use in research and in educational materials, and the distribution and protection of class room records as they are integrated into teacher educa tion lessons and professional development sessions at the um and other sites in the united states. the findings and challenges of these efforts are summarized below. processing classroom records the classroom records used in the pilot were processed in three main ways, producing three different types of products: ■ preservation copies are highquality formats of the classroom records with minimal loss of digital infor mation that can be read by modern computers with standard software. these files are given standardized filenames, cleaned of artifacts and minor irregu larities, and deidentified (that is, digitally altered to remove any information that could reveal the identity of the students and, in some cases, of the teachers). ■ working copies are lowerquality versions of the preservation copies that are still sufficient for print ing or displaying and viewing. trading some degree of quality for smaller file sizes and thus data rates, the working copies are easier for people to use and share. additionally, these files are further devel oped to enhance usability: videos are clipped and composited to feature particular episodes; videos also are subtitled, flagged with chapter markers (or other types of coding), and embedded with links for accessing other relevant information; images of stu dent and teacher work are organized into multipage pdfs with bookmarks, links, and other navigational aids; and all files are embedded with metadata for aiding their discovery and revealing information about the files and their contents. ■ distribution copies are typically similar in quality to the working copies but are often integrated into other documents or with other content; they are labeled with copyright information and statements about the limitations of use. they are, in many cases, edited for use on a variety of platforms and copy protected in small ways (for example, word and powerpoint files are converted to pdfs). the living lab was found to support this processing of classroom records in two important ways. first, the system allowed for the setup and use of workflows that enabled undergraduate students hired by the project to upload processed files into the system and walk through a series of quality checks, focused on different aspects of the products. so, for example, when checking the preservation copies, one person was assigned to check the preservation copy against the actual artifact to make sure everything was captured adequately and that the resulting digital file was named properly (“quality check 1”). another individual was assigned to make sure the content was cleaned up properly and that no identifying information appeared anywhere (“quality check 2”). and finally, a third person checked the file against the meta data to make sure that all basic information about the file was correct (“quality check 3”). files that passed through all checks were organized into collections accessible to project members and others (“organize”). files that failed along the way were sent back to the beginning of the workflow (the “drawing board”), fixed, and checked again (see figure 7). figure 6. a dental student interviewing an spi. article title | author 11enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 11 second, living lab allowed asset and collection development to be carried out collaboratively and itera tively, enabling different individuals to add value in dif ferent ways over time. undergraduate students did much of the initial processing and checking of the assets; skilled staff members converted subtitles into speech metadata housed within living lab; and, eventually, project faculty and graduate students will add other types of analytic codes and content specific metadata to the assets. distribution and protection of classroom records in addition to supporting the production of various types of assets and collections, the living lab supported the distribution and protection of classroom records for use in education settings both at um and other institutions. for example, almost fifteen hours of classroom videos from a thirdgrade mathematics class were made acces sible to and were used by instructors and students in the college of education at michigan state university. in a different context, approximately ten minutes of classroom video was made available to instructors in mathematics departments at brigham young university, the university of georgia, and the city college of new york to use in courses for elementary teachers. each asset (and its derivatives) housed within living lab has a url that can be embedded within web pages and online coursemanagement systems, allowing for a great deal of flexibility in how and where the assets are pre sented and used. at the same time, each call to the server is checked and, when required, users are prompted to authen ticate by logging in before any assets are delivered. this has great potential for easily, seamlessly, and safely integrating living lab assets into a variety of web spaces. although this feature has indeed allowed for a great deal of flexibility, there were and continue to be challenges with creating an integrated and seamless experience for school of education students and their instructors. for example, depending on a variety of factors, such as user operating systems and web browser combinations, users might be prompted for multiple logins. additionally, the login for the living lab server can be quite unforgiving, locking out users who fail to login properly in the first few tries and providing limited communication about what has occurred and what needs to be done to correct the situation. discussion during the living lab pilot a number of workflow chal lenges were overcome that now allow numerous and varied types of media related to classroom records to be ingested into living lab, and derivatives created. this demonstrates that living lab is ready for complex media challenges associated with instruction. however, the next challenge of delivering easily and smoothly to others still remains. once authentication and authorization is con ducted using single signon techniques that allow users to access assets securely from living lab through other systems, assets will be able to be incorporated into web based materials and used to enhance the instruction of teachers in ways that have yet to be conceived. ■ privacy, intellectual property, and copyright during the course of the pilot, a number of issues emerged. among these were some of the most critical issues that institutions considering embarking on a similar asset man agement system need to address. these issues are: ■ privacy; ■ intellectual ownership and author control of materials; ■ digital rights management and copyright; ■ uncataloged materials backlog; and ■ user interface and integration with other campus systems. up to this point, enterprise dam systems had been developed and used primarily by commercial enterprises— for example, cnn and other broadcasting companies. using a product developed by and for the commercial sec tor brought to the fore the cultural differences between the academy and the commercial sector (see figure 8). the first three issues in the previous list are related to the differing cultures of commercial enterprise and academia. these issues are addressed below. the fourth and fifth issues are addressed in the section “other important issues.” privacy videos of medical procedures can be of tremendous value to students. in their own words, “watching is different from reading about it in a textbook.” but subjects have the right to retract their consent regarding the use of their images or treatment information for educational purposes. this creates a dilemma: if other assets have been cre ated using it, do all of them have to be withdrawn? for drawing board → quality check 1 → quality check 2 → quality check 3 → organize figure 7. living lab workflow 12 information technology and libraries | december 200712 information technology and libraries | december 2007 example, if a professor included an image from the univer sity’s dam system in a classroom powerpoint or keynote presentation, and subsequently included the presentation in the university’s dam system, what is the status of this file if the patient withdraws consent for use of her or his treatment information?26 when must the patient’s request be fulfilled? can it be done at the end of the semester, or does it need to be completed immediately? if the request must be fulfilled immediately, the faculty member may not have sufficient time to find a comparable replacement. waiting until the end of the semester helps balance patient privacy with teaching needs. in either case, files must be withdrawn from the enterprise dam system and links to those files removed. consent status and asset relationships must be part of the metadata for an asset to handle such situations. consideration must be given to associating a digital copy of all consent forms with the corresponding asset within an enterprise dam system. intellectual ownership and author control of materials authors’ rights, as recognized by the berne convention for the protection of literary and artistic works, have two components.27 one, the economic right in the work, is what is usually recognized by copyright law in the united states, being a property right that the author of the work can transfer to others through a contract. the other component—the moral rights of the author—is not explicitly acknowledged by copyright law in the united states and thus may escape consideration regarding ownership and use of intellectual property. moral rights include the right to the integrity of the work, and thus come into play in situations where a work is distorted or misrepresented. unlike economic rights, moral rights cannot be transferred and remain with the author. in a university setting, the university may own the economic right for a researcher’s work, in the form of copyright, but the researcher retains moral rights. the following incident illustrates what can happen when only property rights are taken into account. a digital video segment of a medical procedure was being shown as part of a living lab demo at a university it showcase. because the um held the copyright for that particular videotape, no problems were foreseen regarding its usage. a faculty member recognized the video as one she had cre ated several years ago and expressed great concern that it had been used for such a purpose without her knowledge or consent. the concern arose from the fact that video showed an outdated procedure. while the faculty member continued to use this video in the classroom, she felt this was different from having it available through the living lab. in the classroom, the faculty member alerted students to the outdated practices during the viewing, and she had full control over who viewed it. the faculty member felt she lost this control and additional clarification when the video became available through living lab. that is, her work was now misrepresented and her moral rights as an author were violated. digital rights management and copyright in the academic world, digital rights management (drm) is becoming a necessary component in disseminating intellectual products of all forms.28 however, at this time there are few standards and no technical drm solution that works for all media on all platforms. therefore, um has elected to use social rather than technical means of managing digital rights. the living lab metadata schema provides an element for rights statements, dc_rights. these metadata, combined with education of the univer sity community about copyright, fair use, and the highly granular access control and privileges management of the system, provide the community with the knowledge and tools to use the assets ethically. the university can establish rights declarations to use in the dc_rights field as standards are developed and prec edent is established in the courts. these declarations may include copyright licenses developed by the university legal counsel as well as those from the creative commons.29 current solution—access control lists a clear difference between the cultures of commercial enterprises and academia emerged regarding access to assets, administered through acls.30 an acl specifies commercial dam system model university dam system model assets held centrally federated ownership of assets access, roles, and privileges managed centrally distributed management of access, privileges and roles metadata frameworks— monolithic federated metadata schema agnostic user interface(s) re: privileges, ownership figure 8. differences between commerical and university uses of a dam system. article title | author 13enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 13 who is allowed to access an asset and how they can use it. in commercial settings, access to assets is centrally managed, while in academia, with its complex set of intellectual and copyright issues, it is preferable to have them managed by the asset holders. university users repeatedly asked for the ability to define acls for each asset in the living lab. currently, end users and support staff cannot define acls—only system administrators can create them. the middleware for userdefined acls has been fully developed, and the user interface for user defined acls will be made available in the next version. this capability is important in the academic envi ronment because the composition of group(s) of people requiring access to a particular asset is fluid and can span many organizational boundaries, both within and outside the university. a research group owning a collection of assets may want to restrict access for various reasons, including requirements set forth by an institutional review board (irb, a university group that oversees research projects involving human subjects), or regulations such as the health insurance portability and accountability act of 1996, which addresses patient health information privacy.32 the research group will want flexible access control, as research group members may collaborate with others inside and outside the university. the original irb approval may specify that confidentiality of the subjects must be maintained, and collected data, such as video or transcripts, can only be viewed by those directly involved in the research project and cannot be browsed by other researchers not involved in the study or the public at large. in another situation, a collection of art images may only be viewed by current students of the institution, thus requiring a different acl. this situation is still open to interpretation. some say patient consent regarding the use of information for instructional purposes cannot be withdrawn for the use of existing information at the home institution. they can only withdraw it for the use of future assets. others may feel that patients can withdraw permission for the use of their patient assets. other important issues uncataloged materials backlog what emerged from interviews and focus groups with content providers was that while there was no lack of assets they would like to see online, a large proportion of these assets had never been cataloged or even sys tematically labeled in some form. this finding may be attributed in part to the pilot focusing on existing assets that have previously not been available for widespread sharing—such as the files stored on faculty hard disks and departmental servers—only known to a favored few. owners or creators of these materials had not consciously thought about sharing these materials or making them available to others. librarians, in contrast, have devel oped systems and practices to ensure the findability of materials that enter the library. asset owners were more than willing to have the assets placed online, but did not have the time or resources to provide the appropriate metadata. hiring personnel to create the metadata is problematic, as there is a limit to the metadata that can be entered by nonexperts, and experts often are scarce and expensive. for example, for a collection of oral pathology images of microscopic slides, a subject expert must provide the diagnoses, stain, magnification, and other information for each image. without these details, merely putting the slides online is of little value, but these metadata cannot be provided by laypeople. collaborative metadata creation, allowing multiple metadata authors and iterations, may be one solution to this problem. a number of studies indicate that both organiza tional support and userfriendly metadata creation tools are necessary for resource authors to create high quality metadata.33 some of the backlog may be resolved through development of tools aimed at resource authors. in addition, increased use of digital file formats with embedded metadata may contribute to reducing future backlog by requiring less human involvement in meta data creation. faculty need to be taught that metadata raises the value and utility of assets. as they come to understand the essential role metadata plays, they, too, will invest in its creation. user interface and integration with other systems an enterprise dam system has two basic types of uses: by producers and by users. producers tend to be digital media technologists who create the digital assets and ingest them into the enterprise dam system. the users are the faculty, students, and staff who use these digital assets in their teaching, learning, or research. the research and development version of the enter prise dam system, living lab, works well for digital asset producers, but not for the users of these digital assets. ingestion and accessing processes are quite complex and are not currently integrated with other campus systems, such as the online library catalog or the sakaibased, campuswide course management sys tem, ctools.34 digital producers who are comfortable with complex systems are able to ingest and access rich media. however, users have to log onto the enterprise dam system and navigate its complex user interface. the level of complexity of accessing the media can cre ate a barrier to adoption and use. if the level of complex ity for accessing the assets is too high for users, then the system also is too complex to expect users to contribute to the ingestion of digital assets. 14 information technology and libraries | december 200714 information technology and libraries | december 2007 in both student and faculty focus groups there was concern about the technical skills needed for faculty use of an enterprise dam system in the classroom. ideally faculty should be able to incorporate assets seamlessly from the enterprise dam system to their classroom mate rials, such as powerpoint or keynote presentations. then, the presentations created on their computers should dis play without glitches on the classroom system. obviously faculty members cannot be expected to troubleshoot in the classroom when display problems occur. if the enterprise dam system is perceived as difficult to use, or as requiring a lot of troubleshooting by the user, this will discourage adoption by the faculty. this creates additional demands on the enterprise dam system, and potential additional it staffing demands for the academic units wanting to promote enterprise dam system use. when a problem is experienced in the classroom, the departmental it support, not the enterprise dam system support team, will be the first to be called. ideally, an enterprise dam system should be linked to the campus it infrastructure such that users or con sumers do not interact with the dam system itself, but rather through existing academic tools, such as the library gateway, course management system, or departmental web sites. having to learn a new system could be a sig nificant barrier to use for many potential dam system users in academia. ■ conclusions and lessons learned the vision of a dam system that would allow faculty and students easy yet secure access to myriad rich media assets is extremely appealing to members of the academy. conducting the pilot projects revealed numerous techni cal and cultural problems to resolve prior to achieving this vision. the authors anticipate that other institutions will need to address these same issues before undertaking their own enterprise dam system. using commercial software developed in academia during the course of the living lab pilot, the differ ences between academia and the commercial sector proved to be a significant issue. assumptions about the organizational culture and work methods are built into systems, often in a tacit manner. in the case of the initial iteration of the living lab, these assumptions were those of the corporate world, the primary clients of the commercial providers as well the environment of the developers. um project participants, meanwhile, brought their own expectations based on the reality of their work environment in academia. universities do not have a strict hierarchical structure, with each aca demic unit and department having a great degree of local control. academia also has a culture of sharing, where teaching or research products are often shared with no payment involved, other than acknowledgment of the source. thus, there was a process of mutual edu cation and negotiation regarding what was and was not acceptable in the enterprise dam system implementa tion. this difference of cultures first manifested itself with acls. in the initial implementation, an acl could be defined only by a system administrator. this was a showstopper for the um participants, who thought that asset providers themselves would be able to define and modify the acl for any particular asset. a centralized solution with a single owner of the assets (the company), which is acceptable in the corporate environment, is not acceptable in a university environment, where each user is consumer and owner. defining who has access to an asset can be a complex problem in academia, since this access is a moving target subject to both departmental and institutional constraints. libraries and librarians the traditional role of libraries is one of preserving and making accessible the intellectual property of all of humanity. with each new advance in information tech nology, such as dam systems, the role of libraries and librarians continues to evolve. this pilot highlighted the role and value of librarians skilled in metadata develop ment and assignment. without their expertise and early involvement, there would have been no standard method of indexing assets, thus preventing users from finding useful media. also, the project reinforced two reasons for encouraging asset creators to assign metadata at the asset creation point instead of at the archival point. one, this ensures that metadata are assigned when the content expertise is available. it is very difficult for producers to assign metadata retrospectively, and the indexing information may no longer be available at the point of archive. two, metadata assignment at the point of asset creation helps to ensure consistent metadata assignment that lends itself to automated solutions at the time of archiving.35 thus, while their role in digital asset man agement systems continues to evolve, the authors predict that the librarians’ role will evolve around metadata, and that libraries will start to become the archive for digital materials. it is anticipated that librarians will work with technical experts to develop workflows that include the automated metadata assignment to help faculty routinely add existing and new collections of assets to the system. one example of such a role is deep blue at the university of michigan. deep blue is a digital framework for pre serving and finding the best scholarly and artistic work produced at the university. article title | author 15enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 15 production productivity new technical complexities emerge with each new asset collection added to the um system. new workflows as well as richer software features continue to be developed to meet newly identified integration and user interface needs. as the living lab experience advances, techni cal barriers are eliminated and new workflows auto mated. the authors anticipate that, eventually, automated workflows will allow faculty and staff to routinely use digital assets with a minimum of technical expertise, thus decreasing the personnel costs associated with the use of rich media. for the foreseeable future, however, techni cally knowledgeable staff will be required to develop these workflows and even complete a significant amount of the work. academic practice the more delicate and challenging issue is educating fac ulty on the value and power of digital assets to improve their research and teaching. dam is a new concept to fac ulty, and it will only become useful when integrated into their daily teaching and research. this will happen as fac ulty members become more knowledgeable and increase their comfort in the use of digital assets. the dental case study demonstrates that an improved student experience can be provided with such an asset management system, while the education case study demonstrates that a com plex set of authentic classroom materials can be orga nized and ingested for use by others. these case studies are only two examples of the unanticipated outcomes that result from the use of digital assets in education. the authors predict that as more unanticipated and innova tive uses of digital assets are discovered, these new uses will, in turn, lead to increased academic productivity—for example, teaching more without increasing the number of faculty, students teaching each other with rich media, smallgroup work, and projectbased learning. the list of possibilities is endless. as the living lab evolved from a research and development project into the implementation project known as bluestream, it has become an actual classroom resource. this article described myriad issues that were addressed so that other institutions can embark on their own enterprise dam systems fully informed about the road ahead. the remaining technical issues can and will be resolved over time. the greatest challenges that remain are being discovered as faculty and students use bluestream to improve teaching, learning, and research activities. the success of bluestream specifically, and enterprise dam systems in general, will be determined by their successes and failures in meeting the needs of faculty and students. ■ acknowledgements the authors recognize that the living lab pilot program was conducted with the support of others. we thank ruxandraana iacob for her administrative contributions to the project. we thank both ruxandraana iacob and sharon grayden for their assistance with writing this article. thanks to karen dickinson for her encourage ment, optimism, and constant support throughout the project. we thank mark fitzgerald for his vision regard ing the potential of the school of dentistry spi project and for conducting the original research. the living lab pilot was conducted with support from the university of michigan office of the provost through the carat partnership program, which pro vided funding for the pilot, and the caratrackham fellowship program, which funded the metadata work. references 1. a. doyle and l. dawson, “current practices in digital asset management,” internet2/cni performance archive & retrieval working group, 2003, http://docs.internet2.edu/ doclib/draftinternet2humanitiesdigitalassetmanagement practices200310.html (accessed feb. 17, 2007). 2. d. z. spicer, p. b. deblois, and the educause current issues committee. “fifth annual educause survey identifies current it issues.” educause quarterly 27, no. 2 (2004): 8–22. 3. humanities advanced technology and information insti tute (hatii), university of glasgow, and the national initiative for a networked cultural heritage (ninch), “the ninch guide to good practice in the digital representation and man agement of cultural heritage materials,” 2003, www.nyu.edu/ its/humanities/ninchguide (accessed july 10, 2005). 4. a. mccord, “overview of digital asset management sys tems,” educause evolving technologies committee, sept. 6, 2002. 5. james l. hilton, “digital management systems,” educause review 38, no. 2 (2003): 53. 6. james. hilton, “university of michigan digital asset management system,” 2004. http://sitemaker.umich.edu/ bluestream/files/dams_year01_campus.ppt (accessed feb. 15, 2007). 7. the university of michigan, “bluestream,” 2006, http:// sitemaker.umich.edu/bluestream (accessed feb. 15, 2007). 8. oracle corp., “stellent universal content management,” 2006, www.stellent.com/en/index.htm (accessed feb. 15, 2007); artesia digital media group, “artesia: the open text digital media group,” 2006, www.artesia.com/ (accessed feb. 15, 2007); canto, “canto,” 2007, www.canto.com (accessed feb. 15, 2007). 9. r. d. vernon and o. v. riger, “digital asset management: an introduction to key issues,” www.cit.cornell.edu/oit/arch init/digassetmgmt.html (accessed sept. 24, 2004); yan han, “digital content management: the search for a content man agement system,” library hi tech 22, no. 4 (2004): 355–65; stan ford university libraries and academic information resources, 16 information technology and libraries | december 200716 information technology and libraries | december 2007 “media preservation: digital preservation,” 2005, http://library. stanford.edu/depts/pres/mediapres/digital.html (accessed july 29, 2005). 10. telestream, “telestream, inc.,” 2005, www.telestream.net/ products/flipfactory.htm (accessed feb. 15, 2007). 11. autonomy, inc., “virage products overview: virage vid eologger,” 2006, www.virage.com/content/products/index. en.html (accessed feb. 15, 2007). 12. international business machines corp., “ancept media server: digital asset management solution,” 2007, www.nasi. com/ancept.php (accessed feb. 15, 2007). 13. realnetworks, inc., “realnetworks media servers,” 2007, www.realnetworks.com/products/media_delivery.html (accessed feb. 15, 2007); apple, inc., “quicktime streaming server,” 2007, www.apple.com/quicktime/streamingserver (accessed feb. 15, 2007); international business machines corp., “db2 content manager video charger,” 2007, www306.ibm. com/software/data/videocharger/ (accessed feb. 15, 2007). 14. sakai, “sakai: collaboration and learning environment for education,” 2007, www.sakaiproject.org (accessed feb. 15, 2007). 15. the university of michigan, “michigan radio,” 2007, www.michiganradio.org (accessed feb. 15, 2007). 16. the university of michigan, “block m records,” 2005, www.blockmrecords.org (accessed feb. 15, 2007); the univer sity of michigan, “deep blue,” 2007, http://deepblue.lib.umich. edu (accessed feb. 15, 2007). 17. e. duval et al., “metadata principles and practicalities,” d-lib magazine 8, no 4 (2002); a. m. white et al., “pb core— the public broadcasting metadata initiative: progress report,” 2003 dublin core conference sept. 28–oct. 2, 2003, seattle; j. attig, a. copeland, and m. pelikan, “context and meaning: the challenges of metadata for a digital image library within the university,“ college & research libraries 65, no. 3 (may 2004): 251–61. 18. white et al., “pb core—the public broadcasting meta data initiative”; attig, copeland, and pelikan, “context and meaning.” 19. dublin core metadata initiative, “dublin core metadata initiative,” 2007, http://dublincore.org (accessed feb. 15, 2007). 20. visual resources association, “vra core categories, version 3.0,” 2002, www.vraweb.org/vracore3.htm (accessed feb. 15, 2007); louis j. goldberg, et al., “the significance of snodent,” studies in health technology and informatics 116 (aug. 2005): 737–42; http://ontology.buffalo.edu/medo/sno dent_05.pdf (accessed feb. 15, 2007). 21. autonomy, “virage products overview.” 22. filemaker, inc., “filemaker,” 2007, www.filemaker.com/ products (accessed feb. 15, 2007). 23. m. fitzgerald et al., “efficacy of speechtotext technol ogy in managing video recorded interactions,” journal of dental research 85, special issue a (2006): abstract no. 833. 24. u.s. department of education, “family educational rights and privacy act ferpa,” 2005, www.ed.gov/policy/ gen/guid/fpco/ferpa/index.html (accessed feb. 15, 2007). 25. g. lorenzo and j. ittelson, “an overview of eportfolios,” educause learning initiative, 2005, http://educause.edu/ir/ library/pdf/eli3001.pdf (accessed feb. 15, 2007). 26. microsoft corp., “microsoft office powerpoint 2007,” 2007, http://office.microsoft.com/enus/powerpoint/default. aspx (accessed feb. 15, 2007); apple, inc., “keynote,” 2007, www.apple.com/iwork/keynote (accessed feb. 15, 2007). 27. world intellectual property organization, “berne con vention for the protection of literary and artistic works,” 1979, www.wipo.int/treaties/en/ip/berne/trtdocs_wo001.html (accessed feb. 15, 2007). 28. wikimedia foundation, inc., “digital rights manage ment,” 2007, http://en.wikipedia.org/wiki/digital_rights_ management (accessed feb. 15, 2007). 29. creative commons, “creative commons,” 2007, http:// creativecommons.org (accessed feb. 15, 2007). 30. wikimedia foundation, inc., “access control list,” 2007, http://en.wikipedia.org/wiki/access_control_list (accessed feb. 15, 2007). 31. the university of michigan, “um institutional review boards,” 2007, www.irb.research.umich.edu (accessed feb. 15, 2007). 32. health insurance portability and accountability act of 1996 (hipaa), “centers for medicare and medicaid ser vices,” 2005, www.cms.hhs.gov/hipaageninfo/downloads/ hipaalaw.pdf (accessed feb. 15, 2007). 33. j. greenberg et al., “authorgenerated dublin core meta data for web resources: a baseline study in an organization,” journal of digital information 2, no. 2 (2002), http://journals.tdl. org/jodi/article/view/jodi39/45 (accessed nov. 10, 2007); a. crystal and j. greenberg, “usability of a metadata creation application for resource authors,” library & information science research 27, no. 2 (2005): 177–89. 34. the university of michigan, “ctools,” 2007, https:// ctools.umich.edu/portal (accessed feb. 15, 2007). 35. m. cox et al., descriptive metadata for television (amster dam: focal pr., 2006); michael a. chopey, “planning and imple menting a metadatadriven digital repository,” cataloging & classification quarterly 40, no. 3/4 (2005): 255–87. the open access citation advantage: does it exist and what does it mean for libraries? colby lewis information technology and libraries | september 2018 50 colby lewis (colbyllewis@gmail.com), a second year master of science in information student at the university of michigan school of information, is winner of the 2018 lita/ex libris student writing award. abstract the last literature review of research on the existence of an open access citation advantage (oaca) was published in 2011 by philip m. davis and william h. walters. this paper reexamines the conclusions reached by davis and walters by providing a critical review of oaca literature that has been published since 2011 and explores how increases in open access publication trends could serve as a leveraging tool for libraries against the high costs of journal subscriptions. introduction since 2001, when the term “open access” was first used in the context of scholarly literature, the debate over whether there is a citation advantage (ca) caused by making articles open access (oa) has plagued scholars and publishers alike.1 to date, there is still no conclusive answer to the question, or at least not one that the premier publishing companies have deemed worthy of acknowledging. there have been many empirical studies, but far fewer with randomized controls. the reasons for this range from data access to the numerous potential “methodological pitfalls” or confounding variables that might skew the data in favor of one argument or another. the most recent literature review of articles that explored the existence (or lack thereof) of an open access citation advantage (oaca) was published in 2011 by philip m. davis and william h. walters. in that review, davis and walters ultimately concluded that “while free access leads to greater readership, its overall impact on citations is still under investigation. the large access -citation effects found in many early studies appear to be artifacts of improper analysis and not the result of a causal relationship.”2 this paper seeks to reexamine the conclusions reached by davis and walters in 2011 by providing a critical review of oaca literature that have been published since their 2011 literature review.3 this paper will examine the methods and conclusions provoking such criticisms and whether these criticisms are addressed in the studies. i will begin by identifying some of the top confounders in oaca studies, in particular the potential for self-archiving bias. i will then examine articles from july 2011, when davis and walters published their findings, to july 2017. there will be a few exceptions to this time frame, but the studies cited in figures 4 and 5 are entirely from this period. in addition to reviewing oaca studies since davis and walters’ march 2011 study, i will explore the implications of an oaca on the future of publishing and the role of librarians in the subscription process. as antelman points out in her association of college and research libraries conference paper, “leveraging the growth of open access in library collection decision making,” it is the responsibility of libraries to use the newest data and technology available to them in the interest of best serving their patrons and advancing scholarship.4 in connecting oaca mailto:colbyllewis@gmail.com the open access citation advantage | lewis 51 https://doi.org/10.6017/ital.v37i3.10604 studies and the potential bargaining power an oaca could bring libraries, i assess the current roles that universities and university libraries play in promoting (or not) oa publications and the implications of an oaca for researchers, universities, and libraries, and i provide suggestions on how recent research could influence the present trajectory. i conclude by summarizing what my findings tell us about the existence (or lack thereof) of an oaca, and what these findings imp ly for the future of library journal subscriptions and the publish-or-perish model for tenure. lastly, i will suggest some alternative metrics to citations that could be used by libraries in determining future journal subscriptions and general collection management. self-archiving bias and why it doesn’t matter the idea of a self-archiving bias is based upon the concept that, if faced with a choice, authors will always opt to make their best work more widely available. effectively, when open access is not mandated, these articles may be specifically chosen to be made open access to increase readership and, hypothetically, citations.5 this biased selection method has the potential to confound the results of oaca studies because of the intuitive notion that an author’s best work is much more likely to be cited than any of their other work. its effect is amplified by making this work available oa, but it prevents studies in which articles were self-archived from being able to convincingly claim that the citation advantage these articles received was due to oa and not to its inherent quality and subsequent likelihood to be cited anyway. in a 2010 study, gargouri et al. determined that articles by authors whose institutions mandated self-archiving (such as in an institutional repository [ir]) saw an oaca just as great for articles that were mandated to be oa as for articles that were self-selected to be oa.6 this by no means proves a causal relationship between oa and ca, but does counter the notion that self -archived articles are an uncontrollable confounder that automatically compromises the legitimacy of oaca studies.7 ottaviani affirms this conclusion in a 2016 study in which he writes, “in the long run better articles gain more citations than expected by being made oa, adding weight to the results reported by gargouri et al.”8 in short, claiming that articles self-selected for self-archiving irreparably confound oaca studies ignores the fact that these authors have accounted for the likelihood that articles of higher quality will inherently be cited more. as gargouri et al. put it, “the oa advantage [to self-archived articles] is a quality advantage, rather than a quality bias” (italics in original).9 gold versus green and their effect on oaca analyses many critics of oaca studies have argued that such studies do not distinguish between gold oa, green oa, and hybrid (subscription journals that offer the option for authors to opt-in to gold oa) journals in their sample pool, thus skewing the results of their studies. in fact, there are many acknowledged subcategories of oa, but for the purposes of this paper, i will primarily focus on gold, green, and hybrid oa. figure 1, provided by elsevier as a guide for their clients, distinguishes between gold and green oa.10 while the chart provided applies specifically to those looking to publish with elsevier, it highlights the overarching differences between gold oa and green oa. a comprehensive list of oa journals is available through the directory of open access journals (doaj) website (https://doaj.org/). https://doaj.org/ information technology and libraries | september 2018 52 figure 1. elsevier explains to potential clients their options for publishing oa with elsevier and the differences between publishing with gold oa versus green oa. the argument that not distinguishing between gold oa and green oa in oaca studies distorts study results primarily stems from the potential for skew in green oa journals. green oa journals allow authors to self-archive their articles after publication, but the articles are often not made full oa until an embargo period has passed. this problem was addressed in a recent study conducted by science-metrix and 1science, who manually checked and coded approximatively 8,100 top-level domains (tlds).11 it is important to note that this study was made available as a white paper on the 1science website and has not been published in a peer-reviewed journal. additionally, 1science is a company built on providing oa solutions to libraries, which means they have a vested interest in proving the existence of an oaca. however, just as publishers such as elsevier have a vested interest in a substantial oaca not existing, this should not prevent us from examining their data. for their study, 1science did not distinguish hybrid journals as being in a distinct journal category. critics, such as the editorial director of journals policy for oxford university press, david crotty, were quick to fixate on this lack of distinction as a means of discrediting the study.12 employees of elsevier were similarly inclined to criticize the study, declaring that it, “like many others [studies] on this topic, does not appear to be randomized and controlled.”13 however, archambault et al., acknowledging that their study “does not examine the overlap between green and gold,” have provided an extremely comprehensive sample pool, examining 3,350,910 oa papers published between 2007 and 2009 in 12,000 journals.14 this paper examines the notion that “the advantage of oa is partly due to citations having a chance to arrive sooner . . . and concludes that the purported head start of oa papers is actually contrary to observed data.” 15 the open access citation advantage | lewis 53 https://doi.org/10.6017/ital.v37i3.10604 in a more recent study published in february 2018, piwowar et al. examine the prevalence of oa and average relative citation (arc) based on three sample groups of one hundred thousand articles each: “(1) all journal articles assigned a crossref doi, (2) recent journal articles indexed in web of science, and (3) articles viewed by users of unpaywall, an open-source browser extension that lets users find oa articles using oadoi.”16 unlike the 1science study, piwowar et al. had a twofold purpose: to examine the prevalence of oa articles available on the web and whether an oaca exists based on their sample findings. i do not include their results in my literature review because of the dual focus of their study, although i do compare their results with those of archambault et al. and analyze the implications of their findings. bronze: neither gold nor green in their article, piwowar et al. introduce a new category of oa publication: bronze. if gold oa refers to complete open access at the time of publication, and green oa refers to articles published in a paywalled journal but ultimately made oa either after an embargo period or via an ir, bronze oa refers to oa articles that somehow don’t fit into either of these categories. piwowar et al. define bronze oa articles as “free to read on the publisher page, but without any clearly identifiable license.”17 however, as crotty points out in a scholarly kitchen article reflecting on the preprint version of piwowar et al.’s article, “bronze” already exists as an oa category, but has simply been called “public access.”18 while coining “bronze” as a new term for “public access” is helpful in connecting it to oa terms such as “green” and “gold,” it is not quite the new phenomenon it is touted to be. arc as an indication of an oaca both archambault et al. and the authors of the 1science paper provide the arc as a means of establishing a paper’s impact on the larger research community. 19 within their arc analyses, archambault et al. distinguish between non-oa and oa, within which they differentiate between gold and green oa (figure 2). piwowar et al. group papers by closed (non-oa) and oa, with the following oa subcategories: bronze, hybrid, gold, and green oa (figure 3). an arc of 1.0 is the expected amount of citations an article will receive “based on documents published in the same year and [national science foundation (nsf)] specialty.” 20 based on this standard, articles with an arc above or below 1.0 represent a citation impact that percentage above or below the expected citation impact of like articles. for example, an article with an arc of 1.23 has received 23 percent more citations than expected for articles of similar content and quality. this scale can be incredibly useful in determining the presence of a citation advantage, and it can enable researchers to determine overall ca patterns. information technology and libraries | september 2018 54 figure 2. research impact of paywalled (not oa) versus open access (oa) papers “computed by science-metrix and 1science using oaindx and the web of science.” archambault et al., “research impact of paywalled versus open access papers,” white paper, science-metrix and 1science, 2016, http://www.1science.com/1numbr/. critics’ fixation on the “randomized and controlled” nature of the 1science study ignores the fact that the authors do not claim causation. rather, their findings suggest the existence of an oaca when comparing oa (in all forms) and non-oa (in any form) articles (see figure 2). the authors ultimately conclude that “in all these fields, fostering open access (without distinguishing between gold and green) is always a better research impact maximization strategy than relying on strictly paywalled papers.”21 unlike archambault et al., piwowar et al. found that gold oa articles had a significantly lower arc, and that the average arc of all oa balances out to 1.18 because of the high arcs of bronze (1.22), hybrid (1.31), and green (1.33). however, both studies fou nd that non-oa (referred to by piwowar et al. as “closed”) articles had an arc below 1.0, suggesting a definitive correlation between oa (without specifying type) and an increase in citations. http://www.1science.com/1numbr/ the open access citation advantage | lewis 55 https://doi.org/10.6017/ital.v37i3.10604 figure 3. “average relative citations of different access types of a random sample of world of science (wos) articles and review with a digital object identifier (doi) published between 2009 and 2015.” heather piwowar et al., “the state of oa: a large-scale analysis of the prevalence and impact of open access articles,” peerj, february 13, 2018, https://doi.org/10.7717/peerj.4375. six years and what has changed in oaca research between july 2011 and the publication of piwowar et al.’s work in february 2018, nine new oaca studies have been published in peer-reviewed journals. of these, five only look at the oaca in one field, such as cytology or dentistry. the other four are multidisciplinary studies, two of which are repository-specific and only use articles from deep blue and academia.edu, respectively. this is important to note because of critics’ earlier stated objections to the use of studies that are not randomized controlled studies. however, the deep blue study can still be considered a randomized controlled sample group because the authors are not self-selecting articles to upload to the repository as they are with academia.edu. rather, articles were made accessible through deep blue “via blanket licensing agreements between the publishers and the [university of michigan] library.”22 some of the field-specific studies use sample sizes that may not reflect a general oaca, but rather one only for that field, and in certain cases, only for a single journal. field-specific studies between july 2011 and july 2017, five field-specific studies were conducted to determine whether an oaca existed in those fields. i summarize the scope and conclusions of these studies in table 1. as you can see from the table, the article sample size vastly varied between studies, but that can likely be accounted for by considering the specific fields studied since there are only five major cytopathology journals and nearly fifty major ecology journals. piwowar et al. acknowledge this in their study, noting that the nsf assigns all science journals “exactly one ‘discipline’ (a high-level categorization) and exactly one ‘specialty’ (a finer-grained categorization).”23 the more deeply nested in an nsf discipline a subject is, the more specialized the field becomes and the fewer journals there are on the subject. this alone is reason not to extrapolate from the results of these studies and project their results on the existence of oaca across all fields. https://doi.org/10.7717/peerj.4375 information technology and libraries | september 2018 56 only two of these studies, those focused on an oaca in dentistry and ecology, can be cons idered truly randomized controlled studies. both the cytopathology and marine ecology studies chose a specific set of journals from which to draw their entire sample pool. while the dentistry and ecology studies can be considered randomized controlled in nature, they still only reflect the occurrence (or lack thereof) of an oaca in those specific fields. it would be irresponsible to allow the results from studies in a single field of a single discipline to represent oaca trends across all disciplines. therefore, it is surprising that elsevier employees use the dentistry study to make such a claim. hersh and plume write, “another recent study by hua et al (2016) looking at citations of open access articles in dentistry found no evidence to suggest that open access articles receive significantly more citations than non-open access articles.”24 the key phrase missing from the end of this analysis is in dentistry. one might question whether a claim about multidisciplinary oaca can effectively be extrapolated from a single-field analysis. the authors do, two sentences later, qualify their earlier statement by saying, “in dentistry at least, the type of article you publish seems to make a difference but not oa status.”25 that is indeed what this study seems to show, and is therefore a logical claim to make. likewise, the three empirical studies in table 1 show that, for those respective fields, oa status does correlate to a citation advantage. in the case of the ecology study, the authors are confident enough in their randomized controlled methodology to claim causation. 26 the ecology study is the most recently published oaca study, and its authors were able to learn from similar past studies about the necessary controls and potential confounders in oaca studies. with this knowledge, tang et al. determined that: by comparing oa and non-oa articles within hybrid journals, our estimate of the citation advantage of oa articles sets controls for many factors that could confound other comparisons. numerous studies have compared articles published in oa journals to those in non-oa journals, but such comparison between different journals could not rule out the impacts of potentially confounding factors such as publication time (speed) and quality and impact (rank) of the journal. these factors are effectively controlled with our focus on hybrid journals, thereby providing robust and general estimates of citation advantages on which to base publication decisions. 27 the open access citation advantage | lewis 57 https://doi.org/10.6017/ital.v37i3.10604 summary of key field-specific studies author study design content number of articles controls results, interpretation, and conclusion clements 2017 empirical 3 hybrid-oa marine ecology journals all articles published in these journals between 2009 and 2012; specific number not provided jif; article type; selfcitations “on average, open access articles received more peer-citations than nonopen access articles.” oaca found. frisch et al. 2014 empirical 5 cytopathology journals; 1 oa and 4 non-oa 314 articles published between 2007 and 2011 jif; author frequency; publisher neutrality “overall, the averages of both cpp and q values were higher for oa cytopathology journal (cytojournal) than traditional non-oa journals.” oaca found. gaulé and maystre 2011 empirical 1 major biology journal 4,388 articles published between 2004 and 2006 last author; characteristics; article quality “we find no evidence for a causal effect of open access on citations. however, a quantitatively small causal effect cannot be statistically ruled out.” oaca not found. hua et al. 2016 randomized controlled articles randomly selected from pubmed database, not specific dentistry journals 908 articles published in 2013 randomized article selection; exclusion of articles unrelated to dentistry; multidatabase search to determine oa status “in the present study, there was no evidence to support the existence of oa ‘citation advantage’, or the idea that oa increases the citation of citable articles.” oaca not found. tang et al. 2017 randomized controlled 46 hybrid-oa ecology journals 3,534 articles published between 2009 and 2013 gni of author country; randomized article pairing; article length “overall, oa articles received significantly more citations than non-oa articles, and the citation advantage averaged approximately one citation per article per year and increased cumulatively over time after publication.” oaca found. table 1. scope, controls, and results of field-specific oaca studies since 2011. based on a chart in stephan mertens, “open access: unlimited web based literature searching,” deutsches ärzteblatt international 106, no. 43 (2009): 711. jif, journal impact factor; cpp, citations per publication; q, q-value (see frisch, nora k., romil nathan, yasin k. ahmed, and vinod b. shidham. “authors attain comparable or slightly higher rates of citation publishing in an open access journal (cytojournal) compared to traditional cytopathology journals—a five year (2007–2011) experience.” cytojournal 11, no. 10 (april 2014). https://doi.org/10.4103/1742-6413.131739 for specific equation used.) https://doi.org/10.4103/1742-6413.131739 information technology and libraries | september 2018 58 summary of key multidisciplinary studies author study design content number of articles controls results, interpretation, and conclusion mccabe and snyder 2014 empirical 100 journals in ecology, botany, and multidisciplinary science all articles published in these journals between 1996 and 2005; specific number not provided jif; journal founding year “we found that open access only provided a significant increase for those volumes made openly accessible via the narrow channel of their own websites rather than the broader pubmed central platform.” oaca found. niyazov et al. 2016 empirical unspecified number of journals across 23 academic divisions 31,216 articles published between 2009 and 2012 field; jif; publication vs. upload date “we find a substantial increase in citations associated with posting an article to academia.edu. . . . we find that a typical article that is also posted to academia.edu has 49% more citations than one that is only available elsewhere online through a non-academia.edu venue.” oaca found for academia.edu. ottaviani 2016 randomized controlled unspecified number of journals who have blanket licensing agreements between the publishers and the university of michigan library 93,745 articles published between 1990 and 2013 self-selection “even though effects found here are more modest than reported elsewhere, given the conservative treatments of the data and when viewed in conjunction with other oaca studies already done, the results lend support to the existence of a real, measurable, open access citation advantage with a lower bound of approximately 20%.” oaca found. sotudeh et al. 2015 empirical 633 apc-funded oa journals published by springer and elsevier 995,508 articles published between 2007 and 2011 journals who adopted oa policies after 2007 journals with non– article processing charge oa policies “the apc oa papers are, also, revealed to outperform the ta ones in their citation impacts in all the annual comparisons. this finding supports the previous results confirming the citation advantage of oa papers.” oaca found. table 2. scope, controls, and results of multi-disciplinary oaca studies since 2011. jif, journal impact factor; apc, article processing charge; ta, toll access the open access citation advantage | lewis 59 https://doi.org/10.6017/ital.v37i3.10604 based on the randomized controlled methodology that tang et al. found hybrid journals to provide, it is possible that this study may serve as an ideal model for future larger oaca studies across multiple disciplines. however, more field-specific hybrid journal studies will have to be conducted before determining if this model would be the most accurate method for measuring oaca across multiple disciplines in a single study. multidisciplinary studies the multidisciplinary oaca studies conducted since 2011 include a single randomized control study and three empirical studies (table 2). all these studies found an oaca; in the case of niyazov et al., an oaca was found specifically for articles posted to academia.edu. i included this study because it is an important contribution to the premise that a relationship exists between self selection and oaca. niyazov et al. highlight this point in the section “sources of selection bias in academia.edu citations,” explaining that “even if academia.edu users were not systematically different than non-users, there might be a systematic difference between the papers they choose to post and those they do not. as [many] . . . have hypothesized, users may be more likely to post their most promising, ‘highest quality’ articles to the site, and not post articles they believe will be of more limited interest.”28 to underscore this point, i refer to gargouri et al., who stated that “the oa advantage [to self archived articles] is a quality advantage, rather than a quality bias” (italics in original).29 again, it is unsurprising that articles of higher caliber are cited more and that making such articles more readily available increases the amount of citations they would likely already receive. similar to my conclusion in the field-specific study section, we simply need more randomized controlled studies, such as ottaviani’s, to determine the nature and extent of the relationship between oa and ca across multiple disciplines. conclusions critics of some of the most recent studies, specifically archambault et al. and ottaviani, have argued that authors of oaca studies are too quick to claim causation. while a claim of causation does indeed require strict adherence to statistical methodology and control of potential confounders, few of the authors i have examined actually claim causation. they recognize that the empirical nature of their studies is not enough to prove causation, but rather to provide insight into the correlation between open access and a citation advantage. in all their conclusions, these authors acknowledge that further studies are needed to prove a causal relationship between oa and ca. the recent work published by piwowar et al. provides a potential model for replication by other researchers, and ottaviani offers a replicable method for other large research institutions with non-self-selecting institutional repositories. alternatively, field-specific studies conducted in the style of tang et al. across all fields would serve to provide a wider array of evidence for the occurrence of field-specific oaca and therefore of a more widespread oaca. recent developments in oa search engines have created alternative routes to many of the same articles offered by subscriptions, but at a fraction (if any) of the cost. antelman proposed that libraries use an oa-adjusted cost per download (oa-adj cpd), a metric that “subtracts the downloads that could be met by oa copies of articles within subscription journals,” as a tool for negotiating the price of journal subscriptions.30 by calculating an oa-adj cpd, libraries could information technology and libraries | september 2018 60 potentially leverage their ability to access journal articles through means other than traditional subscription bundles to save money and encourage oa publication. while antelman suggests using oa-adj cpd as a leveraging tool when making deals with publishers for journals subscriptions, i suggest that libraries use the data-gathering methods of piwowar et al. via unpaywall to determine whether enough articles from a specific journal can be found oa via unpaywall. by using metrics such as those collected by piwowar et al. through unpaywall, the potential confounding variable of articles found through illegitimate means (such as scihub) is alleviated. instead, piwowar et al.’s metrics focus on tracking the percentage of material searched by library patrons that can be found oa through the unpaywall browser extension. according to unpaywall’s “libraries user guide” page, libraries “can integrate unpaywall into their sfx, 360 link, or primo link resolvers, so library users can read oa copies in cases where there's no subscription access. over 1000 libraries worldwide are using this now. ”31 ideally, scholars will also be more willing to publish papers oa, and institutions will be more supportive of providing the necessary costs for making publications oa. though the publish-orperish model still reigns in academia, there is great potential in encouraging tenured professors to publish oa by supplementing the costs through institutional grants and other incentives wrapped into a tenure agreement. perhaps through this model, as gargouri et al. have suggested, the longstanding publish-or-perish doctrine will give way to an era of “self-archive to flourish.”32 bibliography antelman, kristin. “leveraging the growth of open access in library collection decision making.” acrl 2017 proceedings: at the helm, leading the transformation, march 22–25, baltimore, maryland, ed. dawn m. mueller (chicago: association of college and research libraries, 2017), 411–22. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/l everagingthegrowthofopenaccess.pdf. archambault, éric, grégoire côté, brooke struck, and matthieu voorons. “research impact of paywalled versus open access papers.” white papers, science-metrix and 1science, 2016. http://www.1science.com/1numbr/. calver, michael c. and j. stuart bradley. “patterns of citations of open access and non -open access conservation biology journal papers and book chapters.” conservation biology 24, no. 3 (may 2010): 872-80. https://doi.org/10.1111/j.1523-1739.2010.01509.x. chua, s. k., ahmad m. qureshi, vijay krishnan, dinker r. pai, laila b. kamal, sharmilla gunasegaran, m. z. afzal, lahri ambawatta, j. y. gan, p. y. kew, et al. “the impact factor of an open access journal does not contribute to an article’s citations” [version 1; referees: 2 approved]. f1000 research 6 (2017): 208. https://doi.org/10.12688/f1000research.10892.1. clarivate analytics. “incites journal citation reports.” dataset updated september 9, 2017. https://jcr.incites.thomsonreuters.com/. clements, jeff c. “open access articles receive more citations in hybrid marine ecology journals.” facets 2 (january 2017): 1–14. https://doi.org/10.1139/facets-2016-0032. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/leveragingthegrowthofopenaccess.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/leveragingthegrowthofopenaccess.pdf http://www.1science.com/1numbr/ https://doi.org/10.1111/j.1523-1739.2010.01509.x https://doi.org/10.12688/f1000research.10892.1 https://jcr.incites.thomsonreuters.com/ https://doi.org/10.1139/facets-2016-0032 the open access citation advantage | lewis 61 https://doi.org/10.6017/ital.v37i3.10604 crotty, david. “study suggests publisher public access outpacing open access; gold oa decreases citation performance.” scholarly kitchen, october 4, 2017. https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-accessoutpacing-open-access-gold-oa-decreases-citation-performance/. crotty, david. “when bad science wins, or ‘i’ll see it when i believe it.’” scholarly kitchen, august 31, 2016. https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-itwhen-i-believe-it/. davis, philip m. “open access, readership, citations: a randomized controlled trial of scientific journal publishing.” faseb journal 25, no. 7 (july 2011): 2129–34. https://doi.org/10.1096/fj.11183988. davis, philip m., and william h. walters. “the impact of free access to the scientific literature: a review of recent research.” journal of the medical library association 99, no. 3 (july 2011): 208– 17. https://doi.org/10.3163/1536-5050.99.3.008. elsevier. “your guide to publishing open access with elsevier.” amsterdam, netherlands: elsevier, 2015. https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_may.pdf. evans, james a. and jacob reimer. “open access and global participation in science.” science 323, no. 5917 (february 2009): 1025. https://doi.org/10.1126/science.1154562. eysenbach, gunther. “citation advantage of open access articles.” plos biology 4, no. 5 (may 2006): e157. https://doi.org/10.1371/journal.pbio.0040157. fisher, tim. “top-level domain (tld).” lifewire, july 30, 2017. https://www.lifewire.com/toplevel-domain-tld-2626029. frisch, nora k., romil nathan, yasin k. ahmed, and vinod b. shidham. “authors attain comparable or slightly higher rates of citation publishing in an open access journal (cytojournal) compared to traditional cytopathology journals—a five year (2007–2011) experience.” cytojournal 11, no. 10 (april 2014). https://doi.org/10.4103/1742-6413.131739. gaulé, patrick, and nicolas maystre. “getting cited: does open access help?” research policy 40, no. 10 (december 2011): 1332–38. https://doi.org/10.1016/j.respol.2011.05.025. gargouri, yassine, chawki hajjem, vincent larivière, yves gingras, les carr, tim brody, and stevan harnad. “self-selected or mandated, open access increases citation impact for higher quality research.” plos one 5, no. 10 (october 2010). https://doi.org/10.1371/journal.pone.0013636. hajjem, chawki, stevan harnad, and yves gingras. “ten-year cross-disciplinary comparison of the growth of open access and how it increases research citation impact.” ieee data engineering bulletin 28, no. 4 (december 2005): 39-46. hall, martin. “green or gold? open access after finch.” insights 25, no. 3 (november 2012): 235– 40. https://doi.org/10.1629/2048-7754.25.3.235. https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://doi.org/10.1096/fj.11-183988 https://doi.org/10.1096/fj.11-183988 https://doi.org/10.3163/1536-5050.99.3.008 https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_may.pdf https://doi.org/10.1126/science.1154562 https://doi.org/10.1371/journal.pbio.0040157 https://www.lifewire.com/top-level-domain-tld-2626029 https://www.lifewire.com/top-level-domain-tld-2626029 https://doi.org/10.4103/1742-6413.131739 https://doi.org/10.1016/j.respol.2011.05.025 https://doi.org/10.1371/journal.pone.0013636 https://doi.org/10.1629/2048-7754.25.3.235 information technology and libraries | september 2018 62 hersh, gemma, and andrew plume. “citation metrics and open access: what do we know?” elsevier connect, september 14, 2016. https://www.elsevier.com/connect/citation-metrics-andopen-access-what-do-we-know. houghton, john, and alma swan. “planting the green seeds for a golden harvest: comments and clarifications on ‘going for gold.’” d-lib magazine 19, no. 1/2 (january/february 2013). https://doi.org/10.1045/january2013-houghton. hua, fang, heyuan sun, tanya walsh, helen worthington, and anne-marie glenny. “open access to journal articles in dentistry: prevalence and citation.” journal of dentistry 47 (april 2016): 41– 48. https://doi.org/10.1016/j.jdent.2016.02.005. internet corporation for assigned names and numbers. “list of top-level domains.” last updated september 13, 2018. https://www.icann.org/resources/pages/tlds-2012-02-25-en. jump, paul. “open access papers ‘gain more traffic and citations.’” times higher education, july 30, 2014. https://www.timeshighereducation.com/home/open-access-papers-gain-more-trafficand-citations/2014850.article. mccabe, mark j., and christopher m. snyder. “identifying the effect of open access on citations using a panel of science journals.” economic inquiry 52, no. 4 (october 2014): 1284–1300. https://doi.org/10.11111/ecin.12064. mccabe, mark j., and christopher m. snyder. “does online availability increase citations? theory and evidence from a panel of economics and business journals.” review of economics and statistics 97, no. 1 (march 2015): 144–65. https://doi.org/10.1162/rest_a_00437. mertens, stephan. “open access: unlimited web based literature searching.” deutsches ärzteblatt international 106, no. 43 (2009): 710–12. https://doi.org/10.3238/arztebl.2009.0710. moed, hank. “does open access publishing increase citation or download rates?” research trends 28 (may 2012). https://www.researchtrends.com/issue28-may-2012/does-open-accesspublishing-increase-citation-or-download-rates/. niyazov, yuri, carl vogel, richard price, ben lund, david judd, adnan akil, michael mortonson, josh schwartzman, and max shron. “open access meets discoverability: citations to articles posted to academia.edu.” plos one 11, no. 2 (february 2016): e0148257. https://doi.org/10.1371/journal.pone.0148257. ottaviani, jim. “the post-embargo open access citation advantage: it exists (probably), it’s modest (usually), and the rich get richer (of course).” plos one 11, no. 8 (august 2016): e0159614. https://doi.org/10.1371/journal.pone.0159614. pinfield, stephen, jennifer salter, and peter a. bath. “a ‘gold-centric’ implementation of open access: hybrid journals, the ‘total cost of publication,’ and policy development in the uk and beyond.” journal of the association for information science and technology 68, no. 9 (september 2017): 2248–63. https://doi.org/10.1002/asi.23742. piwowar, heather, jason priem, vincent larivière, juan pablo alperin, lisa matthias, bree norlander, ashley farley, jevin west, and stefanie haustein. “the state of oa: a large-scale https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know https://doi.org/10.1045/january2013-houghton https://doi.org/10.1016/j.jdent.2016.02.005 https://www.icann.org/resources/pages/tlds-2012-02-25-en https://www.timeshighereducation.com/home/open-access-papers-gain-more-traffic-and-citations/2014850.article https://www.timeshighereducation.com/home/open-access-papers-gain-more-traffic-and-citations/2014850.article https://doi.org/10.11111/ecin.12064 https://doi.org/10.1162/rest_a_00437 https://doi.org/10.3238/arztebl.2009.0710 https://www.researchtrends.com/issue28-may-2012/does-open-access-publishing-increase-citation-or-download-rates/ https://www.researchtrends.com/issue28-may-2012/does-open-access-publishing-increase-citation-or-download-rates/ https://doi.org/10.1371/journal.pone.0148257 https://doi.org/10.1371/journal.pone.0159614 https://doi.org/10.1002/asi.23742 the open access citation advantage | lewis 63 https://doi.org/10.6017/ital.v37i3.10604 analysis of the prevalence and impact of open access articles.” peerj (february 13, 2018): 6:e4375. https://doi.org/10.7717/peerj.4375. research information network. “nature communications: citation analysis.” press release, 2014. https://www.nature.com/press_releases/ncomms-report2014.pdf. riera, m. and e. aibar. “¿favorece la publicación en abierto el impacto de los artículos científicos? un estudio empírico en el ámbito de la medicina intensive” [does open access publishing increase the impact of scientific articles? an empirical study in the field of intensive care medicine]. medicina intensiva 37, no. 4 (may 2013): 232-40. http://doi.org/10.1016/j.medin.2012.04.002. sotudeh, hajar, zahra ghasempour, and maryam yaghtin. “the citation advantage of author-pays model: the case of springer and elsevier oa journals.” scientometrics 104 (june 2015): 581–608. https://doi.org/10.1007/s11192-015-1607-5. swan, alma, and john houghton. “going for gold? the costs and benefits of gold open access for uk research institutions: further economic modelling.” report to the uk open access implementation group, june 2012. http://wiki.lib.sun.ac.za/images/d/d3/report-to-the-uk-openaccess-implementation-group-final.pdf. tang, min, james d. bever, and fei-hai yu. “open access increases citations of papers in ecology.” ecosphere 8, no. 7 (july 2017): 1–9. https://doi.org/10.1002/ecs2.1887. unpaywall. “libraries user guide.” accessed september 13, 2018. https://unpaywall.org/userguides/libraries. wray, k. brad. “no new evidence for a citation benefit for author-pay open access publications in the social sciences and humanities.” scientometrics 106 (january 2016): 1031–35. https://doi.org/10.1007/s11192-016-1833-5. endnotes 1 elsevier, “your guide to publishing open access with elsevier” (amsterdam, netherlands: elsevier, 2015), 2, https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_may.pdf. 2 philip m. davis and william h. walters, “the impact of free access to the scientific literature: a review of recent research,” journal of the medical library association 99, no. 3 (july 2011): 213, https://doi.org/10.3163/1536-5050.99.3.008. 3 david and walters, “the impact of free access,” 208. 4 kristin antelman, “leveraging the growth of open access in library collection decision making,” acrl 2017 proceedings: at the helm, leading the transformation, march 22–25, baltimore, maryland, ed. dawn m. mueller (chicago: association of college and research libraries, 2017): 411, 413, http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/201 7/leveragingthegrowthofopenaccess.pdf. https://doi.org/10.7717/peerj.4375 https://www.nature.com/press_releases/ncomms-report2014.pdf http://doi.org/10.1016/j.medin.2012.04.002 https://doi.org/10.1007/s11192-015-1607-5 http://wiki.lib.sun.ac.za/images/d/d3/report-to-the-uk-open-access-implementation-group-final.pdf http://wiki.lib.sun.ac.za/images/d/d3/report-to-the-uk-open-access-implementation-group-final.pdf https://doi.org/10.1002/ecs2.1887 https://unpaywall.org/user-guides/libraries https://unpaywall.org/user-guides/libraries https://doi.org/10.1007/s11192-016-1833-5 https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_may.pdf http://jmla.mlanet.org/ https://doi.org/10.3163/1536-5050.99.3.008 http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/leveragingthegrowthofopenaccess.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/leveragingthegrowthofopenaccess.pdf information technology and libraries | september 2018 64 5 research information network, “nature communications: citation analysis,” press release, 2014, https://www.nature.com/press_releases/ncomms-report2014.pdf. 6 gargouri et al., “self-selected or mandated, open access increases citation impact for higher quality research,” plos one 5, no. 10 (october 2010): 17, https://doi.org/10.1371/journal.pone.0013636. 7 david crotty, “when bad science wins, or ‘i’ll see it when i believe it’,” scholarly kitchen, august 31, 2016, https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-seeit-when-i-believe-it/. 8 jim ottaviani, “the post-embargo open access citation advantage: it exists (probably), it’s modest (usually), and the rich get richer (of course),” plos one 11, no. 8 (august 2016): 9, https://doi.org/10.1371/journal.pone.0159614. 9 gargouri et al., “self-selected or mandated,” 18. 10 elsevier, “your guide to publishing,” 2. 11 top-level domain (tld) refers to the last string of letters in an internet domain name (i.e., the tld of www.google.com is .com). for more information on tlds, see tim fisher, “top-level domain (tld),” lifewire, july 30, 2017, https://www.lifewire.com/top-level-domain-tld2626029. for a full list of tlds, see “list of top-level domains,” internet corporation for assigned names and numbers, last updated september 13, 2018, https://www.icann.org/resources/pages/tlds-2012-02-25-en. 12 crotty, “when bad science wins.” 13 hersh and plume, “citation metrics and open access: what do we know?,” elsevier connect, september 14, 2016, https://www.elsevier.com/connect/citation-metrics-and-open-accesswhat-do-we-know. 14 archambault et al., “research impact of paywalled versus open access papers,” white paper, science-metrix and 1science, 2016, http://www.1science.com/1numbr/. 15 archambault et al., “research impact.” 16 heather piwowar et al., “the state of oa: a large-scale analysis of the prevalence and impact of open access articles,” peerj, february 13, 2018, https://doi.org/10.7717/peerj.4375. 17 piwowar et al., “the state of oa,” 5. 18 david crotty, “study suggests publisher public access outpacing open access; gold oa decreases citation performance,” scholarly kitchen, october 4, 2017, https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-accessoutpacing-open-access-gold-oa-decreases-citation-performance/. https://www.nature.com/press_releases/ncomms-report2014.pdf https://doi.org/10.1371/journal.pone.0013636 https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://doi.org/10.1371/journal.pone.0159614 https://www.lifewire.com/top-level-domain-tld-2626029 https://www.lifewire.com/top-level-domain-tld-2626029 https://www.icann.org/resources/pages/tlds-2012-02-25-en https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know http://www.1science.com/1numbr/ https://doi.org/10.7717/peerj.4375 https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ the open access citation advantage | lewis 65 https://doi.org/10.6017/ital.v37i3.10604 19 archambault et al., “research impact”; piwowar et al., “the state of oa,” 15. 20 piwowar et al., “the state of oa,” 9–10. 21 archambault et al., “research impact.” 22 ottaviani, “the post-embargo open access citation advantage,” 2. 23 piwowar et al., “the state of oa,” 9. 24 hersh and plume, “citation metrics and open access.” 25 hersh and plume, “citation metrics and open access.” 26 tang et al., “open access increases citations of papers in ecology,” ecosphere 8, no. 7 (july 2017): 8, https://doi.org/10.1002/ecs2.1887. 27 tang et al., “open access increases citations,” 7. tang et al. list the following as examples of the “numerous studies” as quoted above, which i did not include in the quote for the purpose of brevity: (antelman 2004, hajjem et al. 2005, eysenbach 2006, evans and reimer 2009, calver and bradley 2010, riera and aibar 2013, clements 2017). 28 yuri niyazov et al., “open access meets discoverability: citations to articles posted to academia.edu,” plos one 11, no. 2 (february 2016): e0148257, https://doi.org/10.1371/journal.pone.0148257. 29 gargouri et al., “self-selected or mandated,” 18. 30 antelman, “leveraging the growth,” 414. 31 “library user guide,” unpaywall, accessed september 13, 2018, https://unpaywall.org/userguides/libraries.<> 32 gargouri et al., “self-selected or mandated,” 20. https://doi.org/10.1002/ecs2.1887 https://doi.org/10.1371/journal.pone.0148257 https://unpaywall.org/user-guides/libraries https://unpaywall.org/user-guides/libraries abstract introduction self-archiving bias and why it doesn’t matter gold versus green and their effect on oaca analyses bronze: neither gold nor green arc as an indication of an oaca six years and what has changed in oaca research field-specific studies summary of key field-specific studies summary of key multidisciplinary studies multidisciplinary studies conclusions bibliography endnotes application level security in a public library: a case study richard thomchick and tonia san nicolas-rocca information technology and libraries | december 2018 107 richard thomchick (richardt@vmware.com) is mlis, san josé state university. tonia san nicolas-rocca (tonia.sannicolas-rocca@sjsu.edu) is assistant professor in the school of information at san josé state university. abstract libraries have historically made great efforts to ensure the confidentiality of patron personally identifiable information (pii), but the rapid, widespread adoption of information technology and the internet have given rise to new privacy and security challenges. hypertext transport protocol secure (https) is a form of hypertext transport protocol (http) that enables secure communication over the public internet and provides a deterministic way to guarantee data confidentiality so that attackers cannot eavesdrop on communications. https has been used to protect sensitive information exchanges, but security exploits such as passive and active attacks have exposed the need to implement https in a more rigorous and pervasive manner. this report is intended to shed light on the state of https implementation in libraries, and to suggest ways in which libraries can evaluate and improve application security so that they can better protect the confidentiality of pii about library patrons. introduction patron privacy is fundamental to the practice of librarianship in the united states (u.s.). libraries have historically made great efforts to ensure the confidentiality of personally identifiable information (pii), but the rapid, widespread adoption of information technology and the internet have given rise to new privacy and security challenges. the usa patriot act, the rollback of the federal communications commission rules prohibiting internet service providers from selling customer browsing histories without the customer’s permission, along with electronic surveillance efforts by the national security agency (nsa) and other government agencies, have further intensified privacy concerns about sensitive information that is transmitted over the public internet when patrons interact with electronic library resources through online systems such as an online public access catalog (opac). 1 hypertext transport protocol secure (https) is a form of hypertext transport protocol (http) that enables secure communication over the public internet and provides a deterministic way to guarantee data confidentiality so that attackers cannot eavesdrop on communications. https has been used to protect sensitive information exchanges (i.e., e-commerce transactions, user authentication, etc.). in practice, however, security exploits such as man-in-the-middle attacks have demonstrated the relative ease with which an attacker can transparently eavesdrop on or hijack http traffic by targeting gaps in https implementation. there is little or no evidence in the literature that libraries are aware of the associated vulnerabilities, threats, or risks, or that researchers have evaluated the use of https in library web applications. this report is intended to shed light on the state of https implementation in libraries, and to suggest ways in which libraries can evaluate and improve application security so that they can better protect the mailto:richardt@vmware.com mailto:tonia.sannicolas-rocca@sjsu.edu application level security in a public library |thomchick and san nicolas-rocca 108 https://doi.org/10.6017/ital.v37i4.10405 confidentiality of pii about library patrons. the structure of this paper is as follows. first, we review the literature on privacy as it pertains to librarianship and cybersecurity. we then describe the testing and research methods used to evaluate https implementation. a discussion on the results of the findings is presented. finally, we explain the limitations and suggest future research directions. literature review the research begins with a survey of the literature on the topic of confidentiality as it pertains to patron privacy; the impact of information technology on libraries; and the use of https as a security control to protect the confidentiality of patron data when it is transmitted over the public internet. while there is ample literature on the topic of patron privacy, there appears to be a lack of empirical studies that measure the use of https to protect the privacy of data transmitted to and from patrons when they use library web applications.2 the primal importance of patron privacy patron privacy has long been one of the most important principles of the library profession in the u.s. as early as 1939, the code of ethics for librarians explicitly stated, “it is the librarian’s obligation to treat as confidential any private information obtained through contact with li brary patrons.”3 the concept of privacy as applied to personal and circulation data in library records began to appear in the library literature not long after the passage of the u.s. privacy act of 1974.4 today, the american library association (ala) regards privacy as “fundamental to the ethics and practice of librarianship,” and has formally adopted a policy regarding the confidentiality of personally identifiable information (pii) about library users, which asserts, “confidentiality exists when a library is in possession of personally identifiable information about users and keeps that information private on their behalf.”5 this policy affirms language from the ala code of ethics, and states that “confidentiality extends to information sought or received and resources consulted, borrowed, acquired or transmitted including database search records, reference questions and interviews, circulation records, interlibrary loan records, information about materials downloaded or placed on ‘hold’ or ‘reserve,’ and other personally identifiable information about uses of library materials, programs, facilities, or services.” 6 with the advent of new technologies used in libraries to support information discovery, more challenges arise to protect patron privacy.7 the impact of information technology on patron privacy researchers have studied the impact of information technology on patron privacy for several decades. early research by harter and machovec discussed the data privacy challenges arising from the use of automated systems in the library, and the associated ethical considerations for librarians who create, view, modify, and use patron records.8 fouty addressed issues regarding the privacy of patron data contained in library databases, arguing that online patron records provide more information about individual library users, more quickly, than traditional paperbased files.9 agnew and miller presented a hypothetical case involving the transmission of an obscene email from a library computer, and an ensuing fbi inquiry, as a method of examining privacy issues that arise from patron internet use at the library.10 in addition, merry pointed to the potential for violations of patron privacy brought about by tracking of personal information attached to electronic text supplied by publishers.11 information technology and libraries | december 2018 109 the consensus from the literature, as articulated by fifarek, is that technology has given rise to new privacy challenges, and that the adoption of technology in the library has outpaced efforts to maintain patron privacy.12 this sentiment was echoed and amplified by john berry, former ala president, who commented that there are “deeper issues that arise from the impact of converting information to digitized, online formats” and critiqued the library profession for having “not built protections for such fundamental rights as those to free expression, privacy, and freedom.”13 ala affirmed these findings and validated much of the prevailing research in a report from the library information technology association, which concluded, “user records have also expanded beyond the standard lists of library cardholders and circulation records as libraries begin to use electronic communication methods such as electronic mail for reference services, and as they provide access to computer, web and printing use.”14 in more recent years, library systems have made increasing use of network communication protocols such as http and focus of the literature has shifted towards internet technologies in response to the growth of trends such as cloud computing and web 2.0. mavodza characterizes the relevance of cloud computing as “unavoidable” and expounds on the ways in which software-as-aservice (saas), platform as a service (paas), and infrastructure as a service (iaas) and other cloud computing models “bring to the forefront considerations about . . . information security [and] privacy . . . that the librarian has to be knowledgeable about.”15 levy and bérard caution that nextgeneration library systems and web-based solutions are “a breakthrough but need careful scrutiny” of security, privacy, and related issues such as data provenance (i.e., where the information is physically stored, which can potentially affect security and privacy compliance requirements). 16 protecting patron privacy in the “library 2.0” era “library 2.0” is an approach to librarianship that emphasizes engagement and multidirectional interaction with library patrons. although this model is “broader than just online communication and collaboration” and “encompasses both physical and virtual spaces,” there can be no doubt that “library 2.0 is rooted in the global web 2.0 discussion,” and that libraries have made increasing use of web 2.0 technologies to engage patrons.17 the library 2.0 model disrupts many traditional practices for protecting privacy, such as limited tracking of user activity, short-term data retention policies, and anonymous browsing of physical materials. instead, as zimmer states, “the norms of web 2.0 promote the open sharing of information—often personal information—and the design of many library 2.0 services capitalize on access to patron information and might require additional tracking, collection, and aggregation of patron activities.”18 as ala cautioned in their study on privacy and confidentiality, “libraries that provide materials over websites controlled by the library must determine the appropriate use of any data describing user activity logged or gathered by the web server software.”19 the dilemma facing libraries in the library 2.0 era, then, is how to appropriately leverage user information while maintaining patron privacy. many library systems require users to validate their identity through the use of a username, password, pin code, or another unique identifier for access to their library circulation records and other personal information.20 however, several studies suggest the authentication process itself spawns a trail of personally identifiable information about library patrons that must be kept confidential.21 there is discussion in the literature about the value of using https and ssl certificates to protect patron privacy and build a high level of trust with users, and general awareness about importance of encrypting communications that involve sensitive information, such as “payment for fines and fees via the opac” or when “patrons are required to enter personal application level security in a public library |thomchick and san nicolas-rocca 110 https://doi.org/10.6017/ital.v37i4.10405 details such as addresses, phone numbers, usernames, and/or passwords.”22 however, as breeding observed, many opacs and other library automation software products “don't use ssl by default, even when processing these personalization features.” 23 these observations call library privacy practices into question, and are concerning since “hackers have identified library ilss as vulnerable, especially when libraries do not enforce strict system security protocols.” 24 one of the challenges facing libraries is the perception that “a library's basic website and online catalog functions don't need enhanced security.”25 as a matter-of-fact, one of the most common complaints against https implementation in libraries has been: “we don’t serve any sensitive information.”26 these beliefs may be based on the historical practice of using https selectively to secure “sensitive” information and operations such as user authentication. but in recent years, it has become clear that selective https implementation is not an adequate defense. the electronic frontier foundation (eff) cautions, “some site operators provide only the login page over https, on the theory that only the user’s password is sensitive. these sites’ users are vulnerable to passive and active attacks.”27 passive attacks do not alter systems or data. during a passive attack, a hacker will attempt to listen in on communications over a network. eavesdropping is an example of a passive attack.28 active attacks alter systems or data. during this type of attack, a hacker will attempt to break into a system to make changes to transmitted or stored data, or introduce data into the system. examples of active attacks include man-in-the-middle, impersonation, and session hijacking.29 http exploits web servers typically generate unique session token ids for authenticated users and transmit them to the browser, where they are cached in the form of cookies. session hijacking is a type of attack that “compromises the session token by stealing or predicting a valid session token to gain unauthorized access to the web server,” often by using a network sniffer to capture a valid session id that can be used to gain access to the server.30 session hijacking is not a new problem, but the release of the firesheep attack kit in 2010 increased awareness about the inherent insecurity of http and the need for persistent https.31 in the wake of firesheep’s release and several major security breaches, senator charles schumer, in a letter to yahoo!, twitter, and amazon, characterized http as a “welcome mat for would-be hackers” and urged the technology industry to implement better security as quickly as possible.32 these and other events prompted several major site operators, including google, facebook, paypal, and twitter, to switch from partial to pervasive https. today these sites transmit virtually all web application traffic over https. security researchers from these companies, as well as from several standards organizations such as electronic frontier foundation (eff), internet engineering task force (ietf), and open web application security project have shared their experiences and recommendations to help other website operators implement https effectively.33 these include encrypting the entire session, avoiding mixed content, configuring cookies correctly, using valid ssl certificates, and enabling hsts to enforce https. testing techniques used to evaluate https implementation there is little or no evidence in the literature that libraries are aware of the associated vulnerabilities, threats, or risks, or that researchers have evaluated the use of https in library web applications. however, there are many methods that libraries can use to evaluate https and information technology and libraries | december 2018 111 ssl/tls implementation, including automated software tools and heuristic evaluations. these methods can be combined for deeper analysis. automated software tools among the most widely used automated analysis software tools is ssl server test from qualys ssl labs. this online service “performs a deep analysis of the configuration of any ssl web server on the public internet” and provides a visual summary as well as detailed information about authentication (certification and certificate chains) and configuration (protocols, key strength, cipher suites, and protocol details).34 users can optionally post the results to a central “board” that acts as a clearinghouse for identifying “insecure” and “trusted” sites. another popular tool is sslscan, a command-line application that, as the name implies, quickly “queries ssl services, such as https, in order to determine the ciphers that are supported.”35 however, these tools are limited in that they only report specific types of data and do not provide a holistic view of https implementation. heuristic evaluations in addition to automated software tools, librarians can also use heuristic evaluations to manually inspect the gray areas of https implementation, either to validate the results of automated software or to examine aspects not included in the functionality of these tools. one example is httpsnow, a service that lets users report and view information about how websites use https. httpsnow enables this activity by providing heuristics that non-technical audiences can use to derive a relatively accurate assessment of https deployment on any particular website or application. the project documentation includes descriptions of, and guidance for identifying, http-related vulnerabilities such as use of http during authenticated user sessions, presence of mixed content (instances in which content on a webpage is transmitted via https while other content elements are transmitted via http), insecure cookie configurations, and use of invalid ssl certificates. research methodology a combination of heuristic and automated methods was used to evaluate https implementation in a public library web application to determine how many security vulnerabilities exist in the application and assess to the potential privacy risks to the library’s patrons. research location this research project was conducted at a public library in the western us that we will call west coast public library (wcpl). this library was established in 1908 and employs ninety staff and approximately forty volunteers. in addition, it has approximately 91,000 cardholders. as part of its operations, wcpl runs a public-facing website and an integrated library system (ils) that includes an opac with personalization for authenticated users. test to conduct the test, a valid wcpl library patron account was created and used to authenticate one of the authors for access to account information and personalized features of wcpl’s opac. next, the google chrome web browser was used to visit wcpl’s public-facing website. a valid patron name, library card number, and eight-digit pin number were then used to gain access to online account information. several tasks were performed to evaluate https usage. a sample search application level security in a public library |thomchick and san nicolas-rocca 112 https://doi.org/10.6017/ital.v37i4.10405 query for the keyword “recipes” was performed in the opac while logged in. the description pages for two of the resources listed in the search engine result page (one printed resource and one electronic resource) were clicked on and viewed. the electronic resource was added to the online account’s “book cart” and the book cart page was viewed. during these activities, httpsnow heuristics were applied to individual webpages and to the user session as a whole. the web browser’s url address window was inspected to determine whether some or all pages were transmitted via http or https. the url icon in the browser’s address bar was clicked on to view a list of the cookies that the application set in the browser. each cookie was inspected for the text, "send for: encrypted connections only," which indicates that the cookie is secure. individual webpages were checked for the presence of mixed (encrypted and unencrypted) content. information about individual ssl certificates was inspected to determine their validity and encryption key length. all domain and subdomain names encountered during these activ ities were documented. the google chrome web browser was then used to access the qualys ssl server test tool. each domain name encountered was submitted. test results were then examined to determine whether any authentication or configuration flaws exist in wcpl’s web applications. results and discussion given the recommendations suggested by several organizations (e.g., eff, ietf, owasp), we evaluated wcpl’s web application to determine how many security vulnerabilities exist in the application, and assess the potential privacy risks to the library’s patrons. the results of tests, as discussed below, suggest that wcpl’s web application processes a number of vulnerabilities that could potentially be exploited by attackers and compromise the confidentiality of pii about library patrons. this is not surprising given the lack of research on https implementation, as well as the general consensus in the literature that technology adoption has outpaced efforts to maintain patron privacy. based on the results of these tests, wcpl’s website and ils span across several domains. some of these domains appear to be operated by wcpl, while others appear to be part of a hosted environment operated by the ils vendor. based on this information, it is reasonable to conclude that wcpl’s ils utilizes a “hybrid cloud” model. in addition, random use of https is observed in the opac interface during the testing process. this is discussed in the following sections. use of http during authenticated user sessions library patrons use wcpl’s website and opac to access and search for books and other material available through the library. given the results of the tests, wcpl does not use https pervasively across its entire web application. during the test, we found that wcpl’s website is transmitted via http by default. this was after manually entering in the url with an “https” prefix, which resulted in a redirect to the unencrypted “http” page. we continued to test wcpl’s website and opac by performing a query using the search bar located on the patron account page. we found that wcpl’s opac transmits some pages over http and others over https. for example, when a search query is performed in the search bar located on the patron account page, the search engine results page is sometimes served over https, and sometimes over http (see figure 1). this behavior is not limited to specific pages; rather it appears to be random. this security flaw leaves library patrons vulnerable to passive and active attacks that exploit gaps in https implementation, which allows an attacker to eavesdrop on and hijack a user-session providing the attacker with access to private information. information technology and libraries | december 2018 113 figure 1. results of the library’s use of https. presence of mixed content when a library patron visits a webpage served over https, the connection with the web server is encrypted, and therefore, safeguarded from attack. if an https webpage includes content retrieved via http, the webpage is only partially encrypted, leaving the unencrypted content vulnerable to attackers. analysis of wcpl’s website did not reveal any explicit use of mixed content on the public-facing portion of the site. test results, however, detected unencrypted content sources on some pages of the library’s online catalog. this, unfortunately, puts patron privacy at risk as attackers can intercept the http resources when an https webpage loads content such as an image, iframe or font over http. this compromises the security of what is perceived to be a secure site by enabling an attacker to exploit an insecure css file or javascript function, leading to disclosure of sensitive data, malicious website redirect, man-in-the-middle attacks, phishing, and other active attacks.36 insecure cookie management cookies are small text files, sent from a web server and stored on user computers via web browsers. cookies can be divided into two categories: session and persistent. persistent cookies are stored on the user’s hard drive until they are erased or expire. unlike persistent cookies, session cookies are stored in memory and erased once the user closes their browser. provided that computer settings allow for it, cookies are created when a user visits a website. cookies can be set up such that communication is limited to encrypted communication, and can be used to remember login credentials, previous information entered into forms, such as name, mailing address, email address, and the like. cookies can also be used to monitor the number of times a user visits a website, the pages a user visits, and the amount of time spent on a webpage. application level security in a public library |thomchick and san nicolas-rocca 114 https://doi.org/10.6017/ital.v37i4.10405 the results of the tests suggest that wcpl’s cookie policies are inconsistent. we found two types of cookies present. within one domain, the web application uses a jsession cookie that is configured to send for “secure connections only.” this indicates that the session id cookie is encrypted during transmission. another domain uses an asp.net session id that is configured to send for any connection, which means the session id could be transmitted in an unencrypted format. cookies transmitted in an unencrypted format could be intercepted by an attacker in order to eavesdrop on or hijack user sessions. this leaves user privacy vulnerable given the type of information contained within cookies. flawed encryption protocol support transport layer security (tls) is a protocol designed to provide secure communication over the web. websites using tls, therefore, provide a secure communication path between their web servers and web browsers preventing eavesdropping, hijacking, and other active attacks. this study employed the ssl server test from qualys ssl labs to perform an analysis of wcpl’s web applications. results of the qualys test (see figure 2) indicate that the site does not support tls 1.2, which means the server may be vulnerable to passive and active attacks, thereby providing hackers with access to data passed between a web server and web browser accessing the server. in addition, the application’s server platform supports ssl 2.0, which is insecure because it is subject to a number of passive and active attacks leading to loss of confidentiality, privacy, and integrity. figure 2. qualys scanning service results. the vulnerabilities discovered during the testing process may be a result of uncoordinated security. this is concerning because it is a by-product of the cloud computing approach used to operate wcpl’s ils. while libraries may have acclimated to the challenge of coordinating security measures across a distributed application, they now face the added complexity of coordinating information technology and libraries | december 2018 115 security measures with their vendors, who themselves may also utilize additional cloud-based offerings from third parties. as cloud technology adoption increases and cloud-based infrastructures become more complex and distributed, attackers will likely attempt to find and exploit systems with inconsistent or uneven security measures, and libraries will need to work closely with information technology vendors to ensure tight coordination of security measures. unencrypted communication using http affects the privacy, security, and integrity of patron data. passive attacks such as eavesdropping, and active attacks such as hijacking, man -in-the-middle, and phishing can reveal patron login credentials, search history, identity, and other sensitive information that, according to ala, should be kept private and confidential. given the results of the testing done in this study, it is clear that wcpl needs to revisit and strengthen their web application security measures by, according to organizations within the security community, using https pervasively across the entire web application, avoiding mixed content, configuring cookies limited to encrypted communication, using valid ssl certificates, and enabling hsts to enforce https. implementing improvements to https will mitigate attacks by strengthening the integrity of wcpl’s web applications, which in turn, will help protect the privacy and confidentiality of library patrons. limitations and future research this research was performed at a public library in the western u.s. therefore, future research is needed to study the implementation of https to increase patron privacy at other public libraries, libraries in other parts of the u.s. and in other countries. it would also be valuable to conduct similar research at libraries of different types, including academic, law, medical, and other types of special libraries. ssl server test from qualys ssl labs and httpsnow were used to evaluate the use of https at wcpl. the use of other evaluation techniques may generate different results. while a major limitation of this study is the evaluation of a single public library and the implementation of https to ensure patron privacy, a next phase of research should further investigate the policies in place that are used to safeguard patron privacy. these include security education, training, and awareness programs, as well as access controls. furthermore, library 2.0 and cloud computing are fundamental to libraries, but create risks that could impact the ability to keep patron pii safeguarded. as such, future research should evaluate the impact library 2.0 and cloud computing applications have on maintaining the confidentiality of patron information. conclusion the library profession has long been a staunch defender of privacy rights, and the literature reviewed indicates strong awareness and concern about the rapid pace of information technology and its impact on the confidentiality of personally identifiable information about library patrons. much work has been done to educate librarians and patrons about the risks facing them and the measures they can take to protect themselves. however, the research and experimentation presented in this report strongly suggest that there is a need for wcpl and other libraries to reassess and strengthen their https implementations. https is not a panacea for mitigating web application risks, but it can help libraries give patrons the assurance of knowing they take security and privacy seriously, and that reasonable steps are being taken to protect them. finally, this report concludes that further research on library application security should be conducted to assess the overall state of application security in public, academic, and special libraries, with the application level security in a public library |thomchick and san nicolas-rocca 116 https://doi.org/10.6017/ital.v37i4.10405 long-term objective of enabling ala and other professional institutions to develop policies and best practices to guide the secure adoption of library 2.0 and cloud computing technologies within a socially connected world. references 1 jon brodkin, “president trump delivers final blow to web browsing privacy rules,” ars technica (april 3, 2017), https://arstechnica.com/tech-policy/2017/04/trumps-signaturemakes-it-official-isp-privacy-rules-are-dead/. 2 shayna pekala, “privacy and user experience in 21st century library discovery,” information technology and libraries 36, no. 2 (2017): 48–58, https://doi.org/10.6017/ital.v36i2.9817. 3 american library association, “history of the code of ethics: 1939 code of ethics for librarians,” accessed may 11, 2018, http://www.ala.org/template.cfm?section=history1&template=/contentmanagement/conte ntdisplay.cfm&contentid=8875. 4 joyce crooks, “civil liberties, libraries, and computers,” library journal 101, no. 3 (1976): 482– 87; stephen harter and charles c. busha, “libraries and privacy legislation,” library journal 101, no. 3 (1976): 475–81; kathleen g. fouty, “online patron records and privacy: service vs. security,” journal of academic librarianship 19, no. 5 (1993): 289–93, https://doi.org/10.1016/0099-1333(93)90024-y. 5 “code of ethics of the american library association,” american library association, amended january 22, 2008, http://www.ala.org/advocacy/proethics/codeofethics/codeethics; “privacy: an interpretation of the library bill of rights,” american library association, amended july 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 6 american library association, “privacy: an interpretation of the library bill of rights,” amended july 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 7 pekala, “privacy and user,” pp. 48–58. 8 harter and busha, “libraries and privacy legislation,” pp. 475–81; george s. machovec, “data security and privacy in the age of automated library systems,” information intelligence, online libraries, and microcomputers 6, no. 1 (1988). 9 fouty, “online patron records and privacy, pp. 289–93. 10 grace j. agnew and rex miller, “how do you manage?,” library journal 121, no. 2 (1996): 54. 11 lois k. merry, “hey, look who took this out!—privacy in the electronic library,” journal of interlibrary loan, document delivery & information supply 6, no. 4 (1996): 35–44, https://doi.org/10.1300/j110v06n04_04. https://arstechnica.com/tech-policy/2017/04/trumps-signature-makes-it-official-isp-privacy-rules-are-dead/ https://arstechnica.com/tech-policy/2017/04/trumps-signature-makes-it-official-isp-privacy-rules-are-dead/ https://doi.org/10.6017/ital.v36i2.9817 http://www.ala.org/template.cfm?section=history1&template=/contentmanagement/contentdisplay.cfm&contentid=8875 http://www.ala.org/template.cfm?section=history1&template=/contentmanagement/contentdisplay.cfm&contentid=8875 https://doi.org/10.1016/0099-1333(93)90024-y http://www.ala.org/advocacy/proethics/codeofethics/codeethics http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy https://doi.org/10.1300/j110v06n04_04 information technology and libraries | december 2018 117 12 aimee fifarek, “technology and privacy in the academic library,” online information review 26, no. 6 (2002): 366–74, https://doi.org/10.1108/14684520210452691. 13 john n. berry iii, “digital democracy: not yet!,” library journal 125, no. 1 (2000): 6. 14 american library association, “appendix—privacy and confidentiality in the electronic environment,” september 28, 2006, http://www.ala.org/lita/involve/taskforces/dissolved/privacy/appendix. 15 judith mavodza, “the impact of cloud computing on the future of academic library practices and services,” new library world 114, no. 3/4 (2012): 132–41, https://doi.org/10.1108/03074801311304041. 16 richard levy, “library in the cloud with diamonds: a critical evaluation of the future of library management systems,” library hi tech news 30, no. 3 (2013): 9–13, https://doi.org/10.1108/lhtn-11-2012-0071; raymond bérard, “next generation library systems: new opportunities and threats,” bibliothek, forschung und praxis 37, no. 1 (2013): 52–58, https://doi.org/10.1515/bfp-2013-0008. 17 michael stephens, “the hyperlinked library: a ttw white paper,” accessed may 13, 2018, http://tametheweb.com/2011/02/21/hyperlinkedlibrary2011/; michael zimmer, “patron privacy in the ‘2.0’ era.” journal of information ethics 22, no. 1 (2013): 44–59, https://doi.org/10.3172/jie.22.1.44. 18 zimmer, “patron privacy in the ‘2.0’ era,” p. 44. 19 “the american library association’s task force on privacy and confidentiality in the electronic environment,” american library association, final report july 7, 2000, http://www.ala.org/lita/about/taskforces/dissolved/privacy. 20 library information technology association (lita), accessed may 11, 2018, http://www.ala.org/lita/. 21 library information technology association (lita), accessed may 11, 2018, http://www.ala.org/lita/; pam dixon, “ethical issues implicit in library authentication and access management: risks and best practices,” journal of library administration 47, no. 3 (2008): 141–62, https://doi.org/10.1080/01930820802186480; eric p. delozier, “anonymity and authenticity in the cloud: issues and applications,” oclc systems and services: international digital library perspectives 29, no. 2 (2012): 65–77, https://doi.org/10.1108/10650751311319278. 22 marshall breeding, “building trust through secure web sites,” computers in libraries 25, no. 6 (2006), p. 24. 23 breeding, “building trust,” p. 25. https://doi.org/10.1108/14684520210452691 http://www.ala.org/lita/involve/taskforces/dissolved/privacy/appendix https://doi.org/10.1108/03074801311304041 https://doi.org/10.1108/lhtn-11-2012-0071 https://doi.org/10.1515/bfp-2013-0008 http://tametheweb.com/2011/02/21/hyperlinkedlibrary2011/ https://doi.org/10.3172/jie.22.1.44 http://www.ala.org/lita/about/taskforces/dissolved/privacy http://www.ala.org/lita/ http://www.ala.org/lita/ https://doi.org/10.1080/01930820802186480 https://doi.org/10.1108/10650751311319278 application level security in a public library |thomchick and san nicolas-rocca 118 https://doi.org/10.6017/ital.v37i4.10405 24 barbara swatt engstrom et al., “evaluating patron privacy on your ils: how to protect the confidentiality of your patron information,” aall spectrum 10, no 6 (2006): 4–19. 25 breeding, “building trust,” p. 26. 26 tj lamana, “the state of https in libraries,” intellectual freedom blog, the office for intellectual freedom of the american library association (2017), https://www.oif.ala.org/oif/?p=11883. 27 chris palmer and yan zhu, “how to deploy https correctly,” electronic frontier foundation, updated february 9, 2017, https://www.eff.org/https-everywhere/deploying-https. 28 computer security resource center, “glossary,” national institute of standards and technology, accessed may 12, 2018, https://csrc.nist.gov/glossary/?term=491#alphaindexdiv. 29 computer security resource center, “glossary,” national institute of standards and technology, accessed may 12, 2018, https://csrc.nist.gov/glossary/?term=2817. 30 open web application security project, “session hijacking attack,” last modified august 14, 2014, https://www.owasp.org/index.php/session_hijacking_attack; open web application security project, “session management cheat sheet,” last modified september 11, 2017, https://www.owasp.org/index.php/session_management_cheat_sheet. 31 eric butler, “firesheep,” (2010), http://codebutler.com/firesheep/; audrey watters, “zuckerberg's page hacked, now facebook to offer ‘always on’ https," accessed may 16, 2018, https://readwrite.com/2011/01/26/zuckerbergs_facebook_page_hacked_and_now_facebook/ . 32 info security magazine, “senator schumer: current internet security “welcome mat for wouldbe hackers,” (march 2, 2011), http://www.infosecurity-magazine.com/view/16328/senator schumer-current-internetsecurity-welcome-mat-for-wouldbe-hackers/. 33 palmer and zhu, “how to deploy https correctly”; internet engineering task force, “recommendations for secure use of transport layer security (tls) and datagram transport layer security (dtls),” (may, 2015), https://tools.ietf.org/html/bcp195; open web application security project, “session management cheat sheet,” last modified september 11, 2017, https://www.owasp.org/index.php/session_management_cheat_sheet. 34 qualys ssl labs, “ssl/tls deployment best practices,” accessed may 18, 2018, https://www.ssllabs.com/projects/best-practices/. 35 sourceforge, “sslscan—fast ssl scanner,” last updated april 24, 2013, http://sourceforge.net/projects/sslscan/. 36 palmer and zhu, “how to deploy https correctly.” https://www.oif.ala.org/oif/?p=11883 https://www.eff.org/https-everywhere/deploying-https https://csrc.nist.gov/glossary/?term=491#alphaindexdiv https://csrc.nist.gov/glossary/?term=2817 https://www.owasp.org/index.php/session_hijacking_attack https://www.owasp.org/index.php/session_management_cheat_sheet http://codebutler.com/firesheep/ https://readwrite.com/2011/01/26/zuckerbergs_facebook_page_hacked_and_now_facebook/ http://www.infosecurity-magazine.com/view/16328/senator-%20schumer-current-internet-%20security-welcome-mat-for-wouldbe-hackers/ http://www.infosecurity-magazine.com/view/16328/senator-%20schumer-current-internet-%20security-welcome-mat-for-wouldbe-hackers/ https://tools.ietf.org/html/bcp195 https://www.owasp.org/index.php/session_management_cheat_sheet https://www.ssllabs.com/projects/best-practices/ http://sourceforge.net/projects/sslscan/ abstract introduction literature review the primal importance of patron privacy the impact of information technology on patron privacy protecting patron privacy in the “library 2.0” era http exploits testing techniques used to evaluate https implementation automated software tools heuristic evaluations research methodology research location test results and discussion use of http during authenticated user sessions presence of mixed content insecure cookie management flawed encryption protocol support limitations and future research conclusion references creating and managing a repository of past exam papers communications creating and managing a repository of past exam papers mariya maistrovskaya and rachel wang information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11837 mariya maistrovskaya (mariya.maistrovskaya@utoronto.ca) is digital publishing librarian, university of toronto. rachel wang (rachel.wang@utoronto.ca) is application programmer analyst, university of toronto. abstract exam period can be a stressful time for students, and having examples of past papers to help prepare for the tests can be extremely helpful. it is possible that past exams are already shared on your campus—by professors in their specific courses, via student unions or groups, or between individual students. in this article, we will go over the workflows and infrastructure to support the systematic collection, provision of access to, and repository management of past exam papers. we will discuss platform-agnostic considerations of opt-in versus opt-out submission, access restriction, discovery, retention schedules, and more. finally, we will share the university of toronto setup, including a dedicated instance of dspace, batch metadata creation and ingest scripts, and our submission and retention workflows that take into account the varying needs of stakeholders across our three campuses. background the university of toronto (u of t) is the largest academic institution in canada. it spans across three campuses and serves more than 90,000 students through its 700 undergraduate and 200 graduate programs.1 the university of toronto structure is the product of its rich history and is thus largely decentralized. as a result, the management of undergraduate exams is carried out individually by each major faculty at the downtown (st. george) campus, and centrally at the university of toronto mississauga (utm) campus and the university of toronto scarborough (utsc) camp us. the faculty of arts and science (fas) at the st. george campus has traditionally made exams from its departments available to students. in the pre-internet era, students were able to consult print and bound exams in departmental and college libraries’ reference collections. with the rise of online technologies, the fas registrar’s office seized the opportunity to make access to past exams more equitable for students and worked with the university of toronto libraries (utl) information technology services (its) to digitize and make exams available online. they were initially shared electronically via the gopher protocol and later via docutek eres, one of the first available course e-reserves systems. after the utl became an early adopter of the dspace (https://duraspace.org/dspace/) open source platform for its institutional repository in 2003, the utl its created a separate instance of dspace to serve as a repository of old exams. the repository makes the last three years of exams from the fas, utm, and utsc available online in pdf. about 5,500 exam papers are available to students with u of t login at any given time. discussed below are some of the considerations in establishing and maintaining a repository of old exams on campus, along with practical recommendations and shared workflows from the utl. mailto:mariya.maistrovskaya@utoronto.ca mailto:rachel.wang@utoronto.ca https://duraspace.org/dspace/ information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 2 considerations in establishing a repository of old exams if you are looking to establish a repository of old exams, these are some of the considerations to take into account when planning a new service or evaluating an existing one. the source of old exams depending on the level of centralization on your campus, exams may be administered by individual academic departments or submitted by instructors/admins into a single location and managed centrally. the stakeholders involved in this process may include the office of the registrar, campus it, departmental admins or libraries, etc. establishing a relationship with such stakeholders is key in getting access to the files. when arranging to receive electronic files, consider whether they could be accompanied with existing metadata. alternatively, if the university archives or records management already receive copies of campus exams, you may be able to obtain them there. print versions will need to be digitized for online access—later in this article we will share metadata creation strategies in this scenario. it is also possible that exams may be collected in less formal ways, for example, via exam drives by student unions and groups. the utl works closely with the fas registrar’s office to receive a batch of exams annually. the utl receives a copy of print fas exams that get digitized by the its staff. the utl also receives exams from two u of t campuses, utm and utsc, that arrive in electronic format via the campus libraries. the u of t engineering society and the faculty of law each maintain their individual exam repositories, and the arts and science student union maintains a bank of term tests donated by students. content hosting and management one of the key questions to answer is which campus department or unit will be responsible for hosting the exams, managing content collection, processing and uploads, and providing technical and user support. these responsibilities may be within the purview of a single unit or may be shared between stakeholders. here are some examples of the tasks to consider: 1. collecting exams from faculty or receiving them from a central location 2. managing restrictions (exams that will not be made available online) 3. digitizing exams received in print 4. creating metadata or converting metadata received with the files 5. uploading exams to the online repository 6. removing exams from the online repository 7. providing technical support and maintenance (e.g., platform upgrades, troubleshooting) 8. providing user support (e.g., assistance with locating exams) at u of t, tasks 1–2 are taken care of by registrar offices at fas and utm and by the library at utsc. tasks 3–8 are performed centrally by the utl its, with the exception of digitization services for exams received from the utm and utsc campuses. further details and considerations related to the content management system and processing pipelines are outlined in the “infrastructure and workflows” section below. information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 3 collection scope depending on the sources of your exams, you may need to establish the scope rules for what gets included in the collection. for example: • will you only include final exams? will term tests also be included? • will solutions be posted with the exams? • will additional materials, such as course syllabi, also be included? at the utl, only final exams are included in the repository, and no answers are supplied. exam retention making old exams available online is always a balancing act between the interests of students who want to have access to past test questions and the interests of instructors who may have a limited pool of questions to draw from or who may teach different course content over time and want to ensure that the questions continue to be relevant. at the utl, in consultation with campus partners, the balance was achieved by only posting the three most recent years of exams in the repository. as soon as a new batch is received, the utl removes a batch of exams more than three years old. opt-in versus opt-out approach where exam collection is driven centrally by a registrar’s office, for example, that office may require that all past exams be made available to students. similarly to the retention considerations, the needs of instructors who draw questions from a limited pool can be accommodated via opt-outs, individual exam restrictions, and ad hoc take-down requests. an alternative approach to exam collection would be an opt-in model where faculty choose to submit exam questions on their own schedule. at the utl, the fas and the utm campus both operate under the opt-out model. the utl receives all exam questions in regular batches unless they have been restricted by instructors’ requests. occasional withdrawal requests from instructors require an approval from the registrar’s office. conversely, the utsc campus operates under the opt-in model where individual departments submit their exams to the library. while this model provides the most flexibility, the volume of exams received from this campus is subsequently relatively small. repository access when making old exams available online, one of the things to consider is who will have access to them. will the exams only be available to students of the respective academic department, or to all students, or to the general public? will access be possible on campus as well as off campus? if the decision is made to restrict access, is there an existing authorization infrastructure in place that the repository could take advantage of, such as an institutional single sign-on or library’s proxy access? at the utl, access to the old exams repository is provided through ezproxy in the same fashion as subscription resources made available via the library. information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 4 discoverability and promotion how will students find out about the exams available in the repository? will the repository be advertised via the library’s website, promoted by course instructors, or linked with the other course materials? considering the challenge of promoting a resource like this along with a variety of other library resources, it will be preferable to make it known to students via the same channels through which they receive other course information. for many institutions this would be via their learning management system or their course information system. at u of t, the old exams repository is linked from the library website. previously, the link was embedded in the university’s learning management system course template. with a recent transition to a new learning management engine, such exposure is yet to be reestablished. infrastructure and workflows minimum cms requirements a repository of old exams does not require a specific content management system (cms) or an offthe-shelf platform. your institution may already have all the components in place to make it happen. here are the minimum requirements you will want to see in such a system: • file upload by staff (preferably in batch) • file download by end users • basic descriptive metadata • search / browse interface • access control / authentication (if you choose to restrict access) the utl uses a stand-alone instance of dspace for its old exams repository. dspace is an opensource software for digital repositories used across the globe primarily in academic institutions. the utl chose this platform since it was already running an instance of dspace for its institutional repository (ir) and had the infrastructure and expertise on site. however, this is not a solution we would recommend to an institution with no existing dspace experience. while dspace is an open source platform, maintaining it locally requires significant staff expertise that may not be warranted considering that a collection of exams would only use a fraction of its robust functionality. if you do consider using dspace, a hosted solution may be preferable in a situation when local it resources and expertise are limited. distributing past exams via an existing digital repository an institution that already maintains a digital repository may consider adding exams as a collection to the existing infrastructure. when choosing to do so it is important to consider whether the exams use case may be different from your ir use case, and whether the new collection will fit in the existing mission and policies. differences may include the following: • access level. ir missions tend to revolve around providing openly accessible materials, whereas exams may need to be restricted. will your repository allow selective access restrictions to the exams collection? • longevity. ir materials are usually intended to be kept long-term, whereas exams may be on a retention schedule. for that reason, it also does not make sense to assign permanent identifiers to exams as many repositories do for their other materials. information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 5 • file types and metadata. unlike a variety of research outputs and metadata usually captured in an ir, exams would have uniform metadata and object type. this makes them suitable for batch transformations and uploads. batch metadata creation options because of the uniform object type, exams are well suited to batch processing, transformations, and uploads. at utl, metadata is created from the filenames of scanned pdf files by a python script.2 the script breaks up the filename into dublin core metadata fields based on the pattern shown in figure 1. see figure 2 for a snippet of the script populating dublin core metadata fields. figure 1. file-naming pattern for metadata creation at utl. figure 2. a screenshot of the utl script generating dublin core metadata from filenames. once metadata is generated, the second python script (figure 3) packages the pdf and metadata file into a dspace simple archive (dsa) which is the format that dspace accepts for batch ingests. information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 6 figure 3. a screenshot of the utl script packaging a pdf and metadata into a dspace simple archive. the dspace simple archive (dsa) then gets batch uploaded into the respective campus and examperiod collections (figure 4) using the dspace native batch import functionality. figure 5 shows what an individual exam record looks like in the repository. after a new batch is uploaded, collections older than three years are removed from the repository. the utl’s exams processing scripts are openly available in github under an apache license 2.0 (https://github.com/utlib/dspace-exams-ingest-scripts/). figure 4. a screenshot of collections in the utl’s old exams repository. https://github.com/utlib/dspace-exams-ingest-scripts/ information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 7 figure 5. a screenshot of a record in the utl’s old exams repository. conclusion having access to examples of past exam questions can be extremely helpful to students in preparing for upcoming tests. it is possible that old exams are already being shared on your campus in official or unofficial ways, in print or electronically. facilitating online sharing of electronic copies means that all students, on and off campus, will have equitable access to these valuable resources. we hope that the considerations and workflows outlined in this article will help institutions establish such services locally. acknowledgements the authors would like to acknowledge the utl librarians and staff who contributed to the setup and maintenance of the old exams repository over the years: marlene van ballegooie, metadata technologies manager, who operated the filename-to-dublin core metadata crosswalk; sean xiao zhao, former applications programmer analyst, who converted it into python; and sian meikle, associate chief librarian for digital strategies and technology, who was at the inception of the original exam-sharing service and provided valuable historical context and feedback on this article. endnotes 1 university of toronto, “quick facts,” accessed november 4, 2019, https://www.utoronto.ca/about-u-of-t/quick-facts. 2 university of toronto libraries, “exam metadata generation and ingest for dspace,” github repository, last modified september 20, 2019, https://github.com/utlib/dspace-exams-ingestscripts/. https://www.utoronto.ca/about-u-of-t/quick-facts https://github.com/utlib/dspace-exams-ingest-scripts/ https://github.com/utlib/dspace-exams-ingest-scripts/ abstract background considerations in establishing a repository of old exams the source of old exams content hosting and management collection scope exam retention opt-in versus opt-out approach repository access discoverability and promotion infrastructure and workflows minimum cms requirements distributing past exams via an existing digital repository batch metadata creation options conclusion acknowledgements endnotes reproduced with permission of the copyright owner. further reproduction prohibited without permission. electronic library for scientific journals: consortium project in brazil rosaly favero krzyzanowski;taruhn, rosane information technology and libraries; jun 2000; 19, 2; proquest pg. 61 electronic library for scientific journals: consortium project in brazil making information available for the acquisition and transmission of human knowledge is the focal point of this paper, which describes the creation of a consortium for the 1111iversity and research institute libraries in the state of sao paulo, brazil. through sharing and cooperation, the project will facilitate information access and minimize acquisition costs of international scientific periodicals, consequently increasing user satisfaction. to underscore the advantages of this procedure, the objectives, management, and implementation stages of the project are detailed, as submitted to the research support foundation of the state of sao paulo (fapesp). i production, organization, and acquisition of knowledge in 1851, predicting the imminent growth in information, which in fact exploded in volume one hundred years later, joseph henri of the smithsonian institute voiced his opinion that the progress of mankind is based on research, study, and investigation, which generate wisdom, knowledge or, simply , information. he stated that for practically every item of interest there is some record of knowl edge pertinent to it, "and unless this mass of information be properly arranged, and the means furnished by which its content may be ascertained, literature as well as science will be overwhelmed by their own unwieldy bulk. the pile will begin to totter under its own weight, and all the additions we may heap upon it will tend to add to the extension of the base, without increasing the elevation and dignity of the edifice." 1 at the threshold of the twenty-first century, these words become more self-evident by the day. there are enormous archives of knowledge from which people extract parts, allowing them to advance and progress in science, technology, and the humanities. until some decades back, recovery from these archives was essentially a manual task consisting of written work and organization. today's technologies provide auxiliary tools to transmit this knowledge . although information is a cultural and social asset, it now is purchased at high prices . making these enormous archives available in a clear and organized manner by using the proper technology is currently the greatest challenge for all those involved in knowledge management-the production , organization, and transmission of information. rosaly favero krzyzanowski rosane taruhn i the advent and implications of electronic publications among the major contributions of the industrial era, outstanding are the evolution and growth of information publi shing and printing facilities that use tools to record, store, and distribute information. in the last ten years, the first steps were taken toward the storage and reproduction of sounds and images in new multimedia formats. technological advances also have brought new possibilities in accessing and disseminating information . electronic publishing has been particularly effective in accelerating access and contributing to the generation of additional knowledge; consequently, an exponential increase in data has taken place, most notably in the second half of the twentieth century. current journals numbered about 10,000 at the beginning of the century; by the year 2000 the number had reached an estimated 1 million. 2 as a result, specialized literature has been warning about a possible crisis in the traditional system of scientific publications on paper . in addition to the difficulty of financing the publication of these works, the prices of subscriptions to scientific periodicals on paper have been rising every year. at times, this makes it impracticable to update collections in all libraries, which interferes substantially in development. on the other hand, access to electronic scientific publications via internet is proving to be an alternative for maintaining these collections at lower cost. it also provides greater agility in publishing and distributing the periodical, and in the final user's accessing of the information. due to this, it is important that institutions that wish to support and promote research developed by their scientific communities facilitate access to these publications on electronic media . to paraphrase line, we can say that although publishers are still uncertain as to all the aspects of transmitting information electronically, because authors and institutions will be increasingly able to distribute their works on the web without the direct involvements of publishers, there is an escalation in electronic publications being published by scientific publishers.3 rosaly favero krzyzanowski is technical director of the integrated library system of the university of sao paulosibi/usp, brazil. rosane taruhn is director of the development and maintenance of holdings service of the technical department of the university of sao paulo-sibi/usp, brazil. electronic library for scientific journals i krzyzanowski andtaruhn 61 ! reproduced with permission of the copyright owner. further reproduction prohibited without permission. physical figure 1. infrastructure resources for consortium formation line also savs that one of the reasons for the growth in the number o'f electronic publications is "that it is technically possible to make them [journals] accessible in this way, and in fact easy and cheap, since nearly all te_xt ~oes through a digital version on the way to pubh~ahon. secondly, journal publishers believe that electronic ve~sions provide a second market in addition to that for t~eir printed versions, or at least in an expanded market, since many users will be the same." 4 . . . . . it is important to point out that the sc1enhhc penod1cal, be it paper or electronic, must ensure market valu_e and academic community receptivity, have a staff qualified for scientific publishing, be consistent in publishing release dates, comply with international standards, and use established distribution and sales mechanisms. 5 line goes further: "electronic publication as an_ 'extra' to printed publication has few added costs of j~urnal publication other than those of printing, and pubhshe~s are not going to want to make less money fro~ elect~onic journals than they do from printed ones. while p~inted journals once acquired can be used and reused without extra cost, each access to an electronic article has to be paid for. and although the costs of storage and binding may be saved, these are offset by the costs of printing out."6 he then notes that this technology demands an active equipment and telecommunication infrastructure. another point he addresses is the need for users to master the search strategies required to efficiently recover information, thus reducing the time spent and costs. in turn, saunders points out that, depending on the contracts made with the publishers or their agents: 62 information technology and libraries i june 2000 libraries, through their development, formation, and maintenance policies, should be receptive to this transition by accommodating the different means of communication to the different user needs and striving for a new balance. these policies should certainly stress the cooperation and sharing of remote access to the information demanded. budget estimates should, therefore, foresee, in addition to the subscriptions to electronic titles with complete texts, other possible items like licensing rates for multi-user remote access and the right to copy articles on electronic media to paper, depending on the contracts made with the publishers or their agents.7 i electronic publication consortiums catering to mutual interests by setting up a library consortium to select, acquire, maintain, and preserve electronic information is one means of reducing or sharing costs as well as expanding the universe of information available to users and ensuring a successful outcome. resources-physical, human, financial, and electronic-are combined for the common good; in this case, the consortium, as shown in figure 1, which was extracted and adapted from an oclc institute. 8 the consortium presupposes invigoration of cooperative activities among member libraries by promoting the central administration of electronic publication databases as part of a shared library system visible to all and replete with access facilities. in addition to putting in place simplified, reciprocal lending progra~s and spu_rring _the cooperative development of collections and the~r st~nng, the consortium has the objective of implementing information distribution by electronic means, provided that copyright and fair use rights are complied _wi~h.9 on t~e other hand, "the research library community is committed to working with publishers and database producers to develop model agreements that deploy lice~ses that d? not contract around fair use or other copynght provisions. in this way, one seeks to insure the library practices being disseminated, especially interli?~ary lendi~g."'. 0 experience shows that acqumng ~ubhcahons through consortia has brought great benefits and has equally favored different size institutions that would not be able to afford single subscriptions, whether on paper or in electronic format. north american and european universities have been opting for this type of alliance to augment inve~tment cost-benefit. important examples of these consortia currently operative are: • washington research library consortium, washington, d.c., www.wric.org; reproduced with permission of the copyright owner. further reproduction prohibited without permission. • university system of georgia, galileo project, www.galileo.peachnet.edu; • committee on institutional cooperation, michigan, www.cedar.cic.net/ cic; and • ohio library and information network, ohio link, www.ohiolink.edu. i the electronic consortium in the state of sao paulo considering that brazilian institutions also are being affected by the high cost of maintaining periodical collections and that alternative means of distributing this information are available, the model used abroad has shown itself as appropriate for developing the international scientific publications electronic library in the state of sao paulo. the location has a favorable information infrastructure available, particularly that of the electronic network of the academic network of sao paulo (ansp), thanks to the support of the research support foundation of the state of sao paulo (fapesp). 11 growing user demand for direct, convenient access to information in the state of sao paulo also was a factor in location choice. the final decision was to compose the consortium of five sao paulo state universitiesuniversidade de sao paulo (usp), universidade estadual paulista (unesp), universidade de campinas (unicamp), universidade federal de sao carlos (ufscar), and universidade federal de sao paulo (unifesp)-as well as the latin american and caribbean center for health science information (bireme). the consortium's goal was to make available to the member institutions' entire scientific community-10,492 faculty and researchers -rapid access to the complete, updated texts of the elsevier science scientific journals. this publishing house, an umbrella for north holland, pergamon press, butterworth-einemann, and excerpta medica, presently publishes electronic versions of its journals. selection of the member institutions that would serve as a pilot group for this project was based on prior experience with the cooperative work in preparing the unibibli collective catalog cd-rom, which, using bireme/opas/oms technology, consolidates the collections of these three universities. the project was initially funded by the fapesp; since its fourth edition the cdrom has been published through funds provided by the universities themselves, by means of a signed agreement. moreover, choice of elsevier science, which would be justified solely by its premier ranking in the global publishing market, also is due to the fact that consortium member institutions maintain subscriptions to a great number (606) of this publishing house's titles on paper. already fully available on electronic media, these titles are components of a representative collection initiating the building of the international scientific publications electronic library in the state of sao paulo. furthermore, the majority of the titles are studied on the institute of scientific information's web of science site, which has been at the disposal of researchers and libraries in the state of sao paulo since 1998. consortium objectives the consortium was formed to contribute to the development of research through the acquisition of electronic publications for the state of sao paulo's scientific community. using the ansp network, in addition to augmenting and speeding up access to current scientific information in all the member institutions, will: • increase the cost-benefit per subscription; • promote the rational use of funds; • ensure continuous subscription to these periodicals; • increase the universe of publications available to users through collection sharing; • guarantee local storage of the information acquired and thus ensure the collection's maintenance and its continual use by present and future researchers; and • develop the technical capabilities of the personnel of the state of sao paulo institutions in operating and using electronic publication databases. initially, the project will not interfere in the current process of acquiring periodicals on paper and in distributing collections in member institutions. however, as electronic collection utilization becomes predominant, duplicate subscriptions to paper may be eliminated so as to allow new subscriptions to be available to the consortium at no additional cost. implementation of the electronic library for international scientific publications implementation of this project includes the following stages already achieved: • constitution of the consortium by the six member institutions; and • set up of an administrative board. the following stages are in progress: • purchase of hardware (central server) and a software manager; and • estimate for the installation of the operational system. electronic library for scientific journals i krzyzanowski and taruhn 63 reproduced with permission of the copyright owner. further reproduction prohibited without permission. bireme server fapesp server full-text database r----------.,1 full-text 1 t international i r database 1 ~----------~ web of science .... •--•.. : scientific : : current : : contents : scielo : periodical : 1 electronic 1 i l'b i 1 1 rary 1 .. __________ .. : connect : i (ccc) i i i ., __________ ., \/ universe • web of science: 8,000 titles • ccc: 9,000 titles users in consortia institutions • scielo (scientific electronic library online): 100 titles • international scientific periodical electronic library: 606 titles figure 2. reference database and full-text interconnectivity to optimize information access and the following stages are planned: • training for qualified personnel and maintenance of the infrastructure built up; • acquisition and implementation of the electronic library on the central server; and • permanent utilization assessment. the pilot project proposes that the central server, for storage and availability of electronic scientific periodical collections on the ansp network, be located at fapesp in order to facilitate development of an electronic bank. in the future, the bank should, in addition to the collection in mind for the project, include international collections of other publishing houses: the scielo collection of brazilian scientific magazines (project fapesp /bireme) as well as the web of science and current contents connect reference databases (see figure 2). consortium management the electronic library will be administrated by the consortium's administrative board, made up of a general coordinator, an operations coordinator, and directors and coordinators of the library systems and central libraries of member institutions as well as consultants recommended by fapesp. the administrative board shall be in charge of the implementation, operation, dissemination, and assessment of electronic library utilization. it also is charged 64 information technology and libraries i june 2000 with supervising qualified personnel training in order to guarantee the success of the project. an agreement specifying the consortium objective, its constitution, the manner by which it shall be executed and consortium member obligations established was signed. shortly, a contract to use elsevier science electronic publications shall be signed by fapesp and by the provider. the agreement's documents and use license were drawn up in compliance with the principles for licensing electronic resources recommended by the american library association, published in final version at the 1997 american library association annual conference.12 i recovery system and information use evaluation research on electronic media suggests that use of a single software program that offers different strategies and forms of interacting for searching the collections requires an evaluation of the efficiency of individual research strategies. this evaluation is critical for preparation of guidelines that orient the choice of systems and proper training programs.13 for the electronic library, the challenge of measuring not only the amount of file use but also the efficacy and efficiency of its information access systems and training for its users is an imperative task. in the project reproduced with permission of the copyright owner. further reproduction prohibited without permission. described, evaluation shall be made by indicators that demonstrate use of the electronic library and of the collections on paper, per journal title, subject researched, user institution, number of accesses per day, and user satisfaction regarding service provided (interface, response time, text copies), among other factors to be studied. i final remarks the way in which electronic media are read by the users is a code far beyond the written, because sound and image are being added increasingly. in this first generation of electronic publications, fapesp supported availability of web of science and of scielo by fapesp and the creation of the international scientific publications electronic library in the state of sao paulo. the possible introduction of current contents connect will trigger an extraordinary leap in research development, facilitating the access of scientific information and the acquisition and transmission of human knowledge as well as enhancing the cooperative and sharing enterprise of member libraries. references and notes l. annual report of the board of regents of tile smit/zsonum institution ... during the year 1851 (washington, d.c. 1852), 22. 2. leo wieers, "a vision of the library of the future," in developing the library of the fut11re: the tilb11rg experience, h. geleijnse and c. grootaers, eds. (tilburg, the netherlands: tilburg univ., 1994), 1-11. 3. m. b. line, "the case for retaining printed lis journals," !fla journal 24, no. 1 (oct./nov. 1998): 15-19. 4. ibid. 5. r. f. krzyzanowski, "administra<;ao de revistas cientificas," in re11niiio anual da sociedade de pesquisa odonto/6gica, aguas de sao pedro, 14, 1997. (lecture) 6. line, "the case for retaining printed lis journals." 7. l. m. saunders, "transforming acquisitions to support virtual libraries," information teclmology and libraries 14, no. 1 (mar. 1995): 41-46. 8. oclc institute, oclc instit11te seminar: information tec/znology trends for thl' global library cormmmity, 1997, ohio (dublin, ohio: oclc institute/the andrew w. mellon foundation/funda<;ao gettilio vargas/bibliodata library network, 1997). 9. a definition of fair use is the "legal use of information: permission to reproduce texts for the purposes of teaching, study, commentary or other specific social purposes." found in j. s. d. o'connor, "intellectual property: an association of research libraries statement of principles." accessed july 28, 1999, http://arl.cni.org/ scomm/ copyright/ principles. html. 10. statement of current perspective and preferred practices for the selection and purchase of electronic information. icolc statement on electronic information. accessed july 2, 1998, www.library.yale.edu/ consortia/statement.html. 11. r. f. krzyzanowski and others, biblioteca eletr6nica de publicac;oes cientfficas internacionais para as universidades e institutos de pesquisa do estado de sao paulo. sao paulo, 1998 (project presented to fapesp-fundac;ao de amparo a pesquisa do estado de sao paulo). 12. b. e. c. schottlaender, "the development of national principles to guide librarians in licensing electronic resources," library acquisitions-practice and theory 22, no. 1 (spring 1998): 49-54. 13. w. s. lang and m. grigsby, "statistics for measuring the efficiency of electronic information retrieval," journal of the american society for information science 47, no. 2 (feb. 1996): 159-66. electronic library for scientific journals i krzyzanowski and taruhn 65 mobile technologies & academics: do students use mobile technologies in their academic lives and are librarians ready to meet this challenge? angela dresselhaus and flora shrode mobile technologies & academics | dresselhaus and shrode 82 abstract in this paper we report on two surveys and offer an introductory plan that librarians may use to begin implementing mobile access to selected library databases and services. results from the first survey helped us to gain insight into where students at utah state university (usu) in logan, utah, stand regarding their use of mobile devices for academic activities in general and their desire for access to library services and resources in particular. a second survey, conducted with librarians, gave us an idea of the extent to which responding libraries offer mobile access, their future plans for mobile implementation, and their opinions about whether and how mobile technologies may be useful to library patrons. in the last segment of the paper, we outline steps librarians can take as they “go mobile.” purpose of the study similar to colleagues in all types of libraries around the world, librarians at utah state university (usu) want to take advantage of opportunities to provide information resources and library services via mobile devices. observing growing popularity of mobile, internetcapable telephones and computing devices, usu librarians assume that at least some users would welcome the ability to use such devices to connect to library resources. to find out what mobile services or vendors’ applications usu students would be likely to use, we conducted a needs assessment. the lessons learned will provide important guidance to management decisions about how librarians and staff members devote time and effort toward implementing and developing mobile access. we conducted a survey of usu’s students (approximately 25,000 undergraduates and graduates) to determine the degree of handheld device usage in the student population, the purposes for which students use such devices, and students’ interests in mobile access to the library. in addition, we surveyed librarians to learn about libraries’ current and future plans to launch mobile services. this survey was administered to an opportunistic population angela dresselhaus (aldresselhaus@gmail.com) was electronic resources librarian, flora shrode (flora.shrode@usu.edu) is head, reference & instruction services, utah state university, logan, utah. mailto:aldresselhaus@gmail.com mailto:flora.shrode@usu.edu information technology and libraries | june 2012 83 comprised of subscribers to seven e-mail lists whom we invited to offer feedback. our goal was to develop an action plan that would be responsive to students’ interests. at the same time, we aim to take advantage of the growing awareness of and demand for mobile access and to balance workloads among the library information technology professionals who would implement these services. usu is utah’s land-grant university and the merrill-cazier library is its primary library facility on the home campus in logan, utah. while usu has had satellite branches for some time, a growing emphasis on expanding online and distance education courses and degree programs has resulted in a considerable growth of its distance education programs in the last five years. mobile access to university resources makes especially good sense for the distance education population and for students who may reside close to the main usu campus but who also enroll in online courses. the library has an information technology staff of 4.5 fte professionals who support the library catalog, maintain roughly 250 computer workstations in cooperation with the director of campus student computer labs, and oversee the computing needs of library staff and faculty members. literature review mobile access to library resources is not a new concept; in fact, the first project designed to deliver handheld mobile access to library patrons began eighteen years ago, in 1993, the time of mainframe computers and gopher. the “library without a roof” project partners included the university of southern alabama, at&t, bellsouth cellular, and notable technologies, inc. 1 library patrons at participating institutions could search and read electronic texts on their personal digital assistants (pdas) and search the library catalog while browsing in physical collections. as reflected in the literature, interest in pda applications for libraries started to pick up around the turn of the twenty-first century. medical librarians were among the first to widely recognize the potential impact of mobile technologies on librarianship. a 2002 article in the journal of the medical library association and a monograph by colleen cuddy are among the first publications that focus on pdas. 2 a quick perusal of the medical category on the itunes store reveals several professional applications, ranging from new england journal of medicine tools to remote patient vital-sign monitors. as an example of the depth of mobile-device penetration in the medical field, in 2010 the food and drug administration approved the marketing of the airstrip suite of mobile-device applications. these apps work in conjunction with vital-sign monitoring equipment to allow instant remote access to a patient’s vital signs. 3 these examples illustrate the increasing pervasiveness of mobile technology in everyday life. mobile learning in academic areas outside of medicine has increased recently as more universities have adopted mobile technologies. 4 a sampling of current projects at academic mobile technologies & academics | dresselhaus and shrode 84 institutions is provided in the 2010 horizon report. 5 according to the 2010 educause center for applied research (ecar) study, 49 percent of undergraduates consider themselves mainstream adopters of technology. 6 locally, utah state university students have adopted smartphones at the rate of 39.3 percent and other handheld internet devices at the rate of 31.5 percent. these statistics indicate that skills are increasing and the technological landscape is changing quickly. the ecar study reports that student computing is rapidly moving to the cloud, another indication of the rapid change in the use of technology. “usb may one day go the way of the eight-track tape as laptops, netbooks, smartphones and other portable devices enable students to access their content from anywhere. they may or may not be aware of it, but many of today’s undergraduates are already cloud-savvy information consumers, and higher education is slowly but surely following their lead.” 7 similarly, usu students show interest in adopting new technology. while usu students are less likely to own mobile devices, 70.2 percent of respondents indicated that they would be likely or very likely to use library resources on smartphones if they owned capable devices and if the library provided easy access to materials. bridges, gascho rempel, and griggs published a comprehensive article, “making the case for a fully mobile library web site: from floor maps to the catalog,” detailing their efforts to implement mobile services on the oregon state university campus. 8 their paper highlights the popularity of mobile phones and smartphones/web-enabled phones. the authors discuss mobile phone use, library mobile websites, and mobile catalogs, and they describe the process they used to develop their mobile library site. they note that mobile services will certainly be expected in the coming years, and we have learned that usu students share this expectation. survey research in recent years librarians have conducted surveys on mobile technology in libraries. in a 2007 study, cummings, merrill, and borrelli surveyed library patrons to find out if they are likely to access the library catalog via small-screen devices. 9 they discovered that 45.2 percent of respondents, regardless of whether they owned a device, would access the library catalog on a small-screen device. mobile access to the library catalog was the most requested service in the usu student survey, although it accounted for only 16percent of the responses. cummings, et al. also discovered that the most frequent users of the catalog were also the least willing to access the catalog via mobile devices, an interesting observation that merits further research. their survey was completed in june of 2007, just five months after the january 9th release of the original iphone. the release of the iphone is significant as the point where the market demographics of mobile device users began to shift to people under thirty, the primary age group of undergraduate students. 10 librarians wilson and mccarthy at ryerson university conducted two surveys to measure information technology and libraries | june 2012 85 the usage of their catalog’s feature to send a call number via text or email (initiated in 2007) and their “fledgling mobile web site” (launched in 2008). 11 the first survey indicated that 20 percent of respondents owned internet-capable cell phones, and over half said they intended to buy this type of phone when their current contracts expired. the survey respondents indicated they wanted the following services: “booking group study rooms, checking hours and schedules, checking their borrower records and checking the catalogue.” 12 the second survey was conducted a year after the library had implemented a group study room reservation system, catalog and borrower record services, and a computer/laptop availability service. results of the follow-up survey show a drastic increase in ownership of internetcapable cell phones (from 20% to 65%). respondents desired two new services: article searches and e-book access. wilson and mccarthy found that very few library patrons were accessing the mobile services, but “60% of the survey respondents were unaware that the library provided mobile services.” 13 the authors conclude that advertising should be a central part of mobile technology implementation. they also detail how the library contributed expertise and leadership to their campus-wide mobile initiatives. seeholzer and salem conducted a series of focus groups in the spring of 2009 to determine the extent of mobile device use among students at kent state university. 14 notable among their findings are that students are willing to conduct research with mobile devices, and they desire to have a feature-rich interactive experience via handheld devices. students expressed interest in customizing interactions with the library’s mobile site and completing common tasks such as placing holds or renewing library materials. nationwide survey of librarians we asked colleagues who subscribe to e-mail distribution lists to respond to a survey about their libraries’ implementation of mobile applications for access to library collections and services. invitations to take the survey were sent to seven lists (acrl science & technology section, eril, information literacy instruction, liblicense-l, nasig, ref-l, and serialist), and 289 librarians and library staff members responded to the survey. the population of subscribers to the e-mail lists we used to solicit survey responses is dynamic and includes librarians and staff who work in academic and other types of settings. while our findings cannot be generalized in a statistically reliable manner, we nonetheless believe that the survey responses merit thorough analysis. we chose to conduct two surveys to avoid some of the problems we noted in a 2007 study conducted by todd spires. 15 spires’ survey questions focused on librarians’ perceptions rather than on empirical data. we developed separate surveys for librarians and students in hopes of avoiding problems that could arise from basing assumptions on perceived behavior or from the complexity of interpreting and generalizing from perceptions. a survey of library patrons should provide more accurate insight into the ways that patrons are using the library mobile technologies & academics | dresselhaus and shrode 86 via handheld devices. in the libraries that currently provide mobile access to resources, the library catalog is most commonly offered. article databases and assistance from a librarian tie as the second most frequently provided services. figure 1 shows a snapshot of the resources and services librarians reported that they provide. we also asked how long libraries have provided mobile access, and the time periods ranged from a few weeks to more than ten years. five librarians indicated that they have provided mobile access for six to ten years, and it is possible that these respondents may work in medical or health science libraries, as our literature review indicated that access to medical information and journal articles via pdas has been a reality for several years. figure 1. librarians’ responses: does your library provide mobile access to the following library resources? librarians were also asked what services and resources they believe libraries should provide via mobile devices. of one hundred seventy-eight responses, 71 percent indicated that “everything” or a variety of library resources should be made available. a few of the more interesting suggestions include a library café webcam (similar to a popular link from north carolina state university), locker reservations, a virtual suggestion box, alerts about database trials, an app that lists new books, and using ipads or other mobile devices for roving reference. roving reference with tablet pcs was evaluated by smith and pietraszewski at the west campus branch library of texas a&m. 16 as tablet computers become increasingly popular with the release of the ipad and other tablets, 17 roving reference should be reconsidered. smith and pietraszewski note that "the tablet pc proved to be an extremely useful device as well as a novelty that drew student interest (anything to make reference librarians look cool!)" 18 using the latest technology in libraries will help raise awareness that libraries are relevant and adapting to changing user preferences. we asked librarians to indicate who had responsibility for implementing mobile access in their library. the 184 responses are summarized here:  63 percent answered that a library systems or computing professional does this work;  26.1 percent indicated that the electronic resources librarian has this role;  17.9 percent rely on an information professional from outside of the library;  22.8 percent chose “other,” and we unfortunately did not offer a space for comments where survey respondents could tell us the job title of the person in their library who implements mobile access. the results from our sample of librarians are consistent with a larger study by the library journal. 19 the lj study found that the majority of academic libraries have implemented or are information technology and libraries | june 2012 87 planning to implement mobile technologies. student survey in january of 2011 we sent out a thirteen-question survey to students (questions are available in appendix a). usu’s student headcount is 25,767, and 3,074 students responded, representing 11.9 percent of the student population. we asked students to identify with colleges so that we could evaluate the survey sample against the enrollment at usu. the rate of response by college clustered between 12–19 percent with the lowest response rate (8 percent) from the college of education. the highest response rate came from the college of humanities and social sciences. we examined survey response rates from usu undergraduate and graduate populations; 54 percent of undergraduates and 50 percent of graduate students use mobile technology for academic purposes. we believe that our sample is sufficiently representative of the overall population of usu. figure 2. student response rates by college in order to understand the context of survey questions that specifically address mobile access, we asked students how often they used library electronic resources. the majority of students used electronic books, the library catalog, and electronic journals/articles a few times each semester. only 34.4 percent of students never use electronic books, 19.6 percent never use the library catalog, and 17.6 percent never use electronic journals/articles. we made comparisons between disciplines and found no significant difference in electronic resource use between fields in the sciences and those in humanities. further data will be collected in fall 2011 about use of print and electronic materials. mobile technologies & academics | dresselhaus and shrode 88 figure 3. electronic resource use among students students were asked how often they use a variety of handheld devices. we decided to emphasize access over ownership in order to allow for a variety of situations. responses show that 39.3 percent of our students use a smartphone with internet access on a daily basis. another 31.5 percent of students use other handheld devices like an ipod touch on a daily basis. very few students use ipads or e-book readers, with 3.9 percent and 5.4 percent indicating daily use, respectively. we view the "other handheld device" category as an important segment of the mobile technology market because of the lower cost barrier, since such devices do not require a subscription to a data plan. the ecar study also noted the possibility of cost factors influencing the decision of some students not to access the internet via a handheld device. 20 information technology and libraries | june 2012 89 figure 4. mobile device usage students were asked if they use their mobile device or phone for academic purposes (e.g., blackboard, electronic course reserves, etc.). this question was intentionally worded broadly in order to gather general information. we used skip logic to direct respondents to different paths through the survey based on their response to earlier questions. in response to a question about how students use their mobile devices, 54 percent of respondents indicated that they use their mobile devices for academic purposes. we analyzed the results by discipline and noted a few variances. among students responding from the school of business, 63 percent said that they use their mobile device for academic purposes, and 59 percent of engineering students use their devices for school work. the respondents from the other colleges reported use under 50 percent, most likely because of more limited adoption of mobile technology by usu faculty in those fields or lack of personal funds (or unwillingness to spend) to acquire devices and data plans. the 2010 ecar report also noted higher exposure to technology in these fields, indicating that the situation at usu is in line with results from a national study. 21 mobile technologies & academics | dresselhaus and shrode 90 table 1. device use for academic purposes by college we asked the students, “if library resources were easily accessible on your mobile devices, and if you had such a device, how likely would you be to use any of the following for assignments or research?” responses to this question allowed us to gauge interest without concerns about cost of technology or the current state of mobile readiness in our library. among the survey respondents, 70.2 percent are likely or very likely to use resources on a smartphone; 46.9 percent are likely or very likely to use resources on an ipad; 45.9 percent are likely or very likely to use resources on an e-book reader; 63.2 percent are likely or very likely to use resources on other devices. we included an option for respondents to select “not applicable” as distinct from “not likely” to allow for those students who may welcome use of a mobile device but who may currently use a device different from the types we specified. information technology and libraries | june 2012 91 figure 5. likelihood of using library resources on mobile device if easily available we are unsure how to account for the dramatic difference in interest between smartphone and ipad usage. survey responses indicated that only a small number of students have access to an ipad, and it is possible that students have had little opportunity to see their classmates or others use ipads in an academic setting. students were asked in a free-text question to list the services the library should offer. the comments were varied and often used language different from the vocabulary that librarians typically use. in order to gain an understanding of trends and to standardize the language, we coded the survey comments. after coding, trends began to emerge. access to the library catalog was mentioned by 16 percent of respondents. mobile services in general were specified by 11 percent of survey respondents, 10 percent wanted articles, and 9 percent wanted to reserve study rooms on their mobile device. the phrase “mobile services” represents a catch-all tag designated for comments that indicated that a student desired a variety of services or all services that are possible. for example, only 9 percent of respondents indicated they had used text to contact the library and 15percent had used instant messaging. several students indicated they might have used these services but did not know they were available, indicating a need for advertising. while we learned much mobile technologies & academics | dresselhaus and shrode 92 about students’ desires for mobile services from this important subset of comments in response to the free-text question, they did not prove especially useful to guide librarians’ plans for the next stages of implementing mobile technology. figure 6. services requested by students as is common at many institutions, funding at usu is limited and any development in the area of mobile access implementation must be strategic. our survey indicated that usu students are using mobile devices for their academic work and would like to further integrate library resources into their mobile routine. the next section of this paper outlines the steps we are taking toward mobile implementation. going mobile the usu library joins many other academic libraries in the beginning stages of implementing mobile technologies. survey responses from students indicate that they use mobile devices for academic purposes, and until options to use the library with such devices are available and advertised, we will not have a clear understanding of students’ preferences. klatt's article, “going mobile: free and easy,” 22 outlines a way to get started with mobile services with small investments of time and money. articles by griggs, 23 back, 24 and west, 25 and books by green, et al. 26 and hanson 27 also provide guidance in this area. here we offer suggestions to establish an implementation team, conduct an environmental scan, outline steps to begin the process, and shed light on advertising, assessment, and policy issues. information technology and libraries | june 2012 93 implementation team for a library seeking to provide mobile access to online resources, a diverse and talented implementation team is important. public services personnel in an academic library staff are on the front lines and often field students’ questions. they may also have the opportunity to observe how students are using mobile devices in the library. if librarians track reference interactions, they may find evidence that students are attempting to use their mobile devices to access library services. the electronic resources/collections specialist will also play a key role in mobile development. these specialists are often in contact with vendors, and their advocacy is important in encouraging mobile web development in the vendor community. a web site coordinator interested in mobile services and knowledgeable in current web standards will bring essential talent to the team. arguably, a mobile-optimized web site should become a standard level of service. web sites that are optimized or adapted specifically for mobile access are device agnostic and do not require advanced knowledge of smart phone operating systems. therefore existing web development staff can apply their current skill set to expand into mobile web design. in order to launch advanced interactive access to library resources, a programmer who is interested in developing mobile apps on a number of platforms is needed. device-specific applications allow for the use of phone features such as gps and orientation sensing via an accelerometer and provide the basis for augmented reality technologies. environmental scan librarians can learn about mobile usage in their community by gathering information to guide future development. at usu we interpret the numbers of students who use mobile devices for academic purposes as justification for implementing mobile library access, but we have not set a benchmark for a degree of interest that would trigger more development. some of the mobile implementations described at the end of this paper required minimal time or were investigated because of the electronic resources librarian’s interest for their relevance to her role as music subject librarian. in the survey we administered to students, we considered it important to include a wide range of devices, including ipod touches and similar devices that have many of the same possibilities for academic use as smartphones but which do not require a monthly contract. laptops are also considered a mobile technology, and while we did not emphasize this class of devices, some student comments referred specifically to laptop computers. we will monitor use of the mobile applications that we implement and likely conduct a follow-up survey to assess students’ satisfaction and to find out if there are other services they would like for the library to provide. while librarians may gather useful information from a user study, there are other ways to determine if students are, in fact, using mobile devices in the library. one approach is to review logs of reference questions to determine if students are inquiring about access to library resources via mobile devices. recently, a few mobile-related questions have surfaced mobile technologies & academics | dresselhaus and shrode 94 at usu in the libstats program used to track reference interactions. this is also an area where training reference staff to recognize and record questions about mobile access could be helpful to detect demand in the library’s community. if vendors provide statistics about use of their products from mobile devices, this information could also contribute to assessing need. finally, in libraries that use vpn or other off-campus authentication methods, consulting with it support staff to see if they field questions on setting up remote access on smartphones or other devices may factor into decisions regarding mobile access. the usu information technology website provides a knowledgebase that includes entries on a variety of mobile device queries. this indicates to librarians that people in the university community are using their mobile devices for academic functions. before we conducted the survey of usu students, we knew little about the exact nature of their mobile use. getting started after identifying the needs on campus, the next step is to create a plan for mobile implementation. an important aspect of anticipating the needs of a library’s user population is to understand the likely use scenarios, goals, tasks, and context as outlined in “library/mobile: tips on designing and developing mobile web sites.” 28 building on services that incorporate tasks that people already perform in non-academic contexts provides a logical bridge for those who are familiar with everyday use of a mobile device to recognize how such devices can serve academic purposes. gathering information from each vendor that supplies content to the library is an important early step in planning. this information can serve as the basis of a mobile web implementation plan and, in the case of ebsco, creating a profile is necessary in order to allow access to a mobile-formatted platform. at usu our online catalog provider has developed an application for apple's ios platform. if a library’s catalog vendor does not offer a dedicated application or mobile site, samuel liston’s comparisons of three major online catalogs on three popular mobile devices is helpful in gaining an understanding of how opacs display on smartphones. his article also outlines a procedure for testing opacs and usability. 29 at usu we can also take advantage of serials solutions’ mobile-optimized search screen and a variety of applications provided by other vendors. jensen noted that librarians should not rely solely on vendor-created applications due to vendors’ tendency to develop applications that are usable by only a segment of the overall mobile device user population. 30 he adds that libraries should also avoid developing applications for limited platforms. in addition, jensen provides a simple step-by-step process for converting articles retrieved from a vendor database to a format that can be downloaded from electronic course reserves and read on a variety of handheld devices. while using vendor-developed applications is an important strategy, most libraries will find that developing a mobile-compatible library website is necessary. information technology and libraries | june 2012 95 mobile website development can be accomplished in a variety of ways. at usu we plan to offer a version of our regular website by employing cascading style sheets (css). this method is described in the paper by bridges, et al., 31 and standard guidelines can be found in the mobile web best practices 1.0. 32 this method will allow the content to be reformatted at the point of need for a variety of platforms. results from the usu student survey indicate a desire to be able to use a mobile device for access to the library catalog, to use services like reference assistance, find articles, and make study room reservations. the library plans to include hours and location information, access to existing reference chat and text features, and links to databases with mobile friendly websites or vendor-created applications in addition to the resources requested by students. we are still unsure of the best way to provide links to applications and how to explain the various authentication methods required by each vendor. while vpn and ezproxy are possible methods to authenticate via mobile devices, vendors are content at the moment to allow students to access their resources by setting up an account that is based on an authorized e-mail domain or through a user account created on the non-mobile version of the resource. in a few cases at usu, mobile applications from vendors allow access to categories of users such as alumni because they have a usu.edu e-mail address, although the library does not typically include these patrons in our authorized remote user group. advertising, assessment, and policy creating a mobile website and offering mobile services are only the beginning of the effort to provide access to library materials for mobile users. as wilson and mccarthy found, advertising is essential; 33 students won’t use a service they don’t know about. crafting a marketing plan with both online and print materials is essential. educating library staff members, especially those on the public services front line, is an essential part of promoting mobile services. assessment strategies must be developed in order to focus development strategically. periodic surveys and focus groups can inform future development of mobile services and gauge the impact of currently offered services. librarians should encourage vendors to provide usage data for their mobile portals or applications, and libraries can track use data from their own information technology departments. implementation of mobile web services creates the need to develop new policies and to educate staff. privacy concerns and the complexities of digital rights management have the potential to transform the role of the library and its policies. 34 patrons will need to be aware that the library has less control over maintaining privacy when materials are accessed via third-party mobile applications. libraries will need to consider how new developments in pricing models may affect expanding mobile access; one example is harpercollins’ announcement in early 2011 about a policy requiring libraries to repurchase individual e mobile technologies & academics | dresselhaus and shrode 96 book titles after a cap on check-outs is reached. 35 librarians’ desire to offer reference services or other assistance via mobile devices follows naturally from their long-standing efforts to enable patrons to ask questions via e-mail, chat, instant messaging, or sms text. instant messaging, chat, and text lend themselves to mobile access because they are designed for the relatively short exchange that people typically use when communicating with a handheld device. offering reference services using sms text and chat in particular are relatively easy for libraries to employ because there are many free services to support them. in some cases, a systems administrator or it expert may be helpful in navigating the set-up of chat and text services and to integrate them so that, for example, when a text message arrives during a time when no one is monitoring the service, a voicemail message automatically appears in library’s e-mail account. librarians can find an enormous amount of advice on the web and in the literature about how to begin offering mobilefriendly reference, how to expand the virtual reference services they currently provide, and how to choose among free and fee-based services for their library’s needs and budget. two efficient places to begin are cody hanson’s special issue of library technology reports, which provides a thorough overview of mobile devices and their capabilities and straightforward suggestions for planning and implementation, and m-libraries, a section of library success: a best practices wiki. 36 conclusion in light of trends toward more widespread use of mobile computing devices and smartphones, it makes sense for libraries to provide access to their collections and services in ways that work well with mobile devices. this case study presents the situation at the merrill-cazier library at utah state university, where students who responded to a survey indicate they are very interested in mobile access, even if they have not yet purchased a smartphone or find data plans to be too expensive at this point. as is only reasonable for any library, at usu we have begun by implementing mobile applications that are available from vendors of our online catalog and databases because these require minimal effort and no additional cost. we present ideas for establishing an implementation team and advice for academic libraries who wish to “go mobile.” we aim to have a concrete plan for the work that will be required to optimize the library’s website for mobile access by the fall of 2011. a significant step is hiring a digital services librarian to work closely with the webmaster, electronic resources librarian, and others interested in promoting access to resources and services via mobile devices. our vision is to be on track to offer an augmented-reality experience to our patrons as the 2010 horizon report indicates will be an important trend in the next two to three years. we aim to create an environment in which students can use their mobile device to gain entry to a new layer of digital information, enhancing their experience in the physical library. information technology and libraries | june 2012 97 references 1. clifton dale foster, “pdas and the library without a roof,” journal of computing in higher education 7, no. 1 (1995): 85–93. 2. russell smith, “adapting a new technology to the academic medical library: personal digital assistants,” journal of the medical library association 90, no. 1 (2002): 93–94; colleen cuddy, using pdas in libraries: a how-to-do-it manual (new york: neal-schuman publishers, 2005). 3. andrea jackson, “wireless technology poised to transform health care,” rady business journal 3, no. 1 (2010): 24–26. 4. alan w. aldrich, “universities and libraries move to the mobile web,” educause quarterly 33, no. 2 (2010), www.educause.edu/educause+quarterly/educausequarterlymagazinevolum/univers itiesandlibrariesmoveto/206531 (accessed mar. 30, 2011). 5. larry johnson, alan levine, r. smith, and s. stone, the 2010 horizon report (austin, tx: the new media consortium, 2010), www.nmc.org/pdf/2010-horizon-report.pdf (accessed mar. 31, 2011). 6. shannon d. smith and judith borreson caruso, with an introduction by joshua kim, the ecar study of undergraduate students and information technology, 2010 (research study, vol. 6) (boulder, co: educause center for applied research, 2010), www.educause.edu/ecar (accessed mar. 31, 2011). 7. smith and caruso, the ecar study of undergraduate students and information technology, 2010. 8. laurie bridges et al., “making the case for a fully mobile library web site: from floor maps to the catalog,” reference services review 38, no. 2 (2010): 309–20. 9. joel cummings, alex merrill, and steve borrelli, “the use of handheld mobile devices: their impact and implications for library services,” library hi tech 28, no. 1 (2009): 22– 40. 10. rubicon consulting, the apple iphone: success and challenges for the mobile industry (los gatos, ca: rubicon consulting, 2008), http://rubiconconsulting.com/downloads/whitepapers/rubicon-iphone_user_survey.pdf (accessed mar. 31, 2011). 11. sally wilson and graham mccarthy, “the mobile university: from the library to the campus,” reference services review 38, no. 2 (2010): 215. http://www.educause.edu/educause%2bquarterly/educausequarterlymagazinevolum/universitiesandlibrariesmoveto/206531 http://www.educause.edu/educause%2bquarterly/educausequarterlymagazinevolum/universitiesandlibrariesmoveto/206531 http://www.educause.edu/educause%2bquarterly/educausequarterlymagazinevolum/universitiesandlibrariesmoveto/206531 file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.nmc.org/pdf/2010-horizon-report.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.nmc.org/pdf/2010-horizon-report.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.educause.edu/ecar http://rubiconconsulting.com/downloads/whitepapers/rubicon-iphone_user_survey.pdf http://rubiconconsulting.com/downloads/whitepapers/rubicon-iphone_user_survey.pdf mobile technologies & academics | dresselhaus and shrode 98 12. ibid., 216. 13. ibid., 223. 14. jamie seeholzer and joseph a. salem, “library on the go: a focus group study of the mobile web and the academic library,” college and research libraries 72, no. 1 (2011): 9– 20. 15. todd spires, “handheld librarians: a survey of librarian and library patron use of wireless handheld devices,” internet reference services quarterly 13, no. 4 (2008): 287– 309. 16. michael m. smith and barbara a. pietraszewski, “enabling the roving reference librarian: wireless access with tablet pcs,” reference services review 32, no. 3 (2004): 249–55. 17. kathryn zickuhr, generations and their gadgets (washington, d.c.: pew internet & american life project, 2011), http://pewinternet.org/reports/2011/generations-andgadgets.aspx (accessed mar. 31, 2011). 18. smith and pietraszewski, “enabling the roving reference librarian,” 253. 19. lisa carlucci thomas, “gone mobile: mobile catalogs, sms reference, and qr codes are on the rise—how are libraries adapting to mobile culture?” library journal 135, no. 17 (2020): 30–34. 20. smith and caruso, the ecar study of undergraduate students and information technology, 2010. 21. ibid. 22. carolyn klatt, “going mobile: free and easy,” medical reference services quarterly 30, no. 1 (2011): 56–73. 23. kim griggs, laurie m. bridges, and hannah gascho rempel, “library/mobile: tips on designing and developing mobile web sites,” code4lib 8, november 23, 2009, http://journal.code4lib.org/articles/2055 (accessed mar. 30, 2011). 24. godmar back and a. bailey, “web services and widgets for library information systems,”information technology & libraries 29, no. 2 (2010): 76–86. 25. mark andy west, arthur w hafner, and bradley d. faust, “communications—expanding access to library collections and services using small-screen devices,” information technology & libraries 25, no. 2 (2006): 103. 26. courtney greene, missy roser, and elizabeth ruane, the anywhere library: a primer for the mobile web (chicago: association of college and research libraries, 2010). http://pewinternet.org/reports/2011/generations-and-gadgets.aspx http://pewinternet.org/reports/2011/generations-and-gadgets.aspx http://journal.code4lib.org/articles/2055 information technology and libraries | june 2012 99 27. cody w. hanson, “libraries and the mobile web,” library technology reports 42, no. 2 (february/march 2011). 28. griggs, bridges, and gascho rempel, “library/mobile.” 29. samuel liston, “opacs and the mobile,” computers in libraries 29, no. 5 (2009): 6–47. 30. r. bruce jensen, “optimizing library content for mobile phones,” library hi tech news 27, no. 2 (2010): 6–9. 31. griggs, bridges, and gascho rempel, “library/mobile.” 32. “mobile web best practices 1.0,” worldwide web consortium (w3c), www.w3.org/tr/mobile-bp (accessed mar. 30, 2011). 33. wilson and mccarthy, “the mobile university.” 34. timothy vollmer, there’s an app for that! libraries and mobile technology: an introduction to public policy considerations (policy brief no. 3) (washington, d.c.: ala office for information technology policy, 2010), www.ala.org/ala/aboutala/offices/oitp/publications/policybriefs/mobiledevices.pdf (accessed mar. 31, 2011). 35. josh hadro, “harpercollins puts 26 loan cap on ebook circulations,” library journal, february 25, 2011, www.libraryjournal.com/lj/home/889452264/harpercollins_puts_26_loan_cap.html.csp (accessed mar. 31, 2011). 36. “m-libraries: library success: a best practices wiki,” www.libsuccess.org/index.php?title=m-libraries, (accessed mar. 31, 2011). file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.ala.org/ala/aboutala/offices/oitp/publications/policybriefs/mobiledevices.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.ala.org/ala/aboutala/offices/oitp/publications/policybriefs/mobiledevices.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.libraryjournal.com/lj/home/889452-264/harpercollins_puts_26_loan_cap.html.csp file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.libraryjournal.com/lj/home/889452-264/harpercollins_puts_26_loan_cap.html.csp file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.libsuccess.org/index.php%3ftitle=m-libraries, mobile technologies & academics | dresselhaus and shrode 100 appendix a. student survey questions 1. type of student? 2. age? 3. gender? 4. what is your college? 5. how often do you use the following electronic resources provided by your library? 6. do you use any of the following devices? 7. do you use your mobile device or phone for academic purposes (e.g., blackboard, electronic course reserves, etc.)? 8. please list what you use your device to do? 9. have you ever used a text message to get help using the library? 10. have you ever used instant messaging to get help using the library? 11. if library resources were easily accessible on your mobile devices and if you had such a device, how likely would you be to use any of the following for assignments or research? 12. what mobile services would you like the library to offer? 13. comments? information technology and libraries | june 2012 101 appendix b. librarian survey questions 1. type of library? 2. your job/role in the library? 3. years working in libraries? 4. does your library offer mobile device applications for the following electronic resources? 5. who in your library or on your campus is responsible for implementing or developing mobile device applications? 6. how long has your library provided access via mobile devices to electronic resources or services? 7. if you collect use data for library electronic resources, are patrons using the mobile device applications your library provides? 8. what mobile services do you believe libraries should offer? 9. comments? microsoft word september_ital_betz_final.docx self-‐archiving with ease in an institutional repository: microinteractions and the user experience sonya betz and robyn hall information technology and libraries | september 2015 43 abstract details matter, especially when they can influence whether users engage with a new digital initiative that relies heavily on their support. during the recent development of macewan university’s institutional repository, the librarians leading the project wanted to ensure the site would offer users an easy and effective way to deposit their works, in turn helping to ensure the repository’s long-‐term viability. the following paper discusses their approach to user-‐testing, applying dan saffer’s framework of microinteractions to how faculty members experienced the repository’s self-‐archiving functionality. it outlines the steps taken to test and refine the self-‐archiving process, shedding light on how others may apply the concept of microinteractions to better understand a website’s utility and the overall user experience that it delivers. introduction one of the greatest challenges in implementing an institutional repository (ir) at a university is acquiring faculty buy-‐in. support from faculty members is essential to ensuring that repositories can make online sharing of scholarly materials possible, along with the long-‐term digital preservation of these works. many open access mandates have begun to emerge around the world, developed by universities, governments, and research funding organizations, which serve to increase participation through requiring that faculty contribute their works to a repository.1 however, for many staff managing irs at academic libraries there are no enforceable mandates in place, and only a fraction of faculty works can be contributed without copyright implications when author agreements transfer copyrights to publishers. persuading faculty members to take the time to sort through their works and self-‐archive those that are not bound by rights restrictions is a challenge. standard installations of popular ir software, including dspace, digital commons, and eprints, do little to help facilitate easy and efficient ir deposits by faculty. as dorothea salo writes in a widely cited critique of irs managed by academic libraries, the “‘build it and they will come’ proposition has been decisively wrong.”2 a major issue she points out is that repositories were predicated on the “assumption that faculty would deposit, describe, and manage their own material.”3 seven sonya betz (sonya.betz@ualberta.ca) is digital initiatives project librarian, university of alberta libraries, university of alberta, edmonton, alberta. robyn hall (hallr27@macewan.ca) is scholarly communications librarian, macewan university library, macewan university, edmonton, alberta. self-‐archiving with ease in an institutional repository | betz and hall doi: 10.6017/ital.v34i3.5900 44 years after the publication of her article, a vast majority of the more than 2,600 repositories currently operating around the world still function in this way and struggle to attract widespread faculty support.4 to deposit works into these systems, faculty are often required to fill out an online form to describe and upload each work individually. this can be a laborious process that includes deciphering lengthy copyright agreements, filling out an array of metadata fields, and ensuring file formats or file sizes that are compatible with the constraints of the software. in august of 2014, macewan university library in edmonton, alberta, launched an ir, research online at macewan (ro@m; http://roam.macewan.ca). our hope was that ro@m’s simple user interface and straightforward submission process would help to bolster faculty contributions. the site was built using islandora, an open-‐source software framework that offered the project developers substantial flexibility in appearance and functionality. in an effort to balance their desire for independence over their work with ease of use, faculty and staff have the option of submitting to ro@m in one of two ways: they can choose to complete a brief process to create basic metadata and upload their work, or they can simply upload their work and have ro@m staff create metadata and complete the deposit. thoroughly testing both of these processes was critical to the success of the ir. we wanted to ensure that there were no obstacles in the design that would dissuade faculty members from contributing their works once they had made the decision to start the contribution process. as the primary means of adding content to the ir, and as a process that other institutions have found problematic, carefully designing each step of how a faculty contributor submits material was our highest priority. to help us focus our testing on some of these important details, and to provide a framework of understanding for refining our design, we turned to dan saffer’s 2013 book microinteractions: designing with details. the following case study describes our use of microinteractions as a user-‐testing approach for libraries and discusses what we learned as a result. we seek to shed light on how other repository managers might envision and structure their own self-‐archiving processes to ensure buy-‐in while still relying on faculty members to do some of the necessary legwork. additionally, we lay out how other digital initiatives may embrace the concept of microinteractions as a means of better understanding the relationship between the utility of a website and the true value of positive user experience. literature review user experience and self-‐archiving in institutional repositories user experience (ux) in libraries has gained significant traction in recent years and provides a useful framework for exploring how our users are interacting with, and finding meaning in, the library technologies we create and support. although there is still some disagreement around the definition and scope of what exactly we mean when we talk about ux, there seems to be general consensus that paying attention to ux shifts focus from the usability of a product to more nonutilitarian qualities, such as meaning, affect, and value.5 hassenzhal simply defines ux as a information technologies and libraries | september 2015 45 “momentary, primarily evaluative feeling (good-‐bad) while interacting with a product or service.”6 hassenzhal, diefenbach, and goritz argue that positive emotional experiences with technology occur when the interaction fulfills certain psychological needs, such as competence or popularity.7 the 2010 iso standard for human-‐centered design for interactive systems defines ux even more broadly, suggesting that it “includes all the users’ emotions, beliefs, preferences, perceptions, physical and psychological responses, behaviors and accomplishments that occur before, during and after use.”8 however, when creating tools for library environments, it can be difficult for practitioners to translate ambiguous emotional requirements, such as satisfying emotional and psychological needs or increasing motivation, with pragmatic outcomes, such as developing a piece of functionality or designing a user interface. it has been well documented that repository managers struggle to motivate academics to self-‐ archive their works.9 however, the literature focusing on how ir websites’ self-‐archiving functionality helps or hinders faculty support and engagement is sparse. one study of note was conducted by kim and kim in 2006, who led usability testing and focus groups on an ir in south korea. 10 they provide a number of ways to improve usability on the basis of their findings, which include avoiding jargon terms and providing comprehensive instructions at points of need rather than burying them in submenus. similarly, veiga e silva, goncalves, and laender reported results of usability testing conducted on the brazilian digital library of computing, which confirmed their initial goals of building a self-‐archiving service that was easily learned, comfortable, and efficient.11 the authors of both of these studies suggest that user-‐friendly design could help to ensure the active support and sustainability of their services, but long-‐term use remained to be seen at the time of publication. meanwhile, bell and sarr recommend integrating value-‐added features into ir websites as a way to attract faculty.12 their successful strategy for reengineering a struggling ir at the university of rochester included adding tools to allow users to edit metadata and add and remove files, and providing portfolio pages where faculty could list their works in the ir, link to works available elsewhere, detail their research interests, and upload a copy of their cv. although the question remains as to whether a positive user experience in an ir can be a significant motivating factor for increasing faculty participation, there seems to be enough evidence to support its viability as an approach. applying microinteractions to user testing dan saffer’s 2013 book, microinteractions: designing with details, follows logically from the ux movement. although he uses the phrase “user experience” sparingly, saffer consistently connects interactive technologies with the emotional and psychological mindset of the user. saffer focuses on “microinteractions,” which he defines as “a contained product moment that revolves around a single use case.”13 saffer argues that well-‐designed microinteractions are “the difference between a product you love and product you tolerate.”14 saffer’s framework is an effective application of ux theory to a pragmatic task. not only does he privilege the emotional state of the user as a priority self-‐archiving with ease in an institutional repository | betz and hall doi: 10.6017/ital.v34i3.5900 46 for design, he also provides concrete recommendations for designing technology that provokes positive psychological states such as pleasure, engagement, and fun. defining what we mean by a “microinteraction” is important when translating saffer’s theory to a library environment. he describes a microinteraction as “a tiny piece of functionality that only does one thing . . . every time you change a setting, sync your data or devices, set an alarm, pick a password, turn on an appliance, log in, set a status message, or favorite or like something, you are engaging with a microinteraction.”15 in libraries, many microinteractions are built around common user tasks such as booking a group-‐use room, placing a hold on an item, registering for an event, rating a book, or conducting a search in a discovery tool. a single piece of interactive library technology may have any number of discrete microinteractions, and often are part of a larger ecosystem of connected processes. for example, an integrated library system is composed of hundreds of microinteractions designed both for end users and library staff, while a self-‐checkout machine is primarily designed to facilitate a single microinteraction. saffer’s framework provided a valuable new lens on how we could interpret users’ interactions with our ir. while we generally conceptualize an ir as a searchable collection of institutional content, we can also understand it as a collection of microinteractions. for example, ro@m’s core is microinteractions that enable tasks such as searching content, browsing content, viewing and downloading content, logging in, submitting content, and contacting staff. ro@m also includes microinteractions for staff to upload, review, and edit content. as discussed above, one of the primary goals when developing our ir was to allow faculty to deposit scholarly content, such as articles and conference papers, directly to the repository. we wanted this process to be simple and intuitive, and for faculty to have some control over the assignation of keywords and other metadata, but also to have the option to simply submit content with minimal effort. we decided to employ user testing to carefully examine the deposit process as a discrete microinteraction and to apply saffer’s framework as a means of assessing both functionality and ux. we hoped that focusing on the details of that particular microinteraction would allow us to make careful and thoughtful design choices that would lead to a more consistent and pleasurable ux. method and case study we conducted two rounds of user testing for the self-‐archiving process. our initial user testing was conducted in january 2014. we asked seven faculty to review and comment on a mockup of the deposit form to test the workflow. this simple exercise allowed us to confirm the steps in the upload process, and identified a few critical issues that we could resolve before building out the ir in islandora. after completing the development of the ir, and with a working copy of the site installed on our user acceptance testing (uat) server, we conducted a second round of in-‐depth usability testing within our new microinteraction framework. in april 2014 we recruited six faculty members through word of mouth and through a call for participants in the university’s weekly electronic staff newsletter. the volunteers represented major disciplines at macewan university, including health sciences, social sciences, humanities, information technologies and libraries | september 2015 47 and natural sciences. saffer describes a process for testing microinteractions and suggests that the most relevant way to test microinteractions is to include “hundreds (if not thousands) of participants.”16 however, he goes on to describe the most effective methods of testing to be qualitative, including conversation, interviews, and observation. testing thousands of participants with one-‐on-‐one interviews and observation sessions is well beyond the means of most academic libraries, and runs counter to standard usability testing methodology. while testing only six participants may seem like a small number, and one that is apt to render inconclusive results and sparse feedback, it is strongly supported by usability experts, such as jakob nielson. during the course of our testing, we quickly reached what nielson refers to in his piece “how many test users in a usability study?” as “the point of diminishing returns.”17 he suggests that for most qualitative studies aimed at gathering insights to inform site design and overall ux, five users is in fact a suitable number of participants. we support his recommendation on the basis of our own experiences; by the fourth participant, we were receiving very repetitive feedback on what worked well and what needed to be changed. testing took place in faculty members’ offices on their own personal computers so that they would have the opportunity to engage with the site as they would under normal workday circumstances. each user testing session lasted 45 to 60 minutes, and was facilitated by three members of the ro@m team: the web and ux librarian guided each faculty member through the testing process, the scholarly communications librarian observed the interaction, and a library technician took detailed notes recording participant comments and actions. each faculty member was given an article and asked to contribute that article to ro@m using the uat site. the ro@m team observed the entire process carefully, especially noting any problematic interactions, while encouraging the faculty member to think aloud. once testing was complete, the scholarly communications librarian analyzed the notes and identified areas of common concern and confusion among participants, as well as several suggestions that the participants made to improve the site’s functionality as they worked through the process. she then went about making changes to the site based on this feedback. as we discuss in the next section, each task that faculty members performed, from easy to frustrating, represented an interaction with the user interface that affected participants’ experiences of engaging with the contribution process, and informed changes we were able to make before launching the ir service three months later. basic elements of microinteractions saffer’s theory describes four primary components of a microinteraction: the trigger, rules, feedback, and loops and modes. viewing the ir upload tool as a microinteraction intended to be efficient and user-‐friendly required us to first identify each of these different components as they applied to the contribution process (see figure 1), and then evaluate the tool as a whole through our user testing. self-‐archiving with ease in an institutional repository | betz and hall doi: 10.6017/ital.v34i3.5900 48 figure 1. ir self-‐archiving process with microinteraction components. trigger the first component to examine in a microinteraction is the trigger, which is, quite simply, “whatever initiates the microinteraction.”18 on an iphone, a trigger for an application might be the icon that launches an app; on a dishwasher, the trigger would be the button pressed to start the machine; on a website, a trigger could be a login button or a menu item. well-‐designed triggers follow good usability principles: they appear when and where the user needs them, they initiate the same action every time, and they act predictably (for example, buttons are pushable, toggles slide). information technologies and libraries | september 2015 49 examining our trigger was a first step in assessing how well our upload microinteraction was designed. uploading and adding content is a primary function of the ir, and the trigger needed to be highly noticeable. we can assume that users would be goal-‐based in their approach to the ir; faculty would be visiting the site with the specific purpose of uploading content and would be actively looking for a trigger to begin an interaction that would allow them to do so. the initial design of ro@m included a top-‐level menu item as the only trigger for contributing works. in the persistent navigation at the top of the site, users could click on the menu item labeled “contribute” where they would then be presented with a login screen to begin the contribution process. this was immediately obvious to half of the participants during user testing. however, the other half immediately clicked on the word “share,” which appeared on the lower half of the page beside a small icon simply as a way to add some aesthetic appeal to the homepage along with the words “discover” and “preserve.” not surprisingly, the users were interpreting the word and icon as a trigger. because of the user behavior that we observed, we decided to add hyperlinks to all three of these words, with “share” linking to the contribution login screen (see figure 2), “discover” leading to a browse page, and “preserve” linking to an faq for authors page that included information on digital preservation. this increased visibility of the trigger significantly for the microinteraction. figure 2. “share” as additional trigger for contributing works. self-‐archiving with ease in an institutional repository | betz and hall doi: 10.6017/ital.v34i3.5900 50 rules the second component of microinteractions described by saffer are the rules. rules are the parameters that govern a microinteraction; they provide a framework of understanding to help users succeed at completing the goal of a microinteraction by defining “what can and cannot be done, and in what order.”19 while users don’t need to understand the engineering behind a library self-‐checkout machine, for example, they do need to understand what they can and cannot do when they’re using the machine. the hardware and software of a self-‐checkout machine is designed to support the rules by encouraging users to scan their cards to start the machine, to align their books or videos so that they can be scanned and desensitized, and to indicate when they have completed the interaction. the goal when designing a self-‐archiving process in ro@m was to ensure that the rules were easy for users to understand, followed a logical structure, and were not overly complex. to this end, we drew on saffer’s approach to designing rules for microinteractions, along with the philosophy espoused by steve krug in his influential web design book, don’t make me think: a common sense approach to web usability.20 both krug and saffer argue for reducing complexity and removing decision-‐making from the user whenever possible to remove potential for user error or mistakes. the rules in ro@m follow a familiar form-‐based approach: users log in to the system, they have to agree to a licensing agreement, they create some metadata for their item, and they upload a file (see figure 1). however, determining the order for each of these elements, and ensuring that users could understand how to fill out the form successfully, required careful thinking that was greatly informed by the user testing we conducted. for example, we designed ro@m to connect to the same authentication system used for other university applications, ensuring that faculty could log in with the credentials they use daily for institutional email and network access. forcing faculty to create, and remember, a unique username and password to submit content would have increased the possibility of login errors and resulted in confusion and frustration. we also used drop-‐down options where possible throughout the microinteraction instead of requiring faculty to input data such as file types, faculty or department names, or content types into free-‐text boxes. during our user testing we found that those fields where we had free-‐text input for metadata entry most often led to confusion and errors. for instance, it quickly became apparent that name authority would be an issue. when filling out the “author” field, some people used initials, some used middle names, and some added “dr” before their name, which could negatively affect the ir’s search results and the ability to track where and when these works may be cited by others. when asked to include a citation for published works, most of our participants noted frustration with this requirement because they could not do so quickly, and they had concerns about creating correct citations. finally, many participants also became confused at the last, optional field in the form that allowed them to assign a creative commons license to their works. information technologies and libraries | september 2015 51 our user testing indicated that we would need to be mindful of how information like author names and citations were entered by users before making an item available on the site. under ideal circumstances, we would have modified the form to ensure that any information that the system knew about the user was brought forward: what saffer calls “don’t start from zero.”21 this could include automatically filling in details like a user’s name. however, like many libraries, we chose to adapt existing software rather than develop our microinteraction from the ground up; implementing such changes would have been too time-‐consuming or expensive. in response, we instead added additional workflows to allow administrators to edit the metadata before a contribution would be published to the web so we could correct any errors. we also changed the “citation” field to “publication information” to imply that users did not need to include a complete citation. lastly, we made sure that “all rights reserved” was the default selection for the optional “add a creative commons license?” field in the form because this was language that with which our users were familiar with and comfortable proceeding. policy constraints are another aspect of the rules that provide structure around microinteractions, and can also limit design choices that can be made. having faculty complete a nonexclusive licensing agreement that acknowledged they had the appropriate copyright permissions to allow them to contribute the work was a required component in our rules. without the agreement, we would risk liability for copyright infringement and could not accept the content into the ir. however, our early designs for the repository included this step at the end of the submission process, after faculty had created metadata about the item. our initial round of testing revealed that several of our participants were unsure of whether they had the appropriate copyright permissions to add content and didn’t want to complete the submission, a frustrating experience for them after spending time filling out author information, keywords, abstract, and the like. we attempted to resolve this issue by moving the agreement much earlier in the process, forcing users to acknowledge the agreement before creating any metadata. we also used simple, straightforward language for the agreement and added additional information about how to determine copyrights or contact ro@m staff for assistance. integrating an api that could automatically search a journal’s archiving policies in sherpa romeo at this stage in the contribution process is something we plan to investigate to help reduce complexity further for users. feedback understanding the concept of feedback is critical to the design of microinteractions. while most libraries are familiar with collecting feedback from users, the feedback saffer describes is flowing in the opposite direction, and instead refers to feedback the application or interface is providing back to users. this feedback gives users information when and where they need it to help them navigate the microinteraction. as saffer comments, “the true purpose of feedback is to help users understand how the rules of the microinteraction work.”22 self-‐archiving with ease in an institutional repository | betz and hall doi: 10.6017/ital.v34i3.5900 52 feedback can be provided in a variety of ways. an action as simple as a color change when a user hovers over a link is a form of feedback, providing visual information that indicates that a segment of text can be clicked on. confirmation messages are an obvious form of feedback, while a folder with numbers indicating how many items have been added to it is more subtle. while visual feedback is most commonly used, saffer also describes cases where auditory and haptic (touch) feedback may be useful . designing feedback, much like designing rules, should aim to reduce complexity and confusion for a user, and should be explicitly connected both functionally and visually to what the user needs to know. in an online web environment, much of the feedback we provide the user should be based on good usability principles. for example, formatting web links consistently and providing predictable navigation elements are some ways that feedback can be built into a design. providing feedback at the users’ point of need is also critical, especially error messages or instructional content. this proved to be especially important to our ro@m test subjects. while the ir featured an “about” section accessible in the persistent navigation at the top of the website that contained detailed instructions and information about how to submit works, and terms of use governing these submissions, this content was virtually invisible to the users we observed. instead, they relied heavily on the contextual feedback that was included throughout the contribution process when it was visible to them. these observations led us to rethink our approach to providing feedback in several cases. for example, an unfortunate constraint of our software required users to select a faculty or school and a department and then click an “add” button before they could save and continue. we included some instructions above the drop-‐down menus, stating “select and click add” in an effort to prevent any errors. however, our participants failed to notice the instructions and inevitably triggered a brief error message (see figure 3). we later changed the word “add” in the instructions from black to bright red hoping to increase its visibility, and we ensured that the error message that displayed when users failed to select “add” clearly explained how to correct the problem and move on. we also observed that the plus signs to add additional authors and keywords were not visible to users. we added feedback that included both text and icons with more detail (see figure 4). however, this remains a problem for users that we will need to further explore. on completing a contribution, users receive a confirmation page that thanks them for the contribution, provides a timeline for when the item will appear on the site, and notes that they will receive an email when it appears. response to this page was positive as it succinctly covered all of the information the users felt they needed to know having completed the process. information technologies and libraries | september 2015 53 figure 3. feedback for the “add” button. figure 4. feedback for adding multiple authors and keywords. self-‐archiving with ease in an institutional repository | betz and hall doi: 10.6017/ital.v34i3.5900 54 modes and loops the final two components of microinteractions defined by saffer are modes and loops. saffer describes a mode as a “fork in the rules,” or a point in a microinteraction where the user is exposed to a new process, interface, or state.23 for example, google scholar provides users with a setting to show “library access links” for participating institutions with openurl compatible link resolvers.24 users who have set this option are presented with a search results page that is different from the default mode and includes additional links to their chosen institution’s link resolver. our microinteraction includes two distinct modes. once logged in, users can choose to contribute works through the “do it yourself” submission that we’ve described here in some detail, or they can choose “let us do it” and complete a simplified version that requires them to acknowledge the licensing agreement, upload their files, and provide any additional data they chose in a free-‐text box (see figure 5). the majority of our testers specified that they would opt for the “do it yourself” option because they wanted to have control over the metadata describing their work, including the abstract and keywords. however, since launching the repository, several submissions have arrived via the “let us do it” form, which suggests a reasonable amount of interest in this mode. figure 5. the “let us do it” form. loops, on the other hand, are simply a repeating cycle in the microinteraction. a loop could be a process that runs in the background, checking for network connections, or it could be a more visible process that adapts itself on the basis of the user’s behavior. for example, in the ro@m submission process users can move backward and forward in the contribution forms; both have information technologies and libraries | september 2015 55 “previous” and “save and continue” buttons on each page to allow users to navigate easily. the final step on the “do it yourself” form allows users to review their metadata and the file that they have uploaded. they can then use the previous button to make changes to what they have entered before completing the submission. ideally, users would be able to edit this content directly from this review page, but software constraints prevented us from including this feature, and the “previous” button did not pose any major challenges for our testing participants. another example of a loop in ro@m is a “contribute more works” button embedded in the confirmation screen that takes users back to the beginning of the microinteraction. this feature was suggested by one of our participants, and it extends the life of the microinteraction, potentially leading to additional contributions. discussion and conclusions focusing on the details of the self-‐archiving process in our ir provided extremely rich qualitative data for improving the user interface, while analyzing the structure of the microinteraction, following saffer’s model, was also a valuable exercise in thinking about user needs and software design from a different perspective from standard usability studies. the improvements we made, based on both saffer’s theory and the results we observed through testing, added significant functionality and ease of use to the self-‐archiving process for faculty. thinking carefully about elements like placement of buttons, small changes in wording or flow, and timing of instructional or error feedback highlighted the big effect small elements can have on usability. however, there are some limitations to both the theory, and our approach to testing and improving the ir that affect how well we can understand and utilize the results. of particular concern is how well this kind of testing can capture the ux of a faculty member beyond the utility or ease of use of the interaction. in an observational study we can rely on comments from participants and key statements that may indicate a participant’s emotional or affective state, but we didn’t include targeted questions to gather this data and focused instead on the details of the microinteraction. we didn’t ask how they felt while using the ir, or if successfully uploading an item to the ir gave them a sense of autonomy or competence, or if this experience would encourage them to submit content in the future. nevertheless, improving usability is a solid foundation for providing a positive ux. hassahzhal describes the difference between “do-‐goals” (completing a task) and “be-‐goals” (human psychological needs like being competent, or developing relationships).25 while he argues that “be-‐goals” are the ultimate drivers of ux, he also suggests that creating tools that make the completion of do-‐goals easy can facilitate the potential fulfillment of be-‐goals by removing barriers and making their fulfillment more likely. ultimately, however, a range of user testing strategies can lead to improvements in a user interface, whether that testing relies on carefully detailed examination of a microinteraction, analysis of large data sets from google analytics, or interviews with key user groups. microinteraction theory is a useful approach, and valuable in its conceptualization, but it should be one of many tools libraries adopt to improve their online ux. self-‐archiving with ease in an institutional repository | betz and hall doi: 10.6017/ital.v34i3.5900 56 similarly, focusing on the ux of irs must be only one of many strategies institutions employ to improve rates of faculty self-‐archiving. there have been recent studies that argue that regardless of platform or process, faculty-‐initiated submissions have proven to be uncommon.26 instead, they suggest that sustainability relies on marketing, direct outreach with individual faculty members, and significant staff involvement in identifying content for inclusion, investigating rights, and depositing on authors’ behalf. it would be short sighted to suggest that relying solely on designing a user-‐friendly website, or only developing savvy promotional and outreach efforts, can determine the ongoing success of an ir initiative. gaining and maintaining support is an ongoing, multifaceted process, and largely depends on the academic culture of an institution as well as available financial and staffing resources. as such, user testing offers qualitative insights into ways that processes and functions might be improved to enhance the viability of ir initiatives in tandem with a variety of marketing and outreach references 1. “welcome to roarmap,” university of southampton, 2014, http://roarmap.eprints.org. 2. dorothea salo, “innkeeper at the roach motel,” library trends 57, no. 2 (2008): 98, http://muse.jhu.edu/journals/library_trends. 3. ibid., 100. 4. “the directory of open access repositories—opendoar,” university of nottingham, uk, 2014, http://www.opendoar.org. 5. effie l-‐c law et al., “understanding scoping and defining user experience: a survey approach,” computer-‐human interaction 2009: user experience (new york: acm press, 2009), 719. 6. marc hassenzahl, “user experience (ux): towards an experiential perspective on product quality,” proceedings of the 20th international conference of the association francophone d’interaction homme-‐machine (new york: acm press, 2008), 11, http://dx.doi.org/10.1145/1512714.1512717. 7. marc hassenzahl, sarah diefenbach, and anja göritz, “needs, affect, and interactive products: facets of user experience,” interacting with computers 22, no. 5 (2010): 353–62, http://dx.doi.org/10.1016/j.intcom.2010.04.002. 8. international standards organization, human-‐centred design for interactive systems, iso 9241-‐210 (geneva: iso, 2010), section 2.15. 9. see philip m. davis and matthew j.l. connolly, “institutional repositories: evaluating the reasons for non-‐use of cornell university’s installation of dspace,” d-‐lib magazine 13, no. 3/4 (2007), http://www.dlib.orghttp://www.dlib.org; ellen dubinsky, “a current snapshot of institutional repositories: growth rate, disciplinary content and faculty contributions,” information technologies and libraries | september 2015 57 journal of librarianship & scholarly communication 2, no. 3 (2014): 1–22, http://dx.doi.org/10.7710/2162-‐3309.1167; anthony w. ferguson, “back talk—institutional repositories: wars and dream fields to which too few are coming,” against the grain 18, no. 2 (2006): 86–85, http://docs.lib.purdue.edu/atg/vol18/iss2/14http://docs.lib.purdue.edu/atg/vol18/iss2/14; salo, “innkeeper at the roach motel”; feria wirba singeh, a abrizah, and noor harun abdul karim, “what inhibits authors to self-‐archive in open access repositories? a malaysian case,” information development 29, no. 1 (2013): 24–35, http://dx.doi.org/0.1177/0266666912450450. 10. hyun hee kim and yong ho kim, “usability study of digital institutional repositories,” electronic library 26, no. 6 (2008): 863–81, http://dx.doi.org/10.1108/02640470810921637. 11. lena veiga e silva, marcos andré gonçalves, and alberto h. f. laender, “evaluating a digital library self-‐archiving service: the bdbcomp user case study,” information processing & management 43, no. 4 (2007): 1103–20, http://dx.doi.org/10.1016/j.ipm.2006.07.023. 12. suzanne bell and nathan sarr, “case study: re-‐engineering an institutional repository to engage users,” new review of academic librarianship 16, no. s1 (2010): 77–89, http://dx.doi.org/10.1080/13614533.2010.5095170. 13. dan saffer, microinteractions: designing with details (cambridge, ma: o’reilly, 2013), 2. 14. ibid., 3. 15. ibid., 2. 16. ibid., 142. 17. jakob nielson, “how many test users in a usability study?” nielsen norman group, 2012, http://www.nngroup.com/articles/how-‐many-‐test-‐users. 18. saffer, microinteractions, 48. 19. ibid., 82. 20. steve krug, don’t make me think: a common sense approach to web usability (berkeley, ca: new riders, 2000). 21. saffer, microinteractions, 64. 22. ibid., 86. 23. ibid., 111. 24. “library support,” google scholar, http://scholar.google.com/intl/en-‐ us/scholar/libraries.html. self-‐archiving with ease in an institutional repository | betz and hall doi: 10.6017/ital.v34i3.5900 58 25. hassahzhal, “user experience,” 10–15. 26. see dubinsky, “a current snapshot of institutional repositories,” 1–22; shannon kipphut-‐ smith, “good enough: developing a simple workflow for open access policy implementation,” college & undergraduate libraries 21, no. 3/4 (2014): 279–94. http://dx.doi.org/10.1080/10691316.2014.932263. collaboration and integration: embedding library resources in canvas articles collaboration and integration embedding library resources in canvas jennifer l. murray and daniel e. feinberg information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.11863 jennifer l. murray (jennifer.murray@unf.edu) is associate dean, university of north florida. daniel e. feinberg (daniel.feinberg@unf.edu) is online learning librarian, university of north florida. abstract the university of north florida (unf) transitioned to canvas as its learning management system (lms) in summer 2017. this implementation brought on opportunities that allowed for a more userfriendly learning environment for students. working with students in courses which were in-person, hybrid, or online, brought about the need for the library to have a place in the canvas lms. students needed to remember how to access and locate library resources and services outside of canvas. during this time, the thomas g. carpenter library’s online presence was enhanced, yet still not visible in canvas. it became apparent that the library needed to be integrated into canvas courses. this would enable students to easily transition between their coursework and finding resources and services to support their studies. in addition, librarians who worked with students, looked for ways for students to easily find library resources and services online. after much discussion, it became clear to the online learning librarian (oll) and the director of technical services and library systems (library director) that the library needed to explore ways to integrate more with canvas. introduction online learning is not a new concept at the unf. in fact, in-person, hybrid, and online courses used online learning in some capacity since distance learning took hold in higher education. unf transitioned to canvas as their learning management system (lms) in summer 2017, which replaced blackboard. this change, which affected all the unf’s online instruction and student learning, brought on new benefits and challenges that allowed for a more secure system for students to take in-person, hybrid, and distance learning courses. while this change occurred, unf’s library went through many changes in its virtual presence. students, specifically those who had classes that utilized canvas, needed a user-friendly way to use the library website and its resources virtually. in response, the library’s resources transitioned into having a greater online presence. however, ultimately, many students needed to use resources that they did not actually realize were available electronically from the library. through instruction and research consultations (both in-person and virtually), students needed to be directed back to the library homepage to access resources; however, the reality was that unless there was a presence of library instruction or professors pointing out library resources, students instead turned to google or other easy to find online resources to which they were previously exposed. how the project originated by spring 2018, there was growth of unf courses that were converted to online or hybrid courses. as students used canvas more, librarians started receiving feedback from in-person and online sessions that students had difficulty accessing the library’s resources while in canvas. the lack of library visibility in canvas caused the librarians to truly acknowledge that this was a problem. mailto:jennifer.murray@unf.edu mailto:daniel.feinberg@unf.edu information technology and libraries june 2020 collaboration and integration | murray and feinberg 2 students had to open a new browser window to access the library and then go back to canvas to complete their assignments, which involved multiple steps. this caused frustration among students who had to remember the library url, while also getting used to navigating their new courses in canvas. librarians consistently spent large amounts of time instructing students how to navigate to the library website during library instruction sessions and research consultations. in effect, more time was spent with students to guide them to library resources such as programmatic or course specific springshare hosted libguides (also known as library guides), or the library homepage. rather than being focused on how to use library resources and become more information literate, students spent more time on just locating the library website to get to the unf library’s online resources. together, the oll and library director talked about possibilities in canvas that would benefit all students who attended unf both in-person and online. canvas is located in unf’s mywings, a portal where all students go for coursework, email, and resources that support their academic studies at unf. it became apparent that if it was possible, there needed to be a quicker way to access the unf library resources for students. literature review with the advent of online learning, it became obvious that students needed to have library access within their online learning management system. for campuses such as unf, this meant within canvas. for unf students that are distance or online students only, this was especially true. farkas noted that librarians had worked to determine the best ways to provide library materials, services, and embed librarians into the lms.1 over the last fifteen years, lms have become more important to support the growth of online learning. pomerantz noted that the lms has become critical to instruction and online learning. approximately 99 percent of institutions adopt an lms and 88 percent of faculty utilize an lms platform.2 this “puts it in the category with cars and cellphones as among the most widely adopted technologies in the united states.”3 library guides that have been integrated into an lms increased their visibility, but did not guarantee that faculty and students would utilize them. that is why it was critical to continuously collaborate and communicate with faculty, students, and librarians to bring attention to the resources that could assist students. farkas noted that librarians at the ohio state university discussed that no matter how the library was integrated into a university’s lms, the usage of the library there was decidedly dependent on if the faculty professor promoted the library to their students.4 the reality that libraries faced was that without visibility in an lms, students that were online/distance learners needed to remember or find the library’s website. while this seemed to be inconsequential, it caused students to use google or other resources instead of their university/college’s library discovery tool or library databases. farkas noted that shank and dewald’s seminal article described a university’s lms as having two levels, macro and micro. when there was one way to access the library in the lms, then it was termed macro. this single pathway allowed for less maintenance since there was a single way to access the library from the lms.5 the university of kentucky embedded the library by adding a library tab in blackboard. other institutions like portland state university, ohio state university, and oakland state university developed library widgets to make the library more accessible.6 the addition of library and research guides in library instruction was critical to increase visibility for information technology and libraries june 2020 collaboration and integration | murray and feinberg 3 students and furthermore make sure students had easier access to the library through their lms. getting librarians’ access to the lms at their institutions is an ongoing issue.7 unf librarians wanted to determine best practices to decide how the library could integrate into canvas. therefore, research was needed to see what other university libraries were doing. the librarians at unf discovered that there was no obvious preference based on examples found in research to accomplish how to get the library into canvas. davis observed that “claiming a small amount of real estate in the lms template . . . is an easy way to put the library in students’ periphery.” by simply having a library link added or a page added to each course was “the digital equivalent for students of walking past the library on the way to class.”8 however, it seemed that a lot depended on how an lms was used at an unf and the technical expertise available. thompson and davis noted that the “lms technology has added another layer of complexity to the puzzle. as technology evolves to address changes in learning design, student and facu lty attitudes, expectations, perceptions will continue to be a critical piece of the course integration puzzle.” 9 while looking at other institutions, there were a variety of ways in which canvas and the library were integrated. there were numerous examples from embedded springshare product library guides, to the creation of modules of quizzes or tutorials, and even to the creation of online minicourses, and embedded librarians in lms courses.10 penn state university looked at their method of how to add library content into canvas. they already had a specific way of putting library guides in canvas, but it was not in a highly visible location for students to easily access. when faculty put guides in their courses, with the collaboration of librarians, the guides were used. however, many of the faculty did not use these librarians or resources. a student survey and user studies were used to best learn how to fix the problem of students and faculty that did not use the guides and content more. penn state worked with their comm 190 instructors to administer a survey that was extra credit, to ensure getting responses.11 “general findings included: 53 percent of students did not know what a course guide was; 41 percent of students had used a guide for a class project; 60 percent accessed a course guide via the lms; and 37 percent of students used course guides on their own.”12 many students were interested in doing their library research within canvas itself. it should be noted that the guides needed to be in a prominent place in canvas, but not overwhelm the course content. for course-specific guides an introductory statement was needed to describe what the guide was about. when the release of springshare’s lti tool occurred, it became an optimal time in which the technical solutions allowed for penn state’s library guides to be embedded smoothly into canvas.13 the learning tools interoperability (lti tool) allows for online software to be integrated into canvas. in effect, when professors want to add a tool to their course, it allows for more seamless and controlled avenue. in the case of library guides, it creates a way in which guides can be embedded into the lms with little problems. another example of a library integration into a campus lms was at the utah state university (usu) merrill-cazier library. they looked to find a way to maximize the effectiveness of springshare’s library guides when they assessed the design and reach of library guides within their lms.14 they took a unique approach to build an lti tool that automatically pulled in the most appropriate library guide when the “research help” link in canvas was activated by a professor. they also saw this as an opportune time to redesign their subject guides and ensure there were guides for all disciplines. they provided usage data to subject librarians to help determine where there might be opportunities to interact with classes and provide more library instruction. overall, information technology and libraries june 2020 collaboration and integration | murray and feinberg 4 the study and feedback they received from students helped them to find ways to improve how librarians used and thought about library guides, and expanded their reach based on usage data. 15 this ability to add library guides to canvas provided students a way to access library materials or the library without having to leave the online classroom. many libraries have conducted focus groups and usability studies that were key to providing valuable feedback on the knowledge and understanding that faculty and students had of guides, ways to improve information shared that assisted students with their coursework and faculty in their online teaching. research indicated that exploration and implementation of integrating library guides into an lms led to a need to improve and provide more consistently designed guides.16 the literature indicated the importance of a strong relationship with the department that manages the lms. these integrations were made much easier when there was a relationship established and it sometimes led to finding out about additional opportunities to integrate more with the lms. penn state saw an increase of over 200,000 hits to its course guides believed to be because of the lti integration.17 this, however, did not guarantee that the students benefited from the course guides, similar to library statistics not proving resources were being used despite page hits. in addition, faculty were able to turn off the lti course guide tool, which reduced the chances of student usage or awareness of the course guide. based on feedback from students and faculty, it did not matter where the course guides were since they could be ignored anyway. a penn state blog was developed by the teaching and learning with technology unit to provide instructors a venue in which they could be aware of online services that librarians provide.18 “although automatic integration allows specialized library resources to be targeted at all lms courses, that does not mean that they’ll be accessed. it is important then to build ongoing relationships with stakeholders, providing not just information that such integrations exist, but also reasons why to use them.”19 however, not all universities and colleges decided to integrate the library strictly through a library guide or a link to the library integration into their lms. karplus noted that students spent more time online rather than going to the physical academic library. karplus discussed that the digital world combined with academic library resources had two benefits. one of which brought online research as a more normal occurrence. the second benefit was that students were more comfortable with accessing online resources.20 while using blackboard, st. mary’s college’s goal was to incorporate library information literacy modules into courses that existed. using the blackboard lms, students were able to access all components of their courses including assigned readings. this became their academic environment. therefore, information literacy modules, tutorials, and outside internet resources could be added to the lms.21 tutorials combined with preand post-testing, gave faculty instant feedback. librarians were also able to participate in blackboard through discussion boards and work with students.22 there was a constant need to update the modules and the information added to blackboard. librarians having access to the blackboard site, allowed for students to use the library resources more readily. “the site can be the focal point for many librarians in one location thus ensuring a consistent, collaborative instructional program.”23 overall, the integration of campus librarians into an lms was to get students to use the library in order to be more successful in their academic endeavors. information technology and libraries june 2020 collaboration and integration | murray and feinberg 5 developing a plan of action initially, the oll and library director brainstormed possible integration ideas, ranging from adding a library global navigation button to the canvas ui, to adding a link to the library in the canvas help menu. at the same time, they also researched what other libraries had done. after brainstorming, it was realized that additional conversations needed to occur within the library and with unf’s online learning support team, a part of the center for instruction and research technology (cirt), the group that manages canvas. the discussion to integrate the library and canvas was a complex matter. unf administrators asked for a proposal to be written so it could be brought to the library, online learning support team, and information technology services (its) stakeholders for discussion and approval. that proposal, along with much needed discussion, was critical in order to determine the possibilities and actions that needed to be taken. that being said it was important to recognize the importance of what was best to serve the faculty and students. when brainstorming discussions started to occur with unf’s online learning support team, it was important for the library to determine what options were available to embed the library in canvas. the library had a strong relationship with unf’s online learning support team and its administrators, which made this an easy process to pursue. what the oll and library director initially wanted was to add a simple link to the global navigation in canvas that would take all users to the library homepage. however, it became apparent that this was not possible due to the fact that this space is limited and many departments on campus would like greater visibility in canvas. the next option, which was easier to implement, was to add a link to the library homepage under the help menu in canvas. although this menu link was added, it was so hidden in canvas that the oll and library director felt that it would never be found in canvas by faculty, let alone students. cirt administrators asserted to the oll and library director of what other possibilities were available. after researching options, the library recommended creating access to library resources and services using a springshare lti tool for library guides, which cirt agreed to. library guides, or libguides, are library webpages that are built from springshare software. using the lti tool seemed like a great possibility since it would allow for more of a presence in addition to the help link to the library homepage. after approval from library administration and initial discussions with it, the project moved forward. implementation the project took about a year to complete from the time discussions began internally in the library to the time the integration went live (see figure 1). information technology and libraries june 2020 collaboration and integration | murray and feinberg 6 figure 1. project timeline the idea to have a seamless entryway to the library seemed to be a good idea based on observations and feedback from students, but the oll and library director started by completing an environmental scan to see what other institutions did to get ideas on ways the unf library could integrate into canvas. the oll and library director learned that there were a variety of ways it had been done from the integration of the library at the global navigation level, course level, and by an added link to the library under the help menu in canvas. it became clear that an integration into canvas would seem like an obvious progression to strengthen not only online learning, but also give students the ability to benefit from the resources that the library subscribed to and enhance their curricular needs. conversations then occurred with unf’s online learning support team to discuss integration options further. after much discussion, a decision was made to pursue an added link to the librar y website under the canvas help menu and a new lti tool at the course level. since canvas was used in so many courses, it was determined that university-wide campus committee agreement was needed on how to go about adding library guides to canvas courses. librarians were also approached at this time to get their input and feedback. the goal seemed obvious to the librarians. when they were approached, buy-in to support the students with canvas by way of the help button and lti tool integration seemed more than straightforward. therefore, for the librarians, the goal was to solve the problem of making sure that students could easily access library materials. overall, the library faculty’s preference for the implementation was to embed the library website under the canvas help menu while also have the student resources libguide inside all canvas courses using the springshare lti tool. after all internal approvals were obtained, the link to the library was seamlessly added under the canvas help menu. as for the springshare lti tool, it required more work and discussion before it could be implemented. after approval was granted from the unf online learning support team and campus its security team, the integration began. configuration options for the lti tool were explored and the systems and digital technologies librarian worked closely with the unf online learning support team and springshare to setup the libapp lti tool. information technology and libraries june 2020 collaboration and integration | murray and feinberg 7 the first step was to configure springhare’s automagic lti tool to automatically add libguid es to courses in canvas. this included adding an lti tool name and description, which appeared in canvas during setup and the course navigation. it was also decided to set the student resources libguide as the default guide for all courses based on feedback from across campus. instructors could request to use a different libguide for their course. to enable this, two parameters had to be set in the automagic lti tool to enable libguide matching between canvas and libguides: • lti parameter name: for unf, this was set to “context” label, to select the course code field in canvas. • libguides metadata name: this was set to the appropriate value to identify the metadata field used in libguides. if an instructor decided to change the default guide to another guide, these two parameters would need to be entered into a specific libguide’s custom metadata, so that canvas could link to the designated guide to display in a course. the change had to be made in the libguide itself, so it was handled by the systems and digital technologies librarian. there had not been many instructors who had requested this yet, but if utilized, the library would also have had to ensure this carried over each semester by updating the metadata in the guide to the new course code. after the configuration was completed on the springshare side, the next step was to set up the integration in the canvas test environment. an external application had to be installed in canvas to allow the springshare lti tool to run. after it was tested, the application was applied across all courses and set to display by default, which the majority of faculty preferred. faculty who did not want to use the integration had the ability to manually turn it off in canvas. during the implementation setup, a few minor issues were encountered. after seeing what the student resources guide looked like in canvas it became clear that the header and footer were not needed and just cluttered the guide. they were both removed in the lti setup options to ensure a cleaner looking guide. since the libguides were being embedded into another system (canvas), formatting of the guides had to be adjusted. the other issue encountered was trying to add available springshare widgets such as the library hours or room booking content to the guide using the automagic lti tool. while this was not successful, it was determined that the additional options were not needed. once the integration was set up in the canvas test environment, demonstrations were held and input was gathered from stakeholders through campus-wide meetings with faculty to obtain their input. it was critical to determine if faculty would utilize libguides in their canvas courses. an overview of the integration and the benefits were given to the campus technology committee and distance learning committee faculty. a demonstration was also given so that these faculty committees could see what the integration would look like in their courses. overall, the feedback obtained from the faculty was very positive. the preference was to have the configuration be optout, where the library guides would automatically display in canvas courses. many faculty members were excited about the integration and looked forward to having it in their courses. after demos took place and final setup was completed based on feedback, the integration was then setup in the canvas production environment and was announced via newsletters, emails , and social media. as of the fall 2019 semester, the library's student resources guide was integrated into all courses in canvas (see figure 2). information technology and libraries june 2020 collaboration and integration | murray and feinberg 8 figure 2. student resources library guide in canvas benefits of the integration students are dependent on their campus lms in order to complete their coursework, support their studies, and in the case at unf, have easier access to the online campus. the libguide integration not only streamlined their way to library resources, but also promoted library usage from students that may not have known how to get to the resources available to them. for faculty it should be noted that they were able to replace the default student libguide with a more specific subject or course guide. either way, it brought more awareness to resources and services that supported curricular needs. the springshare chat widget in the guide also gave students the ability to communicate directly with a librarian from within canvas. this integration not only increased the library visibility in the online environment but enabled all students, whether inperson, hybrid, or online, with direct access to the resources they needed for their coursework. challenges of the integration although there were many benefits to integrate the library into canvas, there were many challenges with making the integration happen. there were many more stakeholders than expected. from library administration, to the canvas administrators, to library faculty, and teaching faculty committees, their input was needed prior to the project taking place. since the project grew organically, this meant that all of the stakeholders were brought in as the project grew or unfolded. once the project received approval from the library and cirt administrators, its administrators had to give the final approval in order to proceed with the integration of library guides. the process to implement the integration took some time to figure out. in addition, getting buy-in from the teaching faculty was key as the navigation options in their canvas courses would be impacted. making sure the faculty understood how it would assist their students was important as the goal was to help their students succeed with their coursework. information technology and libraries june 2020 collaboration and integration | murray and feinberg 9 a concern was if faculty would tell their students, or conversely, would students find the link to the libguide on their own? determining how the news of the library and canvas integration would be communicated to the unf community was the final step. the library director, oll, and cirt administrators needed to determine the best communication routes to get the unf community aware of this news. in effect, emails, unf updates/newsletters, and by word of mouth by teaching faculty. it was crucial that students be aware of these tools. this meant that going forward, unf would depend on word of mouth or student's curiosity in the canvas navigation bars themselves. discussion and next steps integrating the library with the unf’s learning management system, canvas, took much planning and collaboration, which was key to creating a more user-friendly learning environment for students. in reflecting on what went well and what did not, the unf librarians learned several important lessons that will help improve upon the implementation of future projects. to begin, it is important to identify and involve stakeholders early on, so they can provide feedback along the way. getting buy-in from the teaching faculty is also key since the integration affects the navigation options in their canvas courses. for unf, initially, the oll and library director did not realize how many groups of teaching faculty and departments would need to approve this canvas change and implementation. it was important to have them understand the importance of the integration and how it can assist their students with their coursework. considering the content of the library guides was important because of the impact it would have on canvas courses. for example, at the unf library some students thought that the librarian’s picture on the default guide was in fact their professor. in turn, students began to contact her. this caused much confusion for our students and professors alike. along the way, communication is critical so that everyone is kept informed as the integration progresses. communication at the appropriate times and ensuring all information is gathered about configuration options before starting conversations with stakeholders is important too. having this transparency at the appropriate times and ensuring there was enough info rmation about the configuration options before starting conversations with stakeholders was important too. finally, investigating the ability to retrieve usage statistics from day one would be extremely beneficial and provide data to assess how often the library guides are being used in the lms and by whom. this information would help determine next steps and explore other potential integration opportunities. at unf, the librarians were not able to implement statistics as part of our integration which has made it more difficult to assess the usage of the library guides in canvas. now that the integration has been completed, ensuring the integration continues to meet the needs of faculty and students will be important. feedback will need to be gathered from stakeholders to find out if they find the integration useful, if there are any issues being encountered, and/or if they have any recommendations for ways to enhance the integration. usage statistics will also be gathered as soon as they are available. this will provide information on which instructors are using the library guides in their courses and which instructors are not using them. for those who have used it, it will be an opportunity to target those courses for instruction. for those who have not used them, it will be an opportunity to find out why and make sure they are aware of the benefits of using them in their courses. information technology and libraries june 2020 collaboration and integration | murray and feinberg 10 exploring other integration possibilities, especially as the technology continues to evolve, will be important to ensure the library continues to reach students. while the natural progression of the unf integration would be to embed librarians in the canvas platform, others have faced challenges. “according the ithaka s & r library survey 2013 by long and schonfeld, 80–90 percent of academic library directors perceive their librarians’ primary role as contributing to student learning while only 45–55 percent of faculty agree librarians contribute to student learning.”24 even though this is a challenge, faculty collaboration with librarians is crucial for the embedded librarian role. without a requirement of embedded librarianship, marketing for the librarians and what they can do for students will be essential for their role to be successful.25 at unf, conversations will have to be held to determine what other integrations would be of interest and possible at our university. the unf library will also be looking to improve the design and layout of library guides. now that their visibility has increased, it will be important to standardize them and ensure they all have a consistent look and feel, which will make it easier for students to find the information and resources they are looking for. conclusion in today’s rapidly changing technological world, it is critical to make resources available despite where students are physically located. integrating the library’s libguides into canvas not only brought more visibility to the library, its resources, and its services, but it also brought the library to where the students were engaged with the university. as noted by farkas, “positioning the library at the heart of the virtual campus seems as important as positioning the library as the heart of the physical campus.”26 providing resources to students at their point of need, enabled them to easily access the information they needed to help them succeed in their courses. it also allowed faculty to integrate library resources that were most beneficial to their courses and enhanced their teaching as well as the educational needs of their students. the unf library will continue to look at how library resources are used, and how to best serve the online community going forward. it will be important to explore ways to enhance existing services with existing technology but also look ahead and determine what may be possible down the road with new and upcoming technologies. in addition, assessing how the library connects to online learners and gathers feedback from students and faculty will be critical to contributing to the success of students. endnotes 1 meredith gorran farkas, “libraries in the learning management system,” american libraries tips and trends (summer 2015), https://acrl.ala.org/is/wpcontent/uploads/2014/05/summer2015.pdf. 2 jeffrey pomerantz et al., “foundations for a next generation digital learning environment: faculty, students, and the lms” (jan 12, 2018): 1–4. 3 pomerantz et al., “foundations for a next generation digital learning environment.” 4 farkas, “libraries in the learning management system.” https://acrl.ala.org/is/wp-content/uploads/2014/05/summer2015.pdf https://acrl.ala.org/is/wp-content/uploads/2014/05/summer2015.pdf information technology and libraries june 2020 collaboration and integration | murray and feinberg 11 5 farkas, “libraries in the learning management system.” 6 farkas, “libraries in the learning management system.” 7 farkas, “libraries in the learning management system.” 8 robin camille davis, “the lms and the library,” behavioral & social sciences librarian 36, no. 1 (jan 2, 2017): 31–5, https://doi.org/10.1080/01639269.2017.1387740. 9 liz thompson and davis vess, “a bellwether for academic library services in the future: a review of user-centered library integrations with learning management systems,” virginia libraries 62, no. 1 (2017): 1–10, https://doi.org/10.21061/valib.v62i1.1472. 10 davis, “the lms and he library.” 11 amanda clossen and linda klimczyk, “chapter 2: tell us a story: canvas integration strategy,” library technology reports 54, no. 5 (2018): 7–10, https://doi.org/10.5860/ltr.54n5. 12 clossen and klimczyk, “chapter 2,” 8. 13 clossen and klimczyk, “chapter 2,” 8. 14 britt fagerheim et al. “extending our reach,” reference & user services quarterly 56, no. 3 (2017): 180–8, https://doi.org/10.5860/rusq.56n3.180. 15 fagerheim et al., “extending our reach,” 187. 16 fagerheim et al., “extending our reach,” 188. 17 amanda clossen, “chapter 7: ongoing implementation: outreach to stakeholders,” library technology reports 54, no. 5 (2018): 28. 18 amanda clossen, “chapter 7,” 29. 19 amanda clossen, “chapter 7,” 29. 20 susan s. karplus, “integrating academic library resources and learning management systems: the library blackboard site,” education libraries 29, no. 1 (2006): 5, https://doi.org/10.26443/el.v29i1.219. 21 karplus, “integrating academic library resources and learning management systems.” 22 karplus, “integrating academic library resources and learning management systems.” 23 karplus, “integrating academic library resources and learning management systems.” 24 beth e. tumbleson, “collaborating in research: embedded librarianship in the learning management system,” reference librarian 57, no. 3 (jul, 2016): 224–34, https://doi.org/10.1080/02763877.2015.1134376. https://doi.org/10.1080/01639269.2017.1387740. https://doi.org/10.21061/valib.v62i1.1472 https://doi.org/10.5860/ltr.54n5 https://doi.org/10.5860/rusq.56n3.180 https://doi.org/10.26443/el.v29i1.219 https://doi.org/10.1080/02763877.2015.1134376 information technology and libraries june 2020 collaboration and integration | murray and feinberg 12 25 tumbelson, “collaborating in research.” 26 farkas, “libraries in the learning management system.” abstract introduction how the project originated literature review developing a plan of action implementation benefits of the integration challenges of the integration discussion and next steps conclusion endnotes “am i on the library website?”: a libguides usability study articles “am i on the library website?”: a libguides usability study suzanna conrad and christy stevens information technology and libraries | september 2019 49 suzanna conrad (suzanna.conrad@csus.edu) is associate dean for digital technologies and resource management at california state university, sacramento. christy stevens (crstevens@sfsu.edu) is the associate university librarian at san francisco state university. abstract in spring 2015, the cal poly pomona university library conducted usability testing with ten student testers to establish recommendations and guide the migration process from libguides version 1 to version 2. this case study describes the results of the testing as well as raises additional questions regarding the general effectiveness of libguides, especially when students rely heavily on search to find library resources. introduction guides designed to help users with research have long been included among a suite of reference services offered by academic libraries, though terminology, formats, and mediums of delivery have evolved over the years. print “pathfinders,” developed and popularized by the model library program of project intrex at mit in the 1970s, are the precursor to today’s online research guides, now a ubiquitous resource featured on academic library websites.1 pathfinders were designed to function as a “kind of map to the resources of the library,” helping “beginners who seek instruction in gathering the fundamental literature of a field new to them in every respect” find their way in a complex library environment.2 with the advent of the internet, pathfinders evolved into online “research guides,” which tend to be organized around subjects or courses. in the late 1990s and early 2000s, creating guides online required a level of technological expertise that many librarians did not possess, such as html-coding knowledge or the ability to use web development applications like adobe dreamweaver. as a result, many librarians could not create their own online guides and relied upon webmasters to upload and update content. the online guide landscape changed again in 2007 with the introduction of springshare’s libguides, a content management solution that quickly became a wildly popular library product.3 as of december 2018, 614,811 guides had been published by 181,896 librarians, at 4,743 institutions in 56 countries.4 the popularity of libguides is due in part to its removal of technological barriers to online guide creation, making it possible for those without web-design experience to create content. libguides is also a particularly attractive product for libraries constrained by campus or library web templates, affording librarians and library staff the freedom to design pages without requiring higher level permissions to websites. despite these advantages, in the absence of oversight, libguides sites can develop into microsites within the library’s larger web presence. inexperienced content creators can inadvertently develop guides that are difficult to use, lacking consistent templates and containing overwhelming amounts of information. as a result, libraries mailto:suzanna.conrad@csus.edu mailto:crstevens@sfsu.edu am i on the library website? | conrad and stevens 50 https://doi.org/10.6017/ital.v38i3.10977 often find it useful to develop local standards and best practices in order to enhance the user experience.5 like many academic libraries, the cal poly pomona university library uses the libguides platform to provide the campus community with course and subject guides. in 2015, librarians began discussing plans to migrate from libguides version one to the version two platfo rm. these discussions led to broader conversations about libguides related issues and concerns, some of which had arisen during website focus group sessions conducted in early 2015. the focus groups were designed to provide the library with a better understanding of students’ library website preferences. students reported frustration with search options on the library website as well as confusion regarding inconsistent headers. even though focus group questions were related to the library website, two participants commented on the library’s libguides as well. the library was using a modified version of the library website header for vendor-provided services, including libguides, so it was sometimes unclear to students when they had navigated to an external s ite.6 to complicate matters, the library also occasionally used libguides for other, non-researchrelated library pages, such as a page delineating the library’s hours, because of the ease of updating that the platform affords. one student, who had landed on the libguides page detailing the library’s hours, described feeling confused about where she was on the library website. she explained that she had tried to use the search box on the libguides page to navigate away from the hours page, apparently unaware that it was only an internal libguides search. as a result, she did not receive any results for her query. the language the student used to describe the experience clearly revealed her disorientation and perplexity: “something popped up called libguides and then i put what i was looking for and that was nothing. it said no search found. i don’t even know what that was, so i just went to the main page.” another participant, who also tried to search for a research-related topic after landing on a libguides page, stated, “i tried putting my topics. i even tried refining my topic, but then it took me to the guide thing.” accustomed to using a search function to find information on a topic, this student did not interpret the research guide she had landed upon as a potentially useful tool that could help with her research. she expected that her search would produce search results in the form of a list of potentially relevant books or articles. the appearance instead of a research guide was misaligned with her intentions and expectations and therefore confusing to her.7 given both the libguides related issues that emerged during the library website focus groups and the library’s plan to migrate from libguides version one to version two in the near future, the library’s digital initiatives librarian and head of reference and instruction decided to conduct usability testing focused specifically on libguides. in addition to testing the usability of specific libguides features, such as navigational tabs and subtabs, we were also interested in determining whether some of the insights gleaned from the library website focus groups and from prior user surveys and usability testing regarding users’ web expectations, preferences, and behaviors were also relevant in the libguides environment. specifically, prior data had indicated users were unlikely to differentiate between the library’s website and vendor-provided content, such as libguides, libanswers, the library catalog, etc. findings also suggested that rather than intentionally selecting databases that were appropriate to their topics, students often resorted to searching in the first box they saw. this included searching for articles and books on their topics using search boxes that were not designed for that purpose, such as the database search box on the library’s a-z database page and the libguides site search tool for searching all guides. although many students did not always resort to searching first (many did attempt to browse to information technology and libraries | september 2019 51 specific library services), if they were not immediately successful, they would then type terms from the usability testing task into the first available search box.8 finally, we were also aware that many of our current libguides contained elements that were inconsistent with website search and design best practices as well as misaligned with typical website users’ behaviors and expectations, as described by usability experts like jakob nielsen. as such, we wanted to test the usability of some of these potentially problematic elements to determine whether they negatively impacted the user experience in the libguides environment. if they did, we would have institution-specific data that we could leverage to develop local recommendations for libguides standards and best practices that would better meet students’ needs. literature review the growth of libguides since springshare’s founding in 2007, libguides have been widely embraced by academic libraries.9 in 2011, ghaphery and white visited the websites of 99 arl university libraries in the united states and found that 68 percent used libguides as their research guides platform. they also surveyed librarians from 188 institutions, 82 percent of which were college or university libraries, and found that 69 percent of respondents reported they used libguides.10 as of december 2018, springshare’s libguides community website indicated that 1,620 academic libraries in the united states and a total of 1,823 academic libraries around the world, not counting law and medical libraries, were using the libguides platform.11 libguides’ popularity is due in part to its user-friendly format, which eliminates most technical barriers to entry for would be guide authors. for example, anderson and springs surveyed librarians at rutgers university and found they were more likely to update and use libguides than previous static subject guides that were located on the library website and maintained by the webmaster, to whom subject specialists submitted content and any needed changes.12 the majority of librarians reported that having direct access to the libguides system would increase how often they updated their guides. moreover, after implementing the new libguides system, 52 percent said they would update guides as needed, and 14 percent said they would update guides weekly; prior to implementation, only 36 percent stated they would update guides as needed, and none said they would do so weekly. libguides usability testing and user studies although much literature has been published on the usability of library websites,13 fewer studies have focused on research guides or libguides specifically. of these, several focused on navigation and layout issues. for example, in their 2012 libguides navigation study, pittsley and memmott confirmed their initial hypothesis that the standard libguides navigation tabs located in a horizontal line near the top of each page can sometimes go unnoticed, a phenomenon referred to as “banner blindness.” as a result of their findings, librarians at their institution decided to increase the tab height in all libguides, and some librarians also chose to add content menus on the homepages of each of their guides. they moved additional elements from the header to the bottom of the guide under the theory that decreased complexity would contribute to increased tab navigation recognition.14 sonsteby and dejonghe examined the efficacy of libguides’ tabbed navigational interface as well as design issues that caused usability problems. they identified user preferences, su ch as users’ am i on the library website? | conrad and stevens 52 https://doi.org/10.6017/ital.v38i3.10977 desire for a visible search box that behaved like a discovery tool, and design issues that frequently led to confusion, such as search boxes that searched for limited types of content, like journal titles. they also found that jargon confused users, and that guides containing too many tabs that were inconsistently labeled led to both confusion and the perception that guides were “cluttered” and “busy.”15 thorngate and hoden explored the effectiveness of libguides version two designs, specifically focusing on use of columns, navigation, and the integration of libguides into the library website. they found that two-column navigation is the most usable, users are more likely to notice left navigation over horizontal tabs, and students do not view libguides as a separate platform, expecting instead for it to live coherently within the library’s website. 16 almeida and tidal employed a mixed methods approach to gather user feedback about libguides, including usage of “paper prototyping, advanced scribbling, task analysis, tap, and semi-structured interviews.”17 the researchers intended to “translate user design and learning modality preferences into executable design principles,” but found that no one layout filled all students’ needs or learning modalities.18 ouellette’s 2011 study differed from many libguides-focused articles in that rather than assigning participants usability tasks, it employed in-depth interviews with 11 students to explore how they used subject guides created on the libguides platform and the features they liked and disliked about them. like some of the aforementioned studies, oullette found that students did not like horizontal tabbed navigation, preferring instead the more common left-side navigation that has become standard on the web. however, the study was also able to explore issues that many of the usability task-focused studies did not, including whether and how students use subject guides to accomplish their own research-related academic work. ouellette found that students “do not use subject guides, or at least not unless it is a last resort.”19 reasons provided for non-use included not knowing that they existed, preferring to search the open web, and not perceiving a need to use them, preferring instead to search for information rather than browsing a guide.20 such findings call into question the wisdom of expending time and resources on creating guides. however, ouellette asserted that students were more likely to use research guides when they were stuck, when they were required to find information in a new discipline, or when their instructors explicitly suggested that they use them.21 nevertheless, most students who had used libguides reported that they had done so solely “to find the best database for locating journal articles.”22 indeed, ouellette found that the majority of “participants had only ever clicked on the tab leading to the database section of a guide,” a finding that was consistent with staley’s 2007 study, which found that databases are the most commonly used subject guide section.23 while ouellette concluded that libguides creators should therefore emphasize databases on their guides, both the more recent widespread library adoption of discovery systems that search across databases, in many cases making it unnecessary for students to select a specific database, as well as the common practice of aggregating relevant databases under disciplinary subject headings on library databases pages implicitly call into question the need for duplicating such information on library subject guides. if users can easily find such information elsewhere, these conclusions also cast doubt on the effectiveness of the entire libguides enterprise. information retrieval behaviors: search and browse preferences in 1997, usability expert jakob nielsen reported that more than half of web users are “search dominant,” meaning that they go directly to a search function when they arrive at a website rather than clicking links. in contrast, only a fifth of users are “link dominant,” preferrin g to navigate sites by clicking on links rather than searching. the rest of the users employ mixed strategies, switching information technology and libraries | september 2019 53 between searching and clicking on links in accordance with what appears to be the most promising strategy within the context of a specific page.24 while some researchers have questioned the prevalence of search dominance, nielsen’s mobile usability studies have indicated an even stronger tendency toward search dominance when users access websites on their mobile devices.25 moreover, by 2011, nielsen’s research had indicated that search dominance is a user behavior that gets stronger every year, and that “many users are so reliant on search that it’s undermining their problem-solving abilities.” specifically, nielsen found that users exhibited an increasing reluctance to experiment with different strategies to find the information they needed when their initial search strategy failed.26 nielsen attributes the search dominance phenomenon to two main user preferences. the firs t is that search allows users to “assert independence from websites’ attempt to direct how they use the web.”27 the second is that search functions as an “escape hatch when they are stuck in navigation. when they can’t find a reasonable place to go next, they often turn to the site’s search function.” nielsen developed a number of best practices based on these usability testing results, including that search should be made available from every page in a website, since it is not possible to predict when users will feel lost. additionally, given that users quickly scan sites for a box where they can type in words, search should be configured as a box and not a link, it should be located at the top of the page where users can easily spot it, and it should be wide enough to accommodate a typical number of search terms.28 nielsen’s usability studies have shed light not only on where search should be located but also on how search should function. in 2005, nielsen reported that searchers “now have precise expectations for the behavior of search” and that “designs that invoke this mental model but work differently are confusing.”29 specifically, searchers’ “firm mental model” for how search should work includes “a box where they can type words, a button labeled ‘search’ that they click to run the search, [and] a list of top results that’s linear, prioritized, and appears on a new page.” moreover, nielsen found that searchers want all search boxes on all websites to function in the same way as typical search engines and that any deviation from this design causes usability issues. he specifically highlighted scoped searches as problematic, pointing out that searches that only cover a subsite are generally misleading to users, most of whom are unlikely to consider what th e search box is actually searching.30 while there is much evidence to support nielsen’s claims about the prevalence of search dominance, other studies have suggested that users themselves are not necessarily always search or link dominant. rather, some websites lend themselves better to searching or exploring links, and users often adjust their behaviors accordingly.31 although we did not find studies that specifically discussed the search and browse preferences and behaviors of libguides users, we did find studies of library website use that suggested that though users often exhibit search -dominant tendencies, they also often rely on a mixed approach to library website navigation. for example, hess and hristova’s 2016 study of users’ searching and browsing tendencies explored how students access library tutorials and online learning objects. specifically, they compared searching from a search box on the tutorials landing page, using a tag cloud under a search box, and browsing links.32 google analytics data revealed that students employed a mixed approach, equally relying upon both searching and clicking links to access the library’s tutorials.33 similarly, han and wolfram analyzed clickstream data from 1.3 million sessions in an image repository and determined that the two most common actions (86 percent of actions) were simple search and am i on the library website? | conrad and stevens 54 https://doi.org/10.6017/ital.v38i3.10977 click actions.34 however, users in this study exhibited a tendency toward search dominance, conducting simple searches in 70 percent of the actions.35 niu, zhang, and chen presented a mixed methods study analyzing search transaction logs and conducting usability testing f ocused on comparing the discovery layers vufind and primo. browsing in the context of their study included browsing search results. they found that most search sessions were very brief, and students searched using two or three keywords.36 xie and joo tested how thirty-one participants went about finding items on a website, classifying their approaches into what they described as eight “search tactics,” including explorative methods, such as browsing.37 over 88 percent of users conducted at least one search query, and 75 percent employed “iterative exploration,” browsing and evaluating both internal and external links on the site “until they were satisfied or they quit.”38 only four of thirty-one, or 6.7 percent, did “whole site exploration,” a tactic which included browsing and evaluating most of the available information on a website, looking through every page on the site to find the desired information.39 method this study addresses the following research questions: 1. when prompted to find a research guide, are students more likely to click links or type terms into a search box to find the guide? 2. are students more likely to successfully accomplish usability tasks directing them to find specific information on a libguide when using a guide with horizontal or vertical tabs? 3. how likely are students to click on subtabs? 4. how and to what extent does a one-, two-, or three-column content design layout affect students’ ability to find information on a libguide? 5. how and to what extent do students use embedded search boxes in libguides? 6. do students confuse screenshots of search boxes with functioning search tools? in 2015, the university library had access to two versions of libguides: the live version one instance and a beta version two instance. in order to answer our research questions and make data-informed design decisions that would improve the usability of our libguides, we compared the usability of existing research guides in libguides version one to test sites on libguides version two. version two guides differed from version one guides in several ways. version two guides were better aligned with nielsen’s recommendations regarding search box placement and function. every libguide page included a header identical to the library website’s header, which contained a global search box that searched both library resources and the library’s website. the inclusion of a visible discovery tool in the header was consistent with usability recommendations in the literature40 as well as our own prior library website usability tests, which indicated many users preferred searching for resources over finding a path to them by clicking through a series of links. in mid-april 2015, ten students were scheduled to test libguides. each student attempted the same seven tasks, but five students tested the current version of libguides and five students tested version two. the sessions were recorded using camtasia, and students completed usability tasks on a laptop that was hooked up to a large television monitor, allowing the two librarians who were in the room to observe how students navigated the library’s website and libguides platform. one librarian served as the moderator and the other managed the recording technology.41 although additional members of the web team were interested in viewing the test information technology and libraries | september 2019 55 sessions, in order to avoid overwhelming the students, only two librarians sat in the sessions. the moderator read tasks aloud and students were instructed to think aloud while completing each task, narrating their thought processes and navigational decisions. students were recruited via a website usage and perceptions survey sent out the prior quarter, which included a question as to whether they would be interested in participating in usability testing. the students who received this survey were selected from a randomized sample provided by the university’s institutional research office. the sample included both lower division students in the first or second year of their studies and transfer students. students were also recruited in information literacy instruction sessions for lower-level english courses as well as in a creditbearing information literacy course taught by librarians. survey respondents and students from the targeted classes who indicated that they would be interested in participating in usability testing were subsequently contacted via email. students with appropriate testing day availability were selected. students from the various colleges were represented, including engineering; business administration; letters, arts and social sciences; education and integrative studies; and hospitality management. all of the participants were undergraduates and most were lower division students. we chose to focus on recruiting lower division students because we wanted to ensure that our guides were usable by students with the least amount of library experience; many lower division students are unaware of library services and may not have taken a library instruction session or a library information literacy course. however, while the goal was to recruit lower division students, scheduling difficulties, including three no-shows, led us to recruit students on-the-fly who were in the library, regardless of their lower division or upper division status. task 1 in both rounds of usability testing, students were prompted to find a “research guide” to help them write a paper on climate change for a com 100 class. students started from the homepage of the library. two possible success routes included browsing to a featured links section on the homepage where a “research guides” link was listed (see figure 1) or searching via the top level “onesearch” discovery layer search box, displayed in figure 2, which delivered results, including articles from databases, books from the catalog, library website pages, and libguides pages, in a bento-box format. the purpose of this task was to determine if students browsed or searched to find research guides. we defined browsing as clicking on links, menus, or images to arrive at a result, whereas searching involved typing words and phrases into a search box. am i on the library website? | conrad and stevens 56 https://doi.org/10.6017/ital.v38i3.10977 figure 1. featured links section on library homepage. figure 2. onesearch search box on library homepage. task 2 task 2 was designed to compare the usability of libguides version one’s horizontal tab orientation with version two’s left navigation tab option. students were provided with a scenario in which they were asked to compare two public opinion polls on the topic of climate change for the same com 100 class. we displayed the appropriate research guide for the students and instructed them to find a list of public opinion polls. the phrase “public opinion polls” appeared in the navigation of both versions of the guide. figure 3 displays the research guide with horizontal tab navigation and figure 4 with vertical, left tab navigation. information technology and libraries | september 2019 57 figure 3. horizontal tab navigation. am i on the library website? | conrad and stevens 58 https://doi.org/10.6017/ital.v38i3.10977 figure 4. left tab navigation. task 3 in the third scenario, students were informed that their professor recommended that they u se a library “research guide” to find articles for a research paper assignment in an apparel merchandising and management class. students were instructed to find the product development articles on the research guide. the phrase “product development” appeared as a subtab in both versions of the guide. this task was intended to test whether students navigated to subtabs in libguides. as shown in figure 5, the subtab located on the horizontal navigation menu appeared when scrolled over but was otherwise not immediately visible. in contrast, figure 6 shows how the navigation was automatically popped open on the left tab navigation menu so that subtabs were always visible, a newly available option in libguides version two. figure 5. horizontal subtab options. information technology and libraries | september 2019 59 figure 6. left tab navigation with lower subtabs automatically open. task 4 on the same apparel merchandising and management libguide, students were asked where they would go to find additional books on the topic of product development. the librarian who designed this libguide had included search widgets in separate boxes on the page that searched the catalog and the discovery layer “onesearch.” we were interested in seeing whether students would use the embedded search boxes to search for books. this functionality was identical in both the version one and two instances of the guide, as shown in figure 7. figure 7. embedded catalog search and embedded discovery layer search. task 5 in the fifth scenario, students were told that they were designing an earthquake-resistant structure for a civil engineering class. as part of that process, they were required to review am i on the library website? | conrad and stevens 60 https://doi.org/10.6017/ital.v38i3.10977 seismic load provisions. we asked them to locate the asce standards on seismic loads using a research guide we opened for them. the asce standard was located on the “codes & standards” page, which could be accessed by clicking on the “codes & standards” tab. the version one instance of the guide was two-columned, and a link to the asce seismic load standard was available in the second column on the right, per figure 8. the version two instance of the guide used a single, centered column, and the user had to scroll down the page to find the standard, per figure 9. we wanted to see if students noticed content in columns on the right, as many of our libguides featured books, articles, and other resources in columns on the right side of the page, or whether guides with content in a single central column were easier for students to use. figure 8. two-column design with horizontal tabs. information technology and libraries | september 2019 61 figure 9. two-column design with left tab navigation. task 6 because librarians sometimes included screenshots of search interfaces in their guides, we were interested in testing whether students mistook these images of search tools for actual search boxes. in task six, we opened a civil engineering libguide for students and told them to find an online handbook or reference source on the topic of finite element analysis. as shown in figure 10, a screenshot of a search box was accompanied by instructional text explaining how to find specific types of handbooks. within this libguide, there were also screenshots of the onesearch discovery layer as well as a screenshot of a “findit” link resolver button. am i on the library website? | conrad and stevens 62 https://doi.org/10.6017/ital.v38i3.10977 figure 10. screenshots used for instruction. task 7 the final task was designed to test whether it was more difficult for students to find content in a twoor three-column guide. students were instructed to do background research on motivation and classroom learning for a psychology course. they were told to find an “encyclopedic source” on this topic. within each version of the psychology libguide, there was a section called “useful books for background research.” as shown in figure 11, in the version one libguide, books useful for background research were displayed in the third column on the right side of the page. the version two libguide displayed those same books in the first column under the left navigation options. information technology and libraries | september 2019 63 figure 11. books displayed in third column. am i on the library website? | conrad and stevens 64 https://doi.org/10.6017/ital.v38i3.10977 figure 12. two-column display with books in the left column. results searching vs. browsing to find libguides understanding how students navigate and use libguides is important, but if they have difficulty finding the libguides from the library homepage, usability of the actual guides is moot. of the ten students tested, six students used the onesearch discovery layer located on the library’s homepage to search for a guide designed to help them write a paper on climate change for a com 100 class. frequently used search terms included “research guide,” “communication guides,” “climate change,” “climate change research guide,” “faculty guides,” and “com 100.” of these students, two used search as their only strategy, typing search queries into whichever search box they discovered. neither of these students were successful at locating the correct guide. the remaining four students used mixed strategies; they started by searching and resorted to browsing after the search did not deliver exact results. two of these students were eventually successful in finding the specific research guide; two were not. of the six studen ts who searched using the discovery layer, only one did not find the libguides landing page at all. in general, it seems that the task and student expectations during testing were not aligned with the way the guide was constructed. only one student went to the controversial topics guide because “climate change is a controversial topic.” one student thought the guide would be titled “climate change” and another thought there might be a subject librarian dedicated to climate change. students would search for keywords corresponding with their course and topic, but generally they did not make the leap to focus more broadly on controversial topics. only one student browsed directly to the “research guides” link on the homepage and found the guide under subject guides for “communication" on the first try. another student navigated to a information technology and libraries | september 2019 65 “services and help” page from the main website navigation and found a group of libguides that were labeled “user guides,” designed specifically for new students, faculty, staff, and visitors; however, the student did not find any other libguides relevant to the task at hand. the remaining two students navigated to pages with siloed content; one student clicked the library catalog link on the library homepage and began searching using the keywords “climate change.” the other student clicked on the “databases” link. upon arriving at the databases a-z page, the student chose a subject area (science) and searched for the phrase “faculty guides” in the databases search box. the student was unable to find the research guide because our libguides were not indexed in this search box; only database names were listed. only three out of ten students found the guide; the rest gave up. two of the successful participants employed mixed strategies that began with searching and included some browsing; the third student browsed directly to the guide without searching. testers in the libguides version one environment attempted the task an average of 3.8 times before achieving success or giving up compared to an average of 3.2 attempts per tester in version two testing. we defined an attempt as browsing or searching for a result until the student tried a different strategy or started over. for instance, if a student tried to browse to a guide and then chose to search after not finding what they were looking for, that constituted two attempts. testers in both rounds began on the same library website. one major difference between the two research guides landing pages was the search boxes; one was an internal libguides search box (version one) and one was a global onesearch box (version two). it is possible that testers in round two made fewer attempts because of the inclusion of the onesearch box. for those testing with the libguides search box in version one, three searched on the libguides landing page. from both rounds, eight of the students located the libguides landing page, regardless of whether or not they found the correct guide. the two students who did not find the correct guide did land in libguides, but they arrived at specific libguides pages that served other purposes (one found a onesearch help guide and the other landed on a new users’ guide). navigation, tabs, and layout navigation, tab options (including subtab usage), and layouts were evaluated in tasks two, three, five, and seven. as mentioned in the method section, the first group of five students who tested the interface used the version one libguides with horizontal navigation and hidden subtabs. the second round of five students used the version two libguides with left navigation and popped open subtabs. students in both rounds were able to find items in the main navigation (not including subtabs) at consistent rates, with those in the second round with left navigation completing all tasks significantly faster than the first-round testers (38 seconds faster on average across all tasks). in task two, students were asked to find public opinion polls, which they could access by clicking a “public opinion polls” link on the main navigation. in both rounds, regardless of horizontal or vertical navigation, nine of the students clicked on the tab for the polls. only one student testing on version two was unable to find the tab. students in version one testing with horizontal navigation attempted this task two times on average before successfully finding the tab; students testing on version two with vertical navigation attempted 1.4 times before finding the tab with the polls or giving up. am i on the library website? | conrad and stevens 66 https://doi.org/10.6017/ital.v38i3.10977 when asked in task three to find articles on product development, which were included on a “product development” subtab under the primary “library databases amm” tab, nine out of ten students were unable to locate the subtab. in libguides version one, this subtab was only viewable after clicking the main “library databases amm” tab. in libguides version two, this subtab was popped open and immediately visible underneath the “library databases amm” tab. a version two tester was the only student who clicked on the “product development” subtab. students attempted this task 1.8 times in version one testing compared to 1.2 times for those testing version two. it is worth noting that six of the students found product development articles by searching via other means (onesearch, databases, and other library website links); they just did not find the articles on the libguide shown. while they still successfully found resources, they did not find them on the guide we were testing. in task five, we asked students to find the asce standards on seismic loads on a specific guide. the version one guide used a two-column design while the version two guide with the same content utilized a single column for all content. while six students found the standards (three in round one and three in round two), only four of ten testers overall did so by browsing to the resource. three of the students who chose to browse were in round one and the fourth student was from round two. in version one testing with the two-column design, two students found the standards after making two attempts to browse the guide. both of these students used the libguides “search this guide” function to find the correct page for the standards using keywords “asce standards” and “asce.” the third successful student in this round used a mixed methods strategy of searching and browsing. she used the search terms “asce standards on seismic loads” and then searched for “seismic loads” twice in the same search box. she landed on the correct tab of the libguide, scrolling over the correct standard multiple times, but only found the standards after the sixth attempt. during version two testing, which included the one column design and global search box, only one student browsed to the standards on the libguide. this student scrolled up and down the main libguide page, clicked on the left navigation option for “find books,” then the left navigation option for “codes & standards” and scrolled down to find the correct item. four out of five version two testers bypassed browsing altogether, instead using the onesearch box on the page header to try to find the asce standards. two of those students found the specific asce standards that were featured on the libguide; the other two found asce standards, just not the specific item we intended for them to find. the four students who did no t find the specific standards were equally distributed across both testing groups. on average, students attempted to complete the task 3.6 times in version one testing and 1.6 times in version two testing before either finding the resource or giving up. task seven asked students to find an encyclopedic source using a three-column design in version one and a two-column design in version two. the version one guide listed encyclopedias in the right-most column of a three-column layout and the version two guide included them under the left navigation in a two-column design. only three students found the encyclopedia mentioned in task seven, two of whom completed the task using version two’s two-column display. only one student was able to locate the encyclopedia in the third column in version one testing. the seven students who were unable to find the encyclopedia all attempted to search when they were unable to find the encyclopedia by browsing. six of these seven students searched for the keywords “motivation and classroom learning” and the seventh for “motivation and learning.” those who landed in onesearch (six out of seven) received many results and were unable to find encyclopedias. one student searched within libanswers for “encyclopedia” and found britannica. information technology and libraries | september 2019 67 one student attempted to refine by facets, thinking that “encyclopedia” would be a facet similar to “book” or “article.” using search, especially onesearch, to attempt to find an encyclopedia was ultimately unsuccessful for the students. search terms students chose were far too general for them to complete the task successfully. students in version one testing attempted this task 2.4 times compared to 3.2 times for version two testers. embedded search boxes & screenshots of search boxes embedded search boxes and screenshots of search boxes were tested in tasks four and six. the header used in version one libguides was limited, defaulting to searching within the guide, and the additional options on the dropdown menu next to the search box did not include a global “onesearch.” in version two guides, a onesearch box that searched most library resources (articles, books, library webpages, and libguides) was included. during task four, which asked students who were already on a specific guide how they would go about finding additional books on product development, version one testers were much more likely to use embedded search box widgets in the guide content. three of the five students in version one testing used the search widgets on the page to either search the catalog or search onesearch. the remaining two students in that round used a header search or browsed. one of these students used the libguides “search this guide” function in libguides and searched for “producte [sic] development books.” this student did not notice the typo in the search term and subsequently navigated out of libguides to the library website via the library link on the libguides header. the user then searched the catalog for “product development” and was able to locate books. a fifth student in the version one testing round did not use embedded search box widgets or the libguides search. she browsed through two guide pages and then gave up. in version two testing, three of five students used the global onesearch box to find the product development books. the remaining two students chose to search the millennium catalog linked from a “books and articles” tab on the main website header, finding books via that route. during testing of both versions, students tried an average of 1.5 times to complete the task before achieving success or acknowledging failure. nine out of ten testers found books on the topic of product development. the one tester who did not find the books attempted to complete the task one time; she found product development articles from the prior task and said she would click on the same links (for individual article titles) to find books. in task six, half of the ten students from both rounds attempted to click on screenshots of search boxes or unlinked “findit” buttons. a screenshot of the onesearch box and a knovel search box were embedded in the test engineering guide. two users in the version one testing and one tester in version two testing attempted to click on the onesearch screenshot. one student in version two testing attempted to click on the knovel search box screenshot. one student from version one testing tried to click on a “findit” button for the link resolver. comparisons between rounds we recorded how many attempts were needed to complete tasks in each round. in round one, which tested libguides version one, students took an average of 2.74 tries to complete the tasks. in round two, which focused on libguides version two, students took two tries to complete tasks. average attempts per task are displayed in figure 13. we also timed the rounds to see how many minutes it took students to complete all of the tasks. in the first round, it took 16:07 minutes on average and in the second round 15:29 minutes. this does not appear to constitute an important am i on the library website? | conrad and stevens 68 https://doi.org/10.6017/ital.v38i3.10977 difference, but there was one tester in round two who narrated his experiences very explicitly and in great detail. his session lasted 23 minutes. if his testing is excluded, then round two had a shorter average of 13:30 minutes. despite the lower total time spent testing, task success was nearly equal between the two rounds. details on individual testing times per participant are in figure 14. in round one, testers were successful at completing the task, whether they completed it in the manner we predicted or not, for 24 tasks. round two was slightly lower with 23 successfully completed tasks. success was, however, subjective. in task three, we wanted to test whether students found a list of articles on a libguide on a certain topic. nearly all of the students (nine out of ten) found articles on the topic, but only one of them found them via the method we had anticipated. other tasks produced similar results where the students found resources that technically fulfilled the task we had asked them to complete, even though they did not test the feature of the interface we were hoping. in these cases, we called this a success, as they had fulfilled the task as written. figure 13. attempts per task for libguides v1 compared to libguides v2. information technology and libraries | september 2019 69 figure 14. total time per participant for libguides v1 compared to libguides v2. discussion there were several overarching themes that we discovered during the testing of libguides versions one and two. the first relates to nielsen’s conception of search dominance and its implications for finding guides as well as resources within guides. task one, which asked students to navigate to a relevant libguide from the library homepage, revealed that students were much more likely to search for a guide than to navigate to one by using links. although the library homepage in our study included a clearly demarcated “research guides” link, only one tester clicked on it. in contrast, six of ten of the students used search as their first and only strategy, and an additional two of ten first clicked on a link and then switched to search as their next strategy. although our initial search-focused research question and related task looked specifically at how students navigate to guides, most of the other tasks provided additional insight into how students navigate within them as well. our findings are consistent with nielsen’s observation that search functions as an “escape hatch” when users get “stuck in navigation.”42 many students we tested used mixed strategies to find content, often resorting to searching for content when they were confused, lost, or impatient. while one student explicitly stated that search is a backup for when he cannot find something via browsing, search behaviors from many other students suggested that they were “search-dominant,” preferring searching over browsing both on library website pages and from within libguides. similar to nielsen’s studies on reliance on search engine results, students were unlikely to change their search strategies even if they were not receiving helpful results. students did not engage in what xie and joo referred to as “whole site exploration,” browsing and evaluating most of the available information on a website to accomplish the assigned tasks.43 while research guides are sometimes designed to function as linear pathways that lead students through the research process or as comprehensive resources that introduce am i on the library website? | conrad and stevens 70 https://doi.org/10.6017/ital.v38i3.10977 students to a series of tools and resources, all of which could be useful in the research process, the students we tested did not approach guides in this way. rather than starting on the first tab and comprehensively exploring it tab by tab and content box by content box, students ignored most of the content on the page, searching instead to find the specific information they needed. our testers’ search behaviors were also consonant with nielsen’s observation that scoped searches are inconsistent with users’ mental models about how search should function. nielsen found that search boxes that only cover a subsection of a site are generally confusing to users and negatively impact users’ ability to find what they are looking for on a site. in our study, several students used scoped search boxes both on library website pages and within libguides to find content that the search did not index. version two testers had access to a search box on every page that aligned with their global search expectations, and they frequently used it, so much so that they their preference for search disrupted some of the usability questions we were trying to answer in our tasks. for example, users’ tendency to search instead of browse interfered with our ability to clearly discern whether it was easier for students to find content on pages with one-, two-, or three-column content designs (many students did not even attempt to find content in the columns). students’ global search expectations of search boxes also have implications on their ability to find libguides that they have been told exist or to discover the existence of libguides that might help them with their research. for example, students with search-dominant tendencies who attempt to use a library search tool that does not index libguides or the content within libguides will be unlikely to find them. while students did use search boxes embedded within libguides content areas, version two testers had access to a global search box located at the top right-hand side of every libguides page, and as a result, they were more likely to use the global search than the embedded search boxes. this behavior is consistent with nielsen’s assertion that for ease of use, search should consist of a box “at the top of the page, usually in the right-hand corner,” that is “wide enough to contain the typical query.”44 version two testers were quick to find and use the search box in the header that fit this description. although students often used search boxes, and global ones in particular, to accomplish usability testing tasks, they were sometimes impeded by screenshots of search boxes and links. several students clicked on them thinking they were live, unable to immediately distinguish that they differed from the functional embedded search boxes that some of the guides also included. as nielsen observed, “users often move fast and furiously when they’re looking for search. as we’ve seen in recent studies, they typically scan the homepage looking for ‘the little box where i can type.’”45 librarians sometimes use screenshots of search boxes in an effort to provide helpful visuals to accompany instructional content (text) focusing on how to access and use a specific resource. because many students scan the page for a search box so that they can quickly find needed information rather than carefully reading material in the content boxes, it could be argued that these screenshots inadvertently confuse students and impede usability. another way to look at this issue, however, may be that guide content can be misaligned with user expectations and contexts. a user looking to search for articles on a topic who stumbles on a guide may have no reason to do anything other than look for a search box. in contrast, a user introduced to a guide in the context of a course who is asked to read through the content and explore three listed resources in preparation for a discussion to occur in the next class meeting will likely have a very different orientation to the guide and perception of its purpose and usefulness. information technology and libraries | september 2019 71 students’ search behaviors also made us question the efficacy of linking to specific books or articles within a libguide. in tasks three through seven, many of the students used onesearch or the library catalog to search for specific books or articles rather than referencing the guide where potentially useful resources were listed. for example, while trying to find the com 100 guide during task one, one student commented, “i never really look for stuff. i just go to the databases.” version two testers, who had access to a global search in the header of every libguides page, were even more likely to navigate away from the guides to find books or articles. while several studies in the literature had suggested that vertical tab navigation may be more usable than horizontal tab navigation, our study did not bear this out, as students in both rounds were able to find items on vertical and horizontal navigation menus at relatively consistent rates. similarly, one-, two-, and three-column content design did not appear to affect users’ abilities to find information and links on a page; however, users’ tendency to search rather than bro wse interfered with the relevant task’s intention of comparing the browsability of different content column designs, and therefore more targeted research on this question is needed. one student commented on the pointlessness of content in second columns, stating “nobody ever looks on the right side, i always look on the left cause everything’s usually on the left side. because you don’t read from right to left, it’s left to right.” he was, nevertheless, able to complete the task regardless of the multi-column design. subtab placement in libguides versions one and two was very different from each other; version one subtabs were invisible to users unless they hovered over the main menu item on the horizontal menu, while version two allowed us to make subtabs immediately visible on the vertical menu, without any action needed by the user to uncover their existence. given the subtabs’ visibility, we had anticipated that version two testers would be more likely to find and use subtabs, but this turned out not to be the case. only one out of ten students found the relevant subtab. although the successful tester was using libguides version two in which the subtab was visible, the fact that nine out of ten testers failed to see the subtab, regardless of whether it was immediately visible or not, suggests that subtab usage may not be an effective navigation strategy. results from all tasks also suggested that students might not understand what research guides are or how guides might help them with their research. like many libraries, the cal poly pomona university library did not refer to libguides by their product name on the library website, labeling them “research guides” instead in an effort to make their intended purpose clearer. testing revealed, however, that students are not predisposed to think of a “research guide” as a useful tool to help them get started on their research. one student said, “i’m not sure what the definition of a research guide is.” when prompted to think more about what it might be, the student guessed that it was a pamphlet with “something to help me guide the research.” the student did not offer any additional guesses about what specifically that help might look like. moreover, students’ tendency to resort to search itself can also be interpreted as evidence that they are confused about how guides are supposed to help them with research. instead of reading or skimming information on the guides, students used search as a strategy to attempt to complete the tasks an average of 70 percent of the time across both rounds. many of their searches navigated students away from the very guides that were designed to help them. the tendency to navigate away from guides was likely increased by the content included in the guides we tested, since many incorporated search boxes and links that pointed to external systems, such as the catalog, the discovery layer, libanswers, etc. however, many students’ first attempts to am i on the library website? | conrad and stevens 72 https://doi.org/10.6017/ital.v38i3.10977 accomplish the tasks given them involved immediately navigating away from libguides. others navigated away shortly after an initial attempt or two to complete the task within the guide. all but one student navigated away from libguides to complete tasks; four did so more than five times. eight of ten students used onesearch in the header or from the library homepage; the other two used embedded onesearch boxes on the libguides. results also suggested that it might be easier for students to find guides that are explicitly associated with their courses, through either the guides’ titles or other searchable metadata, than to find and understand the relevance of general research guides. even though general research guides might be relevant to the subject matter of students’ courses, guides that explicitly reference a course or courses are easily discoverable and their relevance is more immediately obvious. for instance, the first task asked students to find a “research guide” to help them write a paper on climate change for a com 100 class. we wanted to see whether students would find the “controversial topics” research guide that was designed for com 100 and that included the course number in the guide’s metadata. mentioning the course number in the task seemed to make it more actionable as an assignment they might expect from a professor. when students searched for “com 100,” they were more likely to find the controversial topics guide; two of three students who found the guide searched using the course number. if course numbers had not been included, they might not have found the guide as searching for the course number brought up the correct guide as the one result. two additional students unsuccessfully attempted to find the guide by searching for “com100,” without a space. had the libguides search been more effective, or had librarians included both versions of the course code with and without a space, more students would likely have found the guide. limitations limitations of this study include weaknesses in both our usability tasks and the content of some of the libguides, which made it difficult to answer our research questions. we may have tested too many different features at once, which can be a pitfall of usability testing in general. some tasks, such as tasks five and seven, tested both navigation placement and column layouts. in task five, for instance, there were multiple factors that could have led to success or failure; did a student overlook the asce standards because of column layout or tab placement or was the layout moot because the search box was comprehensive enough to allow them to complete the task without browsing the guide’s content? similarly, task two tested a guide with seven tabs. it is not clear if the students who did not click on a tab missed it because of the placement of the navigation on the page or because the navigation contained too many options. weaknesses in the content of many of the libguides used in the study led to additional limitations. many of the libguides were text heavy and included jargon. one student even commented, “ it’s a lot of words here, so i really don’t want to read them.” although we set out to test the usability of different navigation layouts and template designs, factors such as content overload or use of jargon could have influenced success or failure. the wording of task seven, for example, was particularly problematic and led to unclear results. students were instructed to find an “encyclopedic source” in an attempt to see if they would click on books listed in a third column in version one testing compared to a left column in version two testing. the column header was titled “useful books for background research” and the box included encyclopedias. students appeared to struggle with the idea of what constituted an “encyclopedic source.” when one student was specifically asked what she thought the term meant, she responded, “not sure.” based information technology and libraries | september 2019 73 on the results of this task, it was difficult to discern if the interface or the wording of the task resulted in task completion failures. the contrived nature of usability testing itself might also have affected our results. for example, one student exhibited a tendency to rush through tasks, a behavior that may have been due to experiencing content overload, anxiety over being observed during the testing process, time limitations of which we were unaware, etc. on the other hand, behavior that we perceived to be rushing might be consistent with the students’ normal approach to navigating websites. whatever the case, it is important to keep in mind that usability testing puts users on the spot because they are testing an interface in front of an audience. the usability testing context can therefore influence user behavior, including the number of times students might attempt to find a resource or complete a given task. some students might be impatient or uncomfortable with the process, resulting in attempts to complete the testing as quickly as possible, including giving up on tasks more quickly than they would in a more natural setting. conversely, other students might be more likely to expend more time and effort when performing in front of an audience than they would privately. conclusion usability testing was effective for revealing some of the difficulties students encounter when using our libguides and our website and for prompting reflection on the types of content they include, how that content is presented, and the contexts in which that content may or may not be useful to our students. analysis of the data from our study and a review of the literature within the context of existing political realities and constraints within our library led to our development of several data-informed recommendations for libguides creators, most of which were adopted. one of the most important recommendations was that libguides should use the same header that is on the library’s main website, which includes a global search box. use of the similar header not only would provide a consistent look and feel but it would also provide users with the global search box at the top of the page that is aligned with their mental model of how search should function. our testing confirmed many students prefer to use global search boxes to find information rather than browsing or in addition to browsing when they get stuck. while some librarians were not thrilled with what they viewed as the privileging or predominance of the discovery layer on their guides, preferring to direct students to specific databases instead of the onesearch, this recommendation was ultimately accepted due to the compelling nature of the usability data we were able to share. our recommendation that subtabs should be avoided was also accepted because of how compelling the data was: 90 percent of users failed to find links located on subtabs. we also recommended that librarians should evaluate the importance of all content on their guides to minimize student confusion when browsing. while we acknowledged that there might be contexts when screenshots of search boxes would be useful, we encouraged librarians to think carefully about their use and to avoid them when possible. additionally, librarians were encouraged to evaluate whether the content they were adding was of core importance to the libguide, reflecting on the degree to which it added value or possibly detracted from the libguide, perhaps by virtue of lack of relevance or content-overload. content boxes consisting of suggested books on a general subject guide were used as an example, given the difficulty of providing useful book suggestions to students working on wildly different topics. while results from our rounds of usability testing did not indicate that left-side vertical navigation was decidedly more usable than horizontal navigation at the top of the page, we nevertheless am i on the library website? | conrad and stevens 74 https://doi.org/10.6017/ital.v38i3.10977 recommended that all guides should use left tab navigation, for consistency’s sake across guides, because left-side navigation has become standard on the web, and because other libguide studies have suggested that left-side navigation is easier to use than horizontal navigation, due to issues such as “banner blindness.”46 the librarians agreed, and a template was set up in the administrative console requiring that all public-facing libguides use left tab navigation. based on other usability studies in the literature as well, we also recommended that guides should include no more than a maximum of seven main content tabs.47 although our study did not provide any actionable data about the relative usability of one-, two-, and three-column content designs, other articles in the literature had emphasized the importance of consistency and avoiding a busy look with too much content. in order to avoid both a busy look and having guides that looked decidedly different from each other due to inconsistent number of columns, we therefore recommended that all guides should utilize a two-column layout, with the left column reserved for navigation. all content should appear in a single main column. however, future iterations of libguides usability testing should attempt to find ways to test whether limiting content to a single column is indeed more usable than dispersing it across two or more columns. the group voted on many of our recommendations, and several were simple to implement and oversee because they could translate into design decisions that could be set up as default, unchangeable options within the libguides administration module. other recommendations were more difficult to operationalize and enforce. for example, because our findings indicated that students attempted to search for course numbers to find a guide that they were told was relevant to their research for a specific class, another one of our recommendations to the librarians’ group was to include, as appropriate, various course numbers in their guides’ metadata in order to both make them more discoverable and appear more immediately relevant to students’ coursework. this recommendation is not one that a libguides administrator could enforce due to issues revolving around subject matter and curriculum knowledge. the issue of context, and specifically the connection between courses and guides that has the potential to underscore their relevance and purpose to students, also caused us to question the effectiveness of general subject guides in assisting students with their research. if students are more likely to understand the relevance and purpose of a libguide when it is explicitly connected to their specific class or assignment and less likely to make the connection between a general research guide and their coursework, then the creation and maintenance of general subject guides might not be worth the time and effort librarians invest in them. this question is made more pressing by studies in the literature that indicate both low usage and shallow use of guides, such as using them primarily to find a database.48 while this question did not lead to a specific recommendation to the librarians’ group, we have since reflected that the return on investment issue might be effectively addressed via closer collaboration with faculty in the disciplines. if research guides are more clearly aligned with specific research assignments in specific courses , and if faculty members instruct their students to consult library research guides and integrate libguides and other library resources into learning management systems, perhaps use and return on investment would improve. researchers like hess and hristova, for example, found that online tutorials that are required in specific courses show high usage.49 the connection between course integration and usage may hold true with libguides as well. regardless, students’ frequent lack of understanding of what guides are designed to do and their tendency to navigate quickly away from them rather than exploring them suggests that information technology and libraries | september 2019 75 reconceptualizing what guides are designed to do, and what needs they are designed to meet in what specific contexts might prove to be a useful exercise. a guide designed as an ins tructional tool to teach specific concepts, topic generation processes, search strategies, citation practices, etc. within the context of a specific assignment for a specific course may well be immediately perceived as relevant to students in that course. such a guide discussed in the context of a class might also be perceived as more useful than guides consisting of lists of resources and tools, which are unlikely to be interpreted as helpful by students who stumble upon them while seeking research assistance on the library’s website. as such, thinking about how and in what context students are likely to find guides, and how material might be presented so that guides are quickly perceived as a potentially relevant resource worth exploring might also prove useful. the importance of talking to users cannot be overemphasized; without collecting user feedback, whether through usability testing or another method, it is difficult to know how students perceive and use libguides or any other library online service. getting user input on navigation flow, template design, and search functionality can provide valuable details that can help libraries improve the usability of their online resources. it is also important to note that in our rapidly changing environment, users’ needs and preferences also change. as such, collecting and analyzing user feedback to inform user-centered design should be a fluid process, not a one-time effort. admittedly, it can sometimes be challenging to make collective design decisions, particularly when librarians have strong opinions grounded on their own personal experiences working with students that conflict with usability testing data. although it is necessary to incorporate user feedback into the design process, it is also important to be open to compromise in order to achieve stakeholder buy-in for some usability-informed changes. as with many library services, usage of libguides is contingent at least in part on awareness, as students are unlikely to use services of which they are unaware or are unlikely to discover due to the limitations of a library’s search tools. given the prevalence of search dominance among our users, we should not assume that simply placing a “research guides” link on a webpage will lead to usage. increased outreach, better integration with the content of specific courses and assignments, and a thorough review of libguides content by those creating the guides with an eye toward the specific contexts in which they are likely to be used, taught, serendipitously discovered, etc. is necessary to ensure that the research guides librarians create are worth the time they invest in them. additional studies focusing on why students do or do not use specific types of research guides, the contexts in which they are most useful, how students use them, and the specific content in guides that students find most helpful are needed to determine whether and to what extent they are aligned with students’ information-seeking preferences, behaviors, and needs, as well as how they might be improved to increase their use and usefulness. am i on the library website? | conrad and stevens 76 https://doi.org/10.6017/ital.v38i3.10977 appendix 1: libguides usability testing tasks purpose: seeing how students browse or search to get to research guides task 1: you are writing a research paper on the topic of climate change for your com 100 class. your teacher told you that the library has a “research guide” that will help you write your paper. find the guide. start: library homepage purpose: testing tab orientation on top task 2: you need to compare two public opinion polls on the topic of climate change for your com 100 class. find a list of public opinion polls on the research guide shown. start: http://libguides.library.cpp.edu/controversialtopics or http://csupomona.beta.libguides.com/controversial-topics purpose: testing subtabs task 3: you are writing a research paper for your apparel merchandising & management class on the topic of product development. your teacher told you that the library has a “research guide” that includes a list of articles on product development. find the product development articles on this research guide. start: http://libguides.library.cpp.edu/amm or http://csupomona.beta.libguides.com/amm purpose: testing searching within the libguides pages task 4: if you were going to look for additional books on the topic of product development, what would you do next? start: http://libguides.library.cpp.edu/amm or http://csupomona.beta.libguides.com/amm purpose: testing two-tab column design task 5: you are designing an earthquake-resistant structure for your civil engineering course and need to review seismic load provisions. locate the asce standards on seismic loads. use the research guide we open for you. start: http://libguides.library.cpp.edu/civil or http://csupomona.beta.libguides.com/civilengineering purpose: seeing if including screenshots of search boxes is problematic task 6: your professor also asks you to find an online handbook or reference source on the topic of finite element analysis. locate an online handbook or reference source on this topic. start: http://libguides.library.cpp.edu/civil or http://csupomona.beta.libguides.com/civilengineering purpose: seeing if three-columns are noticeable http://libguides.library.cpp.edu/controversialtopics http://csupomona.beta.libguides.com/controversial-topics http://csupomona.beta.libguides.com/controversial-topics http://libguides.library.cpp.edu/amm http://csupomona.beta.libguides.com/amm http://libguides.library.cpp.edu/amm http://csupomona.beta.libguides.com/amm http://libguides.library.cpp.edu/civil http://csupomona.beta.libguides.com/civil-engineering http://csupomona.beta.libguides.com/civil-engineering http://libguides.library.cpp.edu/civil http://csupomona.beta.libguides.com/civil-engineering http://csupomona.beta.libguides.com/civil-engineering information technology and libraries | september 2019 77 task 7: find resources that might be good for background research on motivation and classroom learning for a psychology course. find an encyclopedic source on this topic. start: http://libguides.library.cpp.edu/psychology or http://csupomona.beta.libguides.com/psychology http://libguides.library.cpp.edu/psychology http://libguides.library.cpp.edu/psychology http://libguides.library.cpp.edu/psychology am i on the library website? | conrad and stevens 78 https://doi.org/10.6017/ital.v38i3.10977 references 1 william hemmig, “online pathfinders: toward an experience-centered model,” reference services review 33, no. 1 (february 2005): 67, https://dx.doi.org/10.1108/00907320510581397. 2 charles h. stevens, marie p. canfield, and jeffrey t. gardner, “library pathfinders: a new possibility for cooperative reference service,” college & research libraries 34, no. 1 (january 1973): 41, https://doi.org/10.5860/crl_34_01_40. 3 “about springshare,” springshare, accessed may 7, 2017, https://springshare.com/about.html. 4 “libguides community,” accessed december 4, 2018, https://community.libguides.com/?action=0. 5 see, for example, alisa c. gonzalez and theresa westbrock, “reaching out with libguides: establishing a working set of best practices,” journal of library administration 50, no. 5/6 (september 7, 2010): 638–56, https://doi.org/10.1080/01930826.2010.488941. 6 suzanna conrad and nathasha alvarez, “conversations with web site users: using focus groups to open discussion and improve user experience,” the journal of web librarianship 10, no. 2 (2016): 74, https://doi.org/10.1080/19322909.2016.1161572. 7 ibid., 74. 8 suzanna conrad and julie shen, “designing a user-centric web site for handheld devices: incorporating data-driven decision-making techniques with surveys and usability testing,” the journal of web librarianship 8, no. 4 (2014): 349-83, https://doi.org/10.1080/19322909.2014.969796. 9 “about springshare.” 10 jimmy ghaphery and erin white, “library use of web-based research guides,” information technology and libraries 31, no. 1 (2012): 21-31, https://doi.org/10.6017/ital.v31i1.1830. 11 “libguides community,” accessed december 4, 2018, https://community.libguides.com/?action=0&inst_type=1. 12 katie e. anderson and gene r. springs, “assessing librarian expectations before and after libguides implementation,” practical academic librarianship: the international journal of the sla academic division 6, no. 1 (2016): 19-38, https://journals.tdl.org/pal/index.php/pal/article/view/19. 13 examples include: troy a. swanson and jeremy green, “why we are not google: lessons from a library web site usability study,” the journal of academic librarianship 37, no. 3 (2011): 22229, https://doi.org/10.1016/j.acalib.2011.02.014; judith z. emde, sara e. morris, and monica claassen-wilson, “testing an academic library website for usability with faculty and graduate students,” evidence based library and information practice 4, no. 4 (2009): 24-36, https://doi.org/10.18438/b8tk7q; heather jeffcoat king and catherine m. jannik, “redesigning for usability: information architecture and usability testing for georgia tech https://dx.doi.org/10.1108/00907320510581397 https://doi.org/10.5860/crl_34_01_40 https://springshare.com/about.html https://community.libguides.com/?action=0 https://doi.org/10.1080/01930826.2010.488941 https://doi.org/10.1080/01930826.2010.488941 https://doi.org/10.1080/19322909.2014.969796 https://doi.org/10.6017/ital.v31i1.1830 https://community.libguides.com/?action=0&inst_type=1 https://journals.tdl.org/pal/index.php/pal/article/view/19 https://doi.org/10.1016/j.acalib.2011.02.014 https://doi.org/10.18438/b8tk7q information technology and libraries | september 2019 79 library’s website,” oclc systems & services 21, no. 3 (2005): 235-43, https://doi.org/10.1108/10650750510612425; danielle a. becker and lauren yannotta, “modeling a library website redesign process: developing a user-centered website through usability testing,” information technology and libraries 32, no. 1 (2013): 6-22, https://doi.org/10.6017/ital.v32i1.2311; darren chase, “the perfect storm: examining user experience and conducting a usability test to investigate a disruptive academic library web site redevelopment,” the journal of web librarianship 10, no. 1 (2016): 28-44, https://doi.org/10.1080/19322909.2015.1124740; andrew r. clark et al., “taking action on usability testing findings: simmons college library case study,” the serials librarian 71, no. 3-4 (2016): 186-96, https://doi.org/10.1080/0361526x.2016.1245170; anthony s. chow, michelle bridges, and patrician commander, “the website design and usability of us academic and public libraries: findings from a nationwide study,” reference & user services quarterly 53, no. 3 (2014): 253-65, https://journals.ala.org/index.php/rusq/article/view/3244/3427; gricel dominguez, sarah j. hammill, and ava iuliano brillat, “toward a usable academic library web site: a case study of tried and tested usability practices,” the journal of web librarianship 9, no. 2-3 (2015), https://doi.org/10.1080/19322909.2015.1076710; junior tidal, “one site to rule them all, redux: the second round of usability testing of a responsively designed web site,” the journal of web librarianship 11, no. 1 (2017): 16-34, https://doi.org/10.1080/19322909.2016.1243458. 14 kate a. pittsley and sara memmott, “improving independent student navigation of complex educational web sites: an analysis of two navigation design changes in libguides,” information technology and libraries 31, no. 3 (2012): 52-64, https://doi.org/10.6017/ital.v31i3.1880. 15 alec sonsteby and jennifer dejonghe, “usability testing, user-centered design, and libguides subject guides: a case study,” the journal of web librarianship 7, no. 1 (2013): 83-94, http://dx.doi.org/10.1080/19322909.2013.747366. 16 sarah thorngate and allison hoden, “exploratory usability testing of user interface options in libguides 2,” college & research libraries 78, no. 6 (2017), https://doi.org/10.5860/crl.78.6.844. 17 nora almeida and junior tidal, “mixed methods not mixed messages: improving libguides with student usability data,” evidence based library and information practice 12, no. 4 (2017): 66, https://academicworks.cuny.edu/ny_pubs/166/. 18 ibid., 63; 71. 19 dana ouellette, “subject guides in academic libraries: a user-centered study of uses and perceptions,” canadian journal of information and library science 35, no. 4 (december 2011): 436–51, https://doi.org/10.1353/ils.2011.0024. 20 ibid., 442. 21 ibid., 442-43. https://doi.org/10.1108/10650750510612425 https://doi.org/10.6017/ital.v32i1.2311 https://doi.org/10.1080/19322909.2015.1124740 https://doi.org/10.1080/0361526x.2016.1245170 https://journals.ala.org/index.php/rusq/article/view/3244/3427 https://journals.ala.org/index.php/rusq/article/view/3244/3427 https://doi.org/10.1080/19322909.2015.1076710 https://doi.org/10.1080/19322909.2016.1243458 https://doi.org/10.6017/ital.v31i3.1880 http://dx.doi.org/10.1080/19322909.2013.747366 https://doi.org/10.5860/crl.78.6.844 https://academicworks.cuny.edu/ny_pubs/166/ https://doi.org/10.1353/ils.2011.0024 am i on the library website? | conrad and stevens 80 https://doi.org/10.6017/ital.v38i3.10977 22 ibid., 443. 23 ibid., 443; shannon m. staley, “academic subject guides: a case study of use at san jose state university,” college & research libraries 68, no. 2 (march 2007): 119–39, https://doi.org/10.5860/crl.68.2.119. 24 jakob nielsen, “search and you may find,” nielsen norman group, last modified july 15, 1997, https://www.nngroup.com/articles/search-and-you-may-find/. 25 jakob nielsen, “macintosh: 25 years,” nielsen norman group, last modified february 2, 2 009, https://www.nngroup.com/articles/macintosh-25-years/; jakob nielsen and raluca budiu, mobile usability (berkeley: new riders, 2013), chap. 2, o’reilly. 26 jakob nielsen, “incompetent research skills curb users’ problem solving,” nielsen norman group, last modified april 11, 2011, https://www.nngroup.com/articles/incompetent-searchskills/. 27 jakob nielsen, “search: visible and simple,” nielsen norman group, last modified may 13, 2001, https://www.nngroup.com/articles/search-visible-and-simple/. 28 ibid. 29 jakob nielsen, “mental models for search are getting firmer,” nielsen norman group, last modified may 9, 2005, https://www.nngroup.com/articles/mental-models-for-search/. 30 ibid. 31 erik ojakaar and jared m. spool, getting them to what they want: eight best practices to get users to the content they want (and to content they didn’t know they wanted) (bradford, ma: uie reports: best practices series, 2001). 32 amanda nichols hess and mariela hristova, “to search or to browse: how users navigate a new interface for online library tutorials,” college & undergraduate libraries 23, no. 2 (2016): 173, https://doi.org/10.1080/10691316.2014.963274. 33 ibid., 176. 34 hyejung han and dietmar wolfram, “an exploration of search session patterns in an imagebased digital library,” journal of information science 42, no. 4 (2016): 483, https://doi.org/10.1177/0165551515598952. 35 ibid., 487. 36 xi niu, tao zhang, and hsin-liang chen, “study of user search activities with two discovery tools at an academic library,” international journal of human-computer interaction 30 (2014): 431, https://doi.org/10.1080/10447318.2013.873281. 37 iris xie and soohyung joo, “tales from the field: search strategies applied in web searching,” future internet 2 (2010): 268-69, https://doi.org/10.3390/fi2030259. https://doi.org/10.5860/crl.68.2.119 https://www.nngroup.com/articles/search-and-you-may-find/ https://www.nngroup.com/articles/macintosh-25-years/ https://www.nngroup.com/articles/incompetent-search-skills/ https://www.nngroup.com/articles/incompetent-search-skills/ https://www.nngroup.com/articles/search-visible-and-simple/ https://www.nngroup.com/articles/mental-models-for-search/ https://doi.org/10.1080/10691316.2014.963274 https://doi.org/10.1177/0165551515598952 https://doi.org/10.1080/10447318.2013.873281 https://doi.org/10.3390/fi2030259 information technology and libraries | september 2019 81 38 ibid., 275; 267-68. 39 ibid., 268-69. 40 sonsteby and dejonghe, “usability testing, user-centered design,” 83-94. 41 we experienced technical difficulties when capturing screens and audio simultaneously in camtasia. the audio did not sync in real time with the testing and we had to correct sync issues after the fact. a full technical test of screen capture and recording technology might have resolved this issue. 42 nielsen, “search: visible and simple.” 43 nielsen, “search and you may find”; nielsen, “incompetent research skills”; iris xie and soohyung joo, “tales from the field,” 268-69. 44 jakob nielsen, “search: visible and simple.” 45 ibid. 46 pittsley and memmott, “improving independent student navigation,” 52-64. 47 e.g., sonsteby and dejonghe, “usability testing, user-centered design,” 83-94. 48 ouellette, “subject guides in academic libraries,” 448; brenda reeb and susan gibbons, “students, librarians, and subject guides: improving a poor rate of return,” portal: libraries and the academy 4, no. 1 (january 22, 2004): 124, https://dx.doi.org/10.1353/pla.2004.0020; staley, “academic subject guides,” 119–39. 49 hess and hristova, “to search or to browse,” 174. https://dx.doi.org/10.1353/pla.2004.0020 abstract introduction literature review the growth of libguides libguides usability testing and user studies information retrieval behaviors: search and browse preferences method task 1 task 2 task 3 task 4 task 5 task 6 task 7 results searching vs. browsing to find libguides navigation, tabs, and layout embedded search boxes & screenshots of search boxes comparisons between rounds discussion limitations conclusion appendix 1: libguides usability testing tasks references president’s message: imagination and structure in times of change bohyun kim information technology and libraries | december 2018 2 bohyun kim (bohyun.kim.ois@gmail.com) is lita president 2018-19 and chief technology officer & associate professor, university of rhode island libraries, kingston, ri. in my last column, i talked about the discussion that lita had begun regarding forming a new division to achieve financial sustainability and more transparency, responsiveness, and agility. this proposed new division would merge lita with alcts (association for library collections and technical services) and llama (library leadership and management association). when this topic was brought up and discussed at an open meeting at the 2018 ala annual conference in new orleans, many members of these three divisions expressed interests and excitement. at the same time, there were many requests for more concrete details. you may recall that as a response to those requests, the steering committee, which consists of the presidents, presidents-elect, and executive directors of the three divisions decided to form four working groups with the aim of providing more complete information about what the new division would look like. today, i am happy to report that the work of the steering committee and the four working groups is well underway. the operations working group that i have been chairing for the last two months submitted its recommendations on november 23. the activities working group finished its report on december 5. the budget and finance working group also submitted its second report. the communications working group continues to engage members of all three divisions by sharing new updates and soliciting opinions and suggestions. most recently, it started gathering input and feedback on potential names for the new division.1 you can see the charges, member rosters, and current statuses of these four working groups in the ‘current information’ page at the ‘alcts/ llama/ lita alignment discussion’ community in the ala connect website (https://connect.ala.org/communities/allcommunities/all/all-current-information).2 to give you a glimpse of our work preparing for the proposed new division, i would like to share some of my experience leading the operations working group. the operations working group consisted of nine members, three from each division, in addition to myself as the chair and one staff liaison. we quickly became familiar with the organizational and membership structures of three divisions. the three divisions are similar to one another in size, but they have slightly different structures. lita has 18 interest groups (ig), 25 committees, and 4 (current) task forces; llama has 7 communities of practice (cop) and 46 discussion groups / committees / task forces; alcts has 5 sections, 42 igs, and 61 committees (20 at the division level and 41 at the section level). all committees and task forces in lita are division-level, while alcts and llama have committees that are either division-level or section/cop-level. alcts is unique in that it elects section chairs, who serve on the division board alongside with alcts directors-at-large. alcts also has a separate executive committee in addition to the board. llama has self-governed cops, which are formed by the board’s approval. among all three, lita has the most flat and simplest structure due to its intentional efforts in the past. for example, there are neither sections nor mailto:bohyun.kim.ois@gmail.com https://connect.ala.org/communities/allcommunities/all/all-current-information information technology and libraries | december 2018 3 communities of practice in lita, and the lita board eliminated the executive committee a few years ago. the steering committee of the three divisions agreed upon several guiding principles for the potential merger. these include (i) open, flexible, and straightforward member engagement, (ii) simplified and streamlined processes, and (iii) a governance and coordinating structure that engages members and staff in meaningful and productive work. the challenge is how to translate those guiding principles into a specific organizational structure, membership structure, and bylaws. clearly, some shuffling of existing sections, cops, and igs in three divisions will be necessary to make the new division as effective, agile, and responsive as promised. however, when and how such consolidation should take place? furthermore, what kind of guidance should the new division provide for members to re-organize themselves into a new and better structure? these are not easy questions to answer. nor are they something that can be immediately answered. some changes may require going through multiple stages for them to be completed. this may concern some members. they may prefer all these questions to have definitive answers before they decide on whether they will support the proposed new division or not. people often assume that a change takes place after a big vision is formed, and then the change is executed by a clear plan that directly translates that vision into reality in an orderly fashion. however, that is rarely how a change takes place in reality. more often than not, a possible change builds up its own pressure, showing up in a variety of forms on multiple fronts by many different people while getting stronger, until the idea of this change gains enough urgency. finally, some vision of the change is crafted to give a form to that idea. the vision for a change also does not materialize in one fell swoop. it often begins with incomplete details and ideas that may even conflict with one another in its first iteration. it is up to all of us to sort them out and make them consistent, so that they would become operational in the real world. recently, the steering committee reached an agreement regarding the final version of the mission, vision, and values of the proposed new division. i hope these resonate with our members and guide us well in navigating challenges ahead if the membership votes in favor of the proposal. the new division’s mission: we connect library and information practitioners in all career stages and from all organization types with expertise, colleagues, and professional development to empower transformation in technology, collections, and leadership, and to advocate for access to information for all. the new division’s vision: we shape the future of libraries and catalyze innovation across boundaries. the new division [name to be determined] amplifies diverse voices and advocates for equal and equitable access to information for all. the new division’s values: shared and celebrated expertise; strategically chosen work that makes a difference; transparent, equitable, flexible, and inclusive structures; empowering framework for experimental and proven approaches; intentional amplification of diverse perspectives; expansive collaboration to become better together. imagination and structure in times of change | kim 4 https://doi.org/10.6017/ital.v37i4.10850 in deciding on all operational and logistical details for the new division, the most important criteria will be whether a proposed change will advance the vision and mission of the new division and how well it aligns with the agreed-upon values and guiding principles. the steering committee and the working groups are busy finalizing the details about the new division. those details will be first reviewed by the board of each division and then shared with the membership at the midwinter for feedback. i did not anticipate that during my service as the lita president-elect and president, i would be leading a change as great as dissolving lita and forming a new division with two other divisions, alcts and llama. it has been an adventure filled with many surprises, difficulties, and challenges, to say the least. this adventure taught me a great deal about leading a change for an organization at a high level. when we move from the high-level vision of a change to the matter of details deep in the weeds, it is easy to lose sight of the original aspiration and goal that led us to the change in the first place. trying to determine as many logistical details becomes tempting to those in a leadership role because we all want to assure people in our organizations at a time of uncertainty and to make the transition smooth. however, creating a new division itself is a huge change at the highest level. it would be wrong to backtrack on the original goal to make the transition smooth. for it is the original goal that requires a transition, not vice versa. i believe those in a leadership role should accept that their most important work during the time of change is not to try to wrangle logistics at all levels but to keep things on track and moving in the direction of the original aspiration and goal. lita and two other divisions have many talented and capable members who will be happy to lend a hand in developing new logistics. the responsibility of leaders is to create space where those people can achieve that freely and swiftly and to provide the right amount of framework and guidance. i hope that all lita members and those associated and involved with lita see themselves in the vision, mission, and values of the new division, embrace changes from the lowest to the highest level, and work towards making the new vision into reality together. 1 you can participate in this process at https://connect.ala.org/communities/communityhome/digestviewer/viewthread?groupid=109804&messagekey=625e8823-21e0-419c-ab2b1cb4a82b8d09 and http://www.allourideas.org/newdivisionname. 2 this ‘current information’ page will be updated as the plans for the new division develop. https://connect.ala.org/communities/community-home/digestviewer/viewthread?groupid=109804&messagekey=625e8823-21e0-419c-ab2b-1cb4a82b8d09 https://connect.ala.org/communities/community-home/digestviewer/viewthread?groupid=109804&messagekey=625e8823-21e0-419c-ab2b-1cb4a82b8d09 https://connect.ala.org/communities/community-home/digestviewer/viewthread?groupid=109804&messagekey=625e8823-21e0-419c-ab2b-1cb4a82b8d09 http://www.allourideas.org/newdivisionname editorial: singularity—are we there, yet? | truitt 55 i n my last column, i wrote about two books—nicholas carr ’s the shallows and william powers’ hamlet’s blackberry—relating to learning in the always-on, always connected environment of “screens.”1 since then, two additional works have come to my attention. while i won’t be able to do them justice in the space i have here, they deserve careful consideration and open discussion by those of us in the library community. if carr’s and power’s books are about how we learn in an always-connected world of screens, sherry turkle’s alone together and elias aboujaoude’s virtually you are about who we are in the process of becoming in that world.2 turkle is a psychologist at mit who studies human– computer interactions. among her previous works are the second self (1984) and life on the screen (1995). aboujaoude is a psychiatrist at the stanford university school of medicine, where he serves as director of the obsessive compulsive disorder clinic and the impulse control disorders clinic. based on extensive coverage of specialist and popular literature, as well as numerous anonymized accounts of patients and subjects encountered by the authors, both works are characterized by thorough research and thoughtful analysis. while their approaches to the topic of “what we are becoming” as a result of screens may differ— aboujaoude’s, for example, focuses on “templates” and the terminology of traditional psychiatry, while turkle’s examines the relationship between loneliness and solitude (they are different), and how these in turn relate to the world of screens—their observations of the everyday manifestations of what might be called the pathology of screens bear many common threads. i’m acutely aware of the potential for injustice (at best) and misrepresentation or misunderstanding (rather worse) that i risk in seeking to distill two very complex studies into such a small space. and, frankly, i’m still trying to wrap my head around both the books and the larger issues they raise. with that caveat, i still think we should be reading about and widely discussing the phenomena reported, which many of us observe on a daily basis. in the sections that follow, i’d like to touch on a very few themes that emerge from these books. ■■ “why do people no longer suffice?”3 a pair of anecdotes that turkle recounts to explain her reasons for writing the current book seems worth sharing at the outset. in the first, she describes taking her then-fourteen-year-old daughter, rebecca, to the charles darwin exhibition at new york’s american museum of natural history in 2005. among the many artifacts on display was a pair of live giant galapagos tortoises: “one tortoise was hidden from view; the other rested in its cage, utterly still. rebecca inspected the visible tortoise thoughtfully for a while and then said matter-of-factly, ‘they could have used a robot.’” when turkle queried other bystanders, many of the children agreed, with one saying, ‘for what the turtles do, you didn’t have to have live ones.’” in this case, “alive enough” was sufficient for the purpose at hand.4 sometime later, turkle read and publicly expressed her reservations about british computer scientist david levy’s book, love and sex with robots, in which levy predicted that by the middle of this century, love with robots will be as normal as love with other humans, while the number of sexual acts and lovemaking positions commonly practiced between humans will be extended, as robots will teach more than is in all of the world’s published sex manuals combined.5 contacted by a reporter from scientific american about her comments regarding levy’s book, turkle was stunned when the reporter, equating the possibility of relationships between humans and robots with gay and lesbian relationships, accused her of likewise opposing these human-to-human relationships. if we now have reached a point where gay and lesbian relationships can strike us as comparable to human-to-machine relationships, something very important has changed; for turkle, it suggested that we are on the threshold of what she terms the “robotic moment”: this does not mean that companionate robots are common among us; it refers to our state of emotional—and i would say philosophical—readiness. i find people willing to seriously consider robots not only as pets but as potential friends, confidants and romantic partners. we don’t seem to care what these artificial intelligences “know” or “understand” of the human moments we might “share” with them. at the robotic moment, the performance of connection seems connection enough. we are poised to attach to the inanimate without prejudice.6 marc truitteditorial: singularity—are we there, yet? marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 56 information technology and libraries | june 2011 while these examples are admittedly extreme, both authors agree that something very basic has changed in the way we conduct ourselves. turkle characterizes it as mobile technology having made each of us “pausable,” i.e., that a face-to-face interaction being interrupted by an incoming call, text message, or e-mail is no longer extraordinary; rather, in the “new etiquette,” it is “close to the norm.”10 and the rudeness, as well we know, isn’t limited to mobile communications. referring to “flame wars,” which regularly erupt in online communities, aboujaoude observes: the internet makes it easier to suspend ethical codes governing conduct and behavior. gentleness, common courtesy, and the little niceties that announce us as well-mannered, civilized, and sociable members of the species are quickly stripped away to reveal a completely naked, often unpleasant human being.11 even our routine e-mail messages—lacking as they often do salutations and closing sign-offs—are characterized by a form of curtness heretofore unacceptable in paper communications. remarkably, to those old enough to recall the traditional norms, the brusqueness is not only unintended, it is as well unconscious; “[we] just don’t think warmth and manners are necessary or even advisable in cyberspace.”12 ■■ castles in the air: avatars, profiles, and remaking ourselves as we wish we were finally, a place to love your body, love your friends, and love your life. —second life, “what is second life?”13 one of the interesting and worrisome themes in both turkle’s and aboujaoude’s studies is that of the reinvention and transformation of the self, in the form of online personas and avatars. this is the stock-in-trade of online communities and gaming sites such as facebook and second life. these sites cater to our nearly universal desire to be someone other than who we are: online, you’re slim, rich, and buffed up, and you feel you have more opportunities than in the real world. . . . we can reinvent ourselves as comely avatars. we can write the facebook profile that pleases us. we can edit our messages until they project the self we want to be.14 the problem is that for many there is an increasing fuzziness at the interface between real and virtual ■■ changing mores, or the triumph of rudeness i can’t think of any successful online community where the nice, quiet, reasonable voices defeat the loud, angry ones. . . . the computer somehow nullifies the social contract. —heather champ, yahoo!’s flickr community manager7 sadly, we’ve all experienced it. we get stuck on a bus, train, or in an elevator with someone engaged in a loud conversation on her or his mobile phone. all too often, the person is loudly carrying on about matters we wish we weren’t there to hear. perhaps it’s a fight with a partner. or a discussion of some delicate health matter. whatever it is, we really don’t want to know, but because of the limitations imposed by physical spaces, we can’t avoid being a party to at least half of the conversation. what’s wrong with these individuals? do they really have no consideration or sense of propriety? it turns out that in matters of tact and good taste, the ground has shifted, and where once we understood and abided by commonly accepted rules of conduct and respect for others, we do so no longer. indeed, the everyday obnoxious intrusions by those using public spaces for their private conversations are among the least of offenders. consider the following situations shared by turkle: sal, 62 years old, holds a small dinner party at his home as part of his “reentry into society” after several years of having cared for his recently deceased wife: i invited a woman, about fifty, who works in washington. in the middle of a conversation about the middle east, she takes out her blackberry. she wasn’t speaking on it. i wondered if she was checking her e-mail. i thought she was being rude, so i asked her what she was doing. she said that she was blogging the conversation. she was blogging the conversation.8 turkle later tells of attending a memorial service for a friend. several [attendees] around me used the [printed] program’s stiff, protective wings to hide their cell phones as they sent text messages during the service. one of the texting mourners, a woman in her late sixties, came over to chat with me after the service. matter-of-factly, she offered, “i couldn’t stand to sit that long without getting on my phone.” the point of the service was to take a moment. this woman had been schooled by a technology she’d had for less than a decade to find this close to impossible.9 editorial: singularity—are we there, yet? | truitt 57 enough” became yet more blurred. turkle’s anecdotes of children explaining the “aliveness” of these robots are both touching and disturbing. speaking of a tamagotchi, one child wrote a poem: “my baby died in his sleep. i will forever weep. then his batteries went dead. now he lives in my head.”19 the concept of “alive enough” is not unique to the very young, either. by 2009, sociable robots had moved beyond children’s toys with the introduction of paro, a baby seal-like “creature” aimed at providing companionship to the elderly and touted as “the most therapeutic robot in the world. . . . the children were onto something: the elderly are taken with the robots. most are accepting and there are times when some seem to prefer a robot with simple demands to a person with more complicated ones.”20 where does it end? turkle goes on to describe nursebot, a device aimed at hospitals and long-term care facilities, which colleagues characterized as “a robot even sherry can love.” but when turkle injured herself in a fall a few months later, [i was] wheeled from one test to another on a hospital stretcher. my companions in this journey were a changing collection of male orderlies. they knew how much it hurt when they had to lift me off the gurney and onto the radiology table. they were solicitous and funny. . . . the orderly who took me to the discharge station . . . gave me a high five. the nursebot might have been capable of the logistics, but i was glad that i was there with people. . . . between human beings, simple things reach you. when it comes to care, there may be no pedestrian jobs.21 but need we librarians care about something as farfetched as nursebot? absolutely. now that ibm has proven that it can design a machine—okay, an array of machines, but something much more compact is surely coming soon—that can win at jeopardy!, is the robotic reference librarian really that much of a hurdle? take a bit of watson technology, stick it in nursebot, give it sensible shoes, and hey, i can easily imagine bibliobot, factory-standard in several guises, including perhaps donna reed (as mary, who becomes the town librarian in the alter-life of capra’s it’s a wonderful life) or shirley jones (as marian, the librarian, in the music man). i like donna reed as much as anyone, but do i really want reference assistance from her android doppelgänger? but then, for years after the introduction of the atm, i confess that i continued taking lunch hours off just so that i could deal with a “real person” at the bank, so perhaps it’s just me. the future is in the helping/service professions, indeed! and when we’re all replaced by robots (sociable and otherwise), what will we do to fill the time? personas: “not surprisingly, people report feeling let down when they move from the virtual to the real world. it is not uncommon to see people fidget with their smartphones, looking for virtual places where they might once again be more.”15 turkle speaks of the development of what she terms a “vexed relationship” between the real and the virtual: in games where we expect to play an avatar, we end up being ourselves in the most revealing ways; on social-networking sites such as facebook, we think we will be presenting ourselves, but our profile ends up as somebody else—often the fantasy of who we want to be. distinctions blur.16 and indeed, some completely lose sight of what is real and what is not. aboujaoude relates the story of alex, whose involvement in an online community became so consuming that he not only created for himself an online persona—“’i then meticulously painted in his hair, streak by streak, and picked “azure blue” for his eye color and “snow white” for his teeth.’”—but also left his “real” girlfriend after similarly remaking the avatar of his online girlfriend, nadia—“from her waist size to the number of freckles on her cheeks.” speaking of his former “real” girlfriend, alex said, “real had become overrated.”17 ■■ “don’t we have people for these jobs?”18 ageist disclaimer: when i grew up, robots—those that weren’t in science fiction stories or films—were things that were touted as making auto assembly lines more efficient, or putting auto workers out of jobs, depending on your perspective. while not technically a robot, the other machine that characterized “that time” was the automated teller machine (atm), which freed us from having to do our banking during traditional weekday hours, and not coincidentally resulted, again, in the loss of many entry-level jobs in financial institutions. as i recall, we were all reassured that the future lay in “helping/ service” professions, where the danger of replacement by machines was thought to be minimal. now, fast forward 30 years. the first half of turkle’s book is the history of “sociable robots” and our interactions with them. moving from the reactions of mit students to joseph weizenbaum’s eliza in the mid-1970s, she recounts her studies of children’s interactions, first with electronic toys—e.g., tamagotchi—and later, with increasingly sophisticated and “alive” robots, such as furby, aibo, and my real baby. with each generation, these devices made yet more “demands” on their owners—for care, “feeding”, etc. and with each generation, the line between “alive” and “alive 58 information technology and libraries | june 2011 to admit that we’ve seen many examples of how connectedness between people we’d otherwise consider “normal” has and is changing our manners and mores.24 many libraries and other public spaces, reacting to patron complaints about the lack of consideration shown by some users, have had to declare certain areas “cell phone free.” in the interest of getting your attention, i’ve admittedly selected some fairly extreme examples from the two books at hand. however, i think the point is that, now that the glitter of always-on, always-connected, has begun to fade a bit, there is a continuum of dysfunctional behaviors that we are beginning to notice, and it’s time to talk about how we as librarians fit into all of this. are there things we in libraries are doing that encourage some of these less desirable and even unhealthy behaviors? which takes us to a second concern raised by some of my gentle draft-readers: we’ve heard this tale before. television, and radio before it, were technologies that, when they were new, were criticized as corrupting and leading us to all sorts of negative, self-destructive, and socially undesirable behaviors. how are screens and the technology of always-connected any different? a part of me—the one that winces every time someone glibly refers to the “transformational” changes taking place around us—agrees. i was trained as a historian, to take a long view about change. and we’re talking about technologies that—in the case of the web— have been in common use for just over fifteen years. that said, my interest here is in seeing our profession begin a conversation about how connective technologies have influenced behavioral changes in people, and especially about how we in libraries may be unwittingly abetting those behavioral changes. television and radio were fundamentally different technologies in that they were one-way broadcast tools. and to the best of my recollection, neither has ever been widely adopted by or in libraries. yes, we’ve circulated videos and sound recordings, and even provided limited facilities for the playback of such media. but neither has ever really had an impact on the traditional core business of libraries, which is the encouragement and facilitation of the largely solitary, contemplative act of reading. connective technologies, in the form of intelligent machines and network-based communities, can be said to be antithetical to this core activity. we need to think about that, and to consider carefully the behaviors we may be encouraging. notwithstanding those critics of change in our profession who feel we move far too glacially, i would maintain that we have often been, if not at the forefront of the technology pack, then certainly among its most enthusiastic ■■ where from here? i titled this column “singularity.” for those not familiar with the literature of science fiction, turkle provides a useful explanation: this notion has migrated from science fiction to engineering. the singularity is the moment—it is mythic; you have to believe in it—when machine intelligence crosses a tipping point. past this point, say those who believe, artificial intelligence will go beyond anything we can currently conceive. . . . at the singularity, everything will become technically possible, including robots that love. indeed, at the singularity, we may merge with the robotic and achieve immortality. the singularity is technological rapture.22 i think it’s pretty clear that we’re still a fair distance from anything that one might reasonably term a singularity. but the concept is surely present, albeit in a somewhat less hubristic degree, when we speak in uncritical awe of “game-changing” or “transformational” technologies. turkle puts it this way: the triumphalist narrative of the web is the reassuring story that people want to hear and that technologists want to tell. but the heroic story is not the whole story. in virtual worlds and computer games, people are flattened into personae. on social networks, people are reduced to their profiles. on our mobile devices, we often talk to each other on the move and with little disposable time—so little, in fact, that we communicate in a new language of abbreviation in which letters stand for words and emoticons for feelings. . . . we are increasingly connected to each other but oddly more alone: in intimacy, new solitudes.23 some of my endlessly patient friends—the ones who provide both you and me with some measure of buffering from the worst of my rants in prepublication drafts of these columns—have asked questions about how all this relates to libraries, for example: how much it is legitimate to generalize to the broader population research findings from cases of obsessive compulsive disorder? the individuals studied are, of course, obsessive and compulsive, in relation to the internet and new technologies. do their behaviors not represent an extreme end of the population? a fair question. and yes, the examples i’ve provided in this column are admittedly somewhat extreme. but turkle and aboujaoud both point to many examples that are far more common. i think all of us would have editorial: singularity—are we there, yet? | truitt 59 references and notes 1. marc truitt, “editorial: the air is full of people,” information technology and libraries 30 (mar. 2011): 3–5. http:// www.ala.org/ala/mgrps/divs/lita/ital/302011/3001mar/ editorial_pdf.cfm (accessed apr. 25, 2011). 2. sherry turkle, alone together: why we expect more from technology and less from each other (new york: basic books, 2011); elias aboujaoude, virtually you : the dangerous powers of the e-personality (new york : norton, 2011). 3. turkle, 19. 4. ibid., 3–4. 5. quoted in ibid., 5. 6. ibid., 9–10. emphasis added. 7. quoted in aboujaoude, 99. 8. turkle, 162. emphasis in original. 9. ibid, 295. 10. turkle, 161. 11. aboujaoude, 96 12. ibid., 98. 13. quoted in turkle, 1. 14. ibid., 12. 15. ibid. 16. ibid., 153. 17. aboujaoude, 77–78. 18. turkle, 290. 19. ibid., 34. 20. ibid., 103–4. 21. ibid., 120–21. 22. ibid., 25. 23. ibid., 18–19. 24. for a recent and typical example, see david carr, “keep your thumbs still when i’m talking to you,” new york times, apr. 15, 2011, http://www.nytimes.com/2011/04/17/ fashion/17text.html (accessed may 2, 2011). 25. aboujaoude, 283. adopters. in our quest to remain “relevant” to our university or school administrations, governing boards, and (in theory, at least) our patrons, we have embraced with remarkably little reservation just about every technology trend that’s come along in the past few decades. at the same time, we’ve been remarkably uncritical and unreflective about our role in, and the larger implications of, what we might be doing by adopting these technologies. aboujaoude, in a surprising, but i think largely correct summary comment, observes: extremely little is available, however, for the individual interested in learning more about how virtual technology has reshaped our inner universe and may be remapping our brains. as centers of learning, public libraries, schools, and universities may be disproportionately responsible for this deficiency. they outdo one another in digitalizing their holdings and speeding up their internet connections, and rightfully see those upgrades as essential to compete for students, scholars, and patrons. in exchange, however, and with few exceptions, they teach little about the unintended, less obvious, and more personal consequences of the world wide web. the irony is, at least in some libraries’ case, that their very survival seems threatened by a shift that they do not seem fully engaged in trying to understand, much less educate their audiences about.25 i could hardly agree more. so, how do we answer aboujaoude’s critique? 4 information technology and libraries | march 2007 this study examines how social scientists arrive at and utilize information in the course of their research. results are drawn about the use of information resources and channels to address information inquiry, the strategies for information seeking, and the difficulties encountered in information seeking for academic research in today’s information environment. these findings refine the understanding of the dynamic relationship between information systems and services and their users within social-scientific research practice and provide implications for scholarly information-system development. t he information needs and informationseeking behavior of social scientists have been the focus of inquiry within library and information science (lis) research for decades. folster reviewed the major studies that have been conducted in this area over the past three decades.1 she found that research methods had developed through several stages. research prior to the 1960s usually consisted of questionnairebased user studies that gathered basic demographic data and quan titative data on the type of information used. following that were citation studies in the mid1960s, and then the combination of questionnaire and interview techniques to develop profiles of users and their needs in the 1970s. the information environment of the 1980s witnessed a major transition in research design. the former practice of studying large groups via questionnaires or struc tured interviews gave way to the use of unstructured interviews or observation of smaller groups, resulting in a more holistic picture of social scientists’ research practices. more fully developed techniques for behavioral models emerged in the 1990s. folster summarized these studies done over decades and concluded that (1) social scientists place a high importance on journals; (2) most of their citation identification comes from journals; (3) infor mal channels, such as consulting colleagues and attend ing conferences, are an important source of information; (4) library resources, such as catalogs, indexes, and librar ians, are not very heavily utilized; and (5) computerized services are ranked very low in their importance to the research process. there are many examples of studies about the infor mationseeking behavior of social scientists. for example, the infross project (investigation into information requirements of the social scientist) studied the informa tion needs of british social scientists in the late 1960s and early 1970s and found that they preferred to use journal citations instead of traditional bibliographic tools, and that they tended to consult with colleagues and subject experts, rather than library catalogs or librarians in order to locate information.2 other socialscientist studies reinforced the findings of the infross project.3 several studies indicated that computerized literature searching was ranked low as a source of information among social scientists and suggested the promotion of electronic information services by librarians to enhance their roles as information providers.4 in an influential study on social scientists’ informa tionseeking patterns, ellis developed a behavioral model with six features based on the stages they went through in gathering information: ■ starting—includes activities characteristic of the ini tial search for information, such as asking colleagues or consulting literature reviews, online catalogs, and indexes and abstracts; ■ chaining—following chains of citations and other forms of referential connection between materials; ■ browsing—semidirected searching in an area of potential interest, such as scanning published jour nals, tables of contents, references, and abstracts; ■ differentiating—using differences (authors or jour nal hierarchies) between sources as a filter on the nature and quality of the material examined; ■ monitoring—maintaining awareness of develop ments in an area through the monitoring of particular sources such as core journals, newspapers, confer ences, magazines, books, and catalogs; and ■ extracting—systematically working through a par ticular source to locate material of interest, for exam ple, sets of journals, collections of indexes, abstracts, or bibliographies.5 meho and tibbo revised ellis’s informationseeking model by studying the informationseeking behavior of socialscience faculty who study stateless nations.6 they confirmed ellis’s model and derived four additional fea tures—accessing, networking, verifying, and information managing. accessing is getting hold of the materials or sources of information once they have been identified and located. networking includes communicating and maintaining a close relationship with a broad range of people such as friends, colleagues, and intellectu als. verifying is checking the accuracy of the informa tion found, and information managing includes filing, archiving, and organizing the collected information to facilitate research. yi shen yi shen (yishen@wisc.edu) is a ph.d. candidate in the school of library and information studies, university of wisconsinmadison. her article is the winner of the 2006 lita/endeavor student writing award. information seeking in academic research: a study of the sociology faculty at the university of wisconsin-madison article title | author 5information seeking in academic research | shen 5 with the exception of ellis’s work in 1987–1990 and the followup study by meho and tibbo, studies inves tigating academic social scientists have been in steady decline since the mid1970s.7 according to line, in an information world radically changed by the internet, it is essential to carry out new studies of information uses and needs.8 most of the studies discussed in this paper were conducted before the development of the internet. the present study focuses on the informationseeking behav ior of social scientists in a new information environment featuring the internet and other dramatic technological advances. kling and mckim pointed out the growing importance of information technology and the resulting major shifts in scientific practice.9 costa and meadows studied the impact of computer usage on scholarly com munication among social scientists and found that major changes in their communication habits were occurring.10 the most significant impacts of information technology were greater interactivity, widened community boundar ies, extended access to information, and an increasing democratization of the international research community. they suggested that the developments were influenced by new pressures (social, economic, political) from the research community and the institutional environment, and by newly available resources (infrastructure, ser vices, sources) being introduced into the academic envi ronment by information technology. it could be expected that social scientists’ informationseeking behavior would change within a new socialtechnical environment. the purpose of this study is to extend the findings of the pre vious studies by examining social scientists’ information needs and their activities and perceptions in relation to today’s information systems and services. this paper provides a theoretical framework for the study, discusses the methods for data collection and data analysis, and summarizes findings. finally, it discusses results, reflects on the theoretical and practical implica tions that ensue, and notes the limitations imposed by the study design. ■ theoretical framework the theoretical frame for this study is the idea of “com munities of practice.” wenger, mcdermott, and snyder define a community of practice as “a group of people who share a common concern, a set of problems, or a passion about a topic, and who deepen their knowledge and expertise in this area by interacting on an ongoing basis.”11 within communities of practice, people share common values, observe and interact with each other, exchange views and ideas, and contribute to the knowl edgecreation process.12 according to wenger, communities of practice are combinations of three elements: a domain of knowledge, which defines the key issues in the community; a com munity of people who care about the domain; and the shared practice that they create.13 communities of prac tice are loosely connected, informal, and selfmanaged. they are about knowledge sharing, and the best way to share knowledge is through social interaction and infor mal relationship networks. effective communication and mutual understanding are important factors in fostering communities of practice. this form of social construction is highly situated and highly improvised.14 it essentially suggests that researching some thing is inseparable from its own historical and social locations of practice and should be carried out in the process of actually doing that thing.15 a process organizes knowledge in a way that is especially useful to practitioners whose shared learning brings value to a community.16 pragmatically, the exami nation of contextbased research processes draws “atten tion away from abstract knowledge and cranial processes and situates it in the practice and communities in which knowledge takes on significance.”17 what is learned is highly dependent in the context on which the learning takes place, as it is central to the transfer and consump tion of information. this requires “looking at the actual practice of work, which consists of a myriad of fine grained improvisations that are unnoticed in any formal mapping of work tasks.”18 such beliefs are utilized in this present study to approach and explain informationseek ing behavior among social scientists. researchers used communities of practice in orga nization and business studies to investigate knowledge sharing and knowledgecreation processes within orga nizational settings to cultivate the building of knowl edgemanagement systems. researchers also used this approach in the field of computersupported cooperative work (cscw) to study the social interactions of group ware systems and community computing and support systems. this study selected communities of practice as the theoretical frame because it has been widely applied in the study of knowledge sharing and has been tested and verified through empirical research. this study rep resents an exploration of the usefulness of communities of practice for research on informationseeking behavior within a knowledgeintensive scholarly community. the primary purpose of the present study is to pro vide empirical evidence on social scientists’ information seeking in scientific research. the main research ques tions are: (1) how do social scientists make use of different information sources and channels to satisfy their infor mation needs? (2) what strategies do they apply when seeking information for academic research? and, (3) what difficulties are encountered in searching for supporting 6 information technology and libraries | march 20076 information technology and libraries | march 2007 information? information service providers should find the results of this study interesting because identifying users’ perceptions of the information environment pro vides guidance for informationsystem development that will closely reflect or accommodate the informationseek ing activities of social scientists. ■ methods the research questions described in the preceding section were tested in the context of information use in scientific inquiry by faculty in the department of sociology at the university of wisconsinmadison during march and april 2003. the participants were selected from the faculty list on the department web page and then contacted by email to arrange facetoface interviews. four people were interviewed based on their willingness to par ticipate. three of them are fulltime professors and have teaching experience of more than twenty years (one of them has been teaching for more than thirty years). the fourth is an assistant professor with four years of teaching experience. all of the participants are female. each inter view lasted from fortyfive minutes to an hour. all participants were interviewed in their campus offices to allow for easy access to supporting materials as examples of how they go about their work. after explain ing her identity, the purpose of the research, and assuring the confidentiality of the interview, the researcher asked initial questions in a relatively structured way to glean backgroundrelated information and research context. the second part of the interview dealt with informa tionrelated behavior, such as information sources and channels used to address research inquiry, and the major strategies for selecting needed information. the third part focused on problems the participants encountered in information seeking. the researcher took field notes and taperecorded all interviews. as a consistency check, the participants were sometimes asked to comment on disciplinary work prac tices gleaned from other interviews. the selection of four participants reflected the practicalities of collecting data with limited time and resources. ■ findings based on the idea of communities of practice that what is learned is highly dependent on the context in which the learning takes place because it is central to the trans fer and consumption of information, the present study provides a holistic picture of information use situated in actual research practice and academic context among these social scientists.19 these findings can be summarized into several interrelated stages as shown in figure 1. the figure shows that the social scientists’ information seeking moves from academic information needs, choice of information sources, searching for information, to use of the information. the researchers move back and forth between stages until the information inquiry is satisfied. searching for information involves the implementation of strategies, confrontation of difficulties, and continuous decision making. choice of information channels goes through the whole informationseeking process based on researchers’ momentary or changing information activi ties and information needs. this figure is intended to provide a general view of the information seeking behav ior in this specific case, but is not intended to generate a model or pattern of information seeking. the findings are organized into the use of information resources and channels to address information inquiry, the strategies for information seeking, and the difficulties encountered in searching for information, which together constitute the major informationseeking practice of the participants. figure 1. stages of the social scientists’ information seeking article title | author 7information seeking in academic research | shen 7 ■ use of information resources and channels to address information needs information needs the respondents reported their researchoriented infor mation needs in the context of their research activities. those information needs can be grouped into seven cat egories. examples of responses follow. 1. general academic issues and current research dis courses in the field. “i find conferences are more useful for seeing what kinds of general things are going on. i guess some of these are research, some are academic politics kinds of things, and what’s happening in the disci pline as a whole.” “in conferences, you find out what other people are doing research on. the most current research is not published yet, so you know what’s happening now.” 2. feedback from colleagues on personal research. “the best thing about conferences is that when i present my own research, i get comments about it.” “you show your paper to people and ask them for comments, and they show you their papers and ask you for comments. this is kind of the normal part of academic life.” “i usually send a copy of a paper or something and get actual comments through email.” 3. current research topics and activities of specific authors. “i’ll look for key people, and see what they’ve done. . . .” “knowing who is doing what where. . . .” “you sort of inevitably talk about your research with other people doing comparable research and find out what they are doing to keep current to what the different research projects are.” 4. existing datasets (existing survey research data bases) and statistics for secondary data analysis. “there are online statistical sources that i get to put in the papers.” “i use the internet to download all the . . . data that we analyze. . . .” “i do a lot of data research, so i use government sites on the internet, like the science’s bureau, or the national center for health statistics. we also have a little center for demography and ecology library. i use our inhouse databases too.” “in social science, there are many existing sets of data. we have something called data and program library service here. they have all kinds of data bases that will tell you where there are data sources that have certain variables in them. . . . so you can go and do your own statistical analysis on those data.” 5. information needed for management purposes, such as the cooperation and coordination of research activities. “in this department, we conduct community busi ness by email. we pass messages around. . . . a decision is usually made through this dialogue.” “i am constantly in interaction with people by e mail to cooperate on research projects.” 6. community recognition and inspirational support from colleagues. for example, one respondent commented, “in conferences, i feel invigorated when sitting and talking to field colleagues who are interested in my research. the whole conversa tion makes me feel excited and inspired.” another respondent indicated, “to see people facetoface that you respect and they think your work is good, that’s good.” it is echoed by a third respondent: “you just talk about your work, and people act like what you are doing is very interesting, then it makes you more inspired.” those needs for information constitute a major research practice of the participants and thus determine how they go about seeking information. ■ information resources supporting information resources could be divided into internally built university resources and external resources. moreover, these internal and external resources could be further subdivided into human resources and nonhuman resources based on their physical forms. internal, nonhuman resources the participants identified two major categories of internal nonhuman information resources for academic research based on their intended use. the first of these categories is books and journals that are available in the university libraries for literature reviews and to provide awareness of current research. however, because of phys ical inconvenience, campus libraries are not often used. one participant indicated, “the library is down the hill, so even before there were lots of good internet resources, i wasn’t going down to the library a lot.” on the other hand, the participants reported that they frequently used the library online public access catalog (opac) to order 8 information technology and libraries | march 20078 information technology and libraries | march 2007 document delivery from the libraries. “i find madcat (the library online catalog system) very useful for a whole variety of specific searches for journals, books, and differ ent online information.” another participant remarked, “i can request a book online through the document deliv ery services.” another internal nonhuman resource consists of exist ing survey datasets that are collected by the center for demography and ecology library for secondary data analysis and research. it was indicated that in social sci ence, as more and more survey research databases were available, there was an increasing amount of research conducted on secondary data. the data and program library service provides all kinds of databases informing researchers of the location of data sources and the vari ables contained in certain datasets. external, nonhuman resources the participants identified three types of external non human resources based on their medium. some of these resources are purchased and managed internally by the campus libraries but developed and maintained exter nally by outside library and information professionals. one type is electronic resources, such as electronic news papers, external opacs, electronic fulltext databases, online statistical reports, survey databases, and govern ment or personal web sites. some named examples include sociological abstracts, lexisnexis, science bureau’s web site, the national center for health statistics web site, web of science, and online british newspapers. the second type is printed resources, such as books, newspapers and magazines, archives, and newspaper indexes that are available from outside of the campus. named examples include the paper indexes for the new york times and los angeles times back in the 1960s. the third type is audio video resources, such as radio broadcasts, tapes, video tapes, and television. one major finding was that the participants depended primarily on electronic information resources. all looked for information on both literature and research data via the internet. literally, each participant had her own fre quent visit to search engines or opacs for information on specific research topics and general research subjects. examples of responses include: “i start with internet explorer and go to google.” “i work a lot online. . . . i just do internet searches. “both these journal and newspaper databases, i use a lot for various purposes.” “i want to find out if there is work on this specific topic or concept. i would almost always start with sociological abstracts.” “the citation index is terrific for finding contempo rary work building on something important.” moreover, the respondents also conducted research on the internet to study web behavior or social networks on the internet. “there are more and more people actually doing research on the internet, studying web sites or connec tions between web sites. . . . they collect data online. . . .” “in socialmovement research, more and more researchers study how people coordinate transactional movements, protest movements, various ethnic move ments, and political movements through the internet.” “online is a big way of doing cooperation as well as doing research. it is one of the reasons that we are inter ested in studying what kind of connections there are on the electronic network.” “a current research project that i am doing is looking at network of . . . web sites. so we are gathering primary data from the web sites.” thus, the electronic mechanism for information sys tems and services dominates the manner in which the participants carry out their research. internal, human resources the faculty participants were not only electronicinforma tion consumers, but also electronicinformation producers. for example, one described, “i maintain my own web page, on which i post my research and add links to outside resources that i collected for years. i have my own gateway to organize the link pages, which can be used for my future reference and by my students. the library links to my web page as well.” moreover, this participant advocated the creation and collection of electronic materials by her col leagues as well. “it’s an evolving process. the more people put their information on the internet, the more useful it is to be on the internet. we are right in that transition.” the department can easily take a step further to build a shared pool of information and information resources in its internal system. a second type of internal human resources comes from the technical staff who provided announcements of technical developments and product information, as well as technical assistance for socialscience research. working as the social science computing cooperative (sscc), the technical staff provides the faculty with detailed instruc tions and useful tips for creating electronic materials as well as with directions for publishing them. librarians, as a third type of human resource, provided reference services and collected necessary information resources for their academic research. external, human resources the external human resources that the respondents gathered and contacted are of two types: people shar article title | author �information seeking in academic research | shen � ing similar research interests and concerns, and people having different fields of interest. the former types were valued for supporting suggestive and creative commu nication and interaction as well as potential cooperation. for example, “when it comes to really think[ing] about things, sit down in one place and talk, and then stuff comes out. you don’t even know what you are thinking until you sit down and talk to people. it’s idea generating.” “knowing who is doing what where in the field is important. . . . i am working on a . . . research topic, which requires the awareness of other people with similar inter est around the world. . . . i cooperate with the scholars from different countries and with different knowledge background.” the latter types are used for current awareness of research works in other fields and general disciplinary activities and academic trends. for example, “i need to know people who know what’s going on in other fields, and they tell me what’s going on.” “i get a lot in terms of contemporary research at con ferences, which are useful for things that haven’t been published in journals.” “[a conference] will generate a lot of interesting inter change.” “[at conferences], i think about how what other people are doing is related to what i would want to do, or how they can do it differently. a lot of times, i think about whether the methods they are using would be useful for my work at all.” ■ channels the major information channels through which the par ticipants delivered and exchanged information included email, telephone, facetoface communication, and proj ect reports or other documents. email was a domi nant communication and informationacquisition tool in research. facetoface or oral communication channels in this case were often used as a supplementary means. “mostly, email is how i communicate with people, occasionally telephone, but not very often. even with people here and we can walk right next door, mostly we just email each other. it’s nice, because you have a record.” “i get hundreds of emails a week. . . . i live on email. my colleagues know i am easier to reach by email than in person.” “[faceto face] it’s just the more personal and emo tional mode [of communication] . . . you can see the person’s expression, and figure out what they are really thinking.” email communication helped accomplish several scientists’ tasks, including quick exchange of timely infor mation, teamwork coordination, nonworkoriented mes sage exchange, field discussion, field information seeking and finding communities of interest. for example, one participant indicated the coordination of community activities through email. in this department, we conduct community business by email. community members rarely meet face to face. the chairperson finds out what the research task is, and sends out messages. people exchange opinions through email messages. and a decision is usually made through this dialogue, instead of talking face to face. when scholars are going to have a facetoface meet ing, they deliver the data, records, and reports before hand, and share their initial viewpoints with supporting information through email. the following factors affected a scholar’s choice of channels for information delivery and exchange: the char acteristics of the information receiver, the characteristics of the information, the task or purpose of delivering or sharing information, and the immediacy of response. for example, one respondent mentioned that she usu ally delivered data, records, and research documents via email for formal announcement and record keeping by the receivers. when there was no stress of immediate response, she preferred email communication for the thoughtful input and feedback allowed by the asynchro nousexchange feature of email. “intellectual questions are more easily handled by email because i have the time to think about it and formulate my responses.” she con tinued, “i usually email a copy of my paper to colleagues for detailed feedback.” in another case, a participant indicated, “some of us are well aware that email is archived, it’s not anonymous and not private. if you are concerned about something and want to say something that you don’t want to have an email record of, you may want to go to talk to some body about it, instead of writing it in an email.” another participant explained that because of her research topics, she usually adopted the facetoface method of communication and attended all kinds of international academic conferences. in other circum stances, when collecting opinions for resolving certain questions, she chose to use email. ■ strategies for information seeking the participants indicated certain strategies applied to gathering information and tracking resources to address 10 information technology and libraries | march 200710 information technology and libraries | march 2007 their information needs. those strategies with response examples are: 1. extracting abstracts: “i use abstracts to get the parameters of what’s happening and then know more narrowly where to focus.” 2. tracking citations: “the citation index is terrific for finding contemporary work” that builds on previ ous major work in a subject area. 3. restricting the search to a limited set of sources or types of sources to achieve satisfactory results within an acceptable timeline. 4. constantly filtering and interpreting the search results by referring to the summary description of web sites: “in most searches that i do, the first ten hits are book dealers. i don’t bother with them. i go to the next page and try. . . . i look at the summary of what the site is and try to figure out what the worthy things are.” 5. avoiding search terms prone to commercial infor mation: “when searching for something without a lot of commercial stuff, you are more likely to get what you want on the top.” 6. setting the default for the number of search results with consideration of information completeness, information usefulness, importance of research, and timeliness: for example, one participant stated, “i usually set my least default to a hundred cita tions. five hundred is too many, but it depends on what you’re looking for, how much you care about your findings, how much faith you have for the existence of useful information. if you think it’s not worth a minute of your time, you just forget it. but if you are sure it’s there, you just have to keep looking for and work[ing] harder at it.” as shown in the findings, the participants employed certain criteria for evaluation of the information they gathered. those judgment criteria were: importance of research, usefulness, accuracy, completeness, and timeli ness. the results imply that to accomplish the research tasks on hand in a fastpaced and distributed digital information environment, the practicalities of time and human effort have come into play in the ways in which the participants sought information. ■ difficulties in seeking supporting information the problems encountered by the participants when col lecting information through various resources were iden tified and are grouped into categories, including: ■ information is scattered in different places and with different qualities; it is difficult to have a complete and valuable picture of a research phenomenon. the participants described this difficulty as “how tricky computerized search is.” ■ there is too much information on the internet to filter, and the current search techniques and ranking tools are not intelligent enough to capture the most relevant information of interest. the participants described trying alternative search strategies as a “gameplaying” and “brainstorming” process. ■ no sources of information or mechanisms assist in the identification of people with similar research interests and their activities in the broad virtual space. for example, one participant described: i am trying to find what’s in public debate on con troversial topics. and it’s very common to have trouble finding both sides of the debate. i started with diffuse searches on the internet trying to see if i can find the potential academic community and tag into their debate. i basically searched on [the search term] on the whole internet because i had no idea where it would be, who got involved, and how it was formed. when doing [the research] issue, it’s easy to find the people in favor of [a topic], but difficult to find anybody who was an opponent. eventually, i got hundreds of hits [search results], and i had to wade through a lot of proponents to find the opponents. sometimes, it’s an issue to find [an] ethnic minority perspective of a topic. ■ technology upgrades and system integration arouse another concern. as one participant expressed it, “technology is changing [so] fast that lots of com puter files from the 1970s are no longer readable. the danger of an information system lies in the tradeoff between the accessibility provided by digitization and the longterm survival of intellectual proper ties.” ■ there are no digital sources of information for some historical documents and no retrievable data bases for book chapters. one participant noted, “the online strategy is very good for really current stuff, but not for older stuff. the people who started the . . . research were actually writing before the online revolution, so they are not turning up so much in keyword searches online.” another participant also mentioned the inconvenience of using hardcopy indexes for newspapers from the 1960s and archival data that go back to the 1970s and 1980s. ■ discussion this study shows how the ‘communities of practice’ perspective situates the process of using information in the actual practice of scientific research. it provides an information context in which knowledge takes on sig article title | author 11information seeking in academic research | shen 11 nificance. the results provide empirical evidence of the participants’ activities as well as insights into the ways they seek information. in his discussion of useroriented evaluation and qualitative analysis of information use, ellis emphasized a smallscale qualitative analysis of users’ perceptions of system performance to construct insights into the complex reality of the information environment.20 he argued that a detailed understanding of the complexity and interaction of information systems and services and their users can be used to explain problems and provide guidance on the development of information systems. the present study is in accord with ellis’s idea by focus ing on a specific sample of academic social scientists working in a university setting. the choice of university of wisconsinmadison is based on the grounds of conve nience and ease of access. the restriction to one specific sample also avoids the added complexity and compound problems of information use situated in different practice and contexts. ellis also considered the feasibility of interviews to “provide enough information for a detailed and accurate account of the perceptions of the social scientists of their informationseeking activities to be made, and to enable an authentic picture to be constructed of those activi ties.”21 he thought the informationseeking activities of social scientists were too diffuse to carry out triangulation of methods. by applying the interview method, this cur rent study complies with ellis’s suggestion. on the other hand, ellis’s informationseeking behav iormodel of social scientists presented six generic fea tures. these conclusions are far too general for specific application. from the perspective of communities of practice, the current study examines the way social sci entists use information in their research practices and specific circumstances; it also presents specific informa tionrelated behavior, strategies, and difficulties. this study also extends the understanding of the way infor mation is used by social scientists in a new information environment with dramatic technical advances. the findings of this study support the conclusions of kling and mckim and costa and meadows by showing the growing importance of information technology and the resulting major shifts in informationseeking practice among social scientists.22 unlike research findings prior to the 1990s, the social scientists in this study make exten sive use of a variety of information sources and channels, primarily electronicinformation systems and services, in seeking information. in the new information environ ment, these new information mechanisms also presented limitations and difficulties. moreover, many lis researchers have examined users’ relevance criteria in information seeking.23 great emphasis is given to the “situational dynamism of user centered relevance estimation.”24 situated in their research practices, the present study also identified the social sci entists’ applications of certain criteria for evaluation of information. although the smallscale study has limitations for research generalization, the rich description of social sci entists’ perspective on the information environment has some practical implications for informationsystemand service design for academic social scientists. ■ plan for system-to-system integration this study identified technology upgrades and sys temintegration problems existing in current academic information systems. technology was developed and applied without the capability of intergenerational com munications and transactions at the cost of intellectual properties. kling and star addressed the same issue that “computerized systems appear like the layers of an archaeological dig, with newer systems built upon older systems with various workplace surveillance capa bilities.”25 they stated that such “legacy systems” are fragile and inflexible for information use and knowledge management. therefore, planning for system integration should be underway. ■ enhance the web resource-retrieval system the study identified the difficulties encountered by fac ulty in locating relevant, complete, and valuable informa tion effectively and efficiently on the large and dynamic web. an advanced web resource system thus is required that allows web content to be indexed and retrieved more intelligently. moreover, the findings of informationseek ing strategies in this case study suggest a oneway user system interaction process. there is no interactive query refinement between the user and the system. thus, the users have to brainstorm and play with alternative search strategies in the hope of significant results. to enhance system effectiveness, a relevancefeedback mechanism that takes into account the users’ relevance judgment is thus needed. this mechanism should have a twoway usersystem interaction component. ■ construct an internal information system the findings of the study point to a need for a shared pool of information resources in the university of wisconsin– madison department of sociology. through the leverage and reuse of existing internal knowledge assets in the 12 information technology and libraries | march 200712 information technology and libraries | march 2007 department, this system could help collectively create or gather information resources for crossreference by colleagues. ■ construct a collaborative information mechanism for the social-scientific community according to the findings, there are no sources of infor mation or mechanisms that assist the identification of people with similar research interests and their activi ties on the broad virtual space. however, awareness of shared interests and experiences constitutes an important external human resource that is valued for suggestive and creative interaction and for potential cooperation. thus, a collaborative information mechanism for identification with personal academic interests will be helpful. ■ limitations certain limitations inherent in the study need to be acknowledged. due to the time and resource constraints, the study sample includes only four scholars. given this small sample, results cannot be generalized. although ellis mentioned the feasibility of interviews in a user oriented study of information use, dependence on a single method has the disadvantages of the restriction of views. for example, interviewer characteristics, expecta tions, and verbal idiosyncrasies, and participants’ socially desirable responses are recognized in many studies as potential sources of method biases (podsakoff et al.).26 if time and resources permit, triangulation of methods—for example, combining interviews with observations and diaries—would increase the level of specificity and justify the validity and reliability of the research results. ■ conclusion drawing upon the idea of communities of practice that what is learned: (1) is dependant on the context in which the learning takes place, and (2) is central to the transfer and consumption of information, this study examined the informationseeking behavior of four social scien tists. results were drawn about their use of information resources and information channels to meet their infor mation inquiries, their strategies for information seeking, and the difficulties encountered in searching for relevant information, situated in the course of their actual scien tific research. this work has two primary contributions. first, it provides a rich description of social scientists’ per spectives on their researchoriented informationseeking behavior in the context of today’s information environ ment. second, it situates information seeking behavior in a socially constructed practice and presents specific features of information seeking. these results will help refine the understanding of the dynamic relationship between information systems and services and their users within scientific research. several areas remain for future research. researchers could make a comparative study of academics in differ ent institutional settings. future research could also study the dynamic interaction of information systems and ser vices and their users within each stage of ellis’s model of informationseeking patterns among social scientists to get insights into the specific features of their information seeking behaviors and to enrich their general patterns of information inquiry with specific details. research on informationseeking behaviors of social scientists could also focus on specific research tasks or certain research stages to decide differences or similarities of informa tionseeking behaviors across academic practice. similar research could also be done on faculty in other disci plines. references 1. m. b. folster, “informationseeking patterns: social scien tists,” the reference librarian 23, no. 49/50 (1995): 83–93. 2. m. b. line, “information requirements in the social sci ences: some preliminary considerations,” journal of librarianship 1, (1969): 1–19; m. b. line, “the information uses and needs of social scientists: an overview of infross,” aslib proceedings 23, (1971): 412–34. 3. p. stenstrom and r. b. mcbride, “serial use by social sci ence faculty: a survey,” college and research libraries 40 (1979): 426–31; r. h. epp and j. s. segal, “the acls survey and aca demic library service,” college and research libraries news 48, (1987): 63–69; m. slater, “social scientists’ information needs in the 1980s,” journal of documentation 44, no. 3 (1988): 226–37; m. b. folster, “a study of the use of information sources by social science researchers,” the journal of academic librarianship 15, no. 1 (1989): 7–11; c. c. gould and m. j. handler, information needs in the social sciences: an assessment (mountain view, calif.: research libraries group, 1989). 4. folster, “a study of the use of information sources by social science researchers”; epp and segal, “the acls survey and academic library service.” 5. d. ellis, “the derivation of a behavioral model for infor mation retrieval system design” (ph.d. diss., univ. of sheffield, 1987); d. ellis, “a behavioral approach to information retrieval system design,” journal of documentation 45, no. 3 (1989): 171– 212. 6. l. i. meho and h. r. tibbo, “modeling the information seeking behavior of social scientists: ellis’s study revisited,” article title | author 13information seeking in academic research | shen 13 journal of the american society for information science and technology 54, no. 6 (2003): 570–87. 7. h. c. hobohm, “social science information and docu mentation: time for a state of the art?” inspel 33, no. 3 (1999): 123–30. 8. m. b. line, “social science information: the poor rela tion,” ifla journal 26, no. 3 (2000): 177–79. 9. r. kling and g. mckim, “not just a matter of time: field differences and the shaping of electronic media in supporting scientific communication,” journal of the american society for information science 51, no. 14 (2000): 1306–20. 10. s. costa and j. meadows, “the impact of computer usage on scholarly communication among social scientists,” journal of information science 26, no. 4 (2000): 255–62. 11. e. wenger, r. mcdermott, and w. m. snyder, cultivating communities of practice: a guide to managing knowledge (boston: harvard business sch. pr., 2002), 4. 12. s. alhawamdeh, knowledge management: cultivating knowledge professionals (oxford: chandos pubs., 2003). 13. e. wenger, communities of practice: learning, meaning, and identity (cambridge: cambridge univ. pr., 1998). 14. j. s. brown and p. duguid, “organizational learning and communities of practice: toward a unified view of working, learning, and innovation,” organization science 2, no.1 (1991): 40–57. 15. j. s. brown, “internet technology in support of the con cept of ‘communities of practice’: the case of xerox,” accounting, management, and information technologies 8, no. 4 (1998): 227–36; brown and duguid, “organizational learning and com munities of practice; f. blackler, “knowledge, knowledge work, and organizations: an overview and interpretation,” organization studies 16, no. 6 (1995): 1021–46; j. lave and e. wenger, situated learning: legitimate peripheral participation (cambridge: cambridge univ. pr., 1991); n. hayes and g. walsham, “par ticipation in groupwaremediated communities of practice: a sociopolitical analysis of knowledge working,” information and organization 11, no. 4 (2001): 263–88. 16. wenger, mcdermott, and snyder, “cultivating communi ties of practice.” 17. brown and duguid, “organizational learning and com munities of practice,” 48. 18. hayes and walsham, “participation in groupwaremedi ated communities of practice,” 264. 19. k. grosser, “human networks in organizational informa tion processing,” in m. e. williams, ed., annual review of information science and technology (medford, n.j.: learned information, 1991), 349–402; brown, “internet technology in support of the concept of ‘communities of practice’”; brown and duguid, “organizational learning and communities of practice”; black ler, “knowledge, knowledge work, and organizations”; lave and wenger, situated learning: legitimate peripheral participation; hayes and walsham, “participation in groupwaremediated communities of practice.” 20. d. ellis, “useroriented evaluation and qualitative anal ysis of patterns of information use,” in d. bawden, user-oriented evaluation of information systems and services (brookfield, vt.: gower, 1990), 172–79. 21. ibid., 177. 22. kling and mckim, “not just a matter of time”; costa and meadows, “the impact of computer usage on scholarly com munication among social scientists.” 23. c. l. barry, “userdefined relevance criteria: an explor atory study,” journal of the american society for information science 45, no. 3 (1994): 149–59; h. w. bruce, “a cognitive view of the situational dynamism of usercentered relevance estimation,” journal of the american society for information science 45, no. 3 (1994): 142–48; s. mizzaro, “relevance: the whole story,” journal of the american society for information science 48, no. 9 (1997): 810–32; x.j. yuan, n. j. belkin, and j.y. kim, “the relation ship between ask and relevance criteria,” in proceedings of the 25th annual international acm sigir conference on research and development in information retrieval (new york: acm pr., 2002), 359–60; s. y. rieh, “judgment of information quality and cognitive authority in the web,” journal of the american society for information science and technology 53, no. 2 (2002): 145–61; c. n. wathen and j. burkell, “believe it or not: factors influencing credibility on the web,” journal of the american society for information science and technology 53, no. 2 (2002): 134–44; a. tombros, i. ruthven, and j. m. jose, “searchers’ criteria for assessing web pages,” in proceedings of the 26th annual international acm sigir conference on research and development in information retrieval (toronto: acm pr., 2003), 385–86. 24. bruce, “a cognitive view of the situational dynamism of usercentered relevance estimation,” 142. 25. r. kling and l. star, “humancentered systems in the perspective of organizational and social informatics,” computers and society 28, no. 1 (1998): 22–29. 26. p. m. podsakoff et al., “common method biases in behav ioral research: a critical review of the literature and recom mended remedies,” journal of applied psychology 88, no. 5 (2003): 879–903. author id box for 2 column layout this article examines the linguistic structure of folksonomy tags collected over a thirty-day period from the daily tag logs of del.icio.us, furl, and technorati. the tags were evaluated against the national information standards organization (niso) guidelines for the construction of controlled vocabularies. the results indicate that the tags correspond closely to the niso guidelines pertaining to types of concepts expressed, the predominance of single terms and nouns, and the use of recognized spelling. problem areas pertain to the inconsistent use of count nouns and the incidence of ambiguous tags in the form of homographs, abbreviations, and acronyms. with the addition of guidelines to the construction of unambiguous tags and links to useful external reference sources, folksonomies could serve as a powerful, flexible tool for increasing the user-friendliness and interactivity of public library catalogs, and also may be useful for encouraging other activities, such as informal online communities of readers and user-driven readers’ advisory services. o ne of the most daunting challenges of information management in the digital world is the ability to keep, or refind, relevant information; book marking is one of the most popular methods for storing relevant web information for reaccess and reuse (bruce, jones, and dumais 2004). the rising popularity of social bookmark managers, such as del.icio.us, addresses these concerns by allowing users to organize their bookmarks by assigning tags that reflect directly their own vocabu lary and needs. the collection of userassigned tags is referred to commonly as a folksonomy. in recent years, significant developments have occurred in the creation of customizable user features in public library catalogs. these features offer clients the opportunity to customize their own library web pages and to store items of interest to them, such as book lists. client participation in these interfaces, however, is largely reactive; clients can select items from the catalog, but they have little ability to orga nize and categorize these items in a way that reflects their own needs and language. digital document repositories, such as library cata logs, normally index the subject of their contents via key words or subject headings. traditionally, such indexing is performed either by an authority, such as a librarian or a professional indexer, or is derived from the authors of the documents; in contrast, collaborative tagging, or folkson omy, allows anyone to freely attach keywords or tags to content. demspey (2003) and ketchell (2000) recommend that clients be allowed to annotate resources of interest and to share these annotations with other clients with similar interests. folksonomies can thus make significant contributions to public library catalogs by enabling cli ents to organize personal information spaces; namely, to create and organize their own personal information space in the catalog. clients find items of interest (items in the library catalog, citations from external databases, external web pages, and so on) and store, maintain, and organize them in the catalog using their own tags. in order to more fully understand these applications, it is important to examine how folksonomies are struc tured and used, and the extent to which they reflect user needs not found in existing lists of subject headings. the purpose of this proposed research is thus to examine the structure and scope of folksonomies. how are the tags that constitute the folksonomies structured? to what extent does this structure reflect and differ from the norms used in the construction of controlled vocabular ies ,such as library of congress subject headings? what are the strengths and weaknesses of folksonomies (for example, reflect user need, ambiguous headings, redun dant headings, and so forth)? this article will examine a selection of tags obtained from three folksonomy sites, del.icio.us (referred to henceforth as delicious), furl, and technorati, over a thirtyday period. the structure of these tags will be examined and evaluated against section 6 of the niso guidelines for the construction of controlled vocabularies (niso 2005), which looks specifically at the choice and form of terms. ■ definitions of folksonomies folksonomies have been described as “usercreated meta data . . . grassroots community classification of digital assets” (mathes 2004). wikipedia (2006) describes a folksonomy as “an internetbased information retrieval methodology consisting of collaboratively generated, openended labels that categorize content such as web pages, online photographs, and web links.” the concept of collaboration is attributed commonly to folksonomies (bateman, brooks, and mccalla 2006; cattuto, loreto, and pietronero 2006; fichter 2006; golder and huberman the structure and form of folksonomy tags: the road to the public library catalog louise f. spiteri louise f. spiteri (louise.spiteri@dal.ca) is associate professor at the school of information management, dalhousie university, halifax, nova scotia, canada. this research was funded by the oclc/alise library and information science research grant program. the structure and form of folksonomy tags | spiteri 13 1� information technology and libraries | september 20071� information technology and libraries | september 2007 2006; mathes 2004; quintarelli 2005; udell 2004). thomas vander wal, who coined the term folksonomy, argues, however, that: the definition of folksonomy has become completely unglued from anything i recognize. . . . it is not col laborative . . . it is the result of personal free tagging of information and objects (anything with a url) for one’s own retrieval. the tagging is done in a social environment (shared and open to others). the act of tagging is done by the person consuming the informa tion” (vanderwal.net 2005). it may be more accurate, therefore, to say that folk sonomies are created in an environment where, although people may not actively collaborate in their creation and assignation of tags, they may certainly access and use tags assigned by others. folksonomies thus enable the use of shared tags. folksonomies are used primarily in social bookmark ing sites, such as delicious (http://del.icio.us/) and furl (http://www.furl.net/), which allow users to add sites they like to their personal collections of links, to organize and categorize these sites by adding their own terms, or tags, and to share this collection with other people with the same interests. the tags are used to collocate bookmarks within a user’s collection and bookmarks across the entire system, so, for example, the page http://del.icio.us/tag/blogging will show all bookmarks that are tagged with blogging by any member of the delicious site. ■ benefits of folksonomies quintarelli (2005) and fichter (2006) suggest that folk sonomies reflect the movement of people away from authoritative, hierarchical taxonomic schemes that reflect an external viewpoint and order that may not necessarily reflect users’ ways of thinking. “in a social distributed environment, sharing one’s own tags makes for innova tive ways to map meaning and let relationships naturally emerge” (quintarelli 2005). vander wal (2006) adds that “the value in this external tagging is derived from people using their own vocabulary and adding explicit mean ing, which may come from inferred understanding of the information/object.” an attractive feature of folksonomies is their inclusive ness; they reflect the vocabulary of the users, regardless of viewpoint, background, bias, and so forth. folksonomies may thus be perceived to be a democratic system where everyone has the opportunity to contribute and share tags (kroski 2006). the development of folksonomies may reflect also the difficulty and expense of applying con trolled taxonomies to the web: building, maintaining, and enforcing a sound, controlled vocabulary is often simply too expensive in terms of development time and of the steep learning curve needed by the user of the system to learn the classification scheme (fichter 2006; kroski 2006; quintarelli 2005; shirky 2004). a further limitation of taxonomies is that they may become outdated easily. new concepts or products may emerge that are not yet included in the taxonomy; in comparison, folksonomies easily accommodate such new concepts (fichter 2006; mitchell 2005; wu, zubair, and maly, 2006). shirky (2004) points out that the advantage of folksonomies is not that they are better than controlled vocabularies, but that they are better than nothing. folksonomies follow desire lines, which are expres sions of the direct information needs of the user (kroski 2006; mathes 2004; merholz 2004). these desire lines also may reflect the needs of communities of interest: tag gers who use same set of tags have formed a group and can seek each other out using simple search techniques. “tagging provides users an easy, yet powerful method to express themselves within a community” (szekely and torres 2005). ■ weaknesses of folksonomies folksonomies share the problems inherent to all uncon trolled vocabularies, such as ambiguity, polysemy, syn onymy, and basic level variation (fichter 2006; golder and huberman 2006; guy and tomkin 2006; mathes 2004). the terms in a folksonomy may have inherent ambiguity as different users apply terms to documents in different ways. the polysemous tag port could refer to a sweet fortified wine, a porthole, a place for loading and unloading ships, the lefthand side of a ship or air craft, or a channel endpoint in a communications system. folksonomies do not include guidelines for use or scope notes. folksonomies provide for no synonym control; the terms mac, macintosh, and apple, for example, are all used to describe apple macintosh computers. similarly, both singular and plural forms of terms appear (for example, flower and flowers), thus creating a number of redun dant headings. the problem with basic level variation is that related terms that describe an item vary along a continuum of specificity ranging from very general to very specific, so, for example, documents tagged perl and javascript may be too specific for some users, while a document tagged programming may be too general for others. folksonomies provide no formal guidelines for the choice and form of tags, such as the use of com pound headings, punctuation, word order, and so forth; for example, should one use the tag vegan cooking or cooking, vegan? guy and tomkin (2006) provide some general suggestions for tag selection best practices, such as the use of plural rather than singular forms, the use article title | author 15the structure and form of folksonomy tags | spiteri 15 of underscore to join terms in a multiterm concept (for example, open_source), following conventions estab lished by others, and adding synonyms. these sugges tions are rather too vague to be of much use, however; for example, under what circumstances should singular forms be used (such as noncount nouns), and how should synonyms be linked? ■ applications of folksonomies other than social bookmarking sites, folksonomies are used in commercial shopping sites, such as amazon (http://www.amazon.com/), where clients tag items of interest; these tags can be accessed by people with similar interests. platial (http://www.platial.com/ splash) is used to tag personal collections of maps. examples of the use of folksonomies for intranets include ibm’s social bookmarking application dogear, which allows people to bookmark pages within their intranet (http://domino.watson.ibm.com/cambridge/ research.nsf/99751d8eb5a20c1f852568db004efc90/ 1c181ee5fbcf59fb852570fc0052ad75?opendocument), and scuttle (http://sourceforge.net/projects/scuttle/), an opensource bookmarking project that can be hosted on web servers for free. penntags (http://tags.library. upenn.edu/) is a social bookmarking service offered by the university of pennsylvania library to its community members. steve museum is a project that is investigating the incorporation of folksonomies into museum catalogs (trant and wyman 2006). another potential application of folksonomies is to public library catalogs, where users can organize and tag items of interest in userspecific folders; users could then decide whether or not to post the tags publicly (spiteri 2006). ■ analyses of folksonomies analysis of the structure, or composition, of tags has thus far been limited; there has been more emphasis placed upon the cooccurrence of tags and their frequency of use. cattuto, loreto, and pietronero (2006) applied a stochas tic model of user behavior to investigate the statistical properties of tag cooccurrence; their results suggest that users of collaborative tagging systems share universal behaviors. michlmayr (2005) compared tags assigned to a set of delicious bookmarks to the dmoz (http://www. dmoz.org/) taxonomy, which is designed by a commu nity of volunteers. the study concluded that there were few instances of overlap between the two sets of terms. mathes (2004) provides an interesting analysis of the strengths and limitations of the structure of delicious and flickr, but does not provide an explanation of the meth odology used to derive his observations; it is not clear, for example, for how long he studied these two sites, how many tags he examined, what elements he was looking for, or what evaluative criteria he applied. golder and huberman (2006) conducted an analysis of the structure of collaborative tagging systems, look ing at user activity and kinds and frequencies of tags. specifically, golder and huberman looked at what tags delicious members assigned and how many bookmarks they assigned to each tag. this study identified a number of functions tags perform for bookmarks, including iden tifying the: ■ subject of the item; ■ format of the item (for example, blog); ■ ownership of the item; and ■ characteristics of the item (for example, funny). while the golder and huberman study provides an important look at tag use, their study is limited in that they examined only one site for a period of four days; their results are an excellent first step in the analysis of tag use, but the narrow focus of their population and sample size means that their observations are not easily generalized. furthermore, this study focuses more on how bookmarks are associated with tags (for example, how many bookmarks are assigned per tag and by whom) rather than at the structural composition of the tags themselves. guy and tonkin (2006) collected a random sampling of tags from delicious and flickr to see whether “popular objections to folksonomic tagging are based on fact.” the authors do not explain, however, over what period the tags were acquired (for example, over a oneday period, over a month), nor to they provide any evaluative criteria. the tags were entered into aspell, an open source spell checker, from which the authors concluded that 40 percent of flickr and 28 percent of delicious tags were either mis spelled, encoded in a manner not understood by aspell, or consisted of compound words of two or more words. tags did not follow convention in such areas as the use of case or singular versus plural forms. while this study certainly focuses upon the structure of the tags, the bases for the authors’ conclusions are problematic. it is not clear that the use of a spell checker is a sufficient measure of quality. does the spell checker allow for cultural variations in spell ing (for example, labor or labour)? how wellrecognized and comprehensive is the source vocabulary for this spell checker? furthermore, if a tag does not exist in the spell checker, does this necessarily mean that the tag is incor rect? tags may include several neologisms, such as podcasting, that may not yet exist in conventional dictionaries but are wellrecognized in a particular domain. the authors do not mention whether they took into account the cor 16 information technology and libraries | september 200716 information technology and libraries | september 2007 rect use of the singular form of such tags as noncountable nouns (for example, air) or tags that describe disciplines or emotions (for example, history and love). if a named entity (person or organization) was not recognized by aspell, does this mean that the tag was classified as incorrect? lastly, the authors seem to imply that compound words of two or more words are necessarily incorrect, which may not be the case (for example, open source software). the pitfalls of folksonomies have been welldocu mented; what is missing is an indepth analysis of the linguistic structure of tags against an established bench mark. while popular opinion suggests that folksonomies suffer from ambiguous and inconsistent structure, the actual extent of these problems is not yet clear; further more, analyses conducted so far have not established clear benchmarks of quality pertaining to good tag structure. although there are no guidelines for the construction of tags, recognized guidelines do exist for the construction of terms that are used in taxonomies. although these guidelines discuss the elucidation of interterm relation ships (hierarchical, associative, and equivalent), which does not apply to the flat space of folksonomies, they contain sections pertaining to the choice and formation of concept terms that may, in fact, have relevance for the construction of tags. ■ methodology selection of folksonomy sites tags were chosen from three popular folksonomy sites: delicious, furl, and technorati (http://www.technorati. com/). delicious and furl function as bookmarking sites, while technorati enables people to search for and organize blogs. these sites were chosen because they provide daily logs of the most popular tags that have been assigned by their members on a given day. the daily tag logs from each of the sites were acquired over a thirtyday period (february 1–march 2, 2006). the daily tags for each site were entered into an excel spreadsheet. a list of unique tags for each site was compiled after the thirtyday period; unique refers to the single instance of a tag. some of the tags were used only once during the thirtyday period, while others, such as travel, occurred several times, so travel appears only once in the list of unique tags. variations of the same tag—for example, car or cars, cheney or dick cheney—were considered to constitute two unique tags. only englishlanguage tags were accumulated. the analysis of the tag structure in the three lists was conducted by applying the niso guidelines for thesaurus construction, which are the most current set of recognized guidelines for the: contents, display, construction . . . of controlled vocabu laries. this standard focuses on controlled vocabularies that are used for the representation of content objects in knowledge organization systems including lists, syn onym rings, taxonomies, and thesauri (niso 2005, 1). while folksonomies are not controlled vocabularies, they are lists of terms used to describe content, which means that the niso guidelines could work well as a benchmark against which to examine how folksonomy tags are structured as well as the extent to which this structure reflects the widely accepted norm for controlled vocabu laries. section 6 of the guidelines (term choice, scope, and form) was applied to the tags, specifically the following elements (see appendix a for the expanded list): 6.3 term choice 6.4 grammatical form of terms 6.5 nouns 6.6 selecting the preferred form only those elements in section 6 that were found to apply to the lists of unique tags are included in appendix a. for each site, the section 6 elements were applied to each unique tag; for example, it was noted whether a tag consists of one or more terms, whether the tag is a noun, adjective, or adverb, and so on. the frequency of occur rence of the section 6 elements was noted for each site and then compared across the three sites in order to determine the existence of any patterns in tag structure and the extent to which these patterns reflect current practice in the design of controlled vocabularies. definition and disambiguation of tags the meanings of the tags were determined based upon (1) the context of their use; and (2) their definition in three external sources, namely merriam webster online dic tionary (http://www.mw.com/); google (http://www. google.com/); and wikipedia (http://www.wikipedia. org/). merriamwebster was used specifically to define all tags other than those that constitute unique entities (for example, named people, places, organizations, or products) and to determine the various meanings of tags that are homographs (for example, art or web). the actual concept represented by homographs was determined by examin ing the sites or blogs to which the tag was assigned. merriamwebster also was used to determine the grammatical form of a tag; for example, noun, verbal noun, adjective, or adverb. determining verbal nouns proved to be complicated, especially given that niso relies only on examples to illustrate such nouns. some tags could serve as both verbal and simple nouns; for example, the tag clipping could describe the activity to clip or an item that has been clipped, such as a newspaper article title | author 17the structure and form of folksonomy tags | spiteri 17 clipping. similarly, does skiing refer to an activity, or the sport? if the dictionary defined a tag as an activity, the tag was classified as a verbal noun. in the case of tags that were defined as both verbal nouns and simple nouns, the context in which the tag was used determined the final classification. the dictionary also was used to determine the type of concept represented by a tag. the niso guidelines do not define any of these seven types of concepts outlined in section 6.3.2; they provide only a short list of examples for each type. if the term represented by the tag was defined as an activity, property, material, event, discipline or field of study, or unit of measurement, it was classified as such unless the context of the tag suggested otherwise. if none of these six types was defined in the dictionary, the default value of thing was assigned to the tag. these definitions were then compared to the context in which the tag was used. in the case of the tag art, for example, an examination of the sites associated with this tag indicated that it refers to art objects, rather than the discipline, so it was classified as a thing. merriamwebster was used to determine whether a tag constitutes a recognized term in standard english (both united states and united kingdom variants); for example, the tag blogs is a recognized term in the dictionary, while podcasting is not. niso does not provide a clear definition of slang, neologism, or jargon, other than to say that they are nonstandard terms not generally found in dictionaries. is the term podcasting, for example, an instance of slang, jargon, or neologism? at what point does jargon become a neologism? because of the difficulty of distinguishing among these three categories, it was decided to use the broader category nonstandard terms to cover tags that (1) could not be found in the dictionary; or (2) are designated as vulgar or slang in the dictionary. google and wikipedia were used to define the mean ings of tags that constitute unique entities. wikipedia also was used to distinguish the various meanings of tags that constitute abbreviations or acronyms via its disambigua tion pages; for example, the tag nfl is given eight pos sible meanings. in this case, the tag nfl is used to refer specifically to the national football league, so the tag is a homograph, noun, and unique entry. ■ tagging conventions and guidelines of the folksonomy sites delicious delicious defines tags as: oneword descriptors that you can assign to your bookmarks. . . . they’re a little bit like keywords but nonhierarchical. you can assign as many tags to a bookmark as you like and easily rename or delete them later. tagging can be a lot easier and more flexible than fitting your information into preconceived categories or folders” (del.icio.us 2006a). the delicious help page for tags encourages people to “enter as many tags as you would like, each separated by a space” in the tag field. this paragraph explains briefly that two lists of tags may appear under the entry form used to enter a bookmark. the first list consists of popular tags assigned by other people to the bookmark in question, while the second consists of recommended tags, which contains a combination of tags that have been assigned by the client in question as well as other users (del.icio.us 2006b). it is not clear how the two lists differ in that they both contain tags assigned by other people to the bookmark at hand. the only tangible guideline provided about how tags should be structured is the sentence “your only limitation on tags is that they must not include spaces.” delicious thus addresses only indirectly the fact that it does not allow multiterm tags; the examples provided suggest ways in which compound terms can be expressed; for example, sanfrancisco, sanfranciso, san.franciso (del. ico.us 2006b). punctuation thus appears to be allowed in the construction of tags, which is confirmed by the sug gestion that asterisks may be used to rate bookmarks: “a tag of * might mean an ok link, *** is pretty good, and a bookmark tagged ***** is awesome” (del.icio.us 2006b). it is thus possible that tags may not consist of recognizable terms, even though asterisks are neither searchable nor indicative of content. furl the furl web site uses the term topics rather than tags, but provides no guidelines or instructions for how to con struct these topics. furl mentions only that when entering a bookmark, “a small window will pop up. it should have the title and url of the page you are looking at. enter any additional details (i.e., topic, rating, comments) and click save” (furl 2006). furl provides all users with a list of default topics to which one can add at will. furl provides no guidelines as to whether single or multiword topics may be used; it is only by trial and error that the user discovers that the latter are, in fact, allowed. technorati in its tags help page, technorati encourages users to “think of a tag as a simple category name. people can categorize their posts, photos, and links with any tag that makes sense” (technorati 2006). a tag may be “anything, but it should be descriptive. please only use tags that are rel evant to the post” (technorati 2006). technorati tags are 1� information technology and libraries | september 20071� information technology and libraries | september 2007 embedded into individual blogs via the link rel=”tag”; for example: global warming. the tag will appear as simply global warming. no other guidelines are provided about how tags should be constructed. as can be seen, the three folksonomy sites provide very few guidelines or conventions for how tags should be constructed. users are not pointed to the common problems that exist in uncontrolled vocabulary, such as ambiguous headings, homographs, synonyms, spelling variations, and so forth, nor are suggestions made as to the preferred form of tags, such as nouns, plural forms, or the distinction between count nouns (for example, dogs) and mass nouns (for example, air). given this lack of guidance, it is not unreasonable to assume that the tags acquired from these sites will vary considerably in form and structure. ■ findings unless stated otherwise, the number of tags per folk sonomy site is 76 for delicious, 208 for furl, and 229 for technorati. homographs the niso guidelines recommend that homographs— terms with identical spellings but different meanings— should be avoided as far as possible in the selection of terms. homographs constitute 22 percent of delicious tags, 12 percent of furl tags, and 20 percent of technorati tags. unique entities constitute a significant proportion of the homographs in all three sites, with 71 percent in delicious, 43 percent in furl, and 55 percent in technorati. the most frequently occurring homographs across the three sites consist predominantly of computerrelated terms, such as ajax and css. single-word versus multiword terms the niso guidelines recommend that terms should represent a single concept expressed by a single or mul tiword term, as needed. singleterm tags constitute 93 percent of delicious tags, 76 percent of furl tags, and 80 percent of technorati tags. the preponderance of single tags in delicious may reflect the fact that it does not allow for the use of spaces between the different elements of the same tag; for example, open source. types of concepts niso provides a list of seven types of concepts that may be represented by terms; while this list is not exhaustive, it represents the most frequently occurring types of con cept. table 1 shows the percentage of tags that correspond to each of the seven types of concepts. tags that represent things are clearly predominant in the three sites, with activities and properties forming a distant second and third in importance. none of the tags represent events or measures, and only a fraction of the technorati tags represent materials. the niso guidelines provide no indication of the expected distribution of the types of concepts, so it is difficult to determine to what extent the three folksonomy sites are consistent with other lists of descriptors. none of the tags fell outside the scope of the seven types of concepts. unique entities unique entities may represent the names of people, places, organizations, products, and specific events (niso 2005). unique entities constitute 22 percent of delicious tags, 14 percent of furl tags, and 49 percent of technorati tags. there is no consistency in the percentage of unique enti ties: technorati has nearly twice the percentage of tags than delicious has, and nearly triple the percentage of tags than furl has. computerrelated products constitute 100 percent of the unique entities in delicious, 63 percent in furl, and 38 percent in technorati. the remainder of the unique entities in furl and technorati represent places, people, and corporate bodies. the unique entities in technorati are closely related to developments in current news events, an occurrence that is likely due to the site’s focus on blogs rather than web sites. as will be discussed in a subsequent section, the unique entries constitute a significant proportion of the tags that represent ambiguous acronyms or abbreviated terms, such as ajax or psp. table 1. concepts represented by the tags delicious (%) furl (%) technorati (%) things 76 82 90.0 materials 0 0 0.4 activities 12 10 4.0 events 0 0 0.0 properties 8 6 4.0 disciplines 4 3 1.0 measures 0 0 0.0 article title | author 19the structure and form of folksonomy tags | spiteri 19 grammatical forms of terms the niso standards recommend the use of the following grammatical forms of terms: ■ nouns and noun phrases ■ verbal nouns ■ noun phrases ■ premodified noun phrases ■ postmodified noun phrases ■ adjectives ■ adverbs table 2 shows the distribution of the grammatical forms of tags. if all the types of nouns are combined, then 95 percent of delicious tags, 94 percent of furl tags, and 97 percent of technorati tags constitute types of nouns. the gram matical structure of the tags in the three folksonomy sites thus reflects very closely the niso recommendations that tags consist of mainly nouns, with the added proviso that adjectives and adverbs be kept to a minimum. none of the folksonomy sites used adverbs as tags, and the num ber of adjectives was very small, forming an average total of 5 percent of the tags. nouns (plural and singular forms) niso divides nouns into two categories: count nouns (how many?), and noncount, or mass nouns (how much?). niso recommends that count nouns appear in the plural form and mass nouns in the singular form. niso specifies other types of nouns that appear typi cally in the singular form: ■ abstract concepts ■ beliefs; for example, judaism, taoism ■ activities; for example, digestion, distribution ■ emotions; for example, anger, envy, love, pity ■ properties; for example, conductivity, silence ■ disciplines; for example, chemistry, astronomy ■ unique entities table 3 shows the distribution of the singular and plu ral forms of noun tags. the term singular nouns was used to collocate all the types of nonplural nouns. table 3 represents the number of tags that constitute count nouns; this does not mean, however, that the tags appeared correctly in the plural form. of the count nouns, 36 percent of delicious tags, 62 percent of furl tags, and 34 percent of technorati tags appeared correctly in the plural form. it should be noted that although table 3 indicates that properties constitute 8 percent of delicious, 6 percent of furl, and 4 percent of technorati tags, most of these tags are adjectives, and thus are not counted in the table. the niso guidelines do not suggest the typical distribution of count versus singular nouns, but table 3 indicates that at least among the three folksonomy sites, singular nouns form the bulk of the tags. table 2. grammatical form of tags delicious (%) furl (%) technorati (%) nouns 88 71 86 verbal nouns 5 6 4 noun phrases— premodified 1 15 4 noun phrases— postmodified 0 2 3 adjectives 6 6 3 adverbs 0 0 0 table 3. count and noncount noun tags delicious (%) furl (%) technorati (%) count nouns 18 35 23 noncount nouns 77 59 74 mass nouns 36 32 19 activities 12 10 4 properties 3 0 1 disciplines 4 3 1 unique 22 14 49 total 95 94 97 20 information technology and libraries | september 200720 information technology and libraries | september 2007 spelling the niso guidelines divide the spelling of terms into two sections: warrant and authority. with respect to warrant, niso recommends that “the most widely accepted spell ing of words, based on warrant, should be adopted,” with crossreferences made between variant spellings of terms. as far as authority is concerned, spelling should follow the practice of wellestablished dictionaries or glossaries. while spelling refers normally to whole words, i included in this analysis acronyms and abbreviations used to denote unique entities, such as countries or product names, as there are recognized spellings of such acronyms and abbreviations. table 4 shows the tags from the three sites that do not conform to recognized spelling; the terms in italics show the accepted spelling. the number of tags that do not conform to spelling warrant is clearly very few, constituting a total of 4 per cent of the delicious tags, 3 percent of the furl tags, and 2 percent of the technorati tags. two of the nonrecognized spellings in delicious are likely due to the difficulty of creating compound tags in this site, as was discussed earlier. the remainder of the tags conformed to recog nized spellings as found in the three reference sources consulted. the findings suggest that tags are spelled con sistently and in keeping with recognized warrant across the three folksonomy sites. because of the international nature of the three folksonomy sites, no default english spelling was assumed. table 5 shows those tags whose spellings reflect regional variations. none of the three folksonomy sites featured lexical variants of any one tag. as the three sites are united states–based, the preponderance of american spelling is not surprising. what is surprising, however, is that technorati features only the british variants in the total of tags examined in this study. it should be pointed out that the two lexical variants of these terms do appear in the three folksonomy sites; the two variants simply did not appear in the daily logs examined. no system to enable crossreferencing (for example, humour use or see humor) exists in any of the three folksonomy sites, nor is crossreferencing discussed in the help logs of the sites. abbreviations, initialisms, and acronyms niso recommends that the full form of terms should be used. abbreviations or acronyms should be used only when they are so wellestablished that the full form of the term is rarely used. crossreferences should be made between the full and abbreviated forms of the terms. abbreviations and acronyms constitute 22 percent of delicious tags, 16 percent of furl tags, and 19 percent of technorati tags. the majority of these abbreviations and acronyms pertain to unique entities, such as product names (for example, flash, mac, and nfl). in the case of delicious and furl, none of the abbreviated tags is referred to also by its full form. four of the abbreviated technorati tags have fullform equivalents: ■ cheney/dick cheney ■ ie/internet explorer ■ sheehan/cindy sheehan ■ uae/united arab emirates abbreviations and acronyms play a significant role in the ambiguity of the tags from the three sites; they represent 71 percent of the abbreviated delicious tags, 45 percent of the abbreviated furl tags, and 73 percent of the abbreviated technorati tags. furl and technorati are very similar in the proportion of abbreviated tags used, but delicious is significantly higher. the delicious tags are focused more heavily upon computerrelated products, which may explain why there are so many more abbrevi ated tags, as many of these products are often referred to by these shorter terms; for example, css, flash, apple, and so on. table 4. tags that do not conform to spelling warrant delicious (n=76) furl (n=208) technorati (n=229) howto (how to) hollywood bday (hollywood birthday) met-art pics (metropolitan art pictures) opensource (open source) med-books (medical books) superbowl (super bowl) toread (to read) oralsex (oral sex) web-20 (web2.0) table 5. tags that reflect regional spelling variations delicious (n=76) furl (n=208) technorati (n=229) humor (u.s. spelling) humor (u.s. spelling) favourite (british spelling) jewelry (u.s. spelling) humour (british spelling) article title | author 21the structure and form of folksonomy tags | spiteri 21 neologisms, slang, and jargon the niso guidelines explain that neologisms, slang, and jargon terms are generally not included in standard dic tionaries and should be used only when there is no other widely accepted alternative. nonstandard tags do not constitute a particularly relevant proportion of the total number of tags per site; they account for 3 percent of the delicious tags, 10 percent of the furl tags, and 6 percent of the technorati tags. the nonstandard tags refer almost exclusively to either computer or sexrelated concepts, such as podcast, wiki, and camsex. nonalphabetic characters this section of the niso guidelines deals with the use of capital letters and nonalphabetic characters. capitalization was not examined in the three folksonomy sites, as none of them are case sensitive; delicious and furl, for exam ple, post tags in lower case, regardless of whether the user has assigned upper or lower case, while technorati shows capital letters only if they are assigned by the users themselves. the niso guidelines state that nonalphabetic characters, such as hyphens, apostrophes (unless used for the possessive case), symbols, and punctuation marks, should not be used because they cause filing and search ing problems. table 6 shows the occurrence of nonalpha betic characters in the three folksonomy sites. a very small proportion of the tags in the three folk sonomy sites contains nonalphabetic characters, namely 1 percent of the delicious tags, and 3 percent of the furl and technorati tags. as was discussed previously, the delicious help screens may encourage people to use nonalphabetic characters to construct compound tags; in spite of this, however, such characters are not, in fact, used very frequently. it should be noted that the terms above were all searched, with punctuation intact, in their respective sites; in all three cases, the search engines retrieved the tags and their associated blogs or web sites, which suggests that nonalphabetic characters may not negatively impact searching. ■ discussion and recommendations the tags examined from the three folksonomy sites cor respond closely to a number of the niso guidelines pertaining to the structure of terms, namely in the types of concepts expressed by the tags, the predominance of single tags, the predominance of nouns, the use of recognized spelling, and the use of primarily alphabetic characters. potential problem areas in the structure of the tags pertain to the inconsistent use of the singular and plural form of count nouns, the difficulty with creating multi term tags in delicious, and the incidence of ambiguous tags in the form of homographs and unqualified abbre viations or acronyms. as has been seen, a significant proportion of tags that represent count nouns appears incorrectly in the singular form. because many search engines do not deploy default truncation, the use of the singular or plural form could affect retrieval; a search for the tag computer in delicious, for example, retrieved 208,409 hits, while one for computers retrieved 91,205 hits. some of the results from the two searches overlapped, but only if both the singular and plural forms of the tags coexist. it would thus be useful for the help features of the folksonomy sites to explain the difference between count and noncount nouns and to discuss the impact of the form of the noun upon retrieval. while all three sites conform to the niso recommendation that single terms be used whenever possible, some concepts cannot be expressed in this fashion, and thus folksonomy sites should accom modate the use of multiterm tags. table 6. nonalphabetic characters delicious (n=76) furl (n=208) technorati (n=229) hyphens — hollywood b-day; urlproject consumercredit; web2.0 apostrophes — mom’s medical (possessive) valentine’s day (possessive) underscore safari_export blogger_life — full stop — web 2.0 (part of product name) web-2.0 (part of product name) forward slash — — /africa + sign — jcr+ — 22 information technology and libraries | september 200722 information technology and libraries | september 2007 furl and technorati allow for their use, but make no mention of this feature in their help screens, which means that such tags may be constructed inconsistently—for example, by the insertion of punctuation—where a sim ple space between the tags will suffice. as has been seen, delicious does not allow directly for the construction of multiterm tags, and in its instructions it actually promotes inconsistency in how various punctuation devices may be used to conflate two or three separate tags, once again at the detriment of retrieval, as is shown below: opensource: 103,476 hits open_source: 91, 205 hits open.source: 26,494 hits delicious should consider allowing for the insertion of spaces between the composite words of a compound tag; without this facility, users may be unaware of how to create compound tags. alternatively, delicious should recommend the use of only one punctuation symbol to conflate terms, such as the underscore. furl and technorati should explain clearly that compound tags may be formed by the simple convention of placing a space between the terms. ambiguous headings constitute the most problematic area in the construction of the tags; these headings take the form of homographs and abbreviations or acronyms. in the case of computerrelated product names, it may be safe to assume that in the context of an online environ ment it is likely that the meaning of these product names is relatively selfevident. in the case of the tag yahoo, for example, none of the sites or blogs associated with this tag pertained to “a member of a race of brutes in swift’s gulliver’s travels who have the form and all the vices of humans, or a boorish, crass, or stupid person” (merriam webster 2007), but referred consistently to the internet service provider and search engine. on the other hand, the tag ajax was used to refer to asynchronous javascript and xml technology as well as to a number of mainly european soccer teams. given the international audience of these folksonomy sites, it may be unwise to assume that the meanings of these homographs are selfevident. library of congress subject headings often uses parenthetical qualifiers to clarify the meaning of terms— for example, python (computer program language)—even though this goes against niso recommendations. it is unlikely, however, that such use of parentheses will be effective in the folksonomy sites. a search for opera (browser), for example, will likely imply an underlying and boolean operator, which detracts from the pur pose and value of the parenthetical qualifier; this was confirmed in a furl search, where the terms opera and browser appeared either immediately adjacent to each other or within the same document. the application of the section of the niso guidelines pertaining to abbreviations and acronyms is particularly difficult, as it is important to balance between using abbre viated forms of concepts that are so wellknown that the full version is hardly used versus creating ambiguous tags. the fact that abbreviated forms appear so prominently in the daily logs of the three folksonomy sites suggests that the full forms of these tags are, in fact, very wellestablished. at face value, therefore, many of the abbreviated tags are ambiguous because they can refer to different concepts, but it is questionable whether such tags as css, flash, apple, and rss, for example are, in fact, ambiguous to the users of the sites. the use of the full forms for these tags seems cumbersome, as these concepts are hardly ever referred to in their full form. it could possibly be argued, in fact, that in some cases, the full forms may not be familiar; i may know to what concept rss refers, for example, without knowing the specific words represented by the letters r, s, s. the possible ambiguity of abbreviated forms is com pounded by the fact that none of the three folkson omy sites allows for crossreferences between equivalent terms, which is a standard feature of most controlled vocabularies, for example: nfl/national football league use national football league/used for nfl the help screens of the three sites do not address the notion of ambiguity in the construction of tags: they do not draw people’s attention to the inherent ambigu ity of abbreviated forms that may represent more than one concept. the sites also fail to address the fact that abbreviated forms (or any tag, for that matter) may be culturally based, so that while the meaning of nfl may be obvious to north american users, this may not be the case for people who live in other geographic areas. it may be useful for the folksonomy sites to add direct links to an online dictionary and to wikipedia, and to encourage people to use these sites to determine whether their cho sen tags may have more than one application or meaning; i had not realized, for example, that rss could represent twentythree different concepts until i used wikipedia and was led to a disambiguation page. access to these external sources may help users decide which full version of the abbreviation to use in the case of ambiguity. the examination of the structure of the tags pointed to some deficiencies in section 6 of the niso guidelines, specifically its occasional lack of sufficient definition or explanation of some of its recommendations. the guidelines list seven types of concepts that are typically represented by controlled vocabulary terms, but rely only upon a few examples to define the meaning and scope of these concepts. the guidelines thus provide no consistent mechanism by which the creators of terms can assess consistently the types of concepts represented. how, for example, is a discipline to be determined? does the term business represent a discipline if it is a subject area that is taught formally in a postsecondary institute, for article title | author 23the structure and form of folksonomy tags | spiteri 23 example? is it necessary for a discipline to be recognized as such among a majority of educational institutions? in its examples for events, niso lists holidays and revolutions. it is unclear, however, what level of specificity applies to this concept; would christmas, for example, be considered an event or a unique entity/proper noun (which is listed separately from types of concepts)? it is only later in the guidelines, under the examples provided for unique enti ties (for example, fourth of july), that one may assume that a named event should be considered a unique entity. verbal nouns also are difficult to determine based only upon the niso examples, and once again no guidelines are provided to determine whether a noun represents an activity or a thing, or possibly both; for example, skiing or clipping. the lack of clear definitions in niso also appeared in the section pertaining to slang, neologisms, and jargon, which are considered to be nonstandard terms that do not generally appear in dictionaries. as was discussed previ ously, it is not clear at what point a jargon term or a slang term becomes a neologism. all of the slang tags found in the three sites (for example, babe) appeared in merriam webster, which may serve to make this niso section even more ambiguous. ■ conclusion the most notable suggested weaknesses of folksonomies are their potential for ambiguity, polysemy, synonymy, and basic level variation as well as the lack of consistent guidelines for the choice and form of tags. the examina tion of the tags of the three folksonomy sites in light of the niso guidelines suggests that ambiguity and polysemy (such as homographs) are indeed problems in the struc ture of the folksonomy tags, although the actual propor tion of homographs and ambiguous tags each constitutes fewer than onequarter of the tags in each of the three folksonony sites. in other words, although ambiguity and polysemy are certainly problematic areas, most of the tags in each of the three sites are unambiguous in their meaning and thus conform to niso recommendations. the help sites of the three folksonomy provide few tangible guidelines for (1) the construction of tags, which affects the construction of multiterm tags; and (2) the clear distinction between the singular and plural forms of count versus noncount nouns. as has been shown, the use of the singular or plural forms of terms, as well as the use of punctuation to form multiterm tags, affects search results. a large proportion of the tags in all three sites consists of single terms, which mitigates the impact on retrieval, but the inconsistent use of the singular and plural forms of nouns is indeed significant and thus may have marked effect upon retrieval. synonymy and basic level variation were not examined in this study, but are certainly worthy of further exploration. in other areas, the tags conform closely to the niso guidelines for the choice and form of controlled vocabu laries. the tags represent mostly nouns, with very few unqualified adjectives or adverbs. the tags represent the types of concepts recommended by niso and conform well to recognized standards of spelling. most of the tags conform to standard usage; there are few instances of nonstandard usage, such as slang or jargon. in short, the structure of the tags in all three sites is well within the standards established and recognized for the construction of controlled vocabularies. should library catalogs decide to incorporate folkson omies, they should consider creating clearly written rec ommendations for the choice and form of tags that could include the following areas: ■ the difference between count and noncount nouns, as well as an explanation of how the use of the sin gular and plural forms affects retrieval. ■ one standard way in which to construct multiterm tags; for example, the insertion of a space between the component terms, or the use of an underscore between the terms. ■ a link to a recognized online dictionary and to wikipedia to enable users to determine the meanings of terms, to disambiguate amongst homographs, and to determine if the full form would be preferable to the abbreviated form. an explanation of the impact of ambiguous tags and homographs upon retrieval would be useful. ■ an acceptable use policy that would cover areas of potential concern, such as the use of potentially offensive tags, overly graphic tags, and so forth. although such terms were not the focus of this study, their presence was certainly evident in some cases, and would need to be considered in an environment that includes clients of all ages. with the use of such expanded guidelines and links to useful external reference sources, folksonomies could serve as a very powerful and flexible tool for increasing the userfriendliness and interactivity of public library catalogs, and also may be useful for encouraging other activities, such as informal online communities of readers and userdriven readers’ advisory services. works cited bateman, s., c. brooks, and g. mccalla. 2006. collaborative tagging approaches for ontological metadata in adaptive e-learning systems. http://www.win.tue.nl/swel/2006/ cameraready/02bateman_brooks_mccalla_swel2006_ final.pdf (accessed jan. 11, 2007). 2� information technology and libraries | september 20072� information technology and libraries | september 2007 bruce, h., w. jones, and s. dumais. 2004. keeping and re-finding information on the web: what do people do and what do they need? seattle: information school. http://kftf.ischool.washington .edu/refinding_information_on_the_web3.pdf (accessed jan. 11, 2007). cattuto, c., v. loreto, and l. pietronero. 2006. collaborative tagging and semiotic dynamics. http://arxiv.org/ps_cache/cs/ pdf/0605/0605015.pdf (accessed jan. 11, 2007). del.icio.us. 2006a. del.ico.us/about. http://del.icio.us/about/ (accessed jan. 11, 2007). del.icio.us. 2006b. del.ico.us/help/tags. http://del.icio.us/help/ tags (accessed jan. 11, 2007). dempsey, l. 2003. the recombinant library: portals and people. journal of library administration 39, no. 4: 103–36. fichter, d. 2006. intranet applications for tagging and folkson omies. online 30, no. 3: 43–45. furl. 2006. how to save a page in furl. http://www.furl.net/ howtosave.jsp (accessed jan. 11, 2007). golder, s. a., and b. a. huberman. 2006. usage patterns of col laborative tagging systems. journal of information science 32, no. 2: 198–208. guy, m., and e. tonkin. 2006. tidying up tags? d-lib magazine 12, no. 1. http://www.dlib.org/dlib/jan.06/guy/01guy.html (accessed jan. 11, 2007). ketchell, d. s. 2000. too many channels: making sense out of portals and personalization. information technology and libraries 19, no. 4: 175–79. kroski, e. 2006. the hive mind: folksonomies and user-based tagging. http://infotangle.blogsome.com/2005/12/07/thehive mindfolksonomiesanduserbasedtagging/ (accessed jan. 11, 2007). mathes, a. 2004. folksonomies—ccooperative classification and communication through shared metadata. http://www.adammathes .com/academic/computermediatedcommunication/ folksonomies.html (accessed jan. 11, 2007). merholz, p. 2004. ethnoclassification and vernacular vocabularies. http://www.peterme.com/archives/000387.html (accessed jan. 11, 2007). merriamwebster. (2007). yahoo. http://www.mw.com/ (accessed jan. 11, 2007). michlmayr, e. 2005. a case study on emergent semantics in communities. http://wit.tuwien.ac.at/people/michlmayr/ publications/michlmayr_casestudy_on_emergentsemantics _final.pdf (accessed jan. 11, 2007). mitchell, r. l. 2005. tag teams wrestle with web content. computerworld 38, no. 16: 31. niso. 2005. guidelines for the construction, format, and management of monolingual controlled vocabularies. ansi/niso z39.192005. bethesda, md.: national information standards organization. http://www.niso.org/standards/resources/z39192005 .pdf (accessed jan. 11, 2007). quintarelli, e. 2005. folksonomies: power to the people. http:// www.iskoi.org/doc/folksonomies.htm (accessed jan. 11, 2007). shirky, c. 2004. folksonomy. http://www.corante.com/many/ archives/2004/08/25/folksonomy.php (accessed jan. 11, 2007). spiteri, l. f. 2006. the use of folksonomies in public library cata logues. the serials librarian 51, no. 2: 75–89. szekely, b., and e. torres. 2005. ranking bookmarks and bistros: intelligent community and folksonomy development. http:// torrez.us/archives/2005/07/13/tagrank.pdf. (accessed jan. 11, 2007). technorati. 2006. technorati help:tags. http://www.technorati. com/help/tags.html (accessed jan. 11, 2007). trant, j., and b. wyman. (2006). investigating social tagging and folksonomy in art museums with steve.museum. http://www.archimuse .com/research/www2006taggingsteve.pdf (accessed jan. 11, 2007). udell, j. 2004. collaborative knowledge gardening. http://www. infoworld.com/article/04/08/20/34opstrategic_1.html (accessed jan. 11, 2007). vander wal, t. 2006. understanding folksonomy: tagging that works. http://s3.amazonaws.com/2006presentations/ dconstruct/tagging_in_rw.pdf (accessed jan. 11, 2007). vanderwal.net. 2005. folksonomy definition and wikipedia. http:// www.vanderwal.net/random/entrysel.php?blog=1750 (accessed jan. 11, 2007). wikipedia. 2006. folksonomy. http://en.wikipedia.org/wiki/ folksonomy (accessed jan. 11, 2007). wu, h., m. zubair, and k. maly. 2006. harvesting social knowledge from folksonomies. http://delivery.acm.org/10.1145/1150000/ 1149962/p111wu.pdf (accessed jan. 11, 2007). article title | author 25the structure and form of folksonomy tags | spiteri 25 appendix a: list of niso elements 6.3 term form 6.3.1 single word vs. multiword terms 6.3.2 types of concepts terms for things and their physical parts terms for materials terms for activities or processes terms for events or occurrences terms for properties or states terms for disciplines or subject fields terms for units of measurement 6.3.3 unique entities 6.4 grammatical forms of terms 6.4.1 nouns and noun phrases 6.4.1.1 verbal nouns 6.4.1.2 noun phrases 6.4.1.2.1 premodified noun phrases 6.4.1.2.2 postmodified noun phrases 6.4.2 adjectives 6.4.3 adverbs 6.5 nouns 6.5.1 count nouns 6.5.2 mass nouns 6.5.3 other types of singular nouns 6.5.3.1 abstract concepts 6.5.3.2 unique entities 6.6.2 spelling 6.6.2.1 spelling—warrant 6.6.2.2 spelling—authorities 6.6.3 abbreviations, initialisms, and acronyms 6.6.3.1 preference for abbreviation 6.6.3.2 preference for full form 6.6.3.2.1 general use 6.6.3.2.2 ambiguity 6.6.4 neologisms, slang, and jargon 6.7.1 capitalization and nonalphabetic characters : | wang 81building an open source institutional repository at a small law school library | wang 81 fang wangcommunications v700 flatbed scanner, which was recommended by many digitization best practices in texas. for software, we had all the important basics such as ocr and image editing software for the project to start. for the following several months, i did extensive research on what digital asset management platform would be the best solution for the law library. we had options to continue displaying the digital collections through webpages or use a digital asset management platform that would provide long-term preservation as well as retrieval functions. we made the decision to go with the latter. generally speaking, there are two types of digital asset management platforms: proprietary and open source. in some rare occasions, a library chooses to develop its own system and not to use either type of the platforms if the library has designated programmers. there are pros and cons to both proprietary and open source platforms. although setting up the repository is fairly quick and easy on a proprietary platform, it can be very expensive to pay annual fees for hosting and using the service. for the open source software, it may appear to be “free” up front; however, installing and customizing the repository can be very time consuming and these solutions often lack technical and development support. there is no uniform rule for choosing a platform. it depends on what the organization wants to achieve and its own unique circumstances. i explored several popular proprietary platforms such as contentdm and digital commons. contentdm is an oclc product, which has a lot of capability and is especially good for displaying image collections. digital commons is owned of the repository is ongoing; it is valuable to share the experience with other institutions who wish to set up an institutional repository of their own and also add to the knowledgebase of ir development. institutional repository from the ground up unlike most large university libraries, law school libraries are usually behind on digital initiative activities because of smaller budgets, lack of staff, and fewer resources. although institutional repositories have already become a trend for large university libraries, it still appears to be a new concept for many law school libraries. at the beginning of 2009, i was hired as the digital information management librarian to develop a digital repository for the law school library. when i arrived at texas tech university law library, there was no institutional repository implemented. there were very few digital projects done at the law library. one digital collection was of faculty scholarship. this collection was displayed on a webpage with links to pdf files. another digital project, to digitize and provide access to the texas governor executive orders found in the texas register, was planned then disbanded because of the previous employee leaving the position. i started by looking at the digitization equipment in the library. the equipment was very limited: a very old and rarely used book scanner and a sheet-fed scanner. the good thing was that the library did have extra pcs to serve as workstations. i did research on the book scanner we had and also consulted colleagues i met at various digital library conferences about it. because the model is very outdated and has been discontinued by the vendor and thus had little value to our digitization project, i decided to get rid of the scanner. i then proposed to purchase an epson perfection building an open source institutional repository at a small law school library: is it realistic or unattainable? digital preservation activities among law libraries have largely been limited by a lack of funding, staffing and expertise. most law school libraries that have already implemented an institutional repository (ir) chose proprietary platforms because they are easy to set up, customize, and maintain with the technical and development support they provide. the texas tech university school of law digital repository is one of the few law school repositories in the nation that is built on the dspace open source platform.1 the repository is the law school’s first institutional repository in history. it was designed to collect, preserve, share and promote the law school’s digital materials, including research and scholarship of the law faculty and students, institutional history, and law-related resources. in addition, the repository also serves as a dark archive to house internal records. i n this article, the author describes the process of building the digital repository from scratch including hardware and software, customization, collection development, marketing and outreach, and future projects. although the development fang wang (fang.wang@ttu.edu) is digital information management librarian, texas tech university school of law library, lubbock, texas. 82 information technology and libraries | june 2011 two months later, we discovered that a preconfigured application called jumpbox for dspace was released and approved to be a much easier solution for the installation. the price was reasonable too, $149 a year (the price has jumped quite a bit since then). however, using jumpbox would leave our newly purchased red hat linux server of no use because jumpbox runs on ubuntu, therefore after some discussion we decided not to pursue it. we were a little stuck in the installation process. outsourcing the installation seemed to be a feasible solution for us at this point. we identified a reputable dspace service provider after doing extensive research including comparing vendors, obtaining references, and pursuing other avenues. after obtaining a quote, we were quite satisfied with the price and decided to contract with the vendor. while waiting for the contract to be approved by the university contracting office, i began designing the look and feel that is unique to the ttu school of law with some help from another library staff member. the installation finally took place at the beginning of january 2010. i worked very closely with the service provider during the installation to ensure the desired configuration for our dspace instance. our repository site with the ttu law branding became accessible to the public three days later. and with several weeks of warranty, we were able to adjust several configurations including display thumbnails for images. overall, we are very pleased with the results. after the installation, our it department maintains the dspace site and we host all the content on our own server. collection development of the ir content is the most critical element to an institutional repository. while we were waiting for our it department 66, the majority of the repositories worldwide were created using the dspace platform.2 for the installation, we looked at the opportunity to use services provided by the state digital library consortium texas digital library (tdl) and tried to pursue a partnership with the main university library, which had already implemented a digital repository. however, because of financial reasons and separate budgets, those approaches did not work out. so we decided to have our own it department install dspace. installation and customization of our dspace unlike large university libraries, smaller special libraries face many challenges while trying to establish an open source repository. after making the decision to use dspace, the first challenge we faced was the installation. dspace runs on postgresql or oracle and requires a server installation. customizing the web interface requires either the jspui (javaserver pages user interface) or xmlui (extensible markup language user interface). the staff in our it department knew little about dspace. however, another special library on campus offered their installation notes to our system administrator because they just installed dspace. although dspace runs on a variety of operating systems, we purchased red hat enterprise linux after some testing because it is the recommended os for dspace. then our system administrator spent several months trying to figure out how to install the software in addition to his existing projects. because we did not have dedicated it personnel working on the installation, the work was often interrupted and very difficult to complete. our it staff also found it very difficult to continue with the installation because the software requires a lot of expertise. by berkley press and is often used in the law library community. as a smaller law library, our budget did not allow us to purchase those platforms, which require annual fees of more than $10,000. so we had to look at the open source options. for the open source platforms, i investigated dspace, fedora, eprints and green stone. dspace is a javabased system developed by mit and hp labs. it offers a communitiescollections model and has built-in submission workflows and long-term preservation function. it can be installed “out of the box” and is easy to use. it has been widely adopted as institutional repository software in the united states and worldwide. fedora was also developed in the united states. it is more of a backend software with no web-based administration tools and requires a lot of programming effort. similar to dspace, eprints is another easy to set up and use ir software developed in the u.k. it is written in perl and is more widespread in europe. greenstone is a tool developed in new zealand for building and distributing digital library collections. it provides interfaces in 35 languages so it has many international users. when choosing an ir platform, it is not a question of which software is superior to others but rather which is more appropriate for the purpose and the content of the repository. our goal was to find a platform that had low costs and did not involve much programming. we also wanted a system that was capable of archiving digital items in various formats for the long term, flexible for data migration, had a widely accepted metadata scheme, decent search capability, and was easy to use. another factor we had to consider was the user base. because open source software relies on the user themselves for technical support for the most part, we wanted a software that had an active user community in the united states. dspace seemed to satisfy all of our needs. also, according to repository : | wang 83building an open source institutional repository at a small law school library | wang 83 hosted by the lubbock county bar association at the ttu law school. we made the initial announcement to the law faculty and staff and later to the lubbock county bar about the new digital initiative service we have established. we received very positive feedback from the law community. professor edgar’s family was delighted to see his collection made available to the public. following the success of the initial launch, i developed an outreach plan to promote the digital repository. to make the repository site more visible, several efforts were made: the repository site url was submitted to the dspace user registry, the directory of open access repositories (opendoar), and registry of open access repositories (roar); the site was registered with google webmaster tools for better indexing; and the repository was linked to several websites of the law school and library. the “faculty scholarship” collection and the “texas governor executive orders” collection became available shortly after. i then developed a poster of the newly established digital repository and presented it at the texas conference on digital libraries held at university of texas austin in may 2010. currently, our digital repository has more than eight hundred digital items as of august 2010. with more and more content becoming available in the repository, we plan on making an official announcement to the law community. we will also make entering first-year law students aware of the ir by including an article about the new repository in the library newsletter that is distributed to them during their orientation. our future marketing plan includes sending out announcements of new collections to the law school using our online announcement system techlawannounce and promoting the digital repository through the law library social networking pages on facebook and twitter. we also plan reviewed each year. based on the collection development policy, we made a decision to migrate the content of the old “faculty scholarship” collection from webpages into the digital repository. it was intended to include all publications of the texas tech law school faculty in the collection. we then hired a second-year law student as the digital project assistant and trained him on scanning, editing, and ocr-ing pdf files; uploading files to dspace; and creating basic metadata. we also brought another two student assistants on board to help with the migration of the faculty scholarship collection. the faculty services librarian checked the copyright with faculty members and publishers while i (the digital information management librarian) served as the repository manager handling more complicated metadata creation, performing quality control over student submissions, and overseeing the whole project. later development and promoting the ir during the faculty scholarship migration process, we discovered a need to customize dspace to allow active urls for publications. we wanted all the articles linked to three widely used legal databases: westlaw, lexisnexis, and hein online. because the default dspace system does not support active urls, it requires some programming effort to make the system detect a particular metadata field then render it as a clickable link. we outsourced the development to the same service provider who installed dspace for us. the results were very satisfying. the vendor customized the system to allow active urls and displayed the links as clickable icons for each legal database. in april 2010, “professor j. hadley edgar ’s personal papers” collection was made available in conjunction with his memorial service, to install dspace, we prepared and scanned two collections: the “texas governor executive orders” collection and the “professor j. hadley edgar’s personal papers” collection. the latter was a collection donated by professor edgar’s wife after he passed away in 2009. professor edgar taught at the law school from 1971 to 1991. he was named the robert h. bean professor of law and was twice voted by the student body as the outstanding law professor. the collection contains personal correspondence, photos, newspaper clippings, certificates, and other materials. many of the items have a high historic value to the law school. for the scanning standards, we used 200 dpi for text-based materials and 400 dpi for pictures. we chose pdf as our production file format as it is a common document format and smaller in size to download. after the installation was completed at the beginning of january, i drafted and implemented a digital repository collection development policy shortly after to ensure proper procedures and guidance of the repository development. the policy includes elements such as the purpose of the repository, scope of the collections, selection criteria and responsibilities, editorial rights, and how to handle challenges and withdrawals. i also developed a repository release form to obtain permissions from donors and authors to ensure open access for the materials in the repository. twelve collections were initially planned for the repository: “faculty scholarship,” “personal manuscripts,” “texas governor executive orders,” “law school history,” “law library history,” “regional legal history,” “law student works,” “audio/ video collection,” “dark archive,” “electronic journals,” “conference, colloquium and symposium,” and “lectures and presentations.” there will be changes to the collections in the future as the digital repository collection development policy will be 84 information technology and libraries | june 2011 all roads lead to rome. no matter what platform you choose, whether open source or not, the goal is to pick a system that best suits your organization’s needs. to build a successful institutional repository is not simply “scanning” and “putting stuff online.” various factors need to be considered, such as digitization, ir platform, collection development, metadata, copyright issues, and marketing and outreach. our experience has proven that it is possible for a smaller special library with limited resources and funding to establish an open source ir such as dspace and continue to maintain the site and build the collections with success. open source software is certainly not “free” because it requires a lot of effort. however, in the end it still costs a lot less than what we would pay to the proprietary software vendors. references 1. “the texas tech university school of law digital repository,” http://reposi tory.law.ttu.edu/ (accessed apr. 5, 2011). 2. “repository maps,” accessed http://maps.repository66.org/ (accessed aug. 16, 2010). (ssrn) links to individual articles in the faculty scholarship collection. after that, the next collections we will work on are the law school and law library history materials. we also plan to do some development on the dspace authentication to integrate with the ttu “eraider” system to enable single log-in. in the future, we want to explore the possibilities of setting up a collection for the works of our law students and engage in electronic journal publishing using our digital repository. conclusion it is not an easy task to develop an institutional repository from scratch, especially for a smaller organization. installation and development are certainly a big challenge for a smaller library with limited number of it staff. outsourcing these needs to a service provider seems to be a feasible solution. another challenge is training. we overcame this challenge by taking advantage of the state consortium’s dspace training sessions. subscribing to the dspace mailing list is necessary as it is a communication channel for dspace users to ask questions, seek help, and keep up to date about the software. on hosting information sessions for our law faculty and students to learn more about the digital repository. future projects there is no doubt that our digital repository will grow significantly because we have exciting collections planned for future projects. one of our law faculty, professor daniel benson, donated some of his personal files from an eight-year litigation representing the minority plaintiffs in the civil rights case of jones v. city of lubbock, 727 f. 2d 364 (5th cir. 1984) in which the minority plaintiffs won the case. the lawsuit changed the city of lubbock’s election system for city council members from the “at large” method to the “single member district system,” which allowed the minority candidates consistently being elected. this collection contains materials, notes, memoranda, letters, and other documents prepared and utilized by the plaintiffs’ attorneys. it has significant historical value because a texas tech law professor and five texas tech law graduates participated in that case successfully as pro bono attorneys for the minority plaintiffs. in addition, we plan on adding social science research network automated storage & retrieval system: from storage to service articles automated storage & retrieval system: from storage to service justin kovalcik and mike villalobos information technology and libraries | december 2019 114 justin kovalcik (jdkovalcik@gmail.com) is director of library information technology, csun oviatt library. mike villalobos (mike.villalobos@csun.edu) is guest services supervisor, csun oviatt library. abstract the california state university, northridge (csun) oviatt library was the first library in the world to integrate an automated storage and retrieval system (as/rs) into its operations. the as/rs continues to provide efficient space management for the library. however, added value has been identified in materials security and inventory as well as customer service. the concept of library as space, paired with improved services and efficiencies, has resulted in the as/rs becoming a critical component of library operations and future strategy. staffing, service, and security opportunities paired with support and maintenance challenges, enable the library to provide a unique critique and assessment of an as/rs. introduction “space is a premium” is a phrase not unique to libraries; however, due to the inclusive and open environment promoted by libraries, their floor space is especially attractive to those within and outside of the building’s traditional walls. in many libraries, the majority of floor space is used to house a library’s collection. in the past, as collections grew, floor space became increasingly limited. faced with expanding expectations and demands, libraries struggled to identify a balance between transforming space for new services while adding materials to a growing collection. in addition to management activities like weeding, other solutions such as offsite storage and compact shelving rose in popularity as a method to create library space in the absence o f new building construction. years later as collections move away from print and physical materials, libraries are beginning to reexamine their building’s space and envision new features and services. “now that so many library holdings are accessible digitally, academic libraries have the opportunity to make use of their physical space in new and innovative ways.”1 the csun oviatt library took a novel approach and launched the world’s first automated storage and retrieval system (as/rs) in 1991 as a storage solution to resolve its building space limitations. the project was a california state university (csu) system chancellor’s office initiative that cost more than $2 million to implement and began in 1989. the original concept “came from the warehousing industry, where it had been used by business enterprises for years.”2 by leveraging and storing physical materials in the as/rs, the csun oviatt library is able to create space within the library for new activities and services. “instead of simply storing information materials, the library space can and should evolve to meet current academic needs by transforming into an environment that encourages collaborative work.”3 mailto:jdkovalcik@gmail.com mailto:mike.villalobos@csun.edu automated storage & retrieval system | kovalcik and villalobos 115 https://doi.org/10.6017/ital.v38i4.11273 unfortunately, as the first stewards of an as/rs, csun made decisions that led to mismanagement and neglect resulting in the as/rs facing many challenges in becoming a stable and reliable component of the library. however, recent efforts have sought to resolve these issues and resulted in system updates, management, and functionality. whereas in the past low-use materials were placed in as/rs to create space for new materials, now materials are moved into the as/rs to create space for patrons, secure collections, and improve customer service. as part of this critical review, the functionality and maintenance along with the historical and current management of the as/rs will be examined. background csun is the second-largest member of the twenty-three-campus csu system. the diverse university community includes over 38,000 students and more than 4,000 employees.4 consisting of nine colleges offering 60 baccalaureate degrees, 41 master’s degrees, 28 credentials in education, and various extended learning and special programs, csun provides a diverse community with numerous opportunities for scholarly success.5 the csun oviatt library’s as/rs is an imposing and impressive area of the library that routinely attracts onlookers and has become part of the campus tour. the as/rs is housed in the library’s east wing and occupies an area that is 8,000 square feet and 40 feet high arranged into six aisles. the 13,260 steel bins, each 2 feet x 4 feet, in heights of 6, 10, 12, 15, and 18 inches, are stored on both sides of the aisles enabling the as/rs to store an estimated 1.2 million items.6 each aisle has a storage retrieval machine (srm) that performs automatic, semiautomatic, and manual “picks” and “deposits” of the bins.7 the as/rs was assessed in 2014 as responsibilities, support, and expectations of the system shifted and previous configurations were no longer viable. discontinued and failing equipment, unsupported server software, inconsistent training and use, and decreased local support and management were identified as impediments for greater involvement in library projects and operations. campus provided funding in 2015 to update the server software as well as major hardware components on three of the six aisles. divided into two phases, the server software upgrade was completed in may 2017 followed by the hardware upgrade in january 2019.8 literature review the continued growth of student, faculty, and academic programs along with evolving expectations and needs since the late 1980s has required the library to analyze library services and examine the building’s physical space and storage capacity. in the late 1980s, identifying space for increasing printed materials was the main contributing factor in implementing the as/rs. in the mid-2010s, creating space within the library for new services was dependent on a stable and reliable as/rs. “the conventional way of solving the space problem by adding new buildings and off-site storage facilities was untenable.”9 a benefit of an as/rs, as creaghe and davis predicted in 1986 was, “the probable slow transition from books to electronic media, an aaf [automated access facility] may postpone the need for future library construction indefinitely.”10 the as/rs has enabled the library to create space by removing physical materials while enhancing customer service, material security, and inventory control. “the role of the library as service has been evolving in lockstep with user needs. the current transformative process that takes place in academia has a powerful impact on at least two functional areas of the library: information technology and libraries | december 2019 116 library as space and library as collection.”11 in addition, the “increased security the aaf … offers will save patrons time that would be spent looking for books on the open shelves that may be in use in the library, on the waiting shelves, misplaced, or missing.”12 in subsequent years, library services have evolved to include computer labs with multiple high-use printers/scanners/copiers, instructional spaces, individual and group study spaces, makerspaces, etc., in addition to campus entities that have required large amounts of physical space within the library. “it is well-known that academic libraries have storage problems. traditional remedies for this situation—used in libraries across the nation—include off-site storage for less used volumes, as well as, more recently, innovative compact shelving. these solutions help, but each has its disadvantages, and both are far from ideal. . . . when the eastern michigan university library had the opportunity to move into a new building, we saw that an as/rs system would enable us to gain open space for activities such as computer labs, training rooms, a cafe, meeting rooms, and seating for students studying.”13 the as/rs provides all the space advantages provided by off-site storage and compact shelving while adding much more value while mitigating negatives of off-site time delays and the confusion of accessing and using compact shelving. staffing & usage 1991–1994 following the 80/20 principle, low-use items were initially selected for storage in the as/rs. “when the storage policy was being developed in [the] 1990s, the 80/20 principle was firmly espoused by librarians. . . . thus, by moving lower-use materials to as/rs, the library could still ensure that more than 80% of the use of the materials occurs on volumes available in the open stacks.”14 low-use items were identified if one of the following three conditions was met: (1) the item’s last circulation date was more than five years ago; (2) the item was a non-circulating periodical; or (3) items that were not designed to leave an area and received little patron usage such as the reference collection. in 1991, the as/rs was loaded with 800,000 low-use items and went live for the first time later that year. staffing for the initial as/rs department consisted of one full-time as/rs supervisor (40 hours/week), one part-time as/rs repair technician (20 hours/week), and 40 hours a week of dedicated student employees, for a total of 100 hours a week of dedicated as/rs management. the as/rs was largely utilized as a specialized service for internal library operations with limited patron-initiated requests. as/rs operations were uniquely created and customized for each as/rs operator as well as the desired task needing to be performed. skills were developed internally with knowledge and training shared by word of mouth or accompanied with limited documentation. 2000 mid-2000s the as/rs department functioned in this manner until the 1994 northridge earthquake struck the campus directly and required partial building reconstruction to the library. although there was no damage to the as/rs itself or its surrounding structure, extensive damage occurred in the wings of the library. the damage resulted in the library building being closed and inaccessible. when the library reopened in 2000, it was determined that due to previous as/rs low usage that a dedicated department was no longer warranted. the as/rs supervisor position was dissolved, the student employee budget was eliminated, and the as/rs technician position was not replaced after the employee retired in 2008. as/rs operational responsibilities were consolidated into the circulation department and as/rs administration into the systems department. both circulation automated storage & retrieval system | kovalcik and villalobos 117 https://doi.org/10.6017/ital.v38i4.11273 and systems departments redefined their roles and responsibilities to include the as/rs without additional budgetary funding, staffing, or training. in order for as/rs operations to be absorbed by these departments, changes had to occur in the administration, operating procedures, staffing assignments, and access to the as/rs. all five circulation staff members and twenty student employees received informal training by members of the former as/rs department in the daily operations of the as/rs. the circulation members also received additional training for first-tier troubleshooting of as/rs operations such as bin alignments, emergency stops, and inventory audits. the as/rs repair technician remained in the systems department; however, as/rs troubleshooting responsibility was shared among the systems support specialists and dedicated as/rs support was lost. the administrative tasks of scheduling preventive maintenance services (pms), resolving as/rs hardware/equipment issues with the vendor, and maintaining the server software remained with the head of the systems department. without a dedicated department providing oversight for the as/rs, issues and problems began to occur frequently. circulation had neither the training nor resources available to master procedures or enforce quality control measures. similarly, the systems department became increasingly removed from daily operations. many issues were not reported at all and became viewed as system quirks that required workarounds or were viewed as limitations of the system. for issues that were reported, troubleshooting had to start all over again and systems relied on circulation staff being able to replicate the issue in order to demonstrate the problem. system’s personnel retained little knowledge on performing daily operations, and troubleshooting became more complex and problematic as different operators had different levels of knowledge and skill that accompanied their unique procedures. mid-2000s–2015 these issues became further exasperated when areas outside of circulation were given full access to the as/rs in the mid-2000s. employees from different departments of the library began entering and accessing the as/rs area and operated the as/rs based on knowledge and skills they learned informally. student assistants from these other departments also began accessing the area and performing tasks on behalf of their informally trained supervisors. further, without access control, employees as well as students ventured into the “pit” area of the as/rs where the srms move and end-of-aisle operations occur. this area contains many hazards and is unsafe without proper training. during this period, the special collections and archives (sc/a) department loaded thousands of un-cataloged, high-use items into the as/rs that required specialized service from circulation. these items were categorized as “non-library of congress” and inventory records were entered into the as/rs software manually by various library employees. in addition, paper copies were created and maintained as an independent inventory by sc/a. over the years, the sc/a paper inventory copies were found to be insufficiently labeled, misidentified, or missing. therefore, the as/rs software inventory database and the sc/a paper copy inventory contained conflicts that could not be reconciled. to resolve this situation, an audit of sc/a materials was completed in spring 2019 to locate inventory that was thought to be missing. information technology and libraries | december 2019 118 all bound journals and current periodicals were eventually loaded into the as/rs as well, causing other departments and areas to rely on the as/rs more heavily. departments such as interlibrary loan and reserves, as well as patrons, began requesting materials stored in the as/rs more routinely and frequently. the as/rs transformed from a storage space with limited usage to an active area with simultaneous usage requests of different types throughout the day. without a dedicated staff to organize, troubleshoot, and provide quality control, there was an abundance of errors that led to long waits for materials, interdepartmental conflicts, and unresolved errors. high-use materials from sc/a, as well as currently received periodicals from the main collection, were the catalysts that drove and eventually warranted change in the as/rs usage model from storage to service. the inclusion of these materials created new primary customers identified as internal library departments: sc/a and interlibrary loan (ill). with over 4,000 materials contained in the as/rs, sc/a requires prompt service for processing archival material into the as/rs and filling specialized patron requests for these materials. in addition, ill processes over 500 periodical requests per month that utilize and depended on as/rs services. the additional storage and requests created an uptick in overall as/rs utilization that carried over into circulation desk operations as well. 2015–present the move from storage to service was not only inevitable due to an evolving as/rs inventory, but was necessary in order to regain quality control and manage the library-wide projects that involved the as/rs. the increased usage and reliance on the as/rs required the system be well maintained and managed. administration of the as/rs remains within systems and circulation student employees continue to provide supervised assistance to the as/rs. the crucial change was identified and emerged within circulation for a dedicated operations and project manager. an as/rs lead position was created with responsibilities for the daily operations and management of the system and service. however, this was not a complete return to the original staffing concept of the early 1990s. the concept for this new position focuses on project management and system operations rather than the original sole attention to system operations. the as/rs lead is the point of contact for all library projects that utilize the as/rs, relaying any as/rs issues or concerns to systems, and daily as/rs usage. this shift is necessary due to the increased demand and reliance on the system that has changed its charge from storage to service. customer service the library noted over time that the as/rs could be used as a tool in weeding and other collection shift projects to create space and aid in reorganizing materials. as more high-use materials were loaded into the as/rs the indirect advantages of the as/rs became more apparent. patrons request materials stored within the as/rs through the library’s website and pick up the materials at the circulation desk. there is no need for patrons to navigate the library, successfully use the classification system, and search shelves to locate an item that may or may not be there. as kirsch notes, “the ability to request items electronically and pick them up within minutes eliminates the user’s frustration at searching the aisles and floors of an unfamiliar library.”15 the vast majority of library patrons are csun students that commute and must make the best use of their time while on campus. housing items in the as/rs creates the opportunity to have hundreds of thousands of items all picked up and returned to one central location. this makes it far easier for library patrons, especially users with mobility challenges, to engage with a plethora of library automated storage & retrieval system | kovalcik and villalobos 119 https://doi.org/10.6017/ital.v38i4.11273 materials. the time allotted for library research and/or enjoyment becomes more productive as their desired materials are delivered within minutes of arriving in the building. as heinrich and willis state, “the provision of the nimble, just-in-time collection becomes paramount, and the demand for as/rs increases exponentially.”16 as/rs items are more readily available than shelved items on the floor, as it takes minutes to have as/rs items returned and made available once again. “they may be lost, stolen, misshelved, or simply still on their way back to the shelves from circulation—we actually have no way of knowing where they are without a lengthy manual search process, which may take days. . . . unlike books on the open shelves, returned storage books are immediately and easily ‘reshelved’ and quickly available again.”17 another advantage is there is no need to keep materials in call-number order with the unpleasant reality of missing and misshelved items. items in the as/rs are assigned bin locations that can only be accessed by an operatoror user-initiated request. the workflow required to remove a material from the as/rs involves multiple scans and procedures that increase accountability that does not exist for items stored on floor shelves. further, users are assured of an item’s availability within the system. storing materials in the as/rs ensures that items are always checked out when they leave the library and not sitting unaccounted for in library offices and processing areas. it also avoids patron frustration of misshelved, recently checked-out, or missing items. security the decision to follow the 80/20 principle and place low-use items in the as/rs meant high-use items remained freely available to library patrons on the open shelves of each floor. this resulted in high-use items being available for patron browsing and checkout, as well as patron misuse and theft. the sole means of securing these high-use items involved tattle-tape and installing security gates at the main entrance. therefore, the development of policies and procedures for the enforcement of these gates was also required. beyond the inherent cost, maintenance, and issue of ensuring items are sensitized and desensitized correctly, gate enforcement became another issue that rested upon the circulation department. assuming theft would occur by exiting the building through passing through the gates at the main entrance of the library, enforcement is limited in actions that may be performed by library employees. touching, impeding the path, following, detaining, searching, etc. of library patrons are restricted actions reserved for campus authorities such as the police and not library employees. rather than attempting to enforce a security mechanism in which we have no authority, the as/rs provides an alternative for the security of high-use and valuable materials. storing items in the as/rs eliminates the possibility of theft or damage by visitors and places control and accountability over the internal use of materials. “there would be far fewer instances of mutilation and fewer missing items.”18 further, access to the as/rs area was restricted from all library personnel to only circulation and systems employees with limited exceptions. individual log ins also provided a method of control and accountability as each operator is required to use a personal account rather than a departmental account to perform actions on the as/rs. materials stored in the as/rs are, “more significantly . . . safer from theft and vandalism.”19 information technology and libraries | december 2019 120 inventory conducting a full inventory of a library collection is time consuming, expensive, and often inaccurate by the time of completion. missing or lost items, shelf reading projects, in-process items, etc. create overhead for library employees and generate frustration for patrons searching for an item. massive, library-wide projects such as collection shifts and weeding are common endeavors undertaken to create space, remove outdated materials, and improve collection efficiency. however, actions taken on an open shelves collection is time consuming, costly, inefficient, and affect patron activities. these projects typically involve months of work that involve multiple departments to complete. items stored within the as/rs do not experience these challenges because the system is managed by a full-time employee throughout the year and not on a project basis. the system is capable of performing inventory audits, and does not affect public services. therefore, while the cost of an item on an open shelf is $0.079, the cost of storing the same item in the as/rs is $0.0220 routine and spot audits ensure an accurate inventory, confirm capacity level of the system, and establish best management of the bins. as/rs inventory audits are highly accurate and much more efficient than shelf reading with little impact to patron services. “while this takes some staff time, it is far less time-consuming than shelf reading or searching for misshelved books.”21 storing materials in the as/rs is more efficient than on open shelves; however, bin management is essential in ensuring bins are configured in the best arrangement to achieve optimal efficiency. the size and configuration of bins directly affects storage capacity. type of storage, random or dedicated, also influences capacity, efficiency, and accessibility of items. the 13,260 steel bins in the as/rs range in height from 6 to 18 inches. the most commonly used bins are the 10and 12-inch bins; however, there is a finite number of these bin heights. unfortunately, the smallest and largest bins are rarely used due to material sizes and weight capacity; therefore, as/rs optimal capacity is unattainable and the number of materials eligible for loading limited by number of bins available. the library also determined that dedicated, rather than random, bin storage type aided in locating specialized materials, reduced loading and retrieval errors, and enhanced accessibility by arranging highly used bins to reachable locations. in the event an srm breaks down and an aisle becomes nonfunctional for retrieving bins, strategically placing the highest used and specialized locations in bins that can be manually pulled is a proactive strategy. however, this requires dedicated bins with an accurate and known inventory that has been arranged in accessible locations. lessons learned disasters & security in 1994, the as/rs proved to provide a much more stable and secure environment than the open stacks when it successfully endured a 6.9 earthquake. the reshelving of more than 300,000 items required a crew of more than thirty personnel over a year to complete. many items were destroyed from the impact of falling to the floor and being buried underneath hundreds of other automated storage & retrieval system | kovalcik and villalobos 121 https://doi.org/10.6017/ital.v38i4.11273 items. the as/rs in contrast consisted of over 800,000 items and successfully sustained the brunt of the earthquake’s impact with no damage to any of the stored items. unfortunately. the materials that had been loaded into the as/rs in 1991 were low-use items that were viewed as one step from weeding. therefore, high-use items stored in open shelves were damaged and required the long process of recovery and reconstruction: identifying and cataloging damaged and undamaged materials, disposal of those damaged, renovation of the area, and purchase of new items. the low-use items stored in the as/rs by contrast required a few bins that had slightly shifted be pushed back fully into their slots. as/rs items have proven to be more secure from misplacement, theft, and physical damage from earthquakes as compared to items in open shelves. maintenance, support, and modernization the csun oviatt library has received two major updates to the as/rs since it was installed in 1991. in 2011, the as/rs received updates for communication and positioning components. the second major update occurred in two phases between 2016 and 2018 and focused on software and equipment. in phase one, server and client-side software was updated from the original software created in 1989. in phase two, half the srms received new motors, drives, and controllers. due to the many years of reliance on preventive maintenance (pm) visits and avoidance of modernization, our vendors were unable to provide support for the as/rs software and had difficulty locating equipment that had become obsolete. preventive maintenance visits were used to maintain the status quo and are not a long-term strategy for maintaining a large investment and critical component of business operations. creaghe and davis note that, “current industrial facility managers report that with a proper aaf [automated access facility] maintenance program, it is realistic to expect the system to be up 9598 percent of the time.”22 pm service is essential for long-term as/rs success; however, preventive maintenance alone is incapable of modernization and ensuring equipment and software do not become obsolete. maintenance is not the same as support, rather maintenance is an aspect of support. support includes points of contacts who are available for troubleshooting, spare supplies on hand for quick repairs, a life-cycle strategy for major components, and longterm planning and budgeting. kirsch attested the following describing eastern michigan university’s strategy: “although the dean is proud and excited about this technology, he acknowledges that just like any computerized technology, when it’s down, it’s down. ” to avoid system problems, emu bought a twenty-year supply of major spare parts and employs the equivalent of one-and-a-half full-time workers to care for its automated storage and retrieval system.”23 a system that relies solely on preventive maintenance will quickly become obsolete and require large and expensive projects in the future if the system is to continue functioning. further, modernization provides an avenue for new features and functions to be realized that increase functionality and efficiency. networking the csun oviatt library on average receives between three to four visits a year along with multiple emails and phone conversations requesting information from different libraries regarding the as/rs. these conversations aid the library by viewing the as/rs in different perspectives and forces the library to review current practices. information technology and libraries | december 2019 122 the library has learned through speaking with many different libraries that needs, design, and configuration of an as/rs can be as unique as the libraries inquiring. the csun oviatt library, for example. is much different than the three other csu system libraries that have an as/rs. due to our system being outdated, it has been difficult to form or establish meaningful groups or share information because the systems are all different from each other. as more conversations occur and systems become more modern and standard, there is potential for knowledge sharing as well as group lobbying efforts for features and pricing. buy in user confidence in any system is required in order for that system to be successful. convincing a user base that moving materials from readily available open shelves and transferring them into steel bins housed within 40-feet-high aisles that are inaccessible will be difficult if the system is consistently down. therefore, the better the as/rs is managed and supported, the more reliable and dependable that system will be and the likelihood user confidence will grow. informing stakeholders of long-term planning and welcoming feedback demonstrates that the system is being supported and managed with an ongoing strategy that is part of future library operations. similarly, administrators need confirmation that large investments and mission-critical services are stable, reliable, and efficient. creating a new line item in the budget for as/rs support and equipment life-cycle requires justification along with a firm understanding of the system. in addition, staffing and organizational responsibilities must also be reviewed in order to establish an environment that is successful and efficient. continuous assessments of the as/rs regarding downtime, projects involved, services and efficiencies provided, etc. aid in providing an illustration of the importance and impact of the system on library operations as a whole. recording usage and statistics unfortunately, usage statistics were not recorded for the as/rs prior to june 2017. therefore, data is unavailable to analyze previous system usage, maintenance, downtime, or project involvement. data-driven decisions require the collection of statistics for system analysis and assessment. following the server software and hardware updates, efforts have been taken to record project statistics, inventory audits, and srm faults, as well as public and internal paging requests. conclusion the as/rs remains, as heinrich & willis described it, “a time-tested innovation.”24 through lessons learned and objective assessment, the library is positioning the as/rs to be a critical component for future development and strategy. by expanding the role of the as/rs to include functions beyond low-use storage, the library discovered efficiencies in material security, customer service, inventory accountability, and strategic planning. the csun oviatt library has learned, experienced, and adjusted its perception, treatment, and usage of the as/rs over the past thirty years. factors often forgotten such as access to the area, staffing and inventory auditing are easily overlooked, while other potential functions such as material security and customer services may not be identified without ongoing analysis and assessment. critical review without a limited or biased perception, has enabled the library to realize the greater functionality the as/rs is able to provide. automated storage & retrieval system | kovalcik and villalobos 123 https://doi.org/10.6017/ital.v38i4.11273 notes 1 shira atkinson and kirsten lee, “design and implementation of a study room reservation system: lessons from a pilot program using google calendar,” college & research libraries 79, no. 7 (2018): 916–30, https://doi.org/10.5860/crl.79.7.916. 2 helen heinrich and eric willis. “automated storage and retrieval system: a time-tested innovation,” library management 35, no. 6/7 (august 5, 2014): 444-53. https://doi.org/10.1108/lm-09-2013-0086. 3 atkinson and lee, “design and implementation of a study room reservation system,” 916–30. 4 “about csun,” california state university, northridge, february 2, 2019, https://www.csun.edu/about-csun. 5 “colleges,” california state university, northridge, may 8, 2019, https://www.csun.edu/academic-affairs/colleges. 6 estimated as/rs capacity was calculated by determining the average size and weight of an item for each size of bin along with the most common bin layout. the average item was then used to determine how many could be stored along the width and length (and if appropriate height) of the bin and then multiplied. many factors affect the overall capacity including: bin layout (with or without dividers), stored item type (book, box, records, etc.), weight of the items, and operator determination of full, partial, empty bin designation. the as/rs mini-loaders have a weight limit of 450 pounds including the weight of the bin. 7 “automated storage and retrieval system (as/rs),” csun oviatt library, https://library.csun.edu/about/asrs. 8 “automated storage and retrieval system (as/rs),” csun oviatt library, https://library.csun.edu/about/asrs. 9 heinrich and willis, “automated storage and retrieval system,” 444-53. 10 norma s. creaghe and douglas a. davis. “hard copy in transition: an automated storage and retrieval facility for low-use library materials,” college & research libraries 47, no. 5 (september 1986): 495-99, https://doi.org/10.5860/crl_47_05_495. 11 heinrich and willis, “automated storage and retrieval system,” 444-53. 12 creaghe and davis, “hard copy in transition,” 495-99. 13 linda shirato, sarah cogan, and sandra yee, “the impact of an automated storage and retrieval system on public services.” reference services review 29, no. 3 (september 2001): 253-61, https://doi.org/10.1108/eum0000000006545. 14 heinrich and willis, “automated storage and retrieval system,” 444-53. 15 sarah e. kirsch, “automated storage and retrieval—the next generation: how northridge’s success is spurring a revolution in library storage and circulation,” paper presented at the https://doi.org/10.5860/crl.79.7.916 https://doi.org/10.1108/lm-09-2013-0086 https://www.csun.edu/about-csun https://www.csun.edu/academic-affairs/colleges https://library.csun.edu/about/asrs https://doi.org/10.5860/crl_47_05_495 https://doi.org/10.1108/eum0000000006545 information technology and libraries | december 2019 124 acrl 9th national conference, detroit, michigan, april 8-11 1999, http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/pdf/kirsch99.pdf . 16 heinrich and willis, “automated storage and retrieval system,” 444-53. 17 shirato, cogan, and yee, “the impact of an automated storage and retrieval system, 253-61. 18 kirsch, “automated storage and retrieval.” 19 shirato, cogan, and yee, “the impact of an automated storage and retrieval system, 253-61. 20 cost of material management was calculated by removing building operational costs (lighting, hvac, carpet, accessibility/open hours, etc.) and focusing on the management of the material instead. the management of materials (or unit cost) is determined by dividing the total amount of fixed and variable costs by the total number of units; 400,000 items divided by $31,500 in annual shelving student budget equals $0.079 per-material per-year in open shelves; 900,000 items divided by $18,000 in annual as/rs student budget equals $0.02 permaterial per-year in the as/rs. 21 shirato, cogan, and yee, “the impact of an automated storage and retrieval system,” 253-61. 22creaghe and davis, “hard copy in transition,” 495-99. 23 kirsch, “automated storage and retrieval.” 24 heinrich and willis, “automated storage and retrieval system,” 444-53. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/pdf/kirsch99.pdf abstract introduction background literature review staffing & usage 1991–1994 2000 mid-2000s mid-2000s–2015 2015–present customer service security inventory lessons learned disasters & security maintenance, support, and modernization networking buy in recording usage and statistics conclusion notes articles 50 years of ital/jla: a bibliometric study of its major influences, themes, and interdisplinarity brady lund information technology and libraries | june 2019 18 brady lund (blund2@g.emporia.edu) is a phd student at emporia state university’s school of library and information management. abstract over five decades, information technology and libraries (and its predecessor, the journal of library automation) has influenced research and practice in the library and information science technology. from its inception on, the journal has been consistently ranked as one of the superior publications in the profession and a trendsetter for all types of librarians and researchers. this research examines ital using a citation analysis of all 878 peer-reviewed feature articles published over the journal’s 51 volumes. impactful authors, articles, publications, and themes from the journal’s history are identified. the findings of this study provide insight into the history of ital and potential topics of interest to ital authors and readership. introduction fifty-one years have passed since the first publication of the journal of library automation (jla), the precursor to information technology and libraries (ital), in 1968: 51 volumes, 204 issues, and 878 feature articles. information technology and its use within libraries has evolved dramatically in the time since the first volume, as has the content of the journal itself. given the interdisciplinary nature of library and information science (lis) and ital, and the celebration of this momentous achievement, an examination of the journal’s evolution, based on the authors, publishers, and works that have influenced its content, seems apropos. the following article presents a comprehensive study of all 7,575 references listed for the 878 articles (~8.6 refs/article average) published over ital’s fifty years, identifying those authors and publishers whose work has been cited the most in the journal and major themes in the cited publications, and an evaluation of the interdisciplinarity of references in ital publications. this study not only frames the history of the ital journal, but demonstrates an evolution of the journal that suggests new paths for future inquiry. conceptual framework a major influence for the organization and methodology of this paper is imad al-sabbagh’s 1987 dissertation from florida state university’s school of library and information studies, the evolution of the interdisciplinarity of information science: a bibliometric study.1 in this study, alsabbagh sought to examine the interdisciplinary influences on the burgeoning field of information science by examining the references of the journal of the american society of information science (jasis), today known as the journal of the association for information science and technology (jasist). in al-sabbagh’s study, a sample of ten percent of jasis references was selected for examination.2 the references were sorted into disciplines based on the definitions supplied by information technology and libraries | june 2019 19 dewey decimal classification categories, with the percentages compared to the total number of the sampled references to derive percentages (e.g., if 150 references of 1,000 total jasis references examined belonged to the category of library science, then 15 percent of references belonged to library science, and so on for all disciplines). the present study deviates slightly from al-sabbagh’s in that it does not use a sampling method. instead, all 878 articles published in jla/ital and their 7,575 references will be examined. the categories for disciplines, instead of being based of dewey decimal, will be based on definitions derived from encyclopedia brittanica, and will include new disciplines that were not used in al-sabbagh’s original analysis, such as information systems and instructional design. 3 additionally, the major authors, publishers, and articles cited throughout jla/ital’s history will be identified; this was not done in al-sabbagh’s study, but will likely provide additional beneficial information for researchers and potential contributors to ital. ital is an ideal publication to study using al-sabbagh’s methodology, in that it is affiliated with librarianship and library science but, due to its content, is also closely associated with the disciplines of information science, computer science, information systems, instructional design, psychology, and many others. ital is likely one of the more interdisciplinary journals to still fall within the category of “library science.” in fact, as part of al-sabbagh’s 1987 study, he distributed a survey to several prominent information science researchers, asking them to name journals relevant to information science (this method was used to determine that jasis was the most representative journal for the discipline of information science). on the list of 31 journals compiled from the respondents’ rankings, ital ranked as the seventh most representative journal for information science, above datamation, scientometrics, jelis, and library hi-tech.4 this shows that, for a long time, ital has been considered as an important journal not just in library science, but in information science and likely beyond. key terminology while the findings of this study are pertinent to the ital reader, some of the terminology used throughout the study may be unfamiliar. to acclimate the reader to the terminology used in this study, brief definitions for key concepts are provided below. bibliometrics. “bibliometrics” is the statistical study of properties of documents.5 the present study constitutes a “citation analysis,” a type of bibliometric analysis that examines the citations in a document and what they can reveal about said document. cited publications. “cited publications” are the references (“publications”) listed at the end of a journal article.6 the purpose of al-sabbagh’s study (and the present study) is to analyze these cited publications to determine what disciplines influenced the research published in a specific journal. this bibliometric analysis methodology is distinct from those that examine the influence of a specific journal on a discipline (i.e., the present study looks at what disciplines influenced ital, not what disciplines are influenced by ital). discipline. in this study, the term “discipline” is used liberally to refer to any area of study that is presently or was historically offered at an institution of higher education (sociology, anthropology, education, etc.). in this study, library science and information science are considered as distinct disciplines (as was the case with al-sabbagh’s study).7 as discussed in the methodology section, the names and definitions of disciplines are all derived from the encyclopedia britanica. 50 years of ital/jla | lund 20 https://doi.org/10.6017/ital.v38i2.10875 literature review the type of citation analysis used by al-sabbagh and as the basis of the current study are used frequently to examine the interdisciplinarity of library and information science and specific lis journals, as noted by huang and chang.8 tsay used a similar methodology to al-sabbagh to examine cited publications in the 1980, 1985, 1990, 1995, 2000, and 2004 volumes of jasist. in this study, the researcher found that about one-half of the citations in jasist came from the field of lis.9 butler examined lis dissertations using a similar approach, finding that about 50 percent of the cited publications in the dissertations originated in lis, with education, computer science, and health science following in the second, third, and fourth positions.10 chikate and paul and chen and liang conducted similar studies of dissertations in the india and taiwan.11 each study found different degrees of interdisciplinarity, possibly indicating a fluctuation within the discipline of lis based on publication type, country of origin, etc. for the publications used in the study. several researchers have used these methods recently to examine library and information science journals, such as chinese librarianship,12 pakistan journal of library and information science,13 library philosophy and practice,14 and the journal of library and information technology.15 these studies are more common for journals published outside of the united states, but there is no reason why the methodology would not hold true for a u.s.-based journal like ital. recently, publications in a wide array of fields have used similar methodologies as al-sabbagh to evaluate interdisciplinarity in a discipline. ramos-rodriguez and ruiz-navarro (2004) examined reference trends in the journal of strategic management.16 fernandez-alles and ramos-rodriguez (2009) conducted a bibliometric analysis to identifying those publications most frequently cited in the journal human resource management.17 crawley-low (2006) used a similar methodology to identify the core (most frequently cited) journals in veterinary medicine from the american journal of veterinary research.18 these studies show a growing interest in the use of citation analysis to present new information about a publication to potential authors, editors, and readers. jarvelin and vakkari (1993) noted trends in lis from 1965 to 1985 based on an examination of cited publications in lis journals. the authors noted a trend in interest in the topic of information storage and retrieval, with a de-emphasis on classification and indexing and a strengthened emphasis on information systems and retrieval.19 this study deviated from al-sabbagh and related studies of interdisciplinarity—though it employed a similar methodology—in that it examined trends or subtopics within the discipline of lis. though it is not a primary focus of the present study, the use of subtopics to further divide the discipline of library science and examine what aspects (management, technology, cataloging, reference) of the discipline are the focus of cited publications is incorporated in several tables in the results section. methods all references from the 878 articles published in the jla/ital journals (n=7,575) were transcribed to an excel spreadsheet for analysis (this spreadsheet can be found as a supplemental file [https://ejournals.bc.edu/index.php/ital/article/view/10875/9469]). the spreadsheet includes separate columns for primary author, title, publisher, and discipline of each reference. the list of disciplines with their definitions, derived from encyclopedia brittanica, is displayed in table 1 below. information technology and libraries | june 2019 21 table 1. definitions of disciplines used for this study. discipline definition library science the principles and practices of library operation and administration, and their study. information science the discipline that deals with the processes of storing and transferring information. information systems the study of the integrated set of components for collecting, storing, and processing data and for providing information, knowledge, and digital products. computer science the study of computers, including their design (architecture) and their uses for computations, data processing, and systems control. engineering the application of science to the optimum conversion of the resources of nature to the uses of humankind. instructional design the systematic development of instructional specifications using learning and instructional theory to ensure the quality of instruction. education the discipline that is concerned with methods of teaching and learning in schools or school-like environments as opposed to various nonformal and informal means of socialization. government resources produced within the political system by which a country or community is administered and regulated. sociology a social science that studies human societies, their interactions, and the processes that preserve and change them. popular newspaper, magazine, media reports that do not fit better within another category. philosophy the rational, abstract, and methodical consideration of reality as a whole or of fundamental dimensions of human existence and experience. psychology the scientific discipline that studies mental states and processes and behaviour in humans and other animals. corporate business, corporate, private organization publications that do not fit better within another category. archival science the study of the repository for an organized body of records produced or received by a public, semipublic, institutional, or business entity in the transaction of its affairs and preserved by it or its successors. management the study of the process of dealing with or controlling things or people. linguistics the scientific study of language. literature the art of creation of a written work. law the discipline and profession concerned with the customs, practices, and rules of conduct of a community that are recognized as binding by the community. 50 years of ital/jla | lund 22 https://doi.org/10.6017/ital.v38i2.10875 discipline definition mathematics the science of structure, order, and relation that has evolved from elemental practices of counting, measuring, and describing the shapes of objects (also includes statistics). health science study of humans, the extent of an individual’s continuing physical, emotional, mental, and social ability to cope with his or her environment. communication science the study of the exchange of meanings between individuals through a common system of symbols. geography the study of the diverse environments, places, and spaces of earth’s surface and their interactions. physics the science that deals with the structure of matter and the interactions between the fundamental constituents of the observable universe. art/design the study of the nature of art, including such concepts as interpretation, representation and expression, and form. economics the social science that seeks to analyze and describe the production, distribution, and consumption of wealth. biology the study of living things and their vital processes. museum studies the study of institutions dedicated to preserving and interpreting the primary tangible evidence of humankind and the environment. music the art concerned with combining vocal or instrumental sounds for beauty of form or emotional expression, usually according to cultural standards of rhythm, melody, and, in most western music, harmony. chemistry the science that deals with the properties, composition, and structure of substances (defined as elements and compounds), the transformations they undergo, and the energy that is released or absorbed during these processes. science and technology studies the study, from a philosophical perspective, of the elements of scientific inquiry. journalism the collection, preparation, and distribution of news and related commentary and feature materials through such print and electronic media as newspapers, magazines, books, blogs, webcasts, podcasts, social networking and social media sites, and e-mail as well as through radio, motion pictures, and television. anthropology the study of human beings in aspects ranging from the biology and evolutionary history of homo sapiens to the features of society and culture that decisively distinguish humans from other animal species. to determine the discipline in which a cited publication would be classified, the researcher used the cited publication’s title, abstract, and journal to select the most appropriate discipline from the table. in those cases where a source could not be easily identified as falling within one specific discipline, the researcher conferred with additional reviewers (professional librarians) to determine the best fit. information technology and libraries | june 2019 23 several analyses of this data were conducted to explore various aspects of jla/ital’s publication history. for the complete data of the publication’s 51 volumes, the top ten most referenced authors, articles, publishers (journals/publishing houses/organizations/websites), and disciplines were identified with the aid of excel’s functions. the same was done separately for both the jla’s 14 volumes and ital’s 37 volumes. this will allow for the comparison of the journal before and after the 1982 rebranding. the 51 volumes of jla/ital were also divided into the five decades of its history: 1968-77, 1978-87, 1988-97, 1998-2007, 2008-18 (eleven volumes instead of ten). for each of these decades, the top ten authors, publishers, and disciplines were identified. for each of the three categories, a table was created to show the top ten of each decade side-by-side. lastly, the titles of the 7,575 cited publications in jla/ital articles were examined using a content analysis, to identify major concepts and themes that appear to have influenced jla/ital articles. nvivo content analysis software was utilized for this analysis. titles were fed from the excel spreadsheet into the nvivo software, and the word frequency tools were used to identify the most frequently used terms and “generalizations,” or types or themes of statements in the titles.20 results table 2 displays the top ten most-cited authors, articles, publishers/publications, and disciplines throughout ital’s fifty-year history. among the authors, four of the top six are associated with two institutions: library of congress and oclc. there are four corporate or nonprofit organizations, three academics (associated with institutions of higher education), two women and four men. of the top ten articles, eight were published before 1973; three were published in jla/ital and five were published in journals versus five in other (non-journal) publications. of the top ten publishers, seven are journals; five of the publishers are directly associated with library science. within the disciplines, lis represents 60 percent of the total. there are 31 total disciplines represented throughout the 51 volumes, a greater number of disciplines than identified in al-sabbaugh’s study of jasist. table 3 displays the results for jla. jla emerged at the same time as the machine-readable catalog (marc) and oclc, and this is evident in the authors, articles, and publishers cited in the journal. during this phase of the journal’s history, the top three authors—fred kilgour, the library of congress, and henriette avram—dominated the citations. these three authors were cited more than the next seven combined (143 to 101). the cited publications during this period reflected a focus on systems, corporate, and government publications. results for the 37 volumes of ital are displayed in table 4. during this period, marshall breeding emerged as one of the biggest influences on information technology and libraries. all but two of the top articles (larson and bizer) were written before 1985. while six publishers were the same as with jla, three of these six (library of congress, association for computing machinery, and college and research libraries) changed position in the top ten. the disciplines of systems, psychology, educational and instructional design rose, while government, corporate, management, linguistics, and electrical engineering dropped; library science, information science, and computer science remained at the top. 50 years of ital/jla | lund 24 https://doi.org/10.6017/ital.v38i2.10875 table 2. overall most cited. top ten authors (affiliation) top ten articles top ten publishers top ten disciplines top ten disciplines with percentages 1 u.s. library of congress american library association. (1967). anglo-american cataloging rules. chicago, il: american library association. ital/jla library science— technology library science 44% 2 fred g. kilgour (oclc) avram, h. d. (1968). the marc ii format: a communications format for bibliographic data. washington, dc: library of congress. asist information science information science 16% 3 henriette d. avram (library of congress) ruecking jr, f. h. (1968). bibliographic retrieval from bibliographic input; the hypothesis and construction of a test. information technology and libraries, 1(4), 227-238. association for computing machinery library science— cataloging computer science 8% 4 american library association kilgour, f. g., leiderman, e. b., & long, p. l. (1971). retrieval of bibliographic entries from a name-title catalog by use of truncated search keys. ohio college library center. college and research libraries computer science information systems 7% 5 ibm: international business machines kilgour, f. g. (1968). retrieval of single entries from a computerized library catalog file. proceedings of the american society for information science, 5, 133136. library of congress information systems government 3% 6 ohio college library center/online computer library center (oclc) long, p. l., & kilgour, f. (1972). a truncated search key title index. information technology and libraries, 5(1), 17-20. american library association library science— general instructional design 3% 7 marshall breeding (vanderbilt university/independent) hildreth, c. r. (1982). online public access catalogs: the user interface. oclc online computer library center, incorporated. library resources and technical services government corporate 2% 8 jakob nielsen (independent) nugent, w. r. (1968). compression word coding techniques for information retrieval. information technology and libraries, 1(4), 250-260. library hitech library science— administration education 2% 9 karen markey (university of michigan) curwen, a. g. (1990). international standard bibliographic description. in standards for the international exchange of bibliographic information: papers presented at a course held at the school of library, archive and information studies, university college london (pp. 3-18). library journal instructional design psychology 2% 10 walt crawford (research libraries group/independent) fasana, p. j. (1963). automating cataloging functions in conventional libraries (no. isl-9028-37). lexington, ma: itek corp information sciences lab. oclc library science— academic sociology 2% information technology and libraries | june 2019 25 table 3. jla most cited. top ten authors (affiliation) top ten articles top ten publishers top ten disciplines top ten disciplines with percentages 1 fred g. kilgour (oclc) avram, h. d. (1968). the marc ii format: a communications format for bibliographic data. washington, dc: library of congress. journal of library automation library science— technology library science 58% 2 u.s. library of congress american library association. (1967). anglo-american cataloging rules. chicago, il: american library association. asist information science information science 14% 3 henriette d. avram (library of congress) ruecking jr, f. h. (1968). bibliographic retrieval from bibliographic input; the hypothesis and construction of a test. journal of library automation, 1(4), 227-238. library of congress library science— cataloging computer science 6% 4 ibm: international business machines kilgour, f. g., leiderman, e. b., & long, p. l. (1971). retrieval of bibliographic entries from a nametitle catalog by use of truncated search keys. ohio college library center. library resources and technical services library science— general government 5% 5 american library association long, p. l., & kilgour, f. (1972). a truncated search key title index. journal of library automation, 5(1), 17-20. ibm computer science corporate 5% 6 william r. nugent (inforonics, inc.) kilgour, f. g. (1968). retrieval of single entries from a computerized library catalog file. proceedings of the american society for information science, 5, 133-136. american library association government information systems 4% 7 paul j. fasana (columbia university) livingston, l.g. (1973). international standard bibliographic description for serials. library resources and technical services, 17(3), 293-298. association for computing machinery corporate management 2% 8 philip l. long (oclc) fasana, p. j. (1963). automating cataloging functions in conventional libraries (no. isl-9028-37). lexington, ma: itek corp information sciences lab. university of illinois press information systems linguistics 1% 9 martha e. williams (university of illinois) nugent, w. r. (1968). compression word coding techniques for information retrieval. journal of library automation, 1(4), 250-260. college and research libraries library science— academic electrical engineering 1% 10 university of california avram, h. d. (1970). the recon pilot project: a progress report. journal of library automation, 3(2), 102-114. special libraries library science— special psychology 1% 50 years of ital/jla | lund 26 https://doi.org/10.6017/ital.v38i2.10875 table 4. ital most cited. top ten authors (affiliation) top ten articles top ten publishers top ten disciplines top ten disciplines with percentages 1 u.s. library of congress american library association. (1967). anglo-american cataloging rules. chicago, il: american library association. information technology and libraries library science— technology library science 41% 2 american library association hildreth, c. r. (1982). online public access catalogs: the user interface. oclc online computer library center, incorporated. asist information science information science 16% 3 marshall breeding (vanderbilt university/independent) markey, k. (1984). subject searching in library catalogs. oclc online computer library center. association for computing machinery library science— cataloging computer science 9% 4 jakob nielsen (independent) malinconico, s. m. (1979). bibliographic data base organization and authority file control. wilson library bulletin, 54(1), 36-45. college and research libraries computer science information systems 7% 5 karen markey (university of michigan) matthews, j. r., lawrence, g. s., & ferguson, d. (1983). using online catalogs: a nationwide survey. neal-schuman publishers, inc.. library hitech information systems instructional design 3% 6 oclc bizer, c., heath, t., & berners-lee, t. (2011). linked data: the story so far. in semantic services, interoperability and web applications: emerging concepts (pp. 205-227). igi global. american library association instructional design government 2% 7 walt crawford (research libraries group/independent) tolle, j. e. (1983). current utilization of online catalogs: transaction log analysis. volume i of three volumes. final report. ohio college library center library science— administration education 2% 8 clifford a. lynch (university of california/coalition for networked 0information) larson, r. r. (1991). the decline of subject searching: long-term trends and patterns of index use in an online catalog. journal of the american society for information science, 42(3), 197-215. journal of academic librarianship library science— general sociology 2% 9 charles r. hildreth (read ltd.) markey, k. (1983). online catalog use: results of surveys and focus group interviews in several libraries. volume ii of three volumes. final report. library journal library science— academic psychology 2% 10 j.r. matthews (san jose state university/independent) ludy, l.e., & logan, s.j. (1982). integrating authority control in an online catalog. american society for information science meeting, 19, 176-178. library of congress government management 2% the top ten authors of each decade are shown in table 5. for the first two decades, fred kilgour was a dominate influence, receiving 15 more citations than the next closest author (the library of congress). in the third decade, kilgour dropped entirely from the top ten and was supplanted at the top spot by karen markey, professor at the university of michigan. during the fourth decade, in the wake of cipa and the u.s. patriot act, the library of congress rose to the top spot and john information technology and libraries | june 2019 27 bertot and paul jaeger, who wrote extensively on these topics and their legal, social, and administrative implications, rose up the list. web resources, such as google, also began to emerge in the fourth decade. in the final decade, breeding, who writes on library systems as well as information technology in general, rose to the top spot. tim berners-lee, one of the pioneers of the internet and linked data, took the second spot. jakob nielson, known for his contributions to usability testing, appears in the top three of the rankings for both the fourth and fifth decade. only the library of congress and american library association appear in the top ten list for all five decades. table 5. top ten authors of each decade. 1968-77 1978-87 1988-97 1998-2007 2008-18 1 fred g. kilgour (oclc) fred g. kilgour (oclc) karen markey (university of michigan) u.s. library of congress marshall breeding (vanderbilt university/independent) 2 u.s. library of congress robert de gennaro (harvard university/ pennsylvania university) u.s. library of congress jakob nielsen (independent) tim berners-lee (w3 consortium/ university of oxford/ massachusetts institute of technology) 3 henriette d. avram (library of congress) henriette d. avram (library of congress) clifford a. lynch (university of california/coalition for networked information) john c. bertot (university of maryland) jakob nielsen (independent) 4 ibm: international business machines ibm: international business machines michael k. buckland (university of california) oclc u.s. library of congress 5 american library association s. michael malinconico (new york public library/ university of alabama) american library association paul t. jaeger (university of maryland) american library association 6 paul j. fasana (columbia university) u.s. library of congress christine l. borgman (university of californialos angeles) walt crawford (research libraries group/independent) national information standards organization 7 william r. nugent (inforonics, inc.) frederick w. lancaster (university of illinois) charles r. hildreth (read ltd) american library association u.s. government 8 university of california allen b. veaner (stanford university/ university of california) joseph r. matthews (san jose state university/independent) roy tennant (university of california/ oclc) john c. bertot (university of maryland) 9 kenneth j. bierman (oklahoma state university/ university of nevada-las vegas) alan l. landgraf (oclc) walt crawford (research libraries group/independent) google oclc 10 robert m. hayes (university of california-los angeles) american library association lois m. chan (university of kentucky) thomas b. hickey (oclc) jung-ran park (drexel university) 50 years of ital/jla | lund 28 https://doi.org/10.6017/ital.v38i2.10875 jla/ital appears as the most cited publisher in all decades except the fourth, as shown in table 6. during that decade, acm and jasist rose above ital, and library journal and websites (websites are considered in this study as a collective group) emerged on the list. library journal was a frequently used source for bertot and jaeger, who authored several ital articles during this period. there were also more articles about the internet, digital libraries, google and google scholar, and the future of libraries during the fourth decade. jasist appears in the top four of every decade but has declined in the fifth decade of ital. oclc, ibm, college and research libraries, cataloging and classification quarterly, journal of academic librarianship, library resources and technical services, and library hi-tech all appear in multiple decades of this list. table 6. top ten publishers of cited articles for each decade. 1968-77 1978-87 1988-97 1998-2007 2008-18 1 jla jla/ital ital association for computing machinery ital 2 library of congress jasist jasist jasist library hi-tech 3 jasist library journal college and research libraries ital association for computing machinery 4 library resources and technical services oclc american library association college and research libraries jasist 5 ibm university of illinois press library resources and technical services american library association journal of academic librarianship 6 american library association library of congress oclc library journal college and research libraries 7 special libraries library resources and technical services library of congress journal of academic librarianship computers in libraries 8 college and research libraries american library association library hi-tech general websites d-lib 9 association for computing machinery prentice-hall journal of academic librarianship library hi-tech cataloging and classification quarterly 10 university of illinois press ibm cataloging and classification quarterly oclc ieee as shown in table 7, library science and information science maintained the first and second positions for every decade of jla/ital’s publication, while computer science and information systems jockeyed for the third and fourth positions every decade except the first (when government reports had a major impact on the journal). government and corporate (ibm particularly) were important in the first three decades but were replaced by instructional design and education in the final two decades. sociology appears in four of five decades, while psychology appears in three of five. in the first two decades, electrical engineering (as it applied to the design of computer systems) rounded up the top ten; law emerged in decade four, following cipa and the information technology and libraries | june 2019 29 patriot act; in the final decade, with the discussion about encoded archival description in ital, archival science rose to the tenth spot. table 7. top ten disciplines of each decade (library science subcategories combined). 1968-77 1978-87 1988-97 1998-2007 2008-18 1 library science library science library science library science library science 2 information science information science information science information science information science 3 computer science information systems computer science computer science information systems 4 government computer science information systems information systems computer science 5 corporate corporate government instructional design instructional design 6 information systems government philosophy education psychology 7 management management sociology corporate government 8 linguistics sociology literature sociology education 9 electrical engineering psychology psychology philosophy sociology 10 chemistry electrical engineering education law archival science table 8 compares all disciplines (including subcategories of library science) in the first decade of jla/ital and the fifth decade. compared to the first decade, the fifth decade saw greater diversification of subtopics under library science, which led to “information science” supplanting “library science—technology” in the top spot. instructional design and archival science emerged from disciplines not discussed in the first decade to become some of the most important disciplines of the fifth decade. the library science subtopics of accessibility and teaching grew significantly as the roles of the librarian evolved. 50 years of ital/jla | lund 30 https://doi.org/10.6017/ital.v38i2.10875 table 8. first ten years vs. last eleven years disciplines (with subcategories of library science). 1968-77 2008-18 1 library science—technology information science 2 information science library science—technology 3 library science—cataloging information systems 4 library science—general computer science 5 computer science library science—cataloging 6 government instructional design 7 corporate library science—accessibility 8 library science—academic library science—academic 9 information systems library science—reference 10 library science—special library science—administration 11 management psychology 12 linguistics government 13 electrical engineering library science—general 14 library science—medical education 15 popular popular 16 library science—reference library science—teaching 17 chemistry sociology 18 physics archival science 19 engineering—general management 20 psychology law 21 mathematics corporate 22 library science—local mathematics 23 communication science philosophy 24 health science literature 25 library science—accessibility linguistics 26 library science—school physics 27 philosophy health science information technology and libraries | june 2019 31 1968-77 2008-18 28 library science—administration geography 29 journalism electrical engineering 30 government library science—medical 31 music biology 32 education art/design 33 literature museum studies 34 economics 35 communication science 36 engineering-general 37 journalism 38 library science—special 39 chemistry 40 science and technology studies 41 library science—school 42 library science—local 43 anthropology table 9 show the ten biggest themes and most frequently used terms throughout jla/ital’s 51 volumes. library is the most common theme and term. the library catalog, and the associated concept of the integrated library system (ils), influence the second and third themes. “online” is an interesting theme/term for the different ways in which it was used throughout the history of the journal. in the early years, “online” was used to refer to the retrieval of computerized bibliographic information; in later years, “online” came to refer almost exclusively to the use of the world wide web. rounding out the top ten terms are several that associated with the study of information science: data, bibliography, and retrieval. finally, table 10 depicts the top ten themes for each of ital’s five decades. libraries remained at the top for all decades; the second spot, however, shifted dramatically. in the first decade, with marc being a major topic of discussion, “system” and “catalog” rose to the top. in decades two and three, with the melding of the disciplines of library science and information science, “information” rose to the top. in the final two decades, the world wide web was influential on the ital discourse. users, usability, and accessibility remain an important theme throughout the history of the journal. 50 years of ital/jla | lund 32 https://doi.org/10.6017/ital.v38i2.10875 table 9. major themes and term frequency in titles of cited publications (all 51 volumes). themes terms 1 library library 2 catalog information 3 system online 4 information system 5 online web 6 usability catalog 7 web digital 8 search data 9 computer bibliography 10 digital retrieval table 10. major themes in titles of cited publications (by decade). 1968-77 1978-87 1988-97 1998-2007 2008-18 1 library library library library library 2 system information information web web 3 catalog system catalog information digital 4 information catalog web digital information 5 online online system usability usability 6 usability web digital users data 7 web usability online catalog users 8 search digital usability search accessibility 9 computer users users accessibility studies 10 digital search accessibility data academic information technology and libraries | june 2019 33 discussion one of the major benefits of a bibliometric study/citation analysis is the production of a set of themes, disciplines, seminal sources, influences, and influencers that may benefit potential authors in determining whether their manuscript is suitable for publication in a specific discipline or journal.21 the results of this study demonstrate that ital is undoubtedly a library science journal, but that it invites a high-level of interdisciplinarity and has experienced a growing impact from the disciplines of information science, computer science, and information systems (which combined presently comprise about 30 percent of total ital references). throughout the journal’s history, there has been an emphasis on library systems, particularly systems for library cataloging. recently, however, there has also been an emphasis on technology, law, and the library as well as instructional technology, teaching, and the library. ital authors take the majority of their citations/ideas from other ital articles, jasist, acm, and other library technology (library hitech, d-lib) and academic librarianship (college and research libraries, journal of academic librarianship) journals. some of the major authors to read to familiarize oneself with the history and themes of the ital publication include fred kilgour, henriette avram, karen markey, and marshall breeding. these are some findings that potential ital authors may find practical use while preparing crafting their research and writing. with ital having a sustained role as a leading publication in library and information science, this study may have some generalizable findings for the discipline. in 2015, richard van noorden produced an interactive chart of the interdisciplinarity of a variety of disciplines, based on data from web of science and the national science foundation.22 if ital is considered representative of a sub-discipline called “library and information science—technology,” it can be compared to the interdisciplinarity of the disciplines listed in van noorden’s study. in the last decade of ital, 45.4 percent of citations came from outside of lis. compared to van noorden’s findings, only 42 of 144 (29 percent) “fields” (or “disciplines,” as they have been referred to as in this study) have a higher proportion of references to outside disciplines. this lis-tech sub-discipline would have a level of interdisciplinarity comparable to the fields of oceanography, botany, philosophy, history, and psychology, and on-par with the average for all social sciences.23 this shows that the discipline certainly has its own proprietary knowledge-base to build upon, but also values the contributions of knowledge from other disciplines. though it is not necessarily the purpose of this study to examine the influence of ital on other journals and within the discipline of lis, some of this information can be gathered rather easily from google scholar (by searching for the journal and comparing the number of citations for each article, as displayed by scholar) and is worth sharing. table 11 shows the top ten most-cited articles published over the history of jla/ital, with mcclure’s 1994 article “network literacy: a role for libraries,” receiving the most references of any article published in the journal. three ital articles have been cited by articles which themselves have over 1,000 citations, including one article (2007’s “checking out facebook.com”) that has been cited by an article which itself has been cited over 10,000 times. fifty-seven ital articles have at least 57 citations, giving the journal an h-index24 of 57. 50 years of ital/jla | lund 34 https://doi.org/10.6017/ital.v38i2.10875 table 11. citations of ital articles in outside journals. rank journal citation number of citations 1 mcclure, c. r. (1994). network literacy: a role for libraries? information technology and libraries, 13(2), 115-26. 447 2 charnigo, l., & barnett-ellis, p. (2007). checking out facebook.com: the impact of a digital trend on academic libraries. information technology and libraries, 26(1), 23-34. 391 3 antelman, k., lynema, e., & pace, a. k. (2006). toward a 21st century library catalog. information technology and libraries 25(3), 128-39. 267 4 spiteri, l. f. (2007). the structure and form of folksonomy tags: the road to the public library catalog. information technology and libraries 26(3), 13-25. 260 5 katz, i. r. (2007). testing information literacy in digital environments: ets's iskills assessment. information technology and libraries 26(3), 3-12. 226 6 jeng, j. (2005). what is usability in the context of the digital library and how can it be measured. information technology and libraries 24(2), 47-56. 196 7 lankes, r. d., silverstein, j., & nicholson, s. (2007). participatory networks: the library as conversation. information technology and libraries 26(4), 17-33. 189 8 dickstein, r., & mills, v. (2000). usability testing at the university of arizona library: how to let the users in on the design. information technology and libraries 19(3), 14451. 188 9 schaffner, a. c. (1994). the future of scientific journals: lessons from the past. information technology and libraries 13(4), 239-47. 177 10 kopp, j. j. (1998). library consortia and information technology: the past, the present, the promise. information technology and libraries 17(1), 7. 166 limitations of this study there were couple of potential limitations to this study. this bibliometric analysis was conducted in the “old-fashioned” way, using excel and hand-typing out all 7,575 cited publications. this was deemed the most effective way to collect the data, based on the availability of ital journal, but did take a great deal of time. to save time in recording data, only the first author for each cited publication was listed and no publication dates were collected, nor were abstracts retained and analyzed (which may provide additional compelling details about the content of these cited publications). greater validity for the assignment of disciplines to cited publications may be achieved by having a large team of researchers for analysis, or using multiple researchers for all citations, not just those that the first researcher deems questionable.25 as with a content analysis, independent review of data and comparison and compromising of coding is likely to provide the most consistent and accurate results. information technology and libraries | june 2019 35 conclusion fifty-one volumes of the journal of library automation/information technology and libraries have been published, over which time library technology has evolved from early-marc, a time in which the exceptional library would have perhaps a single computer for “online retrieval,” to the internet age, characterized by personal computing, library management systems, and technology-aided instruction. as time has passed, many of the major influences on the journal have changed, yet the journal has remained one of the most influential for library and information science technology. increased interdisciplinarity in cited publications and new directions in information law and education offer new directions as the journal enters its sixth decade. endnotes 1 imad al-sabbagh, “evolution of the interdisciplinarity of information science: a bibliometric study” (phd diss., florida state university, 1987). 2 ibid. 3 encyclopedia britannica, https://www.britannica.com/ (accessed sept. 13, 2018). 4 al-sabbagh, “evolution of the interdisciplinarity.” 5 melissa k. mcburney and pamela l. novak, “what is bibliometrics and why should you care?” professional communication conference, ieee (2002): 108-14, https://doi.org/10.1109/ipcc.2002.1049094. 6 lutz bornmann and rudiger mutz, “growth rates of modern science,” journal of the association for information science and technology 66, no. 11 (2015): 2, 215-222, https://doi.org/10.1002/asi.23329. 7 al-sabbagh, “evolution of the interdisciplinarity.” 8 mu-hsuan huang and yu-wei chang, “a study of interdisciplinarity in information science: using direct citation and co-authorship analysis,” journal of information science 37, no. 4 (2011): 369-78, https://doi.org/10.1177/0165551511407141. 9 ming-yueh tsay, “journal bibliometric analysis: a case study on the jasist,” malaysian journal of library & information science 13, no. 2 (2008): 121-39, http://ejum.fsktm.um.edu.my/article/663.pdf. 10 lois buttlar, “information sources in library and information science doctoral research,” library & information science research 21, no. 2 (1999): 227-45, https://doi.org/10.1016/s0740-8188(99)00005-5 11 r.v. chikate and s.k. patil, “citation analysis of theses in library and information science submitted to university of pune: a pilot study,” library philosophy and practice 222 (2008); kuang-hua chen and chiung-fang liang, “disciplinary interflow of library and information science in taiwan,” journal of library and information studies 2, no. 2 (2004): 31-55. 50 years of ital/jla | lund 36 https://doi.org/10.6017/ital.v38i2.10875 12 akhtar hussain and nishat fatima, “a bibliometric analysis of the ‘chinese librarianship: an international electronic journal,’ 2006-2010,” chinese librarianship 31, no. 1 (2011): 1-14, http://www.iclc.us/cliej/cl31hf.pdf. 13 nosheen fatima warraich and sajjad ahmad, “pakistan journal of library and information science: a bibliometric analysis,” pakistan journal of information management and libraries 12, no. 1 (2011): 1-7. http://eprints.rclis.org/25600/. 14 s. thanuskodi, “bibliometric analysis of the journal library philosophy and practice from 20052009,” library philosophy and practice 437 (2010): 1-6. https://digitalcommons.unl.edu/libphilprac/437/. 15 manoj kumar and a.l. moorthy, “bibliometric analysis of desidoc journal of library and information technology during 2001-2010,” desicoc journal of library and information technology 31, no. 3 (2011): 203-08. 16 antonio ramos-rodriguez and jose ruiz-navarro, “changes in the intellectual structure of strategic management research: a bibliographic study of the strategic management journal, 1980-2000,” strategic management journal 25, no. 10 (2004): 981-1,004, https://doi.org/10.1002/smj.397. 17 mariluz fernandez-alles and antonio ramos-rodriguez, “intellectual structure of human resources management research: a bibliometric analysis of the journal human resource management, 1985-2005,” jasist 60, no. 1 (2009): 161, https://doi.org/10.1002/asi.20947. 18 jill crawley-low, “bibliometric analysis of the american journal of veterinary research to produce a list of core veterinary medicine journals,” jmla 94, no. 4 (2006): 430-34. 19 kalervo jarvelin and pertti vakkari, “the evolution of library and information science 19651985: a content analysis of journal articles,” information processing and management 29, no. 1 (1993): 129-44, https://doi.org/10.1016/0306-4573(93)90028-c. 20 r. barry lewis, “nvivo and atlas.ti 5.0: a comparative review of two popular qualitative data-analysis programs,” field methods 16, no. 4 (2004): 439-69, https://doi.org/10.1177/1525822x04269174. 21 thad van leeuwen, “the application of bibliometric analyses in the evaluation of social science research: who benefits from it, and why it is still feasible,” scientometrics 66, no. 1 (2006): 133-54, https://doi.org/10.1007/s11192-006-0010-7. 22 richard van noorden, “interdisciplinarity research by the numbers,” nature 525, no. 7569 (2015): 306-07, https://doi.org/10.1038/525306a. 23 ibid, 306. 24 lutz bornmann and hans-dieter daniel, “what do we know about the h index,” journal of the american society for information science and technology 58, no. 9 (2007): 1,381-385, https://doi.org/10.1002/asi.20609. 25 linda c. smith, “citation analysis,” library trends 30, no. 1 (1981): 83-106. 16 information technology and libraries | march 2011 the internet public library (ipl): an exploratory case study on user perceptions environment. digital and physical holdings, academic and public libraries, free and subscription resources, internet encyclopedias, and a multitude of other offerings form a complex (and often overwhelming) informationseeking environment. to move forward effectively and to best serve its existing and potential users, the ipl must pursue a path that is adapted to the present state of the internet and that is user-informed and user-driven. recent large-scale studies, such as the 2005 oclc reports on perceptions of libraries and information resources, have begun to explore user perceptions of libraries in the complex internet environment.3 these studies emphasize the importance of user perceptions of library use, questioning whether libraries still matter in the rapidly growing infosphere and what future use trends might be. in the internet environment, user perceptions play a key role in use (or nonuse) of library resource and services as information-seekers are faced with myriad easily accessible electronic information sources. the ipl’s name, for example, may or may not be perceived as initially helpful to users’ information-seeking needs. repeat use relates to such perceptions as well, in the amount of value users perceive in the library resources over the many other sources available. in beginning to explore such issues, there is a need for current research addressing user perceptions of an internet public library: what the name implies to both existing and potential users as well as the associated functions and resources that should be offered. in this study, we present an exploratory case study on public perceptions of the ipl. using qualitative analysis of interviews with ten college students, some of whom are current users of the ipl and others with no exposure to the ipl, begins to yield an understanding of the public perception of what an internet public library should be. this study seeks to expand our understanding of such issues and explore the present-day requirements for the ipl in addressing the following research questions: ■■ what is the public perception of an internet public library? ■■ what services and materials should an internet public library offer? ■■ background the ipl: origins and research in 1995, joe janes, a professor at the university of michigan’s school of information and library studies, ran a graduate seminar in which a group of students created a web-based library intended to be a hybrid of both physical library services and internet resources and offerings. the resulting ipl would take the best from both the physical and digital the internet public library (ipl), now known as ipl2, was created in 1995 with the mission of serving the public by providing librarian-recommended internet resources and reference help. we present an exploratory case study on public perceptions of an “internet public library,” based on qualitative analysis of interviews with ten college student participants: some current users and others unfamiliar with the ipl. the exploratory interviews revealed some confusion around the ipl’s name and the types of resources and services that would be offered. participants made many positive comments about the ipl’s resource quality, credibility, and personal help. t he internet public library (ipl), now known as ipl2, is an online-based public service organization and a learning and teaching environment originally developed by the university of michigan’s school of information and currently hosted by drexel university’s ischool. the ipl was created in 1995 as a project in a graduate seminar; a diverse group of students worked to create an online space that would be both a library and an internet institution, helping librarians and the public identify useful internet resources and content collections. with a strong mission to serve and educate a varied community of users, the ipl sought to help the public navigate the increasingly complex internet environment as well as advocate for the continuing relevance of librarians in a digital world. the resulting ipl provided online reference, content collections (such as ready reference and a full-text reading room), youth-oriented resources, and services for other librarians, all through its free, web-based presence.1 currently, the ipl consists of a publicly accessible website with several large content collections (such as “potus: presidents of the united states”), sections targeted toward teens and children (“teenspace” and “kidspace”), and a question and answer service where users can e-mail questions to be answered by volunteer librarians.2 there has been an enormous amount of change in the internet and digital libraries since the ipl’s inception in 1995. while web use statistics, user feedback, and incoming patron questions indicate that the ipl remains well-used and valued, there are many questions about its place in an increasingly information-rich online monica maceli (mgm36@drexel.edu) is a doctoral student, susan wiedenbeck (susan.wiedenbeck@drexel.edu) is a professor, and eileen abels (eabels@drexel.edu) is a professor at the college of information science and technology, drexel university, philadelphia. monica maceli, susan wiedenbeck, and eileen abels the internet public library (ipl) | maceli, wiedenbeck, and abels 17 there has also been a continuous evaluation of the role of the library in an increasingly digital world, a question janes sought to address in his first imaginings of the ipl. a study conducted in 2005 claimed that “electronic information-seeking by the public, both adults and children, is now an everyday reality and large numbers of people have the expectation that they should be able to seek information solely in a virtual mode if they so choose.”12 this trend in electronic information-seeking has driven both public and academic libraries to create and support vast networks of licensed and free online information, directories, and guides. these electronic offerings, which (at least in theory) are desired and appreciated by users, are often overshadowed by the wealth of quickly accessible information from tools such as search engines.13 in competition with quickly accessible (though not necessarily credible or accurate) information sources, librarians have struggled to find their place and relevance in an evolving environment. google and other web search engines often shape users’ experiences and expectations with information-seeking, more so than any formal librarian-driven instruction such as in boolean searching. several recent comprehensive studies have explored user perceptions of libraries, both physical and digital, in relationship to the larger internet. abels explored the perspective of libraries and librarians across a broad population consisting of both library users and non-users.14 her findings included the fact that web search engines were the starting point for the majority of information-seeking, and that there is a high preference among users for virtual reference desk services. she proposed an information-seeking model in which the library serves as one of many internet resources, including free websites and interpersonal sources, and is likely not the user’s first stop. in respect to this model of information-seeking, abels suggests that “librarians need to accept the broader framework of the information seeker and develop services that integrate the library and the librarian into this framework.”15 in 2005, oclc released what is possibly the most comprehensive study to date of the public’s perceptions of library and information resources as explored on a number of levels, including both the physical and digital environments.16 findings relevant to libraries on the internet (and this study) included the following: ■■ 84 percent of participants reported beginning an information search from a search engine; only 1 percent started from a library website ■■ there was a preference for self-service and a tendency to not seek assistance from library staff ■■ users were not aware of most libraries’ electronic resources ■■ college students have the highest rate of library use ■■ users typically cross-reference other sites to validate their results worlds while developing its own unique offerings and features.4 janes had conceived the idea in 1994, when the internet’s continued growth began to make it clear that the role of libraries and librarians would be forever changed as a result. janes’ motivating question was “what does librarianship have to say to the network environment and vice versa?”5 the ipl tackled a broad mission of enhancing the value of the internet by providing resources to its varied users, educating and strengthening that community, and (perhaps most unique at the time) communicating “its’ creators vision of the unique roles of library culture and traditions on the internet.”6 initial student brainstorming sessions yielded the priorities that the ipl would address and included such services as reference, community outreach, and youth services. the first version of the ipl contained electronic versions of classic library offerings, such as magazines, texts, serials, newspapers, and an e-mail reference service. the ipl was well received and continued its development, adding and expanding resources to support specific communities such as teens and children. the ipl was awarded several grants over the next few years, allowing for expansion and continuation.7 a wealth of librarian volunteers, composed of students and staff, contributed to the ipl, in particular toward the e-mail reference services. with a stated goal of responding to patrons’ questions within one week, the reference services provide help and direct contact with the ipl’s user base, many of whom are students working on school assignments.8 the ipl’s collections are discoverable through search engines (popular offerings such as the “potus: presidents of the united states” resources rank highly in search results lists) and through its presence on social networking sites such as myspace, facebook, and twitter. additionally, ipl distributes brochures to teachers and librarians at relevant conferences. the ipl has been the focus of many research studies covering a broad range of themes, such as its history and funding, digital reference and the ipl’s question-andanswer service, and its resources and collections.9 also, in line with the original mission of the ipl, janes developed the internet public library handbook to share best practices with other librarians.10 the majority of publications, however, have focused on ipl’s reference service, which is uniquely placed as a librarian-staffed volunteer digital reference service. as the ipl has collected and retained all reference interactions since its inception in 1995, there is a wealth of data readily available to such studies and exploratory work into how best to analyze it.11 user perceptions of digital libraries the internet is a vastly different world than it was in the early days of the ipl’s creation. the expectations of library patrons, both in digital and in physical environments, have changed as well. and as the internet evolves 18 information technology and libraries | march 2011 of the public, which is the intention of this study. ■■ method this exploratory study consisted of a qualitative analysis of data gathered from interviews and observations of ten college student participants who were academic library users and nonusers of the ipl. a pilot study preceded the final research effort and allowed us to iteratively tailor the study design to best pursue our research questions. our initial study design incorporated a usability test portion, in which users were presented with a series of information-seeking needs and instructed to use the ipl’s website to answer the questions. however, we later dropped this portion of the study because pilot results found that it contributed little to answering our research questions about public perceptions; it largely explored implementation details, which was not the focus of this study. following the pilot study, we recruited ten drexel university students from the university’s w. w. hagerty library. this ensured recruiting participants who were at least minimally familiar with physical libraries and who were from a variety of academic focuses. the participant group included eight females and two males—two were graduate students, eight were undergraduates—from a variety of majors, including biology, biomedical engineering, business, library science, accounting, international studies, and information systems. participants took an average of twenty-six minutes to complete the study. the study consisted of a short interview to assess the user’s experience with public libraries (both physical and online) and their expectations of an internet public library. these open-ended questions (included in the appendix) sought to determine what features, services, or content were desired or expected by users, whether the term of “internet public library” was meaningful, if there were similarities to web-based systems that the participants were already familiar with, or if they had previously used a website they would consider an internet public library. all interviews were audio recorded and transcribed. an initial coding scheme was established and iteratively developed (table 1). once we observed significant overlap between participant responses, the study then proceeded to the final analysis and presentation, using inductive qualitative analysis to code text and identify themes from the data.22 ■■ findings all participants were current or former public library patrons; six participants (p1, p4, p5, p6, p8, and p9) were a portion of the study focused on library identity or brand in the mind of the public; participants found the library brand to be “books,” with no other terms or concepts coming close. as a companion report to this study, oclc released a report focused on the library perceptions of college students.17 as our study uses a college student participant base, oclc’s findings are highly relevant. the vast majority of college students reported using search engines as a starting point for informationseeking and expressed a strong desire for self-service library resources. as compared to the general population, however, college students have the highest rate and broadest use of both physical and digital library resources and a corresponding high awareness of these services. the relationship between public libraries and the internet was explored in depth in a 2002 study by d’elia et al.18 the study sought to systematically investigate patrons’ use of the internet and of public libraries. findings included the fact that the internet and public libraries are often complementary; that more than half of internet users were library users and vice versa; and that libraries are valued more than the internet for providing accurate information, privacy, and child-oriented spaces and services. participants made a distinction between the service characteristics of the public library versus those of the internet. many of the most-valued characteristics of the internet (such as information that is always available when needed) were not supported by physical libraries because of limited offerings and hours. in addition to large, comprehensive surveys, there have been several case-study approaches, exploring user perceptions of a particular digital library or library feature. tammaro researched user perceptions of an italian digital library, finding the catalog, online databases, and electronic journals to be most valued; she found speed of access, remote access, a larger number of resources, and personalization to be key digital library services.19 this study also reported a consistent theme in digital library literature: a patron base primarily consisting of novice users who do not know how to use the library and are unaware of the various services offered. crowley et al. evaluated an existing academic library’s webpages for issues and user needs.20 they identified issues with navigational structures and overly technical terminology and a general need for robust help and extensive research portals. in respect to our study, we found no literature that studied perceptions of internet public libraries. as mentioned earlier, research that addressed the ipl from the perspective of its patrons largely focused on ipl’s reference services. in 2008, ipl staff reported 13,857 reference questions received and 9,794,292 website visitors.21 although reference is clearly a vital and well-used service, there is also a great deal of website collection use that must be researched. recent literature does not address the current state of the ipl from the perspective the internet public library (ipl) | maceli, wiedenbeck, and abels 19 of such a library. a few remained confused about how such a concept would relate to physical public libraries and the internet in general. one participant assumed that such a term must mean the web presence of a particular physical public library. another’s immediate reaction was to question the value of such a venture in light of existing internet resources: “i mean, the internet is already useful, so i don’t know [how useful it would be]” (p2). two other participants found meaning in the term by associating it with a known library website, such as that of their academic library or local physical public library. when asked what websites seem similar in function or appearance to what they would consider an internet public library, responses varied. while most participants could not name any similar website or service, one mentioned several academic library websites that he was familiar with, another described several bookseller websites (amazon.com, half.com, and abebooks.com), and a third mentioned wikipedia (but then immediately retracted the statement, after deciding that wikipedia was not a library). theme 2: quick and easy, but still credible participants were highly enthusiastic about the perceived benefits in access to and credibility of information from an internet public library. ease of use and faster information access, often from home, were key motivators for use of internet-based libraries, both public and academic. as described earlier, there is a wealth of competing information options freely available on the internet. given this, participants felt that an internet public library would offer the most value because of its credible information: i like the ready reference [almanacs, encyclopedias]. . . . i’m not used to using any of these, wikipedia is just so ready and user friendly. it’s so easy to go to wikipedia but it’s not necessarily credible. . . . whereas i feel like this is definitely credible. it’s something i could use if i needed to in some sort of academic setting. (p10) theme 3: lack of differentiation between public and academic; physical and digital libraries for many participants, there was confusion about what was or was not a public library, and they initially considered their academic library in that category. overall, participants did not think of public and academic libraries (physical or on the internet) as distinctly different; rather they were more likely to be associated with phase of life. participants that were not current public library users reported using public libraries frequently during their years of elementary education. for participants that were current public library users, physical public libraries (and other local academic libraries) were used to fill in the gaps current public library users, and four (p2, p3, p7, and p10) had used public libraries in the past but were no longer using their services. two participants were graduate students (p3 and p9) with the remainder undergraduates, and two of the ten students had used the ipl website before (p3 and p6). the participants could be characterized as relatively infrequent public library users with a strong interest in the physical book holdings of the public library, primarily for leisure but frequently for research as well. several participants mentioned scholarly databases that were provided by their public library (typically from within the library or online with access using a public library card). there was also interest in leisure audiovisual offerings and in using the library as a destination for leisure. the following themes illustrate our main findings with respect to our research questions. as described above, we conceptualized our raw data into broad themes through an iterative process of inductive coding and analysis. although multiple themes emerged as associated with each of our research questions, we present only the most important and relevant themes (see table 2). all themes were supported by responses from multiple participants. we will further elaborate the themes discovered later in this section; a selected relevant and meaningful participant quote illustrates each theme. theme 1: confusion about name “internet public library” was not an immediately clear term to four of the participants; the six other participants were able to immediately begin describing their concept table 1. inductive coding scheme developed from raw transcript text, used to identify key themes coding scheme physical public libraries tied to life phase confusion between academic and public current use frequency of use perceptions of an internet public library access properties of physical libraries reference resources tools users general internet use academic library use similar sites to ipl 20 information technology and libraries | march 2011 would contain both electronic online items and locally available items in physical formats. in particular, connections to local physical libraries to share item holdings and availability status were desired: “general book information and maybe a list of where books can be found. like online, the local place you can find the books.” (p7) given that information-seeking, for this group, was conducted indiscriminately across physical and digital libraries, this integrated view into local physical resources seems to be a natural request. theme 6: personal and personalized help although no participants claimed that reference was a service that they typically use during their physical public library experiences, it was a strong expectation for an internet public library and mentioned by nearly every participant. when questioned as to how this reference interaction should take place, there was a clear preference for communicating via instant message: “reference information. . . . you know, where you have real people. a place where you can ask questions. . . . if you think you can get an answer at a library, then online you would hope to get the same things.” (p1) in addition to being able to interact with a “real” librarian, participants desired other personalized elements, such as resources and services dedicated to information needy populations (like children) as well as resources supporting the community and personal lifestyle issues and topics (like health and money). ■■ discussion in summary, we characterized the participants in this case study as low-frequency physical public library users with a high association between life phase (high school or grade school) and public library use. participants looked to public libraries to provide physical books—primarily for leisure but often for research use as well—leisure dvds and cds, scholarly databases, and a space to “hang for items that could not be located at their school’s academic library, either through physical or digital offerings. consistent with this finding, a few participants reported conducting searches across both local academic and public libraries in pursuit of a particular item. there was a general disregard for where the item came from, as long as it could be acquired with relatively little effort from physically close local or online resources. however, participants reported typically starting with their academic libraries for school resources and the public libraries for leisure materials “i go to the philadelphia public library probably once a month or so usually for dvds but sometimes for books that i can’t find here [academic library]. . . . i usually check here first because it’s closer.” (p5) theme 4: electronic resources, catalog, and searching tools are key there were many participant comments, and some confusion, around what type of resources an internet public library would provide, as well as whether they would be free or not (one participant assumed there would be a fee to read online). the desired resources (in order of importance) included leisure and research e-books, scholarly databases, online magazines and newspapers, and dvds and cds (pointers to where those physical items could be found in local libraries). a few comments were negative, assuming the resources provided would only be electronic, but participants were mostly enthusiastic about the types and breadth of resources that such a website would offer. for example, one participant commented, “i think you could get more resources. . . . the library i usually visit is kind of small so it’s very limited in the range of information you can find.” (p4) many participants emphasized the importance of providing robust, yet easy-to-use, search tools in managing complex information spaces and conveying item availability. theme 5: connections to physical libraries several participants assumed that the resource collection table 2. themes identified research question themes identified what is the public perception of an internet public library? confusion about name quick and easy, but still credible lack of differentiation between public and academic; physical and digital libraries what services and materials would such a website offer? electronic resources, catalog, and searching tools are key connections to physical libraries personal and personalized help the internet public library (ipl) | maceli, wiedenbeck, and abels 21 infosphere—their services and collections both physical and virtual.25 this is, like many issues in library systems design, a complex challenge. as previous research has shown, extending the metaphor of the physical library into the digital environment does not always assist users, especially when they may be more likely to draw on previous experiences with other internet resources.26 the original prospectus for the internet public library, as developed by joe janes, acknowledges the different capabilities of physical libraries and libraries on the internet, claiming that the ipl would “be a true hybrid, taking the best from both worlds but also evolving its own features.”27 if users anticipate an experience similar to the internet resources they typically use (such as search engines), then the ipl may best serve its users by moving closer to “internet” than “library.” however, such a choice may entail unforeseen tradeoffs. several participants in this study mused over what physical public library characteristics would carry over to a digital public library and the potential tradeoffs: “you wouldn’t have to leave your home but at the same time i think it’s easier to wander the library and just see things that catch your eye. and i like the quiet setting of the library too.” (p8) another participant mentioned the distinctly positive public library experience, and how such an experience should be reflected in an internet-based public library: “i think that public libraries have a very positive reputation within communities. and i don’t think it would be bad for an internet public library to move toward that expectation that people have.” (p3) the question remains, then, whether the ipl can compete with a multitude of other internet resources without losing the familiar and positive essence of a traditional physical public library. or rather, how can the ipl find a way to translate that essence to a digital environment without sacrificing performance and user expectations of internet services? ■■ conclusion during this study, participants described an internet public library that, in many ways, takes the best features of several currently existing and popular websites. an internet public library should contain all the information of wikipedia, yet be as credible as information received directly from your local librarian. it should search across both websites and physical holdings, like abebooks.com or a search aggregator. it should search as powerfully and as easily as google, yet return fewer, more targeted results. and it should provide real-time help immediately and conveniently, all from the comfort of your home. out” or occupy leisure time. for the participants, an internet public library (an occasionally confusing term) described a service you could access from home, which included electronic books, information about locally available physical books, scholarly databases, reference or help services, and robust search tools. it must be easy to use and tailored to needy community populations such as children and teens. for several participants it would be similar to existing bookseller websites (such as amazon. com or abebooks.com) or academic library websites. in exploring how these findings can inform the future design and direction of the ipl, it is again necessary to reflect on the values and concepts that inspired the original creation of the ipl. the initial choice of the ipl’s name was intended to reflect a novel system at the time, as joe janes detailed in the ipl prospectus: “i would view each of those three words as equally important in conveying the intent of this project: internet, public, and library. i think the combination of the three of them produces something quite different than any pair or individual might suggest.”23 all three of these concepts—internet, public, and library—have evolved with the changing nature of the internet. and, as the research explored would indicate, there may not be a distinct boundary between these concepts from the perspective of users. our finding that participants seek information by indiscriminately crossing public and academic libraries, as well as digital and physical resource formats verifies earlier research efforts.24 as the amount of information accessible on the internet has expanded, the boundary of the library can be seen as either expanding (providing credible indexing, pointers, and information about useful resources from all over the internet), contracting (primarily providing access to select resources that must be accessed through subscription), or existing somewhere in between, depending on the perspective. in any of these cases, it is vital that the ipl present its resources, services, and offerings such that its value and contribution to information-seeking is highlighted and clear to users. amorphously placed in a complex world of digital and physical information, the ipl must work toward creating a strong image of its offering and mission; an image that is transparent to its users, starting with its name. this challenge is not the ipl’s alone, but rather that of all internet library portals, resources, and services. the 2005 oclc report on perceptions of libraries expressed the importance of a strengthened image for internet libraries: libraries will continue to share an expanding infosphere with an increasing number of content producers, providers and consumers. information consumers will continue to self-serve from a growing information smorgasbord. the challenge for libraries is to clearly define and market their relevant place in that 22 information technology and libraries | march 2011 library,” journal of electronic publishing 3, no. 2 (1997). 8. david s. carter and joseph janes, “unobtrusive data analysis of digital reference questions and service at the internet public library: an exploratory study,” library trends 49, no. 2 (2000): 251–65. 9. on the ipl’s history and funding, see barbara hegenbart, “the economics of the internet public library,” library hi tech 16, no. 2 (1998): 69–83; joseph janes, “serving the internet public: the internet public library,” electronic library 14, no. 2 (1996): 122–26; and carter and janes, “unobtrusive data analysis,” 251–65. on digital reference and ipl’s question-andanswer service, see kenneth r. irwin, “professional reference service at the internet public library with ‘freebie’ librarians,” searcher—the magazine for database professionals 6, no. 9 (1998): 21–23; nettie lagace and michael mcclennen, “questions and quirks: managing an internet-based distributed reference service,” computers in libraries 18, no. 2 (1998): 24–27; sara ryan, “reference service for the internet community: a case study of the internet public library reference division,” library & information science research 18, no. 3 (1996): 241–59; and elizabeth shaw, “real time reference in a moo: promise and problems,” internet public library, http://www.ipl.org/div/iplhist/moo .html (accessed dec. 4, 2008). on the ipl’s resources and collections, see thomas pack, “a guided tour of the internet public library—cyberspace’s unofficial library offers outstanding collections of internet resources,” database 19, no. 5 (1996): 52–56. 10. joseph janes, the internet public library handbook (new york: neal schuman, 1999). 11. carter and janes, “unobtrusive data analysis,” 251–65. 12. gloria j. leckie and lisa m. given, “understanding information-seeking: the public library context,” advances in librarianship 29 (2005): 1–72. 13. james rettig, “reference service: from certainty to uncertainty,” advances in librarianship 30 (2006): 105–43. 14. eileen abels, “information seekers’ perspectives of libraries and librarians,” advances in librarianship 28 (2004): 151–70. 15. ibid., 168. 16. cathy de rosa et al., “perceptions of libraries.” 17. cathy de rosa et al., “college students’ perceptions of libraries.” 18. george d’elia et al., “the impact of the internet on public library use: an analysis of the current consumer market for library and internet services,” journal of the american society for information science & technology 53, no. 10 (2002): 802–20. 19. anna maria tammaro, “user perceptions of digital libraries: a case study in italy,” performance measurement & metrics 9, no. 2 (2008): 130–37. 20. gwyneth h. crowley et al., “user perceptions of the library’s web pages: a focus group study at texas a&m university,” the journal of academic librarianship 28, no. 4 (2002): 205–10. 21. adam feldman, e-mail to author, apr. 3, 2009; mark galloway, e-mail to author, apr. 3, 2009. 22. for information on inductive qualitative analysis, see david r. thomas. “a general inductive approach for analyzing qualitative evaluation data” american journal of evaluation 27, no. 2 (2006): 237–46; michael quinn patton, qualitative research and evaluation methods (thousand oaks, calif.: sage, 2002); these are clearly complex, far-reaching, and labor-intensive requirements. and many of these requirements are currently difficult and unresolved challenges to digital libraries in general, not simply the ipl. this preliminary study is limited in its college student participant base and small sample size, which may not reflect perspectives of the greater community of ipl users. these results therefore may not be generalizable to other populations who are current or potential users of the ipl, including other targeted groups such as children and teens. additionally, our chosen participant group, college students who are physical library users, had relatively high levels of library and technology experience, as well as complex expectations. our results would likely differ with a participant group of novice internet users. as detailed above, this study explores public perceptions of an internet public library—an important aspect of the ipl that is not well studied and that has implications on ipl use and repeat use. while the ipl was carefully and thoughtfully constructed by a dedicated group of librarians, students, and educators, there has not been a recent study devoted to understanding what an internet public library should be today. more recently, in january 2010, the ipl merged with the librarians’ internet index to form ipl2. the two collections were merged and the website was redesigned. although this merger was because of circumstances unrelated to our research, our findings were leveraged during the redesign (for example, in naming the collections). in the future, our findings can be used in further ipl2 design iterations or explored in subsequent research studies in the specific context of ipl2 or of digital libraries in general. as discussed above, this study may be extended to different participant populations and to existing but remote ipl2 users. this study may also be continued in a more design-oriented direction to explore the usability and user acceptance of ipl2’s website. references 1. joseph janes, “the internet public library: an intellectual history,” library hi tech 16, no. 2 (1998): 55–68. 2. “about the internet public library,” internet public library, http://ipl.org/div/about/ (accessed feb. 17, 2009). 3. cathy de rosa et al., “perceptions of libraries and information resources,” oclc online computer library center, 2005, http://www.oclc.org/reports/pdfs/percept_all .pdf (accessed mar. 9, 2009); cathy de rosa et al., “college students’ perceptions of libraries and information resources,” oclc online computer library center, 2005, http://www .oclc.org/reports/pdfs/studentperceptions.pdf (accessed mar. 9, 2009). 4. janes, “the internet public library,” 55. 5. ibid., 56. 6. ibid., 57. 7. lorrie lejeune, “before its time: the internet public the internet public library (ipl) | maceli, wiedenbeck, and abels 23 american society for information science & technology 58, no. 3 (2007): 433–45. 25. de rosa et al., “college students’ perceptions of libraries,” 146. 26. makri et al., “a library or just another information resource?” 434. 27. joseph janes, “the internet public library,” 56. and matthew b. miles and michael huberman, qualitative data analysis: an expanded sourcebook, 2nd ed. (thousand oaks, calif.: sage, 1994). 23. janes, “the internet public library,” 56. 24. for example, stephann makri et al., “a library or just another information resource? a case study of users’ mental models of traditional and digital libraries,” journal of the appendix. interview protocol ■■ have you ever visited a public library? ■❏ if so, how often do you visit and why? ■❏ what services do you typically use? ■❏ can you describe your last visit and what you were looking for? ■❏ what do you think an internet public library would be? ■■ what sort of services would it offer? ■■ what else should it do? ■■ have you ever visited an internet public library? this article discusses structural, systems, and other types of bias that arise in matching new records to large databases. the focus is databases for bibliographic utilities, but other related database concerns will be discussed. problems of satisfying a “match” with sufficient flexibility and rigor in an environment of imperfect data are presented, and sources of unintentional variance are discussed. editor’s note: this article was submitted in honor of the fortieth anniversaries of lita and ital. s ameness is a sometime thing. libraries and other informationintensive organizations have long faced the problem of large collections of records growing incrementally. computerized records in a net worked environment have encouraged the recognition that duplicate records pose a serious threat to efficient information retrieval. yet what constitutes a duplicate record may be neither exact nor completely predictable. levels of discernment are required to permit matches on records that do not dif fer significantly and records that do. n initial definitions matching is defined as the process by which additions to a large database are screened and compared with existing database records. ideally, this process of matching ensures that duplicates are not added, nor erroneous replacements made of record pairs that are not really equivalent. oclc (online computer library center, inc.) is a non profit organization serving member libraries and related institutions throughout the world. it is the chief database capital of the organization, and it is “owned” in a sense by the member libraries worldwide that use and contribute to it. at this writing, it contains over seventythree mil lion records. this discussion focuses chiefly on oclc’s extended worldcat (xwc), though many of the issues are common to other bibliographic databases. examples of these include the research libraries group’s research libraries information network (rlin) database, pica (a european cooperative of libraries headquartered in the netherlands), and other union catalogs. the literature will demonstrate that the problems described exist in many if not most large bibliographic databases.the database contents are representations or surrogates of the objects in shared collections. individual records in xwc are com plex bibliographic representations of physical or virtual objects—books, films, urls, maps, slides, and much more. each of these records consists of metadata, i.e., “structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource”1(appendix a). the records use an xml varia tion of the marc communications format.2 for example, a record for a book might typically contain such fields for author, title, publisher, and date, and many more in addi tion. the representation of any one object can be quite com plex, containing scores of fields and subfields. such a record may be quite brief, or several thousand characters long. the depth and richness of the records varies enormously. they may describe materials in more than 450 languages. this is a database against which millions of searches and millions of records are processed, each month. why is matching a challenge? two records describing the same intellectual creation or work (e.g., shakespeare’s othello) can vary by physical form and other attributes. two records describing both the same work and exactly the same form can differ from each other if the records were created under different rules of record description (catalog ing). two records intended to describe the same object can vary unintentionally if typographical or other entry errors are present in one or both. thus sorting out significant from insignificant differences is critical. an example of the challenges of developing matching software in the metadata capture project is described elsewhere.3 the scope of misinformation is limited to information storage and retrieval, and specifically to comparison of incoming records to candidate matches in the database. the authors define misinformation as follows: 1. anything that can cause two database records, i.e., representations of different items to be mistaken as representations of the same item. these can lead to inappropriate merging or updates. 2. the effect of techniques or processes of search that can obscure distinctions in differing items. 3. any case where matching misses an appropriate match due to nonsignificant differences in two records that really represent the same item. note that disinformation (the intentional effort to mis represent) is not considered in scope for this discussion. the assumption is that cooperation is in the interests of all parties contributing to a shared database. we do not assume that all institutions sharing the database have the same goals. misinformation and bias in metadata processing | thornburg and oskins 15 misinformation and bias in metadata processing: matching in large databases gail thornburg and w. michael oskins gail thornburg (thornbug@oclc.org) has taught at the university of maryland and the university of illinois, and served as an adjunct professor at kent state university, and as a senior-level software engineer at oclc. w. michael oskins (oskins@oclc.org) has worked as a developer and researcher at oclc for twenty years. 16 information technology and libraries | june 200716 information technology and libraries | june 2007 what is bias? bias can be defined as factors in the creation or processing of database records that feed on misinformation or missing information, and skew charac terizations of the database records in question. context—matching and bias how are matching and bias related to each other? the growth of a database is in part a function of the matching process. if matching is not tuned correctly, the database can grow or change in nonoptimal ways. another way to look at the problem is to consider the goal of success in searching, and the need to know when to stop. human beings recognize that failure to find the best information for a given problem may be costly. finding the best information when less would suffice may also be costly. systems need to know this. for a large shared data base, hundreds of thousands of records may be processed in a day; the system must be as efficient as possible. what are some costs? fail to match when one should, and duplicates may proliferate in the database. match badly, and there is risk of merging multiple records that do not represent the same item. a system of matching can fail in more than one way. balance is needed. 1. searches, which are based on data in the incom ing record, may be too precise to find legitimate matches. loosen the criteria too much, and the search may return too many records to compare. 2. once retrieved, candidate matches are evaluated. compare candidates too narrowly, and records with insignificant differences will be rejected. fail to take note of salient differences between incom ing record and database record, and the match will be wrong, undetected, and potentially hard to detect in the future. the goals vary in different matching projects. for some projects, setting “holdings,” the indication that a member library owns a copy of something, is the main goal of the processing. this does not involve adding, replacing, or merging database records. for other projects, the goal is to update the database, either by replacing matched records, merging multiple duplicate records into one, or by adding new records if no match is found in the database. for the latter, bad matching could compromise database contents. n background hickey and rypka provide a good review of the problems of identifying duplicates and the implications for match ing software.4 their study notes concerns from a variety of library networks including that of the university of toronto (utlas), washington library network (wln), and research libraries group (rlin). they also refer ence studies on duplicate detection in the illinois state wide bibliographic database and at oak ridge national laboratories. background discussion of broader misinfor mation issues in shared library catalogs can be found in bade’s paper.5 a good, though dated, review of duplicate record problems can be found in the o’neill, rogers, and oskins article.6 the authors discuss their analysis of differences in records that are similar but not identical, and which elements caused failure to match two records for the same item. for example, when there was only one differing element in a pair, they found that element was most often publication date. their study shows the difficulties for experts to determine with certainty that a bibliographic record is for the same item. problems of typographical errors in shared biblio graphic records come under discussion by beall and kafadar.7 their study of copy cataloging errors found only 35.8 percent were corrected later by libraries, though the ordinary assumption is that copy cataloging will be updated when more information is available for an item. pollock and zamora report on a spelling error detection project at chemical abstracts service (cas) and charac terize the types of errors they found.8 chemical abstracts databases are among the most searched databases in the world. cas is usually characterized as a set of sources with considerable depth and breadth. of the four most common typographical errors they describe, errors of omission are most common, with insertion second, substitution third, and transposition fourth. over 90 percent of the errors they found were single letter errors. this is in agreement with the findings of o’neill and aluri, though the databases were substantially different.9 another study on moving image materials focuses on problems of nearequivalents in cataloging.10 yee suggests that cataloging practice tends to lead to making too many separate records for near equivalents. owen gingerich provides insight in the use of holdings information in oclc and other bibliographic utilities such as rlin for scholarly research in locating early editions of copernicus’ de revolutionibus.11 among other sources, he used holdings information in multiple bibliographic utilities to help in collecting a census of copies of de revolutionibus, and plotting its movements through europe in the sixteenth century. his article high lights the importance of distinguishing very similar items for scholarly research. shedenhelm and burk discuss the introduction of vendor records into oclc’s worldcat database.12 their results indicate that these minimallevel records increase the duplication rate within the database and can be costly to upgrade. (see further discussion in the section change in contributor characteristics below.) one problem in analysis of sources of mismatch in previous studies is that there is no good way to detect and charac public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 17misinformation and bias in metadata processing | thornburg and oskins 17 terize typos that form real words. jasco reviews studies characterizing types and sources of errors.13 sheila intner compares the quality issues in the databases of oclc and the research libraries group (rlg) and finds the issues similar.14 intner used matched samples of records from both worldcat and rlin to list and compare types of errors in the records. she noted that while the perception at that time was that rlin had higherquality cataloging, the differences found were not statistically significant. jeffrey beall, while focusing in his study on the full text online database jstor, notes the commonality of problems in metadata quality.15 in addition, he discusses the special quality problems in a database of scanned images. the scanning software itself may introduce typo graphical errors. like xwc, the database changes rapidly. o’neill and visinegoetz present a survey of quality con trol issues in online databases.16 their sections on dupli cate detection and on matching algorithms illustrate the commonalities of these problems in a variety of shared cataloging databases. they cite variation in title as the most common reason for failure to identify a duplicate record that should match. variations in publisher, names, and pagination were noted as common. lei zeng pres ents a study of chinese language records in the oclc and rlin databases.17 zeng discusses quality problems including (1) format errors such as field and subfield tagging and incorrect punctuation; (2) content errors such as missing fields and internal record inconsisten cies; and (3) editing and inputting errors such as spacing and misspelling. part 2 of her study presents the results of the prototype rulebased system developed to catch such errors.18 while the author refrains from comparing the quality of oclc and rlin chinese language catalog records, the discussion makes clear that the quality issues are common to a number of online databases. more work is needed on quality and accuracy of shared records in nonroman scripts, or in other lan guages transliterated to roman script. n types of bias to be considered specific factors that may tend to bias an attempt to match one record to another include: 1. violated expectations—system software expects data it does not receive, or data received is not well formed. 2. temporal bias—changes in rules and philosophies of record creation over time. 3. design bias—choices in layout of the records, which favor one type of record representation at the expense of another. 4. judgment calls—distinctions introduced in record representations due to differing but legitimate variation in expert judgment. oclc is a multina tional cooperative and there is no universal set of standards and rules for creating database records. rules of cataloging most widely used are not abso lutely prescriptive and are designed to allow local deviation to meet local needs.19 5. structural bias—process and systems bias. this category reflects internal influences, inherent in the automatic processing, storage, and retrieval of large numbers of records. 6. growth of the database environment—whether in raw numbers of records, numbers of specific formats, numbers of foreign languages, or other characteristics that may affect efficient location and comparison of records. 7. changes in contributor characteristics––in the goals or focus of institutions that contribute to the database. violated expectations data may not conform to expectations. expectations about the nature of records in the data bases are frequently violated. what seem to be good rules for matching may not work well if the incoming data is not well formed, or simply not constructed as expected. biasing sources in the incoming data include the fol lowing: 1. typographical errors occur in titles and other parts of the record. anywhere the software has to parse text, an entry error—or even correction of an entry error by a later update—could con found matching. this could confound both (a) query execution and (b) candidate comparisons. basically the system expects textual data such as the name of a title or publisher to be correct, and machinebased efforts to detect errors in data are expensive to run. spelling detection techniques can compensate in some ways for data problems, but will not identify cases of realword errors. see kukich for a survey of spelling error, realword, and contextdependent techniques.20 2. there is also the issue of real word differences in similar text strings. an automated system with programmed fault tolerance may wrongly equate the publisher name “mila” with “mela” when they are distinct publishers. equivalence tables can crossreference known variations on wellknown publisher names, but cannot predict merges and other organizational changes. or consider author names: are “john smith” and “jon smith” the 1� information technology and libraries | june 20071� information technology and libraries | june 2007 same? this is a major problem with automated authority control where context clues may not be trustworthy. 3. errors of formatting of variable fields in the meta data contribute to false mismatch. the rules for data entry in the marc record are complex and have changed over time. erroneous placement or coding of subfields poses challenges for iden tification of relevant data. the software must be fault tolerant wherever possible. changes in the format of the data itself in these fields/sub fields may further complicate record comparisons. isbns (international standard book numbers) and lccns (library of congress control numbers) have both changed format in the recent past. 4. errors occur in the fields that indicate format of the information. in bibliographic records, format information is used to derive the overall type of material being described: book, url, dvd, and so on. errors in the data in combination can generate an incorrect material type for the record. 5. language of cataloging: this comparison has in the past caused inappropriate mismatches. the require ments in the new matching aimed to address this. 6. language in formation of queries: marc records frequently are a mixture of languages. as has been seen in other projects with intensive comparison of text, overlap in languages has the potential to confuse comparisons of short strings of text.21 the assumption made here is that the use of all pos sible syllables contained in the title should tend to mitigate language problems. nothing short of semantic analysis by the software is likely to solve such a problem, and contextual approaches to detection have had most success (in the produc tion environment) in carefully controlled cases. matching overall must be generic in its problem solving techniques. temporal bias large databases developed over time have their contents influenced by changes in standards for record creation, changes in contributor perception of the role of the data base, and changes in technology to be described. changes may include the following: 1. description level: e.g. changes such as book or elec tronic book. these have evolved from format to contentbased descriptions that transcend format. over time, the cataloging rules for describing formats have changed. thus a format description created earlier might inadvertently “mismatch” the newer description of exactly the same item. for example, the rules for describing a book on a cd originally emphasized the cd format, whereas now, the emphasis might be shifted to focus on the intellectual content, the fact that it is a book. 2. the role of the database once perceived as chiefly repository or even backup source for a given library has become a shared resource with responsibilities to a community larger than any one library. 3. over time, the use of the database may change. (this is further discussed in the section on growth of the environment later.) searching has to satisfy the reference function of the database, but match ing as a process also relies on searching, and its goals are different. 4. varied standards worldwide challenge coopera tion. while u.s. libraries usually follow aacr2 and use the marc21 communications format, other parts of the world may use unimarc and countryspecific cataloging rules. for instance, the pica bibliotekssystem, which hosts the dutch union catalog, used the prussian cataloging rules, which tended to focus on title entries.22 the switch to the rak was made by the early nineties.23 5. some libraries may not use any form of marc but submit a spreadsheet that is then converted to marc. there is some potential for ambiguities in those conversions due to lack of 1:1 correspon dence of parts. 6. even within a country, standards change over time, so that “correct” cataloging in one decade may not match that in a later period. neither is wrong, in its own temporal context, but each results in different metadata being created to describe the same item. intner points out that oclc’s database was initi ated a full decade before rlg implemented rlin, and rlin started almost the same time as the aacr2 publication.24 thus rlin had many fewer preaacr2 records in its database, while worldcat had many more preexisting records to try to match with the newer aacr2 forms. 7. objects referenced in the database may change over time. for instance, a record describing an elec tronic resource may point to a location no longer valid for that resource. 8. vendor records are created as advance advertis ing, but there is no guarantee the records will be updated later. estimating the time before updates occur is impossible. 9. records themselves change over time as they are copied, derived, and migrated into other systems. they may be enhanced or corrected in any system where they reside. so when they return to the origi nating database, they may have been transformed so far as to be unrecognizable as representations of the same item. this problem is not unique to xwc; public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 1�misinformation and bias in metadata processing | thornburg and oskins 1� it is a challenge for any shared database where export of records and reentry is likely. design bias the title, author, publisher, place of publication, and other elements of a record, designed in a time when most of the contents of a library were books, may not appear as clear or usable for other forms of informa tion, such as web sites or software. there is a risk to any design of a representation for an object, that it may favor distinctions in one format over another. or representations imported from other schemes may lose distinctions in the crosswalk from one scheme to another. a crosswalk is a mechanism for the mapping of data elements/content from one metadata scheme to another. dublin core and marc are just two examples of schemes used by library professionals. software exists to convert dublin core metadata to marc for mat, but the process of converting less complex data to a scheme of more structured data has inevitable limita tions. for instance, dublin core has “subject” while marc has dozens of ways to indicate subject, each with a different kind of designation for subject aspects of an item.25 see discussion in beall.26 libraries commonly exchange or purchase records from external sources to reduce the volume or costs of inhouse cataloging. if an institution harvests metadata from multiple sources, there can be varying structures, content standards, and overall quality, all of which can make record compari sons error prone. while library and information science professionals have been creating metadata in the form of catalog records for a long time, the wider community of digital repositories may be outside the lis commu nity, and have varied understanding of the need for consistent representations of data. robertson discusses the challenges of metadata creation outside the library community.27 museums and archives may take a dif ferent view of what quality standards in metadata are. for example, for a museum, extensive detail about the provenance of an object is necessary. archives often record information at the collection level rather than the object level; for example, a box of miscellaneous papers, as opposed to a record for each of the papers within the box. educators need to describe resources such as learning objects. a learning object is any entity, digital or nondigital, which can be used, reused, or referenced during technologysupported learning 28 for these objects a metadata record using the ieee lom standard may be used.29 while this is as complex as a marc record, it has less bibliographic description and more focus on description of the nature and use of the learning object. in short, for one type of institution the notion of appropriate granularity of description may be too detailed or too vague for the needs of another type of institution. judgment calls two persons creating independent records for the same item exercise judgment in describing what is most impor tant about the object. one may say it is a book with an accompanying cd, another may say it is software on a cd, accompanied by a book of documentation. another example of legitimate variation is the choice of use of ellipses […] to leave out parts of long titles in a metadata description. one record creator may list the whole title, another may list only the first part followed by the mark of ellipsis to indicate abbreviation of the lengthy title. either is correct, but may not match each other without special techniques. see appendix b for the perils of ellipsis handling. the form of name of a publisher, given other occur rences of a publisher name in a record, may be abbrevi ated. for instance, in one place the corporate author who is also the publisher might be listed in the author field as “department of health and human services” and then abbreviated—or not—in the publisher area as “the department.” note that there are limitations inherent to the valida tion of any system of matching, in that human reviewers may not be able to determine whether two representa tions in fact describe the same item. structural bias 1. process bias refers to any features of the software which at runtime may change the way matching is carried out, whether by shortening or lengthen ing the analysis, or otherwise branching the logical flow. this can arise from many sources, including but not limited to the following factors. a. there is need for efficient processing of large num bers of incoming records. this can force an empha sis on speedy matching. that is, matching not required to replace records tends to be optimized to stop searching/matching as early as is reason able. in the case where unique key searching finds a single match to an incoming record, it is fairly easy for the software to “justify” stopping. if there are multiple matches found, more analysis may be needed before the decision to stop matching can be made. over time the numbers of records processed has increased enormously. b. matching needs to exploit “unique” keys to speed searching, yet these may not prove to be unique. though agreements are in place for use of numeric keys such as isbns, creation of these keys is not under the control of any one organization. 20 information technology and libraries | june 200720 information technology and libraries | june 2007 c. problems arise when brief records are com pared with fuller records. comparisons may be biased inadvertently towards false matches. such sparseness of data has been identified as a problem in rlin matching as well as in xwc. d. at the same time there is bias toward less generic titles in matching. requirements of sys tem throughput mandate an upper limit on the size of result set that the matching software will even attempt to analyze. this upper limit could tend to discriminate against effective retrieval of generic titles. matching will reject very large results sets of searches. so the query that has fewer title terms may tend to retrieve too much. titles such as “proceedings” or “bulletin” may be difficult to match if insufficient other informa tion is present in the record for the query to use. ironically this can mean addition of more generic titles to the database, since what is there is in effect less findable. e. transparency can contribute to bias in that, for each layer of transparency a layer of opacity may be added, when information is filtered out from a user’s view. that user may be a human or an application. openurl access to “appropriate copy” is an example from the standards world. the complexity of choosing among multiple online copies has become known as the “appro priate copy” problem. there are a number of instances where more than one legitimate copy of an electronic article may exist, such as mir roring or aggregator databases. it is essentially a problem of where and how to introduce localiza tion into the linking process.30 appropriateness reflects the user’s context, e.g., location, license agreements in place, cost, and other factors. 2. systems bias. what is this, really? the database can be seen as “agent.” the weight of its own mass may affect efforts to use its contents. a. for maintainers of large database systems, the goals of database repository and search engine may be somewhat at odds. yet librarians do make use of the database as reference source. b. search strategies for the software that acts as a user of the database is necessarily developed and optimized at a certain point in time. yet a river of new information flows into this data base. 1. if the numbers of types of entries in various database indexes grows nonproportion ally, search strategies that worked well in the past could potentially fall “out of tune” with the database contents. see growth of the environment section below. 2. change in proportions of languages in the database may render an application’s use of stopword lists less effective. 3. if changes in technology or practice result in new forms of material being described in the database, the software searches using material type as a limiter may not work properly. the software is using abstractions provided by the database, and they need to be kept synchronized. c. automated query construction presents its own problems. the use of boolean searching [term a and term b and term c] is quite restrictive in the sense that there is no “halfway” or flex for a record being included in a set of candidates. matching starts with the most specific search to avoid toohigh numbers of records retrieved, and all it can do is drop or rearrange terms from a query in the effort to broaden the results. d. disconnects in metadata object creation/revision are another problem. links can point to broken uris (uniform resource identifiers). controlled vocabularies can drift or expand. even more confusing, a uri that is not broken may point to content which has changed to the point where the metadata no longer describes the item it once did. at one extreme, bruce and hillmann describe the curious case of citation of judicial opinions, for which a record of the opinion may be created as much as eighteen months before the volume with the official citation is printed, and thus the official citation cannot be created.31 e. expectations for creation of metadata play a role as well. traditional cataloging has generally had an expectation that most metadata is being cre ated once and reused. yet current practice may be more iterative, and must be, if such problems as records with broken internet uris are to be avoided. f. loss of synchronization can subvert process ing. note that other elements of metadata may become divorced or out of synch with the origi nal target /purpose. the prefix to an isbn was originally intended to describe the publisher, but is now an unreliable discriminator. numeric keys intended to identify items uniquely can retrieve multiple items, if the scheme for assign ing them is not applied consistently. in the worst case, meaningful data elements may become so corrupted as to be useless for record retrieval or even comparison of two records. g. ownership issues can detract from optimal data base management. member institutions’ percep tions of ownership of individual records can conflict with the goals of efficient search and retrieval. members may resist the idea of a “bet public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 21misinformation and bias in metadata processing | thornburg and oskins 21 ter” record being merged with a “lesser” one. so systems have ways of ranking records by source or contents with the general goal of trying to avoid losing information, but with the specific effect of refraining from actions that might be enriching in a given case. growth of the database environment a shared database can grow in unpredictable ways. a change in the relative proportions of different types of materials or topical coverage can render onceeffective searches ineffective due to large result sets. an example of this is the number of internetrelated entries in xwc. a search such as “dog” restricted to “internetrelated” entries in 1995 retrieved thirtyfour hits. this might be a manageable number. but in 2005, 225 entries were in the result set. similarly with subject headings, one search on “computer animation” retrieved fourteen hits in 1980, and 342 in 2005. in both cases the result sets grew from manageable to “too large” over time. the increase in the number of foreign language entries in a database can cause problems. just determining what language an entry is in can be difficult, and records may contain multiple languages. also, such languages as chinese, japanese, and korean can overlap. chinese syllables such as: “a, an, to, no, jan, ka, jun, lung, sung, i, lo, la, le, so, sun, juan,” seen out of context might be chinese or any one of several other languages. determining appropriate handling of stopwords and other rules for effective title matching becomes more complex as more languages populate the database. changes in contributor characteristics copy cataloging practices in an institution can affect xwc indirectly. an institution previously oriented to fixing downloaded records may adopt a policy of refrain ing from changing downloaded records. historical inde pendence of libraries is one illustration. prior to the 1970s, most libraries did not share their cataloging with other libraries. many institutions, especially smaller ones, were outside the loop and did things their own way. they used what rules they felt were useful, if they used any rules at all. later they converted sparse and poorly formed data into marc records and sent them to oclc for matching, perhaps in an effort to get back a more complete and useful record. yet the matching process is not always able to distinguish or interpret these local dialects. changes in specialization of cata loging staff at an institution, or cutbacks in staff can lead to reduced facility in providing original cataloging. outsourcing of cataloging work can affect handling of specialized materials as well. the introduction of vendor records and their characteristics has been noted by shedenhelm and burk.32 as they note, these records are very brief bibliographic records originally designed to advertise an item for sale by the vendor. these mini mal level records have a relatively high degree of dupli cation with existing records (37.5 percent in their study) and because of their sparseness can increase the cost of cataloging. changes in the proportion of contribu tors who create records in nonmarc formats such as dublin core can affect the completeness of bibliographic entries. the use of such formats, meant to facilitate the entry of bibliographic materials, does come with a cost. group cataloging is a process whereby smaller libraries can join a larger set of institutions in order to reduce costs and facilitate cataloging. this larger group then contributes to oclc’s database as an entity. the growth of group cataloging has resulted in the addition of more records from smaller libraries, which may in the future have an effect on searching/matching in xwc worldcat overall. internationalization may be a factor as well. the marc format is an anglobased format with englishlanguagebased documentation. rapid inter national growth thrusts a broader range of traditions into a marc/oclc world. the role of character sets is heightened as the database grows. a cyrillic record may not be confidently matched to a transliterated record for the same item. although worldcat has a long his tory with cjk records, marc and worldcat are not yet accustomed to a wide repertoire of character sets. now, however, xwc is an environment in which expanding character coverage is possible, and likely. future research n we need more systematic study of the types of errors/omissions encountered in marc record cre ation. n how can the process of matching accomodate objects that change over time? n how does the conversion from new metadata schemes affect matching to marc records? does it help to know in what format a record arrived, or under what rules it was created? n how can we address sparseness in vendor records or legal citations? how can we deal with other advance publication issues? n how do changes in philosophy of the database affect the integrity of the matching process? n conclusions in this review we have seen that characterizing metadata at a high level is difficult. challenges for adding to a large, complex database include some of the following: 22 information technology and libraries | june 200722 information technology and libraries | june 2007 n rules for expert creation of metadata inevitably change over time. n the object of the metadata itself may change, more often than may be convenient. n comparisons of briefer records to records that are more elaborate descriptions can have pitfalls. search and comparison strategies for such record pairs are challenged by the need to have matching algorithms that work for every scenario. n changes within the database may themselves con tribute to exacerbation of matching problems if duplicates are added too often, or records are merged that actually represent different contents. because of the risk, policies for merging and replacing records tend to be conservative, but this does not always favor the greatest efficiency in database processing. n changes in the membership sharing a database are likely to affect its shape and searchability. n newer schemes of metadata representation are likely to challenge existing algorithms for determining matches. references 1. national information standards organization, understanding metadata (bethesda, md.: niso pr., 2004), 1. http:// www.niso.org/standards/resources/understanding metadata. pdf (accessed feb. 26, 2006). 2. library of congress, “marc 21 concise format for bibliographic data (2002).” http://www.loc.gov/marc/ bibliographic/ecbdhome.html (accessed nov. 20, 2004). 3. gail thornburg, “matching: discrimination, misinforma tion, and sudden death,” informing science conference, flag staff, ariz., june 2005. 4. thomas b. hickey and david j. rypka, “automatic detec tion of duplicate monographic records,” journal of library automation 12, no. 2 (june 1979): 125–42. 5. david bade, “the creation and persistence of misinfor mation in shared library catalogs,” occasional paper no. 211, (graduate school of library and information science, univer sity of illinois at urbana–champaign, apr. 2002). 6. edward t. o’neill, sally a. rogers, and w. michael oskins, “characteristics of duplicate records in oclc’s online union catalog,” library resources and technical services 37, no.1 (1993): 59–71. 7. jeffrey beal and karen kafadar, “the effectiveness of copy cataloging at eliminating typographical errors in shared bibliographic records,” library resources & technical services 48, no. 2 (apr. 2004): 92–101. 8. j. j. pollock and a. zamora, “collection and characteriza tion of spelling errors in scientific and scholarly text,” journal of the american society for information science 34, no. 1 (1983): 51–58. 9. edward t. o’neill and rao aluri, “a method for cor recting typographical errors in subject headings in oclc records,” research report # oclc/opr/rr80/3 (1980). 10. martha m. yee, “manifestations and yearequivalents: theory, with special attention to movingimage materials,” library resources and technical services 38, no. 3 (1995): 227–55. 11. owen gingerich, “researching the book nobody read: the de revolutionibus of nicolaus copernicus,” the papers of the bibliographical society of america 99, no. 4 (2005): 484–504. 12. laura d. shedenhelm and bartley a. burk, “book vendor records in the oclc database: boon or bane?” library resources and technical services 45, no. 1 (2001): 10–19. 13. peter jasco, “content evaluation of databases,” in annual review of information science and technology, vol. 32 (medford, n.j.: information today, inc., for the american society for infor mation science, 1997), 231–67. 14. sheila intner, “quality in bibliographic databases: an analysis of membercontrolled cataloging of oclc and rlin,” advances in library administration and organization 8 (1989): 1–24. 15. jeffrey beall, “metadata and data quality problems in the digital library,” journal of digital information 6, no. 3 (2005): 10–11. 16. edward t. o’neill and diane vizinegoetz, “quality control in online databases,” annual review of information science and technology 23 (washington, d.c.: american society for information science, 1988). 17. lei zeng, “quality control of chineselanguage records using a rulebased data validation system. part 1: an evalua tion of the quality of chineselanguage records in the oclc oluc database,” cataloging and classification quarterly 16, no. 4 (1993): 25–66 18. lei zeng, “quality control of chineselanguage records using a rulebased data validation system. part 2: a study of a rulebased data validation system for online chinese cata loging,” cataloging and classification quarterly 18, no. 1 (1993): 3–26. 19. anglo-american cataloguing rules, 2nd ed., 2002 rev. (chi cago: ala, 2002). 20. karen kukich, “techniques for automatically correct ing words in text,” acm computing surveys 24, no. 4 (1992): 377–439. 21. gail thornburg, “the syllables in the haystack: techni cal challenges of nonchinese in a wadegiles to pinyin con version,” information technology and libraries 21, no. 3 (2002): 120–26. 22. hartmut walravens, “serials cataloguing in germany: the historical development,” cataloging and classification quarterly 35, no. 3/4 (2003): 541–51; instruktionen für die alphabetischen kataloge der preuszischen bibliotheken vom 10. mai 1899. 2 ausg. in der fassung vom 10. august 1908 (berlin: behrend & co., 1909). 23. richard greene, email message to author, nov. 13, 2006; regeln für die alphabetische katalogisierung: rak / irmgard bou vier (wiesbaden, germany: l. reichert, 1980, c1977). 24. intner, “quality in bibliographic databases.” 25. richard greene, email message to author, feb. 27, 2006. 26. beall, “metadata and data quality problems in the digital library.” 27. r. john robertson, “metadata quality: implications for library and information science professionals,” library review 54, no. 5 (2005): 295–300. public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 23misinformation and bias in metadata processing | thornburg and oskins 23 28. ieee. learning technology standards committee, “wg12: learning objects metadata.” http://ltsc.ieee.org/wg12 (accessed feb. 26, 2006). 29. ibid. 30. orien beitarie et al., “linking to the appropriate copy: report of a doibased prototype,” d-lib 7, no. 9 (sept. 2001). 31. thomas r. bruce and diane i. hillmann,“the continuum of metadata quality: defining, expressing, exploiting,” in metadata in practice (chicago: ala, 2004), 238–56. 32. shedenhelm and burk, “book vendor records in the oclc database.” 24 information technology and libraries | june 200724 information technology and libraries | june 2007 appendix a. sample cdfrecord record from the xwc database cgm 7a 27681290 vf bcahru mr baaafu 920714r19551952fr 092 mleng 92513007 dlcamim dlc lp5921u.s. copyright office xxu mr vbe 63606361 (viewing copy) fgb 56435647 (ref print) fpa 06210625 (master pos) othello (motion picture : welles) the tragedy of othello the moor of venice / a mercury production, [films marceau?] ; directed, produced, and written by orson welles. u.s. ; [morocco?] france :films marceau,1952 ; [morocco?: :s.n., 1952?] ;united states : united artists,1955. 2 videocassettes of 2 (ca. 92 min.) :sd., b&w ; 3/4 in. viewing copy. 10 reels of 10 on 5 (ca. 8280 ft.) :sd., b&w ; 35 mm. ref print. 10 reels of 10 on 5 (ca. 8280 ft.) :sd., b&w ; 35 mm. masterpos. copyright: orson welles; 19sep52; lp5921. reference sources cited below and m/b/rs preliminary cataloging card list title as othello. photography, anchisi brizzi, g.r. aldo, george fanto ; film editors, john shepridge, jean sacha, renzo lucidi, william morton ; music, francesco lavagnino, alberto barberis. orson welles, suzanne cloutier, micheaì l macliamoì ir, robert coote. director, producer, and writer credits taken from focus on orson welles, p. 205. lc has u.s. reissue copy.dlc new york times,9/15/55. an adaptation of the play by william shakespeare. reference sources used: new york times, 9/15/55; international motion pic ture almanac, 1956, p. 329; focus on orson welles, p. 205206; monthly film bulletin, v. 23, no. 267, p. 44; index de la cineì matog raphie francì§aise, 1952, p. 496. received: 5/26/87 from lc video lab;viewing copy; preservation, made from ref print, paperwork in acq: copyrightmaterial movement form file, lwo 21635; copyright collection. received: 12/2/64; ref print;copyright deposit; copyright collection. received: 5/70; masterpos;gift; afi theatre collection. othello (fictitious charac ter)drama. public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 25misinformation and bias in metadata processing | thornburg and oskins 25 plays. mim features. mim welles, orson, 1915direction, production,writing, cast. cloutier, suzanne,1927cast. mac liammoì ir, micheaì l, 18991978,cast. coote, robert,19091982,cast. copyright collection (library of congress)dlc afi theatre collection (library of congress)dlc othello. appendix b. the perils of judging near matches a. challenges of handling ellipses in titles thought to be similar incoming title: general explanation of tax legislation enacted in ... / prepared by the staff of the joint committee on taxation match: general explanation of tax legislation enacted in the 104th congress prepared by the staff of the joint committee on taxation incoming title: general explanation of tax legislation enacted in ... / prepared by the staff of the joint committee on taxation match: general explanation of tax legislation enacted in the 106th congress prepared by the staff of the joint committee on taxation incoming title: general explanation of tax legislation enacted in ... / prepared by the staff of the joint committee on taxation match: general explanation of tax legislation enacted in the 107th congress prepared by the staff of the joint committee on taxation incoming title: general explanation of tax legislation enacted in ... / prepared by the staff of the joint committee on taxation match: general explanation of tax legislation enacted in the 108th congress prepared by the staff of the joint committee on taxation b. partial matches in names which might represent the same publisher publisher comparison is challenging in an environment where organziations are regularly merged or acquired by other organziations. there is no real authority control for publishers that would help cataloguers decide on a preferred form. when governmental organizations are added to the mix, the challenges increase. below are some examples of nonmatch ing text of publisher names in records, which might or might not considered the same by a human expert. (the publisher names have been normalized.) 26 information technology and libraries | june 200726 information technology and libraries | june 2007 1. publisher name may be partially or differently recorded in two records incoming publisher: konzeptstudien kantonale planungsgruppe match: kantonale planungsgruppe konzeptstudien (word order different) incoming publisher: institut francais proche orient match: institut francais darcheologie proche orient incoming publisher: u s dept of commerce national oceanic and atmospheric administration national environ mental satellite data and information service match: national oceanic and atmospheric administration 2. publisher name may have changed due to acquisition by another organization incoming publisher: pearson prentice hall match: prentice hall incoming publisher: uxl match: uxl thomson gale incoming publisher: thomson arco match: arco thomson learning 3. one record may show “publisher” which is actually government distributing agency or clearinghouse such as the u.s. government printing office or national technical information service (ntis), while the candidate match shows the actual government agency. these can be almost impossible to evaluate. incoming publisher: u s congressional service match: supt g p o (here the distributor is the government printing office, listed as the publisher) incoming publisher: u s dept of commerce national oceanic and atmospheric administration national environmental satellite data and information service match: national oceanic and atmospheric administration incoming publisher: u s gpo match: u s fish and wildlife service 4. the publisher in a record may start with or end with the publisher in the second record. should it be called a match? good: incoming publisher trotta match: editorial trotta incoming publisher wiley match: john wiley questionable? incoming publisher prentice hall match: prentice hall regents canada incoming publisher geuthner match: orientaliste geuthner incoming publisher oxford match: distributed royal affairs oxford incoming publisher: pan union general secretariat organization states match: social science section cultural affairs pan union resource discovery: comparative survey results on two catalog interfaces heather hessel and janet fransen resource discovery: comparative survey results | hessel and fransen 21 abstract like many libraries, the university of minnesota libraries-twin cities now offers a next-generation catalog alongside a traditional online public access catalog (opac). one year after the launch of its new platform as the default catalog, usage data for the opac remained relatively high, and anecdotal comments raised questions. in response, the libraries conducted surveys that covered topics such as perceptions of success, known-item searching, preferred search environments, and desirable resource types. results show distinct differences in the behavior of faculty, graduate student, and undergraduate survey respondents, and between library staff and non-library staff respondents. both quantitative and qualitative data inform the analysis and conclusions. introduction the growing level of searching expertise at large research institutions and the increasingly complex array of available discovery tools present unique challenges to librarians as they try to provide authoritative and clear searching options to their communities. many libraries have introduced next-generation catalogs to satisfy the needs and expectations of a new generation of library searchers. these catalogs incorporate some of the features that make the current web environment appealing: relevancy ranking, recommendations, tagging, and intuitive user interfaces. traditional opacs are generally viewed as more complex systems, catering to advanced users and requiring explicit training in order to extract useful data. some librarians and users also see them as more effective tools for conducting research than next-generation catalogs. academic libraries are frequently caught in the middle of conflicting requirements and expectations for discovery from diverse sets of searchers. in 2002, the university of minnesota-twin cities libraries migrated from the notis library system to the aleph500™ system and launched a new web interface based on the aleph online catalog, originally branded as mncat. in 2006, the libraries contracted with the ex libris group as one of three development partners in the creation of a new next-generation search environment called primo. during the development process, the libraries conducted multiple usability studies that provided data to inform the direction of the product. participants in the usability studies generally characterized the primo interface as “clear” and “efficient.”1 a year later the university heather hessel (heatherhessel@yahoo.com) was interim director of enterprise technology and systems, janet fransen (fransen@umn.edu) is the librarian for aerospace engineering, electrical engineering, computer science, and history of science & technology, university of minnesota, minneapolis, mn. mailto:heatherhessel@yahoo.com mailto:fransen@umn.edu information technology and libraries | june 2012 22 libraries branded primo as mncat plus, rebranded the aleph opac as mncat classic, and introduced mncat plus to the twin cities user community as a beta service. in august 2008, mncat plus was configured as the default search for the twin cities catalog on the libraries’ main website, with the libraries continuing to keep a separate link active to the aleph opac. a new organizational body called the primo management group was created in december 2008 to coordinate support, feedback, and enhancements of the local primo installation. this committee’s charge includes evaluating user input and satisfaction, coordinating communication to users and staff, and prioritizing enhancements to the software and the normalization process. when the primo management group began planning its first user satisfaction survey, the group noted that a significant number of library users seemed to prefer mncat classic. therefore, two surveys were developed in response to the group’s charge. these two surveys were identical in scope and questions, except that one survey referenced mncat classic and was targeted to mncat classic searchers (appendix a), while the other survey referenced mncat plus and was targeted to mncat plus searchers (appendix b). these surveys were designed to produce statistics that could be used as internal benchmarks to gauge library progress in areas of user experience, as well as to assist with ongoing and future planning with regard to discovery tools and features. research questions in addition to evaluating user satisfaction and requesting user input, the primo management group also chose to question users about searching behaviors in order to set the direction of future interface work. questions directed toward searching behaviors were informed by the findings from a 2009 university of minnesota libraries report on making resources discoverable.2 the group surveyed respondents about types of items they expect to find in their searches, their interest in online resources, and the entry point for their discovery experience. the primo management group crafted the surveys to get answers to the following research questions:  how often do users view their searching activity as successful?  how often do users know the title of the item that they are looking for, as opposed to finding any resource relevant to their topic?  what search environments do users choose when looking for a book? a journal? anything relevant to a topic?  how interested are users in finding items that are not physically located at the university of minnesota?  are there other types of resources that users would find helpful to discover in a catalog search? resource discovery: comparative survey results | hessel and fransen 23 although it can be tempting to think of the people using the catalog interfaces as a homogeneous group of “users,” large academic libraries serve many types of users. as wakimoto states in “scope of the library catalog in times of transition,” on the one hand, we have ‘net-generation users who are accustomed to the simplicity of the google interface, are content to enter a string of keywords, and want only the results that are available online. on the other hand, we have sophisticated, experienced catalog users who understand the purpose of uniform titles and library of congress classifications and take full advantage of advanced search functions. we need to accommodate both of these user groups effectively.3 the primo management group planned to use the demographic information to look for differences among user communities; therefore the surveys requested demographic information such as role (e.g., student) and college of affiliation (e.g., school of dentistry). in designing the surveys, the group took into account the limitations of this type of survey as well as the availability of other sources of information. for example, the primo management group chose not to include questions about specific interface features because such questions could be answered by analyzing data from system logs. the group was also interested in finding out about users’ strategies for discovering information, but members felt that this information was better obtained through focus groups or usability studies rather than through a survey instrument. research method the primo management group positioned links to the user surveys in several online locations, with the libraries’ home page providing one primary entry point. clicking on the link from the home page presented users with an intermediate page, where they were given a choice of which survey to complete: one based on mncat plus, and the other on mncat classic. if desired, users could choose to complete a separate survey for each of the two systems. links were also provided from within the mncat plus and mncat classic environments, and these links directed users to the relevant version of the survey without the intermediary page. in addition to the survey links in the online environment, announcements were made to staff about the surveys, and librarians were encouraged to publicize the surveys to their constituents around campus. the survey period lasted from october 1 through november 25, 2009. at the time of the surveys, the university of minnesota libraries was running primo version 2 and aleph version 19. because participants were self-selected, the survey results represent a biased sample, are more extreme than the norm, and are not generalizable to the whole university population. participants were not likely to click the survey link or respond to e-mailed requests unless they had sufficient incentive, such as strong feelings about one interface or the other. thirty percent of respondents provided an e-mail address to indicate that they would be willing to be contacted for focus groups or further surveys, indicating a high level of interest in the public-facing interfaces the libraries employ. in considering a process for repeating this project, more attention would be paid to methodology to address validity concerns. findings and analysis information technology and libraries | june 2012 24 findings relevant to each research question are discussed here. six hundred twenty-nine surveys contained at least one response—476 for mncat plus and 153 for mncat classic. responses by demographics as shown in table 1, graduate students were the primary respondents for both mncat plus and mncat classic, followed by undergraduates and faculty members. library staff made up 13 percent of mncat classic respondents and 4 percent of mncat plus respondents, although the actual number of library staff responding was nearly identical (twenty-one for mncat plus, twenty for mncat classic). library staff members were disproportionately represented in these survey responses and the group analyzed the results to identify categories in which library staff members differed from overall trends in the responses. questions about affiliation appeared at the end of the surveys, which may account for the high number of respondents in the “unspecified” category. mncat classic respondents frequency mncat plus respondents frequency graduate student 50 33% graduate student 176 37% undergraduate student 31 20% undergraduate student 110 23% library staff 20 13% faculty 40 8% faculty 21 14% staff (non-library) 28 6% staff (non-library) 10 7% library staff 21 4% community member 2 1% community member 11 2% (unspecified) 19 12% (unspecified) 90 19% total 153 100% total 476 100% table 1. respondents by user population a comparison of the student survey responses shows that graduate students were overrepresented, while undergraduates were underrepresented, at close to a reverse ratio. of the total number of graduate and undergraduate students, 62 percent of the respondents were graduate students, even though they accounted for only 32 percent in the larger population. conversely, undergraduates represented only 38 percent of the student respondents, even though they accounted for 68 percent of the graduate and undergraduate total. regrettably, the surveys did not include options for identifying oneself as a non-degree-seeking or professional student, so the analysis of students compared with overall population in this section includes only graduate students and undergraduates. differences were also apparent in the representation of all four categories of students within a particular college unit. at least two college units were underrepresented in the survey responses: resource discovery: comparative survey results | hessel and fransen 25 carlson school of management and the college of continuing education. one college unit was overrepresented in the survey results; 59 percent of the overall student respondents to the mncat classic survey, and 47 percent of the mncat plus students indicated that they were housed in the college of liberal arts (cla), and yet cla students only represent 32 percent of the total number of students on campus. table 2 shows the breakdown of percentages by college or unit and the corresponding breakdown by survey respondent, highlighting where significant discrepancies are evident. twin cities overall percentage of students mncat classic student survey respondents +/mncat plus student survey respondents +/ carlson school of management 9% 0% -9% 2% -7% center for allied health 0% 2% +1% 1% 0% col of educ/human development 10% 9% -1% 14% +3% col of food, agr & nat res sci 5% 4% 0% 7% +2% coll of continuing education 8% 1% -7% 1% -7% college of biological sciences 4% 6% +2% 5% 0% college of design 3% 3% 0% 3% 0% college of liberal arts 32% 59% +27% 47% +15% college of pharmacy 1% 1% 0% 0% -1% college of veterinary medicine 1% 1% 0% 1% 0% graduate school 0% 0% 0% 0% 0% humphrey inst of publ affairs 1% 1% 0% 1% 0% institute of technology (now college of science & engineering) 14% 9% -5% 10% -4% law school 2% 1% -1% 1% 0% medical school 4% 2% -3% 5% 0% school of dentistry 1% 1% 0% 0% -1% school of nursing 1% 0% -1% 0% -1% school of public health 2% 1% -1% 3% +1% table 2. student responses by affiliation information technology and libraries | june 2012 26 faculty and staff together totaled only eighty-nine respondents on the mncat plus survey and fifty-one respondents on the mncat classic survey. in keeping with graduate and undergraduate student trends, the college of liberal arts (cla) was clearly over-represented in terms of faculty responses. the cla faculty group represents about 17 percent of the faculty at the university of minnesota. yet over half the faculty respondents on the mncat plus survey were from cla; over 80 percent of the mncat classic faculty respondents identified themselves as affiliated with cla. faculty groups that were underrepresented include the medical school and the institute of technology. perceptions of success a critical area of inquiry for the surveys was user satisfaction and perceptions of success: “do users perceive their searching activity as successful?” asked in both surveys, the question’s responses allowed the primo management group to compare respondents’ perceived success between the two interfaces. results show a marked difference: while 86 percent of the mncat classic respondents reported that they are “usually” or “very often” successful at finding what they are looking for, only 62 percent of the mncat plus respondents reported the same perception of success. respondents reported very similar rates of success regardless of school, type of affiliation, or student status. figure 1. perceptions of success: mncat plus and mncat classic these results should be interpreted cautiously. because mncat plus is the libraries’ default catalog interface, mncat classic users are a self-selecting group whose members make a conscious decision to bookmark or click the extra link to use the mncat classic interface. one cannot assume that mncat users in general also would have an 86 percent perception of success were they to use mncat classic; familiarity with the tool could play a part in mncat classic users’ success. 14% 24% 44% 18% 4% 11% 32% 54% 0% 10% 20% 30% 40% 50% 60% rarely sometimes usually very often mncat classic mncat plus resource discovery: comparative survey results | hessel and fransen 27 another possible factor in the reported difference in user success is the higher proportion of known-item searching—finding a book by title—occurring in mncat classic. a user’s criteria for success differ when searching for a known item versus conducting a general topical search. it is easier for a searcher to determine that they have been successful in a situation where they are looking for a specific item. some features of mncat classic, such as the start-of-title and other browse indexes, are well suited to known-item searching and had no direct equivalent in mncat plus, which defaults to relevance-ranked results. (primo version 3 has implemented new features to enhance known-item searching.) comments received from users suggest that several factors played a role. one mncat classic respondent praised the “precision of the search...not just lots of random hits” and noted that mncat classic supports a “[m]ore focused search since i usually already know the title or author.” in contrast, a mncat plus respondent commented that the next-generation interface was “great for browsing topics when you do not have a specific title in mind.” this comment is consonant with the results from other usability testing done on next-generation catalogs. in "next generation catalogs: what do they do and why should we care?", emanuel describes observed differences between topical and known-item searching: “during the testing, users were generally happy with the results when they searched for a broad term, but they were not happy with results for more specific searches because often they had to further limit to find what they wanted in the first screen of results.”4 a common characteristic of next-generation catalogs is that they return a large result set that can then be limited using facets. training and experience may also explain some of the differences in success. mncat plus also enables functionality associated with the functional requirements for bibliographic records (frbr), which is intended to group items with the same core intellectual content in a way that is more intuitive to searchers. however, this feature is unfamiliar to traditional catalog searchers and requires an extra step to discover very specific known-items in primo. one mncat plus user expressed dissatisfaction and added, “i'm not sure if it's my lack of training/practice or that the system is not user-friendly.” in focus group analyses conducted in 2008, oclc found that “when participants conducted general searches on a topic (i.e., searches for unknown items) that they expressed dissatisfaction when items unrelated to what they were looking for were returned in the results list. end users may not understand how to best craft an appropriate search strategy for topic searches.”5 how often do users know the title of the item that they are looking for? users come to the library with different goals in mind. in “chang's browsing,” available in theories of information behavior, chang identified five general browsing themes,6 adapted to discovery by carter.7 for the purposes of the survey, the primo management group grouped those themes into two goals: finding an item when the title is known, and finding anything on a given topic. the primo management group had heard concerns from faculty and staff that they have more difficulty finding an item when they know the title when using mncat plus than they did with mncat classic. the group was interested in knowing how often users search for known items. to explore this topic and its impact on perceptions of success, the surveys included two questions on known-item and topical searching. the survey results shown in table 3 indicate that a significantly higher proportion of mncat classic respondents (30 percent plus 43 percent = 73 percent) than mncat plus respondents (24 information technology and libraries | june 2012 28 percent plus 29 percent = 53 percent) were “very often” or “usually” searching for known items. it may be that users in search of known items have learned to go to mncat classic rather than mncat plus. rarely sometimes usually very often total i already know the title of the item i am looking for mncat classic 7% (11) 19% (29) 30% (46) 43% (66) 152 mncat plus 15% (69) 33% (151) 24% (111) 29% (132) 463 i am looking for any resource relevant to my topic mncat classic 14% (21) 32% (47) 20% (29) 34% (51) 148 mncat plus 14% (62) 29% (133) 29% (133) 28% (127) 455 table 3. responses to “i already know the title of the item i am looking for” when the primo management group considered how often researchers in different user roles searched for known items versus anything on a topic, clear patterns emerged as shown in figure 2. in the mncat plus survey, only 34 percent of undergraduate mncat plus searchers “usually” or “very often” search for a particular item, versus 74 percent of faculty. conversely, 75 percent of undergraduate respondents “usually” or “very often” search for any resource relevant to a topic, versus 37 percent of faculty. graduate student respondents showed interest in both kinds of use. if successful browsing by topic is best achieved using post-search filtering, it may help to explain differences between undergraduate students and faculty. the analysis of usability testing done on other next generation catalogs described in “next generation catalogs: what do they do and why should we care?” states that “users that did not have extensive searching skills were more likely to appreciate the search first, limit later approach, while faculty members were faster to get frustrated with this technique.”8 results for all mncat classic respondents showed a preference for known item searching, but undergraduate students still indicated that they search more for anything on the topic and less for known items than faculty respondents. no significant differences were identified by discipline. resource discovery: comparative survey results | hessel and fransen 29 figure 2. searching for a known item vs. any relevant resource some qualitative comments from survey takers suggest that respondents view the library interface as a place to go to find something already known to exist, e.g., “i never want to search by topic. library catalogs are for looking up specific items.” however, with respect to discovering resources for a subject in general, both mncat classic and mncat plus respondents showed that they would also like to find items relevant to their topic (figure 2). there was no significant difference between mncat classic and mncat plus respondents on this question; in both environments, only 14 percent of the users said that they would “rarely” be interested in general results relevant to their topic. perceptions of success by specific characteristics for mncat plus, the majority of respondents “somewhat agree” or “strongly agree” that items available online or in a particular collection are easy to find. one-third of the mncat plus respondents had never tried to find an item in a particular format. over 40 percent had never tried to find an item with a particular isbn/issn. interface features may be a factor here: isbn/issn searching is not a choice in the mncat plus drop down menu, so users may not know that they can do such a search. a higher percentage of mncat classic respondents “strongly agree” that it is easy to find items by collection, available online, or in a particular format, than mncat plus respondents. figure 3 shows results based on particular characteristics. information technology and libraries | june 2012 30 figure 3. perception of success by characteristic although the surveys were primarily intended to gather reactions from end users, some interesting data emerged about usage by library staff. as demonstrated in figure 4, library staff respondents were much more likely to have performed the specific types of searches listed in this section than users generally, and reported a much higher rate of perceived success with mncat classic. figure 4. perception of success by characteristic: library staff resource discovery: comparative survey results | hessel and fransen 31 searching by location: local collections and other resources in a large research institution with several physical library locations and many distinct collections, users need the ability to quickly narrow a search to a particular collection. but even the largest institution cannot collect everything a researcher might need. the primo management group wondered not only whether users felt successful when they looked for an item in a particular collection but also wanted to explore whether users want to see items not owned by the institution as part of their search results. finding items among the many library locations was not a problem for either mncat plus or mncat classic respondents: 72 percent either somewhat or strongly agreed that it is easy to find items in a particular collection using mncat. furthermore, survey respondents of both interfaces agreed that they are interested in items no matter where the items are, which underlines the value of a service such as worldcat; 73 percent of mncat plus respondents and 78 percent of mncat classic respondents expressed a preference for seeing items held by other libraries, knowing they could request items using an interlibrary loan service if necessary. preferred search environments three of the survey questions asked users about their preferred search environments for different searching needs:  when looking for a particular book  when looking for a particular journal article  when searching without a particular title in mind each survey presented respondents with a list of choices and space to specify other sources not listed. respondents were encouraged to mark as many sources as they regularly use. when searching for a specific book, users of the two catalog environments identified a number of other sources. the top five sources in each survey are listed in table 4. when i am looking for a specific book, i usually search (check all that apply): mncat classic respondents (frequency) mncat plus respondents (frequency) 1. mncat classic (116) 1. mncat plus (217) 2. worldcat (50) 2. google (165) 3. amazon (50) 3. mncat classic (163) 4. google (49) 4. amazon (160) 5. google books (31) 5. google books (108) table 4. search environment for books information technology and libraries | june 2012 32 qualitative comments indicated that users like being able to connect to amazon and google books in order to look at tables of contents and reviews. they also specifically mentioned barnes and noble, as well as other local libraries. these results show that mncat plus respondents were more likely to also use mncat classic than vice-versa. the data do not suggest why this would be the case, but familiarity with the older interface may play a role. mncat classic respondents were more likely than mncat plus users to return to their search environment when searching for a particular book (82 percent versus 53 percent). one mncat plus respondent commented “i didn't know i could still get to mncat classic.” when searching for a specific journal article, users of both systems chose “other databases (jstor, pubmed, etc.)” above all the other choices. even more respondents would likely have marked this choice if not for confusion over the term “other databases.” most of the comments mentioned specific databases, even when the respondent had not selected the “other databases” choice. one user commented, “most of these choices would be illogical. you don't list article indexes, that's where i go first.” table 5 lists the five responses marked most often for each survey. when i am looking for a specific journal article, i usually search (check all that apply): mncat classic respondents (frequency) mncat plus respondents (frequency) 1. other databases (jstor, pubmed, etc.) (92) 1. other databases (jstor, pubmed, etc.) (232) 2. mncat classic (53) 2. google scholar (131) 3. google scholar (40) 3. e-journals list (130) 4. e-journals list (34) 4. mncat plus (110) 5. google (29) 5. mncat plus article search (101) table 5. search environment for articles. qualitative comments from respondents indicated that interfaces would be more useful if they helped users find online journal articles. this raised some questions with regard to mncat plus, which includes a tab labeled “articles” for conducting federated article searches. however, mncat plus respondents noted that they used the plus “articles” search almost as much as they did mncat plus. other plus comments included: i tried to use this for journal articles but it only has some in the database i guess and when i did my search it only found books and no articles. i don't understand it. i tried this new one and it came up with wierd [sic] stuff in terms of articles. my professor said to give up and use the regular indexes because i wasn't getting what i needed to do the paper. it wasted my time. this desire for federated search coupled with the expressions of dissatisfaction with the existing federated search platform is consistent with the mixed opinions expressed in other studies, such as sam houston state university’s assessment of use of and satisfaction with the webfeat resource discovery: comparative survey results | hessel and fransen 33 federated search tool. that study found “[f]ederated search use was highest among lower-level undergraduates, and both use and satisfaction declined as student classification rose.”9 the new search tools that contain preindexed articles, such as primo central, summon, worldcat local, and ebsco discovery service, may address the frustrations that more experienced searchers express regarding federated search technology. when researching a topic without a specific title in mind, “google” and “other databases” were nearly equal and ranked first for mncat plus respondents, while “other databases” ranked first for mncat classic respondents. table 6 lists the five responses marked most option for each survey. when i am researching a topic without a specific title in mind, i usually search (check all that apply): mncat classic respondents (frequency) mncat plus respondents (frequency) 1. other databases (jstor, pubmed, etc.) (84) 1. google (197) 2. mncat classic (76) 2. other databases (jstor, pubmed, etc.) (192) 3. google (63) 3. google scholar (155) 4. google scholar (47) 4. mncat plus (145) 5. worldcat (32) 5. mncat classic (101) table 6. search environment for topics significant differences based on school affiliation were evident in the area of preferred search environments for topical research. for example, institute of technology respondents reported using google much more often when researching without a specific title in mind than respondents in other areas. evidence from the health sciences is limited in that only seven percent of respondents in total identified themselves as being from this area. however, these limited results show that health sciences respondents relied more on library databases than on google. respondents in the liberal arts relied more on mncat, in either version, than did respondents in the other fields. desired resource types one feature of the primo discovery interface is its ability to aggregate records from more than one source. university libraries maintains several internal data sources that are not included in the catalog, and the possibility of including some of these in the mncat plus catalog has been considered many times since primo’s release. the primo management group was interested to hear from users whether they would find three types of internal sources useful: research reports and preprints, online media, and archival finding aids. the group also asked users to mark “online journal articles” if they would find article results helpful. the question did not specify whether journal articles would appear integrated with other search results in a mncat “books” search or information technology and libraries | june 2012 34 in a separate search such as that already provided through a metasearch on the mncat plus articles tab. the surveys asked users what kinds of resources would make mncat more useful. the results for both mncat plus and mncat classic were similar and response counts for both surveys were ordered as shown in table 7. respondents could mark more than one of the choices. i would find mncat more useful if it helped me find: mncat classic frequency mncat plus frequency online journal articles 65 255 u of m research materials (e.g., research reports, preprints) 34 149 online media (e.g., digital images, streaming audio/visual) 27 134 archival finding aids 27 90 table 7. desired resource types the primo management group noted that more mncat plus respondents chose “online journal articles” more frequently than the other categories even though the mncat plus interface includes an “articles” tab for federated searching. it is unclear whether the respondents were not seeing the “articles” tab in mncat plus because they would like to see search results integrated, or if they were using the “articles” tab and were not satisfied with the results. comments from respondents generally supported the inclusion of a wider range of resources in mncat. however, several respondents also expressed concerns about the trade-offs that might be involved in providing wider coverage. one user liked the idea of having the databases “all … in one place,” but added that “it would have to just give you the stuff that you need.” several users cited the varying quality of the material discovered through library sources. one user supported the inclusion of articles “if it included good articles and not the ones i got.” a mncat classic respondent gave the variable quality of the material he or she had found through a database search as a reason for leaving the coverage of mncat as it is: “i use the best sources depending on my needs.” another mncat classic user expressed doubt that coverage of all disciplines was feasible. in commenting on the content of mncat, respondents also mentioned specific types of material that they wanted to see (e.g. archives of various countries), as well as difficulties with particular classes of material (“the confusing world of government documents”). one mncat plus user related his or her interest in public domain items to a specific item of functionality that would enhance their discovery, namely a date sort. in general, the interest in university of minnesota research material was fairly high. however, faculty members ranked university of minnesota research materials last in terms of preference: only twelve faculty respondents chose the option, out of sixty-one total faculty respondents. resource discovery: comparative survey results | hessel and fransen 35 conclusions the data from two surveys, conducted concurrently in 2009 on a traditional opac (mncat classic) and next-generation catalog (mncat plus), point to differences in the use and perceptions of both systems. there appeared to be fairly strong “brand loyalty” with mncat classic, given that this interface is no longer the default search for the libraries. surveys for both systems suggest a perception of success that is lower than desirable and that there is room to improve the quality of the discovery experience. it is unclear from the data if the reported perceptions of success were the result of the systems not finding what the user wants, or if the systems did not contain what the user wanted to find. mncat classic respondents were more likely to use worldcat to find a specific book than mncat plus respondents. mncat plus respondents indicated a use of mncat classic, but not vice versa. both sets of surveys described use of amazon and google for discovery. mncat plus respondents reported lower rates of success at finding known items than mncat classic respondents. mncat classic respondents were far more likely to have a specific title in mind that they wanted to obtain; half of the mncat plus respondents reported having a specific title in mind. the team that examined the survey responses found that the data suggested several key attributes that should be present in the libraries discovery environment. further discussion of the results and suggested attributes was conducted with library staff members in open sessions. results also informed local work on improving discovery interfaces. the results suggested:  the environment should support multiple discovery tasks, including known-item searching and topical research.  support for discovery activity should be provided to all primary constituent groups, noting the significant survey response by graduate student searchers.  users want to discover materials that are not owned by the libraries, in addition to local holdings.  a discovery environment should make it easy for users to find and access resources in vendor-provided resources, such as jstor and pubmed. while the results of the 2009 surveys provided a valuable description of usage, the survey team recognized that methodological choices limit the usefulness in applying results to a larger population. the team also recognized that there were a number of questions yet unanswered. some of these outstanding questions present opportunities for future research and suggest that a variety of formats might be useful, including surveys, focus groups, and targeted interviews.  to what extent do users expect to find integrated search results among different kinds of content, such as articles, databases, indexes, and even large scale data sets?  what general search strategies do users use to navigate the complex discovery environment that is available to them, and where are the failure points?  how much of the current environment requires training and how much is truly intuitive to users? information technology and libraries | june 2012 36  how can the university libraries identify and serve users who did not complete the surveys?  how useful would users find targeted results based on a particular characteristic such as role, student status, or discipline? since the surveys were conducted, the university libraries upgraded to primo version 3, which included features to address some of the concerns respondents identified in the surveys, such as known-item searching. primo version 3 allows users to conduct a left-justified title search (“title begins with…”), as well as sort by fields such as title and author. once the new version has been in place long enough for users to develop some comfort with the interface, the primo management group intends to resolve methodological issues and repeat its surveys, measuring users’ reactions against the baseline data set in the 2009 surveys. acknowledgements we would like to thank the other members of the primo management group, who helped to design and implement the surveys, as well as analyze and communicate the results: chew chiat naun (chair), susan gangl, connie hendrick, lois hendrickson, kristen mastel, r. arvid nelsen, and jeff peterson. we also want to acknowledge the helpful feedback and guidance of the group’s sponsor, john butler. references 1 tamar sadeh, “user experience in the library: a case study.” new library world 109, no. 1/2 (2008): 7–24. 2 cody hanson et al., discoverability phase 1 final report (minneapolis: university of minnesota, 2009), http://purl.umn.edu/48258/ (accessed dec. 20, 2010). 3 jina choi wakimoto, “scope of the library catalog in times of transition.” cataloging & classification quarterly 47, no. 5 (2009): 409–26. 4 jenny emanuel, “next generation catalogs: what do they do and why should we care?” reference & user services quarterly 49, no. 2 (winter, 2009): 117–20. 5 karen calhoun, diane cellentani, and oclc, online catalogs : what users and librarians want: an oclc report (dublin, ohio: oclc, 2009). 6 shan-ju chang, “chang's browsing,” in theories of information behavior, ed. karen e. fisher, sandra erdelez and lynne mckechnie, 69-74 (medford, n.j.: information today, 2005). 7 judith carter, “discovery: what do you mean by that?” information technology & libraries 28, no. 4 (december 2009): 161–63. 8 jenny emanuel, “next generation catalogs: what do they do and why should we care?” reference & user services quarterly 49, no. 2 (winter, 2009): 117–20. 9 abe korah and erin dorris cassidy. “students and federated searching: a survey of use and satisfaction,” reference & user services quarterly 49, no. 4 (summer 2010): 325–32. https://purl.umn.edu/48258 resource discovery: comparative survey results | hessel and fransen 37 appendix a. mncat classic survey the library catalog is intended to help you find an item when you know its title, as well as suggest items that are relevant to a given topic. we’d like to know how often you use mncat classic for these different purposes. 1. when i visit mncat classic… very often usually sometimes rarely i already know the title of the item i am looking for     i am looking for any resource relevant to my topic     many people use tools other than the library catalog to find books, articles, and other resources. for the different situations below, please tell us what other tools you find helpful. 2. when i am looking for a specific book, i usually search (check all that apply):  amazon  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat  google books  mncat plus article search  google scholar  libraries onesearch other (please specify) _______________________________________________________ 3. when i am looking for a specific journal article, i usually search (check all that apply):  amazon  google books  mncat plus article search  citation linker  google scholar  libraries onesearch  e-journals list  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat other (please specify) ___________________________________________________ information technology and libraries | june 2012 38 4. when i am researching a topic without a specific title in mind, i usually search (check all that apply):  amazon  google scholar  libraries onesearch  e-journals list  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat  google books  mncat plus article search other (please specify) ___________________________________________________ now we’d like to know what you think of mncat classic and what new features (if any) you’d like to see. 5. when i use mncat classic very often usually sometimes rarely i succeed in finding what i’m looking for     6. it is easy to find the following kinds of items in mncat classic strongly agree somewhat agree somewhat disagree strongly disagree i haven’t looked for this with mncat classic an item that is available online      an item within a particular collection (e.g., wilson library, university archives, etc.)      an item in a particular physical format (e.g., dvd, map, etc.)      an item with a specific isbn or issn      resource discovery: comparative survey results | hessel and fransen 39 7. i would find mncat classic more useful if it helped me find (check all that apply):  online journal articles  online media (e.g., digital images, streaming audio/visual)  archival finding aids  u of m research material (e.g., research reports, preprints) other (please specify) ___________________________________________________ 8. the worldcat catalog allows you to search the contents of many library collections in addition to the university of minnesota. which of the following best describes your level of interest in this type of catalog?  yes, i am interested in what other libraries have regardless of where they are, knowing i could request it through interlibrary loan if i want it  yes, i am interested, but only if i can get the items from a nearby library  no, i am interested only in what is available at the university of minnesota libraries please share anything you particularly like or dislike about mncat classic. 9. what i like most about mncat classic is: ___________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ 10. what i like least about mncat classic is: ___________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ we want to understand how different groups of people use mncat classic, as well as other tools, for finding information. please answer the following questions to give us an idea of who you are. 11. how are you affiliated with the university of minnesota?  faculty  graduate student  undergraduate student  staff (non-library) information technology and libraries | june 2012 40  library staff  community member 12. with which university of minnesota college or school are you most closely affiliated?  allied health programs  food, agricultural and natural resource sciences  pharmacy  biological sciences  law school  public affairs  continuing education  liberal arts  public health  dentistry  libraries  technology (engineering, physical sciences & mathematics)  design  management  veterinary medicine  education & human development  medical school  none of these  extension  nursing 13. we are interested in learning more about how you find the materials you need. if you would be willing to be contacted for further surveys or focus groups, please provide your e-mail address: _______________________________________________ resource discovery: comparative survey results | hessel and fransen 41 appendix b. mncat plus survey the library catalog is intended to help you find an item when you know its title, as well as suggest items that are relevant to a given topic. we’d like to know how often you use mncat plus for these different purposes. 1. when i visit mncat plus… very often usually sometimes rarely i already know the title of the item i am looking for     i am looking for any resource relevant to my topic     many people use tools other than the library catalog to find books, articles, and other resources. for the different situations below, please tell us what other tools you find helpful. 2. when i am looking for a specific book, i usually search (check all that apply):  amazon  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat  google books  mncat plus article search  google scholar  libraries onesearch other (please specify) _______________________________________________________ 3. when i am looking for a specific journal article, i usually search (check all that apply):  amazon  google books  mncat plus article search  citation linker  google scholar  libraries onesearch  e-journals list  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat other (please specify) ___________________________________________________ information technology and libraries | june 2012 42 4. when i am researching a topic without a specific title in mind, i usually search (check all that apply):  amazon  google scholar  libraries onesearch  e-journals list  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat  google books  mncat plus article search other (please specify) ___________________________________________________ now we’d like to know what you think of mncat plus and what new features (if any) you’d like to see. 5. when i use mncat plus very often usually sometimes rarely i succeed in finding what i’m looking for     6. it is easy to find the following kinds of items in mncat plus strongly agree somewhat agree somewhat disagree strongly disagree i haven’t looked for this with mncat plus an item that is available online      an item within a particular collection (e.g., wilson library, university archives, etc.)      an item in a particular physical format (e.g., dvd, map, etc.)      an item with a specific isbn or issn      resource discovery: comparative survey results | hessel and fransen 43 7. i would find mncat plus more useful if it helped me find (check all that apply):  online journal articles  online media (e.g., digital images, streaming audio/visual)  archival finding aids  u of m research material (e.g., research reports, preprints) other (please specify) ___________________________________________________ 8. the worldcat catalog allows you to search the contents of many library collections in addition to the university of minnesota. which of the following best describes your level of interest in this type of catalog?  yes, i am interested in what other libraries have regardless of where they are, knowing i could request it through interlibrary loan if i want it  yes, i am interested, but only if i can get the items from a nearby library  no, i am interested only in what is available at the university of minnesota libraries please share anything you particularly like or dislike about mncat plus. 9. what i like most about mncat plus is: ___________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ 10. what i like least about mncat plus is: ___________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ we want to understand how different groups of people use mncat plus, as well as other tools, for finding information. please answer the following questions to give us an idea of who you are. 11. how are you affiliated with the university of minnesota?  faculty  graduate student  undergraduate student  staff (non-library) information technology and libraries | june 2012 44  library staff  community member 12. with which university of minnesota college or school are you most closely affiliated?  allied health programs  food, agricultural and natural resource sciences  pharmacy  biological sciences  law school  public affairs  continuing education  liberal arts  public health  dentistry  libraries  technology (engineering, physical sciences & mathematics)  design  management  veterinary medicine  education & human development  medical school  none of these  extension  nursing 13. we are interested in learning more about how you find the materials you need. if you would be willing to be contacted for further surveys or focus groups, please provide your e-mail address: _______________________________________________ reproduced with permission of the copyright owner. further reproduction prohibited without permission. graphical table of contents for library collections: the application ... herrero-solana, victor;félix moya-anegón;guerrero-bote, vicente;zapico-alonso, felipe information technology and libraries; mar 2006; 25, 1; proquest education journals pg. 43 reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. usability test results for a discovery tool in an academic library jody condit fagan meris mandernach carl s. nelson jonathan r. paulo grover saunders information technology and libraries | march 2012 83 abstract discovery tools are emerging in libraries. these tools offer library patrons the ability to concurrently search the library catalog and journal articles. while vendors rush to provide feature-rich interfaces and access to as much content as possible, librarians wonder about the usefulness of these tools to library patrons. to learn about both the utility and usability of ebsco discovery service, james madison university (jmu) conducted a usability test with eight students and two faculty members. the test consisted of nine tasks focused on common patron requests or related to the utility of specific discovery tool features. software recorded participants’ actions and time on task, human observers judged the success of each task, and a post–survey questionnaire gathered qualitative feedback and comments from the participants. participants were successful at most tasks, but specific usability problems suggested some interface changes for both ebsco discovery service and jmu’s customizations of the tool. the study also raised several questions for libraries above and beyond any specific discovery-tool interface, including the scope and purpose of a discovery tool versus other library systems, working with the large result sets made possible by discovery tools, and navigation between the tool and other library services and resources. this article will be of interest to those who are investigating discovery tools, selecting products, integrating discovery tools into a library web presence, or performing evaluations of similar systems. introduction discovery tools appeared on the library scene shortly after the arrival of next-generation catalogs. the authors of this paper define discovery tools as web software that searches journal-article and library-catalog metadata in a unified index and presents search results in a single interface. this differs from federated search software, which searches multiple databases and aggregates the results. examples of discovery tools include serials solutions summon, ebsco discovery service, jody condit fagan (faganjc@jmu.edu) is director, scholarly content systems, meris mandernach (manderma@jmu.edu) is collection management librarian, carl s. nelson (nelsoncs@jmu.edu) is digital user experience specialist, jonathan r. paulo (paulojr@jmu.edu) is education librarian, and grover saunders (saundebn@jmu.edu) is web media developer, carrier library, james madison university, harrisonburg, va. mailto:faganjc@jmu.edu mailto:manderma@jmu.edu mailto:nelsoncs@jmu.edu mailto:paulojr@jmu.edu mailto:saundebn@jmu.edu usability test results for a discovery tool in an academic library | fagan et al 84 ex libris primo, and oclc worldcat local; examples of federated search software include serials solutions webfeat and ebsco integrated search. with federated search software, results rely on the search algorithm and relevance ranking as well as each tool’s algorithms and relevance rankings. discovery tools, which import metadata into one index, apply one set of search algorithms to retrieve and rank results. this difference is important because it contributes to a fundamentally different user experience in terms of speed, relevance, and ability to interact consistently with results. combining the library catalog, article indexes, and other source types in a unified interface is a big change for users because they no longer need to choose a specific search tool to begin their search. research has shown that such a choice has long been in conflict with users’ expectati ons.1 federated search software was unable to completely fulfill users’ expectations because of its limited technology.2 now that discovery tools provide a truly integrated search experience, with greatly improved relevance rankings, response times, and increased consistency, libraries can finally begin to meet this area of user expectation. however, discovery tools present new challenges for users: will they be able to differentiate between source types in the integrated results sets? will they be able to limit large results sets effectively? do they understand the scope of the tool and that other online resources exist outside the tool’s boundaries? the sea change brought by discovery tools also raises challenges for librarians, who have grown comfortable with the separation between the library catalog and other online databases. discovery tools may mask important differences between disciplinary searching, and they do not currently offer discipline-specific strategies or limits. they also lack authority control, which makes topical precision a challenge. their usual prominence on library websites may direct traffic away from carefully cultivated and organized collections of online resources. discovery tools offer both opportunities and challenges for library instruction, depending on the academic discipline, users’ knowledge, and information-seeking need. james madison university (jmu) is a predominantly undergraduate institution of approximately 18,000 students in virginia. jmu has a strong information literacy program integrated into the curriculum through the university’s information seeking skills test (isst). the isst is completed before students are able to register for third-semester courses. additionally, the library provides an information literacy tutorial, “go for the gold,” that supports the skills needed for the isst. jmu launched ebsco discovery service (eds) in august 2010 after participating as a beta development partner in spring and summer 2010. as with other discovery tools, the predominant feature of eds is integration of the library catalog with article databases and other types of sources. at the time of this study, eds had a few differentiating features. first, because of ebsco’s business as a database and journal provider, article metadata was drawn from a combination of journal-publisher information and abstracts and index records. the latter included robust subject indexing (e.g., the medical subject headings in cinahl). the content searched by eds varies by information technology and libraries | march 2012 85 institution according to the institution’s subscription. jmu had a large number of ebsco databases and third-party database subscriptions through ebsco, so the quantity of information searched by eds at jmu is quite large. eds also allowed for extensive customization of the tool, including header navigation links, results-screen layout, and the inclusion of widgets in the right-hand column of the results screen. jmu libraries developed a custom “quick search” widget based on eds for the library home page (see figure 1), which allows users to add limits to the discovery-tool search and assists with local authentication requirements. based on experience with a pilot test of the open-source vufind next-generation catalog, jmu libraries believed users would find the ability to limit up-front useful, so quick search’s first drop-down menu contained keyword, title, and author field limits; the second drop-down contained limits for books, articles, scholarly articles, “just leo library catalog,” and the library website (which did not use eds). the “just leo library catalog” option limited the user’s search to the library catalog database records but used the eds interface to perform the search. to access the native catalog interface, a link to leo library catalog was included immediately above the search box as well as in the library website header. figure 1. quick search widget on jmu library homepage usability test results for a discovery tool in an academic library | fagan et al 86 evaluation was included as part of the implementation process for the discovery t ool, and therefore a usability test was conducted in october 2010. the purpose of the study was to explore how patrons used the discovery tool, to uncover any usability issues with the chosen system and to investigate user satisfaction. specific tasks addressed the use of facets within the discovery tool, patrons’ use of date limiters, and the usability of the quick search widget. the usability test also had tasks in which users were asked to locate books and articles using only the discovery tool, then repeat the task using anything but the discovery tool. this article interprets the usability study’s results in the context of other local usability tests and web-usage data from the first semester of use. some findings were used to implement changes to quick search and the library website, and to recommend changes to ebsco; however, other findings suggested general questions related to discovery tool software that libraries will need to investigate further. literature review literature reviewed for this article included some background reading on users and library catalogs, library responses to users’ expectations, usability studies in libraries, and usability studies of discovery tools specifically. the first group of articles comprised a discussion about the limitations of traditional library catalogs. the strengths and weaknesses of library catalogs were reported in several academic libraries’ usability studies.3 calhoun recognized that library users’ preference for google caused a decline in the use and value of library catalogs, and encouraged library leaders to “establish the catalog within the framework of online information discovery systems.” 4 this awareness of changes in user expectations during a time when google set the benchmark for search simplicity was echoed by numerous authors who recognized the limits of library catalogs and expressed a need for the catalog to be greatly modernized to keep pace with the evolution of the web. 5 libraries have responded in several ways to the call for modernization, most notably through investigations related to federated searching and next-generation catalogs. several articles have presented usability studies results for various federated searching products.6 fagan provided a thorough literature review of faceted browsing and next-generation catalogs.7 western michigan university presented usability study results for the next-generation catalog vufind, revealing that participants took advantage of the simple search box but did not use the next-generation catalog features of tagging, comments, favorites, and sms texting. 8 the university of minnesota conducted two usability studies of primo and reported that participants were satisfied with using primo to find known print items, limit by author and date, and find a journal title.9 tod olson conducted a study with graduate students and faculty using the aquabrowser interface, and his participants located sources for their research they had not previously been able to find.10 information technology and libraries | march 2012 87 the literature also revealed both opportunities and limitations of federated searching and nextgeneration catalogs. allison presented statistics from google analytics for an implementation of encore at the university of nebraska-lincoln. 11 the usage statistics revealed an increased use of article databases as well as an increased use of narrowing facets such as format and media type, and library location. allison concluded that encore increased users’ exposure to the entire collection. breeding concluded that federated searching had various limitations, especially search speed and interface design, and was thus unable to compete with google scholar. 12 usability studies of next-generation catalogs revealed a lack of features necessary to fully incorporate an entire library’s collection. breeding also recognized the limitations of next-generation library catalogs and saw discovery tools as their next step in evolution: “it’s all about helping users discover library content in all formats, regardless of whether it resides within the physical library or among its collections of electronic content, spanning both locally owned materials and those accessed remotely through subscriptions.” 13 the dominant literature related to discovery tools discussed features,14 reviewed them from a library selector perspective,15 summarized academic libraries’ decisions following selection, 16 presented questions related to evaluation after selection,17 and offered a thorough evaluation of common features.18 allison concluded that “usability testing will help clarify what aspects need improvement, what additions will make [the interface] more useful, and how the interface can be made so intuitive that user training is not needed.”19 breeding noted “it will only be through the experience of library users that these products will either prove themselves or not.”20 libraries have been adapting techniques from the field of usability testing for over a decade to learn more about user behavior, usability, and user satisfaction, with library web sites and systems. 21 rubin and chisnell and dumas and redish provided an authoritative overview of the benefits and best practices of usability testing. 22 in addition, campbell and norlin and winters offered specific usability methodologies for libraries.23 worldcat local has dominated usability studies of discovery tools published to date. ward, shadle, and mofield conducted a usability study at the university of washington. 24 although the second round of testing was not published, the first round involved seven undergraduate and three graduate students; its purpose “was to determine how successful uw students would be in using worldcat local to discover and obtain books and journal articles (in both print and electronic form) from the uw collection, from the summit consortium, and from other worldcat libraries.” 25 although participants were successful at completing these tasks, a few issues arose out of the usability study. users had difficulty with the brief item display because reviews were listed higher than the actual items. the detailed item display also hindered users’ ability to decipher between various editions and formats. the second round of usability testing, not yet published, included tasks related to finding materials on specific subject areas. usability test results for a discovery tool in an academic library | fagan et al 88 boock, chadwell, and reese conducted a usability study of worldcat local at oregon state university.26 the study included four tasks and five evaluative questions. forty undergraduate students, sixteen graduate students, twenty-four library employees, four instructors, and eighteen faculty members took part in the study. they summarized that users found known-title searching to be easier in the library catalog but found topical searches to be more effective in worldcat local.the participants preferred worldcat local for the ability to find articles and search for materials in other institutions. western washington university also conducted a usability study of worldcat local. they selected twenty-four participants with a wide range of academic experience to conduct twenty tasks in both worldcat local and the traditional library catalog.27 the comparison revealed several problems in using worldcat local, including users’ inability to determine the scope of the content, confusion over the intermixing of formats, problems with the display of facet option, and difficulty with known-item searches. western washington university decided not to implement worldcat local. oclc published a thorough summary of several usability studies conducted mostly with academic libraries piloting the tool, including the university of washington; the university of california (berkeley, davis, and irvine campuses); ohio state university; the peninsula library system in san mateo, california; and the free library of urbana and the des plaines public library, both in illinois.28 the report conveyed favorable user interest in searching local, group, and global collections together. users also appreciated the ability to search articles and books together. the authors commented, “however, most academic participants in one test (nine of fourteen) wrongly assumed that journal article coverage includes all the licensed content available at their campuses.”29 oclc used the testing results to improve the order of search results, provide clarity about various editions, improve facets for narrowing a search, provide links to electronic resources, and increase visibility of search terms. at grand valley state university, doug way conducted an analysis of usage statistics after implementing the discovery tool summon in 2009; the usage statistics revealed an increased use of full-text downloads and link resolver software but a decrease in the use of core subject databases.30 the usage statistics showed promising results, but way recommended further studies of usage statistics over a longer period of time to better understand how discovery tools affect entire library collections. north carolina state university libraries released a final report about their usability study of summon.31 the results of these usability studies were similar to other studies of discovery tools: users were satisfied with the ability to search the library catalog and article databases with a single search, but users had mixed results with known-item searching and confusion about narrowing facets and results ranking. although several additional academic libraries have conducted usability studies of encore, summon, and ebsco discovery service, the results have not yet been published.32 information technology and libraries | march 2012 89 only one usability study of ebsco discovery service was found. in a study with six participants, williams and foster found users were satisfied and able to adapt to the new system quickly but did not take full advantage of the rich feature set.33 combined with the rapid changes in these tools, the literature illustrates a current need for more usability studies related to discovery tools. the necessary focus on specific software implementations and different study designs make it difficult to identify common themes. additional usability studies will offer greater breadth and depth to the current dialogue about discovery tools. this article will help fill the gap by presenting results from a usability study of ebsco discovery service. publishing such usability results of discovery tools will inform institutional decisions, improve user experiences, and advance the tools’ content, features, and interface design. in addition, libraries will be able to more thoroughly modernize library catalogs to meet users’ changing needs and expectations as well as keep pace with the evolution of the web. method james madison university libraries’ usability lab features one workstation with two pieces of usability software: techsmith’s morae (version 3) (http://www.techsmith.com/morae.asp), which records screen captures of participant actions during the usability studies, and the usability testing environment (ute) (version 3), which presents participants with tasks in a web-browser environment. the ute also presents end-of-task questions to measure time on task and task success. the study of eds, conducted in october 2010, was covered by an institutional review board – approved protocol. participants were recruited for the study through a bulk email sent to all students and faculty. interested respondents were randomly selected to include a variety of grade levels and majors for students and years of service and disciplines taught for faculty members. the study included ten participants with ranging levels of experience: two freshman, two sophomores, two juniors, one senior, one graduate student, and two faculty members. three of the participants were from the school of business, one from education, two from the arts and humanities, and two from the sciences. the remaining two participants had dual majors in the humanities and the sciences. a usability rule of thumb is that at least five users will reveal more than 75 percent of usability issues.34 because the goal was to observe a wide range of user behaviors and usability issues, and to gather data about satisfaction from a variety of perspectives, this study used two users of each grade level plus two faculty participants (for a total of ten) to provide as much heterogeneity as possible. student participants were presented with ten pre–study questions, and faculty participants were asked nine pre–study questions (see appendix a). the pre–study questions were intended to http://www.techsmith.com/morae.asp usability test results for a discovery tool in an academic library | fagan et al 90 gather information about participants’ background, including their time at jmu, their academic discipline, and their experience with the library website, the ebscohost interface, the library catalog, and library instruction. since participants were anonymous, we hoped their answers would help us interpret unusual comments or findings. pre–test results were not used to form comparison groups (e.g., freshmen versus senior) because these groups would not be representative of their larger populations. these questions were followed by a practice task to help familiarize participants with the testing software. the study consisted of nine tasks designed to showcase usability issues, show the researchers how users behaved in the system, and measure user satisfaction. appendix b lists the tasks and what they were intended to measure. in designing the test, determining success on some tasks seemed very objective (find a video about a given topic) while others appeared to be more subjective (those involving relevance judgments). for this reason, we asked participants to provide satisfaction information on some tasks and not others. in retrospect, for consistency of interpretation, we probably should have asked participants to rate or comment on every task. all of the tasks were presented in the same order. tasks were completed either by clicking “answer” and answering a question (multiple choice or typed response), or by clicking “finished” after navigating to a particular webpage. participants also had the option to skip the task they were working on and move to the next task. allowing participants to skip a task helps differentiate between genuinely incorrect answers and incorrect answers due to participant frustration or guessing. a time limit of 5 minutes was set for tasks 1–7, while tasks 8 and 9 were given time limits of 8 minutes, after which the participant was timed out. time limits were used to ensure participants were able to complete all tasks within the agreed-upon session. average time on task across all tasks was 1 minute, 35 seconds. after the study was completed, participants were presented with the system usability scale (sus), a ten-item scale using statements of subjective assessment and covering a variety of aspects of system usability.35 sus scores, which provide a numerical score out of 100, are affected by the complexity of both the system and the tasks users may have performed before taking the sus. the sus was followed by a post–test consisting of six open-ended questions, plus one additional question for faculty participants, intended to gather more qualitative feedback about user satisfaction with the system (see appendix a). a technical glitch with the ute software affected the study in two ways. first, on seven of the ninety tasks, the ute failed to enforce the five-minute maximum time limit, and participants exceeding a task’s time limit were allowed to continue the task until they completed or skipped the task. one participant exceeded the time limit on task 1 while three of these errors occurred during both tasks 8 and 9. this problem potentially limits the ability to compare the average time on task across tasks; however, since this study used time on task in a descriptive rather than comparative way, the impact on interpreting results is minimal. the seven instances in which the glitch occurred were included in the average time on task data found in figure 3 because the times information technology and libraries | march 2012 91 were not extreme and the time limit had been imposed mostly to be sure participants had time to complete all the tasks. a second problem with the ute was that it randomly and prematurely aborted some users’ tasks; when this happened, participants were informed that their time had run out and were then moved on to the next task. this problem is more serious because it is unknown how much more time or effort the participant would have spent on the task or whether they would have been more successful. because of this, the results below specify how many participants were affected for each task. although this was unfortunate, the results of the participants who did not experience this problem still provide useful cases of user behavior, especially because this study does not attempt to generalize observed behavior or usability issues to the larger population. although a participant mentioned a few technical glitches during testing to the facilitator, the extent of software errors was not discovered until after the tests were complete (and the semester was over) because the facilitator did not directly observe participants during sessions. results the participants were asked several pre–test questions to learn about their research habits. all but one participant indicated they used the library website no more than six times per month (see figure 2). common tasks this study’s student participants said they performed on the website were searching for books and articles, searching for music scores, “research using databases,” and checking library hours. the two faculty participants mentioned book and database searches, electronic journal access, and interlibrary loan. participants were shown the quick search widget and were asked “how much of the library’s resources do you think the quick search will search?” seven participants said “most”; only one person, a faculty member, said it would search “all” the library’s resources. figure 2. monthly visits to library website < 1 visit (2) 1 3 visits (4) 4 6 visits (3) > 7 visits (1) usability test results for a discovery tool in an academic library | fagan et al 92 when shown screenshots of the library catalog and an ebscohost database, seven participants were sure they had used leo library catalog, and three were not sure. three indicated that they had used an ebsco database before, five had not, and two were not su re. participants were also asked how often they had used library resources for assignments in their major field of study; four said “often,” two said “sometimes,” one “rarely/never,” and one “very often.” students were also asked “has a librarian spoken to a class you’ve attended about library research?” and two said yes, five said no, and one was not sure. a “practice task” was administered to ensure participants were comfortable with the workstation and software: “use quick search to search a topic relating to your major/discipline or another topic of interest to you. if you were writing a paper on this topic how satisfied would you be with these results?” no one selected “no opinion” or very unsatisfied”; sixty percent were “very satisfied” or “satisfied” with their results; forty percent were “somewhat unsatisfied.” figure 3 shows the time spent on each task, while figure 4 describes participants’ success on the tasks. task 1 task 2 task 3 task 4 task 5 task 6 task 7 task 8 task 9 no. of responses (not including timeouts) 10 9 5 7 9 10 10 8 10 avg. time on task (in seconds) 175* 123 116 97 34 120 92 252* 255* standard deviation 212 43 50 49 26 36 51 177 174 *includes time(s) in excess of the set time limit. excess time allowed by software error. figure 3. average time spent on tasks 175 123 116 97 34 120 92 292 255 0 50 100 150 200 250 300 350 task 1 task 2 task 3 task 4 task 5 task 6 task 7 task 8 task 9 t im e o n t a sk ( in s e co n d s) average time for all tasks (not including timeouts) information technology and libraries | march 2012 93 the first task (“what was the last thing you searched for when doing a research assignment for class? use quick search to re-search for this.”) started participants on the library homepage. participants were then asked to “tell us how this compared to your previous experience” using a text box. the average time on task was almost 2 minutes; however one faculty participant took more than 12 minutes on this task; if his or her time was removed, the time on task average was 1 minute, 23 seconds. figure 5 shows the participants’ search terms and their comments. task 1 task 2 task 3 task 4 task 5 task 6 task 7 task 8 task 9 how success determined users only asked to provide feedback valid typed-in response provided how many subtasks completed (out of 3) how many subtasks completed (out of 2) correct multiple choice answer how many subtasks completed (out of 2) end task at correct web location how many subtasks complete d (out of 4) how many subtasks completed (out of 4) p01 n/a correct 3 2 timeout 2 correct 0* 0** p02 n/a correct 3* 1 correct 2 correct 0** 3 p03 n/a correct 0* 1 incorrect 2 correct 4 3 p04 n/a correct 2 0* correct 2 skip 3 2 p05 n/a correct* 2 2 correct 1 correct 4 2 p06 n/a correct 3* 1 correct 1 correct 3 0** p07 n/a correct 2 1* correct 1 correct 0 2 p08 n/a correct 2 0* correct 0 skip timeout 0** p09 n/a correct 2* skip correct 2 correct 4 2 p10 n/a correct 1* 1 correct 2 skip 4 2 note: “timeout” indicates an immediate timeout error. users were unable to take any action on the task. *user experienced a timeout error while working on the task. this may have affected their ability to complete the task. **user did not follow directions. figure 4. participants’ success on tasks usability test results for a discovery tool in an academic library | fagan et al 94 participant jmu status major/discipline search terms p01 faculty geology large low shear wave velocity province comments: ebsco did a fairly complete job. there were some irrelevant results that i don’t remember seeing when i used georef. p02 faculty computer information systems & management science (statistics) student cheating comments: this is a topic that i am somewhat familiar with the related literature. i was pleased with the diversity of journals that were found in the search. the topics of the articles was right on target. the recency of the articles was great. this is a topic for which i am somewhat familiar with the related literature. i was impressed with the search results regarding: diversity of journals; recency of articles; just the topic in articles i was looking for. p03 graduate student education death of a salesman comments: there is a lot of variety in the types of sources that quick search is pulling up now. i would still have liked to see more critical sources on the play but i could probably have found more results of that nature with a better search term, such as “death of a salesman criticism.” p04 1st year voice performance current issues in russia comments: it was somewhat helpful in the way that it gave me information about what had happened in the past couple months, but not what was happening now in russia. p05 3rd year nursing uninsured and health care reform comments: the quick search gave very detailed articles i thought, which could be good, but were not exactly what i was looking for. then again, i didn’t read all these articles either p06 1st year history headscarf law comments: this search yielded more results related to my topic. i needed other sources for an argument on the french creating law banning religious dress and symbols in school. using other methods with the same keyword, i had an enormous amount of trouble finding articles that pertained to my essay. p07 3rd year english jung comments: i like the fact that it can be so defined to help me get exactly what i need. p08 4th year spanish restaurant industry comments: this is about the same as the last time that i researched this topic. p09 2nd year hospitality aphasia comments: there are many good sources, however there are also completely irrelevant sources. p10 2nd year management rogers five types of feedback comments: there is not many documents on the topic i searched for. this may be because the topic is not popular or my search is not specific/too specific. figure 5. participants’ search terms and comments information technology and libraries | march 2012 95 the second task started on the library homepage and asked participants to find a video related to early childhood cognitive development. this task was chosen because jmu libraries have significant video collections and because the research team hypothesized users might have trouble because there was no explicit way to limit to videos at the time. the average time on this task was two minutes, with one person experiencing an arbitrary time out by the software. participants were judged to be successful on this task by the researchers if they found any video related to the topic. all participants were successful on this task, but four entered, then left the discovery tool interface to complete the task. five participants looked for a video search option in the drop-down menu, and of these, three immediately used something other than quick search when they saw that there was no video search option. of those who tried quick search, six opened the source type facet in eds search results and four selected a source type limit, but only two selected a source type that led directly to success (“non-print resources”). task 3 started participants in eds (see figure 6) and asked them to search on speech pathology, find a way to limit search results to audiology, and limit their search results to peer-reviewed sources. participants spent an average of 1 minute, 40 seconds on this task, with five participants being artificially timed out by the software. participants’ success on this task was determined by the researchers’ examination of the number of subtasks they completed. the three subtasks consisted of successfully searching for the given topic (speech language pathology) limiting the search results to audiology, and further limiting the results to peer reviewed sources. four participants were able to complete all three subtasks, including two who were timed out. (the times for those who were timed out were not included in time on task averages, but they were given credit for success.) five completed just two of the subtasks, failing to limit to peerreviewed; one of these because of a timeout. it was unclear why the remaining participants did not attempt to alter the search results to “peer reviewed.” looking at the performed actions, six of the ten typed “and audiology” into search keywords to narrow the search results, while one found and used “audiology” in the subject facet on the search results page. six participants found and used the “scholarly (peer reviewed) journals” checkbox limiter. usability test results for a discovery tool in an academic library | fagan et al 96 figure 6. ebsco discovery service interface beginning with the results they had from task 3, task 4 asked participants to find more recent sources and to select the most recent source available. task success was measured by correct completion of two subtasks: limiting the search results to the last five years and finding the most recent source. the average time on task was 1 minute, 14 seconds, with three artificial timeouts. of those who did not time out, all seven were able to limit their sources to be more recent in some way, but only three were able to select the most recent source. in addition to this being a common research task, the team was interested to see how users accomplished this task. three typed in the limiter in the left-hand column, two typed in the limiter on the advanced search screen, and two used the date slider. two participants used the “sort” drop-down menu to change the sort order to “date descending,” which helped them complete this task. other participants changed the dates, and then selected the first result, which was not the most recent. task 5, which started within eds, asked participants to find a way to ask a jmu librarian for help. the success of this task was measured by whether they reached the correct url for the ask-a information technology and libraries | march 2012 97 librarian page; eight of the ten participants were successful. this task took an average of only 31 seconds to complete, and eight of the ten used the ask-a-librarian link at the top of the page. of the two unsuccessful participants, one was timed out, while another clicked “search modes” for no apparent reason, then clicked back and decided to finish the task. task 6 started in the eds interface and asked participants to locate the journal yachting and boating world and select the correct coverage dates and online status from a list of four options; participants were deemed successful at two subtasks if they selected the correct option and successful at one subtask if they chose an option that was partially correct. participants took an average of two minutes on this task; only five answered correctly. during this task, three participants used the ebsco search option “so journal title/source,” four used quotation marks, and four searched or re-searched with the “title” drop-down menu option. three chose the correct dates of coverage, but were unable to correctly identify the online availability. it is important to note that only searching and locating the journal title were accomplished with the discovery tool; to see dates of coverage and online availability, users clicked jmu’s link resolver button, and the resulting screen was served from serials solutions’ article linker product. although some users spent more time than perhaps was necessary using the eds search options to locate the journal, the real barriers to this task were encountered when trying to interpret the serials solutions screen. task 7, where participants started in eds, was designed to determine whether users could navigate to a research database outside of eds. users were asked to look up the sculpture genius of mirth and were told the library database camio would be the best place to search. they were instructed to “locate this database and find the sculpture.” the researcher observed the recordings to determine success on this task, which was defined as using camio to find the sculpture. participants took an average of 1 minute, 32 seconds on this task; seven were observed to complete the task successfully, while three chose to skip the task. to accomplish this task, seven participants used the jmu research databases link in the header navigation at some point, but only four began the task by doing this. six participants began by searching within eds. the final two tasks started on the library homepage and were a pair: participants were asked to find two books and two recent, peer-reviewed articles (from the last five years) on rheumatoid arthritis. task 8 asked them to use the library’s eds widget, quick search, to accomplish this, and task 9 asked them to accomplish the same task without using quick search. when they found sources, they were asked to enter the four relevant titles in a text-entry box. the average time spent on these tasks was similar: about four minutes per task. comparing these tasks was somewhat confusing because some participants did not follow instructions. user s uccess was determined by the researchers’ observation of how many of the four subtasks the user was able to complete successfully: find two books, find two articles, limit to peer reviewed, and select articles from last five years (with or without using a limiter); figure 4 shows their success. usability test results for a discovery tool in an academic library | fagan et al 98 looking at the seven users who used quick search on the quick search tasks, six limited to “scholarly (peer reviewed) journals”; six limited to the last five years; and seven narrowed results using the source type facet. the average number of subtasks completed on task eight was 3.14 out of 4. looking at the seven users who followed instructions and did not use quick search on task 9, all began with the library catalog and tried to locate articles within the library catalog. the average number of subtasks completed on task 9 was 2.29 out of 4. some users tried to locate articles by setting the catalog’s material type drop-down menu to “periodicals” and others used the catalog’s “periodical” tab, which performed a title keyword search of the e-journal portal. for task 9, only two users eventually chose a research database to find articles. user behavior can only be compared for the six users (all students) who followed instructions on both tasks; a summary is provided in figure 4. after completing all nine tasks, participants were presented with the system usability scale. eds scored 56 out of 100. following the sus, participants were asked a series of post–test questions. only one of the faculty members chose to answer the post–test questions. when asked how they would use quick search, all eight students explicitly mentioned class assignments, and the participating faculty member replied “to search for books.” two students mentioned books specifically, while the rest used the more generic term “sources” to describe items for which they would search. when asked “when would you not use this search tool?” the faculty member said “i would just have to get used to using it. i mainly go to [the library catalog] and then research databases.” responses from the six students who answered this question were vague and hard to categorize: • “not really sure for more general question/learning” • “when just browsing” • “for quick answers” • “if i could look up the information on the internet” • “when the material i need is broad” • “basic searching when you do not need to say where you got the info from” when asked for the advantages of quick search, four specifically mentioned the ability to narrow results, three respondents mentioned “speed,” three mentioned ease of use, and three mentioned relevance in some way (e.g., “it does a pretty good job associating keywords with sources”). two mentioned the broad coverage and one compared it to google, “which is what students are looking for.” when asked to list disadvantages, the faculty member mentioned he/she was not sure what part of the library home page was actually “quick search,” and was not sure how to get to his/her library account. three students talked about quick search being “overwhelming” or “confusing” because of the many features, although one of these also stated, “like anything you need to learn in order to use it efficiently.” one student mentioned the lack of an audio recording limit and another said “when the search results come up it is hard to tell if they are usable results.” information technology and libraries | march 2012 99 knowing that quick search may not always provide the best results, the research team also asked users what they would do if they were unable to find an item using quick search. a faculty participant said he or she would log into the library catalog and start from there. five students mentioned consulting a library staff member in some fashion. three mentioned moving on from library resources, although not necessarily as their first step. one said “find out more information on it to help narrow down my search.” only one student mentioned the library catalog or any other specific library resource. when participants were asked if “quick search” was an appropriate name, seven agreed that it was. of those who did not agree, one participant’s comment was “not really, though i don’t think it matters.” and another’s was “i think it represents the idea of the search, but not the action. it could be quicker.” the only alternative name suggestion was “search tool.” web traffic analysis web traffic through quick search and in eds provides additional context for this study’s results. during august–december 2010, quick search was searched 81,841 times from the library homepage. this is an increase from traffic into the previous widget in this location that searched the catalog, which received 41,740 searches during the same period in 2009. even adjusting for an approximately 22 percent increase in website traffic from 2009 to 2010, this is an increase of 75 percent. interestingly, the traffic to the most popular link on the library homepage, research databases, went from 55,891 in 2009 to 30,616 in 2010, a decrease of 55 percent when adjusting for the change in website traffic. during fall 2010, 28 percent of quick search searches from the homepage were executed using at least one drop-down menu. twelve percent changed quick search’s first drop-down menu to something other than the keyword default, with “title” being the most popular option (7 percent of searches) followed by author (4 percent of searches). twenty percent of users changed the second drop-down option; “just articles” and “just books” were the most popular options, garnering 7 percent and 6 percent of searches, respectively, followed by “just scholarly articles,” which accounted for 4 percent of searches. looking at ebsco’s statistical reports for jmu’s implementation of eds, there were 85,835 sessions and approximately 195,400 searches during august–december 2010. this means about 95 percent of eds sessions were launched using quick search from the homepage. there were an average of 2.3 searches per session, which is comparable to past behavior in jmu’s other ebscohost databases. discussion usability test results for a discovery tool in an academic library | fagan et al 100 the goal of this study was to gather initial data about user behavior, usability issues, and user satisfaction with discovery tools. the task design and technical limitations of the study mean that comparing time on task between participants or tasks would not be particularly illuminating; and, while the success rates on tasks are interesting, they are not generalizable to the larger jmu population. instead, this study provided observations of user behavior that librarians can use to improve services, it suggested some “quick fixes” to usability issues, and it pointed to several research questions. when possible, these observations are supplemented by comparisons between this study and the only other published usability study of eds.36 this study confirmed a previous finding of user studies of federated search software and discovery tools: students have trouble determining what is searched by various systems.37 on the tasks in which they were asked to not use quick search to find articles, participants tried to search for articles in the library catalog. although all but one of this study’s participants correctly answered that quick search did not search “all” library resources, seven thought it searched “most.” both “most” or “some” would be considered correct; however, it is interesting that answering this question more specifically is challenging even for librarians. many journals in subject article indexes and abstracts are included in the eds foundation index; furthermore, jmu’s implementation of eds includes all of jmu’s ebsco subscription resources as well, making it impractical to assemble a master list of indexed titles. of course, there are numerous online resources with contents which may never be included in a discovery tool, such as political voting records, ethnographic files, and financial data. users often have access to these resources through their library. however, if they do not know the library has a database of financial data, they will certainly not consider this content in their response to a question of how many of the library resources are included in the discovery tool. as discovery tools begin to fulfill users’ expectations for a “single search,” libraries will need to share best practices for showcasing valuable, useful collections that fall outside the discovery tool’s scope or abilities. this is especially critical when reviewing the 72 percent increase in homepage traffic to the homepage search widget compared with the 55 percent decrease in homepage traffic to the research databases page. it is important to note these trends do not mean the library’s other research databases have fallen in usage by 55 percent. though there was not a comprehensive examination of usage statistics, spot-checking suggested ebsco and non-ebsco subject databases had both increases and decreases in usage from previous years. another issue libraries should consider, especially when preparing for instruction classes, is that users do not seem to understand which information needs are suited to a discovery tool versus the catalog or subject-specific databases. several tasks provided additional information about users’ mental models of the tool, which may help libraries make better decisions about navigation customizations in discovery tool interfaces and on library websites. task 7 was designed to discover whether users could find their way to a database outside of eds if they knew they needed to use a specific database. six participants, including one of the faculty members, began by searching eds for the name of the sculpture and/or the database name. on task 1, a graduate information technology and libraries | march 2012 101 student who searched on “death of a salesman” and was asked to comment on how quick search results compared to his or her previous experience, said, “i would still have liked to see more critical sources on the play but i could probably have found more results of that nature with a better search term, such as ‘death of a salesman criticism.’” while true, most librarians would suggest using a literary criticism database, which would target this information need. librarians may have differing opinions regarding the best research starting point, but their rationale would be much different than that of the students in this study. this study’s participants said they would use quick search/eds when they were doing class work or research, but would not use it for general inquiries. if librarians were to list which user information needs are best met by a discovery tool versus a subject-specific database, the types of information needs listed would be much more numerous and diverse, regardless of differences over how to classify them. in addition to helping users choose between a discovery tool or a subject-specific database, libraries will need to conceptualize how users will move in and out of the discovery tool to other library resources, services, and user accounts. while users had no trouble finding the ask-alibrarian link in the header, it might have been more informative if users started from a searchresults page to see if they would find the right-hand column’s ask-a-librarian link or links to library subject guides and database lists. discovery tools vary in their abilities to connect users with their online library accounts and are changing quickly in this area. this study also provided some interesting observations about discovery tool interfaces. the default setting for ebsco discovery service is a single search box. however, this study suggests that while users desire a single search, they are willing to use multiple interface options. this was supported by log analysis of the library’s locally developed entry widget, quick search, in which 28 percent of searches included the use of a drop-down menu. on the first usability task, users left quick search’s options set to the default. on other tasks, participants frequently used the dropdown menus and limiters in both quick search and eds. for example, on task 2, which asked them to look for videos, five users looked in the quick search format drop-down menu. on the same task within eds, six users attempted to use the source type facet. use of limiters was similarly observed by williams and foster in their eds usability study.38 one eds interface option that was not obvious to participants was the link to change the sort order. when asked to find the most recent article, only two participants changed the sort option. most others used the date input boxes to limit their search, then selected the first result even thought it was not the most recent one. it is unclear whether the participant assumed the first result was the most recent or whether they could not figure out how to display the most recent sources. finding a journal title from library homepages has long been a difficult task,39 and this study provided no exception, even with the addition of a discovery tool. it is important to note that the standard eds implementation would include a “publications” or “journals a–z” link in the header; usability test results for a discovery tool in an academic library | fagan et al 102 in eds, libraries can customize the text of this link. jmu did not have this type of link enabled in our test, since the hope was that users could find journal titles within the eds results. however, neither eds nor the quick search widget’s search interfaces offered a way to limit the search to a journal title at the time of this study. during the usability test, four participants changed the field search drop-down menu to “title” in eds, and three participants changed the eds field search drop-down menu to “so journal title/source,” which limits the search to articles within that journal title. while both of these ideas were good, neither one resulted in a precise results set in eds for this task unless the user also limited to “jmu catalog only,” a nonintuitive option. since the test, jmu has added a “journal titles” option to quick search that launches the user’s search into the journal a–z list (provided by serials solutions). in two months after the change (february and march 2011), only 391 searches were performed with this option. this was less than 1 percent of all searches, indicating that while it may be an important task, it is not a popular one. like many libraries with discovery tools, jmu added federated search capabilities to eds using ebscohost integrated search software in an attempt to draw some traffic to databases not included in eds (or not subscribed to through ebsco by jmu), such as mla international bibliography, scopus, and credo reference. links to these databases appeared in the upper-righthand column of eds during the usability study (see figure 6.) usage data from ebsco showed that less than 1 percent of all jmu’s eds sessions for fall 2010 included any interaction with this area. likewise, williams and foster observed their participants did not use their federated search until explicitly asked to do so.40 perhaps users faced with discovery tool results simply have no motivation to click on additional database results. since the usability test, jmu has replaced the right-hand column with static links to ask-a-librarian, subject guides, and research database lists. readers may wonder why one of the most common tasks, finding a specific book title, was not included in this usability study; this was because jmu libraries posed this task in a concurrent homepage usability study. on that study, twenty of the twenty-five participants used quick search to find the title “pigs in heaven” and choose the correct call number. eleven of the twenty used the quick search drop-down menu to choose a title search option, further confirming users’ willingness to limit up-front. the average time on this task was just under a minute, and all participants completed this task successfully, so this task was not repeated in the eds usability test. other studies have reported trouble with this type of task;41 much could depend on the item chosen as well as the tool’s relevance ranking. user satisfaction with eds can be summarized from the open-ended post–study questions, from the responses to task 1 (figure 5), and the sus scale. answers to the post–study questions indicated participants liked the ability to narrow results, the speed and ease of use, and relevance of the system. a few participants did describe the system as being “overwhelming” or “confusing” because of the many features, which was also supported by the sus scores. jmu has been using the sus to understand the relative usability of library systems. the sus offers a benchmark for system improvement; for example, ebsco discovery service received an sus of only 37 in spring 2010 (n information technology and libraries | march 2012 103 = 7) but a 56 on this study in fall 2010 (n = 10). this suggests the interface has become more usable. in 2009, jmu libraries also used the sus to test the library catalog’s classic interface as well as a vufind interface to the library catalog, which received scores of 68 (n = 15) and 80 (n = 14), respectively. the differences between the catalog scores and eds indicate an important distinction between usability and usefulness, with the latter concept encompassing a system’s content and capabilities. the library catalog is, perhaps, a more straightforward tool than a discovery tool and attempts to provide access to a smaller set of information. it has none of the complexity involved in finding article-level or book chapter information. all else being equal, simpler tools will be more usable. in an experimental study, tsakonas and paptheodorou found that while users did not distinguish between the concepts of usability and usefulness, they prefer attributes composing a useful system in contrast to those supporting usability.42 discovery tools, which support more tasks, must make compromises in usability that simpler systems can avoid. in their study of eds, williams and foster also found overall user satisfaction with eds. their participants made positive comments about the interface as well as the usefulness and relevance of the results.43 jmu passed on several suggestions to ebsco related to eds based on the test results. ebsco subsequently added “audio” and “video” to the source types, which enabled jmu to add a “just videos at jmu” option to quick search. while it is confusing that “audio” and “video” source types currently behave differently than the others in eds, in that they limit to jmu’s catalog as well as to the source type, this behavior produces what most local users expect. a previous usability study of worldcat local showed users have trouble discriminating between source types in results lists, so the source types facet is important.44 another piece of feedback provided to ebsco was that on the task where users needed to choose the most recent result, only two of our participants sorted by date descending. perhaps the textual appearance of the sort option (instead of a drop-down menu) was not obvious to participants (see figure 6); however, williams and foster did not observe this to be an issue in their study.45 future research the findings of this study suggest many avenues for future research. libraries will need to revisit the scope of their catalogs and other systems to keep up with users’ mental models and information needs. catalogs and subject-specific databases still perform some tasks much better than discovery tools, but libraries will need to investigate how to situate the discovery tool and specialized tools within their web presence in a way that will make sense to users. when should a user be directed to the catalog versus a discovery tool? what items should libraries continue to include in their catalogs? what role do institutional repositories play in the suite of library tools, and how does the discovery tool connect to them (or include them?) how do library websites begin to make sense of the current state of library search systems? above all, are users able to find the best resources for their research needs? although research on searchers’ mental models has been extensive,46 librarians’ mental models have not been studied as such. yet placing the usability test results for a discovery tool in an academic library | fagan et al 104 discovery tool among the library’s suite of services will involve compromises between these two models. another area needing research is how to instruct users to work with the large numbers of results returned by discovery tools. in subject-specific databases, librarians often help users measure the success of their strategy—or even their topic—by the number of results returned: in criminal justice abstracts, 5,000 results means a topic is too broad or the search strategy needs refinement. in a discovery tool, a result set this large will likely have some good results on the first couple of pages if sorted by relevance; however, users will still need to know how to grow or reduce their results sets. participants in this study showed a willingness to use limiters and other interface features, but not always the most helpful ones. when asked to narrow a broad subject on task 3 of this study, only one participant chose to use the “subject” facet even when the subtopic, audiology, was clearly available. most added search terms. it will be important for future studies to investigate the best way for users to narrow large results set in a discovery tool. this study also suggested possible areas of investigation for future user studies. one interesting finding related to this study’s users’ information contexts was that when users were asked to search on their last research topic, it did not always match up with their major: a voice performance student searched on “current issues in russia,” and the hospitality major searched on “aphasia.” to what extent does a discovery tool help or hinder students who are searching outside their major area of study? one of jmu’s reference librarians noted that while he would usually teach a student majoring in a subject how to use that subject’s specific indexes, as opposed to a discovery tool, a student outside the major might not need to learn the subject-specific indexes for that subject and could be well served by the discovery tool. future studies could also investigate the usage and usability of discovery tool features in order to continue informing library customizations and advice to vendors. for example, this study did not have a task related to logging into a patron account or requesting items, but that would be good to investigate in a follow-up study. another area ripe for further investigation is discovery tool limiters. this study’s participants frequently attempted to use limiters, but didn’t always choose the correct ones for the task. what are the ideal design choices for making limiters intuitive? this study found almost no use of the embedded federated search add-on: is this true at other institutions? finally, this study and others reveal difficulty in distinguishing source types. development and testing of interface enhancements to support this ability would be helpful to many libraries’ systems. conclusion this usability test of a discovery tool at james madison university did not reveal as many interface-specific findings as it did questions about the role of discovery tools in libraries. users were generally able to navigate through the quick search and eds interfaces and complete tasks successfully. tasks that are challenging in other interfaces, such as locating journal articles and discriminating between source types, continued to be challenging in a discovery tool interface. information technology and libraries | march 2012 105 this usability test suggested that while some interface features were heavily used, such as drop down limits and facets, other features were not used, such as federated search results. as discovery tools continue to grow and evolve, libraries should continue to conduct usability tests, both to find usability issues and to understand user behavior and satisfaction. although discovery tools challenge libraries to think not only about access but also about the best research pathways for users, they provide users with a search that more closely matches their expectations. acknowledgement the authors would like to thank patrick ragland for his editorial assistance in preparing this manuscript. correction april 12, 2018: at the request of the author, this article was revised to remove a link to a website. references 1. emily alling and rachael naismith, “protocol analysis of a federated search tool: designing for users,” internet reference services quarterly 12, no. 1 (2007): 195, http://scholarworks.umass.edu/librarian_pubs/1/ (accessed jan. 11, 2012); frank cervone, “what we've learned from doing usability testing on openurl resolvers and federated search engines,” computers in libraries 25, no. 9 (2005): 10 ; sara randall, “federated searching and usability testing: building the perfect beast,” serials review 32, no. 3 (2006): 181–82, doi:10.1016/j.serrev.2006.06.003; ed tallent, “metasearching in boston college libraries —a case study of user reactions,” new library world 105, no. 1 (2004): 69-75, doi: 10.1108/03074800410515282. 2. s. c. williams and a. k. foster, “promise fulfilled? an ebsco discovery service usability study,” journal of web librarianship 5, no. 3 (2011), http://www.tandfonline.com/doi/pdf/10.1080/19322909.2011.597590 (accessed jan. 11, 2012). 3. janet k. chisman, karen r. diller, and sharon l. walbridge, “usability testing: a case study,” college & research libraries 60, no. 6 (november 1999): 552–69, http://crl.acrl.org/content/60/6/552.short (accessed jan. 11, 2012); frances c. johnson and jenny craven, “beyond usability: the study of functionality of the 2.0 online catalogue,” new review of academic librarianship 16, no. 2 (2010): 228–50, doi: 10.1108/00012531011015217 (accessed jan, 11, 2012); jennifer e. knievel, jina choi wakimoto, and sara holladay, “does interface design influence catalog use? a case study,” college & research libraries 70, no. 5 (september 2009): 446–58, http://crl.acrl.org/content/70/5/446.short (accessed jan. 11, 2012); jia mi and cathy weng, “revitalizing the library opac: interface, searching, and display challenges,” information technology & libraries 27, no. 1 (march 2008): 5–22, http://0http://scholarworks.umass.edu/librarian_pubs/1/ http://www.tandfonline.com/doi/pdf/10.1080/19322909.2011.597590 http://crl.acrl.org/content/60/6/552.short http://crl.acrl.org/content/70/5/446.short http://0-www.ala.org.sapl.sat.lib.tx.us/ala/mgrps/divs/lita/publications/ital/27/1/mi.pdf usability test results for a discovery tool in an academic library | fagan et al 106 www.ala.org.sapl.sat.lib.tx.us/ala/mgrps/divs/lita/publications/ital/27/1/mi.pdf (accessed jan. 11, 2012). 4. karen calhoun, “the changing nature of the catalog and its integration with other discovery tools,” http://www.loc.gov/catdir/calhoun-report-final.pdf (accessed mar. 11, 2011). 5. dee ann allison, “information portals: the next generation catalog,” journal of web librarianship 4, no. 1 (2010): 375–89, http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1240&context=libraryscience (accessed january 11, 2012); marshall breeding, “the state of the art in library discovery,” computers in libraries 30, no. 1 (2010): 31–34; c. p diedrichs, “discovery and delivery: making it work for users . . . taking the sting out of serials!” (lecture, north american serials interest group, inc. 23rd annual conference, phoenix, arizona, june 5–8, 2008), doi: 10.1080/03615260802679127; ian hargraves, “controversies of information discovery,” knowledge, technology & policy 20, no. 2 (summer 2007): 83, http://www.springerlink.com/content/au20jr6226252272/fulltext.html (accessed jan. 11, 2012); jane hutton, “academic libraries as digital gateways: linking students to the burgeoning wealth of open online collections,” journal of library administration 48, no. 3 (2008): 495–507, doi: 10.1080/01930820802289615; oclc, “online catalogs: what users and librarians want: an oclc report,” http://www.oclc.org/reports/onlinecatalogs/default.htm (accessed mar. 11 2011). 6. c. j. belliston, jared l. howland, and brian c. roberts, “undergraduate use of federated searching: a survey of preferences and perceptions of value-added functionality,” college & research libraries 68, no. 6 (november 2007): 472–86, http://crl.acrl.org/content/68/6/472.full.pdf+html (accessed jan. 11, 2012); judith z. emde, sara e. morris, and monica claassen‐wilson, “testing an academic library website for usability with faculty and graduate students,” evidence based library & information practice 4, no. 4 (2009): 24– 36, http://kuscholarworks.ku.edu/dspace/bitstream/1808/5887/1/emdee_morris_cw.pdf (accessed jan. 11,2012); karla saari kitalong, athena hoeppner, and meg scharf, “making sense of an academic library web site: toward a more usable interface for university researchers,” journal of web librarianship 2, no. 2/3 (2008): 177–204, http://www.tandfonline.com/doi/abs/10.1080/19322900802205742 (accessed jan. 11, 2012); ed tallent, “metasearching in boston college libraries—a case study of user reactions,” new library world 105, no. 1 (2004): 69–75, doi: 10.1108/03074800410515282; rong tang, ingrid hsieh-yee, and shanyun zhang, “user perceptions of metalib combined search: an investigation of how users make sense of federated searching,” internet reference services quarterly 12, no. 1 (2007): 211–36, http://www.tandfonline.com/doi/abs/10.1300/j136v12n01_11 (accessed jan. 11, 2012). 7. jody condit fagan, “usability studies of faceted browsing: a literature review,” information technology & libraries 29, no. 2 (2010): 58–66, http://0-www.ala.org.sapl.sat.lib.tx.us/ala/mgrps/divs/lita/publications/ital/27/1/mi.pdf http://www.loc.gov/catdir/calhoun-report-final.pdf http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1240&context=libraryscience http://www.springerlink.com/content/au20jr6226252272/fulltext.html http://www.oclc.org/reports/onlinecatalogs/default.htm http://crl.acrl.org/content/68/6/472.full.pdf+html http://kuscholarworks.ku.edu/dspace/bitstream/1808/5887/1/emdee_morris_cw.pdf http://www.tandfonline.com/doi/abs/10.1080/19322900802205742 http://www.tandfonline.com/doi/abs/10.1300/j136v12n01_11 information technology and libraries | march 2012 107 http://web2.ala.org/ala/mgrps/divs/lita/publications/ital/29/2/fagan.pdf (accessed jan. 11, 2012). 8. birong ho, keith kelley, and scott garrison, “implementing vufind as an alternative to voyager’s web voyage interface: one library’s experience,” library hi tech 27, no. 1 (2009): 8292, doi: 10.1108/07378830910942946 (accessed jan. 11, 2012). 9. tamar sadeh, “user experience in the library: a case study,” new library world 109, no. 1 (2008): 7–24, doi: 10.1108/03074800810845976 (accessed jan. 11, 2012). 10. tod a. olson, “utility of a faceted catalog for scholarly research,” library hi tech 25, no. 4 (2007): 550–61, doi: 10.1108/07378830710840509 (accessed jan. 11, 2012). 11. allison, “information portals,” 375–89. 12. marshall breeding, “plotting a new course for metasearch,” computers in libraries 25, no. 2 (2005): 27. 13. ibid. 14. dennis brunning and george machovec, “interview about summon with jane burke, vice president of serials solutions,” charleston advisor 11, no. 4 (2010): 60–62; dennis brunning and george machovec, “an interview with sam brooks and michael gorrell on the ebscohost integrated search and ebsco discovery service,” charleston advisor 11, no. 3 (2010): 62–65, http://www.ebscohost.com/uploads/discovery/pdfs/topicfile-121.pdf (accessed jan. 11, 2012). 15. ronda rowe, “web-scale discovery: a review of summon, ebsco discovery service, and worldcat local,” charleston advisor 12, no. 1 (2010): 5–10; k. stevenson et al., “next-generation library catalogues: reviews of encore, primo, summon and summa,” serials 22, no. 1 (2009): 68–78. 16. jason vaughan, “chapter 7: questions to consider,” library technology reports 47, no. 1 (2011): 54; paula l. webb and muriel d. nero, “opacs in the clouds,” computers in libraries 29, no. 9 (2009): 18. 17. jason vaughan, “investigations into library web scale discovery services,” articles (libraries), paper 44 (2011), http://digitalcommons.library.unlv.edu/lib_articles/44. 18. marshall breeding, “the state of the art in library discovery,” 31–34; sharon q. yang and kurt wagner, “evaluating and comparing discovery tools: how close are we towards next generation catalog?” library hi tech 28, no. 4 (2010): 690–709. 19. allison, “information portals,” 375–89. 20. breeding, “the state of the art in library discovery,” 31–34. 21. galina letnikova, “usability testing of academic library websites: a selective bibliography,” internet reference services quarterly 8, no. 4 (2003): 53–68. http://web2.ala.org/ala/mgrps/divs/lita/publications/ital/29/2/fagan.pdf http://www.ebscohost.com/uploads/discovery/pdfs/topicfile-121.pdf http://digitalcommons.library.unlv.edu/lib_articles/44 usability test results for a discovery tool in an academic library | fagan et al 108 22. jeffrey rubin and dana chisnell, handbook of usability testing: how to plan, design, and conduct effective tests, 2nd ed. (indianapolis, in: wiley, 2008); joseph s. dumas and janice redish, a practical guide to usability testing, rev. ed. (portland, or: intellect, 1999). 23. nicole campbell, ed., usability assessment of library-related web sites: methods and case studies (chicago: library & information technology association, 2001); elaina norlin and c. m. winters, usability testing for library web sites: a hands-on guide (chicago: american library association, 2002). 24. jennifer l. ward, steve shadle, and pam mofield, “user experience, feedback, and testing,” library technology reports 44, no. 6 (2008): 17. 25. ibid. 26. michael boock, faye chadwell, and terry reese, “worldcat local task force report to lamp,” http://hdl.handle.net/1957/11167 (accessed mar. 11 2011). 27. bob thomas and stefanie buck, “oclc’s worldcat local versus iii’s webpac: which interface is better at supporting common user tasks?” library hi tech 28, no. 4 (2010): 648–71. 28. oclc, “some findings from worldcat local usability tests prepared for ala annual,” http://www.oclc.org/worldcatlocal/about/213941usf_some_findings_about_worldcat_local.pdf (accessed mar. 11, 2011). 29. ibid., 2. 30. doug way, “the impact of web-scale discovery on the use of a library collection,” serials review 36, no. 4 (2010): 21420. 31. north carolina state university libraries, “final summon user research report,” http://www.lib.ncsu.edu/userstudies/studies/2010_summon/ (accessed mar. 28, 2011). 32. alesia mcmanus, “the discovery sandbox: aleph and encore playing together,” http://www.nercomp.org/data/media/discovery%20sandbox%20mcmanus.pdf (accessed mar. 28, 2011); prweb, “deakin university in australia chooses ebsco discovery service,” http://www.prweb.com/releases/deakin/chooseseds/prweb8059318.htm (accessed mar. 28, 2011); university of manitoba, “summon usability: partnering with the vendor,” http://prezi.com/icxawthckyhp/summon-usability-partnering-with-the-vendor (accessed mar. 28, 2011). 33. williams and foster, “promise fulfilled?” 34. jakob nielsen, “why you only need to test with 5 users,” http://www.useit.com/alertbox/20000319.html (accessed aug. 20, 2011). 35. john brooke, “sus: a ‘quick and dirty’ usability scale,” in usability evaluation in industry, ed. p. w. jordanet al. (london: taylor & francis, 1996), http://www.usabilitynet.org/trump/documents/suschapt.doc (accessed apr. 6, 2011). 36. williams and foster, “promise fulfilled?” http://hdl.handle.net/1957/11167 http://www.oclc.org/worldcatlocal/about/213941usf_some_findings_about_worldcat_local.pdf http://www.lib.ncsu.edu/userstudies/studies/2010_summon/ http://www.nercomp.org/data/media/discovery%20sandbox%20mcmanus.pdf http://www.prweb.com/releases/deakin/chooseseds/prweb8059318.htm http://prezi.com/icxawthckyhp/summon-usability-partnering-with-the-vendor/ http://www.useit.com/alertbox/20000319.html http://www.usabilitynet.org/trump/documents/suschapt.doc information technology and libraries | march 2012 109 37. seikyung jung et al., “libraryfind: system design and usability testing of academic metasearch system,” journal of the american society for information science & technology 59, no. 3 (2008): 375–89; williams and foster, “promise fulfilled?”; laura wrubel and kari schmidt, “usability testing of a metasearch interface: a case study,” college & research libraries 68, no. 4 (2007): 292–311. 38. williams and foster, “promise fulfilled?” 39. letnikova, “usability testing of academic library websites,” 53–68; tom ipri, michael yunkin, and jeanne m. brown, “usability as a method for assessing discovery,” information technology & libraries 28, no. 4 (2009): 181–86; susan h. mvungi, karin de jager, and peter g. underwood, “an evaluation of the information architecture of the uct library web site,” south african journal of library & information science 74, no. 2 (2008): 171–82. 40. williams and foster, “promise fulfilled?” 41. ward et al., “user experience, feedback, and testing,” 17. 42. giannis tsakonas and christos papatheodorou, “analysing and evaluating usefulness and usability in electronic information services,” journal of information science 32, no. 5 (2006): 400– 419. 43. williams and foster, “promise fulfilled?” 44. bob thomas and stefanie buck, “oclc’s worldcat local versus iii’s webpac: which interface is better at supporting common user tasks?” library hi tech 28, no. 4 (2010): 648–71. 45. williams and foster, “promise fulfilled?” 46. tracy gabridge, millicent gaskell, and amy stout, “information seeking through students’ eyes: the mit photo diary study,” college & research libraries 69, no. 6 (2008): 510–22; yan zhang, “undergraduate students’ mental models of the web as an information retrieval system,” journal of the american society for information science & technology 59, no. 13 (2008): 2087–98; brenda reeb and susan gibbons, “students, librarians, and subject guides: improving a poor rate of return,” portal: libraries and the academy 4, no. 1 (2004): 123–30; alexandra dimitroff, “mental models theory and search outcome in a bibliographic retrieval system,” library & information science research 14, no. 2 (1992): 141–56. usability test results for a discovery tool in an academic library | fagan et al 110 appendix a task pre–test 1: please indicate your jmu status (1st year, 2nd year, 3rd year, 4th year, graduate student, faculty, other) pre–test 2: please list your major(s) or area of teaching (open ended) pre–test 3: how often do you use the library website? (less than once a month, 1–3 visits per month, 4–6 visits per month, more than 7 visits per month) pre–test 4: what are some of the most common things you currently do on the library website? (open ended) pre–test 5: how much of the library’s resources do you think the quick search will search? (less than a third, less than half, half, most, all) pre–test 6: have you used leo? (show screenshot on printout) (yes, no, not sure) pre–test 7: have you used ebsco? (show screenshot on printout) (yes, no, not sure) pre–test 8 (student participants only): how often have you used library web resources for course assignments in your major? (rarely/never, sometimes, often, very often) pre–test 9 (student participants only): how often have you used library resources for course assignments outside of your major? (rarely/never, sometimes, often, very often) pre–test 10 (student participants only): has a librarian spoken to a class you've attended about library research? (yes, no, not sure) pre–test 11 (faculty participants only): how often do you give assignments that require the use of library resources? (rarely/never, sometimes, often, very often) pre–test 12 (faculty participants only): how often have you had a librarian visit one of your classes to teach your students about library research? (rarely/never, sometimes, often, very often) post–test 1: when would you use this search tool? post–test 2: when would you not use this search tool? post–test 3: what would you say are the major advantages of quick search? information technology and libraries | march 2012 111 post–test 4: what would you say are the major problems with quick search? post–test 5: if you were unable to find an item using quick search/ebsco discovery service what would your next steps be? post–test 6: do you think the name “quick search” is fitting for this search tool? if not, what would you call it? post–test 7 (faculty participants only): if you knew students would use this tool to complete assignments would you alter how you structure assignments and how? appendix b task purpose • practice task: use quick search to search a topic relating to your major / discipline or another topic of interest to you. if you were writing a paper on this topic how satisfied would you be with these results? help users get comfortable with the usability testing software. also, since the first time someone uses a piece of software involves behaviors unique to that case, we wanted participants’ first use of eds to be with a practice task. 1. what was the last thing you searched for when doing a research assignment for class? use quick search to re-search for this. tell us how this compared to your previous experience. having participants re-search a topic with which they had some experience and interest would motivate them to engage with results and provide a comparison point for their answer. we hoped to learn about their satisfaction with relevance, quality, and quantity of results. (user behavior, user satisfaction) 2. using quick search find a video related to early childhood cognitive development. when you’ve found a suitable video recording, click answer and copy and paste the title. this task aimed to determine whether participants could complete the task, as well as show us which features they used in their attempts. (usability, user behavior) 3. search on speech pathology and find a way to limit your search results to audiology. then, limit your search results to peer reviewed sources. how satisfied are you with the results? since there are several ways to limit results in eds, we designed this task to show us which limiters participants tried to use, and which limiters resulted in success. we also hoped to learn about whether they thought the limiters provided satisfactory results. (usability, user behavior, user satisfaction) usability test results for a discovery tool in an academic library | fagan et al 112 4. you need more recent sources. please limit these search results to the last 5 years, then select the most recent source available. click finished when you are done. since there are several ways to limit by date in eds, we designed this task to show us which limiters participants tried to use, and which limiters resulted in success. (usability, user behavior) 5. find a way to ask a jmu librarian for help using this search tool. after you’ve found the correct web page, click finished. we wanted to determine whether the user could complete this task, and which pathway they chose to do it. (usability, user behavior) 6. locate the journal yachting and boating world. what are the coverage dates? is this journal available in online full text? we wanted to determine whether the user could locate a journal by title. (usability) 7. you need to look up the sculpture genius of mirth. you have been told that the library database, camio, would be the best place to search for this. locate this database and find the sculpture. we wanted to know whether users who knew they needed to use a specific database could find that database from within the discovery tool. (usability, user behavior). 8. use quick search to find 2 books and 2 recent peer reviewed articles (from the last 5 years) on rheumatoid arthritis. when you have found suitable source click answer and copy and paste the titles. click back to webpage if you need to return to your search results. these two tasks were intended to show us how users completed a common, broad task with and without a discovery tool, whether they would be more successful with or without the tool, and what barriers existed with and without the tool (usability, user behavior) 9. without using quick search, find 2 books and 2 recent peer reviewed articles (from the last 5 years) on rheumatoid arthritis. when you have found suitable sources click answer and copy and paste the titles. click back to webpage if you need to return to your search results. letter to the editor ann kucera information technology and libraries | june 2018 9 https://doi.org/10.6017/ital.v37i2.10407 dear editorial board, regarding “halfway home: user centered design and library websites” in the march 2018 issue of information technology and libraries (ital), i thought there were some interesting points. i think that your assertion, however, that user centered design automatically eliminates anything from a website that your main user group did not expressly ask for is faulty. when someone brings up the fact that user centered design is not statistically significant, i interpret that as a misunderstanding of what user centered design is. our academic library websites are not research projects so why would we gather statistically significant information about them? our academic library websites are (or should be) helpful to students and faculty and constantly changing to meet their needs. if librarians perpetuate a misunderstanding of user centered design, my fear is that misunderstanding could perpetuate stagnation and a refusal to change our technology/user interfaces in a rapidly changing environment and do our patrons and ourselves a disservice. user centered design is a set of tools to help us gather information about users and their needs. the information gathered informs the design but does not dictate the design and needs to be part of an iterative process. the web design team at your institution demonstrated user centered design when they added floor maps back into the web site when a group of users pointed out that it was causing problems for the main users at your institution. while valuable experience from librarians and other staff is critical to take into account, it is sometimes difficult to determine which pieces of the puzzle provide comfort to those who work at the library vs. which pieces assist students in their studies. i applaud your willingness to “clear the slate” and reduce the amount of information you were maintaining on your website. i’m guessing you may have removed dozens of links from your website. you only mentioned adding one category of information back into the design. i would say your user centered design process is working quite well. ann kucera systems librarian central michigan university https://doi.org/10.6017/ital.v37i1.10338 virtual reality: the next big thing for libraries to consider editorial board thoughts virtual reality: the next big thing for libraries to consider breanne kirsch information technology and libraries | december 2019 4 breanne kirsch (breanne.kirsch@briarcliff.edu) is university librarian, briar cliff university. i had the pleasure of attending educause annual conference from october 14-17, 2019. this was my first time at educause, but i was impressed with the variety of programs, vendors, and options for learning about technology and higher education. after recently completing my coursework for a second master’s in educational technology, i was curious to see what new technologies would be highlighted at educause. i found out about some new trends, such as the growth of esports in high schools and higher education. esports are when players or teams compete through computers in video game competitions.1 there were over 20 programs and sessions about virtual reality at educause. since there were so many programs about virtual reality at educause, i wanted to share a little of what i learned including how some higher education institutions are creating vr content, using pre-created content, and vr in libraries. since virtual reality is still new to many higher education institutions, i wasn’t sure how many would be creating content, but i did attend a couple of sessions about how 360 -degree content is being created. virtual reality content creation seems to happen most frequently in the medical field so students can practice different procedures that may not happen very frequently in their jobs, allowing them to experience a wider variety of procedures that they will eventually encounter in the workplace. health sciences libraries are generally ahead of the curve in providing vr services to patrons.2 additionally, stem areas are finding more uses for vr, such as vr laboratories so expensive lab equipment does not need to be purchased, but students can still participate in vr lab experiences. creating vr content using tools such as unity can be difficult and time-consuming. some educators are using 360-degree cameras to create virtual settings that can be used by students but are easier to create. tim fuller and rich kappel spoke about how they used a 360-degree camera and matterport scans to create 360-degree virtual environments for students to explore and engage with robotics technology. tags can then be added to include pictures, videos, or link to websites with more information. this creates a shareable link that can be used to share with students. i was able to use my iphone and the google street view app to create a 360-degree tour of my library. it is not high quality enough to view in virtual reality with an oculus go or other vr headset, but it is a great starting point for creating a 360-degree virtual tour of a library on a budget. this was free (since i already had an iphone). there is a wide variety of freely available, 360-degree content that can be used by educators in the classroom and more is being created. what does this mean for libraries? while quick virtual tours can be created with smartphones, higher quality vr experiences can also be created by librarians using a 360-degree video camera. these experiences could be used to teach students information literacy skills or search strategies in a vr environment. while this would be harder to do right now with the technologies available, mailto:breanne.kirsch@briarcliff.edu virtual reality: the next big thing for libraries to consider | kirsch 5 https://doi.org/10.6017/ital.v38i4.11847 it could become easier down the road. meanwhile, librarians can create 360-degree virtual tours. libraries can offer vr services, such as a vr lab or checking out standalone vr headsets, such as oculus go or oculus quest. just like with the makerspaces trend, libraries are well situated to support virtual reality in education. our library circulates an oculus go and when we were considering adding a virtual reality headset, there were some risks we considered prior to purchasing it. there are health risks for some people when using virtual reality headsets, such as motion sickness, dizziness, and, in some cases, epileptic seizures. it is important to explain this to students before they check out the device, so they know to immediately quit using the oculus go if they have an adverse reaction. additionally, we keep cleaning wipes with the oculus go to help keep it sanitary when multiple people are using it. a tablet or smart phone needs to be associated with the oculus go in order to update apps or download new apps. therefore, a passcode needs to be added so students can’t purchase paid apps on the oculus go with the associated credit card. privacy can also be a concern, especially when using the social apps, which is why i decided not to download the social apps on the oculus go at this time. some of the scary apps, such as the face your fear app can cause students to scream, so it is important that students realize how realistic the experiences are before using them. one final consideration when offering vr services is staffing. there needs to be someone trained in the library that can help teach students how to use the vr headset and experiences. i’ve trained each of our student workers in how to use the headset so they can show other students. while these are some important considerations when deciding whether to offer vr services or not, i believe the benefits outweigh the risks. virtual reality is expected to continue to grow, especially with wireless headsets, such as the oculus go and oculus quest available. it is important for libraries to be ready to offer support with virtual reality, just as we’ve offered support for prior technologies including tablets, laptops, computers, 3d printers, etc. libraries can start small, by circulating an oculus go or creating a 360-degree library tour. libraries with more resources could create a vr lab or provide support for creating vr content, such as 360 -degree video cameras or tools like unity. it will be exciting to see how libraries can support vr in the future. further readings van arnhem, jolanda-pieta, christine elliott, and marie rose. augmented and virtual reality in libraries. lanham: rowman & littlefield, 2018. varnum, kenneth j. beyond reality: augmented, virtual, and mixed reality in the library. chicago: ala editions, 2019. endnote 1 matthew a. pluss, kyle j. m. bennett, andrew r. novak, derek panchuk, aaron j. coutts and job fransen, “esports: the chess of the 21st century,” frontiers in psychology 10, no. 156, 2019, https://doi.org/10.3389/fpsyg.2019.00156. 2 susan lessick and michelle kraft, “facing reality: the growth of virtual reality and health sciences libraries,” journal of the medical library association 105, no. 4, 2017, https://doi.org/10.5195/jmla.2017.329. https://doi.org/10.3389/fpsyg.2019.00156 https://doi.org/10.5195/jmla.2017.329 further readings endnote filling the gap in database usability: putting vendor accessibility compliance to the test article filling the gap in database usability putting vendor accessibility compliance to the test samuel kent willis and faye o'reilly information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.11977 samuel kent willis (samuel.willis@wichita.edu) is assistant professor and technology development librarian, wichita state university. faye o’reilly (faye.oreilly@wichita.edu) is assistant professor and digital resources librarian, wichita state university. © 2020. abstract library database vendors often revamp simpler interfaces of their database platforms with scriptenriched interfaces to make them more attractive. sadly, these enhancements often overlook users who rely on assistive technology, leaving electronic content difficult for this user base despite the potential of electronic materials to be easier for them to access and read than print materials. even when providers are somewhat aware of this user group's needs there are questions about the effect of their efforts to date and whether accessibility documentation from them can be relied upon. this study examines selected vendors’ vpat reports (voluntary product accessibility template) through a manual assessment of their database platforms to determine their overall accessibility. introduction libraries are now providing more access to online databases than ever before. in fact, as blechner notes, most of the “information patrons seek is located in indexes and databases that are only available digitally. students and faculty rely heavily on these resources in completing course assignments and conducting research.”1 vendors frequently revamp simpler interfaces of their database platforms with script-enriched interfaces to make it more attractive to students.2 sadly, these enhancements often overlook users who rely on assistive technology, leaving electronic content difficult for this user base despite the potential of electronic materials to be easier for them to access and read than print materials. online databases not only bridge the gap for distance users but can also improve service to users with print disabilities.3 resources produced digitally or properly digitized for online dissemination more readily allow all users, including patrons with physical or mental impairments, to make use of them than do print materials. these resources allow all patrons to have access to updates and new publications at the same time, and can be presented in multiple formats.4 key features of electronic access that are helpful to users are zooming in on text and automatic reflow to reduce the need to scroll, improving color contrast or changing colors to make looking at the screen easier on the eyes, and the capability of the text to be read aloud by either a built-in feature or user-provided assistive technology such as a screen reader or refreshable braille display.5 all of this, however, presupposes that the content can be accessed using the platform provided by the vendor to navigate the database, and that the documents be made at least minimally accessible. the question is then, how well do these platforms interact with the assistive technologies employed by the largest minority group in the united states (persons with disabilities), relying on libraries to facilitate “their full participation in society,” and to achieve academic success?6 many vendors provide accessibility documentation pertaining to their database platforms. some note considerable limitations in accessibility while others claim to be highly accessible when in mailto:samuel.willis@wichita.edu mailto:faye.oreilly@wichita.edu information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 2 fact they may be no better that the former. accessibility guidelines like section 508 of the rehabilitation act sets forth are a good starting point, but related literature has emphasized that even conformance to these standards does not guarantee they will be usable for all.7 literature review accessibility in libraries has been examined from a variety of vantage points. some studies were an inspiration to our work and complementary to it, though our manual and holistic review of library databases from third parties was a unique approach. dermody and majekodunmi conducted a usability study of electronic databases, focusing on students unable to fully make use of analog materials.8 they asserted that technology, online databases in particular, can either be a help or a hindrance to users with print disabilities.9 after having visually impaired students use screen readers to test three proprietary databases, the authors concluded that their use of the platforms was disrupted by advanced features designed to engage users. study participants were frustrated to have to abandon a research article applicable to their topic because it was presented in an unreadable format.10 the authors found that as website design evolves to enhance the user experience, screen reader users and others who relied on assistive technology were often overlooked and unable to make use of the sites due to the construction of the platforms and due to inaccessible pdfs.11 regarding accessibility assessment, the authors asserted that database providers were unlikely to catch all issues or evaluate their products accurately.12 the legal responsibility for these shortfalls, however, belongs to the subscribing institutions.13 the results of dermondy and majekodunmi’s survey demonstrated that the usability of electronic databases was stunted by the limitations of screen readers, the platforms or materials themselves, and by insufficient information literacy training for assistive technology users. in 2015, blechner wrote about the challenges law students with disabilities face in their education, similar to any undergraduate or graduate program. this study was conducted by a librarian with screen-reading software and an accessibility checklist. blechner highlighted that using research databases with assistive technology to locate material and complete assignments was a barrier to completing legal education programs or passing the bar.14 in academic institutions, student success is related to library access. as much of a library’s resources are online, inaccessible electronic resources present a massive issue.15 database design is especially important to users who use assistive technologies to access online resources. blechner pointed out that an additional barrier to online resource access was an average delay of three years before an accessible version of a requested platform or service was prepared.16 if an undergraduate degree took four years to complete, a freshman living with a disability would be a senior before they have equitable access. blechner stressed a need for librarians to go beyond addressing the accessibility of their native web platforms and to inspect vendor platforms prior to subscribing to them. libraries "rarely raise the issue when selecting electronic indexes and databases for procurement from outside vendors.”17 libraries cannot adequately serve patrons and comply with legal requirements if they are unable to provide meaningful access to information for all library patrons. a significant point from blechner’s article was that compliance with federal standards does not guarantee a service is easy to use or usable at all. “a product can receive a rubber stamp even when it is not functional or usable despite a company’s good faith efforts to provide an accessible product.”18 other authors have supported this claim, which, along with our own observations, was an impetus for this research. information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 3 in chapter 8 of ensuring digital accessibility through process and policy, lazar, goldstein, and taylor used different web accessibility evaluation methods to verify vendor accessibility information on their platforms. the three methods they examined were (1) having users with disabilities test the platform or content using assistive technology; (2) conducting an expert review to ensure compliance with usability standards; and (3) performing an automated scan of the content using scanning software. regardless of method chosen for evaluation, the authors stressed the importance of continuous evaluation, as content can easily become inaccessible through changes to the user interface. the authors identified strengths and weaknesses to each of the approaches but recommended that whenever possible method one be used from early on in the development with a goal of ongoing improvement, and that method two be used in conjunction with it. when specifically examining the accessibility review of vendor-supplied database content, the authors noted that a voluntary product accessibility template (vpat) is one form of method two; however, its findings are only reliable insofar as the template is completed by an accessibility expert, and even then there is room for disagreement. 19 this supported the approach we undertook in this study to examine vendor databases and compare our findings with vendors’ vpats when available. in our professional experience, some vpat creators are experts in accessibility, while others are not, and even among experts opinions vary, which led us to the same conclusion as lazar et al.: “multiple experts, working independently, can increase the validity of the accessibility inspection.”20 jennifer tatomir created a checklist, the tatomir accessibility check-list (tac), to apply the accessibility guidelines to a usability study.21 at the time the article was written in 2010, the thencurrent web accessibility standards would have been the wcag 2.0 (released in 2008) and section 508 standards, last revised in 1998 to include equitable access to information and data under the protection of the law. wcag is now in version 2.1, with version 2.2 already in development, and section 508 requirements were updated in 2017 to include many wcag principles. the tac examined (1) documents and webpages; (2) bypass links; (3) page element labels; (4) captions for images and figures; (5) scripts and code that would interfere with assistive technology; (6) duplicate links; (7) transcripts for audiovisual material; (8) site organization; (9) timed responses; and (10) the accessibility of web forms.22 while the testing criteria used in this study differed from ours on several points, tatomir and durrance’s work supported our creation of the accessibility remediation guide (arg), a checklist of which section 508 standards would be the most important to our libraries (see appendix a). the arg will be discussed in more detail later in this article. finally, delancey conducted an assessment of the accuracy of 17 vendors’ vpats which was similar to one aspect of our research. her work used automated assessment tools as the primary measure for comparison against vpats, while this study is a direct comparison of two expert reviews.23 the goal of our research project was to determine the accuracy of vendor-supplied accessibility documentation—vpats in particular—to inform future communications with those vendors as well as collection development decisions moving forward. the studies used in this paper used sighted librarians, students using screen readers , and native users of screen readers to conduct accessibility testing. ideal candidates for accessibility testing would of course be users with disabilities. however, this approach can be complemented by a review for basic usability and compliance with section 508 standards. librarians are also ideal candidates for accessibility testing since they have access to and expertise in using research databases and are committed to providing access to all.24 librarians can also provide information information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 4 in advance, in anticipation of need. the findings of such accessibility testing could be beneficial in drafting licensing agreements that would ensure a higher level of service for patrons with disabilities. as blechner said, it is “critical that libraries independently exercise their power as buying agents to improve the state of electronic resource accessibility.”25 librarians can be instrumental in the development of database platforms moving forward by continually checking the accessibility of these platforms and sharing opportunities for enhancement with vendors.26 methodology this study made use of the arg (appendix a) for both vpat accuracy analysis, and overall testing of database accessibility. the arg was based on the standards set forth in section 508 of the rehabilitation act and related vpat creation guidelines. the arg has 11 criteria and was originally intended for accessibility evaluation of new databases, but two criteria were merged with others to make nine in order for it to be easier for a graduate student to evaluate. due to the breadth of technologies covered in a vpat, the authors determined that many of the sections in a vpat were not relevant to our examination. an example of this is section 1194.25 which refers to physical accessibility of kiosks and the like and therefore has no bearing on electronic content. the functionalities we chose to test were a restricted subset of the functionalities assessed in a vpat, but this set was selected for several reasons. some of the guidelines were selected due to their wide impact on a variety of assistive technologies related to the needs of persons with disabilities including blindness, deafness, limited vision, hearing, or mobility. following these guidelines would improve the performance of the platforms for use with screen readers and keyboards, eye tracking software, refreshable braille displays, and other assistive technology.27 other tests were chosen as a result of our preliminary investigation and use of the databases, and resulting evidence that they were areas on concern. finally, some of these items to be examined were selected because a lack of accessibility in these areas would result in drastic limitations to the usability and therefore utility of the databases overall, even if they rarely applied. the reasons behind this study were threefold. firstly, 62 percent (48) of our vendors had provided no vpat. this test would fulfill a similar purpose, allowing us to know how accessible these databases without vpats were as well as identify particular areas requiring remediation in anticipation of patron needs. secondly, our library had anecdotal evidence that some of the vpats that were provided contained inaccuracies but without a thorough examination it was impossible to know the particulars or extent of the issues. finally, the goal of the project was to identify trends in database accessibility and usability for persons with disabilities, comparing major database providers with smaller vendors. these findings will give insight into what most needs to be addressed based on the size and type of content provider and will likely have some bearing on similar institutions’ collections. these are the criteria we used in testing. other institutions, if following our example, would likely want to adapt the list to meet their needs and institutional priorities: 1) keyboard navigation and intuitive forms 2) presence of keyboard traps 3) platform optical character recognition (ocr) 4) document ocr 5) alternative text 6) table data information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 5 7) skip navigation 8) transcripts 9) closed captions note: criteria 3, 8, and 9 included testing support materials, including video tutorials for criteria 8 and 9. once we had determined which sections to include, we hired and trained a graduate student to use the arg to examine each database to which we subscribed on the nine criteria. we were awarded a grant to fund the student’s work. the student tested each database platform and a minimum of three items in each database, manually checking them using a keyboard and screen reader (nvda). this testing fulfilled the majority of our priorities but was supplemented by her checking for transcripts and captions for video content. while the findings cannot be comprehensive in a manual test this work is complementary with existing vpats in enabling us to identify areas in need of development in vendor platform usability. it is noteworthy that by testing the databases manually with a screen reader, certain limitations in the usability of the databases were discovered that would not have been revealed by doing automatic checks as have been done in similar studies. an excellent example of this is poorly designed skip navigation (which was found for nearly half of our databases). using the data we collected with the assistance of our graduate student, we compiled and compared our findings on the various vendors. our scoring (based on representative random sampling) gave one point for a database passing a single criterion, half a point for a minor issue, and no points for any criteria that failed our tests. the scores were then added together, ignoring any criteria which did not apply to particular databases, to form a composite score. when analyzing vendors with multiple databases, their overall score was based on the average of the individual database scores. this enabled us to codify a percentage of accessibility for every vendor and compare them. for the purposes of this study, we will refer to any vendor that provided the university libraries with 15 or more database subscriptions as large vendors (lvs), and the rest as small vendors (svs). given that we only subscribe to 15 or more databases from a few vendors, some of the vendors we classified as svs would likely be considered lvs at other institutions. research findings vpat accuracy assessment as previously stated, one goal of this research was to measure the accuracy of vendor-supplied vpats. 227 databases assessed had an associated vpat from the vendor, but the rest did not ( see appendix b for list of all databases by vendor). we used the arg (appendix a), and compared the vendors’ claims on the vpat to our manual testing of the database functionality. of the 227 databases, only 10 databases were found to fully match the claims the vendor made on the vpat for the 11 criteria assessed from the arg. databases where the vpat claims did not match the findings of the testing on one criteria were given a score of “partial match.” of the 227 databases, 138 were considered partial matches (see figure 1 for details). the main incongruity between vpats and our results were due to the databases not having sufficient skip navigation, meaning they did not have appropriate or functional bypasses. these issues are likely due to outdated vpats that do not reflect the latest changes to the databases but could also be the result of vendors’ lack of understanding of what it means to be truly usable by persons with disabilities. information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 6 for databases that failed two or more of the criteria tested, a score of “not a match” was given, with 79 databases of 227 failing (see figure 1 for details). for these databases, skip navigation and alternative text were the main issues. the databases, when presenting essential content in an image, like a photo or chart, did not provide an alternative presentation of that content, which means only sighted users could access the data from that image. the findings of this study are similar to the data from the overall usability study, finding that vendors struggled with skip navigation and alt-text, as we will discuss below. some of these databases were also found to have keyboard traps that prevented screen reader users from navigating to the entire site and at times may even trap the user’s navigation in a single content area. this number of inconsistencies was even higher than the authors anticipated but reinforced all the more the importance of not taking information in vpats for granted, especially when the vpat is several years old and the platform has undergone any changes. figure 1. vpat accuracy assessment accessibility analysis overall related to the vpat accuracy assessment, we conducted manual tests of our databases and database platforms, both those with and without a vpat provided by the vendor. of our 351 databases, 124 (35 percent) had no related vpat, and on a whole, examining all criteria, we found them notably less accessible. that said, there were exceptions where databases with no associated vpat still had accessibility information giving reasonable detail, and others where the vpat provided was inaccurate or where it highlighted significant accessibility issues (see tables 1 and 61% 4% 35% partial match match not a match information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 7 2). the average composite score of vpat-linked databases was 74 percent, compared to 67 percent for those with none (see table 3 for comparison). each criterion was compared and any instance where one category of databases was more than five percent higher than the other was highlighted. table 1. summary of issues for databases with vpats (227 total) good partial poor applicable n/a download ocr 50 32 78 160 67 skip navigation 68 124 35 227 0 transcripts 42 4 15 61 166 alt text 71 39 12 122 105 tables 17 38 4 59 168 captions 35 0 8 43 184 platform ocr 108 41 7 156 71 keyboard navigation 202 22 3 227 0 keyboard traps 224 0 3 227 0 average 90.78 33.33 18.33 142.44 84.56 table 2. summary of issues for databases without vpats (124 total) good partial poor applicable n/a download ocr 61 14 27 102 22 skip navigation 47 27 48 122 2 transcripts 3 0 8 11 113 alt text 52 38 26 116 8 tables 30 8 2 40 84 captions 6 1 4 11 113 platform ocr 88 10 16 114 10 keyboard navigation 74 35 15 124 0 keyboard traps 123 0 1 124 0 average 53.78 14.78 16.33 84.89 39.11 information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 8 table 3. comparison of databases with and without vpats (351 total) percent good of applicable databases with vpats (227) percent good of applicable databases without vpats (124) download ocr 41.25% 66.67% skip navigation 57.27% 49.59% transcripts 72.13% 27.27% alt text 74.18% 61.21% tables 61.02% 85.00% captions 81.40% 59.09% platform ocr 82.37% 81.58% keyboard navigation 93.83% 73.79% keyboard traps 98.68% 99.19% average 73.57% 67.04% the biggest barriers to accessibility found in this study pertained to downloadable files’ ocr, skip navigation, transcripts, and alternative text (see figure 2 and table 4). the accessibility of downloadable files through ocr or alternative formats (txt, html, etc.) was found to be the most major concern, though it did not apply to all databases. its overall score for applicable databases was 51 percent, based on the frequency and severity of the issues. many database platforms had full text available for download only through pdfs that were images of text or that had other issues failing to work with assistive technologies. it was more than twice as frequent for a database to have inaccessible downloadable files as inaccessible full text online. often html or txt formats were not available for download, but in instances where it was available through the vendor’s platform, another means of accessing the information mitigated this issue. other times, however, the full text on the platform itself was not accessible. information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 9 figure 2. accessibility issues by database table 4. summary of issues by database platform (351 total) good partial poor applicable n/a percent good of applicable download ocr 111 46 105 262 89 51.15% skip navigation 115 151 83 349 2 54.58% transcripts 45 4 23 72 279 65.28% alt text 123 77 38 238 113 67.86% tables 47 46 6 99 252 70.71% captions 41 1 12 54 297 76.85% platform ocr 196 51 23 270 81 82.04% keyboard navigation 276 57 18 351 0 86.75% keyboard traps 347 0 4 351 0 98.86% average 144.56 48.11 34.67 227.33 123.67 72.67% a lack of or poorly executed skip navigation accounted for the second greatest number of issues by vendor. this criterion’s final score was 55 percent. when skip navigation existed, the most common problem was for it to not redirect to the main content. often times, for example, on the search results page, the link would take the user to the filters in the margin with no easy way to bypass them and get to the actual results. eighty-three databases were found to have no skip navigation whatsoever, but the majority of issues found were from existing bypass links not working as intended. d o w n l o a d o c r s k i p n a v i g a t i o n t r a n s c r i p t s a l t t e x t t a b l e s c a p t i o n s p l a t f o r m o c r k e y b o a r d n a v i g a t i o n k e y b o a r d t r a p s number of databases t e s t in g c r it e r ia good partial poor information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 10 databases with audiovisual materials made up a relatively small portion of our databases, but when these types of items existed, problems were not infrequent. additionally, we examined support videos made available by database providers to test all multimedia content for transcripts and captions. twenty-seven out of 72 (38 percent) were determined to have inaccurate transcripts or be in need of them. captions are irrelevant to non-visual materials, so were only applicable to 54 databases. of these, 13 (24 percent) were found lacking. therefore, transcripts were the bigger issue. nearly half of the databases with images had at least minor issues with alternative text, whether in documents or the platforms themselves. in many cases, this issue was not identified by the vendor in any accessibility documentation because alternative text was present, but not properly descriptive. thirty-eight databases (16 percent of applicable) had major issues where images were important to the performance of the platform or database and no alternative text was provided. in database materials, charts and graphs were often lacking any alternative text, though on occasion we found the information conveyed in the chart was covered in the main text. in these instances, that was not counted as an issue. the results for tables were similar. both in the platforms and the documents, tables often lacked identifying header and cell information for screen readers to make sense out of the data. a few were entirely unreadable. fifty-two of 99 databases with tabular data (53 percent) had problems, but most of them were not major, and for this reason, tables were of less concern than alternative text. finally, keyboard navigation was a rarer issue, but still was found to be a concern in 75 databases (21 percent). this was often related to images or forms not having descriptive text for screen readers, so non-visual users would be unable to know the purpose of the form, etc. on a few occasions database platforms would have keyboard traps that prevented screen reader users from navigating to the entire site, or more often at least buttons or links that could be used only with a mouse. while our testing only included keyboard navigation, it is important to remember that if it is not usable by keyboard, neither is it likely to work with other assistive technology used for navigation. while this area was of least frequent concern of all criteria we tested, it is nevertheless a vital part of making any website or platform truly usable. all these findings were important to our study as they helped us to identify areas of need, especially for databases that had no corresponding vpat. whether the databases had a vpat or not, this research provided us with the details needed to reach out to database providers and request specific improvements. vendor comparison by size the final goal of our research was to compare the relative accessibility of database providers based on the number of databases we subscribed to from each. while at times we may have subscribed to only a small number of databases from a larger content provider, there was a general correlation between what we considered svs in this study and those vendors that only offer a more limited number of collections. in assessing the percent accessible a provider was for each criterion, we added all good scores, one point for each related database, to the partial scores, one half a point for each, then divided it by the total number of databases in this area. in this way, minor issues were not recorded as negatively as major issues. overall accessibility of the lv databases was found to be significantly higher than accessibility of individual databases and svs (see tables 5 and 6 for findings for lvs and svs respectively). our information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 11 findings showed our lvs to have an average score of 74 percent accessibility, compared to 69 percent for svs, both averages being based on the number of applicable databases. there were two tested criteria, however, that lvs scored lower on than svs: downloadable files’ ocr, and tables. the details of each criterion will be discussed below. most lv content is on a consistent platform and we found, similar to an earlier study, that this consistency helped those materials to be more accessible.28 the issues lv databases most often had were related to individual items, rather than to the platform as a whole. for example, lvs were found to have frequent problems with pdf files. given that our lvs account for 61 percent of our databases (214 of 351) and they are typically larger than the databases of svs, this has significant impact on ongoing vendor communication and accessibility remediation efforts. skip navigation issues was the largest problem found for svs. interestingly, while no lvs were entirely missing skip navigation, a lack of proper functionality was a major concern for half of them, accounting for 121 databases. thirteen databases were found where lvs had no skip navigation. in contrast, 70 sv databases (52 percent of sv content for which this criterion applied) had no skip navigation or it failed to function at all. an additional 30 sv databases and 121 lv databases had improperly functioning bypass links. overall, svs were more likely to have none at all, and lvs were more likely to have it not properly set up. full text ocr results varied greatly depending on the type. platform ocr showed little difference between lvs and svs, both being found to be 82 percent accessible. as mentioned previously, downloadable files ocr had more accessibility problems than platform ocr, but there was a large difference between lvs and svs. for this criterion lv content was found to be only accessible about 40 percent of the time, and sv content 70 percent of the time. this may be due to svs generally having smaller databases so it is less difficult to address accessibility needs for individual items. whatever the cause, the disparity between lvs and svs in this area was very significant. transcripts and captions were far more common for lvs than svs. fifty-five databases (26 percent of lv databases) included audiovisual material, including support tutorials, while only 17 (12 percent) of sv content did. lv content was found to be accessible 73 percent of the time for transcripts, and 82 percent for captions. applicable svs on the other hand were only 41 percent accessible for transcripts, and 66 percent accessible for captions. this demonstrates the need for development in both these areas, but especially for transcripts, which when synchronized with the videos have the capability to full more user needs than captions can. closely following transcripts was alternative text for non-textual content like charts, diagrams and other images. it is worth mentioning that some databases have images neither in their platforms nor in their collection materials. if the platform is simple and the database only provides abstracts, for example, there may be no images, in which case this criterion does not apply. nearly one-third (113) of the databases were found to have no images. of the 238 databases with images, we found at least some issues with 115 (48 percent of applicable, 33 percent overall), there being no significant difference between svs and lvs as a whole. individually, the platforms varied greatly, and regarding major limitations in alternative text there were found to be 21 sv databases, but only 17 lv databases. information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 12 while table accessibility applied to only 99 databases, there were significant issues found particularly with one lv. given the disparity between lvs it is impossible to draw meaningful conclusions comparing lvs and svs for this criterion. further study is needed in this area. finally, the areas of least frequent concern were keyboard navigation and keyboard traps. seventy-five databases (21 percent) were found to have suboptimal navigation. in this case, lvs did not have as many issues as svs. optimization is needed for them, but only one lv had major issues in this area. forty percent of sv databases (55 of 137) had at least some navigation issues identified, whereas only nine percent of lv databases (20 of 214) had any issues in this area. as for major issues, only four databases were identified in our study as having keyboard traps, two svs and two lvs. these only seemed to appear for separate platforms and never for large ones, suggesting that our vendors are likely aware of this issue and avoiding it in newly created platforms. the authors hope the remaining databases with this issue will not be neglected in making these improvements. to sum up, lv content was found to be more accessible overall. their largely consistent platforms more often had skip navigation (29 percent more), transcripts (32 percent more) and captions (16 percent more) for multimedia content, and superior keyboard navigation (18 percent more). sv platforms, however, had a higher score on downloadable files ocr (31 percent more) and on tables (24 percent more). see table 7 for detailed comparison. table 5. issues by lv database (214 total) good partial poor applicable n/a download ocr 52 25 86 163 51 skip navigation 80 121 13 214 0 transcripts 38 4 13 55 159 alt text 59 41 17 117 97 tables 14 35 4 53 161 captions 31 0 7 38 176 platform ocr 106 39 8 153 61 keyboard navigation 194 13 7 214 0 keyboard traps 212 0 2 214 0 average 87.33 30.89 17.44 135.67 78.33 information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 13 table 6. issues by sv database (137 total) good partial poor applicable n/a download ocr 59 21 19 99 38 skip navigation 35 30 70 135 2 transcripts 7 0 10 17 120 alt text 64 36 21 121 16 tables 33 11 2 46 91 captions 10 1 5 16 121 platform ocr 90 12 15 117 20 keyboard navigation 82 44 11 137 0 keyboard traps 135 0 2 137 0 average 57.22 17.22 17.22 91.67 45.33 table 7. comparison of lv databases and sv databases (351 total) percent good of applicable databases from lvs (214) percent good of applicable databases from svs (137) download ocr 39.57% 70.20% skip navigation 65.65% 37.04% transcripts 72.73% 41.18% alt text 67.95% 67.77% tables 59.43% 83.70% captions 81.58% 65.63% platform ocr 82.03% 82.05% keyboard navigation 93.69% 75.91% keyboard traps 99.07% 98.54% average 73.52% 69.11% information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 14 conclusion and limitations this investigation was intended to complement existing studies related to library database accessibility. it was unique in that it manually analyzed content from every database subscription in the university libraries, rather than only major or representative databases or automated tests. building a comparison between vendor vpats and our manual assessment was a key value of this research that we hope will be further developed in future inquiry. the comparison of different types of vendors was also important. while the consistency of lv platforms was found to improve the sites overall, the authors expected that lv content would be more compliant with accessibility regulations than they were found to be. from a usability and accessibility perspective, the increased cost of these databases was deemed to be associated with too little improvement of service. it matters little how clean a platform looks to visual users, for example, if it is impossible or very difficult to use by non-visual users. as anticipated, there were few instances of keyboard traps (when a keyboard and screen reader user is caught in a loop or on a single link when attempting to navigate through the website). when these occur, however, it is a major concern, as it renders the site virtually useless for non-mouse users. there was no significant difference between lvs and svs on three of nine criteria—including keyboard traps— and on two criteria, svs were superior. therefore, despite that lvs were found to be 14 percent more accessible on average, the authors urge lvs to work diligently to address the areas where they were found to be deficient. both aspects of this study concluded that vendors generally misunderstood the execution of sk ip navigation and alternative text, as a usability study of databases proved many databases failed in fulfilling these criteria, while a separate study of their vpats’ accuracy proved vendors claimed they did comply with the criteria, while the platform was found to not comply fully. this study is limited in that few samples were able to be examined for each content type in every database platform. the authors anticipate that a deeper investigation would bring to light additional accessibility concerns. another limitation of this research was related to the time involved in testing. database platforms changed during the course of this work, but the results of this study pertain to only a short period of time, making them in cases outdated even at the time of this writing. therefore, the manual testing we have performed would work best when used in conjunction with automated tools for testing database content as other studies have done. the authors hope that further study in this area could involve persons with varied impairments to test the platforms directly and assert that there is potential for collaboration between vendors and libraries in this area. information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 15 appendix a: accessibility remediation guide the authors developed the arg for database testing prior to signing licensing agreements with vendors. while initially create d based on vpat version 1 criteria as defined in section 508 standards, it was adapted and cross -referenced with vpat version 2 criteria following the refresh of section 508 in january of 2018. the organization of the criteria was altered greatly at tha t time, but vpats from vendors may use either version, depending on the age of the vpat. finally, it was used in this study to create the testing criteria. testing criteria vpat version 1-1.6 standards vpat version 2-2.3 standards notes section 1194.22 (web-based intranet and internet information and applications) related standards after section 508 refresh 5 (alternative text) a) a text equivalent for every non-text element shall be provided (e.g., via “alt,” “longdesc,” or in element content). e101 (web, software), e201 (application) wcag: 1.1.1 non-text content 8 and 9 (transcripts and closed captions) b) equivalent alternatives for any multimedia presentations shall be synchronized with the presentation. 500 (software) wcag: 1.2.2 captions (prerecorded) and 1.2.3 audio description for streaming media only “equivalent alternatives” include transcripts. 3, 4 and 6 (platform ocr, document ocr, and table data) d) documents shall be organized so they are readable without requiring an associated style sheet. e205.2-4 (electronic content) wcag: 1.3.2 meaningful sequence “documents” describes the webpage. is the webpage well organized so it’s readable without style elements (colors, blocking, font sizes, etc.). 1 and 3 (keyboard navigation and intuitive forms, and platform ocr) l) when pages utilize scripting languages to display content, or to create interface elements, the information provided by the script shall be identified with the functional text that can be read by assistive technology. e205.2-4 (electronic content) wcag: 2.1.1 keyboard does the database include interactive content (buttons, check boxes, or other mouse input), news tickers, media players, browser games etc.)? is this content accurately identified via text for use with screen readers? information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 16 testing criteria vpat version 1-1.6 standards vpat version 2-2.3 standards notes 1 (keyboard navigation and intuitive forms) n) when electronic forms are designed to be completed online, the form shall allow people using assistive technology to access the information, field elements, and functionality required for completion and submission of the form, including all directions and cues. e205.2-4 (electronic content) wcag: 3.2.1 on focus definition of “form” includes search boxes in databases. is the search box, search boxes’ purpose, and purpose accurate? 7 (skip navigation) o) a method shall be provided that permits users to skip repetitive navigation links. e205.2-4 (electronic content) wcag: 2.4.1 bypass blocks and 1.3.1 info and relationships section 1194.24 (video and multi-media products) related standards after section 508 refresh 8 and 9 (transcripts and closed captions) e) display or presentation of alternate text presentation or audio descriptions shall be user-selectable unless permanent. 400 (hardware) wcag: 1.2.1 and 1.2.3 audio description or media alternative for streaming media only section 1194.31 (functional performance criteria) related standards after section 508 refresh 3, 4 and 6 (platform ocr, document ocr, and table data) a) at least one mode of operation and information retrieval that does not require user vision shall be provided or support for assistive technology used by people who are blind or visually shall be provided. 302.1 (vision) wcag: 1.4.5 images of text do pdfs have optical character recognized (ocr) text, or are they only images of text? if they do have ocr text, is it accurate? is it missing information in images or figures? information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 17 testing criteria vpat version 1-1.6 standards vpat version 2-2.3 standards notes 8 and 9 (transcripts and closed captions) c) at least one mode of operation and informational retrieval that does not require user hearing shall be provided, or support for assistive technology used by people who are deaf or hard of hearing shall be provided. 303.4 (hearing) wcag: 1.2.1 and 1.2.2 1 and 2 (keyboard navigation and intuitive forms, and presence of keyboard traps) f) at least one mode of operation and information retrieval that does not require fine motor control or simultaneous actions and that is operable with limited reach and strength shall be provided. 303.7 (limited manipulation), 303.8 (limited reach) wcag: 2.1.1 keyboard section 1194.41 (information, documentation and support) related standards after section 508 refresh 3, 8 and 9 (platform ocr, transcripts, and closed captions) b) end-users shall have access to a description of the accessibility and compatibility features of products in alternate formats for alternate methods upon request, at no additional charge. 602.2 (accessibility and compatibility features) and 603.2 (information on accessibility and compatibility features) wcag: 3.3.5 help information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 18 appendix b: database by vendor list used in vpat accuracy audit vendor name aapg (american association of petroleum geologists) aapg/datapages abc-clio arbaonline acls (american council of learned societies) acls humanities e-book acm (association of computing machinery) acm digital library acs (american chemical society) scifinder adam matthew digital african american communities migration to new worlds american indian histories and cultures american west digital collection aiaa (american institute of aeronautics & astronautics) aiaa electronic library alexander street press academic video online african american music reference american civil war: letters and diaries american history in video anthropological field work online anthropology online art and architecture in video asian american drama bbc video collection black drama black studies in video border and migration studies online broadway hd classical music in video classical music library classical performance in video classical scores library contemporary world drama counseling and psychotherapy transcripts: volume i counseling and psychotherapy transcripts: volume ii counseling and therapy in video dance online: dance in video dance online: dance studies collection diagnosing mental disorders: dsm-5 and icd10 information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 19 vendor name disability in the modern world drama texts collection early encounters in north america education in video engineering case studies online environmental issues online ethnographic sound archives online ethnographic video online food studies online gilded age global issues library human rights studies online illustrated civil war newspapers and magazines images of america: a history of american life in images and texts international business online lgbt studies in video lgbt thought and culture music online: listening (united states) music periodicals of the 19th century new world cinema: independent features and shorts (1990-present) north american immigrant letters, diaries and oral histories north american indian thought and culture north american women's drama north american women's letters and diaries nursing and mental health in video: a symptom media collection nursing education in video pbs video collection performance design archive psychological experiments online royal shakespeare company collection silent film online sixties: primary document and personal narratives 1960– 1974 social theory social work online sony pictures classics information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 20 vendor name theatre & drama premium theatre in context theatre in performance theatre in video: volume i theatre in video: volume ii twentieth century drama underground & independent comic, comix and graphic novels women and social movements in the united states, 16002000 world history in video 60 minutes: 1997–2014 american institute of physics scitation index spin american mathematical society mathscinet apa (american psychological association) apa books e-collections asm international asm handbooks online asme (american society of mechanical engineer) asme digital collection astm astm standards & engineering digital library bioone bioone books 24x7 financepro itpro britannica encyclopedia britannica online spanish reference center business expert press business expert press cabell's cabell's directory psychology set cabell's directory educational set cambridge crystallographic data centre cambridge structural database (webcsd) webcsd cambridge university press historical statistics of the united states (hsus) chadwyck healey early english books online black abolitionist papers black studies center black studies center: history makers module early english books online text creation project clcd (children's literature comprehensive database) children's literature comprehensive database (clcd) cq press cq researcher information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 21 vendor name credo reference masterworks credo reference datazoa datazoa ebsco agricola alt-healthwatch america: history & life (ebsco) american antiquarian society (aas) historical periodicals collection (series 1–5) american doctoral dissertations 1933–1955 anthropology plus applied science & technology abstracts art abstracts art full text art index retrospective atla (american theological library association) historical monographs collection: series i atla (american theological library association) historical monographs collection: series ii auto repair reference center biography reference bank book collection: nonfiction book review digest plus business abstracts with full text business source complete cinahl complete communication & mass media complete computer source: consumer edition consumer health complete criminal justice abstracts with full text ebook collection (formerly netlibrary) ebsco databases econlit education full text ergonomics abstracts eric (ebsco) european views of the americas: 1493 to 1750 fuente academica general science full text georef information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 22 vendor name georef in process greenfile health source: consumer edition health source: nursing/academic edition history reference center humanities full text library literature & information science full text library, information science & technology abstracts (lista) literary reference center mediclatina medline (ebsco) mental measurements yearbook with tests in print mla directory of periodicals mla international bibliography music index native american archives newspaper source plus novelist plus omnifile full text mega philosopher's index psycarticles psychology and behavioral sciences collection psycinfo psyctests readers' guide full text regional business news religion & philosophy collection rilm abstracts of music literature small business reference center smartsearch social sciences full text sportdiscus with full text teacher reference center topicsearch vocational & career collection women's studies international academic search complete information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 23 vendor name ei engineering village compendex elsevier sciencedirect clinical pharmacology scopus gale 19th century u.s. newspapers 19th century uk periodicals academic onefile archives unbound artemis primary sources british literary manuscripts online business insights: essentials economist historical archive educator's reference complete eighteenth century collections online (ecco) expanded academic asap gale databases gale digital collections gale virtual reference library general onefile greenr (global reference on the environment, energy, and natural resources) health & wellness resource center (with alternative health module) health reference center academic indigenous peoples: north america informe academico infotrac newsstand kansas history, territorial through civil war years, 1854– 1865 legaltrac literature resource center making of the modern world nineteenth century collections online (ncco) opposing viewpoints in context sabin americana, 1500–1926 slavery and anti-slavery collection smithsonian collections online: evolution of flight 1784– 1991 testing & education reference center: terc information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 24 vendor name times (london) google google scholar guidestar guidestar hathitrust hathitrust heinonline heinonline: government, politics and law heinonline: slavery in america and the world: history, culture & law ibisworld ibisworld ieee ieee mit press ebooks library ieee xplore digital library ieee-wiley ebooks library infobase learning films on demand infogroup referenceusa institute of physics iopscience interdok directory of published proceedings jstor jstor kanopy kanopy streaming knovel knovel lexisnexis lexisnexis academic nexis uni library of congress congress.gov (formerly thomas legislative) mergent key business ratios mergent archives mergent intellect mergent online national academies press national academies press publications national library of medicine pubmed (medline) naxos naxos music library naxos sheet music library ncjrs national criminal justice reference service abstracts newsbank access world news newsbank oclc archivegrid articlefirst camio: catalog for art images online clase and periodica eco (electronic collections online) firstsearch information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 25 vendor name oaister oclc electronic books papersfirst proceedingsfirst worldcat (oclc) worldcat dissertations and theses worldcat.org ovid ovidsp oxford university press oxford art online oxford english dictionary oxford history of western music oxford medicine online oxford music online oxford reference online: premium projectmuse project muse proquest abi/inform collection aerospace database agricultural & environmental science database american periodicals series (1741–1988) annual register (1758–2016) art and architecture archive (1845–2005) biological science database chicago defender (1910–1975) (proquest historical black newspapers) cleveland call & post (1934–1991) (proquest historical black newspapers) comdisdome design and applied arts index (daai) digital national security archive (dnsa) dissertations and theses @ wichita state university earth, atmospheric & aquatic science database eblebook library (now ebook central) *ebook central ebrary (now ebook central) eric (proquest) fold3 harper's bazaar archive heritagequest online literature online (lion) information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 26 vendor name los angeles sentinel (1934–2005) (proquest historical black newspapers) materials science & engineering database medline (proquest) national criminal justice reference service abstracts (proquest) new york amsterdam news (1922–1993) (proquest historical black newspapers) new york times (1851–3 years ago) with index (1851– 1993) (proquest historical newspapers) new york tribune/ herald tribune (1841–1962) (proquest historical newspapers) pais index periodicals archive online pilots pittsburg courier (1911–2002) (proquest historical black newspapers) pittsburg postgazette (1786–2003) (proquest historical newspapers) proquest civil war era 1840–1865 proquest congressional publications (including hearings) proquest databases proquest digital microfilm proquest historical newspapers proquest history vault proquest nursing & allied health source proquest research library research library, proquest scitech premium collection social services abstracts sociological abstracts technology collection the christian science monitor (1908–1994) (proquest historical newspapers) the guardian & the observer (1791–1909) (proquest historical newspapers) ulrichsweb.com vogue archive women's magazine archive collection 1: 1883–2005 women's magazine archive collection 2: 1846–2015 readex african american newspapers (1827–1998) information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 27 vendor name america's historical newspapers (1690–1922) american state papers, 1789–1838 early american imprints readex allsearch territorial papers of the united states, series 1 u.s. congressional serial set, 1817–1994 sage sage journals online sage reference online sage research methods sage research methods cases sage stats salem press salem history salem literature sbrnet sports market analysis (formerly sbrnet) springer springerlink state library of kansas mango languages cloud library digital books elending learning express library oneclick digital statista statista swank swank digital campus taylor & francis crc press ebooks europa world year book thomson reuters arts & humanities citation index medline (web of science) ria checkpoint science citation index social sciences citation index web of science u.s. department fo commerce stat-usa u.s. census bureau u.s. department of education eric u.s. government printing office catalog of u.s. government publications gpo monthly catalog homeland security digital library university of chicago gss (general social survey) information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 28 vendor name university of michigan icpsr (inter-university consortium for political and social research) uptodate uptodate valueline valueline investment survey plus wharton research data services (wrds) compustat eventus wiley cochrane library wiley online library information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 29 endnotes 1 a. j. blechner, “improving usability of legal research databases for users with print disabilities,” legal reference services quarterly, 34, no. 2 (2000): 139, https://doi.org/10.1080/0270319x.2015.1048647. 2 jennifer horwath, “evaluating opportunities for expanded information access: a study of the accessibility of four online databases,” library hi tech 20, no. 2 (2002): 199, https://doi.org/10.1108/07378830210432561. 3 blechner, “improving usability of legal research databases for users with print disabilities,” 140. 4 horwath, “evaluating opportunities for expanded information access: a study of the accessibility of four online databases,” 199. 5 sarah george, ellie clement, and grace hudson, “auditing the accessibility of electronic resources,” sconul focus, 62 (2014): 16. 6 blechner, “improving usability of legal research databases for users with print disabilities,” 141. 7 suzanne l. byerley and mary beth chambers, “accessibility and usability of web-based library databases for non-visual users,” library hi tech, 20, no. 2 (2002) 177, https://doi.org/10.1108/07378831111116976; blechner, “improving usability of legal research databases for users with print disabilities,” 140. 8 kelly dermody and norda majekodunmi, “online databases and the research experience for university students with print disabilities,” library hi tech 20, no. 1 (2011): 150, https://doi.org/10.1108/07378831111116976. 9 dermody and majekodunmi, “online databases and the research experience for university students with print disabilities,” 156. 10 dermody and majekodunmi, “online databases and the research experience for university students with print disabilities,” 156. 11 dermody and majekodunmi, “online databases and the research experience for university students with print disabilities,” 156–7. 12 dermody and majekodunmi, “online databases and the research experience for university students with print disabilities,” 151. 13 dermody and majekodunmi, “online databases and the research experience for university students with print disabilities,” 144. 14 blechner, “improving usability of legal research databases for users with print disabilities,” 142. https://doi.org/10.1080/0270319x.2015.1048647 https://doi.org/10.1108/07378830210432561 https://doi.org/10.1108/07378831111116976 https://doi.org/10.1108/07378831111116976 information technology and libraries december 2020 filling the gap in database usability | willis and o’reilly 30 15 blechner, “improving usability of legal research databases for users with print disabilities,” 139. 16 blechner, “improving usability of legal research databases for users with print disabilities,” 145. 17 blechner, “improving usability of legal research databases for users with print disabilities,” 138. 18 blechner, “improving usability of legal research databases for users with print disabilities,” 140. 19 jonathan lazar, daniel f. goldstein, and anne taylor, ensuring digital accessibility through process and policy (amsterdam: morgan kaufmann/elsevier, 2015), 150. 20 lazar, goldstein, and taylor, ensuring digital accessibility through process and policy, 153. 21 jennifer tatomir and joan c. durrance, “overcoming the information gap: measuring the accessibility of library databases to adaptive technology users,” library hi tech 28, no. 4 (2010): 581. 22 tatomir and durrance, “overcoming the information gap: measuring the accessibility of library databases to adaptive technology users,” 581. 23 laura delancey, “assessing the accuracy of vendor-supplied accessibility documentation,” library hi tech 33, no. 1 (2015): 104, https://doi.org/10.1108/lht-08-2014-0077. 24 blechner, “improving usability of legal research databases for users with print disabilities,” 168. 25 blechner, “improving usability of legal research databases for users with print disabilities,” 147. 26 lazar, goldstein, and taylor, ensuring digital accessibility through process and policy, 155. 27 nondiscrimination on the basis of disability; accessibility of web information and services of state and local government entities, 81 fed. reg. 28,658 (may 9, 2016) (to be codified at 28 cfr pt. 35). 28 christina mune and ann agee, “are e-books for everyone? an evaluation of academic e-book platforms’ accessibility features,” journal of electronic resources librarianship 28, no. 3 (2016): 181, https://doi.org/10.1080/1941126x.2016.1200927. https://doi.org/10.1108/lht-08-2014-0077 https://doi.org/10.1080/1941126x.2016.1200927 abstract introduction literature review methodology research findings vpat accuracy assessment accessibility analysis overall vendor comparison by size conclusion and limitations appendix a: accessibility remediation guide appendix b: database by vendor list used in vpat accuracy audit endnotes article title | author 23frbrization of a library catalog | dickey 23 the functional requirements for bibliographic records (frbr)’s hierarchical system defines families of bibliographic relationship between records and collocates them better than most extant bibliographic systems. certain library materials (especially audio-visual formats) pose notable challenges to search and retrieval; the first benefits of a frbrized system would be felt in music libraries, but research already has proven its advantages for fine arts, theology, and literature—the bulk of the non-science, technology, and mathematics collections. this report will summarize the benefits of frbr to nextgeneration library catalogs and opacs, and will review the handful of ils and catalog systems currently operating with its theoretical structure. editor’s note: this article is the winner of the lita/ ex libris writing award, 2007. t he following review addresses the challenges and benefits of a next-generation online public access catalog (opac) according to the functional requirements for bibliographic records (frbr).1 after a brief recapitulation of the challenges posed by certain library materials—specifically, but not limited to, audiovisual materials—this report will present frbr’s benefits as a means of organizing the database and public search results from an opac.2 frbr’s hierarchical system of records defines families of bibliographic relationship between records and collocates them better than most extant bibliographic systems; it thus affords both library users and staff a more streamlined navigation between related items in different materials formats and among editions and adaptations of a work. in the eight years since the frbr report’s publication, a handful of working systems have been developed. the first benefits of such a system to an average academic library system would be felt in a branch music library, but research already has proven its advantages for fine arts, theology, and literature—the bulk of the non-science, technology, and mathematics collections. ■ current search and retrieval challenges the difficulties faced first, but not exclusively, by music users of most integrated library systems fall into two related categories: issues of materials formats, and issues of cataloging, indexing, and marc record structure. music libraries must collect, catalog, and support materials in more formats than anyone else; this makes their experience of the most common ils modules—circulation, reserves, and acquisitions—by definition more complicated. the study of music continues to rely on the interrelated use of three distinct information formats—scores (the notated manifestation of a composer’s or improviser’s thought), recordings (realizations in sound, and sometimes video, of such compositions and improvisations), and books and journals (intellectual thought regarding such compositions and improvisations)—music libraries continue to require . . . collections that integrate [emphasis mine] these three information formats appropriately.3 put a different way, “relatedness is a pervasive characteristic of music materials.”4 this is why frbr’s model of bibliographic relationships offers benefits that will first impact the music collection.5 at present, however, musical formats pose search and retrieval challenges for most ils users, and the problem is certainly replicated with microforms and video recordings. the marc codes distinguish between material formats, but they support only one category for sound recordings, lumping together cd, dvd audio, cassette tape, reel-toreel tape, and all other types.6 this single “sound recording” definition is easily reflected in opacs (such as those powered by innovative interfaces’ millennium and ex libris’ aleph 500) and union catalogs (such as worldcat. org).7 however, the distinction between sound recording formats is embedded in subfields of the 007 field, which presently cannot be indexed by many library automation systems because the subfields are not adjacent. an even more central challenge derives from the fact that music sound recordings—such as journals and essay collections—contain within each item more than one work. thus, for one of the central material formats collected by a music library (as well as by a public library or other academic branches), users routinely find themselves searching for a distinct subset of the item record. perversely, though music catalogers do tend to include analytic added-entries for the subparts of a cd recording or printed score, and major ils vendors are learning to index them, aacr2 guidelines set arbitrary cutoff points of about fifteen tracks on a sound recording, and three performable units within a score.8 subsets of essay collections and journal runs are routinely exposed to users’ searches by indexing and abstracting services and major databases, but subsets of libraries’ music collections depend upon catalogers to exploit the marc records for user access.9 timothy j. dickey (dickeyt@oclc.org) is a post-doctoral researcher, oclc office of programs and research, dublin, ohio. frbrization of a library catalog: better collocation of records, leading to enhanced search, retrieval, and display timothy j. dickey 24 information technology and libraries | march 200824 information technology and libraries | march 2008 in light of these pervasive bibliographic relationships, catalogers of music (again, with parallels in other subjects) have developed a distinctive approach to the marc metadata schema. in particular, they—with their colleagues in literature, fine arts, and theology—rely upon the 700t field for uniform work titles, and upon careful authority control.10 however, once again, many major ils portals have spotty records in affording access to library collections via these data. innovative interfaces’ millennium, though it clearly leads other major library products in this market, frequently frustrates music librarians (it is, of course, not alone in doing so).11 its automatic authority control feature works poorly with (necessary) music authority records.12 and even though innovative has been one of the first vendors to add a database index to the 700t field, partly in response to concerns expressed to the company by the music librarians’ user group, millennium apparently does not allow for an appropriate level of follow-through on searching.13 an initial search by name of a major composer, for instance, yields a huge and cluttered result set containing all indexed 700t fields.14 the results do helpfully include the appropriate see also references, but those references disappear in a subsidiary (limited) search. in addition, the subsidiary display inexplicably changes to an unhelpful arrangement of generic 245 fields (“mozart, symphonies”; “mozart, operas, excerpts”). similar challenges will be faced by other parts of an academic or large public library collection, including the literature collections (for works such as shakespeare’s plays), fine arts (for images and artists’ works), and theology (for works whose uniform title is in latin). the opac interfaces of other major ils vendors fare little better. the same search (for “mozart”) on the emory university library catalog (with an ils by sirsidynix), similarly yields a rich results set of more than one thousand records, and poses similar problems in refining the search.15 in the case of this opac, an index of 700t fields also exists, but it only may be searched from the inside of a single record; as with millennium, sirsidynix’s interface will then group the next set of results confusingly by 245 fields. the library corporation’s carl-x apparently does not contain a 700t index; the simple “mozart” search returns a muchsimplified set of only 97 results organized by 245a fields, and thus offers a more concise set of results but avoids the most incisive index for audio-visual materials.16 ex libris offers a somewhat more helpful display of its more restricted results; unfortunately for the present comparison, though the detailed results set does list the “format” of all mozart-authored items, the same term— “music”—is used for sound recordings, musical scores, and score excerpts, with no attempt logically to group the results around individual works.17 no 700t index appears present. ■ the frbr paradigm: review of literature and theory from the earliest library catalogs in the modern age, the tools of bibliographic organization have sought to afford users both access to the collection and collocation of related materials. anglo-american cataloging practice has traditionally served the first function by main entries and alternate access points and the second function by classification systems. however, as knowledge increases in scope and complexity, the systems of bibliographic control have needed to evolve. as early as the 1950s, theories were developing that sought to distinguish between the intellectual content of a work, and its often manifold physical embodiments.18 the 1961 paris international conference on cataloging principles first reified within the cataloging community a work-item distinction, though even the 1988 publication of the anglo-american cataloging rules, 2nd ed., “continued to demonstrate confusion about the nature . . . of works.”19 meanwhile, extensive research into the nature of bibliographic relationships groped toward a consensus definition of the entity-types that could encompass such relationships.20 ed o’neill and diane vizine-goetz examined some one hundred editions of smollett’s the expedition of humphrey clinker over a two-hundred-year span of publication history to propose a hierarchical set of definitions to define entity levels.21 the theoretical entities include the intellectual content of a work—which in the case of audio-visual works, may not even exist in any printed formats—the various versions, editions, and printings in which that intellectual content manifests itself, and the specific copies of each manifestation which a library may hold.22 research has discovered such clusters of bibliographically related entities for as much as 50 percent or more of all the intellectual works in any given library catalog, and as many as 85 percent of the works in a music catalog.23 this work laid the foundation for frbr (and, once again, incidentally underscored the breadth of its applicability to, and beyond, music catalogs). the theoretical framework of frbr is most concisely set forth in the final report of the ifla study group. the long-awaited publication traces its genesis to the 1990 stockholm seminar, and the resultant 1992 founding of the ilfa study group on functional requirements for bibliographic records. the study group set out to develop: a framework that identifies and clearly defines the entities of interest to users of bibliographic records, the attributes of each entity, and the types of relationships that operate between entities . . . a conceptual model that would serve as the basis for relating specific attributes and relationships . . . to the various tasks that users perform when consulting bibliographic records. article title | author 25frbrization of a library catalog | dickey 25 the study makes no a priori assumptions about the bibliographic record itself, either in terms of content or structure.24 in other words, the intention of the group’s deliberations and the final report is to present a model for understanding bibliographic entities and the relationships between them to support information organization tools. it specifically adopts an approach that defines classes of entities based upon how users, rather than catalogers, approach bibliographic records—or, by natural extension, any system of metadata. the frbr hierarchical entities comprise a fourfold set of definitions: ■ work: “a distinct intellectual or artistic creation”; ■ expression: “the intellectual or artistic realization of a work” in any combination of forms (including editions, arrangements, adaptations, translations, performances, etc.); ■ manifestation: “the physical embodiment of an expression of a work”; and ■ item: “a single exemplar of a manifestation.”25 examples of these hierarchical levels abound in the bibliographic universe, but frequently music offers the quickest examples: ■ work: mozart’s die zauberflöte (the magic flute) ■ work: puccini’s la bohéme ■ expression: the composer’s complete musical score (1896) ■ manifestation: edition of the score printed by ricordi in 1897 ■ expression: an english language edition for piano and voices ■ expression: a performance by mirella freni, luciano pavarotti, and the berlin philharmonic orchestra (october 1972) ■ manifestation: a recording of this perfor mance released on 33¹/³ rpm sound discs in 1972 by london records ■ manifestation: a re-release of the same per formance on compact disc in 1987 by london records ■ item: the copy of the compact disc held by the columbus metropolitan library ■ item: the copy of the compact disc held by the university of cincinnati in fact, lis research has tended to demonstrate what music librarians have always understood—that relatedness among items and complexity of families is most prevalent in audio-visual collections. even before the ifla report had been penned, sherry vellucci had set out the task: “to create new catalog structures that better serve the needs of the music user community, it is important first to understand the exact nature and complexity of the materials to be described in the catalog.”26 even limiting herself to musical scores alone (that is, no recordings or monographs), vellucci found that more than 94.8 percent of her sample exhibited at least one bibliographic relationship with another entity in the collection; she further related this finding to the very “inherent nature of music, which requires performance for its aural realization,” as opposed to, for example, monographic book printing.27 vellucci and others have frequently commented on how the relatedness of manifestations—in different formats, arrangements, and abridgements—of musical works continues to be a problem for information retrieval in the world of music bibliography.28 musical works have been variously and industriously described by musicologists and music bibliographers. yet, in the information retrieval domain [and, i might add, under both aacr and aacr2] . . . systems for bibliographic information retrieval . . . have been designed with the document as the key entity, and works have been dismissed as too abstract . . .29 the work is the access point many users will bring—in their minds, and thus in their queries—to a system. they intend, however, to discover, identify, and obtain specific manifestations of that work. very recently, research has begun to demonstrate that the frbr model can offer specific advantages to music retrieval in cases such as these: “the description of bibliographic data in a frbr-based database leads to less redundancy and a clearer presentation of the relationships which are implicit in the traditional databases found in libraries today.”30 explorations of the theory in view of the benefits to other disciplines, such as audio-visual and other graphic materials, maps, oral literature, and rare books, have appeared in the literature as well.31 the admitted weakness of the frbr theory, of course, is that it remains a theory at its inception, with still preciously few working applications. ■ frbr applications working implementations of frbr to catalogs, opacs, and ilss are still relatively few but promise much for the future. the frbr theoretical framework has remained an area of intense research at oclc, which has even led to some prototype applications and, very recently, deployment in the worldcat local interface.32 a scattered few other researchers have crafted frbr catalogs and catalog displays for their own ends; the library of congress has a prototype as well. innovative, the leading academic ils vendor, announced a frbr feature for 2005 release, 26 information technology and libraries | march 200826 information technology and libraries | march 2008 yet shelved the project for lack of a beta-testing partner library.33 ex libris’ primo discovery tool, one other complete ils (by visionary technologies for library systems, or vtls), and the national library of australia, have each deployed operational frbr applications.34 the number of projects testifies to the high level of interest among the cataloging and information science communities, while the relatively small number of successful applications testifies to the difficulties faced. oclc has engaged in a number of research projects and prototypes in order to explore ways that frbrization of bibliographic records could enhance information access. oclc research frequently notes the potential streamlining of library cataloging by frbrization; in addition they have experienced “superior presentation” and “more intuitive clustering” of search results when the model is incorporated into systems.35 work-level definitions stand behind such oclc research prototypes as audience level, dewey browser, fictionfinder, xisbn, and live search. in every case, researchers determined that, though it was very difficult to automate any identification of expressions, application of work-level categories both simplifies and improves search result sets.36 an algorithm common to several of these applications is freely available as an open source application, and now as a public interface option in oclc’s worldcat local.37 the algorithm creates an author/title key to cluster worksets (often at a higher level than the frbr work, as in the case of the two distinct works that are the book and screenplay for gone with the wind). in the public search interface, the results sets may be grouped at the work level; users may then execute a more granular search for “all editions,” an option that then displays the group of expressions linked to the work record. unfortunately, as the software does not use 700t fields (its intention is to travel up the entity hierarchy, and it uses the 1xx, 24x, and 130 fields), its usefulness in solving the above challenges may not be immediate. a somewhat similar application (though merrilee proffitt declares it not to be a frbr product) was redlightgreen, a user interface for the exrlg union catalog based upon quasi-frbr clustering.38 the reports from designers of other automated systems offer interesting commentaries on the process. the team building an automatically frbrized database and user interface for austlit—a new union collection of australian literature among eight academic libraries and the national library of australia—acknowledged some difficulty with non-monographic works such as poems, though the majority of their database consisted of simpler work-manifestation pairs.39 based on strongly positive user feedback (“the presentation of information about related works [is] both useful and comprehensible”), a similar application was attempted on the australian national music gateway musicaustralia; it is unclear whether the project was shelved due to difficulties in automating the frbrization process.40 one recent application created for the perseus digital library adopts a somewhat different approach.41 rather than altering previously created marc records to allow hierarchical relationships to surface, this team created new records using crosswalks between marc and, for instance, mods, for work-level records. they claim some moderate level of success; though once again, their discussion of the process is more illuminating than their product. mimno and crane successfully allowed a single manifestation-level record to link upwards to many expressions, a necessary analytic feature especially for dealing with sound recordings. they did practically demonstrate the difficulty of searching elements from different levels of the hierarchy at the same time (such as work title and translator), a complication predicted by yee.42 three ils vendors have released products that use the frbr model: portia (visualcat), ex libris (primo), and vtls (virtua).43 the first product, a cataloging utility from a smaller player in the vendor market, claims to incorporate frbr into its metadata capture, yet the information available does not explain how, nor do they offer an opac to exploit it. the 2007 release of ex libris’ primo offers what the company calls “frbr groupings” of results.44 this discovery tool is not itself an ils, but promises to interoperate with major existing ils products to consolidate search results. it remains unclear at this time how ex libris’ “standard frbr algorithms” actually group records; the single deployment in the danish royal library allows searching for more records with the same title, for instance, but does not distinguish between translations of the same work.45 vtls, on the other hand, has since 2004 offered a complete product that has the potential to modify existing marc records—via local linking tags in the 001 and 004 fields—to create frbr relationships.46 their own studies agreed with oclc that a subset, roughly 18 percent, of existing catalog records (most heavily concentrated in music collections) would benefit from the process, and they thus allow for “mixed” catalogs, with only subsets (or even individually selected records) to be frbrized. the company’s own information suggests relatively simple implementation by library catalogers, coupled with robust functionality for users, and may be the leading edge of the next generation of catalog products. ■ frbr solutions the ilfa study group, following its user-centered approach, set out a list of specific tasks that users of a computer-aided catalog should be able to accomplish: article title | author 27frbrization of a library catalog | dickey 27 ■ to find all manifestations embodying certain criteria, or to find a specific manifestation given identifying information about it; ■ to identify a work, and to identify expressions and manifestations of that work; ■ to select among works, among expressions, and among manifestations; and ■ to obtain a particular manifestation once selected. it seems clear that the frbr model offers a framework of relationships that can aid each task. unfortunately, none of the currently available commercial solutions may be in themselves completely applicable for a single library. the oclc work-set algorithm is open source, as well as easily available through worldcat local, but it only works to create super-work records; it also ignores the 700t field so crucial to many of the issues noted above. none of the other home-grown applications may have code available to an institution. the virtua module from vtls offers a very tempting solution, but may require a change of vendor.47 either adapting one of these solutions or designing a local application, then, raises the question: what would the ideal system entail? catalog frbrization will transpire in two segments: enhancing the existing catalog to add bibliographic relationships to surface in the retrieval phase, and designing or adaptating a new interface and display to reflect the relationships.48 the first task may prove the more formidable, due to the size of even a modest catalog database and the difficulties often observed in automating such a task; while the librarians constructing the austlit system found a relatively high percentage of records could be transferred en masse, the oclc research team had difficulty automatically pinpointing expressions from current marc records.49 despite current technology trends toward users’ application of tags, reviews, and other metadata, a task as specialized as adding bibliographic relationships to the catalog demands specialized cataloging professionals.50 the best approach within a current library structure may be to create a single new position to head the project and to act as liaison with cataloging staff in the various branches and with vendor staff, if applicable. each library branch may judge on its own the proportions of records to frbrize, beginning with high-traffic works and authors, those for whom search results tend to be the most overwhelming and confusing to users. each branch can be responsible for allocation of cataloging staff effort to the process, and will thus have specialist oversight of subsets of the database. three technical solutions to actually changing the database structure have been attempted in the literature to date: incrementally improving the existing marc records to better reflect bibliographic relationships, adding local linking tags, and simply creating new metadata schemas. the vtls solution of adding local linking tags seems most appropriate; relationships between records are created and maintained via unique identifiers and linking statements in the 001 and 004 fields.51 oclc’s open source software could expedite the creation of work-level records, and the creation of expression-level records will be made easier by the large amount of bibliographic information already present in the current catalog. wherever possible, cataloging staff also should take the opportunity to verify or create links to authority files so as to enhance retrieval.52 creating a new catalog display option could be accomplished via additions to current opac coding, either by adopting worldcat local or by designing parts of a new local interface. it need not even require a complete revision; the single site (ucl) currently deploying vtls’ frbrized interface maintains a mixed catalog and offers, once again, a highly intuitive model.53 when a searcher comes across a bibliographic record for which frbr linking is available, they may click a link to open a new display screen. we should strive, however, to use simple interface statements such as “view all different kinds of holdings,” “this work has x editions, in y languages” or “this version of the work has been published z times” (both the oclc prototype and the austlit gateway offer such helpful and user-friendly statements). though the foundational work of both tillett and smiraglia focused upon taxonomies of relationships, the hierarchical structure of the ifla proposal should remain at the forefront of the display, with a secondary organization by type of relationship or type of entity. rather than adopting a design which automatically refreshes at each click, a tree organization of the display should be more user-friendly, allowing users to maintain a visual sense of the organization that they are encountering (see appendix for screenshots of this type of tree display).54 format information should be included in the display, as an indication of a users’ primary category, as well as a distinction among expressions of a work. with these changes, the library catalog will begin to afford its users better access to many of its core collections. frbrization of even part of the catalog—concentrating on high-incidence authors, as identified by subject specialists—will allow it better to reflect, and collocate, items within the families of bibliographic relationships that have been acknowledged a part of library collections for decades. this increased collocation will begin to counteract the pitfalls of mere keyword searching on the part of users, especially in conjunction with renewed authority work. finally, frbr offers a display option in a revamped opac that is at the same time simpler than current result lists, and more elegant in its reflection of relatedness among items. each feature should better 28 information technology and libraries | march 200828 information technology and libraries | march 2008 enable the users of our catalog to find, select, and obtain appropriate resources, and will bring our libraries into the next generation of cataloging practice. references and notes 1. ifla committee on the functional requirements for bibliographic records, final report (munich: k. g. saur, 1998); see also http://www.ifla.org/vii/s13/wgfrbr/bibliography.htm (accessed mar. 10, 2007). 2. this paper began as a graduate research assignment for lis 60640 (library automation), in the kent state university mlis program, march 19, 2007. my thanks to jennifer hambrick, nancy lensenmayer, and joan lippincott, for their helpful comments on earlier drafts. the curricular assignment asked for a library automation proposal in a specific library setting; the original review contained a set of recommendations concerning frbr through the lens of a (fictional) medium-sized academic library system, that of st. hildegard of bingen catholic university. as will be noted below, the branch music library typically serves a small population of music majors (graduate and undergraduate) within such an institution, but also a large portion of the student body that use the library’s collection to support their music coursework and arts distribution requirements. any music library’s proportion of the overall system’s holdings may be relatively small, but will include materials in a diverse set of formats: monographs, serials, musical scores, sound recordings in several formats (cassette tapes, lps, cds, and streaming audio files), and a growing collection of video recordings, likewise in several formats (vhs, laser discs, and dvd). it thus offers an early test case for difficulties with an automated library system. 3. dan zager, “collection development and management,” notes—quarterly journal of the music library association 56, no. 3 (march 2000): 569. 4. sherry l. velluci, “music metadata and authority control in an international context,” notes—quarterly journal of the music library association 57, no. 3 (mar. 2001): 541. 5. the opac for the university of huddersfield library system famously first deployed a search option for related items (“did you mean . . . ?”); http://www.hud.ac.uk/cls (accessed july 10, 2007). frbr not only offers the related item search, but also logically groups related works throughout the library catalog. 6. allyson carlyle demonstrated empirically that users value an object’s format as one of the first distinguishing features: “user categorization of works: toward improved organization of online catalog displays,” journal of documentation 55, no. 2 (mar. 1999): 184–208 at 197. 7. millennium will feature heavily in the following discussion, both because of its position leading the academic library automation market (being adopted wholesale by, for instance, the ohio statewide academic library consortium), and because it was the subject of the original paper. 8. see alastair boyd, “the worst of both worlds: how old rules and new interfaces hinder access to music,” caml review 33, no. 3 (nov. 2005), http://www.yorku.ca/caml/ review/33-3/both_worlds.htm (accessed mar. 12, 2007); michael gorman and paul w. winkler, eds., anglo-american cataloging rules, 2nd ed. (chicago: ala, 1988). 9. in the past few years, a small subset of the search literature has described technical efforts to develop search engines that can query by musical example; see j. stephen downie, “the scientific evaluation of music information retrieval systems: foundations and future,” computer music journal 28, no. 2 (summer 2004): 12–23. a company called melodis corporation has recently announced a successful launch of a query-by-humming search engine, though a verdict from the music community remains out; http://www.midomi.com (accessed jan. 31, 2007). 10. see velluci, “music metadata and authority control in an international context”; richard p. smiraglia, “uniform titles for music: an exercise in collocating works,” cataloging and classification quarterly 9, no. 3 (1989): 97–114; steven h. wright, “music librarianship at the turn of the century: technology,” notes—quarterly journal of the music library association 56, no. 3 (mar. 2000): 591–97. each author builds upon the foundational work of barbara tillett, “bibliographic relationships: toward a conceptual structure of bibliographic information used in cataloging” (ph.d. diss., university of california at los angeles, 1987). 11. “at conferences, [my colleagues] are always groaning if they are a voyager client,” interview with an academic music librarian by the author, feb. 9, 2007. 12. several prominent music librarians only discovered that innovative’s system had such a feature when instances of the automatic system’s changing carefully crafted music authority records were discovered; mark sharff (washington university in st. louis) and deborah pierce (university of washington), postings to innovative music users’ group electronic discussion list, oct. 6, 2006, archive accessed feb. 1, 2007. 13. music librarians are the only subset of the millennium users to have formed their own innovate users’ group. sirsidynix has a separate users’ group for stm librarians, and ex libris hosts a law librarians’ users’ group, two other groups whose interaction with the ils poses discipline-specific challenges. 14. searches were tested on the the ohio state university libraries’ opac , http://library.osu.edu (accessed mar. 10, 2007). 15. http://www.emory.edu/libraries.cfm (accessed june 27, 2007). 16. searches performed on the library of oklahoma state university, http://www.library.okstate.edu (accessed june 27, 2007); tlc has considered making frbrization a possible feature of their product. they offer some concatenation of “intellectually similar bibliographic records,” and “tlc continues to monitor emerging frbr standards”; don kaiser, personal communication to the author, july 8, 2007. i was unable to reach representatives of sirsidynix on this issue. 17. searches performed on the mit library catalog, powered by aleph 500 http://libraries.mit.edu (accessed june 27, 2007). 18. eva verona, “literary unit versus bibliographic unit [1959],” in foundations of descriptive cataloging, ed. michael carpenter and elaine svenonius, 155–75 (littleton, colo.: libraries unlimited, 1985), and seymour lubetzky, principles of cataloging, final report phase i: descriptive cataloging (los angeles: institute for library research, 1969), are usually credited with article title | author 29frbrization of a library catalog | dickey 29 the foundational work on such theories; see richard p. smiraglia, the nature of “a work”: implications for the organization of knowledge (lanham, md.: scarecrow, 2001), 15–33, to whom the following overview is indebted. 19. anglo-american cataloging rules, cited in smiraglia, the nature of “a work,” 33. 20. among the many library and information science thinkers contributing to this body of research, the most prominent have been patrick wilson, “the second objective” in the conceptual foundations of descriptive cataloging, ed. elaine svenonius, 5–16 (san diego: academic publ., 1989); edward t. o’neill and diane vizine-goetz, “bibliographic relationships: implications for the function of the catalog,” in the conceptual foundations of descriptive cataloging, ed. elaine svenonius, 167–79 (san diego: academic publ., 1989); barbara ann tillett, “bibliographic relationships: toward a conceptual structure of bibliographic information used in cataloging” (ph.d. diss, university of california, los angeles, 1987); eadem, “bibliographic relationships,” in relationships in the organization of knowledge, carol a. bean and rebecca green, eds. , 19–35 (dordrecht: kluwer, 2001) (summary of her dissertation findings on 19–20); martha m. yee, “manifestations and near-equivalents: theory with special attention to moving-image materials,” library resources and technical services 38, no. 3 (1994): 227–55. 21. o’neill and vizine-goetz, “bibliographic relationships”; see also edward t. o’neill, “frbr: application of the entityrelationship model to humphrey clinker,” library resources and technical services 46, no. 4 (oct. 2002): 150–59. 22. theorists in music semiotics who have more or less profoundly influenced music librarians’ view of their materials include jean-jacques nattiez, music and discourse: toward a semiology of music, trans. by carolyn abbate (princeton, n.j.: princeton univ. pr., 1990), and lydia goehr, the imaginary museum of musical works (new york: oxford univ. pr., 1992). see also smiraglia, the nature of “a work,” 64. for a concise overview of how semiotic theory has influenced thinking about literary texts, see w. c. greetham, theories of the text (oxford: oxford univ. pr., 1999), 276–325. 23. studies have found families of derivative bibliographic relationships in 30.2 percent of all worldcat records, 49.9 percent of records in the catalog of georgetown university library, 52.9 percent in the burke theological library (union theological seminary), 57.9 percent of theological works in the new york university library, and 85.4 percent in the sibley music library at the eastman school of music (university of rochester). see smiraglia, the nature of “a work,” 87, who cites richard p. smiraglia and gregory h. leazer, “derivative bibliographic relationships: the work relationship in a global bibliographic database,” journal of the american society for information science 50 (1999): 493–504; richard p. smiraglia, “authority control and the extent of derivative bibliographic relationships” (ph.d. diss., university of chicago, 1992); richard p. smiraglia, “derivative bibliographic relationships among theological works,” proceedings of the 62nd annual meeting of the american society for information science (medford, n.j.: information today, 1999): 497–506; and sherry l. vellucci, “bibliographic relationships among musical bibliographic entities: a conceptual analysis of music represented in a library catalog with a taxonomy of the relationships” (d.l.s. diss., columbia university, 1994). 24. ifla, final report, 2–3. 25. ibid, 16–23. 26. sherry l. vellucci, bibliographic relationships in music catalogs (lanham, md.: scarecrow, 1997), 1. 27. ibid, 238; 251. 28. vellucci, “music metadata”; richard p. smiraglia, “musical works and information retrieval,” notes: quarterly journal of the music library association 58, no. 4 (june 2002). patrick le boeuf notes that users of music collections often use the single word “score” to indicate any one of the four frbr entities; “musical works in the frbr model or ‘quasi la stessa cosa’: variations on a theme by umberto eco,” in functional requirements for bibliographic records (frbr): hype or cure-all? ed. patrick le boeuf, 103–23 at 105–06 (new york: haworth, 2005). 29. smiraglia, “musical works and information retrieval,” 2. 30. marte brenne, “storage and retrieval of musical documents in a frbr-based library catalogue” (masters’ thesis, oslo university college, 2004), 79. see also john anderies, “enhancing library catalogs for music,” paper presented at the conference on music and technology in the liberal arts environment, hamilton college, june 22, 2004; powerpoint presentation accessed mar. 12, 2007, from http://academics. hamilton.edu/conferences/musicandtech/presentations/catalog-enhancements.ppt; boyd, “the worst of both worlds.” 31. see the extensive bibliography compiled by ifla, cataloging division: “frbr bibliography,” http://www.ifla.org/ vii/s13/wgfrbr.bibliography.htm (accessed mar. 10, 2007). 32. the first ils deployment of the worldcat local application using frbr is with the university of washington libraries: http://www.lib.washington.edu (accessed june 27, 2007). 33. innovative interfaces, inc., “millennium 2005 preview: frbr support,” inn-touch (june 2004), 9. interestingly, the onepage advertisement for the new service chose a musical work, puccini’s opera la bohème, to illustrate how the sorting would work. innovative interfaces booth staff at the ala national conference, washington, d.c., june 24, 2007, told the author the company has moved in a different development direction now (investing more heavily in faceted browsing). 34. denmark’s det kongelige bibliotek has been the first ex libris partner library to deploy primo, http://www.kb.dk/en (accessed july 10, 2007). the vtls system has been operating since 2004 at the université catholique de louvain, http:// www.bib.ucl.ac.be (accessed mar. 15, 2007). for austlit, see http://www.austlit.edu.au (accessed mar. 14, 2007). 35. rick bennett, brian f. lavoie, and edward t. o’neill, “the concept of a work in worldcat: an application of frbr,” library collections, acquisitions, and technical services 27, no. 1 (spring 2003): 45–60. work-level records allow manifestation and item records to inherit labor-intensive subject classification metadata; eric childress, “frbr and oclc research,” paper presented at the university of north carolina-chapel hill, apr. 10, 2006, http://www.oclc.org/research/presentations/ childress/20060410-uncch-sils.ppt (accessed mar. 12, 2007). 36. thomas b. hickey, edward t. o’neill, and jenny toves, “experiments with the ifla functional requirements for bibliographic records (frbr),” d-lib 8, no. 9 (sept. 2002), http://www.dlib.org/dlib/september02/hickey/09hickey.html (accessed mar. 12, 2007). 37. thomas b. hickey and jenny toves, “frbr work-set algorithm,” apr. 2005 report, http://www.oclc.org/research/ projects/frbr/default.htm (accessed mar. 12, 2007); algorithm 30 information technology and libraries | march 200830 information technology and libraries | march 2008 available at http://www.oclc.org/research/projects/frbr/algorithm.htm. on worldcat local, see above, note 32. 38. merrilee proffitt, “redlightgreen: frbr between a rock and a hard place,” http://www.ala.org/ala/alcts/alctsconted/ presentations/proffitt.pdf (accessed mar. 12, 2007). redlight green has been discontinued, and some of its technology incorporated into worldcat local. 39. http://www.austlit.edu.au (accessed mar. 14, 2007), but unfortunately a subscription database at this time, and thus unavailable for operational comparison. see marie-louise ayres, “case studies in implementing functional requirements for bibliographic records: austlit and musicaustralia,” alj: the australian library journal 54, no. 1 (feb. 2005): 43–54, http:// www.nla.gov.au/nla/staffpaper/2005/ayres1.html (accessed mar. 12, 2007). 40. ibid. 41. see david mimno and gregory crane, “hierarchical catalog records: implementing a frbr catalog,” d-lib 11, no. 10 (oct. 2005); http://www.dlib.org/dlib/october05/ crane/10crane.html (accessed mar. 12, 2007). 42. ibid. see also martha m. yee, “frbrization: a method for turning online public finding lists into online public catalogs,” information technology and libraries 24, no. 3 (2005): 77–95, http://repositories.cdlib.org/postprints/715 (accessed mar. 12, 2007). 43. portia, “visualcat overview,” http://www.portia.dk/ pubs/visualcat/present/visualcatoverview20050607.pdf (accessed mar. 14, 2007); vtls, inc., “virtua,” http://www.vtls. com/brochures/virtua.pdf (accessed mar. 14, 2007). 44. http://www.exlibrisgroup.com/primo_orig.htm (accessed july 10, 2007). 45. syed ahmed, personal communication to the author, july 10, 2007; searches run july 10, 2007, on http://www.kb.dk/en. the library’s holdings of manifestations of mozart’s singspiel opera, the magic flute, run to four different groupings on this catalog: one under the title “die zauberflöte,” one under the title “la flute enchantée: opéra fantastique en 4 actes,” and two separate groups under the title “tryllefløtjen.” 46. “vtls announces first production use of frbr,” http:// www.vtls.com/corporate/releases/2004/6.shtml (accessed mar. 14, 2007). unfortunately, though this press release indicates commitments on the part of the université catholique de louvain and vaughan public libraries (ontario, canada) to use fully frbrized catalogs, only the first is operating in this mode as of july 2007, and with only a subset of its catalog adapted. 47. virtua is not interoperable, for instance, with any of innovative’s other ils modules, which continue to dominate a number of larger academic consortia; john espley, vtls inc. director of design, personal communication to the author, mar. 15, 2007. 48. see allyson carlyle, “fulfilling the second objective in the online catalog: schemes for organizing author and work records into usable displays,” library resources and technical services 41, no. 2 (1997): 79–100. 49. even at the work-level, yee distinguished fully eight different places in a marc record in which the identity of a work may be located, “frbrization,” 79–80. 50. gregory leazer and richard p. smiraglia imply that cataloger-based “maps” of bibliographic relationships are inadequate; “bibliographic families in the library catalog: a qualitative analysis and grounded theory,” library resources and technical services 43, no. 4 (1999): 191–212. the cataloging failures they describe, however, are more a result of inadequacies in the current rules and practice, and do not really prove that catalogers have failed in the task of creating useful systems. 51. vinood chacra and john espley, “differentiating libraries though enriched user searching: frbr as the next dimensions in meaningful information retrieval,” powerpoint presentation, http://www.vtls.com/corporate/frbr.shtml (accessed mar. 10, 2007). 52. see yee, “frbrization.” 53. http://www.bib.ucl.ac.be (accessed mar. 15, 2007). 54. not only does the ex libris primo application need clickthroughs, it creates a new window for an extra step before presenting a new group of records. bibliography anderies, john. “enhancing library catalogs for music.” paper presented at the conference on music and technology in the liberal arts environment, hamilton college, june 22, 2004; http://academics.hamilton.edu/conferences/musicandtech/presentations/catalog-enhancements.ppt (accessed mar. 12, 2007). ayres, marie-louise. “case studies in implementing functional requirements for bibliographic records: austlit and musicaustralia.” alj: the australian library journal 54, no. 1 (feb. 2005): 43–54; http://www.nla.gov.au/nla/staffpaper/2005/ ayres1.html (accessed mar. 12, 2007). bennett, rick, brian f. lavoie, and edward t. o’neill. “the concept of a work in worldcat: an application of frbr.” library collections, acquisitions, and technical services 27, no. 1 (spring 2003): 45–60. boyd, alistair. “the worst of both worlds: how old rules and new interfaces hinder access to music.” caml review 33, no. 3 (nov. 2005); http://www.yorku.ca/caml/review/33-3/ both_worlds.htm (accessed mar. 12, 2007). brenne, marte. “storage and retrieval of musical documents in a frbr-based library catalogue.” masters’ thesis, oslo university college, 2004. carlyle, allyson. “fulfilling the second objective in the online catalog: schemes for organizing author and work records into usable displays,” library resources and technical services 41, no. 2 (1997): 79–100. ______. “user categorization of works: toward improved organization of online catalog displays.” journal of documentation 55, no. 2 (mar. 1999): 184–208 chacra, vinood, and john espley. “differentiating libraries though enriched user searching: frbr as the next dimensions in meaningful information retrieval.” powerpoint presentation, http://www.vtls.com/corporate/frbr.shtml (accessed mar. 10, 2007). childress, eric. “frbr and oclc research.” paper presented at the university of north carolina-chapel hill, apr. 10, 2006; http://www.oclc.org/research/presentations/ childress/20060410-uncch-sils.ppt (accessed mar. 12, 2007). hickey, thomas b., and edward o’neill. “frbrizing oclc’s worldcat.” in functional requirements for bibliographic records article title | author 31frbrization of a library catalog | dickey 31 (frbr): hype or cure-all? ed. patrick le boeuf, 239-251. new york: haworth, 2005. hickey, thomas b., and jenny toves. “frbr work-set algorithm.” apr. 2005 report; http://www.oclc.org/research/ frbr (accessed mar. 12, 2007). hickey, thomas b., edward t. o’neill, and jenny toves, “experiments with the ifla functional requirements for bibliographic records (frbr),” d-lib 8, no. 9 (sept. 2002); http://www.dlib.org/dlib/september02/hickey/09hickey. html (accessed mar. 12, 2007). ifla study group on the functional requirements for bibliographic records. functional requirements for bibliographic records: final report. munich: k. g. saur, 1998. layne, sara shatford. “subject access to art images.” in introduction to art image access: issues, tools, standards, strategies, murtha baca, ed., 1–18. los angeles: getty research institute, 2002. leazer, gregory, and richard p. smiraglia. “bibliographic families in the library catalog: a qualitative analysis and grounded theory.” library resources and technical services 43, no. 4 (1999): 191–212. le boeuf, patrick. “musical works in the frbr model or ‘quasi la stessa cosa’: variations on a theme by umberto eco.” in functional requirements for bibliographic records (frbr): hype or cure-all? patrick le boeuf, ed., 103–23 new york: haworth, 2005. markey, karen. subject access to visual resources collections: a model for computer construction of thematic catalogs. new york: greenwood, 1986. mimno, david, and gregory crane. “hierarchical catalog records: implementing a frbr catalog.” d-lib 11, no. 10 (oct. 2005); http://www.dlib.org/dlib/october05/crane/10crane. html (accessed mar. 12, 2007). o’neill, edward t. “frbr: application of the entity-relationship model to humphrey clinker.” library resources and technical services 46, no. 4 (oct. 2002): 150–59. o’neill, edward t., and diane vizine-goetz. “bibliographic relationships: implications for the function of the catalog.” in the conceptual foundations of descriptive cataloging. elaine svenonius, ed., 167–79. san diego: academic publ., 1989. proffitt, merrilee. “redlightgreen: frbr between a rock and a hard place.” paper presented at the 2004 ala annual conference, orlando, fla.; http://www.ala.org/ala/alcts/alctsconted/presentations/proffitt.pdf (accessed mar. 12, 2007). smiraglia, richard p. bibliographic control of music, 1897–2000. lanham, md.: scarecrow and music library association, 2006. ______. “content metadata: an analysis of etruscan artifacts in a museum of archaeology.” cataloging and classification quarterly, 40, no. 3/4 (2005): 135–51. ______. “musical works and information retrieval,” notes: quarterly journal of the music library association 58, no. 4 (june 2002): 747–64. ______. the nature of “a work”: implications for the organization of knowledge. lanham, md.: scarecrow, 2001. ______. “uniform titles for music: an exercise in collocating works.” cataloging and classification quarterly 9, no. 3 (1989): 97–114. tillett, barbara ann. “bibliographic relationships.” in relationships in the organization of knowledge. carol a. bean and rebecca green, eds., 19–35. dordrecht: kluwer, 2001. vellucci, sherry l. bibliographic relationships in music catalogs. lanham, md.: scarecrow, 1997. ______. “music metadata and authority control in an international context.” notes—quarterly journal of the music library association 57, no. 3 (mar. 2001): 541–54. wilson, patrick. “the second objective.” in the conceptual foundations of descriptive cataloging. elaine svenonius, ed., 5–16. san diego: academic publ., 1989. wright, h. s. “music librarianship at the turn of the century: technology.” notes: quarterly journal of the music library association 56, no. 3 (mar. 2000): 591–97. yee, martha m. “frbrization: a method for turning online public finding lists into online public catalogs.” information technology and libraries 24, no. 3 (2005): 77–95; http://repositories.cdlib.org/postprints/713 (accessed mar. 12, 2007). ______. “manifestations and near-equivalents: theory with special attention to moving-image materials.” library resources and technical services 38, no. 3 (1994): 227–55. zager, daniel. “collection development and management.” notes: quarterly journal of the music library association 56, no. 3 (2000): 567–73. 32 information technology and libraries | march 200832 information technology and libraries | march 2008 a search on also sprach zarathustra on the online public access catalog for the universite catholique de louvain, with results frbrized. (a vtls opac). selecting the first work yields the following screen: . . . which, when frbrized, yields a list of expressions. any part of the tree may be expanded, to display manifestations, and item-level records follow. appendix: examples of a frbrized tree display editorial board thoughts: critical technology cinthya ippoliti information technology and libraries | december 2018 5 cinthya ippoliti (cinthya.ippoliti@ucdenver.edu) is university librarian and director, auraria library, university of colorado. critical librarianship has brought many changes in how libraries have examined their programs and services, created new positions dedicated to equity, inclusion, and diversity, and paved the way to challenge existing assumptions about our work and environment. technology also exists in a space that is not neutral, as library systems and services reflect specific perspectives in their content and focus as well as how they are made accessible (or not). i would like to briefly examine how we can begin to think about these issues within academic libraries, and offer some additional readings for further reflection for four technology-related areas: spaces, services/programming, systems, and engaging with our users. technology spaces we might assume that because we are seeing students using our classrooms, makerspaces, and study areas, that we have been successful in meeting the needs of a wide variety of users. to a large extent that may be true, but we should also be asking ourselves who does not feel welcome in such a space and, more importantly, why not? there are two facets to this question. the first involves the degree to which libraries strive to create a welcoming environment. staff interactions, signage, hours, and institutional values are all part of a complex and broader environment that signals to users how these spaces function and how they are perceived by the organization. these same elements can also serve as deterrents through choices in layout, policy, or other intangible aspects so that they may in fact prevent individuals from entering these spaces in the first place. the second revolves around the notion that each technology-rich space conveys its level of friendliness and intended purpose through its physical presence. ensuring that furniture, paint, and layout are compliant with ada standards, and integrating these features with each other as opposed to setting them apart so that they are not considered “special” or “different,” is one small and vital step in this direction. maggie beers and teggin summers cover these issues an educause review article and discuss asking questions regarding how power structures are reinforced by having a “front” of the room or other configurations can enrich planning and assessment efforts. similarly, developing a plan so that new technology in areas such as makerspaces rotates as much as possible will help to provide access for those who may not be able to utilize these resources outside of the library context in order to accommodate differing skill levels, interests, and learning styles. in addition, students may not always be present on campus due to family, job, or other life circumstances and planning with the assumption that everyone who could benefit from using a particular space is in fact taking advantage of that benefit, is problematic. one way around that is to ensure that each space is as flexible as possible and (ideally) can be reconfigured for quiet reflection, collaborative work, or transformed into a sensory space or other type of specialized environment. the reservation process should be available both online and manually (as not critical technology | ippoliti 6 https://doi.org/10.6017/ital.v37i4.10810 everyone may have access to a computer and/or the internet), hour limitations should have several counter options, and the space should be available as much of the time as possible when it is not in use for more a more formalized purpose. any space usage assessments should also purposefully include non-users or perceived non-users and integrate questions about barriers to or about the space in their methodologies. finally, ensuring that the right level of staffing to support both the intended, as well as perhaps the unintended, uses of the space and the activities that occur within it will help create a sense that not only the space itself is valued, but that the experiences occurring within it are even more important. this is not easy to accomplish, as it is difficult to predict exactly how a space will be used unless there are very strict confines placed around its configuration and accessibility. but assuming that most spaces in libraries are designed to be malleable and keeping in constant communication with users via some of the methods described above should help. technology services and programming similarly, services and programs cannot be built around a one-size-fits-all model. this can prove to be quite challenging given the limited resources libraries face. engagement and learning lie not only in access to tools, but in the very process of sharing knowledge and experiences — whether for academic growth, social action, or simply personal enjoyment. matt ratto, who coined the term “critical making,” defines it as the process “intended to highlight the interwoven material and conceptual work that making involves.” he argues that “critical making is dependent on open design technologies and processes that allow the distribution and sharing of technical work and its results.” ratto makes the further point that this process also has the capacity of “unpacking the social and technical dimensions of information technologies.” this in turn allows for technology to become more than simply a cool resource, but rather a mechanism for democratizing this creative work of making and designing while dealing with its messy, political, and uncomfortable aspects which do not exist in vacuum outside of the tools themselves. an approach in this instance might involve taking technology outside of library spaces such as on campus or within the community, offering as much for free as possible, and capitalizing on programs such as girls who code (https://girlswhocode.com/) and grow with google (https://grow.google/). capturing how these resources are used in all of their possible permutations enables stories of individuals to shine through. the impact of these programs takes on a personal element through showcases, speaker events, and hackathons that are designed to bring the community together and engage in sharing of knowledge, perspectives, and conversations. in addition, this will hopefully shrink the barriers for those who don’t see themselves as having a role in these activities. integrated library systems i do not have a background in systems, but simon barron and andrew preater have written a great chapter unpacking the inherent power structures which manifest themselves in library systems such as the integrated library system (ils), discovery interfaces, and the third-party resources we provide access to. they suggest taking action by thinking about user privacy and ensuring that the information libraries are able to view, gather, and store is used ethically and that decisions for derivative services or actions are not made based on assumptions about gender identity, economic status, or other identifiers via access to these types of data. openness is another area the authors explore, as they discuss how libraries can use open source software whenever possible in order to balance the field against profit-based licensing models. barron and preater also raise a concern however that while crowdsourcing is in theory a good way to include the community in https://girlswhocode.com/ https://grow.google/ information technology and libraries | december 2018 7 developing ways to help itself, it still does not recognize the limited resources marginalized populations can dedicate to these efforts. finally, they discuss how it is crucial for libraries to recognize and support the expertise needed in this arena in order to avoid overreliance on vendor systems that can prove alluring with out-of-the-box solutions, but which compromise things like privacy, autonomy, and customization that might otherwise benefit from equity, diversity, and inclusion-centered practices. equity-driven design engaging with users in developing shared solutions to challenges is an important aspect of the user experience, and can help pave the way for deeper conversations. taking a step back and making sure the assessment and design process itself is transparent for everyone is one of the first things that needs to be in place. i would like to harken to the work of gretchen rossman and sharon rallis who make a crucial distinction between user-centered design, in which the user seldom has a voice in what the final process or product looks like, and what they term as “emancipatory design,” in which participants are “collaboratively producing knowledge to improve their work and their lives.” in addition, emancipatory design is one where “users are in charge; their power, their indigenous knowledge are more powerful and respected than those of the expert designer.” this approach can therefore be a means to promoting equity, diversity, and inclusion into technology work in libraries by focusing on the users’ voice as opposed to our own and working collaboratively to develop shared solutions to address their challenges. a specific example of how this framework might be applied comes from the stanford school of design which is famous for its course in design thinking. stanford has recently taken that concep t even further, and integrated an equity focus into the first steps of the progression, where the designer is not only identifying existing built-in biases but also raises questions such as who the users are, what are the equity challenges that need to be addressed, who has institutional power, and how is it manifested in the decisions that drive the organization. the stanford model also provides specific methods focusing on human values and developing relational trust as a way to bookend the design thinking process by reflecting on the blind spots that were uncovered as a way to help inform action items and next steps and ensure that the users are actively collaborating to develop these services and programs which in turn affect them. this version of the program is available at https://dschool.stanford.edu/resources/equity-centered-design-framework. as a final thought, one idea to keep at the forefront in all of these areas is that of universal design, which is defined by the center for universal design at ncsu as “the design of products and environments to be useable by all people, to the greatest extent possible, without the need for adaptation or specialized design.” the first principle is that of equitable use and can be applied to many technology-related aspects whether they are physical or virtual: • provide the same means of use for all users: identical whenever possible; equivalent when not • avoid segregating or stigmatizing any users • provisions for privacy, security, and safety should be equally available to all users • make the design appealing to all users https://dschool.stanford.edu/resources/equity-centered-design-framework critical technology | ippoliti 8 https://doi.org/10.6017/ital.v37i4.10810 further readings: barron, s. and preater, a. j. “critical systems librarianship.” in the politics of theory and the practice of critical librarianship (sacramento: litwin books, 2018). https://repository.uwl.ac.uk/id/eprint/4512/1/2018-barron-and-preater-critical-systemslibrarianship.pdf. beers, m. & summers, t. “educational equity and the classroom: designing learning-ready spaces for all students,” educause review. may 7, 2018. https://er.educause.edu/articles/2018/5/educational-equity-and-the-classroom-designinglearning-ready-spaces-for-all-students. north carolina state university center for universal design. “center for universal design”. https://projects.ncsu.edu/design/cud/ (accessed november 25, 2018). ratto, m. “critical making,” open design now. http://opendesignnow.org/index.html%3fp=434.html (accessed november 7, 2018). rossman, g. b., and rallis, s. f. learning in the field: an introduction to qualitative research (thousand oaks, ca: sage, 1998). https://repository.uwl.ac.uk/id/eprint/4512/1/2018-barron-and-preater-critical-systems-librarianship.pdf https://repository.uwl.ac.uk/id/eprint/4512/1/2018-barron-and-preater-critical-systems-librarianship.pdf https://er.educause.edu/articles/2018/5/educational-equity-and-the-classroom-designing-learning-ready-spaces-for-all-students https://er.educause.edu/articles/2018/5/educational-equity-and-the-classroom-designing-learning-ready-spaces-for-all-students https://projects.ncsu.edu/design/cud/ http://opendesignnow.org/index.html%3fp=434.html technology spaces technology services and programming integrated library systems equity-driven design further readings: editor’s comments bob gerrity information technology and libraries | december 2012 1 past and present converge with the december 2012 issue of information technology and libraries (ital), as we also publish online the first volume of ital’s predecessor, the journal of library automation (jola), originally published in print in 1968. the first volume of jola offers a fascinating glimpse into early days of library automation, when many things were different, such as the size (big) and capacity (small) of computer hardware, and many things were the same (e.g., richard johnson’s description of the book catalog project at stanford, where “the major achievement of the preliminary systems design was to establish a meaningful dialogue between the librarian and systems and computer personnel.” plus ça change, plus c'est la meme. there are articles by luminaries in the field: richard de gennaro describes approaches to developing an automation program in a large research library, frederick kilgour, from the ohio bob gerrity (r.gerrity@uq.edu.au) is university librarian, university of queensland, australia. http://ejournals.bc.edu/ojs/index.php/ital/issue/view/312 editor’s comments bob gerrity editor’s comments | gerrity 2 college library center (now oclc), analyzes catalog-card production costs at columbia, harvard, and yale in the mid 1960s (8.8 to 9.8 cents per completed card), and henriette avram from the library of congress describes the successful use of the cobol programming language to manipulate marc ii records. the december 2012 issue marks the completion of ital’s first year as an e-only, open-access publication. while we don’t have readership statistics for the previous print journal to compare with, download statistics for the e-version appear healthy, with more than 30,000 full-text article downloads for 2012 content so far this year, plus more than 10,000 downloads for content from previous years. based on the download statistics, the topics of most interest to today’s ital readers are discovery systems, web-based research guides, digital preservation, and digital copyright. this month’s issue takes some of these themes further, with articles that examine the usability of autocompletion features in library search interfaces (ward, hahn, and feist), reveal patterns of student use of library computers (thompson), propose a cloud-based digital library storage solution (sosa-sosa), and summarize attributes of open standard file formats (park, oh). happy reading. 20190318 10979 gallley editorial board thoughts who will use this and why? user stories and use cases kevin m. ford information technology and libraries | march 2019 5 kevin m. ford (kefo@loc.gov) is librarian, linked data specialist, library of congress. perhaps i’m that guy. the one always asking for either a “user story” or a “use case,” and sometimes both. they are tools employed in software or system engineering to capture how, and importantly why, actors (often human users, but not necessarily) interact with a system. both have protagonists, but one is more a creative narrative, the other like a strict, unvarnished retelling. user stories relate what an actor wants to do and why. use cases detail to varying degrees how that actor might go about realizing his desire. the concepts, though distinct, are often confused and conflated. and, because they classify as jargon, the concepts have sometimes been employed outside of technology to capture what an actor needs, the path the actor takes to his or her objective, including any decisions that might be made along the way, and all of this effort is undertaken in order to identify the best solution. by giving the actors a starring role, user stories and use cases ensure focus is on the actors, their inputs, and the expected outcome. they protect against incorporating unnecessary elements, which could clutter and, even worse, weaken the end product, and they create a baseline understanding by which the result can be measured. and so i find myself frequently asking in meetings, and mumbling in hallways: “what’s the use case for that?” or “is there a user story? if not, then why are we doing it?” you get the idea. it’s a little ironic that i would become this person. not because i didn’t believe in user stories and use cases – quite the contrary, i’ve always believed in the importance and utility of them – but because of a book i was assigned during graduate coursework for my lis degree and my initial reaction. it’s not just an unassuming book, it has a downright boring appearance, as one might expect of a book entitled “use case modeling.”1 it’s a shocking 347 pages. it was a joint endeavor by two authors: kurt bittner and ian spence. i think i read it, but i can’t honestly recall. i assume i did because i was that type of student and i had a long chicago el commute at the time. in any case, i know beyond doubt that i was assigned this book, dutifully obtained it, and then picked it up, thumbed through it, rolled my eyes, and probably said, “ugh, really?” and that’s just it. the joke’s on me. the concepts, and as such the book, which i’ve moved across the country a couple of times, remain near-daily constants in my life. as a developer, i basically don’t do anything without a user story and a use case, especially one whose steps (including preconditions, alternatives, variables, triggers, and final outcome) haven’t been reasonably sketched out. “sketched out” is an interesting phrase because one would think that if entire books were being authored on the topic of use cases, for example, then use cases would be complicated and involved affairs. they can be, but they need not be. the same holds for user stories. imagine you were designing a cataloging system, here’s an example of the latter: as a librarian i want my student catalogers to be guided through selection of vocabulary terms to improve both their accuracy and speed.2 editorial board thoughts: who will use this and why? | ford 6 https://doi.org/10.6017/ital.v38i1.10979 that single-sentence user story identifies the actors (student catalogers), what they need (a “guided … selection of vocabulary terms”), and why (“to improve their accuracy and speed”). the use case would explore how the student catalogers (the actors) would interact with the system to realize that user story. the use case might be narrowly defined (“adding controlled terms to records”) or might be part of a broader use case (“cataloging records”), but in either instance the use case might go to some length to describe the interaction between the student catalogers and the system in order to generate a clear understanding of the various interactions. by doing this, the use case helps to identify functional requirements and it clearly articulates user/system expectations, which can be reviewed by stakeholders before work begins and used to verify delivery of the final product. as i have presented this, using these tools might strike you as overly formal and time-consuming. in many circumstances they might be, if the developer has sufficient user and domain knowledge (rare, very, very rare) and especially if the “solution” is not an entirely new system but just an enhancement or augmentation to an existing system. yet, whether it is a completely new system being developed by someone who has long and profound experience with the domain or a simple enhancement, it may be worth entertaining the questions/process if even informally. i find it is often sufficient to ask “who will use this and why?” essentially i’m asking for the “user story” but dispensing with the jargon. doing so may lead to additional questions, the answers to which would likely check the boxes of a “use case” even if the effort is not identified as such, and it certainly ensures the user-driven nature and need of the request. this might all sound obvious, but i like to think of it as defensive programming, which is like defensive driving. yes, the driver coming up to the stop sign on my right is going to stop, but i take my foot off the gas and position it over the brake just in case. likewise, i’m confident the functional requirements i’m being handed have been fully considered and address a user need, but i’m going to ask for the user story anyway. i’m also leery of scope creep which, if i were to continue the driving analogy, would be equivalent to driving to one store because you need to, but then also driving to two additional stores for items you think might be good to have but for which you have no present need. it’s time-consuming, you’ve complicated your project, you’ve added expense to your budget, and the extra items might be of little or no use in the end. the number of times i’ve been in meetings in which new, additional features are discussed because the designers think it is a good idea (that is, there has been no actual user request or input sought) is alarmingly high. that’s when i pipe up, “is there a user story? if not, then why are we doing it?” user stories and use cases help focus any development project on those who stand to benefit, i.e. the project’s stakeholders, and can guard simultaneously against insufficient planning and software bloat. and the concepts, though most often thought of with respect to large-scale projects, apply in all circumstances, from the smallest feature request to an existing system to the redesign of a complex system. if you are not in the habit of asking, try it next time: who will use this and why? endnotes 1 kurt bittner and ian spence, use case modeling (boston: addison-wesley, 2003). also useful: alistair cockburn, writing effective use cases (boston: addison-wesley, 2001). information technology and libraries | march 2019 7 2 “use case 3.4: authority tool for more accurate data entry,” linked data for libraries (ld4l), accessed march 1, 2019, https://wiki.duraspace.org/display/ld4l/use+case+3.4%3a+authority+tool+for+more+accur ate+data+entry. reproduced with permission of the copyright owner. further reproduction prohibited without permission. china academic library and information system: an academic library consortium in china dai, longji;chen, ling;zhang, hongyang information technology and libraries; jun 2000; 19, 2; proquest pg. 66 china academic library and information system: an academic library consortium in china longji dai, ling chen, and hongyang zhang since its inception in 1998, china academic library and information system (calis) has become the most important academic library consortium in china. calis is centrally funded and organized in a tiered structure. it currently consists of thirteen management or information centers and seventy member libraries' 700,000 students. after more than a year of development in information infrastructure, a calis resource-sharing network is gradually taking shape. l ike their counterparts in other countries, academic libraries in china are facing such thorny problems as shrinking budgets, growing patron demands, and rising costs for purchasing books and subscribing to periodicals. it has thus become increasingly difficult for a single library to serve its patrons to their satisfaction. under these circumstances, the idea of resource sharing among academic libraries was born. library consortia provide an organizational form for libraries to share their resources. the georgia library learning online (galileo), the virtual library of virginia (viva), and ohiolink are among the wellknown library consortia in the united states.i traditionally, the primary purpose of establishing a library consortium is to share physical resources such as books and periodicals among members. more recently, however, advances in computer, information, and telecommunication technologies have dramatically revolutionized the way in which information is acquired, stored, accessed, and transferred. sharing electronic resources has rapidly become another important goal for library consortia. i what is calis? in may 1998, as one of the two public service systems in "project 211," the china academic library and information system (ca lis) project was approved by the state development and planning commission of china after a two-year feasibility study by experts from academic libraries across the country. calis is a nationwide academic library consortium. funded primarily by the chinese government, it is longji dai is director, peking university library, and deputy director, calis administrative center; ling chen is deputy director, calis administrative center; and hongyang zhang is deputy director, reference department, peking university library. 66 information technology and libraries i june 2000 intended to serve multiple resource-sharing functions among the participating libraries-including online searching, interlibrary loan, document delivery, and coordinated purchasing and cataloguing-by digitizing resources and developing an information service network. i structure and management of calis a library consortium is an alliance formed by member libraries on a voluntary basis to facilitate resource sharing in pursuit of common interests. whether a consortium can operate successfully depends in large part on how it is managed. calis differs from library consortia in the united states in that it is a national network. it resembles multistate consortia in the united states with respect to geographic distribution of member libraries, but it is like tightly knit or even centrally funded statewide ones in terms of management.2 the calis members are distributed in twenty-seven provinces, cities, and autonomous regions in china, making an entirely centralized management difficult. after surveying some of the major library consortia in the united states, europe, and russia, calis adopted an organizational mode characterized by a combination of both centralized and localized management-that is, a three-tiered structure (figure 1). in order to improve the management efficiency and maximize the sharing of various resources including funds, calis has established a coordination and management network comprising one national administrative center (which also serves as the north regional center), five national information centers (see table 1) and seven regional information centers (see table 2). the thirteen centers are maintained by full-time staff members provided by the libraries in which these centers are located. the national administrative center (located in peking university)-overseen by officials from the concerned office at the ministry of education and the presidents of peking and tsinghua universities and advised by an advisory committee consisting of experts from major member libraries-is responsible for the construction and management of calis, makes policies and regulations, and prepares resource-sharing agreements. the center has an office handling routine management needs and several specialized work groups overseeing calis' national projects, such as those for the development of databases for union catalogues, current chinese periodicals, and calis' service software. under the guidance of the national administrative center, five national information centers are each responsible for building and maintaining an information system reproduced with permission of the copyright owner. further reproduction prohibited without permission. in one of five general areas-humanities, social science, and science; engineering and technology; agriculture and forestry; medicine; and national defense-in coordination with regional centers and member libraries. the host libraries where these centers are located possess relatively abundant collections in their respective areas. these centers, which are intended to be information bases that cover all major disciplines of science, are responsible for importing databases for sharing and constructing resource-sharing networks among member libraries and for providing searching and document delivery services to member libraries. 5 national information centers 8 regional information centers 70 member libraries depending on their location, academic libraries in china are divided into eight groups, with each forming a regional library consortium. each regional consortium is overseen by a regional management center, except for the consortium in the north, which is directly managed by the national management center. the regional centers not only participate in nationwide projects in coordination with the national centers and other figure 1. the three-tiered structure of calis regional centers, but they also are responsible for promoting cooperation among libraries in their particular regions. all the centers are located in member universities and staffed by the host universities. the concerned vice president or library director of a host university is in charge of the associated center. the regional centers also are assisted by regional coordination committees and advisory committees of provincial and municipal officials in charge of education; university presidents; library directors; and senior librarians in the concerned table 2. table 1. five national information centers areas of specialization humanities , social science and science engineering and technology agriculture and forestry medicine national defense location peking university, beijing tsinghua university , beijing china agricultural university, beijing beijing medical university, beijing haerbin industrial university, haerbin, heilongjiang regional information centers and areas of the ir jurisdiction name national administrative center southeast (south) regional center southeast (north) regional center central regional center south regional center southwest regional center northwest regional center northeast regional center location beijing shanghai nanjing wuhan guanzhou chengdu xi'an jilin areas of juristiction beijing, tianjin , hebei, shanxi, and inner mongolia shanghai, zhejiang, fujian, and jiangxi jiangsu, anhui, and shandong hubei, hunan, and henan guangdong, hainan, and guangxi sichuan, chongqing, yunnan, and guizhou shanxi, gansu, ningxia, and xinjiang jilin, liaoning, and heilongjiang china academic library and information system i dai, chen, and zhang 67 reproduced with permission of the copyright owner. further reproduction prohibited without permission. regions. these committees serve a coordinating role in the regions. i funding the development and operation of calis has been funded in large part by the chinese government. the sources of funding for calis at the present time are as follows: • government grants. much of the funds for the calis project during the first phase of construction came from the government. because of the demonstrated benefits of the ongoing project, it is expected that the government will provide funds for the second phase of calis construction. these government funds have been used in the purchase of software and hardware for the calis centers and commercial databases, development of service software and databases, training of staff members, etc. • local matching funds. according to prior agreements, a province or city that desires to have a regional center is required to provide funds in supplementation to the government funds for the construction of its local center. • member library funds. these funds, primarily derived from the university budgets, have been used to purchase electronic resources and cover the expenses incurred from the use of the calis service software platforms. although calis is currently funded by the government, the future expansion and operation of the system is expected to rely in large part on other sources of fun_ds. the funding needs for calis may be met by operating the system in a commercial mode. i principles for cooperation among members the successful operation of a library consortium clearly depends on good working relationships among members and between members and the consortium. at calis, all members are required to adhere to a set of principles (see below) in dealing with these relationships. it is based on these principles, known as the calis princ~ples for cooperation among members, that calis pohc1es and rules are made. • the common interests of calis are above those of individual member libraries. 68 information technology and libraries i june 2000 • • • • member libraries should not cooperate at the expense of the interests of others. calis provides services to member libraries for no profit. member libraries are all equal and enjoy the same privileges. larger member libraries are obliged to make more contributions. i what has been achieved? when it was first established, calis had sixty-one member libraries from major universities participating in "project 211." later, as many other major universities were interested in joining the alliance, the number of calis members has climbed to seventy. at the present, calis serves about 700,000 students. construction of calis is a long-term, strategic undertaking. the system provides service functions as they become available and is constantly being improved in the process. in the first phase (1998 to 2000) of the project, calis successfully started the following information-sharing functions in its member libraries: • primary and secondary data searching; • interlibrary borrowing and lending; • document delivery; • coordinated purchasing; and • online cataloguing. the following tasks have been completed: • purchase of computer hardware (e.g., sun e~s00); • construction of a cern etor internet-based information-sharing network connecting academic libraries across the country; and • group purchase of databases, such as umi, ebsco, ei village, inspec, elsevier, and web of science, that are shared among member libraries either directly online or indirectly through requested service/ document delivery. calis also has completed development of a number of databases, including: • • union catalogues. these databases currently contain 500,000 bibliographic records of the chinese and western language books and periodicals in all member libraries. dissertation abstracts and conference proceedings . these databases now contain abstracts of doctoral dissertations (12,000 bibliographic records) and proceedings of national and international conferences (8,000 records) collected from more than thirty reproduced with permission of the copyright owner. further reproduction prohibited without permission. memb er librarie s. the databa ses are expected to ha ve 40,000 record s in total by the end of 2000. • current chinese periodicals. th ese databases (5,000 titl es, 1.5 milli on bibliographic records) cont ain cont ents and indexes of current chinese pe rio dicals from about thirt y member libraries. • key disciplines databases. calis has sponsor ed the de ve lopment of twe nt yfiv e di scip line-sp eci fic d a tabases by m em ber librarie s. each of thes e dat abc.ses contains about 50,000 to 100,000 record s. the first three class es of databases are prepared in the usmarc, un imarc, or ccfc format for the ease of u se b y patron s and ca ta loguing s taff and in data exchang e. clients from member librari es may perform a web-ba sed sear ch of th e above databa ses. most of th em contain secondary docum en ts and ab str acts, and access calis onl ine resources using brows ers. deve lopm ent of sofhvare platform s includes the following: • cooperative online cataloguing systems. the syst ems includ e protocol 239.50-based searc h and upl oad in g serve rs and terminal softw are platforms for cataloguing staff. acquisition and ca taloguin g staff in each memb er library m ay participate in cooperativ e online cataloguing using the terminal sof tware platform s on their local sys tem . th e sys tems have been u sed for the devel opment and operation of the union catalogue databa ses. • systems for database development. these syst ems can be used in the de velopment of shared databa ses containing secondary data informati on in usmarc, unimarc, ccfc, or dublin core format. the systems for dat abase developm ent in the usmarc, unimarc, or ccfc format s are equipp ed with a search server based on the 239.50 protocol to permit use by catalo guing staff and for data exchange . • a n interlibrary loan system. the sys tem, d eve loped base d on the iso10160/10161 protocol, consists of ill protocol machines and clien t terminal s. these sys tems, locat ed in memb er libr aries, are interco nnected to form a calis interl ibrar y loan n etw ork. primar y document deliv ery sof tware bas ed on the ftp protocol also has been developed for the de livery of scann ed docum ent s between libr aries. • an opac system. the system has both web /239. 50 a nd web / ill ga teways . patron s may visit the system using co mmon brow sers , sea rch all calis new! lita publications getting the most out of web-based surveys by david ward • 2000 $20 ($18 lita members) isbn 0838981089 surv eys help evalu ate user service s, rat e diff e r e nt librar y programs, facilitat e n ee ds assess m ents , a id fa cul ty research , a nd mor e. posti ng surv eys to the w eb provide s an easy and conveni en t way to reach in ten ded aud igetting the most out of web-based survey s enc es, cen tralizes data collection a n d gives librari a ns gre ater contro l over analyz ing and repor ting results . thi s guide shows ho w to create r ob u st w eb-ba se d sur veys, a nd t h e n gather a nd ass imil ate t h e ir da ta for u se in common database a nd spre adsh eet programs. th e auth or h as applied th e techniques described in hi s own work and has desi gned both comm ercial and acad emic web sites . digital imaging of photographs: a practical approach to workflow design and project management by lisa macklin and sarah lockmiller• 1999 $20 ($18 lita members) • isbn 0838980058 a com pre hens ive app roach to man agement of digit al im ag ing in libr aries a nd archi ves , from archival nega tives to metadata ca taloging a nd web -base d access. getting mileage out of meta data: applications for the library by jean hudgins, grace agnew, and elizabeth brown 1999 • $22 ($19.20 lita members)• isbn 0838980066 an overview of the state-of-the-art metadata cataloging and curr ently ava ilabl e metadata standa rds, incl uding compr ehen sive descr iption s an d links to current a pplications. other lita publications and a printable order form can be found at www.lita.org/litapubs/index.html. fax orders to (312) 836-9958 or call 800-545-2433, press 7 (m-f, 8-5 cst). china academic library and information system i dai, chen, and zhang 69 reproduced with permission of the copyright owner. further reproduction prohibited without permission. databases, and send search results directly to the calis interlibrary loan service. patrons also may access an ill server through web/ill, tracking the status of submitted interlibrary loan requests, inquiring about fees, and so on. the databases that are centrally located and those that are distributed at various locations as well as service platforms in member libraries form a calis information service network. i future considerations in a period of just over a year, considerable progress has been made in forming a nationwide resource-sharing library consortium in china. however, because member libraries vary in size, available funds, staff quality, and automation level, calis has yet to realize its potential. there are a number of problems that remain to be solved. for example, the calis union catalogue databases do not work well on some of the old automation systems in member libraries and the calis service platforms are incompatible with a dozen automation systems currently in use; as a result, the union catalogues cannot tell the real-time circulation status in all member libraries, affecting interlibrary loan service. in addition, primary 70 information technology and libraries i june 2000 resources are not sufficiently abundant. therefore, the extent to which resources are shared among member libraries remains quite limited. in the next phase of development, calis will improve service systems (including hardware and software platforms) and the distribution of shared databases. at the same time, calis will develop more electronic resource databases and be actively involved in the research and development of digital libraries, expanding the scale and extent of resource sharing. references 1. barbara a. winters, "access and ownership in the 21st century: development of virtual collection in consortia! settings," in electronic resources and consortia (taiwan: science and technology information center, 1999), 163-80; katherine a. perry, "viva (the virtual library of virginia): virtual management of information, in electronic resources and consortia (taiwan: science and technology information center, 1999), 93-114; delmus e. williams, "living in a cooperative world: meeting local needs through ohiolink," in electronic resources and consortia, ching-chin chen, ed. (taiwan: science and technology information center, 1999), 137-61. 2. jordan m. scepanski, "collaborating on new missions: library consortia and the future of academic libraries," in proceedings of the international conference on new missions of academic libraries in the 21st century, duan xiaoqing and he zhaohui, eds. (peking: peking univ. pr., 1998), 271-75. library space information model based on gis — a case study of shanghai jiao tong university yaqi shen information technology and libraries | september 2018 99 yaqi shen (yqshen@sjtu.edu.cn) is a librarian at shanghai jiao tong university. abstract in this paper, a library-space information model (lsim) based on a geographical information system (gis) was built to visually show the bookshelf location of each book through the display interface of various terminals. taking shanghai jiao tong university library as an example, both spatial information and attribute information were integrated into the model. in the spatial information, the reading room layout, bookshelves, reference desks, and so on were constructed with different attributes. the bookshelf layer was the key attribute of the bookshelves, and each book was linked to one bookshelf layer. through the field of bookshelf layer, the book in the query system can be connected with the bookshelf-layer information of the lsim. with the help of this model, readers can search books visually in the query system and find the books’ positions accurately. it can also be used in the inquiry of special-collection resources. additionally, librarians can use this model to analyze books’ circulation status, and books with similar subjects that are frequently circulated can be recommended to readers. the library’s permanent assets (chairs, tables, etc.) could be managed visually in the model. this paper used gis as a tool to solve the problem of accurate positioning, simultaneously providing better services for readers and realizing visual management of books for librarians. introduction geographical information systems (gis) are powerful tools that can edit, store, analyze, display, and manage geographical data. early in 1992, several association of research libraries (arl) institutions, including the university of georgia, harvard university, north carolina state university, and southern illinois university, launched the gis literacy project and carried out an extensive survey about the possible applications of gis in libraries.1 since then, studies about the application of gis in library research have attracted more and more attention.2 gis is effective for library-planning efforts, such as investigating library-service areas, modeling the implications of the opening and closing of library services, informing initial location decisions, and so on.3 the university of idaho library adopted gis to link variables such as age, race, income, and education from the 2000 us census with the service-area maps of two proposed branch libraries. based on the thematic maps created, the demographic information about potential library users can be displayed. most importantly, the maps were also helpful for improving the library-service planning. koontz et al. from florida state university investigated the reasons for public-library closure by using gis. the authors presented a methodology using gis to describe libraries’ geographic market to illustrate the effects of facility location, relocation, and permanent closure on potential users. sin used gis with inequality measures and multiple regressions to analyze statistics from the public-libraries survey and the census-tract data. then the nationwide library space information model based on gis | shen 100 https://doi.org/10.6017/ital.v37i3.10308 multivariate study of the neighborhood-level variations was investigated, and the public libraries’ funding and service landscapes were mapped. gis can also provide strong support for the library accessibility.4 in south wales, united kingdom, a case study about a preliminary analysis of spatial variations in accessibility of library services was carried out based on a gis model. park further measured the public-library accessibility accurately and provided realistic analysis by using gis, including descriptive and statistical analyses and a road network–based distance measure. in another paper, park went a step further to measure readers’ travel time and distance while they are using the library. in addition to using gis for library planning and accessibility, it can be also applied to managing the collections, including the physical documents and digital databases of an academic library.5 solar and radovan from the national and university library of slovenia explored the possibility of creating a virtual collection of diverse materials like maps and pictorial documents using gis. they connected spatial data with other pictorial elements, including views and portrait images with hyperlinks.6 coyle from rochester public library studied the implementation of gis in the library collection. he believed that libraries that implemented gis early on would have an intellectual advantage over those coming on board later.7 sedighi conducted research about gis as a decisionsupport system in analyzing geospatial data in the databases of an academic library. by using the analysis functions of the system, a range of features could be indicated; for example, the spatial relationships of data based on the educational course can be analyzed.8 boda used a 3d virtuallibrary model to represent the most prominent and celebrated collection of classical antiquity in the alexandria library.9 beyond the applications mentioned above, some libraries have used gis techniques to analyze reader behaviors.10 xia developed gis into an analytical tool for examining the relationships between the height of the shelf and the frequency of book use, revealing that readers tended to pull books off shelves that are easily reachable by human eyes and hands. mandel used gis to map the most popular routes that readers took when entering the library. based on the seating sweeps method, mandel adopted maps to depict use of tables and computers. the research results of both xia and mandel can provide the information of readers’ behavior whereby the books’ positions, and accordingly the entry routes and facilities’ evaluation can be adjusted strategically. though lots of work has been done about the application of gis to the library, there are few reports about visually showing the exact position of each book through the library-catalog display interface, which is of great importance both for the readers and the librarians. xia located library items with gis and pointed out that updating the starting and ending call numbers for each shelf could be the most tedious work.11 specifically, gis cannot tell if the book is not in its correct location or is being used by somebody else. xia advised combining gis with radio frequency identification (rfid), both of which have the capability of tracing the location of each book. stackmap, a library-mapping tool providing a collection-mapping product for librarians, was being used at the hampton library.12 the shanghai jiao tong university library built an interface that would use gis to identify the specific location of each book in the catalog. a gis model that includes spatial and attribute information was constructed. the connection of gis, rfid, and opac was discussed in detail. additionally, the relationship between the bookshelves and patrons’ behavior was studied deeply. information technology and libraries | septmber 2018 101 it is hoped that this gis model will bring convenient services for readers and efficient management for librarians. methodology background in 1984, shanghai jiao tong university circulation system was built based on barcode-reader technology. the first automated library-management system (lms), minisis and image library integrated system, was implemented in 1988. in 1993, the second lms, the unify online multiuser system, was implemented. in 1994, an open public access catalogue (opac) system was built based on the unils, allowing readers to query the library bibliographic record through the computer. in 1998, the third automated lms, a client/server–based tool, was built based on the horizon lms. in 2008, we launched the aleph integrated library system (ils). in the same year, primo, a resource discovery and access system, was introduced. in 2009, the our explore interface was built based on the primo system, providing the services of resource retrieval and access.13 rfid technology was introduced in 2014, and now readers can borrow or return books through self-service machines. users can find a book via the opac or our explore system in the shanghai jiao tong university library homepage (http://www.lib.sjtu.edu.cn/index.php?m=content&c=index&lang=en), a screen shot of which is shown in figure 1. book information can be found through the systems, but the exact position of the books cannot be exhibited in the system. at the library reference desk, the question readers ask most frequently is where they can find a certain book. the chinese library classification (clc) system is used to organize the collections in the shanghai jiao tong university. the librarians are very familiar with the classification. however, it is hard for the inexperienced users to understand, even if they have been trained. although static maps can guide patrons to find the books, patrons sometimes still have difficulties finding the books. if the readers can get the exact bookshelf location for a book through the opac or our explore system, the users’ experience could be improved significantly, and much of readers’ time for finding the books could be saved. therefore, it is necessary to introduce gis to the library with the aim of visually showing the position of each book. furthermore, library managers need to plan the budget at the end of every year. the arrangement of different subjects should be considered in the planning. although the usage of the collections by the ils provides reference for the planning, a library-space information model (lsim) would bring a new insight. software there are many kinds of gis software in this research field, including commercial products such as arcgis, mapinfo, and mapgis as well as free and open-source software (foss) solutions. taking foss and arcgis for example, foss can provide a broader context of the open-source software movement and developments in gis.14 no single foss package can match all the functionality that arcgis has for creating thematic maps; therefore, the function of spatial analysis and data processing of arcgis is more powerful. the software used in this study is arcgis 10.3 trial version. http://www.lib.sjtu.edu.cn/index.php?m=content&c=index&lang=en library space information model based on gis | shen 102 https://doi.org/10.6017/ital.v37i3.10308 figure 1. opac and our explore in the shanghai jiao tong university library homepage. methods there are two modules in the lsim, including spatial information and attributes information, as shown in figure 2. spatial information, including the building position, the reading-room layout, bookshelf information, and so on, is transferred to shapefile style. remote-sensing information is used to set the geographic location of the library. these elements are constructed with different attributes, and 2d-attribute and 3d-multipatch data are stored in the geodatabase. arcmap and arcscene are used to generate the 2d and 3d maps and analyze the readers’ behavior. we connect the spatial information with data from the opac, our explore, and rfid. the query fields (which we call “general information”) in the opac are title, author, keyword, call number, issn, isbn, system number, barcode, collection location, and publisher. in the our explore system, readers can not only search the general information, but also refine the search results by specific fields, such as topic, author, collection location, published date, and clc. the functions of book reserving and renewing are also supported by these two systems. rfid is introduced to the shanghai jiao tong library to allow self-service, and the fields include collection location, subject, issn, isbn, barcode, and so on. barcode is the common field in all three systems and is used to connect them. in the rfid system, the bookshelf is the unique identification of each shelf in the bookshelves. in the shanghai jiao tong university library, the first-book location method is used to manage books in the rfid system. the first book on each bookshelf is recorded as a different bookshelf location, and the books on one bookshelf are assigned to the same bookshelf location. the books are ordered and arranged according to the call number. a book’s current status can be obtained in the information technology and libraries | septmber 2018 103 rfid system by shelf inventory. the books that are borrowed by patrons or not on the right shelf would be recorded in the rfid system. the key attribute information in the lsim is the bookshelf layer, which is used to describe the book’s position. the field of the bookshelf layer is connected with the rfid data. taking the bookshelf layer of rfid as the attribute field, the position of a book can be located by the bookshelf layer in the lsim. compared to xia’s research, it is easier to get the bookshelf-layer information based on the rfid in the lsim.17 figure 2. research flowchart. the connection of the opac, rfid, and lsim is shown in figure 3. when the reader locates a book in the opac or our explore, the barcode will be shown in the system. the bookshelf layer in the rfid system can be retrieved through the barcode immediately. the map of the reading room has been embedded in the opac. furthermore, the coordinates of the book (x, y, height) can be shown through the bookshelf layer. the index of each bookshelf coordination is created in the opac, rfid system, and lsim. the field of the map presentation is built in the opac, and the search interface is supported by the arcmap and arcscene. the url link is the content of the field, and its content is varied with the different bookshelves. in short, when the reader searches one book, the related bookshelf coordination is highlighted in the map. through the bookshelf layer field, the book information in the query system can be connected with that of lsim. faculty and students can search books in the query system visually. as shown in figure 2, spatial information and attribute information are connected in the lsim. furthermore, a lsim based on gis is built to provide better services for readers and enhance librarians’ visual management. library space information model based on gis | shen 104 https://doi.org/10.6017/ital.v37i3.10308 figure 3. the connection of the opac, rfid system, and lsim. figure 4. finding a book in the our explore system. information technology and libraries | septmber 2018 105 figure 5a. the visual position of the book with the call number r318-53/3 (2d). figure 5b. the visual position of the book with the call number r318-53/3 (3d). library space information model based on gis | shen 106 https://doi.org/10.6017/ital.v37i3.10308 discussion providing services for readers by lsim visual query in the reading room when a book about biological medicine is required, it can be searched by using the keyword “biological medicine” in our explore. then, as shown in figure 4, a book titled amalgamation within evolution can be found with the clc call number r318-53/3. readers can find the book with the call number in the corresponding reading room. however, if the lsim is applied, the search results include not only the text information about the book’s location, but also a visual map. firstly, the barcode of the book (32832872) is identified and passed to the bookshelf layer. the bookshelf layer (a4r042c04) will be found in the lsim. then the book’s spatial position can be shown on a visual map. figures 5a and 5b show the 2d and 3d visual position of the book with the call number r318-53/3, and these two results can be switched in the system. the red arrow is the book’s position. based on the visual position, readers can find the book more conveniently. the reading rooms in shanghai jiao tong university library are organized by subject. in each reading room, the books with related categories are distributed together. figures 5a and 5b show the layout of one reading room. the books with the large clc classes, i.e., o, p, q, r, and s, were studied as an example in the reading room in this paper. the red triangles represent chairs and the light green rectangles represents desks. shelves are alphabetically labeled. the reference desk, office area, group study room, storehouse, inquiry machines, printers, and stairs are also shown. special collections in different reading rooms in the shanghai jiao tong university library, there are many special collections, such as contract documents, tsung-dao lee’s manuscripts, alumni theses, important findings of research teams, and so on. because of their rarity, these special collections do not circulate and can only be read in the reading rooms. furthermore, these collections are located in different branch libraries. the geographical information of these resources can be input into the model. scholars can use lsim to achieve the exact positions of these resources, go directly to the related area, and quickly find these special items. library analysis and management book-borrowing situation analysis using gis, it is also possible to show how often books circulate based on their physical location. as shown in figure 6, each rectangle represents a shelf in the reading room. the books with the same topic are placed on the same shelf. the number labeled on the shelf represents the average borrowing frequency of the books on this shelf. different colors mean different frequency, with scale of five to one hundred. the clc classes o, p, and q appearing on the right of the shelves represent mathematical sciences and chemistry, astronomy and geosciences, and bioscience, respectively. information technology and libraries | septmber 2018 107 figure 6. average borrowing frequency of the books on each shelf in one reading room. based on analysis of the relationship between borrowing frequency and subject category, the hot spots of the professional fields can be found and shown. in turn, books related to the hot spots can be recommended to readers. taking class o as an example, the shelf position of the highest borrowing frequency (100) is in row 9, column 2. according to the query system, the theme of the books on this shelf is high polymer chemistry. the books with high borrowing frequency can be highlighted both on the bookshelf and in the query system. if the higher-borrowing-frequency books on the remote shelves meet school discipline development policy, the purchases of these books will be increased. books related to the subjects with the higher borrowing frequency on the taller or lower shelves will also be considered, and vice versa. permanent-assets management permanent assets such as chairs, desks, shelves, inquiry machines, printers, etc., can be managed in this model. information about permanent assets (such as their status, spatial position, etc.) was input in the model, as is shown in figures 5a and 5b. librarians can find the visual positions of permanent assets at any time, and readers can conveniently find the inquiry machines or printers to search books and print documents. library space information model based on gis | shen 108 https://doi.org/10.6017/ital.v37i3.10308 future directions the lsim is only tested in one reading room and is still experimental. this model will be expanded to the whole library, providing visual information of library books and materials. in the process of using this model, gis potentiality in the library will be exploited to provide better services for readers and managers. conclusion based on readers’ need of the book position in the library, the lsim is built to visually show the exact bookshelf layer of the book. spatial and attribute information is combined into the model. based on the model, readers can search for books and find books’ positions. meanwhile, many special collections located in the different branches can be easily found in the model. the gis model not only brings convenience to readers, but also supports the library’s analysis and management. librarians can analyze books’ circulation history based on the relationship between the books’ borrowing frequency and subject categories. books with higher borrowing frequency and ones related them can be recommended to the readers. then the number of the purchased books with the higher borrowing frequency in the remote, taller, or lower places will be increased based on the above analysis. permanent assets can also be managed, and librarians can conveniently find the status and spatial position of the inquiry machines, printers, and so on. in short, the application of gis in the library will bring a visual insight into the library, providing a better reader experience and better library management. acknowledgements i thank guo jing, chen jiayi and huang qinling, shanghai jiao tong university library, for their advice on the structure of this article and the grammar of the written english. i also thank liu min and peng xia, east china normal university, for their help in the model building. research was funded by the “fundamental research funds for the central universities" (grant 17jcya13), shanghai jiao tong university. information technology and libraries | septmber 2018 109 endnotes 1 d. kevin davie, james fox, and barbara preece, the arl geographic information systems literacy project. spec kit 238 and spec flyer 238 (washington, dc: association of research libraries, 1999). 2 b. w. bishop and l. h. mandel, “utilizing geographic information systems (gis) in library research,” library hi tech 4, no. 4 (2010): 536–47. 3 karen hertel and nancy sprague, “gis and census data: tools for library planning,” library hi tech 25, no. 2 (2007): 246–59, https://doi.org/10.1108/07378830710755009; christie m. koontz, dean k. jue, and bradley wade bishop, “public library facility closure: an investigation of reasons for closure and effects on geographic market areas,” library information science research 31, no. 2 (2009): 84–91, https://doi.org/10.1016/j.lisr.2008.12.002; sei-ching joanna sin, “neighborhood disparities in access to information resources: measuring and mappin g u.s. public libraries’ funding and service landscapes,” library information science research 33, no. 1 (2011): 41–53, https://10.1016/j.lisr.2010.06.002. 4 gary higgs, mitch langford. and richard fry, “investigating variations in the provision of digital services in public libraries using network-based gis models,” library and information science research 35, no. 1 (2013): 24–32, https://doi.org/10.1016/j.lisr.2012.09.002; sung jae park, “measuring public library accessibility: a case study using gis,” library and information science research 34, no. 1 (2012): 13–21, https://doi.org/10.1016/j.lisr.2011.07.007; sung jae park, “measuring travel time and distance in library use,” library hi tech 30, no. 1 (2012): 151–69, https://doi.org/10.1108/07378831211213274. 5 wang xuemei et al., “applications and researches of geographic information system technologies in bibliometrics,” earth science informatics 7, no. 3 (2014): 147–52, https://doi.org/10.1007/s12145-013-0132-4. 6 renata solar and dalibor radovan, “use of gis for presentation of the map and pictorial collection of the national and university library of slovenia,” information technology and libraries 24, no. 4 (2005): 196–200, https://doi.org/10.6017/ital.v24i4.3385. 7 andrew coyle, “interior library gis,” library hi tech 29, no. 3 (2011): 529–49, https://doi.org/10.1108/07378831111174468. 8 mehri sedighi, “application of geographic information system (gis) in analyzing geospatial information of academic library databases,” electronic library 30, no. 3 (2012): 367–76, https://doi.org/10.1108/02640471211241645. 9 istván boda et al., “a 3d virtual library model: representing verbal and multimedia content in three dimensional space,” qualitative and quantitative methods in libraries 4, no. 4 (2017): 891–901. 10 xia jingfeng, “using gis to measure in-library book-use behavior,” information technology and libraries 23, no 4 (2004): 184–91, https://doi.org/10.6017/ital.v23i4.9663; lauren h. mandel, “toward an understanding of library patron wayfinding: observing patrons’ entry routes in a mailto:https://doi.org/10.1108/07378830710755009 mailto:https://doi.org/10.1016/j.lisr.2008.12.002 mailto:https://10.1016/j.lisr.2010.06.002 https://doi.org/10.1016/j.lisr.2012.09.002 https://doi.org/10.1108/07378831211213274 https://doi.org/10.1108/02640471211241645 https://doi.org/10.6017/ital.v23i4.9663 library space information model based on gis | shen 110 https://doi.org/10.6017/ital.v37i3.10308 public library,” library and information science research 32, no. 2 (2010): 116–30, https://doi.org/10.1016/j.lisr.2009.12.004; lauren h. mandel, “geographic information systems: tools for displaying in-library use data,” information technology and libraries 29, no. 1 (2010): 47–52, https://doi.org/10.6017/ital.v29i1.3158. 11 xia jingfeng, “locating library items by gis technology,” collection management 30, no. 1 (2005): 63–72, https://doi.org/10.1300/j105v30n01_07. 12 matt enis, “technology: capira adds stackmap,” library journal 139, no. 13 (2014): 17. 13 chen jin, the history of shanghai jiao tong university library (shanghai: shanghai jiao tong university press, 2013). 14 francis p. donnelly, “evaluating open source gis for libraries,” library hi tech 28, no. 1, (2010): 131–51, https://doi.org/10.1108/07378831011026742. https://doi.org/10.1016/j.lisr.2009.12.004 https://doi.org/10.6017/ital.v29i1.3158 https://doi.org/10.1300/j105v30n01_07 https://doi.org/10.1108/07378831011026742 abstract introduction methodology background software figure 1. opac and our explore in the shanghai jiao tong university library homepage. methods figure 4. finding a book in the our explore system. figure 5a. the visual position of the book with the call number r318-53/3 (2d). figure 5b. the visual position of the book with the call number r318-53/3 (3d). discussion providing services for readers by lsim visual query in the reading room special collections in different reading rooms library analysis and management book-borrowing situation analysis figure 6. average borrowing frequency of the books on each shelf in one reading room. permanent-assets management future directions conclusion acknowledgements endnotes vr hackfest public libraries leading the way vr hackfest chris markman, m ryan hess, dan lou, and anh nguyen information technology and libraries | december 2019 6 chris markman (chris.markman@cityofpaloalto.org) is senior librarian – information technology & collections, palo alto city public library. m ryan hess (ryan.hess@cityofpaloalto.org) is library services manager — digital initiatives, information technology & collections, palo alto city public library. dan lou (dan.lou@cityofpaloalto.org) is senior librarian — information technology & collections, palo alto city public library. anh nguyen (anh.nguyen@cityofpaloalto.org) is library specialist, information technology & collections, palo alto city public library. we built the future of the internet…today! the elibrary team at the palo alto city library held a vr hackfest weaving together multiple emerging technologies into a single workshop. during the event, participants had hands -on experience building vr scenes, which were loaded to a raspberry pi and published online using the distributed web. throughout the day, participants discussed how these technologies might change our lives, for good and for ill. and afterward, an exhibit showcasing the participants’ vr scenes was placed at our mitchell park branch to stir further conversation. multiple emerging technologies explored the workshop was largely focused around the a-frame code, a framework for publishing 3d scenes to the web (https://aframe.io/). however, we also integrated a number of other technologies, including a raspberry pi, qr codes, a twitter-bot, and the inter-planetary file system (ipfs), which is a distributed web technology. virtual reality built with a-frame code in the vr hackfest, participants first learned how to use a-frame code to render 3d scenes that can be experienced through a web browser or vr headset. a-frame is a new framework that web publishers and 3d designers can use to design web sites, games and 3d art. a-frame is an extension of html, the code used to build web pages. anyone who is familiar with html will pick up a-frame very quickly, but it is simple enough even for beginners. for example, here is some raw a-frame code: mailto:chris.markman@cityofpaloalto.org mailto:ryan.hess@cityofpaloalto.org mailto:dan.lou@cityofpaloalto.org mailto:anh.nguyen@cityofpaloalto.org https://aframe.io/ vr hackfest | markman, hess, lou, and nguyen 7 https://doi.org/10.6017/ital.v38i4.11877 figure 1. try this code example! https://tinyurl.com/ipfsvr02. save the above code as an html file and open it with a webvr compatible browser like chrome and you will then see a blue cube in the center of your screen. by just changing the values of a few parameters, novice coders can easily change the shape, size, color and location of primitive 3d objects, add 3d backgrounds and more. advanced users can also insert javascript code to make the 3d scenes more interesting. for example, in the workshop, we provided javascript that animated a 3d robot head (see figure 1) pre-loaded into the codepen (https://codepen.io) interface for quicker editing and iteration. the inter-planetary file system (ipfs) the collection of 3d scenes created in the vr hackfest was published to the internet using the inter-planetary file system (ipfs), an open source distributed web technology originally created in palo alto by protocol labs in 2014 and now actively improved by a global network of software developers. ipfs allows anyone to publish to the internet without a server, through a peer-to-peer network that can also work seamlessly with the regular internet through http “gateways”. in november 2019, brave browser (https://brave.com) became the first to offer seamless ipfs integration, capable of spawning its own background process or daemon that can upload and download to ipfs content on the fly without the need for an http gateway or separate browser extension installation. unlike p2p technologies such as bittorrent, ipfs is best suited for distributing small files available for long periods of time rather than the quick distribution of large files over a short period of time. this is an oversimplification of what is really happening behind the scenes (part of the magic involves content-addressable storage and asynchronous communication methods based on pub/sub messaging, to name a few) but the ability to share and publish 3d environments and 3d objects in a way that can instantly scale to meet demand could have far reaching consequences for future technologies like augmented reality. https://tinyurl.com/ipfsvr02 https://codepen.io/ https://ipfs.io/ https://brave.com/ information technology and libraries | december 2019 8 figure 2. workshop attendees. ipfs can load content much faster, more securely (through features like automated cryptographic hash checking), and allows people to publish directly to the internet without the need of a thirdparty host. google, facebook, and amazon web services need not apply. the same technology has already been used to overcome censorship efforts by governments, but like any technology it has its downsides. content on ipfs is essentially permanent, allowing free speech to flourish but it could also make undesirable content, like hate speech or child pornography, all but impossible to control. toward 21st century literacy like our other technology programs, the vr hackfest was designed to engage customers around new forms of literacy, particularly around understanding code and thinking critically about emerging communication technologies. in 2019, we are already seeing how technologies like machine learning and social media are impacting social relations, politics and the economy. it is no longer enough to know how to read and write code that underlies the web. true literacy must also understand how these technologies interface with each other and how they impact people and society. vr hackfest | markman, hess, lou, and nguyen 9 https://doi.org/10.6017/ital.v38i4.11877 figure 3. the free-standing exhibit. information technology and libraries | december 2019 10 to this end, the vr hackfest sought to take participants on a journey, both technological but also sociological. once the initial practice with the code was completed, we moved on to a discussion of the consequences for using these technologies. with the distributed web, for example, we explored questions like: • what are the implications for permanent content on the web which no one can take down? • what power do gatekeepers like the government and private companies have over our online speech? • what does a 3d web look like and how will that change how we communicate, tell stories and learn? after the workshop ended, we continued the conversation with the public through an exhibit placed at our mitchell park branch (see figure 3). in this exhibit, we showcased the vr scenes participants had created and introduced the technologies underlying them. but we also asked people to reflect on the future of the internet and to share their thoughts by posting on the exhibit itself. public comments reflected the current discourse around the internet. responses (see figure 5) were generally positive—most of our customers mentioned better download speeds or other efficiency increases but a few also highlighted online privacy and safety improvements. we recorded an equal number of pessimistic and technical responses to the same question, these often demonstrated either knowledge of similar technology (e.g. “how is this different than napster?”) or displeasure with the current state of the world wide web (e.g. “less human connections” or “more spyware and less freedom”). outcomes one surprise outcome was that our project reached the attention of the developers of ipfs, who happen to live a few blocks away from the library. after reading about the exhibit online, their whole team visited our team at the library. in fact, one of their team turned out to be a former child customer of our library! the workshop itself, which was featured as a summer reading program activity, also brought in record numbers. originally open to 20 participants and later expanded to 30, the workshop grew a waitlist that more than quadrupled our initial room capacity. clearly, people were interested in learning about these two emerging technologies. we also want to take a moment to highlight the number of design iterations this project went through before making its way into the public eye. the free-standing vr hackfest exhibit was originally conceived as a wall mounted computer kiosk that encouraged users to publish a short message directly to the web with ipfs, but this raised too many privacy concerns and ultimately our building design does not make mounting a computer on the wall an easy task. our workshop also initially focused much more on command line skills working directly with ipfs, but user testing with library staff showed learning a-frame was more than enough. vr hackfest | markman, hess, lou, and nguyen 11 https://doi.org/10.6017/ital.v38i4.11877 figure 4. building the exhibit. information technology and libraries | december 2019 12 figure 5. exhibit responses. figure 6. visit from protocol labs co-founders. 0 2 4 6 8 10 12 14 16 18 20 optimistic pessimistic technical spam illegible n u m b e r o f p o st -i t n o te s vr hackfest | markman, hess, lou, and nguyen 13 https://doi.org/10.6017/ital.v38i4.11877 the vr hackfest was also a win because it combined so many different skills into a single project. we were not only working with open source tools and highlighting new technologies, but also building an experience for workshop attendees and showcasing their work to thousands of people. future work our immediate plans include re-use of the exhibit frame for future public technology showcases and offering another round of vr hackfest workshops, perhaps in a smaller group so participants have the chance to view their work while wearing a vr headset. figure 7. 3d mock-up. beyond this, we also think libraries have the opportunity to harness the distributed web for digital collections, potentially undercutting the cost of alternative content delivery networks or file hosting services. through this project we have already tested things like embedded ipfs links in marc records and building a 3d object library. essentially, all the pieces of the “future web” are already here and it is just a matter of time before all modern web browsers offer native support for these new technologies. in general, our project demonstrated the popularity of 21st-century literacy programs. but it also demonstrated the significant technical difficulties of conducting cutting edge technology workshops in public libraries. clearly, the demand is there, and our library will continue to strive to re-imagine library services. multiple emerging technologies explored virtual reality built with a-frame code the inter-planetary file system (ipfs) toward 21st century literacy outcomes future work 20 information technology and libraries | june 2008 an assessment of student satisfaction with a circulating laptop service louise feldmann, lindsey wess, and tom moothart since may of 2000, colorado state university’s (csu) morgan library has provided a laptop computer lending service. in five years the service had expanded from 20 to 172 laptops. although the service was deemed a success, users complained about slow laptop startups, lost data, and lost wireless connections. in the fall of 2005, the program was formally assessed using a customer satisfaction survey. this paper discusses the results of the survey and changes made to the service based on user feedback. colorado state university (csu) is a land-grant insti-tution located in fort collins, colorado. the csu libraries consist of the morgan library, the main library on the central campus; the veterinary teaching branch hospital library at the veterinary hospital campus; and the atmospheric branch library at the foothills campus. in 1997, morgan library completed a major renovation and expansion which provided a designated space for public desktop computers in an information commons environment. the library called this space the electronic information center (eic). due to the popularity of the eic ,and with the intent of expanding computer access without expanding the computer lab, library staff began to explore the implementation of a laptop checkout service in 2000. library staff used heather lyle’s (1999) article “circulating laptop computers at west virginia university” as a guide in planning the service. development funds were used to purchase twenty laptop computers, and the 3com corporation donated fifteen wireless network access points. the laptops were to be used in morgan library on a wireless network maintained by the library technology services department. these computers were to be circulated from the loan desk, the same desk used to check out books. although the building is open to the public, use of the laptops was limited to university students and staff and for library in-house use only. all the public desktop computers and laptops use microsoft windows and microsoft office. maintaining the security of the libraries’ network and students’ personal data in a wireless environment was paramount. to maintain a secure computing environment and present a standardized computing experience in the library, an application of windows xp group policies was used. currently, the laptop software is updated at least every semester using symantec ghost. ghost copies a standardized image to every laptop even when the library owns a variety of computer models from the same manufacturer. additionally, due to concerns over wireless computer security, morgan library implemented cisco’s virtual private network (vpn) in 2004. the laptop service was launched in may 2000. more than 22,000 laptop transactions occurred in the initial year. since its inception, the use of the morgan library laptop service and the number of laptops available for checkout has steadily grown. using student technology funds, the service had grown to 172 laptops and ten presentation kits consisting of a laptop, projector, and a portable screen. circulation during the fall 2005 semester totaled 30,626 laptops and 102 presentation kits. in fiscal year 2005, 66,552 laptops and presentation kits were checked out. based on the high circulation statistics and anecdotal evidence, the service appeared to be successful. although morgan library replaced laptops every three years and upgraded the wireless network, laptop support staff noted that users complained of slow laptop startups, lost data, and lost wireless connections. the researchers also noted that large numbers of users queued at the circulation desk at 5:00 p.m. even though large numbers of desktop computers were available in the eic. a customer service satisfaction survey was developed to assess the service and test library staff’s assumptions about the service. csu had a student population of 25,616 students at the time of the survey. n literature review much of the published literature discussing laptop services focuses on the implementation of laptop lending programs and was published from 2001 to 2003, when many libraries were beginning this service (allmang 2003; block 2001; dugan 2001; myers 2001; oddy 2002; vaughan and burnes 2002; williams 2003). these articles deal primarily with topics such as how to deal with start-up technological, staffing, and maintenance issues. they have minimal discussion of the service post-implementation. researchers who have surveyed users of university laptop lending services include direnzo (2002), lyle (1999), jordy (1998), block (2001), oddy (2002), and monash university’s caulfield library (2004). direnzo from the university of akron only briefly discusses a survey they conducted with some information about additional software added as a result of their user comments. lyle from west virginia university discusses the percentage of respondents to particular questions such louise feldmann (louise.feldmann@colostate.edu) is the business and economics librarian at colorado state university libraries. she serves as the college liaison librarian to the college of business. lindsey wess (lindsey.wess@colostate. edu) coordinates assistive technology services and manages the information desk and the electronic information center at colorado state university libraries. tom moothart (tmoothar@ library.colostate.edu) is the coordinator of on-site services at colorado state university libraries. student satisfaction with circulating laptop service | feldmann, wess, and moothart 21 as what applications were used, problems encountered, and overall satisfaction with the service. jordy’s report provides in-depth analysis of the survey results from the university of north carolina at chapel hill, but the focus of his survey is on the laptop service’s impact on library employee work flow. monash university’s caulfield library survey focuses on wireless access and awareness of the program by patrons. other survey results found on university library web sites include southern new hampshire university library (west 2005) and murray state university library (2002). additionally, the monmouth university library web site (2003) provides discussion and written analysis of a survey they conducted prior to implementation of their service, a survey which was used to gather information and assess patron needs in order to aid in the construction and planning of their service. from the survey results discussed in the literature and posted on web sites, overall comments from users are very consistent with one another. most users indicate that they use a loaned laptop computer rather than desktop computer for privacy and portability (lyle 1999; oddy 2002; west 2005). in addition, the responses from patrons are overwhelmingly positive and users appreciated having the service made available (lyle 1999; jordy1998; west 2005). both west virginia university and the university of north carolina at chapel hill surveys found that 98 percent of respondents would check out a laptop again (lyle 1999; jordy 1998). southern university of new hampshire’s survey indicated that 88 percent of those responding would check one out again (west 2005). many respondents stated that a primary drawback of using the laptops was the slowness of connectivity (lyle 1999; monash 2004; murray state 2002). the primary use of the laptops, reported in the surveys, was microsoft word (lyle 1999; jordy 1998; oddy 2002). there is a lack of published literature regarding laptop lending customer satisfaction surveys and analysis. this could be due to the relative newness of many programs, the lack of university libraries that provide laptops, or the reliance on circulation statistics solely to assess the program. articles that discuss circulation and usage statistics as an assessment indicator to judge the popularity of their programs include direnzo (2002), dugan (2001), and vaughan and burnes (2002). based on high circulation statistics and positive anecdotal evidence, it may appear that library users are pleased with laptop programs, and perhaps there has been a hesitation to survey users on a program that is perceived by those in the library as successful. n results with the strong emphasis on assessment at colorado state university, it was decided to formally survey laptop users on their satisfaction with the program. the survey was distributed by the access services staff when the laptops were checked out from october 28, 2005, to november 28, 2005. this was a voluntary survey and the respondents were asked to complete one survey. users returned 173 completed surveys. undergraduates are the predominant audience for the laptop service; of the 173 returned surveys, 160 identified themselves as undergraduates. as shown in table 1, the responses indicated that the library has a core of regular laptop users, with 33 percent using the laptops at least daily and 82 percent using the laptops at least weekly. only 3 percent indicated that they were using a laptop for the first time. many laptop users also utilized the eic with 67 percent responding that they use the information commons at least weekly (see table 2). the laptops were initially purchased with the intent that they would be used to support student team projects. presentation kits with a laptop, projector, and portable screen were an extension of this idea and were also made available for checkout. surprisingly, only 15 percent of table 1. how often do you use a library laptop? frequency percentage more than once a day 3% daily 30% weekly 49% monthly 15% my first time 3% n=172 table 2. how often do you use a library pc? frequency percentage more than once a day 3% daily 20% weekly 44% monthly 20% never 13% n=169 22 information technology and libraries | june 2008 the respondents noted that they were using the laptop with a group. during evenings, it was observed by staff that students were regularly queuing and waiting for a laptop even though pcs were available in the library computer lab. figure 1 shows an hourly use statistics for the desktop and laptop public computers. the usage of the desktop computer drops in the late afternoon, just as the use of the laptop computer increases. students were asked why they chose a laptop rather than a library pc and were allowed to choose from multiple answers. as can be seen in table 3, most students noted the advantages of portability and privacy. five respondents wrote in the “other” category that they were able to work better in quieter areas, and ten mention that the computer lab workspace is limited. the dense use of space in the library computer lab has been noted by morgan library staff and students. the desktop surrounding each library pc only provides about three feet of workspace. one respondent explained the choice of laptop over pc was because “i can take it to a table and spread out my notes vs. on a library pc.” for many users, the desktops are too crowded to spread research material, and the eic is too noisy for contemplative thought. as can be noted from the use statistics, the public laptop program has been a very popular library service. prior to the survey, the perception of the morgan library staff was that students were waiting in the evening for extended periods of time for a laptop. when the library expanded the laptop pool from 20 in 2000 to 172 in 2005, it had seemingly no effect on reducing the number of students waiting to use them. as can be seen in table 4, when asked how long they had waited for a laptop, 74 percent of the students said they had access to a laptop immediately, and 15 percent waited less than a minute. the survey was administered during the second busiest time of the year for the library, the month before thanksgiving break. in the open comments, one respondent stated that it was possible to wait fortyfive minutes to an hour for a laptop and another noted that “during finals weeks it is almost impossible to get one.” even with the limited waiting time recorded by the page 1 of 1 feldmann figures.doc 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 7:30 am 8:30 am 9:30 am 10:30 am 11:30 am 12:30 pm 1:30 pm 2:30 pm 3:30 pm 4:30 pm 5:30 pm 6:30 pm 7:30 pm 8:30 pm 9:30 pm 10:30 pm 11:30 pm time of day p er ce nt ag e of u se r desktop computers checkout laptops figure 1. computer use statistics for may 1, 2006. figure 1. computer use statistics for may 1, 2006. table 3. why did you choose to use a laptop rather than a library pc? response number portability 41 privacy 12 easier to work with a group 7 portability and privacy 54 portability and easier to work with a group 10 portability, privacy, and easier to work with a group 12 student satisfaction with circulating laptop service | feldmann, wess, and moothart 23 respondents, when asked how the library could improve the laptop service many respondents requested that more laptops be purchased to decrease the wait. the library is struggling to determine the appropriate number of laptops to have available during peak use periods to reduce or eliminate wait times. the library laptops are more problematic than the library desktop computers to support. the laptops are more fragile than the desktop computers and have the added complication of connecting to the wireless network. every morning the morgan library’s technology staff retrieves non-functioning laptops; library technicians regularly retrieve lost data due to malfunctioning laptops and unsophisticated computer users. the addition of the virtual private network (vpn) connection to the laptop startup script files has slowed the boot-up to the wireless network. an effort has been made to ameliorate wireless “dead zones,” but users still complain of being dropped from the wireless network. with these problems in mind, users were asked about the technical complications they have experienced with the library laptops. the survey responses in tables 5 and 6 indicate a much lower percentage of users reporting technical problems than was anticipated. the technical staff’s large volume of technical calls may reflect the volume of users rather than systematic problems with the laptop service. surprisingly, 79 percent of the users reported rarely or never returning a non-functioning laptop. in addition, the library technicians have reported that no problems have been found on some of the laptops returned for repair. some of the returned computers may be due to frustration with the slow connection to the wireless network. forty-five percent of respondents reported at least occasionally having problems connecting to the wireless network. from the inception of the laptop program, the library has experienced problems with the wireless technology. from its original fifteen wireless access points to its current twenty-nine, the library has struggled to meet the demand of additional library laptops and users’ personal laptops. many written comments on the surveys complained about the slow connection speed of the wireless network such as, “find a way to make the boot-up process faster. i need to wait about five minutes for it to be totally booted and ready to use.” even with the slow connection to the wireless network, 41 percent of students responding to the survey rated their satisfaction with the library’s laptop service as excellent and 49 percent rated their satisfaction as good (see table 7). n discussion even with 90 percent of our users rating the laptop service as good or excellent, the survey noted some problems that needed attention. the morgan library laptops seamlessly connect to a wireless network through a login script when the computer is turned on. a new script was written to table 4. how long did you wait before you were able to check out your laptop? response percentage i did not wait 74% less than one minute 15% one to four minutes 11% five to ten minutes 2% more than ten minutes 0% n=171 table 5. how often have you experience problems saving files, connecting to the wireless network, or had a laptop that locked up or crashed? frequency saving files wireless connection locked up or crashed often <1% 5% <1% occasionally 8% 40% 17% rarely 33% 32% 35% never 58% 24% 49% n= 165 165 163 table 6. how often have you returned a library laptop that was not working properly? frequency percentage often 4% occasionally 18% rarely 30% never 49% n=165 24 information technology and libraries | june 2008 allow the connection and authentication to the cisco virtual private network (vpn) client. during testing it was found that some laptops took as long as ten minutes to connect to the wireless network, which resulted in numerous survey respondents commenting on our slow wireless network. to help correct this problem, the library’s network staff changed each laptop’s user profile from a mandatory roaming profile to a local profile and simplified the login script. the laptops connected faster to the wireless network with the new script, but they still did not meet the students’ expectations. in the fall of 2006, the library network staff moved the laptops from vpn to wi-fi protected access (wpa) wireless security, and laptop login time to the wireless network dropped to under two minutes. the number of customer complaints dropped dramatically after implementing wpa. additional access points were purchased to improve connectivity in morgan library’s wireless “dead zones.” in january 2006, the university’s central computing services audited the wireless network after continued wireless connectivity complaints. the audit recommended reconfiguring the access points channel assignments. in many cases it was found that the same channel had been assigned to access points adjacent to each other, ultimately compromising laptop connectivity. the audit also discovered noise interference on the wireless network from a 2.4-ghz cordless phone used by the loan desk staff. the phone was replaced with a 5.8-ghz one, which has resulted in fewer dropped connections near the loan desk. supporting almost 200 laptops has introduced several problems in the library. the morgan library building was not designed to support the use of large numbers of laptops. because it is impractical for the loan desk to charge nearly 200 laptop batteries throughout the day, laptops available for checkout must be connected to electrical outlets. these are seldom near study tables, and students are forced to crawl underneath tables to locate power or stretch adapter cords across aisles. a space plan for the morgan library is being developed that will increase the number of outlets near study tables. in the meantime, 100 power strips were added to tables used heavily by laptop users. the loan desk staff is very efficient at circulating, but has less success at troubleshooting technical problems. when the laptop service was first implemented, large numbers of laptops were not available due to servicing reasons. the public laptop downtime was lowered by hiring additional library technology students. a one-day onsite repair service agreement was purchased from the manufacturer which resulted in many equipment repairs being completed within 48 hours. in order to reduce the downtime further, a plan to replace some loan desk student workers with library technology students is being evaluated. the technology students will be able to troubleshoot connectivity and hardware problems with the users when they return the defective computers to the loan desk. if a computer needs additional service, it can be handled immediately, which will allow more laptops for checkout since fewer will be removed for repair. when the laptop service was first envisioned, it was seen as a great service for those working in groups. as can be seen in table 3, very few students are using the laptops in a group setting. in survey written comments, students emphasize that they enjoy the portability and privacy enabled by using a laptop. the morgan library eic is cramped and noisy, with the configuration allowing very little room for students to spread out research materials and notes for writing. the morgan library space plan takes these issues into consideration and recommends reconfiguring the eic to lessen the noise and provide writing space near computers. this is intended to improve the student library experience and encourage students to use the desktop computers during the evenings when lines form for the laptops. in order to decrease the current laptop queue at the loan desk, more laptops will be added. as a result of survey comments requesting apple computers, five mac powerbooks were added to the library’s laptop fleet. in addition, as morgan library adds more checkout laptops and the number of students arriving on campus with wireless laptops increases, the wireless infrastructure will need to be upgraded. upgrading the wireless access points to standard 802.11g has been implemented. updating each laptop with a new hardrive image has become problematic as the number of laptops has increased. the wireless network capacity is not large enough for the ghost software to transmit the image to multiple laptops, and so each laptop must be physically attached to the library network. initially, when library technology services attempted imaging many laptops at once, it took six to eight hours and required up to eight staff members. this method of large-scale laptop imaging was so network intensive that it had to be performed when the library was closed to avoid disrupting table 7. please rate your satisfaction with the laptop service. response percentage excellent 41% good 49% neutral 7% poor very poor 2% <1% n=166 student satisfaction with circulating laptop service | feldmann, wess, and moothart 25 public internet use. now imaging the laptop fleet is done piecemeal, twenty to thirty laptops at a time, in order to minimize complications with the ghost process and multicasting through the network switches. due to the staff time required, laptop software is not updated as often as the users would like. technological solutions continue to be investigated that will decrease the labor and network intensity of imaging. n conclusion the morgan library laptop service was established in 2000 and has been a very popular addition to the library’s services. as an example of its popularity, in fiscal year 2005 the laptops circulated 66,552 times. student government continues to support the use of student technology fees to support and expand the fleet of laptops. this survey was an attempt to assess users’ perceptions of the service and identify areas that need improvement. the survey found that students rarely wait more than a few minutes for a laptop, and in open-ended survey questions, students noted that they waited for computers only during peak use periods. while relatively few survey respondents experienced technical difficulties with the laptops and wireless network, slow wireless connection time was a concern that students noted in the open comments section of the survey. overall, the students gave the laptop service a very high rating. when asked to suggest improvements to the service, many respondents recommended purchasing more laptops. the libraries made several changes to improve the laptop service based on survey responses. changes have been made to the login script files, wireless network, and security protocol to speed and stabilize the wireless connection process. additional wireless access points will be added to the building and all access points will be upgraded to the 802.11g standard. in addition, five mac powerbooks have been added to the fleet of windowsbased laptops. the library continues to investigate new service models to circulate and maintain the laptops. works cited allmang, nancy. 2003. our plan for a wireless loan service. computer in libraries 23, no. 3: 20–25. block, karla j. 2001. laptops for loan: the experience of a multilibrary project. journal of interlibrary loan, document delivery, and information 12, no. 1: 1–12. direnzo, susan. 2002. a wireless laptop-lending program: the university of akron experience. technical services quarterly 20, no. 2: 1–12. dugan, robert e. 2001. managing laptops and the wireless network at the mildred f. sawyer library. journal of academic librarianship 27, no. 4: 295–298. jordy, matthew l. 1998. the impact of user support needs on a large academic workflow as a result of a laptop check-out program. master’s thesis, university of north carolina. lyle, heather. 1999. circulating laptop computers at west virginia university. information outlook 3, no. 11: 30–32. myers, penelope. 2001. laptop rental program, temple university libraries. journal of interlibrary loan, document delivery, and information supply 12, no. 1: 35–40. monash university caulfield library. 2004. laptop users and wireless network survey. www.its.monash.edu.au/staff/networks/wireless/review/caul-lapandnetsurvey.pdf (accessed june 8, 2005). monmouth university. 2003. testing the wireless waters: a survey of potential users before the implementation of a wireless notebook computer lending program in an academic library. http://bluehawk.monmouth.edu/~hholden/wwl/wireless_survey_results.html (accessed june 8, 2005). murray state university. 2002. library laptop computer usage survey results. www.murraystate.edu/msml/laptopsurv. htm (accessed june 8, 2005). oddy, elizabeth carley. 2002. laptops for loan. library and information update 1, no. 4: 54–55. vaughn, james b., and brett burnes. 2002. bringing them in and checking them out: laptop use in the modern academic library. information technology and libraries 21, no. 2: 52–62. west, carol. 2005. librarians pleased with results of student survey. southern new hampshire university. www.snhu. edu/3174/asp (accessed june 8, 2005). williams, joe. 2003. taming the wireless frontier: pdas, tablets, and laptops at home on the range. computers in libraries 23, no. 3: 10–12, 62–64. 48 information technology and libraries | march 2007 author id box for 3 column layout column title editor zoomify image is a mature product for easily publishing large, high-resolution images on the web. end users view these images with existing webbrowser software as quickly as they do normal, downsampled images. a flash-based zoomifyer client asynchronously streams image data to the web browser as needed, resulting in response times approaching those of desktop applications using minimal bandwidth. the author, a librarian at cornell university and the principal architect of a small, open-source company, worked closely with zoomify to produce a cross-platform, opensource implementation of that company’s image-processing software and discusses how to easily deploy the product into a widely used webpublishing environment. limitations are also discussed as are areas of improvement and alternatives. z oomifyer from zoomify (www .zoomify.com) enables users to view large, highresolu tion images within existing web browser software while providing a rich, interactive user experience. a small zoomifyer client, authored in macromedia flash, is embedded in an html page and makes asyn chronous requests to the server to stream image data back to the client as needed. by streaming the image data in this way, the image renders as quickly as a normal, downsampled image, even for images that are giga bytes in size. as the user pans and zooms, the response time approaches that of desktop applications while using the smallest possible band width necessary to render the image. and because flash has 98.3 per cent browser saturation, viewing “zoomified” images is seamless for most users and allows them to view images interactively in much greater detail than would otherwise be prac tical or even possible.1 zoomify image (sourceforge.net/ projects/zoomifyimage) was created at cornell university in collabora tion with zoomify to create an open source, crossplatform, and scriptable version of the processing software that creates the image data displayed in a zoomifyer client. this work was immediately integrated into an inno vative contentmanagement system that was being developed within the zope application server, a premier web application and publishing plat form. authors in this system can add highresolution images just as they normally add downsampled images, and the image is automat ically processed on the server by zoomify image and displayed within a zoomifyer client. zoomify image is now in its second major release on source forge and contains user con tributed software to easily deploy it in other environments such as php. zoomifyer has been used in a number of applications in many fields, and can greatly enhance many research and instructional activities. applying zoomifyer to digitalimage collections is obvious, allowing libraries to deliver an unprecedented level of detail in images published to the web. new applications also suggest themselves, such as serving highresolution images taken from tissue samples in a medical lab or using zoomifyer in advanced geo spatial image applications, particu larly when advanced client features such as annotations are used. the zoomifyer approach also has positive implications for preservation and copyright protection. zoomify image generates cached derivatives of master image files so the image masters are never directly accessed in the application or sent over the internet. image data are stored and transmitted to the client in small chunks so that end users do not have access to the full data of the original image. deploying zoomify image dependencies and winstallation zoomify image was designed ini tially to be a faithful, crossplatform port of zoomify’s imageprocessing software. it was developed in close cooperation with zoomify to pro vide a scriptable method for invok ing the imagepreparation process for zoomifyer clients so this technol ogy could be used in more environ ments. zoomify image is written in the python programming language and uses the thirdparty python imaging library (pil) with jpeg support, both of which are also open source and crossplatform. it has been tested in the following environments: ■ python 2.1.3 ■ pil 1.1.3 and ■ python 2.4.3 ■ pil 1.1.4 installers for python and pil exist for all major platforms and can be obtained at python.org and www .pythonware.com/products/pil. the installation documentation that comes with pil will help you locate the appropriate jpeg libraries if they are missing from your system. for macosx, you can find prebuilt binary installers for python, pil and zope at sourceforge.net/projects/ mosxzope. introducing zoomify image adam smith adam smith (ajs17@cornell.edu) is a systems librarian at cornell university library, ithaca, new york. introducing zoomify image | smith 4�introducing zoomify image | smith 4� the “ez” version of the zoomifyer client, a flashbased applet with basic pan and zoom functionality, is pack aged with zoomify image for conve nience so the software can be used immediately once installed. the ez client is covered by a separate license and can be easily replaced with more advanced clients from zoomify at www.zoomify.com. (a description of how to upgrade the zoomifyer client is included in this paper.) after python and pil with jpeg support are installed, download the zoomify image software from sourceforge.net/projects/zoomify image and decompress it. using zoomify image from the command line begin exploring zoomify image by invoking it on the command line: python /zoomifyfilepr ocessor.py or, to process more than one file at a time: python /zoomifyfile processor.py the file format of the images input to zoomify image are typically either tiff or jpeg, but can be any of the many formats that pil can read.2 an image called “test.jpg” is included in the zoomify image distribution and is of sufficient size and complexity to provide an interesting example. during processing, zoomify image creates a new directory to hold the converted image data in the same location as the image file being processed. the name of this direc tory is based on the file name of the image being processed, so that, for example, an image called “test.jpg” would have a corresponding folder called “test” containing the converted image data used by the zoomifyer client. if the image file has no file extension, the directory is named by appending “_data” to the image name, so that an image file named “test” would have a corresponding directory called “test_data.” if the process is rerun on the same images, any previously generated data are automatically deleted before being regenerated. zoomify provides substantial documentation and sample code on its web site that demonstrates how to use the data generated by zoomify image in several environments. user contributed code is bundled with zoomify image itself, further dem onstrating how to dynamically incor porate this conversion process into several environments. an example of the use of zoomify image within the zope application server is given. incorporating zoomify image into the zope application server the popular zope application server contains a number of builtin services including a web server, ftp and webdav servers, plugins for access ing relational databases, and a hier archical objectoriented database that uses a filesystem metaphor for stor age. this object database provides a unique opportunity to incorporate zoomifyer into zope seamlessly. to use zoomify image with zope, the distribution must be decom pressed into your zope products directory. for versions 2.7.x and up, this is at: /products/ in zope versions prior to the 2.7.x series, the products directory is at: /lib/python/ products/ restart zope and now within the webbased zope management interface (zmi), the ability to add zoomify image objects appears. after selecting this option, a form is presented that is identical to the form used for adding ordinary image objects within zope. when an image is uploaded using this form, zope automatically invokes the zoomify image conversion process on the server and links the generated data to the default zoomifyer client that comes with the distribution. if the image is subsequently edited within zmi to upload a new version, any existing conversion data for that image are automatically deleted, and the new conversion data are gener ated to replace them, just as when invoked on the command line. again, the uploaded image can be in any format that zope recognizes as having a contenttype of “image/...” and that pil can read. the only potential “gotcha” in this process is that in the versions of the zoomifyer client the author has tested, zoomify image objects that have file names (in zope terminology, the file name is the object’s “id” property) with extensions other than “.jpg” are not displayed properly by the zoomifyer client. so, when uploading a tiff image, for example, the id given to the zoomify image object should either not contain an extension, or it should be changed from image.tif to something like image_tif. this bug has been reported to zoomify and may be fixed in newer versions of the flashbased viewing software at the time of publication. to view the image within the zoomifyer client, simply call the “view” method of the object from within a browser. so, for a zoomify image object uploaded to: http:///test/test.jpg go to this url: http:///test/test. jpg/view or, to include this view of the image within a zope page template 50 information technology and libraries | march 200750 information technology and libraries | march 2007 (zpt), simply call the tag method of the zoomify image just as you would a normal image object in zope. so, in a zpt, use this: it is possible that the zoomify image conversion process will not have had time to complete when someone tries to view the image. the zoomify image object will attempt to degrade gracefully in this situation by trying to display a downsampled version of the image that is gener ated part way through the conver sion process, or, if that is also not available, finally informing the user that the image is not yet ready to be viewed. this logic is built into the tag method. to add larger images more effi ciently, or to add images in bulk, the zoomify image distribution contains detailed documentation to quickly configure zope to accept images via ftp or webdav and automatically process them through zoomify image when they are uploaded. finally, the default zoomifyer cli ent can be overridden by uploading a custom zoomifyer client into a loca tion where the zoomify image object can “acquire” it, and giving it a zope id of “zoomifyclient.swf”. how it works to be viewed by a zoomifyer cli ent, an image must be processed to produce tiles of the image at differ ent scales, or tiers. an xml file that describes these tiles is also necessary. zoomify image provides a cross platform method of producing these tiled images and the xml file that describes them. beginning at 100percent scale, the image is successively scaled in half to produce each tier, until both the width and height of the final tier are, at most, 256 pixels each. each tier is further divided into tiles that are, at most, 256 pixels wide by 256 pixels tall, as seen in figure 1. these tiles are created left to right, top to bottom. tiles are saved as images with the naming conven tion indicated in figure 2. the numbering is zerobased, so that the smallest tier is represented by one tile that is at most 256 x 256 pixels wide with the name “00 0.jpg.” tiles are saved in directories in groups of 256, and those directories also follow a zerobased naming con vention starting with “tilegroup0.” lowernumbered tile groups contain lowernumbered tiles, so 000.jpg is always in tilegroup0. zoomifyer clients understand this tilenaming scheme and only request tiles from the server that are necessary to stitch together the por tion of the image being viewed at a particular scale. limitations zoomify image was developed to meet two goals: 1. to provide a crossplatform port of the zoomifyer con figure 1. tiers and tiles for a 2048 x 2048 pixel image figure 2. tile image naming scheme introducing zoomify image | smith 51introducing zoomify image | smith 51 verter for use in unix/linux systems, and 2. to make the converter script able, and ultimately integrate it into opensource contentman agement software, particularly zope. this zoomifyer port was writ ten in python, a mature, highlevel programming language with an execution model similar to java. although zoomify image continues to be optimized, compared to the official zoomify conversion software, it is slower and more limited in the sizes of images it can reasonably process. anecdotally, zoomify image has been used effectively on images hundreds of megabytes large, but significant performance degradation has been reported in the multigiga byte range. because of these limitations in zoomify image, the official zoomify imageprocessing software is recom mended for converting very large images manually in a windows or macintosh environment. the zoomify image product is recommended in the following circumstances: ■ the conversion must be per formed on a unix/linux machine. ■ the conversion process must be scriptable, such as for batch pro cessing or being run dynamically. ■ images sizes are not in the multi gigabyte range. if a scriptable, crossplatform version of the zoomifyer converter is needed, but performance is an issue, several things can be done to extend the current limits of the soft ware. obviously, upgrading hard ware, particularly ram, is effective and relatively inexpensive. running the latest versions of python and pil will also help. each new version of python makes significant perfor mance improvements, and this was a primary goal of version 2.5, which was released in september 2006. the author believes that the cur rent weak link in the performance chain is related to how zoomify image is loading image data into memory with pil during processing. in the current distribution, a python script contributed by gawain avers, which is based partially on the zoomify image approach, uses imagemagick instead of pil for image manipula tion and is better able to process multigigabyte images. the author would like to add the ability to des ignate the image library at runtime in future versions of zoomify image. future development beyond improving the performance of the coreprocessing algorithm, the author would also like to explore opportunities for more efficiently processing images within zope, such as spawning a background thread for processing images so the zope web server can immediately respond to the client’s imagesubmission request. the author would also like to improve the tag method to display data more flexibly in the zoomifyer client and ensure consistent behav ior with zope’s default image tag method. finally, zoomify image could also benefit from the addi tion of a simple configuration file to control such runtime properties as image quality and which thirdparty imageprocessing library to use, for example. conclusion zoomify image is mature, open source software that makes it pos sible to publish large, highresolution images to the web. it is designed to be convenient to use in a variety of architectures and can be viewed within existing browser software. download it for free, begin using it in minutes, and explore its unique possibilities. references 1. adobe systems, macromedia flash player statistics, http://www.adobe.com/ products/player_census/flashplayer/ (accessed march 1, 2007). 2. pythonware, python imaging library handbook: image file formats, http:// www.pythonware.com/library/pil/ handbook/formats.htm (accessed aug. 6, 2006). resources macromedia flash player statistics (http://www.adobe.com/ products/player_census/flash player/) (accessed jan. 2, 2007). python imaging library (pil) (http:// www.pythonware.com/products/ pil/) (accessed jan. 2, 2007). python programming language official web site (http://www.python.org/) (accessed jan. 2, 2007). zoomify image (http://sourceforge.net/ projects/zoomifyimage/) (accessed jan. 2, 2007). zoomify (http://www.zoomify.com/) (accessed jan. 2, 2007). zope community (http://www.zope .org/) (accessed jan. 2, 2007). zope installers for macosx (http:// sourceforge.net/projects/ mosxzope/) (accessed jan. 2, 2007). practical limits to the scope of digital preservation mike kastellec practical limits to the scope of digital preservation | kastellec 63 abstract this paper examines factors that limit the ability of institutions to digitally preserve the cultural heritage of the modern era. the author takes a wide-ranging approach to shed light on limitations to the scope of digital preservation. the author finds that technological limitations to digital preservation have been addressed but still exist, and that non-technical aspects—access, selection, law, and finances—move into the foreground as technological limitations recede. the author proposes a nested model of constraints to the scope of digital preservation and concludes that costs are digital preservation’s most pervasive limitation. introduction imagine for a moment what perfect digital preservation would entail: a perfect archive would capture all the content generated by humanity instantly and continuously. it would catalog that information and make it available to users, yet it would not stifle creativity by undermining creators’ right to control their creations. most of all, it would perfectly safeguard all the information it ingested eternally, at a cost society is willing and able to sustain. now return to reality: digital preservation is decidedly imperfect. today’s archives fall far short of the possibilities outlined above. much previous scholarship debates the quality of different digital preservation strategies; this paper looks past these arguments to shed light on limitations to the scope of digital preservation. what are the factors that limit the ability of libraries, archives, and museums (henceforth collectively referred to as archival institutions) to digitally preserve the cultural heritage of the modern era? 1 i first examine the degree to which technological limitations to digital preservation have been addressed. next, i identify the non-technical factors that limit the archival of digital objects. finally, i propose a conceptual model of limitations to digital preservation. technology any discussion of digital preservation naturally begins with consideration of the limits of digital preservation technology. while all aspects of digital preservation are by definition related to technology, there are two purely technical issues at the core of digital preservation: data loss and technological obsolescence. 2 many things can cause data loss. the constant risk is physical deterioration. a digital file consists at its most basic level as binary code written to some form of mike kastellec (makastel@ncsu.edu) is libraries fellow, north carolina state university libraries, raleigh, nc. mailto:makastel@ncsu.edu information technology and libraries | june 2012 64 physical media. just like analog media (paper, vinyl recordings), digital media (optical discs, hard drives) are subject to degradation at a rate determined by the inherent properties of the medium and environment in which it is stored. 3 when the physical medium of a digital file decays to the point where one or more bits lose their definition, the file becomes partially or wholly unreadable. other causes of data loss include software bugs, human action (e.g., accidental deletion or purposeful alteration), and environmental dangers (e.g., fire, flood, war). assuming a digital archive can overcome the problem of physical deterioration, it then faces the issue of technological obsolescence. binary code is simply a string of zeroes and ones (sometimes called a bitstream)—like any encoded information, this code is only useful if it can be decoded into an intelligible format. this process depends on hardware, used to access a bitstream from a piece of physical media, and software, which decodes the bitstream into an intelligible object, such as a document or video displayed on a screen, a printout, or an audio output. technological obsolescence occurs when either the hardware or software needed to render a bitstream usable is no longer available. given the rapid pace of change in computer hardware and software, technological obsolescence is a constant concern. 4 most digital preservation strategies involve staying ahead of deterioration and obsolescence by copying data from older to current generations of file formats and storage media (migration) or by keeping many copies that are tested against one another to find and correct errors (data redundancy). 5 other strategies to overcome obsolescence include pre-emptively converting data to standardized formats (normalization) or avoiding conversion and instead using virtualized hardware and software to simulate the original digital environment needed to access obsolete formats (emulation). as may be expected of a young field, 6 there is a great deal of debate over the merits of each of these strategies. to date, the arguments mostly concern the quality of preservation, which is beyond the scope of this work. what should not be contentious is that each strategy also imposes limitations on the potential scale of digital preservation. migration and normalization are intensive processes, in the sense that they normally require some level of human interaction. any human-mediated process limits the scale of an archival institution’s preservation activities, as trained staffs are a limited and expensive resource. emulation postpones the processing of data until it is later accessed, potentially allowing greater ingest of information. as a strategy, however, it remains at least partly theoretical and untested, increasing the possibility that future access will be limited. data redundancy deserves closer examination, as it has emerged as the gold standard in recent years. the limitations data redundancy imposes on digital preservation are two-fold. the first is that simple maintenance of multiple copies necessarily increases expenses, therefore—given equal levels of funding—less information can be preserved redundantly than can be preserved without such measures. (cost considerations are inextricably linked to every other limitation on digital preservation and are examined in greater detail in “finances,” below.) there are practical, technical limitations on the bandwidth, disk access, and processing speeds needed to perform practical limits to the scope of digital preservation | kastellec 65 parity checks (tests of each bit’s validity) of large datasets to guard against data loss. pushing against these limitations incurs dramatic costs, limiting the scale of digital preservation. current technology and funding are many orders of magnitude short of what is required to archive the amount of information desired by society over the long term. 7 the second way technology limits digital preservation is more complex—it concerns error rates of archived data. non-redundant storage strategies are also subject to errors, of course. only redundant systems have been proposed as a theoretical solution to the technological problem of digital preservation, 8 though, so it is necessary to examine their error rate in particular. on a theoretical level, given sufficient copies, redundant backup is all but infallible. in practice, technological limitations emerge. 9 the number of copies required to ensure perfect bit preservation is a function of the reliability of the hardware storing each copy. multiple studies have found that hardware failure rates greatly exceed manufacturers’ claims. 10 rosenthal argues that, given the extreme time spans under consideration, storage reliability is not just unknown but untestable. 11 he therefore concludes that it cannot be known with certainty how many copies are needed to sustain acceptably low error rates. even today’s best digital preservation technologies are subject to some degree of loss and error. analog materials are also inevitably subject to deterioration, of course, but the promise of digital media leads many to unrealistic expectations of perfection. nevertheless, modern digital preservation technology addresses the fundamental needs of archival institutions to a workable degree. technological limitations to digital preservation still exist but the aspects of digital preservation beyond purely technical considerations—access, selection, law, and finances— should gain greater relative importance than they have in the past. access with regard to digital preservation, there are two different dimensions of access that are important. at one end of a digital preservation operation, authorized users must be able to access an archival institution’s holdings and unauthorized users restricted from doing so. this is largely a question of technology and rights management—users must be able to access preserved information and permitted to do so. this dimension of access is addressed in the technology and law sections of this paper. the other dimension of access occurs at the other end of a digital preservation operation: an archival institution must be able to access a digital object to preserve it. this simple fact leads to serious restrictions on the scope of digital preservation because much of the world’s digital information is inaccessible for the purposes of archiving by libraries and archives. there are a number of reasons why a given digital object may be inaccessible. large-scale harvesting of webpages requires automated programs that “crawl” the web, discovering and capturing pages as they go. web crawlers cannot access password-protected sites (e.g., facebook) and database-backed sites (all manner of sites, including many blogs, news sites, e-commerce sites, information technology and libraries | june 2012 66 and countless collections of data). this inaccessible portion of the web is estimated to dwarf the readily accessible portion by orders of magnitude. there is also an enormous amount of inaccessible digital information that is not part of the web at all, such as emails, company intranets, and digital objects created and stored by individuals. 12 additionally, there is a temporal limit to access. some digital objects only are accessible (or even exist) for a short window of time, and all require some measure of active preservation to avoid permanent loss. 13 the lifespans of many webpages are vanishingly short. other pages, like some news items, are publicly accessible for a short window before they are hidden behind paywalls. even long-lasting digital objects are often dynamic: the ads accompanying a webpage may change with each visit; news articles and other documents are revised; blog posts and comments are deleted. if an archival institution cannot access a digital object quickly or frequently enough, the object cannot be archived, at least not completely. large-scale digital preservation, which in practice necessarily relies on periodic automated harvesting of content, is therefore limited to capturing snapshots of the changes digital objects undergo over their lifespans. law existing copyright law does not translate well to the digital realm. leaving aside the complexities of international copyright law, in the united states it is not clear, for example, whether an archival institution like the library of congress is bound by licensing restrictions and if it can require deposit of digital objects, nor whether content on the web or in databases should be treated as published or unpublished. 14 “many of the uncertainties come from applying laws to technologies and methods of distribution they were not designed to address.” 15 a lack of revised laws or even relevant court decisions significantly impacts the potential scale of digital preservation, as few archival institutions will venture to preserve digital objects without legal protection for doing so. given this unclear legal environment, efforts at large-scale digital preservation are hampered by the need to secure permission to archive from the rights holder of each piece of content. 16 this obviously has enormous impact on preserving the web, but even scholarly databases and periodical archives may not hold full rights to all of their published content. additionally, a single digital object can include content owned by any number of authors, each of whose permission is needed for legal archival. without stronger legal protection for archival institutions, the scope of digital preservation is severely limited by copyright restrictions. digital preservation is further limited by licensing agreements, which can be even more restrictive than general copyright law. frequently, purchase of a digital object does not transfer ownership to the end-user, but rather grants limited licensed access to the object. in this case, libraries do not enjoy the customary right of first sale that, among other things, allows for actions related to preservation that would otherwise breach copyright. 17 preservation of licensed works requires that libraries either cede archival responsibility to rights practical limits to the scope of digital preservation | kastellec 67 holders, negotiate the right to archive licensed copies, or create dark archives that preserve objects in an inaccessible state until their copyright expires. selection the limitation selection imposes on digital preservation hinges on the act of intellectual appraisal. the total digital content created each year already outstrips the total current storage capacity of the world by a wide margin. 18 it is clear libraries and archives cannot preserve everything so, more than ever, deciding what to preserve is critical. 19 models of selection for digital objects can be plotted on a scale according to the degree of human mediation they entail. at one end, the selective model is closest to selection in the analog world, with librarians individually identifying digital objects worthy of digital preservation. at the other end of the scale, the whole domain model involves minimal human-mediation, with automated harvesting of digital objects. the collaborative model, in which archival institutions negotiate agreements with publishers to deposit content, falls somewhere between these two extremes, as does the thematic model, which can apply either selectiveor whole-domain-type approaches to relatively narrow sets of digital objects defined by event, topic, or community. each of these approaches results in limits to the scope of digital preservation. the human mediation of the selective model limits the scale of what can be preserved, as objects can only be acquired as quickly as staff can appraise them. the collaborative and thematic models offer the potential for thorough coverage of their target but by definition are limited in scope. the whole domain model avoids the bottleneck of human appraisal but, more than any other model, is subject to the access limitations discussed above. whole domain harvesting is also essentially wasteful, as it is an anti-selection approach—everything found is kept, irrespective of potential value. this wastefulness makes the whole domain model extremely expensive because of the technological resources required to manage information at such a scale. finances the ultimate limiting factor is financial reality. considerations of funding and cost have both broad and narrow effects. the narrow effects are on each of the other limitations previously identified— financial constraints are intertwined with the constraints imposed by technology, access, law, and selection. the technological model of digital preservation that offers the highest quality and lowest risk, redundant offsite copies, also carries hard-to-sustain costs. while the cost of storage continues to drop, hardware costs actually make up only a small percentage of the total cost of digital preservation. power, cooling, and—for offsite copy strategies—bandwidth costs are significant and do not decrease as scale increases to the same degree that storage costs do. cost considerations similarly fuel non-technical limitations: increased funding can increase the rate at which digital objects are accessed for preservation and can enable development of systems to mine deep web resources. selection is limited by the number of staff who can evaluate objects or information technology and libraries | june 2012 68 the need to develop systems to automate appraisal. negotiating perpetual access to objects or arranging to purchase archival copies creates additional costs. the broad financial effect is that any digital preservation requires dedicated funding over an indefinite timespan. lavoie outlines the problem: much of the discussion in the digital preservation community focuses on the problem of ensuring that digital materials survive for future generations. in comparison, however, there has been relatively little discussion of how we can ensure that digital preservation activities survive beyond the current availability of soft-money funding; or the transition from a project's first-generation management to the second; or even how they might be supplied with sufficient resources to get underway at all. 20 there are many possible funding models for digital preservation, 21 each with their own limitations. creators and rights holders can preserve their own content but normally have little incentive to do so over the long-term, as demand for access slackens. publicly funded agencies can preserve content, but they may lack a clear mandate for doing so, and they are chronically underfunded. preservation may be voluntarily funded, as is the case for wikipedia, although it is not clear if there is enough potential volunteer funding for more than a few preservation efforts. fees may support preservation, either through charging users for access or by third-party organizations charging content owners for archival services; in such cases, however, fees may also discourage access or provision of content, respectively. a nested model of limitations these aspects can be seen as a series of nested constraints (see figure 1). practical limits to the scope of digital preservation | kastellec 69 figure 1. nested model of limitations at the highest level, there are technical limitations on how much digital information can be preserved at an acceptable quality. within that constraint, only a limited portion of what could possibly be preserved can be accessed by archival institutions for digital preservation. next, within that which is accessible, there are legal limitations on what may be archived. the subset defined by technological, access, and legal limitations still holds far more information than archival institutions are capable of archiving, therefore selection is required, entailing either the limited quality of automated gathering or the limited quantity of human-mediated appraisal. finally, each of these constraints is in turn limited by financial considerations, so finances exert pressure at each level. conclusion it is possible to envision alternative ways to model these series of constraints—the order could be different, or they could all be centered on a single point but not nested within each other. thus, undue attention should not be given to the specific sequence outlined above. one important conclusion that may be drawn, however, is that the identified limitations are related but distinct. the preponderance of digital preservation research to date has understandably focused on overcoming technological limitations. with the establishment of the redundant backup model, which addresses technological limitations to a workable degree, the field would be well served by greater efforts to push back the non-technical limitations of access, law, and selection. the other conclusion is that costs are digital preservation’s most pervasive limitation. as rosenthal plainly states it, “society’s ever-increasing demands for vast amounts of data to be kept for the future are information technology and libraries | june 2012 70 not matched by suitably lavish funds.” 22 if funding cannot be increased, expectations must be tempered. perhaps it has always been the case, but the scale of the digital landscape makes it clear that preservation is a process of triage. for the foreseeable future, the amount of digital information that could possibly be preserved far outstrips the amount that feasibly can be preserved. it is useful to put the advances in digital preservation technology in perspective and to recognize that non-technical factors also play a large role in determining how much of our cultural heritage may be preserved for the benefit of future generations. references and notes 1. issues specific to digitized objects (i.e., digital versions of analog originals) are not specifically addressed herein. technological limitations apply equally to digitized and born-digital objects, however, and the remaining limitations overlap greatly in either case. 2. francine berman et al., sustainable economics for a digital planet: ensuring long-term access to digital information (blue ribbon task force on sustainable digital preservation and access, 2010), http://brtf.sdsc.edu/biblio/brtf_final_report.pdf (accessed apr. 23, 2011). 3. marilyn deegan and simon tanner, “some key issues in digital preservation,” in digital convergence—libraries of the future, ed. rae earnshaw and john vince, 219–37 (london: springer london, 2007), www.springerlink.com.proxyremote.galib.uga.edu/content/h12631/#section=339742&page=1 (accessed nov. 18, 2010). 4. berman et al., sustainable economics for a digital planet; deegan and tanner, “digital convergence.” 5. data redundancy normally will also entail hardware migration; it may or may not also incorporate file format migration. 6. the library of congress, for instance, only began digital preservation in 2000 (www.digitalpreservation.gov/partners/pioneers/index.html [accessed apr. 24, 2011]). 7. david s. h. rosenthal, “bit preservation: a solved problem?” international journal of digital curation 5, no. 1 (july 21, 2010), www.ijdc.net/index.php/ijdc/article/view/151 (accessed mar. 14, 2011). 8. h. m. gladney, “durable digital objects rather than digital preservation,” january 1, 2008, http://eprints.erpanet.org/149 (accessed mar. 14, 2011). 9. rosenthal, “bit preservation.” 10. ibid. rosenthal cites studies by schroeder and gibson (2007) and pinheiro (2007). 11. ibid. http://brtf.sdsc.edu/biblio/brtf_final_report.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.springerlink.com.proxy-remote.galib.uga.edu/content/h12631/%23section=339742&page=1 file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.springerlink.com.proxy-remote.galib.uga.edu/content/h12631/%23section=339742&page=1 http://www.digitalpreservation.gov/partners/pioneers/index.html file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.ijdc.net/index.php/ijdc/article/view/151 http://eprints.erpanet.org/149/ practical limits to the scope of digital preservation | kastellec 71 12. peter lyman, “archiving the world wide web,” in building a national strategy for digital preservation: issues in digital media archiving (washington, dc: council on library and information resources and library of congress, 2002), 38–51, www.clir.org/pubs/reports/pub106/pub106.pdf (accessed dec. 1, 2010); f. mccown, c. c marshall, and m. l nelson, “why web sites are lost (and how they’re sometimes found),” communications of the acm 52, no. 11 (2009): 141–45; margaret e. phillips, “what should we preserve? the question for heritage libraries in a digital world,” library trends 54, no. 1 (summer 2005): 57–71. 13. deegan and tanner, “digital convergence”; mccown, marshall, and nelson, “why web sites are lost (and how they’re sometimes found).” 14. june besek, copyright issues relevant to the creation of a digital archive: a preliminary assessment (the council on library and information resources and the library of congress, 2003), www.clir.org/pubs/reports/pub112/contents.html (accessed mar. 15, 2011). 15. ibid., 17. 16. archival institutions that do not pay heed to this restriction, such as the internet archive (www.archive.org), claim their actions constitute fair use. the legality of this claim is as yet untested. 17. berman et al., sustainable economics for a digital planet. 18. francine berman, “got data?” communications of the acm 51, no. 12 (december 2008): 50, http://portal.acm.org/citation.cfm?id=1409360.1409376&coll=portal&dl=acm&idx=j79&part =magazine&wanttype=magazines&title=communications (accessed nov. 20, 2010). 19. phillips, “what should we preserve?” 20. brian f. lavoie, “the fifth blackbird,” d-lib magazine 14, no. 3/4 (march 2008): i, www.dlib.org/dlib/march08/lavoie/03lavoie.html (accessed mar. 14, 2011). 21. berman et al., sustainable economics for a digital planet. 22. rosenthal, “bit preservation.” http://www.clir.org/pubs/reports/pub106/pub106.pdf file:///c:/users/gerrityr/documents/my%20dropbox/ital/ital_june_2012_preprints/,%20http:/www.clir.org/pubs/reports/pub112/contents.htm http://www.archive.org/ http://portal.acm.org/citation.cfm?id=1409360.1409376&coll=portal&dl=acm&idx=j79&part=magazine&wanttype=magazines&title=communications http://portal.acm.org/citation.cfm?id=1409360.1409376&coll=portal&dl=acm&idx=j79&part=magazine&wanttype=magazines&title=communications http://www.dlib.org/dlib/march08/lavoie/03lavoie.html http://www.dlib.org/dlib/march08/lavoie/03lavoie.html digitization has bestowed upon librarians and archivists of the late 20th and early 21st centuries the opportunity to reexamine how they access their collections. it draws these two traditional groups together with it specialists in order to collaborate on this new great challenge. in this paper, the authors offer a strategy for adapting a library system to traditional archival practice. t he librarian and the archivist . . . both collect, preserve, and make accessible materials for research; but significant differences exist in the way these materials are arranged, described, and used.”1 among the items usually collected by libraries are: published books and serials, and in more recent times, commercially available sound recordings, films, videos, and electronic resources of various types. archives, on the other hand, tend to collect original records of an organization, unique personal papers, as well as other effects of individuals and families. each type of institution, given its particular emphasis, has its own traditions and its own methods of dealing with its collections. most midto large-sized automated libraries in the united states and abroad use machine readable cataloging (marc) records to form the basis of their online catalogs. bibliographic records, including those in the marc format, generally represent an individually published item, or “information product,”2 and describe the physical characteristics of the item itself. the basic unit of archival description, however, is a much more complex entity than the basic unit of bibliographic description and often involves multiple hierarchical levels that may or may not extend down to the level of individual items. at portland state university (psu) the authors examined whether the capabilities of their present integrated library system could be expanded to capture the hierarchical structure of traditional archival finding aids. ■ background as early as 1841, the cataloging rules established by panizzi were geared toward locating individual published items. panizzi based his rules on the idea that any person looking for any particular book should be able to find it through the catalog.3 this tradition has continued over time up through current standards such as the anglo-american cataloguing rules and reaffirmed in marc, the standard for the representation and exchange of bibliographic information that has been widely used by libraries for over thirty years.4 archival description, on the other hand, is generally based on the fonds, that is, the entire collection of materials in any medium that were created, accumulated, and used by a particular person, family, or organization in the course of that creator’s activities and functions.5 thus, the basic unit of archival description, usually a finding aid, is a much more complex entity than the basic unit of bibliographic description, often involving multiple hierarchical levels of description that may or may not extend down to the level of individual items. before archival description begins, the archivist identifies related groups of materials and determines their proper arrangement. once the arrangement is determined, then the description of the materials reflects both their provenance and their original order.6 the first explicit statement of the levels of arrangement in an archival collection was by holmes and has since been elevated to the level of dogma in the archival community.7 a more recent statement in describing archives: a content standard (dacs) indicates that the actual levels of arrangement may differ for each collection. by custom, archivists have assigned names to some, but not all, levels of arrangement. the most commonly identified are collection, record group, series, file (or filing unit), and item. a large or complex body of material may have many more levels. the archivist must determine for practical reasons which groupings will be treated as a unit for purposes of description.8 rephrasing holmes, the five levels of arrangement can be defined as: 1. the collection level which holmes called the depository level—the breakdown of the depository’s complete holdings into a few major divisions based on the broadest common denominator 2. the record group level—the fonds or complete collection of the papers of a particular administrative division or branch of an organization or of a particular individual or family 3. the series level—the breakdown of the record group into natural series and the arrangement of each series with respect to the others 4. the filing unit level—the breakdown of each series into unit components, which are usually fairly obvious if the documents are kept in file folders 5. the document level—the level of individual items digital collection management through the library catalog michaela brenner, tom larsen, and claudia weston digital collection management through the library catalog | brenner, larsen, and weston 65 michaela brenner (brennerm@pdx.edu) and tom larsen (larsent@pdx.edu) are database maintenance and catalog librarians, and claudia weston (westonc@pdx.edu) is assistant university librarian for technical services, portland state university. 66 information technology and libraries | june 2006 the end result of archival description is usually a finding aid that ideally presents an accurate representation of the items in an archival collection so that users can, as independently as possible, locate them.9 building on the print finding aid, the archival community has explored a number of mechanisms for disseminating information on the availability of items in their collections. in 1983, the usmarc format for archival and manuscript control (marc-amc) was released and subsequently sanctioned for use as one possible standard data structure and communication protocol in the saa descriptive standard archives, personal papers, and manuscripts (appm) and its successor, dacs.10 its adoption, however, has been somewhat controversial among archivists.11 the difficulty in capturing the hierarchical nature of collections through the marc format is one factor that has limited the use of marc by the archival community. while it is possible to encode this hierarchical description in marc using notes and linking fields, few archivists in practice have actually made use of these linking fields.12 thus, in archival cataloging, marc records have been used primarily for collection-level description, allowing users to search and discover only general information about archival collections in online catalogs while the finding aid has remained the primary tool for detailed data at all levels of description. in 1995, the encoded archival description (ead) emerged as a new standard for encoding descriptions of archival collections. the ead standard, like the marc standard, allows for the electronic storage and exchange of archival information; but unlike marc, it is based on the finding aid. ead is well suited for encoding the hierarchical relationships between the different parts of the collection and displaying them to the user, and it has become more widely adopted by the archival community. as outlined, the standards and systems chosen by an institution are dictated by the needs and traditions of that institution. the archival community relies heavily on finding aids and, with increasing frequency, on ead, their electronic extension; whereas the library community heavily relies on the online public access catalog (opac) and marc records. new trends capitalizing on the strengths of both traditions are evolving as libraries and archives seek ways to improve access to their archival and digital collections. ■ access to digital archival collections in libraries when searching the web for collections of information, one frequently encounters separate interfaces for traditional library, archival, and digital collections even though these collections may be owned, sponsored, hosted, or licensed by a single institution. descriptive records for traditional library materials reside in the opac and are constructed according to standard library practice, while finding aids for the archival and digital collections increasingly appear on specially designed web sites. this, of course, means that users searching the opac may miss relevant materials that are described only in the archival and digital documents database or web site. similarly, users searching the archival and digital documents database or web site may miss relevant materials that are described only in the opac. in other instances, libraries, such as the library of congress, selectively add records to their opacs for individual items in their archival and digital document collections. this incorporation allows users more complete access to items within the library’s collections. authority control and the assignment of descriptors further enhance access to the item-level records. to minimize processing costs, however, libraries frequently create brief descriptive records for items, thereby limiting their value to patrons.13 by creating descriptive records for the items only, libraries also obscure the hierarchical relationships among the items and the collections in which they reside. these relationships can provide the user with a useful context for the individual items and are an essential part of archival description. still other libraries, such as the university of washington, include collection-level marc records in the opac for their archival and digital document collections. these are searchable in the opac in the same way as bibliographic records for other materials. these collection-level records can then in turn be linked to finding aids that describe the collections more fully.14 collection-level records often are used in libraries where library resources may be insufficient for cataloging large collections of materials at the item level.15 the guidelines for collection-level records in appm and dacs, however, allow for additional fields that are not ordinarily used in library bibliographic records. these include such things as descriptions of the organization and arrangement of the collection, citations for published descriptions of the collection and links to the finding aid, and acknowledgment of the donors, as well as ample subject access to the collection. despite their potential for detail, collectionlevel records cannot provide the same degree of access to individual items as full item-level records. ■ an approach taken at portland state university library in many ways, archival and digital-document collections are continuing resources. a continuing resource is defined as “. . . a bibliographic resource that is issued over time digital collection management through the library catalog | brenner, larsen, and weston 67 with no predetermined conclusion. continuing resources include serials and ongoing integrating resources.”16 like published continuing resources, archival and digital collections generally are created over time with no predetermined conclusion. in fact, some archival collections continue to grow even after part of the collection has been accessioned by a library or archive. thus, even though many of the individual items in the collection might be properly treated as monographic (not unlike serial analytics), it would not be unreasonable to treat the entire collection as a continuing resource. with this in mind, the authors examined whether their electronic-resource management system could be adapted to accommodate evolving collections of digitized and born-digital material. more specifically, the present system was examined to determine whether its capabilities could be expanded to capture the hierarchical structure found in traditional archival finding aides. the electronic resource management system in use by psu library is innovative interfaces’ electronic resource management (erm) product. according to innovative interfaces inc.’s (iii) marketing literature, “[erm] effectively controls subscription and licensing information for licensed resources such as e-journals, abstracting and indexing (a&i) databases, and full-text databases.”17 to control and provide improved access to these resources, erm stores details about purchase orders, aggregators and publishers, subscription terms, licensing conditions, breadth of holdings, internal and external contact information, and other aspects of these resources that individual libraries consider relevant. for increased security and data integrity, multilevel permissions restrict viewing and editing of data to the appropriate level of staff or patron. the ability of erm to replicate the two-level hierarchical relationships between aggregators or publishers and the electronic and print resources they provide was of particular interest to the authors. through erm and iii’s batch record load capabilities, bibliographic and resource records can be loaded into the iii system using delimited source files such as those provided by serials solutions. resource records are the mechanisms used by iii to describe digital resources at a collection, subcollection, or title level, thereby enabling the capture of descriptive information not permitted by standard bibliographic records. iii uses holdings records to document serial holdings statements. according to the marc 21 formats for holdings data, a holdings statement is the “record of the location(s) and bibliographic units of a specific bibliographic item held at one or more locations.”18 iii holdings records may also contain a url for connecting to an electronic resource. in figure 1, for example, the resource record shows that psu library provides limited access to a number of journal titles through its springer journals online resource. as seen in figure 2, the display of a holdings record embedded in a bibliographic record provides more specific information on the availability of a title through the library’s collection. in this particular example, the information display reveals that print volumes are available for this title but that psu only has this title available as a part of the springer-verlag electronic collection accessible by clicking on the hotlink. more information on the springer collection can be discovered by clicking on the about resource button to retrieve the springer journals online resource record. this example, then, represents a two-level hierarchy where the resource springer journals online is analogous to an archival collection and abdominal imaging is analogous to an archival series. adaptation of erm for library-created digital collections was explored through work being done to fulfill the requirements of a grant received in 2005 by psu library. the goal of this grant was “to develop a digital library under the sponsorship of the portland state university library to serve as a central repository for the collection, accession, and dissemination of key planning documents and reports, maps, and other ephemeral materials that have high value for oregon citizens and for scholars around the world.”19 the overall collection is called the oregon sustainable community digital library (oscdl). in addition to having its own web site, it was decided to make this collection accessible through the psu library catalog so that patrons could find digitized original documents about the city of portland together with other library materials. bibliographic records would be added to the database with hyperlinks to the digitized original documents using existing staff and tools. these bibliographic marc records would be as complete as possible. initially, attention was focused on documents originating from four different sources: ernest bonner, a former portland city planner; the city of portland archives; metro (the regional government for the portland, oregon, metropolitan area); and trimet (the portland metropolitan public transportation system). along with the documents, metadata was received from various databases. these descriptions ranged from almost nothing to detailed archival descriptions. unlike the challenge of shifting titles and holdings with typical serials collections, the challenge of this project was to reflect the four hierarchical levels of psu library’s collection (figure 3). innovative’s system structure was manipulated in order to accomplish this. at the core of iii’s erm module are resource records (rr) created to reflect the peculiarities of a particular collection. linked to these resource records are holdings records (hr) containing hyperlinks to the actual digitized documents (doc h1 – doc h3) as well as to their respective bibliographic records (bib doc h1 – bib doc h3) containing additional information on the individual items within the collection (figure 4). 68 information technology and libraries | june 2006 first, resource records were manually created for three of the subcollections within the bonner collection. these subcollections contained documents reflecting the development of harbor drive, front street, and the park blocks. the fields defined for the resource records include the resource title; type (digitized documents) and format (pdf) of the resource; a hyperlink to the new oscdl web site; content and systems contact names; a brief description of the resource; and, most importantly, the resource id used to connect holding records for individual documents to the corresponding resource record. next, the batch-loading function in erm was used to create bibliographic and holding records and associate them with the resource records. taking advantage of tracking data produced during the digitization process (figure 5), spreadsheets were created for each collection reflecting the data assigned to each individual digitized document. the document title, the date the document was created, number of pages, and summaries were included. coordinates for the streets mentioned in the documents were also included. because erm uses issn numbers and titles as match points for record loads, ”issn” numbers were also manufactured for each document and included in the spreadsheet. these homemade numbers were distinguished by using pdx as a prefix followed by collection and document numbers or letters, for example, pdx0022090 or pdxhdcoll. fortunately, erm accepted these dummy issns (figure 6). from this data spreadsheet, the system-required comma delimited coverage load file (*.csv) was also created. for this file, the system only allows a limited number of fields, and is very particular about the right terms, including correct capitalization, for the header row. individual document titles, the made-up issn numbers, individual urls to the documents, and a collection-specific resource id (provider) that connects all the documents from a collection to their respective resource record were included. the resource id is the same for all documents in one collection (figure 7). in the first attempt, the system was set up to produce holdings and bibliographic records automatically, using the data from the spreadsheets. for the bibliographic records, a system-provided template was created that included some general subject headings, genre headings, an author field, and selected fixed fields, such as language, bibliographic level, and material type (figure 8). records for the harbor drive collection were loaded, and the system created brief bibliographic and holdings records and linked them to the harbor drive resource record. the records were globally updated to add the general material designator (gmd) “electronic resource” to the title as well as the phrase “digitized document” as a local “call number” to make these documents more visible in the browse screen of the online catalog (opac) (figure 9). the digitized documents now could be found in the library catalog by author, subject, or keyword. the brief bibliographic records (figure 10) allow the user to go either to the digitized document via url or to the resource record with more information on the resource itself and links to other items in the same collection. the resource record then provides links either to the new oscdl web site (via the oregon sustainable community digital library link at the bottom of the resource record), to the bibliographic description of the individual document, or to the digitized document (figure 11). however, the quality of the brief bibliographic records that had been batch generated through the system-provided template was not satisfactory (figure 8). it was decided that more document-specific data like summaries, number of pages, the dates the documents were created, geographical information, and documentlevel local subject headings should be included. these data were already available from the original spreadsheets. with limited time and staff resources, full bibliographic marc records were batch created using the spreadsheets, detailed templates adjusted slightly to each collection, microsoft mail merge, and finally, the marcedit program created by terry reese of oregon state university (http://oregonstate.edu/~reeset/marcedit/html/index.html). this gave maximum control over the data to be included and the way they would be included. it also eliminated the need to clean up the data following the record load (figure 12). subsequently, full bibliographic records were created for the subcollections harbor drive, front street, and park blocks, to connect them to the next higher level, the bonner collection (figure 3). these records were also contributed to worldcat. mimicking the process used at the document level, a resource record was created for the bonner collection and the holdings records for the three subcollections were connected with their corresponding bibliographic records (figure 13). resource records with their corresponding item-level records for trimet, the city archives, and metro followed. the final step was then to add the resource record and the bibliographic record for the whole oscdl collection (figure 14). since this last bibliographic record is not connected to a collection above it, there is only a hyperlink to the oscdl resource record (figure 15). more subcollections and their corresponding digital documents are continually being added to oscdl. structures in psu library’s opac are adjusted as these collections change. digital collection management through the library catalog | brenner, larsen, and weston 69 ■ conclusion according to salter, “digitizing, the current challenge that straddles the 20th and 21st centuries, has given archivists and librarians pause to reconsider access to their collections. the world of digitization is the catalyst for it people, librarians, and archivists to unify the way they do things.”20 in this paper, a strategy has been offered for adapting a library system to traditional archival practice. by making use of some of the capabilities of the module in psu library’s integrated library system that was originally designed for managing electronic resources, a method was developed for managing digital archival collections in a way that incorporates some of the features of a traditional finding aid. the contents of the various hierarchical levels of the collection are fully represented through the manipulation of the record structures available through psu’s system. this technique provides for enhanced access to the individual items of a collection by giving the context of the item within the collection. links between the hierarchical levels facilitate navigation between the levels. although the records created for traditional library systems are not as rich as those found in traditional finding aids, or in ead, their electronic equivalent; and the visual arrangements are not as intriguing as a wellplanned web site, the ability to show how items fit within the greater context of their respective collection(s) is a step toward reconciling traditional library and archival practices. enabling the library user to virtually browse through the overall resources offered by the library and then, if desired, through the various levels of a collection for relevant resources enhances the opportunities presented to the user for finding relevant information. references and notes 1. society of american archivists, “so you want to be an archivist: an overview of the archival profession,” 2004, www.archivists.org/prof-education/arprof.asp (accessed apr. 24, 2006). 2. kent m. haworth, “archival description: content and context in search of structure,” journal of internet cataloging 4, no. 3/4 (2001): 7–26. 3. antonio panizzi, “rules for the compilation of the catalogue,” the catalogue of the british museum 1 (1841): v–ix. 4. joint steering committee for revision of aacr, angloamerican cataloguing rules, 2nd ed., 2002 revision (chicago: ala, 2002). 5. society of american archivists, describing archives: a content standard (chicago: society of american archivists, 2004). 6. haworth, “archival description.” 7. oliver w. holmes, “archival arrangement: five different operations at five different levels,” american archivist 27, no. 1 (1964): 21–41; terry abraham, “oliver w. holmes revisited: levels of arrangement and description of practice,” american archivist 54, no. 3 (1991): 370–77. 8. society of american archivists, describing archives: a content standard (chicago: society of american archivists, 2004); xiii. 9. haworth, “archival description.” 10. society of american archivists, describing archives: a content standard (chicago: society of american archivists, 2004); steven l. hensen, comp., archives, personal papers, and manuscripts, 2nd ed. (chicago: society of american archivists, 1989). 11. peter carini and kelcy shepherd, “the marc standard and encoded archival description,” library hi tech 22, no. 1 (2004): 18–27; steven l. hensen, “archival cataloging and the internet: the implications and impact of ead,” journal of internet cataloging 4, no. 3/4 (2001): 75–95. 12. abraham, “oliver w. holmes revisited.” 13. elizabeth j. weisbrod and paula duffy, “keeping your online catalog from degenerating into a finding aid: considerations for loading microformat records into the online catalog,” technical services quarterly 11, no. 1 (1993): 29–42. 14. carini and shepherd, “the marc standard and encoded archival description.” 15. see, for example, margaret f. nichols, “finding the forest among the trees: the potential of collection-level cataloging,” cataloging & classification quarterly 23, no. 1 (1996): 53–71; and weisbrod and duffy, “keeping your online catalog from degenerating into a finding aid.” 16. joint steering committee for revision of aacr, angloamerican cataloguing rules, d-2. 17. innovative interfaces inc., “electronic resources management,” 2005, www.iii.com/pdf/lit/eng_erm.pdf (accessed apr. 24, 2006). 18. library of congress, marc 21 format for holdings data: including guidelines for content designation (washington, d.c.: cataloging distribution service, library of congres, 2000), appendix e–glossary. 19. carl abbot, “planning a sustainable portland: a digital library for local, regional, and state planning and policy documents—framing paper,” 2005, http://oscdl.research.pdx. edu/framing.php (accessed apr. 24, 2006). 20. anne a. salter, “21st-century archivist,” newsletter, 2003, www.lisjobs.com/newsletter/archives/sept03asalter.htm (accessed apr. 24, 2006). 70 information technology and libraries | june 2006 figure 1. example of resource record from the psu library catalog (search conducted nov. 4, 2005) appendix. figures digital collection management through the library catalog | brenner, larsen, and weston 71 figure 2. example of a bibliographic record for a journal title from the psu library catalog (search conducted nov. 4, 2005) 72 information technology and libraries | june 2006 figure 4. resource record harbor drive with linked holdings records, bibliographic records, and original documents figure 3. partial diagram of the hierarchical levels of the collection digital collection management through the library catalog | brenner, larsen, and weston 73 figure 7. comma delimited coverage load file (*.csv) figure 6. data spreadsheet figure 5. spreadsheet for tracking data 74 information technology and libraries | june 2006 figure 9. browse screen in opac figure 8. bibliographic records template digital collection management through the library catalog | brenner, larsen, and weston 75 figure 11. resource record with various links figure 10. system-created brief bibliographic record in opac 76 information technology and libraries | june 2006 figure 13. bonner resource record with linked holdings records, bibliographic records, and original documents figure 12. full bibliographic record in opac digital collection management through the library catalog | brenner, larsen, and weston 77 figure 15. bibliographic record for the oscdl collection figure 14. outline of linked records in the collection reproduced with permission of the copyright owner. further reproduction prohibited without permission. in the beginning...was the command line zillner, tom information technology and libraries; jun 2000; 19, 2; proquest pg. 103 book reviews in the beginning ... was the command line by neal stephenson. new york: avon books, inc., 1999. 151p. $10 (isbn 0-38081593-1) neal stephenson is best known for his cyberfiction, including snow crash and most recently cryptonomicon. in the beginning . . . was tlze command line is a quite different kettle of fish. command line is a short book with a succinct message: the command line is a good thing, because the full power of the computer is only available to those who can access the command line and type in the magic commands that make things happen. stephenson learned this lesson the hard way, after first spending much time as a macintosh-devoted guihead. the revelation came when he lost a document he was editing on his powerbook, completely and without a trace, forever irretrievable. actually, i say the book has a succinct message, but it has many messages and many metaphors, all artfully constructed by a master of prose. stephenson constructs his arguments along multiple lines, providing a discursive tour through windows, macintosh, and unix history, offering personal history as well as his own take on the economics of the software industry. for example, he believes that microsoft would be better off as an applications company rather than carrying the millstone of a family of operating systems. as for apple, he suggests that they have been doing their best to destroy themselves for years, so far unsuccessfully (but give them time). the real meat of the book is whether, in fact, it is better to offer to people the flash of metaphor with the recognition that power and certain levels of choice are lost, as with graphical user interfaces exemplified by windows and the macintosh, or whether it is better to have at least some access to the command line interface, which ms/dos offered and members of the unix family (e.g., linux) afford. this is, in fact, both a silly and important question at the same time. silly because many people would wonder why anyone would want command line access to any software. silly because others might wonder why you couldn't have both. important, or at least apparently important, because we seem to have become, without much warning, a world wrapped in guis of one sort or another. important in the library automation world, because end-user tools are moving increasingly toward gui-based or web-based interfaces without textbased alternatives (except, perhaps, lynx or similar web browsers, which have their own problems). for much of the book, stephenson dances around the question, among others, of why not both gui and text-based interfaces, and finally finds the answer in the be operating system. my question is, why not as many interfaces as it takes, of whatever sort? to repeat the trite saw, there are two kinds of people in the world, those who divide the world into two kinds of people and those who don't. stephenson has a lot of fun trying to make the division in this case, then ultimately comes out from behind the posturing and admits that he believes in the availability of both worlds. there are many people who do, indeed, want hard things hidden from them, at least some of the time. when i am dealing with an automated teller machine, i don't want to have to use mechanical levers or pedals as i might have needed were atms invented in an earlier age, nor do i want to type in commands, although i am comfortable using a command line environment in my workplace. i just want to be prompted through a minimal number of steps to walk away with some cash from my checking account. the world is a complicated and challenging place to navigate. some people tom zillner, editor would like to be helped by other people in this navigation, although many have found that they would far rather deal with the dumbeddown interface of an atm machine than to interact with not-so-friendly, underpaid bank tellers. similarly, many people want to accomplish a particular task requiring the use of a computer and don't mind having the details hidden from them, no matter how much power knowing the details would provide. or, they want to do that at least some of the time. as an example in the library world, let's consider a nai:ve patron who enters the library desiring to perform a known-item search. such a user might be quite comfortable with an interface with a single type-in box and a set of clickable buttons labeled title, author and subject. or maybe just a single button "click to start search." although nai:ve users may consult library staff, who are most often more friendly than bank tellers, many people want to find their own materials. at the same time, more sophisticated users want more sophisticated capabilities and interfaces from the same catalogs. although vendors have gotten better at providing a couple of levels of complexity and corresponding user interfaces, why not go further? there aren't just two kinds of people. there are lots of kinds of people, with lots of kinds of information needs, representing lots of experience levels. why the restrictions at the user interface? in the history of microcomputing, stephenson points to the evolution of two major players, microsoft and apple, with linux coming on strong and be representing an interesting offshoot. i think the important insight implicit in what stephenson discusses is that much of the appearance and behavior of windows and the macintosh desktop are historically based artifacts. in order to maintain backward compatibility with existing applications, the windows and macintosh book reviews 103 reproduced with permission of the copyright owner. further reproduction prohibited without permission. operating systems have picked up a great deal of "cruft," computer code that allows multitasking and other improvements cobbled on to the fragile inner shell of ancient code required for compatibility with older applications. at the same time, stephenson invokes the familiar refrain that the user interfaces of both platforms are tied to a tired set of metaphors that attempt to mimic the real-world office (e.g., desktop, folder) but do not do so with any kind of useful fidelity. in the library world, i think a similar kind of lineage might be traced from command line interfaces to the current windowsand web-based front-ends. although many libraries and librarians have faced painful conversion processes over the years in moving through generations of automated systems, it might be interesting to see if there are still traces of underlying code that owe their existence to backward compatibility. where does stephenson turn in the face of the inelegance of the windows and macintosh worlds? he finds solace in the power and integrity of linux. it may take a long time to successfully install the operating system and get it to function with all of the hardware components of a particular computer configuration, but it has all that power, and all of those cool applications carefully constructed by people who care. bugs are fixed quickly. it's a community effort. that's all very appealing, particularly when compared to the appalling response (or lack of it) to windows or macintosh bugs. the problem is that so far most of us aren't equipped to deal with the steep curve required to install linux on personal computers, and the corporate or library environment usually isn't politically prepared for linux to be adopted as an institutionwide standard. so, while linux boxes are frequent choices for servers, they are not widespread personal pc choices. nor r.hould they be until easy installation tools are available. again, stephenson is ambivalent. on the one hand, he recognizes that there are many people who don't want the kind of power offered by being so close to the machine if it means becoming experts in arcane commands and codes. even though he wants the power and simplicity, and decries the limitations imposed by the gui, he recognizes that linux is not for everyone. he's right. most people use computers to get some work done (or to play). to the extent that the software gets in the way, it isn't operating properly. by that criterion, none of the three environments described are particularly useful in a desktop world. in spite of the fact that the old metaphors have been rightly criticized for years for their tiredness, there doesn't seem to be much movement beyond them, except in limited research operating environments and applications. similarly, it seems, in the library and information world, at least in most people's routine interactions with opacs and databases. yes, i am waffling, because i'm sure that someone could point out the "snarfle n 1 virtual reality interface to the lc catalog that affords a walkthrough browsing experience," but of course only six computer science researchers have actually experienced the snarfletm interface, and it requires a $25,000 workstation and $10,000 in virtual reality gear to work, plus it is s-1-o-w. pardon the sarcastic riff, but there is a lot of wonderful user interface work that is certainly not finding its way onto mainstream computer users' desktops, or to the library or information center. so what's the answer? criticism is fun, because critics don't necessarily have to provide a positive account to match their nay-saying function. if things are bleak in the world of the user interface, both on the average user desktop and on the library desk104 information technology and libraries i june 2000 top as well, what is to be done? for a taste of what is to come in the library world, take a look at mylibrary (http:/ /my.lib.ncsu.edu/), which allows profiling of user preferences and customization based on academic discipline. similarly, there are a number of web portals and other sites that allow customization for users (e.g., my yahoo, my excite, etc.). suppose that these first steps in customization are carried further, so that each user's unique profile generates a unique user interface experience across all databases he or she deals with in a session. the interface unification could be accomplished across heterogeneous databases in a couple of different ways. a simple initial step that many libraries already employ is to obtain databases from a single aggregator, so that a uniform interface is presented to the user. for example, oclc' s first search offers a single interface to a number of commercial databases. this type of solution is not possible for libraries that need access to a diverse array of databases not available through a single aggregator or vendor. of course, this situation can present patrons and staff with a bewildering array of interfaces and search methods. a more elaborate solution is to employ z39.50 to access the databases and build a single interface at the front end. there may be aggregators that already use this strategy with the databases they provide, but in the future perhaps there would be an incentive to offer unified interfaces with fine-grain customization possible by users. getting back to stephenson's more generalized view of the user interface, i think there are also opportunities here for more finegrained customization. stephenson points to the beos, which apparently allows both command-line and guibased interactions, as an example of what can be done when an operating system is constructed anew, from the bottom up, with no pre-existing reproduced with permission of the copyright owner. further reproduction prohibited without permission. audience to satisfy. at the same time, and in contrast, stephenson extols the power of open software development, which he believes is most apparent in operating systems, the production of which he describes as money-losing propositions. yet, linux is tremendously successful without, for the most part, commercial gain for developers. can this same model be applied to interface and other development in the library world? in this example, might not some group of librarian coders (or coder librarians) work together to put mylibrary together with z39.50 capabilities and customization of interfaces to produce a little slice of paradise for library patrons? promising moves are being made within the library community to get open source efforts off the ground. this could be one of many especially useful and fruitful projects to come out of open software development for libraries. although his book is ostensibly about a few issues that elicit yawns from most of the world, stephenson is really using in the beginning . . . was the command line to look at a much bigger picture than simply the command line versus the gui at its microscopic level. stephenson looks at the cloaking, obfuscation or replacement of underlying text by images and multimedia as contributing to the decline of civilization. that seems like a radical claim, but at heart it is the one that stephenson makes in his discussion of the disney-ification of the world-that visual metaphors and explanations oversimplify and obscure the truth. in fact, stephenson goes further, discussing this trend toward anti-word as our attempt at an antidote for the kind of intellectualism that resulted in a lot of death, pain, and suffering for people in the twentieth century. he, as a person who lives by words and loves the intellectual life, thinks we've gone too far, reaching a state of cultural relativism where there is neither good nor bad remammg. this discussion includes my favorite quote of the book: the problem is that once you have done away with the ability to make judgments as to right and wrong, true and false, etc., there's no real culture left. all that remains is clog dancing and macrame. the ability to make judgments, to believe things, is the entire point of having a culture. i think this is why guys with machine guns sometimes pop up in places like luxor and begin pumping bullets into westerners .... when their sons come home wearing chicago bulls caps with the bills turned sidewavs, the dads go out of their minds. (p. 56) it's a pretty startling move to try to connect up the decline in use of the command line to an anti-intellectualism following world war ii that resulted in cultural relativism. i think it actually has some merit, although in the case of visual interfaces versus the command line the ethical import is minimal, i.e., i don't believe my decision to accomplish certain tasks using visual metaphors contributes to the decline of civilization, and i think the fact that i like to work on other tasks utilizing a command line won't serve to save our written culture. it's too much of a stretch. i think that something stephenson misses in his discussion of the replacement of the written word by visual images is that there is still a creative force and judgment involved in the creation of the images. there is still script writing. isn't this, after all, what a writer does in any case, creating images, metaphorically, through his or her work? certainly, we are moving through a perilous time, when the world really is changing from a reliance on the written word to more dependence on the visual. there will be many things lost in this transition. plato had some major, wellfounded doubts about the transition from greece's oral cultural tradition to a written one. the change happened anyway. civilization has been declining for a long time. my fearless prediction is that it will continue to decline for a long time. i think stephenson has done a masterful job of writing a brief glimpse of the overall picture that represents the state of culture and intellectual life in the world today, and has also made some important points about the economics and character of the world of software and operating environments. his writing skills make this fairly short book a pleasurable read and a worthwhile one. as i did, i think you might find this long essay a useful starting point for thoughts about issues large and small.-tom zillner, wils the cathedral & the • bazaar: musings on linux and open source by an accidental revolutionary by eric s. raymond, sebastopol, calif.: o'reilly, 1999. 288p. $19.95 (isbn 156592-724-9) this short essay examines, in the guise of a book review, the concept of a "gift culture" and how it may or may not be related to librarianship. as a result of this examination, and with a few qualifications, i believe my judgements about open source software and librarianship are true: open source software development and librarianship have a number of similarities-both are examples of gift cultures. i have recently read a book about open source software development by eric raymond. the cathedral & the bazaar describes the environment of free software and tries to explain why some programmers are willing to give away the products of their labors. it describes the "hacker milieu" as a "gift culture": book reviews 105 reproduced with permission of the copyright owner. further reproduction prohibited without permission. gift cultures are adaptations not to scarcity but to abundance. they arise in populations that do not have significant material scarcity problems with survival goods. we can observe gift cultures in action among aboriginal cultures living in ecozones with mild climates and abundant food. we can also observe them in certain strata of our own society, especially in show business and among the very wealthy. 1 raymond alludes to the definition of "gift cultures," but not enough to satisfy my curiosity. being the good librarian, i was off to the reference department for more specific answers. more often than not, i found information about "gift exchange" and "gift economies" as opposed to "gift cultures." (yes, i did look on the internet but found little.) probably one of the earliest and more comprehensive studies of gift exchange was written by marcell mauss. 2 in his analysis he says gifts, with their three obligations of giving, receiving, and repaying, are in aspects of almost all societies. the process of gift giving strengthens cooperation, competitiveness, and antagonism. it reveals itself in religious, legal, moral, economic, aesthetic, morphological, and mythological aspects of life.3 as gregory states, for the industrial capitalist economies, gifts are nothing but presents or things given, and "that is all that needs to be said on the matter." ironically for economists, gifts have value and consequently have implications for commodity exchange. 4 he goes on to review studies about gift giving from an anthropological view, studies focusing on tribal communities of various american indians, cultures from new guinea and melanesia, and even ancient roman, hindu, and germanic societies: the key to understanding gift giving is apprehension of the fact that things in tribal economics are produced by nonalienated labor. this creates a special bond between a producer and his/her product, a bond that is broken in a capitalistic societv based on alienated wage-labor. 5 ingold, in "introduction to social life," echoes many of the things summarized by gregory when he states that industrialization is concerned exclusively with the dynamics of commodity production. clearly in non-industrial societies, where these conditions do not obtain, the significance of work will be very different. for one thing, people retain control over their own capacity to work and over other productive means, and their activities are carried on in the context of their relationships with kin and community. indeed their work may have the strengthening or regeneration of these relationships as its principle objective. 6 in short, the exchange of gifts forges relationships between partners and emphasizes qualitative as opposed to quantitative terms. the producer of the product (or service) takes a personal interest in production, and when the product is given away as a gift it is difficult to quantify the value of the item. therefore, along with the product or service, less tangible elements-such as obligations, promises, respect, and interpersonal relationships-are exchanged. as i read raymond and others i continually saw similarities between librarianship and gift cultures, and therefore similarities between librarianship and open source software development. while the summaries outlined above do not necessarily mention the "abundance" alluded to by raymond, the existence of abundance is more than mere speculation. potlatch, "a ceremonial feast of the american indians of the northwest coast marked by the host's lavish distribution of gifts or sometimes destruction of property to demonstrate wealth and generosity with the 106 information technology and libraries i june 2000 expectation of eventual reciprocation," is an excellent example.? libraries have an abundance of data and information. (i won't go into whether or not they have an abundance of knowledge or wisdom of the ages. that is another essay.) libraries do not exchange this data and information for money; you don't have to have your credit card ready as you leave the door. libraries don't accept checks. instead the exchange is much less tangible. first of all, based on my experience, most librarians simply take pride in their ability to collect, organize, and disseminate data and information in an effective manner. they are curious. they enjoy learning things for learning's sake. it is a sort of platonic end in itself. librarians, generally speaking, just like what they do and they certainly aren't in it for the money. you won't get rich by becoming a librarian. information is not free. it requires time and energy to create, collect, and share, but when an information exchange does take place, it is usually intangible, not monetary, in nature. information is intangible. it is difficult to assign it a monetary value, especially in a digital environment where it can be duplicated effortlessly: an exchange process is a process whereby two or more individuals (or groups) exchange goods or services for items of value. in library land, one of these individuals is almost always a librarian. the other individuals include tax payers, students, faculty, or in the case of special libraries, fellow employees. the items of value are information and information services exchanged for a perception of worth-a rating valuing the services rendered. this perception of worth, a highly intangible and difficult thing to measure, is something the user of library services "pays," not to libraries and librarians, but to administrators and decision-makers. ultimately, these payments reproduced with permission of the copyright owner. further reproduction prohibited without permission. manifest themselves as tax dollars or other administrative support. as the perception of worth decreases so do tax dollars and support. 8 therefore, when information exchanges take place in libraries, librarians hope their clientele will support the goals of the library to administrators when issues of funding arise. librarians believe that "free" information ("think free speech, not free beer") will improve society. it will allow people to grow spiritually and intellectually. it will improve humankind's situation in the world. libraries are only perceived as beneficial when they give away this data and information. that is their purpose, and they, generally speaking, do this without regard to fees or tangible exchanges. in many ways i believe open source software development, as articulated by raymond, is very similar to the principles of librarianship. first and foremost they are similar in the idea of sharing information. both camps put a premium on open access. both camps are gift cultures and gain reputation by the amount of "stuff" they give away. what people do with the information, whether it be source code or journal articles, is up to them. both camps hope the shared information will be used to improve our place in the world. just as jefferson's informed public is necessary for democracy, open source software is necessary for the improvement of computer applications. second, human interactions are a necessary part of the mixture in both librarianship and open source development. open source development requires people skills by source code maintainers. it requires an understanding of the problem the computer application is intended to solve, since the maintainer must be able to "patch" the software, both to add functionality and to repair bugs. this, in turn, requires interactions both with other developers and with users who request repairs or enhancements. similarly, librarians understand that information-seeking behavior is a human process. while databases and many "digital libraries" house information, these collections are really "data stores" and are only manifested as information after the assignment of value is given to the data and interrelations between data are created. third, it has been stated that open source development will remove the necessity for programmers. yet raymond posits that no such thing will happen. if anything, there will be an increased need for programmers. similarly, many librarians feared the advent of the web because they believed their jobs would be in jeopardy. ironically, librarianship is flowering under new rubrics such as information architects and knowledge managers. it has also been brought to my attention by kevin clarke (kevin_clarke@unc.edu) that both institutions use peer-review: your cultural take (gift culture) on "open source" is interesting. i've been mostly thinking in material terms but you are right, i think, in your assessment. one thing you didn't mention is that, like academic librarians, open source folks participate in a peer-review type process. index to advertisers all of this is happening because of an information economy. it sure is an exciting time to be a librarian, especially a librarian who can build relational databases and program on a unix computer. acknowledgements thank you to art rhyno (arhyno@ server.uwindsor.ca) who encouraged me to post the original version of this text.-eric lease morgan, north carolina state university, raleigh, north carolina references 1. the cathedral & the bazaar: musings on linux and open source by an accidental revolutionary, 99. 2. m. mauss, the gift: forms and functions of exchange in archaic societies (new york: norton, 1967). 3. s. lukes, "mauss, marcel," in international encyclopedia of the social sciences, d. l. sills, ed. (new york: macmillian), vol 10, 80. 4. c. a. gregory, "gifts," in the new pa/grave: a dictionary of eeconomics, j. eatwell and others, eds. (new york: stockton pr., 1987), vol. 4, 524. 5. ibid. 6. t. ingold, "introduction to social life," in companion encyclopedia of anthropology, t. ingold, ed (new york: routledge, 1984), 747. 7. the merriam-webster online dictionary, http://search.eb.com/ cgi-bin/ dictionary?va=potlatch 8. e. l. morgan, "marketing future libraries." accessed apr. 27, 2000, www.lib.ncsu.edu/ staff/ morgan/ cil/ marketing. info usa library technologies, inc. lita mit press cover 2 cover 3 58, 69, cover 4 95 book reviews 107 14 information technology and libraries | march 2007 article title: subtitle in same font author name and second author author id box for 2 column layout 14 information technology and libraries | march 2007 article title: subtitle in same font author name and second author author id box for 2 column layout based on data collected as part of the 2006 public libraries and the internet study, the authors assess the degree to which public libraries provide sufficient and quality bandwidth to support the library’s networked services and resources. the topic is complex due to the arbitrary assignment of a number of kilobytes per second (kbps) used to define bandwidth. such arbitrary definitions to describe bandwidth sufficiency and quality are not useful. public libraries are indeed connected to the internet and do provide public-access services and resources. it is, however, time to move beyond connectivity type and speed questions and consider issues of bandwidth sufficiency, quality, and the range of networked services that should be available to the public from public libraries. a secondary, but important issue is the extent to which libraries, particularly in rural areas, have access to broadband telecommunications services. t he biennial public libraries and the internet studies, conducted since 1994, describe public library involve ment with and use of the internet.1 over the years, the studies showed the growth of publicaccess comput ing (pac) and internet access provided by public libraries to the communities they serve. internet connectivity rose from 20.9 percent to essentially 100 percent in less than ten years; the average number of public access computers per library increased from an average of two to nearly eleven; and bandwidth rose to the point where 63 percent of public libraries have connection speeds of greater than 769kbps (kilobytes per second) in 2006. this dramatic growth, replete with related information technology challenges, occurred in an environment of challenges—among them budgetary and staffing—that public libraries face in main taining traditional services as well as networked services. one challenge is the question of bandwidth suf ficiency and quality. the question is complex because typically an arbitrary number describes the number of kbps used to define “broadband.” as will be seen in this paper, such arbitrary definitions to describe band width sufficiency are generally not useful. the federal communications commission (fcc), for example, uses the term “high speed” for connections of 200kbps in at least one direction.2 there are three problematic issues with this definition: 1. it specifies unidirectional bandwidth, meaning that a 200kbps download, but a much slower upload (e.g., 56kbps) would fit this definition; 2. regardless of direction, bandwidth of 200kbps is neither high speed nor does it allow for a range of internetbased applications and services. this inad equacy will increase significantly as internetbased applications continue to demand more bandwidth to operate properly. 3. the definition is in the context of broadband to the single user or household, and does not take into consideration the demands of a highuse multiple workstation publicaccess context. in addition to connectivity speed, there are many ques tions related to public library pac and internet access that can affect bandwidth sufficiency—from budget and sus tainability, staffing and support, to services public librar ies offer through their technology infrastructure, and the impacts of connectivity and pac on the communities that libraries serve. one key question, however, is what is quality pac and internet bandwidth for public libraries? and, in attempting to answer that question, what are measures and benchmarks of quality internet access? this paper provides data from the 2006 public libraries and the internet study to foster discussion and debate around determining quality pac and internet access.3 bandwidth and connectivity data at the library outlet or branch level are presented in this article. the band width measures are not systemwide but rather at the point of service delivery in the branch. ■ the bandwidth issue there are a number of factors that affect the sufficiency and quality of bandwidth in a pac and internet service context. examples of factors that influence actual speed include: ■ number of workstations (publicaccess and staff) that simultaneously access the internet; ■ provision of wireless access that shares the same con nection; ■ ultimate connectivity path—that is, a direct connec tion to the internet that is truly direct, or one that goes through regional or other local hops (that may have aggregated traffic from other libraries or orga nizations) out to the internet; john carlo bertot and charles r. mcclure assessing sufficiency and quality of bandwidth for public libraries john carlo bertot (jbertot@fsu.edu) is the associate director of the information use management and policy institute and professor at the college of information, florida state university; and charles r. mcclure (cmcclure@ci.fsu.edu) is the director of the information use management and policy institute (www .ii.fsu.edu) and francis eppes professor of information studies at the college of information, florida state university. article title | author 15assessing sufficiency and quality of bandwidth for public libraries | bertot and mcclure 15 ■ type of connection and bandwidth that the telecom munications company is able to supply the library; ■ operations (surfing, email, downloading large files, streaming content) being performed by users of the internet connection; ■ switching technologies; ■ latency effects that affect packet loss, jitter, and other forms of noise throughout a network; ■ local settings and parameters, known or unknown, that impede transmission or bog down the delivery of internetbased content; ■ range of networked services (databases, videoconfer encing, interactive/realtime services) to which the library is linked; ■ if networked, the speed of the network on which the publicaccess workstations reside; and ■ general application resource needs, protocol priority, and other general factors. thus, it is difficult to precisely answer “how much bandwidth is enough” within an evolving and dynamic context of public access, use, and infrastructure. putting publicaccess internet use into a more typi cal applicationanduse scenario, however, may provide some indication of adequate bandwidth. for example: ■ a typical threeminute digital song is 3mb; ■ a typical digital photo is about 2mb; and ■ a typical powerpoint presentation is about 10mb. if one person in a public library were to email a powerpoint presentation at the same time that another person downloaded multiple songs, and another was exchanging multiple pictures, even a library with a t1 line (1.5mbps—megabytes per second) would experience a temporary network slowdown during these operations. this does not take into account many other new high bandwidthconsuming applications such as cnn stream ingvideo channel; uploading and accessing content to a wiki, blog, or youtube.com; or streaming content such as cbs’s webcasting the 2006 ncaa basketball tournament. an increasingly used technology in various settings is twoway internetbased video conferencing. with an installed t1 line, a library could support two 512kbps or three 384kbps videoconferences, depending on the amount of simultaneous traffic on the network—which, in a public access context, would be heavy. indeed, the 2006 public libraries and the internet study indicated a near continuous use of publicaccess workstations by patrons (only 14.6 percent of public libraries indicated that they always had a sufficient number of workstations available for patron use). public libraries increasingly serve as access points to egovernment services and resources, e.g., social services, disaster relief, health care.4 these services can require the simple completion of a webbased form (lowbandwidth consumption) to more interactive services (highband width consumption). and, as access points to continuing education and online degree programs, public libraries need to offer adequate broadband to enable users to access services and resources that increasingly can depend on streaming technologies that consume greater bandwidth. ■ bandwidth and pac in public libraries today as table 1 demonstrates, public libraries continue to increase their bandwidth, with 63.3 percent of public libraries reporting connection speeds of 769kbps or greater. this compares to 47.7 percent of public libraries reporting connection speeds of greater than 769kbps in 2004. there are disparities between rural and urban pub lic libraries, with rural libraries reporting substantially fewer instances of connection speeds of greater than 1.5mbps in 2006. on the one hand, the increase in con nectivity speeds between 2004 and 2006 is a positive step. on the other, 16.1 percent of public libraries report that their connection speeds are insufficient to meet patron demands all of the time, and 29.4 percent indicate that their connection speeds are insufficient to meet patron demands some of the time. thus, nearly half of public libraries indicate that their connection speeds are insuf ficient to meet patron demands some or all of the time. in terms of public access computers, the average number of workstations that public libraries provide is 10.7 (table 2). urban libraries have an average of 17.1 workstations, as compared to rural libraries, which report an average of 7.1 workstations. a closer look at bandwidth and pac for the next sections, the data offer two key views for analysis purposes: (1) workstations—divided into libraries with ten or fewer publicaccess workstations and libraries with more than ten publicaccess worksta tions (given that the average number of publicaccess workstations in libraries is roughly ten); and (2) band width—divided into libraries with 769kbps or less and libraries with greater than 769kbps (an arbitrary indicator of broadband for a public library context). in looking across bandwidth and publicaccess work stations (table 3), overall 31.8 percent of public libraries have connection speeds of less than 769kbps while 63.3 percent have connection speeds of greater than 769kbps. a majority of public libraries—68.5 percent—have ten or fewer workstations, while 30.9 percent have more than ten workstations. in general, rural libraries have fewer workstations and lower bandwidth as compared to sub urban and urban libraries. indeed, 75.2 percent of urban 16 information technology and libraries | march 200716 information technology and libraries | march 2007 libraries with fewer than ten workstations have connec tion speeds of greater than 769kbps, as compared to 45.2 percent of rural libraries. when examining pac capacity, it is clear that public libraries have capacity issues at least some of the time in a typical day (tables 4 through 6). only 14.6 percent of public libraries report that they have sufficient numbers of workstations to meet patron demands at all times (table 6), while nearly as many, 13.7 percent, report that they consistently are unable to meet patron demands for publicaccess workstations (table 4). a full 71.7 percent indicate that they are unable to meet patron demands during certain times in a typical day (see table 5). in other words, 85.4 percent of public libraries report that they are unable to meet patron demand for publicaccess workstations some or all of the time during a typical day—regardless of number of workstations available and type of library. the disparities between rural and urban libraries are notable. in general, urban libraries report more difficulty in meeting patron demands for publicaccess workstations. of urban public libraries, 27.8 percent report that they consistently have difficulty in meeting patron demand for workstations, as compared to 11.0 percent of suburban and 10.6 percent of rural public libraries (table 4). by contrast, 6.6 percent of urban libraries report sufficient workstations to meet patron demand all the time as compared to 18.9 percent of rural libraries (table 6). when reviewing the adequacy of speed of connectiv ity data by the number of workstations, bandwidth, and metropolitan status, a more robust and descriptive pic table 1. public library outlet maximum speed of public-access internet services by metropolitan status and poverty metropolitan status poverty level maximum speed urban suburban rural low medium high overall less than 56kbps 0.7% ±0.8% (n=18) 0.4% ±0.6% (n=17) 3.7% ±1.9% (n=275) 2.0% ±1.4% (n=245) 2.7% ±1.6% (n=61) 2.6% ±1.6% (n=5) 2.1% ±1.4% (n=311) 56kbps– 128kbps 2.5% ±1.6% (n=67) 5.4% ±2.3% (n=264) 15.2% ±3.6% (n=1,132) 9.9% ±3.0% (n=1,237) 9.5% ±2.9% (n=216) 5.3% ±2.2% (n=10) 9.8% ±3.0% (n=1,463) 129kbps– 256kbps 2.7% ±1.6% (n=72) 6.8% ±2.5% (n=332) 11.1% ±3.1% (n=829) 8.5% ±2.8% (n=1,067) 7.3% ±2.6% (n=166) 8.2% ±2.8% (n=1,233) 257kbps–768kbps 9.1% ±2.9% (n=241) 10.4% ±3.1% (n=504) 13.4% ±3.4% (n=1,002) 12.5% ±3.3% (n=1,557) 8.4% ±2.8% (n=190) 11.7% ±3.2% (n=1,747) 769kbps– 1.5mbps 33.6% ±4.7% (n=889) 40.0% ±4.9% (n=1,945) 31.0% ±4.6% (n=2,310) 34.3% ±4.8% (n=4,286) 34.6% ±4.8% (n=788) 38.1% ±4.9% (n=70) 34.4% ±4.8% (n=5,144) greater than 1.5mbps 49.4% ±5.0% (n=1,304) 31.6% ±4.7% (n=1,533) 19.9% ±4.0% (n=1,488) 27.4% ±4.5% (n=3,423) 35.5% ±4.8% (n=808) 50.5% ±5.0% (n=93) 28.9% ±4.5% (n=4,324) don’t know 1.9% ±1.4% (n=50) 5.4% ±2.3% (n=263) 5.7% ±2.3% (n=427) 5.5% ±2.3% (n=685) 2.1% ±1.4% (n=48) 3.5% ±1.8% (n=6) 4.9% ±2.2% (n=739) weighted missing values, n=1,497 table 2. average number of public library outlet graphical publicaccess internet terminals by metropolitan status and poverty* poverty level metropolitan status low medium high overall urban 14.7 20.9 30.7 17.9 suburban 12.8 9.7 5.0 12.6 rural 7.1 6.7 8.1 7.1 overall 10.0 13.3 26.0 10.7 * note that most library branches defined as “high poverty” are in general part of library systems with multiple branches and not single building systems. by and large, library systems connect and provide pac and internet services systemwide. article title | author 17assessing sufficiency and quality of bandwidth for public libraries | bertot and mcclure 17 ture emerges. while overall, 53.5 percent of public librar ies indicate that their connection speeds are adequate to meet demand, some parsing of this figure reveals more variation (tables 7 through 10): ■ libraries with connection speeds of 769kpbs or less are more likely to report that their connection speeds are insufficient to meet patron demand at all times, with 24.0 percent of rural libraries, 25.8 percent of suburban libraries, and 25.4 percent of urban libraries so reporting (table 7). ■ libraries with connection speeds of 769kpbs or less are more likely to report that their connection speeds are insufficient to meet patron demand at some times, with 35.0 percent of rural libraries, 38.1 per cent of suburban libraries, and 53.4 percent of urban libraries so reporting (table 8). ■ libraries with connection speeds of greater than 769kbps also report bandwidthsufficiency issues, with 12.0 percent of rural libraries, 10.5 percent of suburban libraries so reporting; and 14.0 percent of urban librar ies indicating that their connection speeds are insuf ficient all of the time (table 7); 20.3 percent of rural libraries, 29.5 percent of suburban libraries, and 30.0 percent of urban libraries indicating that their connec tion speeds are insufficient some of the time (table 8). ■ libraries that have ten or fewer workstations tend to rate their bandwidth as more sufficient at either 769kbps or less or greater than 769kbps (tables 7, 8, and 10). thus, in looking at the data, it is clear that libraries with fewer workstations indicate that their connection speeds are more sufficient to meet patron demand. table 3. public library public-access workstations and speed of connectivity by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 48.4% n=2,929 45.2% n=2,737 30.1% n=891 63.2% n=1,872 21.6% n=269 75.2% n=937 more than 10 workstations 22.0% n=307 75.5% n=1,053 12.0% n=225 85.1% n=1,595 9.6% n=130 89.8% n=1,221 total 43.4% n=3,242 50.9% n=3,802 23.0% n=1,116 71.6% n=3,474 15.1% n=399 83.0% n=2,194 missing: 7.6% (n=1,239) table 4. fewer public library public-access workstations than patrons wishing to use them by metropolitan status rural suburban urban total 10 or fewer workstations 10.5% n=681 10.8% n=339 23.6% n=300 12.1% n=1,321 more than 10 workstations 10.8% n=158 11.4% n=220 31.2% n=430 16.9% n=808 total 10.6% n=845 11.0% n=562 27.8% n=748 13.7% n=2,157 missing: 2.9% (n=473) table 5. fewer public library public-access workstations than patrons wishing to use them at certain times during a typical day by metropolitan status rural suburban urban total 10 or fewer workstations 68.8% n=4,444 74.5% n=2,347 69.1% n=880 70.5% n=7,670 more than 10 workstations 78.1% n=1,139 80.2% n=1,548 62.8% n=866 74.5% n=3,553 total 70.5% n=5,605 76.7% n=3,905 65.6% n=1,764 71.7% n=11,273 missing: 2.9% (n=473) table 6. sufficient public library public-access workstations available for patrons wishing to use them by metropolitan status rural suburban urban total 10 or fewer workstations 20.6% n=1,331 14.7% n=464 7.4% n=94 17.4% n=1,889 more than 10 workstations 11.0% n=161 8.4% n=163 6.0% n=83 8.5% n=406 total 18.9% n=1,501 12.3% n=627 6.6% n=177 14.6% n=2,304 missing: 2.9% (n=473) 18 information technology and libraries | march 200718 information technology and libraries | march 2007 table 7. public library connection speed insufficient to meet patron needs by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 25.4% n=668 12.1% n=297 27.4% n=233 9.8% n=173 15.4% n=34 10.2% n=90 more than 10 workstations 11.6% n=34 11.4% n=108 19.2% n=41 11.3% n=168 25.4% n=32 17.1% n=199 total 24.0% n=705 12.0% n=408 25.8% n=274 10.5% n=341 18.7% n=72 14.0% n=293 table 8. public library connection speed insufficient to meet patron needs at some times by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 34.1% n=898 19.3% n=474 37.1% n=315 29.0% n=511 50.0% n=130 27.0% n=238 more than 10 workstations 43.2% n=127 22.5% n=214 42.3% n=90 30.3% n=450 60.3% n=76 32.0% n=374 total 35.0% n=1,025 20.3% n=694 38.1% n=405 29.5% n=961 53.4% n=206 30.0% n=626 table �. public library connection speed is sufficient to meet patron needs by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 38.9% n=1,025 68.3% n=1,675 35.0% n=297 60.2% n=1,062 34.6% n=90 62.9% n=556 more than 10 workstations 45.2% n=133 66.1% n=628 38.5% n=82 54.9% n=817 14.3% n=18 50.9% n=594 total 39.5% n=1,158 67.5% n=2,306 35.7% n=379 57.9% n=1,886 28.0% n=108 56.0% n=1,168 table 10. public library connection speed insufficient to meet patron needs some or all of the time by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 59.5% n=1,566 31.4% n=771 64.6% n=549 38.8% n=684 65.4% n=170 37.1% n=328 more than 10 workstations 54.8% n=161 33.9% n=322 61.5% n=131 41.6% n=618 85.7% n=108 49.1% n=573 total 24.0% n=1,025 32.3% n=1,102 64.0% n=680 40.0% n=1,302 72.0% n=278 44.0% n=919 article title | author 1�assessing sufficiency and quality of bandwidth for public libraries | bertot and mcclure 1� ■ discussion and selected issues the data presented point to a number of issues related to the current state of public library pac and internetaccess adequacy in terms of available public access computers and bandwidth. the data also provide a foundation upon which to discuss the nature of quality and sufficient pac and internet access in a public library environment. while public libraries indicate increased ability to meet patron bandwidth demand when providing fewer publicly avail able workstations, public libraries indicate that they have difficulty in meeting patron demand for public access computers. growth of wireless connections in 2004, 17.9 percent of public library outlets offered wire less access, and a further 21.0 percent planned to make it available. outlets in urban and highpoverty areas were most likely to have wireless access. the majority of librar ies (61.2 percent), however, neither had wireless access nor had plans to implement it in 2004. as table 11 demon strates, the number of public library outlets offering wire less access has roughly doubled from 17.9 percent to 36.7 percent in two years. furthermore, 23.1 percent of outlets that do not currently have it plan to add wireless access in the next year. thus, if libraries follow through with their plans to add wireless access, 61.0 percent of public library outlets in the united states will have it by 2007. the implications of the rapid growth of the public library’s provision of wireless connectivity (as shown in table 11) on bandwidth requirements are significant. either libraries added wireless capabilities through their current overall bandwidth, or they obtained additional bandwidth to support the increased demand created by the service. if the former, then wireless access created an even greater burden on an already problematic band width capacity and may have actually reduced the overall quality of connectivity in the library. if the latter, libraries then had to shoulder the burden of increased expendi tures for bandwidth. either scenario required additional technology infrastructure, support, and expenditures. sufficient and quality connections the notion of sufficient and quality public library con nection to the internet is a moving target and depends on a range of factors and local conditions. for purposes of discussion in this paper, the authors used 769kbps to differentiate “slower” from “faster” connectivity. if, how ever, 1.5mbps or greater had been used to define faster connectivity speeds, then only 28.9 percent of public libraries would meet the criterion of “faster” connectiv ity (see table 1). and in fact, simply because 28.9 percent of public libraries report connection speeds of 1.5mbps or faster does not also mean that they have sufficient or quality bandwidth to meet the computing needs of their users, their staff, their vendors, and their service provid ers. some public libraries may need 10mbps to meet the pac needs of their users as well as the internal staff and management computing needs. the library community needs to become more edu cated and knowledgeable about what constitutes sufficient and quality connectivity in their library for the communi ties that they serve. a first step is to understand clearly the nature and type of the connectivity of the library. the next step is to conduct an internal audit that minimally: ■ identifies the range of networked services the library provides both to users as well as for the operation of the library; ■ identifies the typical bandwidth consumption of these services; ■ determines the demands of users on the bandwidth in terms of services they use; ■ determines peak bandwidthusage times; ■ identifies the impact of highconsumption networked services used at these peakusage times; ■ anticipates bandwidth demands of newer services and resources that users will want to access through the library’s infrastructure—myspace.com, youtube. com—regardless of whether or not the library is the direct provider of such services; and ■ determines what broadband services are available to the library, the costs of these services, and the “fit” of these services to the needs of the library. based on this and related information from such an audit, library administration can better determine the degree to which the bandwidth is sufficient in speed and quality. ■ planning for sufficient and quality bandwidth knowing the current condition of existing bandwidth in the library is not the same as successful technology plan ning and management to ensure that the library has, in fact, bandwidth that is sufficient in speed and quality. once an audit such as has been suggested is completed, careful planning for bandwidth deployment in the library is essential. it appears, however, that currently much of the management and planning for networked services is based first on what bandwidth is available as opposed to the bandwidth that is needed to provide the necessary services and resources in a networked environment. this stance puts public libraries in a reactive condition rather than a proactive condition regarding provision of net worked services. 20 information technology and libraries | march 200720 information technology and libraries | march 2007 most public library planning approaches stress the importance of conducting some type of needs assessment as a precursor to any type of planning.5 further, technology plans should include such things as goals, objectives, ser vices provision, and evaluation as they relate to bandwidth and the appropriate bandwidth needed. recent library technology planning guides, however, give little attention to the management, planning, and evaluation of band width as it relates to provision of networked services. it must be noted that some public libraries may be prevented from accessing higher bandwidth due to high cost, lack of availability of bandwidth alternatives, or other local factors that determine access to advanced telecommunications in their areas. in such circumstances, the audit may serve to inform the public service/utilities commissions, fcc, and others of the need for deploy ment of advanced telecommunications services in these areas. ■ bandwidth planning in a community context the audit and planning processes that have been described are critical activities for libraries. it is essential, however, for these processes to occur in the larger community con text. investments in technology infrastructure are increas ingly a communitywide resource that services multiple functions—emergency services, community access, local government agencies, to name a few. it is in this larger context that library pac and internet access occurs. moreover, there is a convergence of technology and service needs. for example, public libraries increasingly serve as agents of egovernment and disasterrelief providers.6 first responders rely on the library’s infrastructure when theirs is destroyed, as hurricane katrina and other storms demonstrated. local, state, and federal government agen cies rely on broadband and pac and internet access (wired or wireless) to deliver egovernment services. thus, at their core, libraries, emergency services, gov ernment agencies, and others have similar needs. pooling resources, planning jointly, and looking across needs may yield economies of scale, better service, and a more robust community technology infrastructure. emergency providers need access to reliable broadband and commu nications technologies in general, and in emergency situ ations in particular. libraries need access to highquality broadband and pac technologies. both need access to wireless technologies. as broadcast networks relinquish ownership of the 700 mhz frequency used for analog television in february 2009, and this frequency is distributed to municipali ties for emergency services, now is an excellent time for libraries to engage in community technology planning for egovernment, disaster planning and relief efforts, and pac and internet services. by working with the larger community to build a technology infrastructure, the library and the entire community benefit. ■ availability to high-speed connectivity one key consideration not known at this time is the extent to which public libraries—particularly those in rural areas—even have access to highspeed connec tions. many rural communities are served not by the large telecommunications carriers, but rather by small, privately ownedandrun local exchange carriers. iowa and wisconsin, for example, are each served by more than eighty exchange carriers. as such, public libraries are limited in capacity and services to what these exchange table 11. public-access wireless internet connectivity availability in public library outlets by metropolitan status and poverty metropolitan status poverty level provision of public-access wireless internet services urban suburban rural low medium high overall currently available 42.9% ± 4.9% (n=1,211) 42.5% ± 4.9% (n=2,240) 30.7% ± 4.6% (n=2,492) 38.0% ± 4.8% (n=5,165) 28.1% ±4.5% (n=679) 53.8% ± 5.0% (n=99) 36.7% ± 4.8% (n=5,943) not currently available and no plans to make it available within the next year 23.1% ± 4.2% (n=651) 29.7% ± 4.6% (n=1,562) 49.2% ± 5.0% (n=3,988) 37.4% ± 4.8% (n=5,091) 44.4% ± 4.9% (n=1,072) 21.0% ± 4.1% (n=39) 38.3% ± 4.9% (n=6,201) not currently available, but there are plans to make it available within the next year 30.6% ± 4.6% (n=864) 26.0% ± 4.4% (n=1,369) 18.6% ± 3.9% (n=1,509) 22.5% ± 4.2% (n=3,063) 26.2% ± 4.4% (n=633) 25.3% ± 4.4% (n=46) 23.1% ± 4.2% (n=3,742) article title | author 21assessing sufficiency and quality of bandwidth for public libraries | bertot and mcclure 21 carriers offer and make available. thus, in some areas, dsl service may be the only form of highspeed connec tivity available to libraries. and, as suggested earlier, dsl may or may not be considered high speed given the needs of the library and the demands of its users. communities that lack highquality broadband ser vices by telecommunications carriers may want to con sider building a municipal wireless network that meets the community’s broadband needs for emergency, disas ter, and publicaccess settings. as a community engages in communitywide technology planning, it may become evident that local telecommunications carriers do not meet the broadband needs of the community. such com munities may need to build their own networks, based on identified technologyplan needs. ■ knowledge of networked services connectivity needs patrons may not attempt to use highbandwidth services at the public library because they know from previous visits that the library cannot provide acceptable connec tivity speeds to access that service—thus, they quit trying to access that service, limiting the usefulness of the pub lic library. in addition, librarians may have inadequate knowledge or information to determine when bandwidth is or is not sufficient to meet the demands of their users. indeed, the survey and site visits revealed that some librarians did not know the connection speeds that linked their library to the internet. consequently, libraries are in a dilemma: increase both the number of workstations and the bandwidth to meet demand; or provide less service in order to operate within the constraints of current connectivity infrastruc ture. and yet, roughly 45 percent of public libraries indi cate that they have no plans to add workstations within the next two years; the average number of workstations has been around ten for the last three surveys (2002, 2004, and 2006); and 80 percent of public libraries indicate that space limitations affect their ability to add workstations.7 hence, for many libraries, adding workstations is not an option. ■ missing the mark? the networked environment is such that there are multi ple uses of bandwidth within the same library—for exam ple, public internet access, staff access, wireless access, integrated library system access. we are now in the web 2.0 environment, which is an interactive web that allows for content uploading by users (e.g., blogs, mytube.com, myspace.com, gaming). streaming content, not text, is increasingly the norm. there are portable devices that allow for text, video, and voice messaging. increasingly, users desire and prefer wireless services. this is a new environment in which libraries provide public access to networked services and resources. it is an enabling environment that puts users fully in the content seat—from creation to design to organization to access to consumption. and users have choices, of which the public library is only one, regarding the information they choose to access. it is an environment of competition, advanced applications, bandwidth intensity, and highquality com puters necessary to access the graphically intense content. the impacts of this new and substantially more com plex environment on libraries are potentially significant. as user expectations rise, combined with the provision of highquality services by other providers, libraries are in a competitive and service and resourcerich informa tion environment. providing “bare minimum” pac and internet access can have two detrimental effects in that they: (1) relegate libraries to places of last resort, and (2) further digitally divide those who only have publicaccess computers and internet access through their public librar ies. it is critical, therefore, for libraries to chart a highend course regarding pac and internet access, and not access that is merely perceived to be acceptable by the librarians. ■ additional research the context in which issues regarding quality pac and sufficient connectivity speeds to internet access reside is complex and rapidly changing. research questions to explore include: ■ is it possible to define quality pac and internet access in a public library context? ■ if so, what are the attributes included in the defini tion? ■ can these attributes be operationalized and mea sured? ■ assuming measurable results, what strategies can the library, policy, research, and other interested communities employ to impact public library move ment toward quality pac and internet access? ■ should there be standards for sufficient connectivity and quality pac in public libraries? ■ how can public librarians be better informed regard ing the planning and deployment of sufficient and quality bandwidth? ■ what is the role of federal and state governments in supporting adequate bandwidth deployment for public libraries?8 ■ to what extent is broadband deployment and avail ability truly universal as per the universal service 22 information technology and libraries | march 200722 information technology and libraries | march 2007 (section 254) of the telecommunications act of 1996 (p.l. 104104)? these questions are a beginning point to a larger set of activities that need to occur in the research, practitioner, and policymaking communities. ■ obtaining sufficient and quality public-library bandwidth arbitrary connectivity speed targets, e.g., 200kbps or 769kbps, do not in and of themselves ensure quality pac and sufficient connectivity speeds. public libraries are indeed connected to the internet and do provide public access services and resources. it is time to move beyond connectivitytype and speed questions and consider issues of bandwidth sufficiency, quality, and the range of networked services that should be available to the public from public libraries. given the widespread connectivity now provided from most public libraries, there continue to be increased demands for more and better networked services. these demands come from governments that expect public libraries to support a range of egovernment services, from residents who want to use free wireless connectivity from the public library, to patrons who need to download music or view streaming videos (to name but a few). simply providing more or better connectivity will not, in and of itself, address all of these diverse service needs. increasingly, pac support will require additional public librarian knowledge, resources, and services. sufficient and quality bandwidth is a key component of those services. the degree to which public libraries can provide such enhanced networked services (requiring exceptionally high bandwidth that is both sufficient and of high quality) is unclear. mounting a significant effort now to better understand existing bandwidth use and plan for future needs and requirements in individual public libraries is essential. in today’s networked envi ronment, libraries must stay competitive in the provision of networked services. such will require sufficient and highquality connectivity and bandwidth. ■ acknowledgements the authors gratefully acknowledge the support of the bill & melinda gates foundation and the american library association for support of the 2006 public libraries and the internet study. data from that study have been incorpo rated into this paper. references 1. information institute, public libraries and the internet (tal lahassee, fla.: information use management and policy insti tute, 2006). all studies conducted since 1994 are available at: http://www.ii.fsu.edu/plinternet (accessed march 1, 2007). 2. u.s. federal communications commission, high speed services for internet access: status as of december 31, 2005 (wash ington, d.c.: fcc, 2006), available at http://www.fcc.gov/ bureaus/common_carrier/reports/fccstate_link/iad/ hspd0604.pdf (accessed mar. 1, 2007). 3. j. c. bertot et al., public libraries and the internet 2006 (tal lahassee, fla.: information use management and policy insti tute, forthcoming), available at http://www.ii.fsu.edu/plinternet (accessed mar. 1, 2007). 4. j. c. bertot et al., “drafted: i want you to deliver e government,” library journal 131, no. 13 (aug. 2006): 34–37. 5. c. r. mcclure et al., planning and role setting for public libraries: a manual of options and procedures (chicago: ala, 1987); e. himmel and w. j. wilson, planning for results: a public library transformation process (chicago, ala, 1997). 6. j. c. bertot et al., “drafted: i want you to deliver egov ernment.”; p. t. jaeger et al., “the policy implications of internet connectivity in public libraries,” government information quarterly 23, no. 1 (2006): 123–41. 7. j. c. bertot et al., public libraries and the internet 2006. 8. jaeger et al., “the policy implications of internet connec tivity in public libraries.” virtual reality: a survey of use at an academic library articles virtual reality a survey of use at an academic library megan frost, michael goates, sarah cheng, and jed johnston information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11369 megan frost (megan@byu.edu) physiological sciences librarian, brigham young university. michael goates (michael_goates@byu.edu) life sciences librarian, brigham young university. sarah cheng is an undergraduate student, brigham young university. jed johnston (jed_johnston@byu.edu) innovation lab manager, brigham young university. abstract we conducted a survey to inform the expansion of a virtual reality (vr) service in our library. the survey assessed user experience, demographics, academic interests in vr, and methods of discovery. currently our institution offers one htc vive vr system that can be reserved and used by patrons within the library, but we would like to expand the service to meet the interests and needs of our patrons. we found use among all measured demographics and sufficient patron interest for us to justify expansion of our current services. the data resulting from this survey and the subsequent focus groups can be used to inform other academic libraries exploring or developing similar vr services. introduction virtual reality (vr) is commonly defined as an experience in which a user remains physically within their real world while entering a virtual world (comprising three-dimensional objects) using a headset with a computer or a mobile device.1 vr is part of a spectrum of related technologies ranging from mostly real experiences to completely virtual experiences, such as augmented reality, augmented virtuality, and mixed reality. 2 extended reality (xr) is a term often used when describing these technologies as a whole. many different xr devices and services are available in academic libraries. the most popular xr devices used in libraries are the htc vive, the oculus rift by facebook, and google cardboard.3 other common xr devices include gearvr by samsung and playstation virtual reality by sony.4 the htc vive and oculus rift are technologies that provide an immersive virtual-reality experience. google cardboard provides both non-immersive virtual reality and augmented reality experiences, while mixed reality is provided through various technologies such as microsoft’s hololens and mixed-reality headsets from hp, acer, and magic leap. in addition, many academic libraries are using augmented reality apps that can be downloaded on patrons’ personal mobile devices.5 academic libraries are starting to offer various xr services to increase engagement with patrons and teach information literacy.6 despite the increase in xr service offerings, there is little consistency in the devices used or in how these services are developed at academic libraries , and there is substantial variation in the types of services offered. for example, some libraries make vr headsets available for in-house activities, such as storytelling, virtual travel, virtual gaming, and the development of new skills.7 other libraries, notably ryerson university library and archives in toronto, let students and faculty borrow their oculus rift headsets for two or three days at a time.8 some university libraries lend out headsets or 360-degree cameras or provide a virtualmailto:megan@byu.edu mailto:michael_goates@byu.edu mailto:jed_johnston@byu.edu information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 2 reality space for students to develop content.9 the university of utah library offers an open-door, drop-in vr workshop once a week.10 claude moore health sciences library at the university of virginia implemented a project that educated its students and staff on the uses of vr in the health field through a combination of large-group demonstrations, one-on-one consultations, and workshops.11 the xr field is developing quickly, and xr services have the potential to benefit students academically. some universities are already offering classes on vr platforms.12 this is particularly true in fields that are high risk or potentially discomforting. for example, students in medical fields benefit by practicing virtually before attempting surgery on a human body.13 in addition to potential surgical benefits, the university of new england has been utilizing xr technology to teach empathy to its medical and other health profession students by putting the learner in the place of their patients.14 other examples of xr usage in the health fields include a recent attempt to introduce vr in anatomic pathology education and the use of virtual operating rooms to train nurses and educate the public. 15 one recent study measured the effectiveness of using vr platforms in engineering education and found a drastic improvement in student performance.16 many educational institutions outside of the university setting have also started exploring how xr could be used to enhance students’ educational experience. this technology has already progressed from being considered a novelty to being an established tool to engage learners.17 one of the perceived benefits of xr use in public libraries by both library patrons and staff is the ability of xr technology to inspire curiosity and a desire to learn.18 in some school programs, students are able to advance their learning through xr apps that allow them not only to absorb information but also to experience what they are learning through hands-on activities and full immersion without danger (e.g., hazardous science experiments) or high cost (e.g., traveling to another country).19 xr has the potential to increase the overall engagement of students, which, according to carnini, kuh, and klein’s 2006 study, is correlated to how well students learn.20 xr has the ability to capture the attention of students and eliminate distractions. this is particularly true for students with attention deficit disorder, anxiety disorders, or impuls e-control disorder.21 the application of xr goes beyond traditional classroom settings. a case study assessing the benefits of vr in american football training found that players showed an overall improvement of 30 percent after experiencing game plays created by their coaches in a virtual environment.22 although these studies were not conducted in an academic library or university setting, their results are transferable. it is beneficial to academic libraries to provide technologies to their patrons that enhance and advance their learning. currently, xr apps available for purchase on the google app store are still limited. most app development comes from private companies; however, some universities are giving their students the opportunity to develop xr content.23 objectives at brigham young university, we want our vr services to foster the link between academic achievement and virtual reality. in order to do this effectively, our first objective is to determine which vr services will be of most benefit to our patrons. to inform the expansion of future vr services, we conducted a survey of patrons using current vr services in the library. this survey is also intended to help other libraries that are developing vr services and potentially developers information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 3 interested in creating academic content for students. we were primarily interested in user experience, demographics, academic interests in vr, and methods of discovery. methods during one semester, january through april 2018, we asked individuals to complete a questionnaire following their use of the library’s htc vive system. this questionnaire was administered through an online qualtrics survey that was distributed via email to patrons after using the library’s vr equipment. it consisted of thirteen questions that gathered basic demographic information as well as information on patron interests and experiences with the library’s vr services. the complete survey used in this study can be found in appendix a. currently the harold b. lee library at brigham young university offers one htc vive vr system that can be used on site in the science and engineering area of the library. it is primarily operated by student employees who work at the science and engineering reference desk. time slots are reserved through an online registration system on the library’s website. in order to gather more in-depth, qualitative data on patron experience with the library’s vr services, we also conducted a focus group with vr users. we recruited participants by adding a question at the end of the qualtrics survey asking whether the responder would be interested in participating in a focus group. all focus group participants received a complimentary lunch. during the focus group, we asked a series of five questions to gain a deeper understanding of users’ vr experience at the library. in particular, we asked participants to explain what went well during their vr experience in the library, what difficulties they experienced, how they envisioned using vr for both academic and extracurricular purposes, and what type of vr content (e.g., software or equipment) they would like the library to acquire. the focus group facilitator asked follow-up questions for clarification as needed. the session was audio recorded, and participant responses were transcribed and coded for themes. results and discussion demographics the most frequent users of the vr equipment in the library were male students in the science. technology. engineering, or mathematics (stem) disciplines. the percentage of male students at brigham young university is roughly 50 percent but over 70 percent of our survey respondents were male. that stated, there was considerable use among all measured demographics, as shown in figure 1. over one third of responders were not students. university faculty made up 11 percent of responders during the survey period. the proportion of faculty who responded was higher than the university’s faculty-to-student ratio and likely the result of directly advertising the service to non-student university employees. because some users informed librarians that they had brought spouses and children to use the equipment, we estimate that the 7 percent of responders who were neither students nor university employees mostly consisted of family or friends accompanying students or employees. over one third of student responders were majoring in disciplines outside of science, technology, engineering, and mathematics. this number is small when compared to the number of students in these majors across campus (approximately 63 percent of students on campus are not majoring in stem disciplines.); however, it demonstrates that there is an interest in vr technology throughout the university. as the vr services are located in the science and engineering area of the library, it is not surprising that more students majoring information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 4 in these disciplines used these services when compared to students majoring in other disciplines. in fact, 15 percent of responders learned about the services at the reference desk, where they could see other patrons using the vr equipment. the most common discovery method, however, was the various forms of advertisements targeted to both students and employees of the brigham young university, as shown in figure 2. figure 1. demographics. information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 5 figure 2. most effective discovery methods: advertisement and word-of-mouth. only 7 percent of responders identified research or class assignments as their primary reason for using the services. the large majority of use, as shown in figure 3, was simply for entertainment or fun. this was not unexpected, especially as most of the users were trying the technology for the first time (see figure 4). however, because we purchased the equipment with the intent to support academic pursuits on campus, we hoped to see a higher percentage of academic use. figure 3. most responders came because it sounded fun. information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 6 figure 4. most responders were first-time users. faculty use was higher than expected (see figure 5). eleven percent of users during our survey period were faculty. the majority of these responders indicated an interest in potentially using vr technology with their students (see figure 6). while this interest was positive, faculty member suggestions for classroom use remained hypothetical, without any concrete intentions for implementation. this suggests that although faculty interest exists, faculty may need to be informed of specific application ideas in order to be more likely to incorporate this technology into their courses. information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 7 figure 5. faculty were interested in trying the vr equipment. information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 8 figure 6. faculty were interested in using vr academically. a clear majority (72 percent) indicated an intention of returning to the library to use the service again (see figure 7). information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 9 figure 7. most responders intend to return. because our vr services were a small pilot program at the time of the survey, we did not offer a large number of paid apps to users. table 1 displays the most common apps used by survey responders. most users tried google earth during their session, and employees at the reference desk often recommended this app to new users. another common app for new users was the lab, which includes a few small games showcasing the current capabilities of vr. google tiltbrush is an app for creating 3d art. virtual jerusalem is an app that was created by faculty at brigham young university and allows users to walk around and explore the jerusalem temple mount during the time of christ. the fifth-most-used app we offered was 3d organon vr anatomy, which teaches human anatomy. 1. google earth 2. the lab 3. tiltbrush by google 4. virtual jerusalem 5. 3d organon vr anatomy table 1. top five apps used. focus group data we conducted a total of three focus group sessions. each session included between five and eight participants, for a total of twenty-one focus group participants. because we were primarily information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 10 interested in student responses, we limited focus group participants to students enrolled at brigham young university. the participants were asked to describe what did or did not go well during their vr session. when describing what went well during their vr session, many participants responded with positive comments about the quality of service the library employees provided during their session. most participants expressed satisfaction with the number and quality of the apps provided by the library. during all three focus groups, participants mentioned that they liked how easy it was to sign up for the vr services. the most common problems reported by participants related to health or safety concerns, such as feeling dizzy, bumping into objects in the room because of the lack of space, and tripping over the headset wire. other s reported problems related to the level of personal or social comfort with the vr services, such as feeling self-conscious using vr in a semi-open space not exclusively devoted to vr services or being told to be quieter. when asked about ways the library could improve its vr services, the students suggested solutions to many of these problems. a frequent recommendation was that the library dedicate a space to vr. the reasons for this suggestion included minimizing the risk of accidentally bumping into objects, reducing the embarrassment of using the vr equipment in front of spectators, and allowing participants to become more fully immersed in the vr experience without worrying about being too loud. other common suggestions included providing more than one headset for multiple patrons to use for gaming purposes or team projects, acquiring wireless headsets to eliminate wire tripping hazards, and providing more online training videos to reduce reliance on library workers for common troubleshooting problems. participants did not provide actionable suggestions on ways to decrease dizziness while operating vr equipment. when asked about how the students could see themselves using vr academically, many responded with some of the more well-known uses of vr technology, such as potential uses in science, medicine, engineering, and the military. however, some students had a very hard time determining how vr could be applied to humanities fields such as english. after some discussion, most students were able to see the relevance of vr in their field, but some said that they most likely would not pursue those functions of vr, using vr exclusively for extracurricular activities. in contrast to the lack of academic uses envisioned by focus group participants, participants had substantially more ideas about how they would use vr for extracurricular purposes, including playing games for stress relief, exercising, exploring the world, and watching movies. many expressed interest in using vr for extracurricular learning outside their majors, such as virtually being part of significant historic events, exploring ecosystems, and visiting museums or other significant landmarks. students expressed interest in exploring the many possibilities provided by vr technology but were not especially aware of or interested in how vr might apply to their specific field of study unless they were in an engineering, medical, or other science-related discipline. conclusions vr is a rapidly growing field, and academic libraries are already providing students access to this technology. in our study, we found considerable interest across campus in using vr in the library, however the academic interest and use were not as high as we hoped. future marketing to faculty might benefit from specifically suggesting ideas for academic uses or collaboration. even though our current vr services are located at the science and engineering help desk, nearly 40 percent of information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 11 users were not in stem disciplines. this is encouraging and suggests value in marketing future vr services to all library patrons. we also found sufficient patron interest to justify exploring related vr services, such as offering classes on creating content and acquiring less expensive headsets that can be borrowed outside of the library. although this survey was limited to one university, we believe the results can be used to inform other academic libraries as they develop similar vr services. endnotes 1 susan lessick and michelle kraft, “facing reality: the growth of virtual reality and health sciences libraries,” journal of the medical library association: jmla 105, no. 4 (2017): 407. 2 paul milgram et al., “augmented reality: a class of displays on the reality-virtuality continuum,” in telemanipulator and telepresence technologies 2351 (international society for optics and photonics, 1995), 282-92. 3 hannah pope, “incorporating virtual and augmented reality in libraries,” library technology reports 54, no. 6 (2018): 8. 4 sarah howard, kevin serpanchy, and kim lewin, “virtual reality content for higher education curriculum,” proceedings of vala (melbourne, australia: libraries, technology and the future inc., 2018), 2. 5 zois koukopoulos and dimitrios koukopoulos, “usage scenarios and evaluation of augmented reality and social services for libraries,” in digital heritage. progress in cultural heritage: documentation, preservation, and protection (springer international, 2018), 134-41; leanna fry balci, “using augmented reality to engage students in the library,” information today europe/ili365 (november 17, 2017), https://www.infotoday.eu/articles/editorial/featuredarticles/using-augmented-reality-to-engage-students-in-the-library-121763.aspx. 6 bruce massis, “using virtual and augmented reality in the library,” new library world 116, nos. 11-12 (2015): 789, https://doi.org/10.1108/nlw-08-2015-0054. 7 adetoun a oyelude, “virtual and augmented reality in libraries and the education sector,” library hi tech news 34, no. 4 (2017): 3, https://doi.org/10.1108/lhtn-04-2017-0019. 8 weina wang, kelly kimberley, and fangmin wang, “meeting the needs of post-millennial: lending hot devices enables innovative library services,” computers in libraries (april 2017): 7. 9 “oxford libguides: virtual reality: borrowing vr equipment,” bodleian libraries, https://ox.libguides.com/vr/borrowing; “virtual reality services,” penn state university libraries, https://libraries.psu.edu/services/virtual-reality-services; “vr studio,” north carolina state, https://www.lib.ncsu.edu/spaces/vr-studio. 10 oyelude, “virtual and augmented reality,” 3. 11 lessick and kraft, “facing reality: the growth of virtual reality,” 409. https://www.infotoday.eu/articles/editorial/featured-articles/using-augmented-reality-to-engage-students-in-the-library-121763.aspx https://www.infotoday.eu/articles/editorial/featured-articles/using-augmented-reality-to-engage-students-in-the-library-121763.aspx https://doi.org/10.1108/nlw-08-2015-0054 https://doi.org/10.1108/lhtn-04-2017-0019 https://ox.libguides.com/vr/borrowing https://libraries.psu.edu/services/virtual-reality-services https://www.lib.ncsu.edu/spaces/vr-studio information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 12 12 oyelude, “virtual and augmented reality,” 3. 13 medhat alaker, greg r. wynn, and tan arulampalam, “virtual reality training in laparoscopic surgery: a systematic review & meta-analysis,” international journal of surgery 29 (2016): 86, https://doi.org/10.1016/j.ijsu.2016.03.034. 14 elizabeth dyer, barbara j. swartzlander, and marilyn r. gugliucci, “using virtual reality in medical education to teach empathy,” journal of the medical library association: jmla 106, no. 4 (2018): 498, https://doi.org/10.5195/jmla.2018.518. 15 emilio madrigal, shyam prajapati, and juan hernandez-prera, “introducing a virtual reality experience in anatomic pathology education,” american journal of clinical pathology 146, no. 4 (2016): 462, https://doi.org/10.1093/ajcp/aqw133; nils fredrik kleven et al., “training nurses and educating the public using a virtual operating room with oculus rift,” ieee (2014): 1, https://doi.org/10.1109/vsmm.2014.7136687. 16 wadee alhalabi, “virtual reality systems enhance students’ achievements in engineering education,” behaviour & information technology 35, no. 11 (2016): 925, https://doi.org/10.1080/0144929x.2016.1212931. 17 patricia brown, “how to transform your classroom with augmented reality—edsurge news,” edsurge, november 2, 2015, https://www.edsurge.com/news/2015-11-02-how-to-transformyour-classroom-with-augmented-reality. 18 negin dahya et al., “virtual reality in public libraries,” university of washington information school, https://ischool.uw.edu/vrinlibraries. 19 del siegle, “seeing is believing: using virtual and augmented reality to enhance student learning,” gifted child today 42, no. 1 (2019): 46, https://doi.org/10.1177/1076217518804854. 20 guillaume loup et al., “immersion and persistence: improving learners’ engagement in authentic learning situations,” 11th european conference on technical enhanced learning (2016): 414, https://doi.org/10.1007/978-3-319-45153-4_35; robert carini, george kuh, and stephen klein, “student engagement and student learning: testing the linkages,” research in higher education 47, no. 1 (2006): 23-4, https://doi.org/10.1007/s11162-005-8150-9. 21 mariano alcaniz, elena olmos-raya, and luis abad, “use of virtual reality for neurodevelopmental disorders: a review of the state of the art and future agenda,” medicinabuenos aires 79, nos. 77–81 (2019): 419-20, https://doi.org/10.21565/ozelegitimdergisi.448322. 22 yazhou huang, lloyd churches, and brendan reilly, “a case study on virtual reality american football training,” proceedings of the 2015 virtual reality international conference 6 (2015): 3, https://doi.org/10.1145/2806173.2806178. 23 “media lab,” massachusetts institute of technology, https://libraries.psu.edu/services/virtualreality-services; “the ischool technology resources at fsu: virtual reality,” florida state university libguides, https://guides.lib.fsu.edu/ischooltech/vr. https://www.sciencedirect.com/science/journal/17439191 https://doi.org/10.1016/j.ijsu.2016.03.034 https://doi.org/10.5195/jmla.2018.518 https://doi.org/10.1093/ajcp/aqw133 https://doi.org/10.1109/vsmm.2014.7136687 https://doi.org/10.1080/0144929x.2016.1212931 https://www.edsurge.com/news/2015-11-02-how-to-transform-your-classroom-with-augmented-reality https://www.edsurge.com/news/2015-11-02-how-to-transform-your-classroom-with-augmented-reality https://ischool.uw.edu/vrinlibraries https://doi.org/10.1177/1076217518804854 https://doi.org/10.1007/978-3-319-45153-4_35 https://doi.org/10.1007/s11162-005-8150-9 https://doi.org/10.21565/ozelegitimdergisi.448322 https://doi.org/10.1145/2806173.2806178 https://libraries.psu.edu/services/virtual-reality-services https://libraries.psu.edu/services/virtual-reality-services https://guides.lib.fsu.edu/ischooltech/vr abstract introduction objectives methods results and discussion demographics focus group data conclusions endnotes the “black box”: how students use a single search box to search for music materials kirstin dougan information technology and libraries | december 2018 81 kristin dougan (dougan@illinois.edu) is head, music and performing arts library, university of illinois. abstract given the inherent challenges music materials present to systems and searchers (formats, title forms and languages, and the presence of additional metadata such as work numbers and keys), it is reasonable that those searching for music develop distinctive search habits compared to patrons in other subject areas. this study uses transaction log analysis of the music and performing arts module of a library’s federated discovery tool to determine how patrons search for music materials. it also makes a top-level comparison of searches done using other broadly defined subject disciplines’ modules in the same discovery tool. it seeks to determine, to the extent possible, whether users in each group have different search behaviors in this search environment. the study also looks more closely at searches in the music module to identify other search characteristics such as type of search conducted, use of advanced search techniques, and any other patterns of search behavior. introduction music materials have inherent qualities that present difficulties to the library systems that describe them and to the searchers who wish to find them. this can be exemplified in three main areas: formats, titles, and relationships. first, printed music comes in multiple formats such as full scores, vocal scores, study scores, and parts; and in multiple editions such as facsimiles, scholarly editions, performing editions (of various caliber); each format and edition serving a different purpose or need. related to this, but less problematic, is the variety of sound recording formats available. second, issues resulting from titling practices abound in music, ranging from frequent use of foreign terms, not just in descriptive titles (l'oiseau de feu = zhar-ptitsa = the firebird = feuervogel), but in generic titles as translated by various publishers from different countries (symphony=sinfonie). additionally, musical works often have only generic genre titles enhanced by key and work number metadata, for example symphony no. 1 in c minor. third, music materials present a relationship issue best defined as “one-to-many.” musical works often have multiple sections or songs in them (an aria in an opera or a movement in a symphony), and a cd or a score anthology may contain multiple pieces of music. given these three main challenges presented by music materials, it is possible that those searching for music develop distinctive search habits compared to patrons in other subject areas. this study uses transaction log analysis of the music and performing arts module of a library’s federated discovery tool to determine how patrons search for music materials. it also makes a top-level comparison of searches done using other broadly defined subject disciplines’ modules in the same discovery tool. it seeks to determine, to the extent possible, whether users in each group have different search behaviors in this search environment. the study also looks more closely at mailto:dougan@illinois.edu the “black box” | dougan 82 https://doi.org/10.6017/ital.v37i4.10702 searches in the music module to identify other search characteristics such as type of search conducted, use of advanced search techniques, and any other patterns of search behavior. background since fall 2007 the university of illinois library has had easy search (es), a locally developed search tool designed to aid users in finding results from multiple catalog, a&i, and reference targets quickly and simultaneously. there is a “main” es on the library’s main gateway page that searches a variety of cross-disciplinary tools (see figure 1). figure 1. gateway easy search. on the gateway, users have the option of selecting one of the format tabs to narrow their search to books, articles, journals, or media. when the data for this study was gathered, the journals tab was not present. starting in 2010 many of the subject and branch libraries in the university library created their own es modules with target resources specific to the disciplinary areas they serve. search boxes for these es subject modules are often displayed right on the branch library’s home page, but users can also select these subject module options from the dropdown in the main es (see figure 2). information technology and libraries | december 2018 83 figure 2. gateway dropdown subject choices. the mpal es interface as it appears on the mpal homepage can be seen in figure 3—it was created in 2011. figure 3. mpal easy search interface. the “black box” | dougan 84 https://doi.org/10.6017/ital.v37i4.10702 es is a federated search tool and does not have a central index like most current discovery layer tools. rather, it utilizes broadcast search technologies to target different tools and search them directly. while the gateway es now uses a “bento box” layout to display selected citations from each target, in the first iterations of the tool and still today in the subject modules, users are simply presented with a list of hit counts in each of the target tools (see figures 4 and 5). figure 4. mpal easy search display screen part 1. figure 5. mpal easy search results screen part 2. information technology and libraries | december 2018 85 not shown in the screen captures are the results from various newspaper indexes and reference sources such as oxford music online and the international encyclopedia of dance. literature review many studies have examined patron search behavior using transaction log analysis and other methods over the past few decades. since the appearance of google in 1998, and its vast impact on individuals’ expectations and search behavior, recent studies have looked at user search behavior in tools that initially present a single search box. additional studies have looked at disciplinespecific searching behaviors. general search studies and single search boxes many advantages and disadvantages have been ascribed to tools with single search boxes (whether federated search tools or discovery layers), namely ease and convenience on the one hand, and the lack of precision possible in searching and overwhelming number of results on the other. two companion articles, boyd et al. and dyas-correira, written ten years apart, attempted to visit and revisit these issues.1 results and patron satisfaction can vary based on size of library and number of resources accessed by these tools. these types of tools will never be able to search and display everything and that problem is magnified by the number of resources a library has. holman, porter, and zimerman discovered in independent studies that undergraduates do not search very efficiently or effectively and find library tools difficult to use.2 avery and tracy also found this true for the es tool under discussion in this study: the generation of keywords by many students indicates they often struggled to identify alternative terminology that may have resulted in a more successful search . . . . many students exhibited persistence in their searching, but the selection of search terms, sometimes compounded by spelling problems or problems in search string structure, likely did not yield the most relevant results.3 asher, duke, and wilson state in their study comparing student search strategies and success across a variety of library search tools and google that there were “strong patterns in the way students approached searches no matter which tool they used. . . . students treated almost every search box like a google search box, using simple keyword searches in 81.5 percent (679/829) of the searches observed.”4 dempsey and valenti note students’ infrequent use of limiters such as “peer-review” and “date” in eds, the high non-use and misuse rates of quotation marks, relatively low instances of repeated uncorrected spelling errors, and variance patterns in keyword usage.5 students like federated search tools and discovery layers because of their convenience and ease, as found in studies by armstrong, belliston, and williams et al.6 this is reiterated in asher et al., “despite the fact that they did not necessarily perform better on the research tasks in this study, students did prefer google and google-like discovery tools because they felt that they could get to full-text resources more quickly.”7 this one-box approach could hinder students, as described by swanson and green: the search box became an obstacle in other questions where it should not have been used. in some cases, the search box was viewed as an all-encompassing search of the entire site. the “black box” | dougan 86 https://doi.org/10.6017/ital.v37i4.10702 several students searched for administrative information, research guides, and podcasts in this box.8 lown et al. also found that users hope to access a vast range of information via a single search box. “one lesson is that library search is about more than articles and the catalog. about 23 percent of use of quicksearch took place outside either the catalog or articles modules, indicating that ncsu library users attempt to access a wide range of information from the single search box.”9 search and library use in different disciplines in their study comparing a discovery layer and subject-specific tools, dahlen and hanson found “subject-specific indexing and abstracting databases still play an important role for libraries that have adopted discovery layers. discovery layers and subject-specific indexing and abstracting databases have different strengths and can complement each other within the suite of library resources.”10 they also observed things iterated by previous authors, chiefly that “not all students prefer discovery tools” and “the tools that students prefer may not be those that give them the best results.”11 in addition, they found that default configuration matters in terms of students’ success in and preference for a given tool. fu and thomes found that creating smaller disciplinespecific subsets in discovery tools was beneficial to searchers by reducing results and in creasing the results’ relevance.12 few studies investigate how music students search for music materials. dougan found in her observational study of music students’ search behaviors that they have difficulty forming good searches; misuse quotation marks and other search elements; and at times struggle with finding music materials.13 mayer noted upper-class music students’ frustration with using library tools to find specific works of music, going so far as to state, “the music students agreed that both the discovery layer and the catalog are not effective for music-related searching, for any format.”14 clark and yeager found that students had an easier time searching for media items than music scores, and frequently struggled with search strategy revisions.15 there is more research on the larger information needs of disciplines and creating models for research behavior, and not necessarily specific search processes or constructions.16 whitmire, in her 2002 pre-google article, found that students majoring in the social sciences were engaged in information-seeking behaviors at a higher rate than students majoring in engineering.17 chrzastowski and joseph surveyed graduate students at the university of illinois at urbana– champaign and found that those in the life sciences, physical sciences, and engineering visited the libraries less often than students in other academic disciplines.18 students in the arts and humanities used the library more often than students in other disciplines. collins and stone report that in prior studies of users across different disciplines, arts and humanities users do not account for the biggest users of library materials, their survey found the opposite to be true. 19 when looking at the various student populations in their study, musicians had the highest library usage in terms of items borrowed and almost the highest number of library visits. music users in the study also showed high numbers of hours logged into the library e-resources and highest number of e-resources accessed compared to others in their discipline group (but not as much as other disciplines). however, they show a low number of pdf downloads and low number of e-resources accessed frequently. information technology and libraries | december 2018 87 methodology this study conducted quantitative analysis of easy search (es) data as a whole and from a selection of the subject modules, including the music and performing arts library (mpal) module, using data from the period june 20, 2014 through june 16, 2015. additional quantitative and qualitative analysis was conducted only on the mpal es transaction log data. data from the following subject modules were included in comparative analyses: • funk agricultural, consumer and environmental sciences library (aces) (http://www.library.illinois.edu/funkaces/) • grainger engineering library (http://search.grainger.illinois.edu/top/) • history, philosophy, and newspaper library (hpnl) (http://www.library.illinois.edu/hpnl/) • music and performing arts library (mpal) (http://www.library.illinois.edu/mpal/) • social science, health, and education library (sshel) (http://www.library.illinois.edu/sshel/) • undergraduate library (ugl) (http://www.library.illinois.edu/ugl/) each of these libraries has a search box for es on its home page that is customized to the search targets identified as best for those subject areas by the subject librarians in that library. transaction log data on searches done in es is continuously compiled in a sql database and queries were written to determine certain quantitative measures. searches done in these various subject modules were isolated by a variable in the sql data that indicates whether the search was done in the main gateway es, in the main gateway es but using one of the subject dropdown choices, or from the subject es box directly from that library’s homepage. searches in the six subject modules listed above and in the main es were assessed for the average number of searches per session and the average words per search. further analysis of searches done in the mpal es module used 25,503 sessions conducted on mpal public computers from march 21, 2014 to june 21, 2015, which is a slightly longer timespan than used for the comparative analysis between subject es modules described above. to make this more manageable, only every tenth session was considered, meaning 2,550 sessions were analyzed out of the full set of mpal data. searches were sorted by session id number, which is assigned to each session when a new session is begun. this method kept all strings from one session together, whereas simply sorting by date and string id did not, since multiple sessions can occur simultaneously. a session is a series of user actions (searches and click-throughs) from the same workstation in which there is less than a twenty-minute pause between actions. if there are user actions from the same workstation after a twenty-minute pause, a new session is established, therefore, there is the possibility that some of the sequential sessions were from the same user, but there is no easy way to determine that. the mpal data set was assessed using the following quantitative measures: 1) average number of searches per session and whether session contained a) a single search b) multiple searches for the same thing (either repeated exactly or varied) http://www.library.illinois.edu/funkaces/ http://search.grainger.illinois.edu/top/ http://www.library.illinois.edu/hpnl/ http://www.library.illinois.edu/mpal/ http://www.library.illinois.edu/sshel/ http://www.library.illinois.edu/ugl/ the “black box” | dougan 88 https://doi.org/10.6017/ital.v37i4.10702 c) multiple strings searching for multiple things 2) average number of search terms per search 3) type of search by index (title/author/keyword) or other advanced search 4) use of boolean, quotation marks, parentheses, etc. 5) use of work or opus numbers or key indications 6) search indicating format (score, cd, etc.) findings comparing the data for searches done in the main es to some of the subject modules (see table 1) shows that the ugl es and the hpnl es have the fewest average searches per session, and the mpal es has the third highest average number of searches per session. the sciences tend to have higher average words per search string values, while mpal has the second lowest average number of words per search. this is not surprising given that the sciences tend to use a lot of journal literature and it is common for researchers to copy and paste such citations into es. whereas in music, as we will see later, keyword searches tend to focus on combinations of the composer’s name and words from the work title, occasionally with other terms added. source sessions searches average searches per session average words per search all es searches 599,482 1,340,159 2.121 5.08220 gateway only gateway everything tab 382,040 757,862 1.9837 5.255 gateway books tab 71,007 136,724 1.9255 4.048 gateway articles tab 57,169 107,893 1.887 6.35 gateway total 1,002,479 all subject modules departmental searches (incl. those from gateway dropdown) 75,035 214,364 1.9288 searches done directly from subject library pages 144,283 select subject modules21 agricultural, consumer and environmental sciences library 2,732 5,221 1.911 4.07 engineering library 32,018 68,146 2.128 5.092 history, philosophy, and newspaper library 1,264 1,985 1.57 3.09 music and performing arts library (mpal) 21,047 41,590 1.976 3.375 mpal data from march 21, 2014 to june 21, 2015 25,503 49,702 1.949 3.349 social science, health, and education library 9,458 19,760 2.089 4.906 undergraduate library 26,988 44,588 1.65 3.909 table 1. comparative search data from june 20, 2014 to june 16, 2015 (unless otherwise noted). information technology and libraries | december 2018 89 average number (and range) of searches per session in looking at the searches done directly from the mpal homepage and from the gateway dropdown from march 21, 2014 through june 21, 2015, there were 25,503 sessions conducted in the mpal es that contained a total of 49,702 searches, resulting in an average of 1.949 searches per session. of the 2,550 mpal search sessions in the study sample, the majority (63.2 percent) consisted of one search.22 this means the patron conducted one search and then left es, presumably clicking into the library catalog or another tool that is a target in es to complete their research. sessions consisting of two to four searches account for 31 percent of sessions, while sessions involving five to nine searches only account for 5 percent of total sessions, and only 32 sessions, or fewer than 1 percent, consist of ten or more searches (see table 2). searches per session number of sessions searches per session number of sessions 1 1604 12 7 2 476 13 2 3 191 14 3 4 116 16 1 5 51 17 2 6 29 18 1 7 22 19 2 8 12 20 1 9 15 23 1 10 6 30 2 11 6 total searches 2,550 table 2. searches per session. sessions with multiple searches (n= 946) were evaluated to see whether patrons were searching multiple times for the same thing (either with the same term[s] or with different terms), or whether they were searching for different things. five sessions that were clearly not music-related were removed from the sample. each session was categorized as “same/exact,” “same/different,” or “different.” at times, sessions might include several searches for the same thing using altered strings, in addition to searches for other things. those sessions were coded as “different.” for example: crumb zodiac crumb georgy crumb georgy cromb korean music there were 478 multi-search sessions (50.6 percent) in which patrons searched for different things within their session, 391 sessions (41.3 percent) in which patrons looked for the same thing with differing search strings, and 71 (7.5 percent) in which patrons reiterated the exact same search in each attempt. in the 71 sessions in which patrons used the same exact search the “black box” | dougan 90 https://doi.org/10.6017/ital.v37i4.10702 multiple times, they averaged 2.25 searches. those sessions tagged as “same/exacts” provide an opportunity to try to determine why patrons repeat the same search. common themes include: using too broad a search, searching in wrong place (non-performing-arts–related search), or repeatedly typing in the wrong info (e.g., typos or other errors) and not realizing their mistake. in the 391 sessions in which patrons spent their session searching for the same thing with different search strings, they did so with an average of 2.96 searches. often the variation in the search string was a change in spelling or a minor change in the terms, but sometimes it involved the addition or subtraction of terms, such as starting with morley fitzewilliam virginalists and going to morley fitzewilliam. in another example, we see how music metadata can prove challenging for searchers to format, such as when a patron started with schumann op.68 (without the necessary space between op. and 68), then progressed to album for the young, and finally schumann album for the young. in the 478 sessions in which patrons searched for completely different things within their session, they did so with an average of 4.08 searches per session. in many cases, although the searches were for different items, they were related in some way, either by genre, instrument, or some other element, such as in this example: microjazz color me jazz jamey aebersold play-along vandall jazz jazz piano pieces but sometimes the searches were for very different things: debussy voiles composition as problem mart humal composition as problem debussy ursatz average number (and range) of terms per search in looking at the approximately 4,900 searches included in the sample of 2,550 mpal sessions, without removing the small percentage of duplicate searches, two-term searches are the most common, followed by three-term searches—together accounting for more than half of the searches (55.3 percent). oneand four-term searches are the next most common, together accounting for 25.5 percent of searches (see table 3). in 2012, regular es single-term searches were at almost 60 percent.23 information technology and libraries | december 2018 91 number of terms in search string instances percentage (%) 1 605 12.4 2 1,559 31.8 3 1,149 23.5 4 642 13.1 5 400 8.2 6 196 4.0 7 100 2.0 8-15 216 4.5 16-57 29 .06 table 3. words per search string. longer search strings (8-15 terms) ranged from 74 to ten examples each, respectively, while searches with 16 to 20 terms ranged from 8 to 2 examples each, respectively. the following term counts each had only one example in the logs (25, 26, 31, 32, 36, 57). single-term searches types of single-term searches can be broken down into several categories (see table 4). over half (58.4 percent) were searches for personal names or part/all of a work title. some names and work titles are in fact so unique that a one-word search might in fact be successful (e.g., beyoncé, schwanengesang, newsies, or landowska). over a fifth (22.2 percent) were classified as “other or undetermined,” including publisher names, cities, or subject terms. type of one-word search number personal name 260 title 93 instrument/genre 51 tool/location/format 51 call number/barcode/label number 15 other/undetermined 135 table 4. one-term search types. in the tool/location/format category patrons searched for things such as: albums, images, dissertation, rilm, worldcat, jstor, and imslp. while rilm (abstracts of music literature) and worldcat can be found by a search in this tool because they will match on journal or database titles to which we subscribe, a search for imslp [international music score library project] only brings back mentions of imslp in rilm, etc. mpal links to imslp on its webpage, but neither imslp nor the library’s website are targets in es. when patrons only searched for a format, as in a session where a patron first searched for performances, then albums, and then audio cd [sic], it is difficult to know whether the patron expected to be led to a tool that only searched or listed recordings, whether they wanted a list of all of our recordings, or if some other logic was occurring. searchers also used this technique in multi-word searches, such as in the example george gershwin articles. the “black box” | dougan 92 https://doi.org/10.6017/ital.v37i4.10702 single-term searches in the “other/undetermined” category were a mix of subject terms like solfege, tuning, and spectralism. the patron could be trying to find materials related to these topics, examples of them (in the case of solfege), or definitions for them. they also included publisher or label terms such as rubank and puntamayo [sic], and even, on more than one occasion, urls and dois. two-term searches and names the largest segment of the mpal data (31.8 percent) is comprised of two-term searches. the examples show that often a musical work can be easily sought based on the composer’s name and a word from the title, especially in cases where it is a common title but adding the composer’s name makes it unique (e.g., ligeti requiem). sometimes the patron only knows the work’s characteristics and not its proper name (e.g., lakme duet). patrons do attempt to search for topical material using only two words, and that is not likely enough for a good topic search in most cases, such as in the example mahler dying. sometimes phonetic spellings are employed such as woozy wick followed by woyzeck (which is both a play and a film with this spelling but could also potentially be a misspelling of berg’s opera wozzeck). another example is image cartier followed by images quartier. personal names are frequently seen in two-term search strings. occasional use of foreign versions of names is observed, e.g., georgy crumb. it is difficult to know if these are typos or an artifact of our high international student population. as with any search that contains only a name, it is impossible to know whether the searcher was looking for materials by that individual or information about them. additionally, when current faculty names are searched, it is difficult to know whether patrons are looking for contact information for them, or scores or recordings by them. also observed in name searches is the phenomena of patrons repeating their search with a change in order of names, such as bryan gilliam and then gilliam bryan. this occurs with other two-word searches as well, such as a change from introitus gubaidulina to gubaidulina introitus. switching the order of the words in a search no longer makes a difference in most search tools (although in some catalogs, of course, it was once required to formulate an author search as last name, first name). there is still the occasional use of comma in ln, fn searches here. echoing the results of an earlier study that asked students what data points they used in searching, only occasionally did searches in this data set incorporate specific performers combined with a particular piece or composer: franck mutter, or for a particular edition: idomeneo barenreiter.24 sometimes names/titles were combined with format, such as a session in which a patron searched for hedwig images and then hedwig photo. here it is hard to tell if they are looking for pictures of a fictional owl or images from productions of hedwig and the angry inch, or something else. names are also frequently combined with work numbers instead of title words, such as mozart k.395 and moscheles op.73. search strings in the “other/undertermined” category sometimes included what appears to be an author/date search, perhaps for an article, such as mccord 2006. long search strings on the other end of the spectrum, the vast majority of the ten-plus word string searches are for performing arts items, but some were in other subject areas. these long searches are often citations that have been copied and pasted, which can be discerned from the use of punctuation and capitalization, like “welded in a single mass”: memory and community in london’s concert halls information technology and libraries | december 2018 93 during the first world war.25 it is very common in general gateway es searches to see an entire citation pasted in,26 but less common in the mpal module. searches such as this are often truncated through iteration to make the search more generic (see table 5). given easy search’s doi search recognition function, the longest version of this search would have worked had the doi been correct, but the correct doi number lacks the “.2” at the end (see table 6, query 1). the middle three searches (#s 2-4) failed because none of the a&i services that include this citation use hess, j. for the author’s name, but instead use her full first name (juliet). other examples showed that even when patrons use the exact citation, their search might not be successful if the citation formatting did not match that of the database(s) in which the article was indexed. query # query string 1 hess, j. (2014). radical musicking: towards a pedagogy of social change. music education research, 16(3), 229-250. doi: 10.1080/14613808.2014.909397.2). 2 hess, j. (2014). radical musicking: towards a pedagogy of social change. music education research, 16(3), 229-250. 3 hess, j. (2014). radical musicking: towards a pedagogy of social change. 4 hess, j. radical musicking: towards a pedagogy of social change. 5 radical musicking: table 5. search truncation. in some instances, searches were long because the patron included additional information such as in this example: bernstein, leonard. arranger: jack mason. title: west side story-selections (for symphonic "full" orchestra piano-conductor score). edition/publisher: hal leonard corporation. it is hard to tell if this was a copy and paste from another source such as a publisher catalog, or if the patron was trying to be very precise. in any case, this search was not successful, but would have been had the searcher omitted extraneous information such as the terms “arranger” and “edition/publisher.” type of index search—title/author/keyword and adding subsets or tools easy search does have an advanced search function with indexes for title and author, although it is rarely used by patrons. including repeated searches, searches done selecting the “title” index only numbered 207, or fewer than 10 percent of the sample. searches done selecting “author” were even scarcer, at 141(5.5 percent). the remaining ~2,300 searches in the sample were conducted using the default keyword search. occasionally there was a misuse of index searching, such as: ti: js bach english suite ti: scarlatti sonatas ti: haydn cello concerto d in these examples, composer name is included in a title index search. it is unclear whether searchers do not realize that they have selected something other than a keyword search, or whether people inherently think of the composer’s name as part of the title. later in this paper the phenomenon of searches using possessive name forms is discussed, which may be associated. the “black box” | dougan 94 https://doi.org/10.6017/ital.v37i4.10702 patrons have the option to start from the main library gateway and perform a search in es, and in the advanced search screen can choose other subject modules such as arts and humanities, l ife sciences, and so forth, and/or types of tools to cross-search (see figure 6). patrons chose the music and performing arts tool subset in 161 sessions. figure 6. easy search advanced search screen. the vast majority of the time (4,557 searches or 93 percent), patrons chose to start from the mpal es on the mpal homepage and do a basic search there, but 179 times patrons started from the mpal es and chose other subsets through the advanced search.27 given our large music education program, logically, some patrons made tool choices that included the music subset and the education and/or social science subsets. but sometimes patrons chose every or almost every option available across multiple unrelated subject areas, which likely made for a very unwieldy result set. information technology and libraries | december 2018 95 use of boolean operators, quotation marks, parentheses, truncation, etc. as in most search tools, there are several ways in es to conduct more sophisticated searches. however, patrons do not employ these techniques often, in part because they don’t always have to. in most older catalogs (including our classic voyager opac), searchers had to use boolean terms in capital letters, whereas in vufind and worldcat boolean and is now implied between terms. in the 159 examples of boolean logic in the searches, and is most common term used. interestingly, some researchers used plus signs instead of and (as they might in google), not just between individual words, but in between multi-word segments of the string (without employing quotation marks). however, the + sign, like and, is ignored/implied by es. berg + warm die lufte progressive studies for trumpet progressive studies for trumpet + john miller progressive studies for trumpet (john miller) new orleans + bossa nova johnny alf + brazil dick farney + brazil dick farney + booker pittman in some cases, the use of boolean did not seem intentional, that is, the term “and” appears as part of a common phrase (especially for instrument combinations), such as in webern violin and piano. only a handful of the boolean searches included examples of or and not, which seemed to stem from a class assignment designed by a professor, as the search strings are all very similar. one set is below: machaut not mass machaut or mass machaut and mass machaut mass notre dame machaut mass machaut and mass the “black box” | dougan 96 https://doi.org/10.6017/ital.v37i4.10702 commas were sometimes seen to stand in for boolean operators in a sense, or at least to separate search concepts, like the plus signs above, but were not counted in the total uses of boolean terms cited above. they are ignored by es. rachmaninoff, moment music planet, holst city noir, john adams piazzolla, flute and marimba mussorgsky, pictures at an exhibition, manfred schandert searchers used quotation marks on occasion (n=162) to keep phrases together, and parentheses were also used in this manner eight times (although they are ignored by es), such as in these examples: preludes and fugues (well-tempered clavier) cohen chaconne (from partita in d minor, bwv 1004) in some cases, searchers did not seem to grasp the function of quotation marks, as in this example: “snowforms" raymond murrey schafer, which was also observed by avery and tracy.28 truncation symbols can be another powerful tool in a searcher’s arsenal, but examples of their use in the transaction logs show that most searchers who attempt to use them do not understand them, such as in the examples doctor atomic?, boethius music,* and: orchestra* history history of the orchestra orchest* history orchestr* history orchestra history orchestral history in fact, the current library catalog assists users by automatically applying truncation logic so that “symphony” returns results for “symphonies” and vice versa. it is doubtful that this is generally known among users and likely functions in a manner transparent to most of them. work numbers and key indications searching by music metadata elements such as work or opus numbers and key designations has always proved challenging in online search environments given that numbers and single letters can appear in other parts of the catalog record with different meanings (e.g., 1 part instead of symphony no. 1). added to this is the difficulty of describing items that contain multiple works— the item’s title might be “mozart’s complete symphonies” or “beethoven symphonies 1-6” without information technology and libraries | december 2018 97 complete work details provided. nevertheless, 134 searches had some form of work number included, and 36 searches included a key indication. fantasie in f# minor presto georg philipp telemann and concerto en ut mineur j.c. bach are further examples of why a work’s key is hard to search by, one because of the use of the french solfege syllable “ut” and one because it includes a sharp symbol (#).29 the difficulties this can cause often led searchers to try various permutations of their search. mozart concerto g major sam franko; mozart concerto k 283 sam franko; scores; mozart violin concerto g major; mozart violin concerto g major sam franko; mozart violin concerto; sonata g major flute cpe bach sonata g major flute bach hamburger sonata flute cpe bach hamburger sonata hamburger sonata it is counterintuitive to searchers that including specific details in their search string might not help, but that is in fact the case in many online catalogs. searchers often run into the question of how or if to include the work indicator (op., k., bwv, etc.), which can lead to a “misuse” of this extra data such as in mozart k501 and mahler symphony no.9 (no spaces). another observation includes the use of what the author calls musicians’ shorthand. that is, those familiar with classical repertoire will know that examples such as sibelius 1 and mahler 5 are searches for symphonies even though they do not say so, but it will be harder, if not impossible, for the catalog to interpret that, leaving the searcher to sort through many extra results. in addition is the long-standing issue of whether to enter the number as “1”, “1st”, or “first” and whether the system can interpret these against the form of the number present in the catalog record. search by format or edition type in forty-seven examples searchers used format terms in their searches, including score, vocal score, full score, dvd, performance recordings, albums, and audio cd as well as the following: prokofiev romeo and juliet orchestra parts orchestra excerpts prokofiev romeo and juliet viola the “black box” | dougan 98 https://doi.org/10.6017/ital.v37i4.10702 tosca harp part assassins cd saxophone article in fifteen examples searchers searched for edition types including urtext, facsimile, critical edition, and complete works. in the latter case they occasionally used the word “complete” and the composer’s name, such as complete schumann or complete webern. unfortunately, this approach will often not be successful, because even though the term “complete works” is used colloquially by musicians, the titles of such editions are often something else (and often in a foreign language, such as “opera omnia”). other observations on formulation of searches searching by call numbers and recording label numbers while some catalogs allow call number searches, our current instance of vufind does not have a call number index, and keyword searching for them only works in some instances.30 but while call number searching does not work well in vufind (e.g., it has to be done as a keyword search and not a call number index search like in voyager), it still works in es because it is searching by keywords. there were thirty-two examples of searches in mpal’s es where patrons used entire call numbers or the first part of a classification number to find related materials: count basie biography count basie ml 410 duke ellington ml 410 duke ellington bibiliography it is also not unrealistic to think that patrons might want to search by a recording’s label number, since most catalogs provide search options for isbns and issns for print materials. searchers attempted this in a handful of searches like lpo-0014,31 7.24356e+11,32 and 777337-2.33 unfortunately this information is not usually reflected in mpal’s catalog records. common descriptions, natural language queries, genre queries, and context words as mentioned already with the examples mahler 1 and complete works, patrons regularly search with terms and phrases that make sense to them or that are used colloquially when discussing music and sources, which may or may not be in the bibliographic record. additional examples in the data set include: handel messiah critical edition rodelinda in italian mamma mia! book [for the text of a musical] grove encyclopedia [the title of this is in fact “dictionary” not “encyclopedia”] mgg sachteil [the abbreviation for musik in geschichte und gegenwart and the name for a section of it] information technology and libraries | december 2018 99 dance collection the last example in the list is particularly intriguing—somewhat like the earlier search examples of performances and albums, one wonders if the patron hoped to find everything in that category and then be able to browse, however it is hard to know what the searcher anticipated getting in return. sometimes natural language queries appear, often in an attempt to find a smaller part of a larger work, such as the slow movement of brahms's first symphony, anonymous chant from vespers for christmas day, and chaconne (from partita in d minor, bwv 1004); or for things other than musical works, such as in reviews of stravinsky article by robert craft. another variation on natural language or colloquial searches is the use of the possessive form of composer names. although not common (23 examples), patrons do this when searching for composer and title of a work, e.g., verdi's requiem. it seems unlikely that people do this when searching for books or other works, but musicians make works possessive to the composer, such as in the examples mendelssohn's violin concerto, to differentiate between pieces with the same form/generic title. in rare cases searchers used the term “by,” such as jeptha by carrissimi. genre searches such as south indian vocal music and hindustani classical music show that people may want to search the way they might in pandora or itunes, although it is possible this person was looking for secondary materials and not recordings or scores: pop female pop women pop contemporary pop searchers also exhibit a desire to find things by genre and instrument or voice type, such as soprano arias [which is ‘high voice’ in the lc subject heading], mozart satb sanctus, and baroque arias for medium voice. other examples include marimba literature, organ literature, and organ techniques. catalogs do not necessarily aid in these types of searches, even though they are natural constructions for users. sometimes searchers add context words to their search like they would in google in a way that will not necessarily help them in the catalog, such as daniel read composer. discussion even given the difficulties of searching for music materials, mpal patrons have embraced es—its module has almost as many searches as the undergraduate library’s, which serves a much larger population. it also has twice as many searches as the social science, health, and education library module, which also serves a much larger population than mpal. one of the possible reasons for this is the fact that mpal was an early adopter of developing an es subject module that could be searched from our homepage, which means our patrons have had longer to grow accustomed to using it. mpal has lower average words-per-search ratio (3.375 or 3.349 depending on data set) than most other es modules, likely because there are more composer plus title keyword searches for musical works and not as many pasted article citation searches, which tend to be longer. this is supported by the comparison of the average number of words in searches done in the gateway books tab the “black box” | dougan 100 https://doi.org/10.6017/ital.v37i4.10702 (4.048) vs. the gateway articles tab (6.35). in addition, although twoand three-word searches are most common, mpal has a significant number of single-word searches (12.4 percent). such searches can work in music, when there are unique titles like turandot and treemonisha that are unlikely to appear for more than one composer or as terms in other disciplines. for this same reason, singleor even two-word searches are unlikely to be effective in most other disciplines. at around seven words per search a transition in search patterns occurs. eight word and longer search strings are almost always some version of a title of a book, article, chapter or dissertation, etc. and strings with six words and fewer tend to be topical searches or combination composer/piece searches. other transaction log studies of es have shown that “title searching and results display—of journal titles, article titles, and book titles—is being heavily employed by users.”34 however, in music, where title alone may not be sufficient to identify and retrieve a musical work, searches with a combination of composer name and elements of the title and/or additional information will always be most prevalent. search location appropriateness and context even though discovery layers and federated search tools help with minimizing the number of silos and places in which scholars need to search, there are still issues with patrons attempting to use the es box to find things it is not designed to find.35 searchers see a box and search, without always understanding the context. this can happen on multiple levels. the mpal page clearly states that the mpal es box searches for arts-related things, but obviously patrons do not always see or comprehend this, even after they type in many queries that do not provide (good) results. this is likely related to the number of visitors to mpal from other disciplines who do not realize that there are various differently scoped versions of es. the following example could be a theatre set construction related search, which would work only moderately well in our tool. or, it may have been conducted by an architecture or structural engineering student, who would have better luck using a different es module. light weigh [sic] structures in architecture building research the evolving design vocabulary of fabric structure the engineering discipline of tent structures building research jan/feb 1972:22 it would be ideal if the system was smart enough to make suggestions: “you appear to need architecture resources—if you are not finding what you need, might we suggest tool x, y, or z?” while es does this to an extent when it can in the generic es, it does not do so in the subject modules, and in reality, can only go so far. it raises the question of whether we are we doing patrons a disservice by offering pre-defined subject modules. while this approach has some benefits for most users, it also creates different challenges for some. mpal’s es does not target all available relevant online tools and neither does the general es, so interdisciplinary researchers still need to be cautious of silos, even well-intentioned ones created by librarians or traditional information technology and libraries | december 2018 101 ones created by vendors. it is difficult to inform patrons of this in one-box search settings—they see the box and are eager to get started without first having to read a lengthy set of instructions. search location context is also important when patrons use es to try to find things that are described or linked on our website and not in es, such as for any of our named special collections. patrons also use es to find tools such as naxos, jstor, worldcat, and librarysource, some of which are targeted by es and some of which are not. es will at least provide a link to a tool, however (see figure 7). figure 7. easy search post-search suggestion. these particular tools are all also linked from the mpal website (in fact, naxos is linked further down the home page from the es box) and we also have a separate tool that enables one to search for databases and online journals by name. on some occasions, searchers used es to look for help using library tools, such as in the following example: rilm retrieval rilm using rilm the library website, not the discovery layer, is a better tool for finding instructions, since help information is currently delivered via various libguides. however, this is not intuitive to patrons. on a related note, it is interesting to consider whether patrons searching for specific tools such as imslp expect to find results from non-library resources in our search layers, or if they simply do not differentiate in their minds what is an open tool and what is a library subscription tool. patron knowledge level many of the observations of this study are related to known-item searching, since a large percentage of people looking for music materials are looking for specific pieces of music. earlier studies show that it is difficult to search for something if you do not know what it is.36 this can be seen in examples like ombramaifu handel (should be ombra mai fu) or the interworkings of tennis (which was followed by the correct inner game of tennis). topical searches can be especially difficult in any subject when the patron does not quite know how to put what they want into words (or literally does not know the right words, especially in the case of our many patrons for whom english is not their first language). the “black box” | dougan 102 https://doi.org/10.6017/ital.v37i4.10702 qualtize musical tension spell change click: kw:qualitize musical tension quantize musical tension quantitative musical tension music motive similarity surveying musical form through melodic-motivic similarities a paradigmatic approach to extract the melodic structure of a musical piece inding subsequences of melodies in musical pieces spell change click: kw:finding subsequences melodies musical pieces similarity measures for melodies measures of musical tension measuring musical tension this echoes head and eisenberg’s 2009 findings and dempsey and valenti’s 2016 findings.37 shortcomings of the easy search tool this study helped illuminate some shortcomings in es. sometimes the search formulation changes from es to the target, for example cramer preludes in es becomes all(cramer preludes) [a bound phrase] in one target, resulting in many fewer results than if the search had been done in the native interface. patrons may not realize this as they are searching. in another case there were no results for danças folclóricas brasileiras e suas aplicações educativas but removing the diacritics retrieves this title in our catalog, so it appears that diacritics do not function in es (at least when vufind is the target)—something that may not be apparent to searchers and hopefully can be addressed in the code. further research additional analysis could be done on this data set, including assessing whether searches were for known items or topics, and more specifically whether for articles, books, scores, or recordings. however, in many cases it is difficult to tell if a patron is looking for a score, recording, or information about a piece or composer. other research on es shows over half of searches (just over 58 percent in 2015) in the main es are for known items.38 this percentage is likely to be much higher in mpal’s es. with an enhanced data set it would also be possible to identify which target tools searchers are choosing most often. conclusion while many patrons (and librarians) are eager for a tool that can truly search everything, we are not there yet. some have tried to make music-specific interfaces for library catalogs, but this work is not widespread.39 perhaps because music students are often searching for things other than articles it would be better to have one tool that searches the catalog and streaming media tools information technology and libraries | december 2018 103 and one that only searches article indexes. some schools have taken this approach—configuring their discovery layer indexes to include article content but not the local catalog. there were several observations in this data of patron search behavior are not fully supported by library systems in all cases, but perhaps should be (e.g., use of + signs, searching by record label numbers or genre names/types of music/formats). in some cases, this is an issue with the metadata standards in use and in others it is about needing more flexible search options based on the metadata that we already have. newcomer et al. discuss this in their article outlining music discovery requirements.40 tools like easy search and discovery layers solve some problems for users but can create others. dedicated library catalogs are still generally the best tools for finding scores and recordings in our physical (and some online) collections, but not all libraries offer that tool anymore, instead offering a discovery layer as the primary search tool. in those cases, serious consideration needs to be given to facets, the ability to limit by format, and especially the frbrization of items, which is particularly problematic for music. additionally, there is a continued need for targeted instruction for music library users, because not only are the tools used in libraries less than perfect, the inherent challenges in searching for music because of its formats and titles are aggravated by musicians’ use of shorthand and colloquialisms to describe music materials. endnotes 1 john boyd et al., “the one-box challenge: providing a federated search that benefits the research process,” serials review 32, no. 4 (december 2006): 247–54, https://doi.org/10.1016/j.serrev.2006.08.005; sharon dyas-correia et al., “’the one-box challenge: providing a federated search that benefits the research process’ revisited,” serials review 41, no. 4 (october-december 2015): 250–56, https://doi.org/10.1080/00987913.2015.1095581. 2 lucy holman, “millennial students’ mental models of search: implications for academic librarians and database developers,” journal of academic librarianship 37, no. 1 (january 2011): 19–27, https://doi.org/10.1016/j.acalib.2010.10.003; brandi porter, “millennial undergraduate research strategies in web and library information retrieval systems,” journal of web librarianship 5, no. 4 (july-december 2011): 267–85, https://doi.org/10.1080/19322909.2011.623538; martin zimerman, “digital natives, searching behavior, and the library,” new library world 11, nos. 3/4 (2012): 174–201, https://doi.org/10.1108/03074801211218552. 3 susan avery and dan tracy, “using transaction log analysis to assess student search behavior in the library instruction classroom,” reference services review 42, no. 2 (june 2014): 332, https://doi.org/10.1108/rsr-08-2013-0044. 4 andrew asher, lynda m. duke, and suzanne wilson, “paths of discovery: comparing the search effectiveness of ebsco discovery service, summon, google scholar, and conventional library resources,” college & research libraries 74, no. 5 (september 2013): 473, https://doi.org/10.5860/crl-374. https://doi.org/10.1080/00987913.2015.1095581 https://doi.org/10.1016/j.acalib.2010.10.003 https://doi.org/10.1080/19322909.2011.623538 https://doi.org/10.1108/03074801211218552 https://doi.org/10.1108/rsr-08-2013-0044 https://doi.org/10.5860/crl-374 the “black box” | dougan 104 https://doi.org/10.6017/ital.v37i4.10702 5 megan dempsey and alyssa valenti, “student use of keywords and limiters in web-scale discovery searching,” journal of academic librarianship 42, no. 3 (may 2016): 203, https://doi.org/10.1016/j.acalib.2016.03.002. 6 annie r. armstrong, “student perceptions of federated searching vs. single database searching,” reference services review 37, no. 3 august 2009): 291–303, https://doi.org/10.1108/00907320910982785; c. jeffrey belliston, jared l. howland, and brian c. roberts, “undergraduate use of federated searching: a survey of preferences and perceptions of value-added functionality,” college & research libraries 68, no. 6 (november 2007): 472-86, https://doi.org/10.5860/crl.68.6.472; sarah d. williams, angela bonnell, and bruce stoffel, “student feedback on federated search use, satisfaction, and web presence: qualitative findings of focus groups,” reference and user services quarterly 49, no. 2 (winter 2009): 131–39. 7 asher et al., “paths of discovery,” 476. 8 troy swanson and jeremy green, “why we are not google: lessons from a library web site usability study,” journal of academic librarianship 37, no. 3 (may 2011): 227, https://doi.org/10.1016/j.acalib.2011.02.014. 9 cory lown, tito sierra, and josh boyer, “how users search the library from a single search box,” college & research libraries 74, no. 3 (may 2013): 240, https://doi.org/10.5860/crl-321. 10 sarah dahlen and kathlene hanson, “preference vs. authority: a comparison of student searching in a subject-specific indexing and abstracting database and a customized discovery layer,” college & research libraries 78, no. 7 (november 2017), 892, https://doi.org/10.5860/crl.78.7.878. 11 ibid. 12 li fu and cynthia thomes, “implementing discipline-specific searches in ebsco discovery service,” new library world 115, nos. 3/4 (2014): 102–15, https://doi.org/10.1108/nlw-012014-0003. 13 kirstin dougan, “finding the right notes: an observational study of score and recording seeking behaviors of music students,” journal of academic librarianship 41, no. 1 (january 2015): 61–67, https://doi.org/10.1016/j.acalib.2014.09.013. 14 jennifer m. mayer, “serving the needs of performing arts students: a case study,” portal: libraries & the academy 15, no. 3 (july 2015): 416, https://doi.org/10.1353/pla.2015.0036. 15 joe clark and kristin yeager, “seek and you shall find? an observational study of music students’ library catalog search behavior,” journal of academic librarianship 44, no. 1 (january 2018): 105-12, https://doi.org/10.1016/j.acalib.2017.10.001. 16 christine d. brown, “straddling the humanities and social sciences: the research process of music scholars,” library & information science research 24, no. 1 (march 2002): 73–94, https://doi.org/10.1016/s0740-8188(01)00105-0; stephann makri and claire warwick, https://doi.org/10.1016/j.acalib.2016.03.002 https://doi.org/10.1108/00907320910982785 https://doi.org/10.5860/crl.68.6.472 https://doi.org/10.1016/j.acalib.2011.02.014 https://doi.org/10.5860/crl-321 https://doi.org/10.5860/crl.78.7.878 https://doi.org/10.1108/nlw-01-2014-0003 https://doi.org/10.1108/nlw-01-2014-0003 https://doi.org/10.1016/j.acalib.2014.09.013 https://doi.org/10.1353/pla.2015.0036 https://doi.org/10.1016/j.acalib.2017.10.001 https://doi.org/10.1016/s0740-8188(01)00105-0 information technology and libraries | december 2018 105 “information for inspiration: understanding architects' information seeking and use behaviors to inform design,” journal of the american society for information science & technology 61, no. 9 (september 2010): 1,745-770, https://doi.org/10.1002/asi.21338; francesca marini, “archivists, librarians, and theatre research,” archivaria 63 (2007): 7–33; ann medaille, “creativity and craft: the information-seeking behavior of theatre artists,” journal of documentation 66, no. 3 (may 2010): 327–47, https://doi.org/10.1108/00220411011038430; marybeth meszaros, “a theatre scholarartist prepares: information behavior of the theatre researcher,” in advances in library administration and organization (v. 29), delmus e. williams and janine golden, eds. (bingley, uk: emerald group publishing limited, 2010): 185-217; bonnie reed and donald r. tanner, “information needs and library services for the fine arts faculty,” journal of academic librarianship 27, no. 3 (may 2001): 231, https://doi.org/10.1016/s0099-1333(01)00184-7; shannon robinson, “artists as scholars: the research behavior of dance faculty,” college & research libraries 77, no. 6 (november 2016): 779-94, https://doi.org/10.5860/crl.77.6.779. 17 ethelene whitmire, “disciplinary differences and undergraduates’ information‐seeking behavior,” journal of the association for information science and technology 53 (june 2002): 631-38, https://doi.org/10.1002/asi.10123. 18 tina chrzastowski and lura joseph, “surveying graduate and professional students' perspectives on library services, facilities and collections at the university of illinois at urbana-champaign: does subject discipline continue to influence library use?,” issues in science & technology librarianship 45, no. 1 (winter 2006), https://doi.org/10.5062/f4dz068j. 19 ellen collins and graham stone, “understanding patterns of library use among undergraduate students from different disciplines,” evidence based library and information practice 9 (september 2014): 51–67, https://doi.org/10.18438/b8930k. 20 this is up from the 4.33 average reported by mischo in 2012 (164). 21 including direct from departmental webpage and via gateway es dropdown choices. 22 in mischo’s 2012 analysis of easy search logs, 52 percent of sessions had one string and 48 percent had two or more. by 2015, single-query sessions had risen to 57 percent (william mischo, et al., "the bento approach to library discovery: web-scale and beyond,” internet librarian international, october 21, 2015). 23 william h. mischo et al., “user search activities within an academic library gateway: implications for webscale discovery systems,” in planning and implementing resource discovery tools in academic libraries, ed. mary popp and diane dallis (hershey, pa: igi global, 2012), 163. 24 kirstin dougan, “information seeking behaviors of music students,” reference services review 40, no. 4 (november 2012): 563, https://doi.org/10.1108/00907321211277369. https://doi.org/10.1002/asi.21338 https://doi.org/10.1108/00220411011038430 https://doi.org/10.1016/s0099-1333(01)00184-7 https://doi.org/10.5860/crl.77.6.779 https://doi.org/10.1002/asi.10123 https://doi.org/10.5062/f4dz068j https://doi.org/10.18438/b8930k https://doi.org/10.1108/00907321211277369 the “black box” | dougan 106 https://doi.org/10.6017/ital.v37i4.10702 25 vanessa williams, “‘welded in a single mass’: memory and community in london’s concert halls during the first world war,” the journal of musicological research 33, nos. 1–3 (2014): 27–38. 26 mischo, “user search activities,” 162. 27 this echoes earlier research that shows most searchers use default settings and keyword searches. 28 avery and tracy, “using transaction logs,” 31. 29 barbara d. henigman and richard burbank, “online music symbol retrieval from the access angle,” information technology & libraries 14, 1 (march 1995): 5–16. 30 we still have to use our older voyager opac or the staff-side of voyager to effectively search by call number until we get a newer version of vufind. 31 symphony no. 4 in e flat “romantic” by anton bruckner, klaus tennstedt (conductor), london philharmonic orchestra. (performer). 32 this is mozart, “clarinet concerto in a, k. 622,” meyer/berlin philharmonic/abbado emi classics 57128; 7.24356e+11. 33 this is reich: sextet / piano phase / eight lines (griffiths kevin/ london steve reich ensemble/ the/ stephen wallace) (cpo: 777337-2)). 34 mischo, “user search activities,” 169. 35 this reinforces what lown and asher et al. found as cited in the literature review above. 36 kirstin dougan, “finding the right notes: an observational study of score and recording seeking behaviors of music students,” journal of academic librarianship 41, no. 1 (january 2015): 66. 37 alison head and michael eisenberg, “finding context: what today’s college students say about conducting research in the digital age,” progress report (2009) (retrieved from http://projectinfolit.org/images/pdfs/pil_progressreport_2_2009.pdf); dempsey and valenti, “student use of keywords and limiters,” 2016. 38 william h. mischo et al., “the bento approach to library discovery: web-scale and beyond,” internet librarian international, october 21, 2015. 39 anke hofmann and barbara wiermann, “customizing music discovery services: experiences at the hochschule für musik und theater, leipzig,” music reference services quarterly 17, no. 2 (june 2014): 61–75, https://doi.org/10.1080/10588167.2014.904699; bob thomas, “creating a specialized music search interface in a traditional opac environment,” oclc systems & services 27, no. 3 (august 2011): 248–56, https://doi.org/10.1108/10650751111164588. 40 nara newcomer et al., “music discovery requirements: a guide to optimizing interfaces,” notes 69, no. 3 (march 2013): 494-524, https://doi.org/10.1353/not.2013.0017. http://projectinfolit.org/images/pdfs/pil_progressreport_2_2009.pdf https://doi.org/10.1080/10588167.2014.904699 https://doi.org/10.1108/10650751111164588 https://doi.org/10.1353/not.2013.0017 abstract introduction background literature review general search studies and single search boxes search and library use in different disciplines methodology findings average number (and range) of searches per session average number (and range) of terms per search single-term searches two-term searches and names long search strings type of index search—title/author/keyword and adding subsets or tools use of boolean operators, quotation marks, parentheses, truncation, etc. work numbers and key indications search by format or edition type other observations on formulation of searches searching by call numbers and recording label numbers common descriptions, natural language queries, genre queries, and context words discussion search location appropriateness and context patron knowledge level shortcomings of the easy search tool further research conclusion endnotes 24 information technology and libraries | march 2011 ruben tous, manel guerrero, and jaime delgado semantic web for reliable citation analysis in scholarly publishing nevertheless, current practices in citation analysis entail serious problems, including security flaws related to the publishing process (e.g., repudiation, impersonation, and privacy of paper contents) and defects related to citation analysis, such as the following: ■■ nonidentical paper instances confusion ■■ author naming conflicts ■■ lack of machine-readable citation metadata ■■ fake citing papers ■■ impossibility for authors to control their related citation data ■■ impossibility for citation-analysis systems to verify the provenance and trust of citation data, both in the short and long term besides the fact that they do not provide any security feature, the main shortcoming of current citation-analysis systems such as isi citation index, citeseer (http:// citeseer.ist.psu.edu/), and google scholar is the fact that they count multiple copies or versions of the same paper as many papers. in addition, they distribute citations of a paper between a number of copies or versions, thus decreasing the visibility of the specific work. moreover, their use of different analysis databases leads to very different results because of differences in their indexing policies and in their collected papers.3 to remedy all these imperfections, this paper proposes a reference architecture for reliable citation analysis based on applying semantic trust mechanisms. it is important to note that a complete or partial adoption of the ideas defended in this paper will imply the effort to introduce changes within the publishing lifecycle. we believe that these changes are justified considering the serious flaws of the established solutions, and the relevance that citation-analysis systems are acquiring in our society. ■■ reference architecture we have designed a reference architecture that aims to provide reliability to the citation and citation-tracking lifecycle. this architecture is based in the use of digitally signed semantic metadata in the different stages of the scholarly publishing workflow. as a trust scheme, we have chosen a public key infrastructure (pki), in which certificates are signed by certification authorities belonging to one or more hierarchical certification chains.4 trust scheme the goal of the architecture is to allow citation-analysis systems to verify the provenance and trust of machinereadable metadata about citations before incorporating analysis of the impact of scholarly artifacts is constrained by current unreliable practices in cross-referencing, citation discovering, and citation indexing and analysis, which have not kept pace with the technological advances that are occurring in several areas like knowledge management and security. because citation analysis has become the primary component in scholarly impact factor calculation, and considering the relevance of this metric within both the scholarly publishing value chain and (especially important) the professional curriculum evaluation of scholarly professionals, we defend that current practices need to be revised. this paper describes a reference architecture that aims to provide openness and reliability to the citation-tracking lifecycle. the solution relies on the use of digitally signed semantic metadata in the different stages of the scholarly publishing workflow in such a manner that authors, publishers, repositories, and citation-analysis systems will have access to independent reliable evidences that are resistant to forgery, impersonation, and repudiation. as far as we know, this is the first paper to combine semantic web technologies and public-key cryptography to achieve reliable citation analysis in scholarly publishing. i n recent years, the amount of scholarly communication brought into the digital realm has exponentially increased.1 this no-way-back process is fostering the exploitation of large-scale digitized scholarly repositories for analysis tasks, especially those related to impact factor calculation. the potential automation of the contribution– relevance calculation of scholarly artifacts and scholarly professionals has attracted the interest of several parties within the scholarly environment, and even outside of it. for example, one can find within articles of the spanish law related to the scholarly personnel certification the requirement that the papers appearing in the curricula of candidates should appear in the subject category listing of the journal citation reports of the science citation index.2 this example shows the growing relevance of these systems today. ruben tous (rtous@ac.upc.edu) is associate professor, manuel guerrero (guerrero@ac.upc.edu) is associate professor, and jaime delgado (jaime.delgado@ac.upc.edu) is professor, all in the departament d’arquitectura de computadors, universitat politècnica de catalunya, barcelona, spain. semantic web for reliable citation analysis in scholarly publishing | tous, guerrero, and delgado 25 might send a signed notification of rejection. we feel that the notification of acceptance is necessary because in a certain kind of curriculum, evaluations for university professors conditionally accepted papers can be counted, and in other curriculums not. the camera-ready version will be signed by all the authors of the paper, not only the corresponding author like in the paper submission. after the camera-ready version of the paper has been accepted, the journal will send a signed notification of future publication. this notification will include the date of acceptance and an estimate date of publication. finally, once the paper has been published, the journal will send a signed notification of publication to the author. the reason for having both notification of future publication and notification of publication is that, again, some curriculum evaluations might be flexible enough to count papers that have been accepted for future publication, while stricter ones state explicitly that they only accept published papers. once this process has been completed, a citationanalysis system will only need to import the authors’ ca certificates (that is, the certificates of the universities, research centers, and companies) and the publishers’ ca certificates (like acm, ieee, springer, lita, etc.) to be able to verify all the signed information. a chain of cas will be possible both with authors (for example, university, department, and research line) and with publications (for example, publisher and journal). ■■ universal resource identifiers to ensure that authors’ uris are unique, they will have a tree structure similar to what urls have. the first level element of the uri will be the authors’s organization (be it a university or a research center) id. this organization id will be composed by the country code top-level domain (cctld) and the organization name, separated by an underscore.5 the citation-analysis system will be responsible for assigning these identifiers and ensuring that all organizations have different identifiers. then, in the same manner, each organization will assign second-level elements (similar to departments) and so forth. author’s ca_id: _ example: es_upc author ’s uri: author:/// . . . /. example: author://es_upc.dac/ruben.tous (in this example “es” is the cctdl for spain, upc (universitat politècnica de catalunya) is the university, and dac (departament d’arquitectura de computadors) is the department. them into their repositories. as a collateral effect, authors and publishers also will be able to store evidences (in the form of digitally signed metadata graphs) that demonstrate different facts related to the creating–editing–publishing process (e.g., paper submission, paper acceptance, and paper publication). to achieve these goals, our reference architecture requires each metadata graph carrying information about events to be digitally signed by the proper subject. because our approach is based in a pki trust scheme, each signing subject (author or publisher) will need a public key certificate (or identity certificate), which is an electronic document that incorporates a digital signature to bind a public key with an identity. all the certificates used in the architecture will include the public key information of the subject, a validity period, the url of a revocation center, and the digital signature of the certificate produced by the certificate issuer’s private key. each author will have a certificate that will include as a subject-unique identifier the author ’s universal resource identifier (uri), which we explain in the next section, along with the author ’s current information (such as name, e-mail, affiliation, and address) and previous information (list of former names, e-mails, and addresses), and a timestamp indicating when the certificate was generated. the certification authority (ca) of the author’s certificate will be the university, research center, or company with which the author is affiliated. the ca will manage changes in name, e-mail, and address by generating a new certificate in which the former certificate will move to the list of former information. changes in affiliation will be managed by the new ca, which will generate a new certificate with the current information. since the new certificate will have a new uri, the ca also will generate a signed link to the previous uri. therefore the citation-analysis system will be able to recognize the contributions signed with both certificates as contributions made by the same author. it will be the responsibility of the new ca to verify that the author was indeed affiliated to the former organization (which we consider a very feasible requirement). every time an author (or group of authors) submits a paper to a conference, workshop, or journal, the corresponding author will digitally sign a metadata graph describing the paper submission event. although the paper submission will only be signed by the corresponding author, it will include the uris of all the authors. journals (and also conferences and workshops) will have a certificate that contains their related information. their ca will be the organization or editorial board behind them (for instance, acm, ieee, springer, lita, etc.). if a paper is accepted, the journal will send a signed notification of acceptance, which will include the reviews, the comments from the editor, and the conditions for the paper to be accepted. if the paper is rejected, the journal 26 information technology and libraries | march 2011 ■■ microsoft’s conference management toolkit (cmt; http://cmt.research.microsoft.com) is a conference management service sponsored by microsoft research. it uses https to provide confidentiality, but it is a service for which you have to pay. although some of the web-based systems provide confidentiality through https, none of them provides nonrepudiation, which we feel is even more important. this is so because nonrepudiation allows authors to certify their publications to their curriculum evaluators. our proposed scheme always provides nonrepudiation because of its use of signatures. curriculum evaluators don’t need to search for the publisher’s website to find the evaluated author’s paper. in addition, our proposed scheme allows curriculum evaluations to be performed by computer programs. and confidentiality can easily be achieved by encrypting the messages with the public key of the destination of the message. it should not be difficult for authors to obtain the public key for the conference or journal (which could be included in its “call for papers” or included on its webpage). and, because the paper-submission message includes the author’s public key, notifications of acceptance, rejection, and publication can be encrypted with that key. ■■ modeling the scholarly communication process citation analysis systems operate over metadata about the scholarly communication process. currently, these metadata are usually automatically generated by the citation-analysis systems themselves, generally through a programmatic analysis of the scholarly artifacts unstructured textual contents. these techniques have several drawbacks, as enumerated already, but especially regarding the fact that there is metadata that cannot be inferred from the contents of a paper, like all the aspects of the publishing process. to allow citation-analysis systems accessing metadata about the entire scholarly artifacts lifecycle, we suggest a metadata model that captures a great part of the scholarly domain static and dynamic semantics. this model is based on knowledge representation techniques in semantic web, such as resource description framework (rdf) graphs and web ontology language (owl) ontologies. metadata and rdf the term “metadata” typically refers to a certain data representation that describes the characteristics of an information-bearing entity (generally another data representation such as a physical book or a digital video file). metadata plays a privileged role in the scholarly creations’ uris are built in a similar manner to authors’ uris. but it this case, the use of the country code as part of the publisher’s id is optional. because a creation and its metadata evolve through different stages (submission and camera-ready), we will use different uris for each phase. we propose the use of this kind of uri instead of other possible schemes such as the digital object identifier (doi), because the ones proposed in this paper has the advantage of being human readable and contain the cas chain.6 of course, that doesn’t mean that once published a paper cannot obtain a doi or another kind of identifier. publisher’s ca_id: or _ examples: lita and it_italianjournalofzoology creation’s uri: creation:// . . . / example: creation://lita.ital/vol27_num1_ paper124 confidentiality and nonrepudiation nowadays, some conferences manage their paper submissions and notifications of acceptance (with their corresponding reviews) through e-mail, while others use a web-based application, such as edas (http://edas.info/). the e-mail-based system has no means of providing any kind of confidentiality. each router through which the e-mail travel can see their contents (paper submissions and paper reviews). the web-based system can provide confidentiality through http secure (https), although some of the most popular applications (such as edas and myreview) do not provide it; their developers may not have thought that it was an important feature. the following is a short list of some of the existing web-based systems: ■■ edas (http://edas.info/) is probably the most popular sytem. it can manage a large number of conferences and special issues of journals. it does not provide confidentiality. ■■ myreview (http://myreview.intellagence.eu/index .php) is an open-source web application distributed under the gpl license for managing the paper submissions and paper reviews of a conference or journal. myreview is implemented with php and mysql. it does not provide confidentiality. ■■ conftool (http://www.conftool.net) is another web-based management system for conferences and workshops. a free license of the standard version is available for noncommercial conferences and events with fewer than 150 participants. it uses https to provide confidentiality. semantic web for reliable citation analysis in scholarly publishing | tous, guerrero, and delgado 27 the purpose of the reference architecture described in this paper, we do not instruct which of the two described approaches for signing rdf graphs is to be used. the decision will depend on the implementation (i.e., on how the graphs will be interchanged and processed). owl and an ontology for the scholarly context to allow modeling the scholarly communication process with rdf graphs, we have designed an owl description logic (dl) ontology. owl is a vocabulary for describing properties and classes of rdf resources, complementing rdfs’s capabilities for providing semantics for generalization hierarchies of such properties and classes. owl enriches the rdfs vocabulary by adding, among others, relations between classes (e.g., disjointness), cardinality (e.g., “exactly one”), equality, richer typing of properties, characteristics of properties (e.g., symmetry), and enumerated classes. owl has the influence of more than ten years of dl research. this knowledge allowed the set of constructors and axioms supported by owl to be carefully chosen so as to balance the expressive requirements of typical applications with a requirement for reliable and efficient reasoning support. a suitable balance between these computational requirements and the expressive requirements was achieved by basing the design of owl on the sh family of description logics.10 the language has three increasingly expressive sublanguages designed for different uses: owl lite, owl dl, and owl full. we have chosen owl dl to define the ontology for capturing the static and dynamic semantics of the scholarly communication process. with respect to the other versions of owl, owl dl offers the most expressiveness while retaining computational completeness (all conclusions are guaranteed to be computable) and decidability (all computations will finish in finite time). owl dl is so named because of its correspondence with description logics. figure 3 shows a simplified graphical view of the owl ontology we have defined for capturing static and dynamic semantics of the scholarly communication process. figure 4, figure 5, and figure 6 offer a (partial) tabular representation of the main classes and properties of the ontology. in owl, properties are independent from classes, but we have chosen to depict them in an object-oriented manner to improve understanding. for the same reason we have represented some properties as arrows between classes, despite this information being already present in the tables. uris do not appear as properties in the diagrams because each instance of a class will be an rdf resource, and any resource has a uri according to the rdf model. these uris will follow the rules described in the above section, “reference architecture.” it’s worth mentioning that the selection of the included properties has been based in the study of several metadata formats and standards, such as dublin communication process by helping identify, discover, assess, and manage scholarly artifacts. because metadata are data, they can be represented through any the existing data representation models, such as the relational model or the xml infoset. though the represented information should be the same regardless of the formalism used, each model offers different capabilities of data manipulation and querying. recently, a not-so-recent formalism has proliferated as a metadata representation model: rdf from the world wide web consortium (w3c).7 we have chosen rdf for modeling the citation lifecycle because of its advantages with respect to other formalisms. rdf is modular; a subset of rdf triples from an rdf graph can be used separately, keeping a consistent rdf model. it therefore can be used with partial information, an essential feature in a distributed environment. the union of knowledge is mapped into the union of the corresponding rdf graphs (information can be gathered incrementally from multiple sources). rdf is the main building block of the semantic web initiative, together with a set of technologies for defining rdf vocabularies like rdf schema (rdfs) and the owl.8 rdf comprises several related elements, including a formal model and an xml serialization syntax. the basic building block of the rdf model is the triple subjectpredicate-object. in a graph-theory sense, an rdf instance is a labeled directed graph consisting of vertices, which represent subjects or objects, and labeled edges, which represent predicates (semantic relations between subjects and objects). coming back to the scholarly domain, our proposal is to model static knowledge (e.g., authors and papers metadata) and dynamic knowledge (e.g., “the action of accepting a paper for publication,” or “the action of submitting a paper for publication”) using rdf predicates. the example in figure 1 shows how the action of submitting a paper for publication could be modeled with an rdf graph. figure 2 shows how the example in figure 1 would be serialized using the rdf xml syntax (the abbreviated mode). so, in our approach, we model assertions as rdf graphs and subgraphs. to allow anybody (authors, publishers, citation-analysis systems, or others) to verify a chain of assertions, each involved rdf graph must be digitally signed by the proper principal. there are two approaches to signing rdf graphs (as also happens with xml instances). the first approach applies when the rdf graph is obtained from a digitally signed file. in this situation, one can simply verify the signature on the file. however, in certain situations the rdf graphs or subgraphs come from a more complex processing chain, and one could not have access to the original signed file. a second approach deals with this situation, and faces the problem of digitally signing the graphs themselves, that is, signing the information contained in them.9 for 28 information technology and libraries | march 2011 note that instances of submitted and accepted event classes will point to the same creation instance because no modification of the creation is performed between these events. on the other hand, instances of tobepublished and published event classes will point to different creation instances (pointed by the cameraready and publishedcreation properties) because of the final editorial-side modifications to which a work can be subject. ■■ advantages of the proposed trust scheme the following is a short list of security features provided by our proposed scheme and attacks against which our proposed scheme is resilient: core (dc), dc’s scholarly works application profile, vcard, and bibtex.11 figure 4 shows the class publication and its subclasses, which represent the different kinds of publication. in the figure, we only show classes for journals, proceedings, and books. but it could obviously be extended to contain any kind of publication. figure 5 contains the classes for the agents of the ontology (i.e., the human beings that author papers and book chapters and the organizations to which human beings are affiliated or that edit publications). the figure also includes the creation class (e.g., a paper or a book chapter). finally, figure 6 has the part of the ontology that describes the different events that occur in the process of publishing a paper (i.e., paper submission, paper acceptance, notification of future publication, and publication). figure 1. example rdf graph semantic web for reliable citation analysis in scholarly publishing | tous, guerrero, and delgado 29 cryptography. the necessary changes do not apply only to the citation-management software, but also to all the involved parties in the publishing lifecycle (e.g., conference and journal management systems). authors and publishers would be the originators of the digitally signed evidences, thus user-friendly tools for generating and signing the rdf metadata would be required. plenty of rdf editors and digital signature toolkits exist, but we predict that conference and journal management systems such as edas could easily be extended to provide integrated functionalities for generating and processing digitally signed metadata graphs. this could be transparent to the users because the rdf documents would be automatically generated (and also signed in the case of the publishers) during the creating–editing– publishing process. because our approach is based on a pki trust scheme, we rely on a special setup assumption: the existence of cas, which certify that the identity information and the public key contained within the public key certificates of authors and publishers belong together. to get a publication recognized by a reliable citation-analysis system, an author or a publisher would need a public-key certificate issued by a ca trusted by this citation-analysis system. the selection of trusted ■■ an author can certify to any evaluation entity that will evaluate his or her curriculum the publications that he or she has done. ■■ an evaluator entity can query the citation-analysis system and get all the publications that a certain author has done. ■■ an author cannot forge notifications of publication. ■■ a publisher cannot repudiate the fact that it has published an article once it has sent the certificate. ■■ two or more authors cannot team up and make the system think that they are the same person to have more publications in their accounts (not even if they happen to have the same name). ■■ implications the adoption of the approach proposed in this paper has certain implications in terms of technological changes but also in terms of behavioral changes at some of the stages of the scholarly publishing workflow. regarding the technological impact, the approach relies on the use of semantic web technologies and public-key 2008–05–25 semantic web for reliable citation management in scholarly publishing . . . . . . figure 2. example rdf/xml representation of graph in figure 1 30 information technology and libraries | march 2011 figure 3. owl ontology for capturing the scholarly communication process figure 4. part of the ontology describing publications semantic web for reliable citation analysis in scholarly publishing | tous, guerrero, and delgado 31 the citation-analysis system obtains the information or whether the information is duplicated. the proposed approach guarantees that the citation-analysis subsystem can always verify the provenance and trust of the metadata, and the use of unique identifiers ensures the detection of duplicates. our approach also implies minor behavioral changes for authors, mainly related to the management of publickey certificates, which is often required for many other tasks nowadays. a collateral benefit of the approach would be the automation of the copyright transfer procedure, which in most cases still relies on handwritten signatures. authors would only be required to have their public-key certificate at hand (probably installed in the web browser), and the conference and journal management software would do all the work. cas by citation-analysis systems would require the deployment of the necessary mechanisms to allow an author or a publisher to ask for the inclusion of his or her institution in the list. however, this process would be eased if some institutional cas belonged to trust hierarchies (e.g., national or regional), so including some higher-level cas makes the inclusion of cas of some small institutions easier. another technological implication is related to the interchange and storage of the metadata. users and publishers should save the signed metadata coming from a publishing process digitally, and citation-analysis systems should harvest the digitally signed metadata. the metadata-harvesting process could be done in several different ways; but here raises an important benefit of the presented approach: the fact that it does not matter where figure 5. part of the ontology describing agents and creations 32 information technology and libraries | march 2011 domain, but which we have taken in consideration. in our approach, static and dynamic metadata cross many trust boundaries, so it is necessary to apply trust management techniques designed to protect open and decentralized systems. we have chosen a public-key infrastructure (pki) design to cover such a requirement. however, other approaches exist, such as the one by khare and rifkin, which combines rdf with digital signatures in a manner related to what is known as the “web of trust.”13 one aspect of any approach dealing with rdf and cryptography is how to digitally sign rdf graphs. as described above, in the section “modeling the scholarly communication process with semantic web knowledge representation techniques,” there are two different approaches for such a task, signing the file from which the graph will be obtained (which is the one we have chosen) or digitally signing the graphs themselves (the information represented in them), as described by carroll.14 ■■ conclusions the work presented in this paper describes a reference architecture that aims to provide reliability to the citation and citation-tracking lifecycle. the paper defends that current practices in the analysis of impact of scholarly artifacts entail serious design and security flaws, including nonidentical instances confusion, author-naming conflicts, fake citing, repudiation, impersonation, etc. ■■ related work as far as we know, this is the first paper to combine semantic web technologies and public-key cryptography to achieve reliable citation analysis in scholarly publishing. regarding the use of ontologies and semantic web technologies for modeling the scholarly domain, we highlight the research by rodriguez, bollen, and van de sompel.12 they define a semantic model for the scholarly communication process, which is used within an associated large-scale semantic store containing bibliographic, citation, and use data. this work is related to the mesur (metrics from scholarly usage of resources) project (http://www.mesur.org) from los alamos national laboratory. the project’s main goal is providing novel mechanisms for assessing the impact of scholarly communication items, and hence of scholars, with metrics derived from use data. as in our case, the approach by rodriguez, bollen, and van de sompel models static and dynamic aspects of the scholarly communication process using rdf and owl. however, contrary to what happens in that approach, our work focuses on modeling the dynamic aspects of the creation–editing–publishing workflow, while the approach by rodriguez, bollen, and van de sompel focuses on modeling the use of alreadypublished bibliographic resources. regarding the combination of semantic web technologies with security aspects and cryptography, there exist several works that do not specifically focus in the scholarly figure 6. part of the ontology describing events semantic web for reliable citation analysis in scholarly publishing | tous, guerrero, and delgado 33 isi web of knowledge, http://www.isiwebofknowledge .com/ (accessed june 24, 2010); and eugene garfield, citation indexing: its theory and application in science, technology and humanities (new york: wiley, 1979). 3. judit bar-ilan, “an ego-centric citation analysis of the works of michael o. rabin based on multiple citation indexes,” information processing & management: an international journal 42 no. 6 (2006): 1553–66. 4. alfred arsenault and sean turner, “internet x.509 public key infrastructure: pkix roadmap,” draft, pkix working group, sept. 8, 1998, http://tools.ietf.org/html/draft-ietf-pkixroadmap-00 (accessed june 24, 2010). 5. internet assigned numbers authority (iana), root zone database, http://www.iana.org/domains/root/db/ (accessed june 24, 2010). 6. for information on the doi system, see bill rosenblatt, “the digital object identifier: solving the dilemma of copyright protection online,” journal of electronic publishing 3, no. 2 (1997). 7. resource description framework (rdf), world wide web consortium, feb. 10, 2004, http://www.w3.org/rdf/ (accessed june 24, 2010). 8. “rdf vocabulary description language 1.0: rdf schema. w3c working draft 23 january 2003,” http://www .w3.org/tr/2003/wd-rdf-schema-20030123/ (accessed june 24, 2010); “owl web ontology language overview. w3c recommendation 10 february 2004,” http://www.w3.org/tr/ owl-features/ (accessed june 24, 2010). 9. jeremy j. carroll, “signing rdf graphs,” in the semantic web—iswc 2003, vol. 2870, lecture notes in computer science, ed. dieter fensel, katia sycara, and john mylopoulos (new york: springer, 2003). 10. ian horrocks, peter f. patel-schneider, and frank van harmelen, “from shiq and rdf to owl: the making of a web ontology language” web semantics: science, services and agents on the world wide web 1 (2003): 10–11. 11. see the dublin core metadata initiative (dcmi), http:// dublincore.org/ (accessed june 24, 2010); julie allinson, pete johnston, and andy powell, “a dublin core application profile for scholarly works,” ariadne 50 (2007), http://www.ukoln .ac.uk/repositories/digirep/index/eprints_type_vocabulary_ encoding_scheme, http://www.ariadne.ac.uk/issue50/ allinson-et-al/ (accessed dec. 27, 2010); world wide web consortium, “representing vcard objects in rdf/xml: w3c note 22 february 2001,” http://www.w3.org/tr/2001/note -vcard-rdf-20010222/ (accessed dec. 3, 2010); and for bibtex, see “entry types,” http://nwalsh.com/tex/texhelp/bibtx-7. html (accessed june 24, 2010). 12. marko. a. rodriguez, johan bollen, and herbert van de sompel, “a practical ontology for the large-scale modeling of scholarly artifacts and their usage,” proceedings of the 7th acm/ ieee joint conference on digital libraries (2007): 278–87. 13. rohit khare and adam rifkin, “weaving a web of trust,” world wide web journal 2, no. 3 (1997): 77–112. 14. carroll, “signing rdf graphs.” the architecture presented in this work is based in the use of digitally signed rdf graphs in the different stages of the scholarly publishing workflow, in such a manner that authors, publishers, repositories, and citation-analysis systems could have access to independent reliable evidences. the architecture aims to allow the creation of a reliable information space that reflects not just static knowledge but also dynamic relationships, reflecting the full complexity of trust relationships between the different parties in the scholarly domain. to allow modeling the scholarly communication process with rdf graphs, we have designed an owl dl ontology. rdf graphs carrying instances of classes and properties from the ontology will be digitally signed and interchanged between parties at the different stages of the creation–editing–publishing process. citation-management systems will have access to these signed metadata graphs and will be able to verify their provenance and trust before incorporating them to their repositories. because citation analysis has become a critical component in scholarly impact factor calculation, and considering the relevance of this metric within the scholarly publishing value chain, we defend that the relevance of providing a reliable solution justifies the effort of introducing technological changes within the publishing lifecycle. we believe that these changes, which could be easily automated and incorporated to the modern conference and journal editorial systems, are justified considering the serious flaws of the established solutions and the relevance that citation-analysis systems are acquiring in our society ■■ acknowledgment this work has been partly supported by the spanish administration (tec2008-06692-c02-01 and tsi2007 66869-c02-01). references and notes 1. herbert van de sompel et al., “an interoperable fabric for scholarly value chains,” d-lib magazine 12 no. 10 (2006), http:// www.dlib.org/dlib/october06/vandesompel/10vandesompel .html (accessed jan. 19, 2011). 2. boletín oficial del estado (b.o.e.) 054 04/03/2005 sec 3 pag 7875 a 7887, http://www.boe.es/boe/dias/2005/03/04/pdfs/ a07875–07887.pdf (accessed june 24, 2010). see also thomson 50 information technology and libraries | june 2006 author name and second author f orty years! in july 1966, the library and information technology association (lita) was officially born at the american library association (ala) annual conference in new york as the information science and automation division (isad). it was bastille day, and i’m sure for those who had worked so hard to create this new organization that it probably seemed like a revolution, a new day. the organizational meeting held that day attracted “several hundred people.” imagine! i’ve mentioned it before, i know, but the history of the first twenty-five years of lita is intriguing reading and well worth an investment of your time. stephen r. salmon’s article “lita’s first twenty-five years: a brief history” (www.lita.org/ala/lita/aboutlita/org/1st 25years.htm) offers an interesting look back in time. any technology organization that has been in existence for forty or more years has seen a lot of changes and adapted over time to a new environment and new technologies. there is no other choice. someone (who, i don’t remember; i’d gladly attribute the quote if i did) once told me that library automation began with the electric eraser. i’m sure that many of you have neither seen an electric eraser, nor can you probably imagine its purpose. ask around. i’m sure there are staff in your organization who do remember using it. there may even be one hidden somewhere in your library. a quick search of the web even finds cordless, rechargeable electric erasers today in drafting and art supply stores. the 1960s, as lita was born, was still the era of the big mainframe systems and not-so-common programming languages. machine readable cataloging (marc) was born and oclc conceived. the 1970s saw the introduction of minicomputer systems. digital equipment corporation introduced the vax, a 32-bit platform, in 1976. the roots of many of our current integrated library systems reach back to this decade. the 1980s saw the introduction of the ibm personal computer and the apple macintosh. the graphical interface became the norm or at least the one to imitate. the 1990s saw a shift away from hardware to communication and access as the web was unveiled and began to give life to the internet bubble. the new millennium began with y2k. the web predominates, and increasingly, the digital form dominates almost everything we touch (text, audio, video). automation and systems evolved and changed over the years, and so did libraries. automation, which had been confined to large air-conditioned and monitored rooms, moved out into the library. it increasingly appeared at circulation desks, on staff desks, and then throughout the library. information technology (it) spread into offices everywhere and into homes. libraries had products and services to deliver to users. users demanded more convenience. of course, others knew this trend as well and provided products and services that users wanted. users often liked what they saw in stores better than what the library was able to provide. each of us attempts to keep up, compete, and beat those whom we see as our competitors. it’s a moving target and one that seems to be gaining speed. all the while, during these four decades, our association and its members continually adapted to the new environment, faced new challenges, and adopted new technologies. we would not exist if we did not. i feel that we, as an association, are again facing the need to change, to transform ourselves. it, digital technology, automation (whatever term you want to use) affects the work of virtually every library staff member. everyone’s work in the library uses or contributes to the digital presence of our employer. it is not the domain of a few. lita has a wonderful history and it has great potential to better serve the profession. what do we want our association to be? what programs and services can we provide that others do not? who can we involve to broaden our reach? how can we better communicate with members and nonmembers? if we had a clean sheet of paper, what would we write? what would we dream? we need to share that dream and bring it to life. i can’t do it. the lita board can’t do it. we need your help. we need your ideas. we need your energy. we need to break out of our comfort zone. none of us wants the strategic plan (www.lita.org/ala/lita/aboutlita/org/plan.htm) we adopted last year to ring hollow. we want to accelerate change and move into a reenergized future. i welcome your aspirations, ideas, and comments. i know that the lita board does as well. please feel free to contact me or any member of the board (www.lita .org/ala/lita/aboutlita/org/litagov/board.htm). lita is your association. where should we be going? help us navigate the future. patrick mullin patrick mullin (mullin@email.unc.edu) is lita president 2005– 2006, and associate university librarian for access services and systems, the university of north carolina at chapel hill. president’s column lita president's message: sustaining lita lita president’s message sustaining lita emily morton-owens information technology and libraries | september 2019 2 emily morton-owens (egmowens.lita@gmail.com) is lita president 2019-20 and the assistant university librarian for digital library development & systems at the university of pennsylvania libraries. recently, at the 2019 midwinter meeting in seattle, ala decided to adopt sustainability as one of the core values of librarianship. the resolution includes the idea of a triple bottom line: “to be truly sustainable, an organization or community must embody practices that are environmentally sound and economically feasible and socially equitable.” if you had thought of sustainability mainly in terms of the environment, you have plenty of company. i originally pictured it as an umbrella term for a variety of environmental efforts: clean air, waste reduction, energy efficiency. but in fact the idea encompasses human development in a broader sense. one definition of sustainability involves making decisions in the present that take into account the needs of the future. of course our current environmental threats demand our attention, and libraries have found creative ways to promote environmental consciousness (myriad examples include books on bikes, seeking leed or passive house certification for library buildings, providing resources on xeriscaping, and many more). even if you’re not presently working in a position that allows you to engage directly on the environment, though, the concept of sustainability turns out to permeate our work and values. the ideas of solving problems in a way that doesn’t create new challenges for future people, developing society in a way that allows all people to flourish, and fostering strong institutions: these concepts all resonate with the work we do daily, not only in what we offer our users but also in how we work with each other. as a profession, we have a history of designing future-proof systems (or at least attempting to). whenever i’ve been involved in planning a digital library project, one of the first questions on the table is “how do we get our data back out of this, when the time comes?” no matter how enamored we are of the current exciting new solution, we remember that things will look different in the future. library metadata schemas are all about designing for interoperability and reusability, including in new ways that we can’t picture yet. someone who is unaccustomed to this kind of planning may see a high project overhead for these concerns, but we have consistently incorporated long-term thinking into our professional values due to the importance we place on free access, data preservation, and interoperability. the triple-bottom line approach, considering economic, social, and environmental factors, also influences the lita leadership. i recently announced the lita board’s decision to reduce our in person participation at ala midwinter for 2020, which is partly in response to ala’s deliberations about reinventing the event starting in 2021. with all the useful collaboration technologies now at our fingertips, it is harder to justify requiring our members to meet in person more than once per year. it is possible for us to do great work, on a continuous and rolling basis, throughout the year. more importantly, we want to offer committee and leadership positions to members who may not mailto:egmowens.lita@gmail.com http://www.ala.org/aboutala/sites/ala.org.aboutala/files/content/governance/council/council_documents/2019_ms_council_docs/ala%20cd%2037%20resolution%20for%20the%20adoption%20of%20sustainability%20as%20a%20core%20value%20of%20librarianship_final1182019.pdf sustaining lita | morton-owens 3 https://doi.org/10.6017/ital.v38i3.11627 be able to travel extensively, for personal or work reasons. (especially when many do not receive financial support from their employers. and, to come back around to environmental concerns for a moment, think of all the flights our in-person meetings require.) by being more flexible about what participation looks like, we sustain the effort that our members put into lita through a world of work that is changing. financial sustainability is also a factor in our pursuit of a merger with alcts and llama. we are three smaller divisions based on professional role, not library type, who share interests and members. we also have similar needs and processes for running our respective associations. unfortunately, lita has been on an unsustainable course with our budget for some time—we spend more than we take in annually, due to overhead costs and working within ala’s processes and infrastructure. the lita board has engaged for many years on the question of how to balance our financial future with the fact that our programs require full-time staff, instructors, technology, printing, meeting rooms, etc. core, as the new merged division will be known, will allow us to correct that balance by combining our operations, streamlining workflows, and containing our costs. the staff will also be freed up to invest more effort in member engagement. we can’t predict all the services that associations will offer in the future, but we know that, for example, online professional development is always needed, so we’re ensuring that the plan allows it to continue. it is inspiring to talk about the new collaborations and subject-matter synergies that the merger will bring with it, but core will also achieve something important for sustaining a level of service to our membership. at the ala level, the steering committee on organizational effectiveness (scoe) is also looking at ways to streamline the association’s structure and make it more approachable and welcoming to new members. i would add that a simplified structure should make ala more accountable to members as well, which is crucial for positioning it as an organization worth devoting yourself to. these shifts are essential because member volunteers are what make ala happen, and we need a structure that invites participation from future generations of library workers. taken together, these may look like a confusing flurry of changes. but librarians have evolved to be excellent at long-term thinking about our goals and values and how to pursue an exciting future vision based on what we know now and what tools (technology, people, ideas) we have at hand. we care about helping our users thrive and are able to take a broad view of what that encompasses. in particular, with the new resolution about sustainability, we’re including the health of our communities and the security of our environment as a part of that mission. due to their innovative spirit and principled sense of commitment, our members are well-placed to lead transformations in their home institutions and to participate in the development of lita. as we weigh all these changes, we value the achievements of our association and its past leaders and members, and seek to honor them by making sure those successes carry on for our future colleagues. september_ital_yelton_final president’s message: 50 years andromeda yelton information technologies and libraries | september 2017 1 fifty years. lita was voted into existence (as isad, the information science and automation division) in detroit at midwinter 1966. therefore we have just completed our first fifty years, a fact celebrated (thanks to our 50th anniversary task force) with a slide show and cake at annual in chicago. it’s truly humbling to take office upon this milestone. looking back, some of the true giants of library technology have held this office. in 1971-72, jesse shera, who in his wide-ranging career challenged librarians to think deeply about the epistemological and sociological dimensions of librarianship; ala makes several awards in his name today. in 1973-74 and again in 1974-75, frederick kilgour, the founding director of oclc, who also has an eponymous award. in 1975-76, henriette avram, the mother of marc, herself. moreover, thanks to the work of countless lita volunteers, much of this history is available openaccess. i strongly recommend reading http://www.ala.org/lita/about/history/ for an overview of the remarkable people and key issues across our history. you can also read papers by avram and kilgour, among many others, in the archives of this very publication. in fact, reading the ital archives is deeply engaging. it turns out library technology has changed a bit in 50 years! (i trust that isn’t a shock to you.) the first articles (in what was then the journal of library automation) are all about instituting first-time computer systems to automate traditional library functions such as acquisitions, cataloging, and finance. the following passage caught my eye: “a functioning technical processing system in a two-year community college library utilizes a model 2201 friden flexowriter with punch card control and tab card reading units, an ibm 026 key punch, and an ibm 1440 computer, with two tape and two disc drives, to produce all acquisitions and catalog files based primarily on a single typing at the time of initiating an order” (“an integrated computer based technical processing system in a small college library”, jack w. scott; https://doi.org/10.6017/ital.v1i3.2931.) how many of us are still using punch cards today? and, indeed, how many of us are automating libraries for the first time? the topics discussed among lita members today are far more wideranging: user experience, privacy, accessibility. they’re more likely to be about assessing and improving existing systems than creating new ones, and more likely to center on patron-facing technologies. andromeda yelton (andromeda.yelton@gmail.com) is lita president 2017-18 and owner/consultant of small beautiful useful llc. president’s message | yelton https://doi.org/10.6017/ital.v36i3.10086 2 and yet, with a few substitutions — say, “raspberry pi” for “friden flexowriter” — the blockquote above would not be out of place today. then as now, lita members were doing something exciting, yet deeply practical, that cleverly repurposes new technology to make library experiences better for both patrons and staff. our job descriptions have changed enormously in fifty years; in fact, the lita board charged a task force to develop lita member personas, so that we can better understand whom we serve, and work to align our publications, online education, conference programming, and committee work toward your needs. (you can see an overview of the task force’s stellar work on litablog: http://litablog.org/2017/03/who-are-lita-members-lita-personas/.) at the same time, the spirit of pragmatic creativity that runs throughout the first issues of the journal of library automation continues to animate lita members today. i’m looking forward to seeing where we go in our next fifty years. 26 information technology and libraries | september 2007 author id box for 2 column layout wikis in libraries matthew m. bejune wikis have recently been adopted to support a variety of collaborative activities within libraries. this article and its companion wiki, librarywikis (http://librarywikis. pbwiki.com/), seek to document the phenomenon of wikis in libraries. this subject is considered within the framework of computer-supported cooperative work (cscw). the author identified thirty-three library wikis and developed a classification schema with four categories: (1) collaboration among libraries (45.7 percent); (2) collaboration among library staff (31.4 percent); (3) collaboration among library staff and patrons (14.3 percent); and (4) collaboration among patrons (8.6 percent). examples of library wikis are presented within the article, as is a discussion for why wikis are primarily utilized within categories i and ii and not within categories iii and iv. it is clear that wikis have great utility within libraries, and the author urges further application of wikis in libraries. i n recent years, the popularity of wikis has skyrocketed. wikis were invented in the mid1990s to help facilitate the exchange of ideas between computer programmers. the use of wikis has gone far beyond the domain of com puter programming, and now it seems as if every google search contains a wikipedia entry. wikis have entered into the public consciousness. so, too, have wikis entered into the domain of professional library practice. the purpose of this research is to document how wikis are used in librar ies. in conjunction with this article, the author has created librarywikis (http://librarywikis.pbwiki.com/), a wiki to which readers can submit additional examples of wikis used in libraries. the article will proceed in three sections. the first section is a literature review that defines wikis and introduces computersupported cooperative work (cscw) as a context for understanding wikis. the second section documents the author’s research and presents a schema for classifying wikis used in libraries. the third section considers the implications of the research results. ■ literature review what’s a wiki? wikipedia (2007a) defines a wiki as: a type of web site that allows the visitors to add, remove, edit, and change some content, typically with out the need for registration. it also allows for linking among any number of pages. this ease of interaction and operation makes a wiki an effective tool for mass collaborative authoring. wikis have been around since the mid1990s, though it is only recently that they have become ubiquitous. in 1995, ward cunningham launched the first wiki, wikiwikiweb (http://c2.com/cgi/wiki), which is still active today, to facilitate the exchange of ideas among computer program mers (wikipedia 2007b). the launch of wikiwikiweb was a departure from the existing model of web communica tion ,where there was a clear divide between authors and readers. wikiwikiweb elevated the status of readers, if they so chose, to that of content writers and editors. this model proved popular, and the wiki technology used on wikiwikiweb was soon ported to other online communi ties, the most famous example being wikipedia. on january 15, 2001, wikipedia was launched by larry sanger and jimmy wales as a complementary project for the nowdefunct nupedia encyclopedia. nupedia was a free, online encyclopedia with articles written by experts and reviewed by editors. wikipedia was designed as a feeder project to solicit new articles for nupedia that were not submitted by experts. the two services coexisted for some time, but in 2003 the nupedia servers were shut down. since its launch, wikipedia has undergone rapid growth. at the close of 2001, wikipedia’s first year of operation, there were 20,000 articles in eighteen language editions. as of this writing, there are approximately seven million articles in 251 languages, fourteen of which have more than 100,000 articles each. as a sign of wikipedia’s growth, when this manuscript was first submitted four months earlier, there were more than five million articles in 250 languages. author’s note: sources in the previous two para graphs come from wikipedia. the author acknowledges the concerns within the academy regarding the practice of citing wikipedia within scholarly works; however, it was decided that wikipedia is arguably an authoritative source on wikis and itself. nevertheless, the author notes that there were changes—insubstantial ones—to the cited wikipedia entries between when the manuscript was first submitted and when it was revised four months later. wikis and cscw wikis facilitate collaborative authoring and can be con sidered one of the technologies studied under the domain of cscw. in this section, cscw is explained and it is shown how wikis fit within this framework. cscw is an area of computer science research that considers the application of computer technology to sup port cooperative, also referred to as collaborative work. the term was first coined in 1984 by irene greif (1988) and matthew m. bejune (mbejune@purdue.edu) is an assistant professor of library science at purdue university libraries. he also is a doctoral student at the graduate school of library and information science, university of illinois at urbana-champaign. article title | author 27wikis in libraries | bejune 27 paul cashman to describe a workshop they were planning on the support of people in work environments with com puters. over the years there have been a number of review articles that describe cscw in greater detail, including bannon and schmidt (1991), rodden (1991), schmidt and bannon (1992), sachs (1995), dourish (2001), ackerman (2002), olson and olson (2002), dix, finlay, abowd, and beale (2004), and shneiderman and plaisant (2005). publication in the field of cscw primarily occurs through conferences. the first conference on cscw was held in 1986 in austin, texas. since then, the conference has been held biennially in the united states. proceedings are published by the association for computing machinery (acm, http://www.acm.org/). in 1991, the first european conference on computer supported cooperative work (ecscw) was held in amsterdam. ecscw also is held biennially, in oddnumbered years. ecscw proceedings are published by springer (http://www.ecscw.unisie gen.de/). the primary journal for cscw is computer supported cooperative work: the journal of collaborative computing. publications also appear within publications of the acm and chi, the conference on human factors in computing. cscw and libraries as libraries are, by nature, collaborative work envi ronments—library staff working together and with patrons—and as digital libraries and computer technolo gies become increasingly prevalent, there is a natural fit between cscw and libraries. the following researchers have applied cscw to libraries. twidale et al. (1997) pub lished a report sponsored by the british library research and innovation centre that examined the role of col laboration in the informationsearching process to inform how information systems design could better address and support collaborative activity. twidale and nichols (1998) offered ethnographic research of physical collaborative environments—in a university library and an office—to aid the design of digital libraries. they wrote two reviews of cscw as applied to libraries—the first was more com prehensive (twidale and nichols 1998) than the second (twidale and nichols 1999). sánchez (2001) discussed collaborative environments designed and prototyped for digital library environments. classification of collaboration technologies that facilitate collaborative work are typically classified within cscw across two continua: synchronous versus asynchronous, and colocated versus remote. if put together in a twobytwo matrix, there are four possibilities: (1) synchronous and colocated (same time, same place); (2) synchronous and remote (same time, different place); (3) asynchronous and remote (different time, different place); and (4) asynchronous and colocated (different time, same place). this classification schema was first proposed by johansen et al. (1988). nichols and twidale (1999) mapped work applications within the realm of cscw in figure 1. wikis are not present in the figure, but their absence is not an indication that they are not cooperative work technologies. rather, wikis were not yet widely in use at the time cscw was considered by nichols and twidale. the author has added wikis to nichols and twidale’s graphical representation in figure 2. interestingly, wikis are bordercrossers fitting within two quadrants: the upper right—asynchronous and colocated; and the lower right—asynchronous and remote. wikis are asynchronous in that they do not require people to be working together at the same time. they are both colocated and remote in that people working collaboratively may not need to be working in the same place. it is also interesting to note that library technologies also can be mapped using johansen’s schema. nichols and twidale (1999) also mapped this, and figure 3 illus trates the variety of collaborative work that goes on within libraries. ■ method in order to to discover the widest variety of wikis used in libraries, the author searched for examples of wikis used in libraries within three areas—the lis literature, the library success wiki, and within messages posted on three professional electronic discussion lists. when examples were found, they were logged and classified according to a schema created by the author. results are presented in the next section. the first area searched was within the lis literature. the author utilized the wilson library literature and figure 1. classification of cscw applications co-located remote synchronous asynchronous meeting rooms distributed meetings muds and moos shared drawing video conferencing collaborative writing team rooms organizational memory workflow web-based applications collaborative writing 2� information technology and libraries | september 20072� information technology and libraries | september 2007 information science database. there were two main types of articles: ones that argued for the use of wikis in libraries, and ones that were case studies of wikis that had been implemented. the second area searched was within library success: a best practices wiki (http://www.libsuccess.org/) (see figure 4), created by meredith farkas, distance learning librarian at norwich university. as the name implies, it is a place for people within the library community to share their success stories. posting to the wiki is open to the public, though registration is encouraged. there are many subject areas on the wiki, including management and leadership, readers’ advisory, reference services, infor mation literacy, and so on. there also is a section about collaborative tools in libraries (http://www.libsuccess .org/index.php?title=collaborative_tools_in_libraries), in which examples of wikis in libraries are presented. within this section there is a presentation about wikis made by farkas (2006) titled wiki world (http://www. libsuccess.org/indexphp?title=wiki_world), from which examples were culled. the third area that was searched was professional electronic discussion list messages from web4lib, dig_ ref, and librefl. the web4lib electronic discussion list (tennant 2005) is “for the discussion of issues relating to the creation, management, and support of library based world wide web servers, services, and applica tions.” the list is moderated by roy tennant and the web4lib advisory board and was started in 1994. the dig_ref electronic discussion list is a forum for “people and organizations answering the questions of users via the internet” (webjunction n.d.). the list is hosted by the information institute of syracuse, school of information studies, syracuse university, and was created in 1998. the librefl electronic discussion list is “a moderated discussion of issues related to reference librarianship (balraj 2005). established in 1990, it’s operated out of kent state university and moderated by a group of list own ers. these three electronic discussion lists were selected for two reasons. first, the author is a subscriber to each electronic discussion list, and prior to the research noted there were messages about wikis in libraries. second, based on the descriptions of each electronic discussion list stated above, the selected electronic discussion lists reasonably covered the discussion of wikis in libraries within the professional library electronic discussion lists. one year of messages, november 15, 2005, through november 14, 2006, was analyzed for each list. messages about wikis in libraries were identified through key word searches against the author’s personal archive of electronic discussion list messages collected over the figure 2. classification of cscw applications including wikis co-located remote synchronous asynchronous meeting rooms distributed meetings muds and moos shared drawing video conferencing collaborative writing wikis team rooms wikis organizational memory workflow web-based applications collaborative writing figure 3. classification of collaborative work within libraries co-located remote synchronous asynchronous personal help reference interview issue of book on loan fact-to-face interactions use of opacs database search video conferencing telephone notice boards post-it notes memos documents for study social information filtering e-mail, voicemail distance learning postal services figure �. library success: a best practices wiki (http://www. libsuccess.org/) article title | author 29wikis in libraries | bejune 29 years. an alternative method would have been to search the web archive of each list, but the author found it easier to search within his mail client, microsoft outlook. messages with the word “wiki” were found in 513 mes sages: 354 in web4lib, 91 in dig_ref, and 68 in libref l. this approach had high recall, as discourse about wikis frequently included the use of the word “wiki,” though low precision, as there were many results that were not about wikis used in libraries. common false hits included messages about the nature study (giles 2005) that com pared wikipedia to encyclopedia britannica, and messages that included the word “wiki” but were simply refer ring to wikis, though not examples of wikis used within libraries. from the list of 513 messages, the author read each message and came up with a much shorter list of thirtynine messages about wikis in libraries: thirtytwo in web4lib, three in dig_ref, and four in librefl. ■ results classification of the results after all wiki examples had been collected, it became clear that there was a way to classify the results. in farkas’s (2006) presentation about wikis, she organized wikis in two categories: (1) how libraries can use wikis with their patrons; and (2) how libraries can use wikis for knowledge sharing and collaboration. this schema, while it accounts for two types of collaboration, is not granular enough to represent the types of collaboration found within the wiki examples identified. as such, it became clear that another schema was needed. twidale and nichols (1998) identified three types of collaboration within libraries: (1) collaboration among library staff; (2) collaboration between a patron and a member of staff; and (3) collaboration among library users. their classification schema mapped well to the examples of wikis that were identified; however, it too was not granular enough, as it did not distinguish among col laboration between library staff intraorganizationally and extraorganizationally, the two most common types of wiki usage found in the research (see appendix). to account for these types of collaboration, which are common not only to wiki use in libraries but to all professional library prac tice, the author modified twidale and nichols schema (see figure 6). the improved schema also uniformly represents entities across the categories—library staff and member of staff are referred to as “library staff,” and patrons and library users are referred to as “patrons.” examples of wikis used in libraries for each category are provided to better illustrate the proposed classifica tion schema. ■ collaboration among libraries the library instruction wiki (http://instructionwiki .org/main_page) is an example of a wiki that is used for collaboration among libraries (figure 7). it appears as though the wiki was originally set up to support library instruction within oregon—it is unclear if this was asso ciated with a particular type of library, say academic or public—but now the wiki supports library instruction in general. the wiki is selfdescribed as: a collaboratively developed resource for librarians involved with or interested in instruction. all librarians and others interested in library instruction are welcome and encouraged to contribute. the tagline for the wiki is “stop reinventing the wheel”(library instruction wiki 2006). from this wiki there figure 6. four types of collaboration within libraries 1. collaboration among libraries (extra-organizational) 2. collaboration among library staff (intra-organizational) 3. collaboration among library staff and patrons 4. collaboration among patrons figure 5. wiki world (http://www.libsuccess.org/index.php?title=wiki _world) 30 information technology and libraries | september 200730 information technology and libraries | september 2007 is a list of library instruction resources that include the fol lowing: handouts, tutorials, and other resources to share; teaching techniques, tips, and tricks; classspecific web sites and handouts; glossary and encyclopedia; bibliography and suggested reading; and instructionrelated projects, brainstorms, and documents. within the handouts, tutori als, and other resources to share section, the author found a wide variety of resources from libraries across the country. similarly, there were a number of suggestions to be found under the teaching techniques, tips, and tricks section. another example of a wiki used for collaboration among libraries is the library success wiki (http://www .libsuccess.org/), one of the sources of examples of wikis used in this research. adding to earlier descriptions of this wiki as presented in this paper, library success seems to be one of the most frequently updated library wikis and perhaps the most comprehensive in its cover age of library topics. ■ collaboration among library staff the university of connecticut libraries’ staff wiki (http:// wiki.lib.uconn.edu/) is an example of a wiki used for col laboration among library staff (figure 8). this wiki is a knowledge base containing more than one thousand infor mation technology services (its) documents. its docu ments support the information technology needs of the library organization. examples include answers to com monly asked questions, user manuals, and instructions for a variety of computer operations. in addition to being a repository of its documents, the wiki also serves as a portal to other wikis within the university of connecticut libraries. there are many other wikis connected to library units; teams; software applications, such as the libraries ils; libraries within the university of connecticut libraries; and other university of connecticut campuses. the health science library knowledge base, stony brook university (http://appdev.hsclib.sunysb.edu/ twiki/bin/view/main/webhome) is another example of a wiki that is used for collaboration among library staff (figure 9). the wiki is described as “a space for the dynamic collaboration of the library staff, and a platform of shared resources” (health sciences library 2007). on the wiki there are the following content areas: news and announcements; hsl departments; projects; trouble shooting; staff training resources, working papers and support materials; and community activities, scholarship, conferences, and publications. ■ collaboration among library staff and patrons there are only a few examples of wikis used for collabora tion among library staff and patrons to cite as exemplars. one example is the st. joseph county public library (sjpl) subject guides (http://www.libraryforlife.org/ subjectguides/index.php/main_page), seen in figure 10. this wiki is a collection of resources and services in print and electronic formats to assist library patrons with subject area searching. as the wiki is published by library staff for public consumption, it has more of a professional feel than wikis from the first two categories. pages have images, and the content is structured to look like a standard web page. though the wiki looks like a web page, there still remain a number of edit links that follow each section of text on the wiki. while these tags bear importance for those editing figure 7. library instruction wiki (http://instructionwiki.org/) figure �. the university of connecticut libraries’ staff wiki (http:// wiki.lib.uconn.edu/) article title | author 31wikis in libraries | bejune 31 the wiki—library staff only in this case—they undoubtedly puzzle library patrons who think that they have the ability to edit the wiki when, in fact, they do not. another example of collaboration between library staff and patrons that takes a similar approach is the usc aiken gregggraniteville library web site (http://library. usca.edu/) in figure 11. as with the sjpl subject guides, this wiki looks more like a web site than a wiki. in fact, the usc aiken wiki conceals its true identity as a wiki even more so than the sjpl subject guides. the only evidence that the web site is a wiki is a link at the bottom of each page that says “powered by pmwiki.” pmwiki (http:// pmwiki.org/) is a content management system that uti lizes the wiki technology on the back end to manage a web site while retaining the look and feel of a standard web site. it seems that the benefits of using a wiki in such a way are shared content creation and management. ■ collaboration among patrons as there are only three examples of wikis used for col laboration among patrons, all examples will be high lighted in this section. the first example is wiki worldcat (http://www.oclc.org/productworks/wcwiki.htm), sponsored by oclc. wiki worldcat launched as a pilot project in september 2005. the service allows users of open worldcat, oclc’s web version of worldcat, to add book reviews to item records. though this wiki does not have many book reviews in it, even for contemporary bestsellers, it gives a taste for how a wiki could be used to facilitate collaboration among patrons. a second example is the biz wiki from ohio university libraries (http://www.library.ohiou.edu/subjects/ bizwiki/index.php/main_page) (see figure 12). the biz wiki is a collection of business information resources avail able through ohio university. the wiki was created by chad boeninger, reference and instruction librarian, as an alternate form of a subject guide or pathfinder. what separates this wiki from those in the third category, collaboration among library staff and patrons, is that the wiki is editable by patrons as well as librarians. similarly, butler wikiref (http://www .seedwiki.com/wiki/butler_wikiref) is a wiki that has reviews of reference resources created by butler librarians, faculty, staff, and students (see figure 13).figure 9. health sciences library knowledge base (http://appdev .hsclib.sunysb.edu/twiki/bin/view/main/webhome) figure 11. usc aiken gregg-graniteville library (http://library.usca .edu/) figure 10. sjcpl subject guides (http://libraryforlife.org/subject guides/index.php/main_page/) 32 information technology and libraries | september 200732 information technology and libraries | september 2007 full results thirtythree wikis were identified. two wikis were classi fied in two categories each. the full results are available in the appendix. table 1 illustrates how wikis were not uniformly distributed across the four categories: category i had 45.7 percent, category ii had 31.4 percent, category iii had 14.3 percent, and category iv had 8.6 percent. nearly 80 percent of all examples were found within categories i and ii. as seen in some of the examples in the previous section, wikis were utilized for a variety of purposes. here is a short list of purposes for which wikis were utilized: for sharing information, supporting association work, collecting soft ware documentation, supporting conferences, facilitating librariantofaculty collaboration, creating digital reposito ries, managing web content, creating intranets, providing reference desk support, creating knowledge bases, creating subject guides, and collecting reader reviews. wiki software utilization is summarized in tables 2 and 3. mediawiki is the most popular software utilized by libraries (33.3 percent), followed by unknown (30.3 percent), pbwiki (12.1 percent), pmwiki (12.1 percent), seedwiki (6.1 percent), twiki (3 percent), and xwiki (3 percent). if the values for unknown are removed from the totals (table 3 ), mediawiki is utilized in almost half (47.8 percent) of all library wiki applications. ■ discussion with a wealth of examples of wikis in categories i and ii and a dearth of examples of wikis in categories iii and iv, the library community seems to be more comfortable using wikis to collaborate within the community, but less comfortable using wikis to collaborate with library patrons or to enable collaboration among patrons. the research results pose the questions: why are wikis pre dominantly used for collaboration within the library community? and why are wikis minimally used for col laborating with patrons and helping patrons to collabo rate with one another? why are wikis predominantly used for collaboration within the library community? this is perhaps the easier of the two questions to explain. there is a long legacy of cooperation and collaboration intraorganizationally and extraorganizationally within libraries. one explanation for this is the shared bud getary climate within libraries. all too often there are insufficient money, staff, and resources to offer desired levels of service. librarians work together to overcome these barriers. prominent examples include coopera tive cataloging, interlibrary lending, and the formation of consortia to negotiate pricing. another explanation can be found in the personal characteristics of library professionals. librarianship is a service profession that consequently attracts serviceminded individuals who are interested in helping others, whether they are library patrons or fellow colleagues. a third reason is the role of library associations, such as the international federation of library associations and institutions, the american library association, the special libraries association, and the medical library association, as well as many others at the international, national, state, and local lev figure 12. ohio university libraries biz wiki (http://www.library. ohiou.edu/subjects/bizwiki) figure 13. butler wikiref (http://www.seedwiki.com/wiki/butler_ wikiref) article title | author 33wikis in libraries | bejune 33 els, and the work that is done through these associations at annual conferences and throughout the year. libraries use wikis to collaborate intraorganizationally and extra organizationally because collaboration is what they do most naturally. why are wikis minimally used for collaborating with patrons and helping patrons to collaborate with one another? the reasons for why libraries are only minimally using wikis to collaborate with patrons and for patron collabora tion are more difficult to ascertain. however, due to the untapped potential of using wikis, the proposed answers to this question are more important and may lead to future implementations of wikis in libraries. here are four pos sible explanations, some more speculative than others. first, perhaps one of the reasons is the result of the way in which libraries are conceived by library patrons and librarians alike. a strong case can be made for libraries as places of collaborative work, and the author takes this posi tion. however, historically libraries have been repositories of information, and this remains a pervasive and difficult concept to change—libraries are frequently seen simply as places to get books. in this scenario, the librarian is a gate keeper that a patron interacts with to get a book—that is, if the patron interacts with a librarian at all. it also is worthy to note that the relationship is oneway—the patron needs the assistance of librarian, but not the other way around. viewed in these terms, this is not a collaborative situation. for libraries to use wikis for the purpose of collaborating with library patrons, it might demand the reconceptualiza tion of libraries by library patrons and librarians. similarly, this extreme conceptualization of libraries does not con sider patrons working with one another, even though it is an activity that occurs formally and informally within libraries, not to mention with the emergence of interdisci plinary and multidisciplinary work. if wikis are to be used to facilitate collaboration between patrons, the conceptual ization of the library by library patrons and librarians must be expanded. second, there may be fears within the library commu nity about authority, responsibility, and liability. libraries have long held the responsibility of ensuring the authority of the bibliographic catalog. if patrons are allowed to edit the library wiki, there is potential for negatively affecting the authority of the wiki and even the perceived author ity of the library. likewise, there is potential liability in allowing patrons to post to the library wiki. similar con table 2. software totals wiki software no. % mediawiki 11 33.3 unknown 10 30.3 pbwiki 4 12.1 pmwiki 4 12.1 seedwiki 2 6.1 twiki 1 3 xwiki 1 3 total: 33 100 table 3. software totals without unknowns wiki software no. % mediawiki 11 47.8 pbwiki 4 17.4 pmwiki 4 17.4 seedwiki 2 8.7 twiki 1 4.3 xwiki 1 4.3 total: 23 100.0 table 1. classification summary category no. % i: collaboration among libraries 16 45.7 ii: collaboration among library staff 11 31.4 iii: collaboration among library staff and patrons 5 14.3 iv: collaboration among patrons 3 8.6 total: 35 100.0 3� information technology and libraries | september 20073� information technology and libraries | september 2007 cerns have been raised in the past about other collabora tive technologies, such as blogs, bulletin boards, mailing lists, and so on, all aspects of the library 2.0 movement. if libraries are fully to realize library 2.0 as described by casey and savastinuk (2006), miller (2006), and courtney (2007), these issues must be considered. third, perhaps it is due to a matter of fit. it might be the case that wikis are utilized in categories i and ii and not within categories iii and iv because the tools are better suited to support the types of activities within categories i and ii. consider some of the activities listed earlier: sup porting association work, collecting software documenta tion, supporting conferences, creating digital repositories, creating intranets, and creating knowledge bases. each of these illustrates a wiki that is utilized for the creation of a resource with multiple authors and readers, tasks that are wellsuited to wikis. wikipedia is a great example of a wiki with clear, shared tasks for multiple authors and multiple readers and a sense of persistence over time. in contrast, relationships between library staff and patrons do not typically lead to the shared creation of resources. while it is true that the relationship between patron and librarian in the context of a patron’s research assignment can be collab orative depending on the circumstances, authorship is not shared but is possessed by the patron. in addition, research assignments in the context of undergraduate coursework are shortlived and seldom go beyond the confines of a particular course. in terms of patrons working together with other patrons, there is the precedent of group work; however, groups often produce projects or papers that share the characteristics of nongroup research assignments listed above. this, of course, does not mean that wikis are not suitable for collaboration within categories iii and iv, but perhaps the opportunities for collaboration are fewer or that they stretch the imagination of the types and ways of doing collaborative work. fourth, perhaps it is a matter of “not yet.” while the research has shown that libraries are not utilizing wikis in categories iii and iv, this may be because it is too soon. it should be noted that wikis are still new technologies. it might be the case that librarians are experimenting in safer contexts so they will gain experience prior to trying more public projects where their expertise will be needed. if this explanation is true, it is expected that more exam ples of wikis in libraries will soon emerge. as they do, the author hopes that all examples of wikis in libraries, new and old, will be added to the companion wiki to this article, librarywikis (http://librarywikis.pbwiki.com/). ■ conclusion it appears that wikis are here to stay, and that their utili zation within libraries is only just beginning. this article documented the current practice of wikis used in libraries using cscw as a framework for discussion. the author located examples of wikis in three places: within the lis lit erature, on the library success wiki, and within messages from three professional electronic discussion lists. thirty three examples of wikis were identified and classified using a classification schema created by the author. the schema has four categories: (1) collaboration among librar ies; (2) collaboration among library staff; (3) collaboration among library staff and patrons; and (4) collaboration among patrons. wikis were used for a variety of purposes, including for sharing information, supporting associa tion work, collecting software documentation, supporting conferences, facilitating librariantofaculty collaboration, creating digital repositories, managing web content, creat ing intranets, providing reference desk support, creating knowledge bases, creating subject guides, and collecting reader reviews. by and large, wikis were primarily used to support collaboration among library staff intraorganiza tionally and extraorganizationally, with nearly 80 percent (45.7 percent and 31.4 percent respectively) of the examples so identified, and less so in the support of collaboration among library staff and patrons (14.3 percent) and col laboration among patrons (8.6 percent). a majority of the examples of wikis utilized the mediawiki software (47.8 percent). it is clear that there are plenty of examples of wikis utilized in libraries, and more to be found each day. it is at this time that the profession is faced with extending the use of this technology, and it is to the future to see how wikis will continue to be used within libraries. works cited ackerman, mark s. 2002. the intellectual challenge of cscw: the gap between social requirements and technical feasibil ity. in human-computer interaction in the new millennium, ed. john m. carroll, 179–203. new york: addisonwesley. balraj, leela, et al. 2005 librefl. kent state university librar ies. http://www.library.kent.edu/page/10391 (accessed june 12, 2007). archive is available at this link as well. bannon, liam j., and kjeld schmidt. 1991. cscw: four charac ters in search for a context. in studies in computer supported cooperative work. ed. john m. bowers and steven d. benford, 3–16. amsterdam: elsevier. casey, michael e., and laura c. savastinuk. 2006. library 2.0. library journal 131, no. 14: 40–42. http://www.libraryjournal. com/article/ca6365200.html (accessed june 12, 2007). courtney, nancy. 2007. library 2.0 and beyond: innovative technologies and tomorrow’s user (in press). westport, conn.: libraries unlimited. dix, alan, et al. 2004. socioorganizational issues and stake holder requirements. in human computer interaction, 3rd ed., 450–74. upper saddle river, n.j.: prentice hall. dourish, paul. 2001. social computing. in where the action is: the foundations of embodied interaction, 55–97. cambridge, mass: mit pr. article title | author 35wikis in libraries | bejune 35 farkas, meredith. 2006. wiki world. http://www.libsuccess. org/index.php?title=wiki_world (accessed june 12, 2007). giles, jim. 2005. internet encyclopaedias go head to head. nature 438: 900–01. http://www.nature.com/nature/journal/v438/ n7070/full/438900a.html (accessed june 12, 2007). greif, irene, ed. 1988. computer supported cooperative work: a book of readings. san mateo, calif.: morgan kaufmann publishers. health sciences library, state university of new york, stony brook. 2007. health sciences library knowledge base. http://appdev.hsclib.sunysb.edu/twiki/bin/view/main/ webhome (accessed june 12, 2007). johansen, robert, et al. 1988. groupware: computer support for business teams. new york: free press. library instruction wiki. 2006. http://instructionwiki.org/ main_page (accessed june 12, 2007). miller, paul. 2006. coming together around library 2.0. dlib magazine 12, no. 4. http://www.dlib.org/dlib/april06/ miller/04miller.html (accessed june 12, 2007). nichols, david m., and michael b. twidale. 1999. com puter supported cooperative work and libraries. vine 109: 10–15. http://www.comp.lancs.ac.uk/computing/research/ cseg/projects/ariadne/docs/vine.html (accessed june 12, 2007). olson, gary m., and judith s. olson. 2002. groupware and com putersupported cooperative work. in the human-computer interaction handbook: fundamentals, evolving technologies and emerging applications, ed. julie a. jacko and andrew sears, 583–95. mahwah, n.j.: lawrence erlbaum associates, inc.. rodden, tom t. 1991. a survey of cscw systems. interacting with computers 3, no. 3: 319–54. sachs, patricia. 1995. transforming work: collaboration, learn ing, and design. communications of the acm 38: 227–49. sánchez, j. alfredo. 2001. hci and cscw in the context of digi tal libraries. in chi ‘01 extended abstracts on human factors in computing systems. conference on human factors in computing systems. seattle, wash., mar. 31–apr. 5 2001. schmidt, kjeld, and liam j. bannon. 1992. taking cscw seri ously: supporting articulation work. computer supported cooperative work 1, no. 1/2: 7–40. shneiderman, ben, and catherine plaisant. 2005. collaboration. in designing the user interface: strategies for effective humancomputer interaction, 4th ed., 408–50. reading, mass.: addison wesley. tennant, roy. 2005. web4lib electronic discussion. webjunc tion.org. http://lists.webjunction.org/web4lib/ (accessed june 12, 2007). archive is available at this link as well. twidale, michael b., et al. 1997. collaboration in physical and digital libraries. report no. 64, british library research and innovation centre. http://www.comp.lancs.ac.uk/ computing/research/cseg/projects/ariadne/bl/report/ (accessed june 12, 2007). twidale, michael b., and david m. nichols. 1998a. using studies of collaborative activity in physical environments to inform the design of digital libraries. technical report cseg/11/98, computing department, lancaster university, uk. http://www.comp.lancs.ac.uk/computing/research/cseg/ projects/ariadne/docs/cscw98.html (accessed june 12, 2007). twidale, michael b., and david m. nichols. 1998b. a survey of applications of cscw for digital libraries. technical report cseg/4/98, computing department, lancaster university, uk. http://www.comp.lancs.ac.uk/computing/research/cseg/ projects/ariadne/docs/survey.html (accessed june 12, 2007). webjunction. n.d. dig_ref electronic discussion list. http:// www.vrd.org/dig_ref/dig_ref.shtml (accessed june 12, 2007). wikipedia. 2007a. wiki. http://en.wikipedia.org/wiki/wiki (accessed april 29, 2007). wikipedia. 2007b. wikiwikiweb. http://en.wikipedia.org/ wiki/wikiwikiweb (accessed april 29, 2007). 36 information technology and libraries | september 200736 information technology and libraries | september 2007 appendix. wikis in libraries i = collaboration between libraries ii = collaboration between library staff iii = collaboration between library staff and patrons iv = collaboration between patrons category description location wiki software i library success: a best practices wiki—a wiki capturing library success stories. covers a wide variety of topics. also features a presentation about wikis http://www.libsuccess. org/index.php?title=wiki_world http://www.libsuccess.org/ mediawiki i wiki for school library association in alaska http://akasl.pbwiki.com/ pbwiki i wiki to support reserves direct. free, opensource software for managing academic reserves materials developed by emory university. http://www.reservesdirect.org/ wiki/index.php/main_page mediawiki i sunyla new tech wiki—a place for state university of new york (suny) librarians to share how they are using information technologies to interact with patrons http://sunylanewtechwiki.pbwiki. com/ pbwiki i wiki for librarians and faculty members to collaborate across campuses. being used with distance learning instructors and small groups message from robin shapiro. on [dig_ref] electronic discussion list dated 10/18/2006. unknown i discusses setting up three wikis in last month: “one to sup port a preconference workshop, another for behindthe scenes conferences planning by local organizers, and one for conference attendees to use before they arrived and during the sessions” (30). fichter, darlene. 2006. using wikis to support online collaboration in libraries. information outlook 10, no.1: 3031. unknown i unofficial wiki to the american library association 2005 annual conference http://meredith.wolfwater.com/ wiki/index.php?title=main_page mediawiki i unofficial wiki to the 2005 internet librarian conference http://ili2005.xwiki.com/xwiki/bin/ view/main/webhome xwiki i wiki for the canadian library association (cla) 2005 annual conference http://wiki.ucalgary.ca/page/cla mediawiki i wiki for south carolina library association http://www.scla.org/governance/ homepage pmwiki i wiki set up to support national discussion about institutional repositories in new zealand http://wiki.tertiary.govt.nz/ ~institutionalrepositories pmwiki i the oregon library instruction wiki used for sharing infor mation about library instruction http://instructionwiki.org/ mediawiki i personal repositories online wiki environment (prowe)— an online repository sponsored by the open university and the university of leicester that uses wikis and blogs to encourage the open exchange of ideas across communities of practice http://www.prowe.ac.uk/ unknown article title | author 37wikis in libraries | bejune 37 category description location wiki software i lis wiki—space for collecting articles and general informa tion about library and information science http://liswiki.org/wiki/main_page mediawiki i making of modern michigan—a wiki to support a statewide digital library project http://blog.lib.msu.edu/mmmwiki/ index.php/main_page unknown (behind firewall) i wiki used as a web content editing tool in a digital library initiative sponsored by emory university, the university of arizona, virginia tech, and the university of notre dame http://sunylanewtechwiki.pbwiki .com/ pbwiki ii wiki at suny stony brook health sciences library used as knowledge base http://appdev.hsclib.sunysb.edu/ twiki/bin/view/main/webhome; presentation can be found at: http:// ms.cc.sunysb.edu/%7edachase/ wikisinaction.htm twiki ii wiki at york university used internally for committee work. exploring how to use wikis as a way to collaborate with users message from mark robertson. on web4lib electronic discussion list dated 10/13/2006. unknown ii wiki for internal staff use at the university of waterloo. they utilize access control to restrict parts of the wiki to groups message from chris gray. on web4lib electronic discussion list dated 08/09/2006. unknown ii wiki at the university of toronto for internal communica tions, technical problems, and as a document repository message from stephanie walker. on librefl electronic discussion list dated 10/28/2006. unknown ii wiki used for coordination and organization of portable professor program, which appears to be a collaborative infor mation literacy program for remote faculty http://tfppcommittee.pbwiki.com/ pbwiki ii the university of connecticut libraries’ staff wiki which is a repository of information technology services documents http://wiki.lib.uconn.edu/wiki/ main_page mediawiki ii wiki used at binghamton university libraries for staff intranet. features pages for committees, documentation, policies, newsletters, presentations, and travel reports screenshots can be found at http://library.lib.binghamton.edu/ presentations/cil2006/cil%202006 _wikis.pdf mediawiki ii wiki used at the information desk at miami university described in: withers, rob. “something wiki this way comes.” c&rl news 66, no. 11 (2005): 775–77. unknown ii use of wiki as knowledge base to support reference service http://oregonstate.edu/~reeset/ rdm/ unknown ii university of minnesota libraries staff web site in wiki form https://wiki.lib.umn.edu/ pmwiki ii wiki used to support the mit engineering and science libraries bteam. the wiki may no longer be active, but is still available http://www.seedwiki.com/wiki/b team seedwiki iii a wiki that is subject guide at st. joseph county public library in south bend, indiana http://www.libraryforlife.org/ subjectguides/index.php/main_page mediawiki 3� information technology and libraries | september 20073� information technology and libraries | september 2007 category description location wiki software iii wiki used at the aiken library, university of south carolina as a content management system (cms) http://library.usca.edu/main/ homepage pmwiki iii doucette library of teaching resources wiki—a repository of resources for education students http://wiki.ucalgary.ca/page/ doucette mediawiki iv wiki worldcat (wikid) is an oclc pilot project (now defunct) that allowed users to add reviews to open worldcat records http://www.oclc.org/product works/wcwiki.htm unknown iii and iv wikiref lists reviews of reference resources—databases, books, web sites, etc. —created by butler librarians, faculty, staff, and students. http://www.seedwiki.com/wiki/ butler_wikiref; reported in matthies, brad, jonathan helmke, and paul slater. using a wiki to enhance library instruction. indiana libraries 25, no. 3 (2006): 32–34. seedwiki iii and iv wiki used as a subject guide at ohio university http://www.library.ohiou.edu/sub jects/bizwiki/index.php/main_page; presentation about the wiki: http://www.infotoday.com/cil2006/ presentations/c101102_boeninger .pps mediawiki editorial board thoughts halfway home: user centered design and library websites mark cyzyk information technology and libraries | march 2018 4 mark cyzyk (mcyzyk@jhu.edu), a member of lita and the ital editorial board, is the scholarly communication architect in the sheridan libraries, the johns hopkins university, baltimore, maryland. our library website has now gone through two major redesigns in the past five or so years. in both cases, a user centered design approach was used to plan the site. in contrast to the single person vision and design by committee approaches, user centered design focuses on the empirical study and eliciting of the needs of users. great attention is paid to studying them, listening to them, and to exposing their needs as expressed. in both of our cases, the overall design, functionality, and content of the new site was then focused exclusively on the results of such study. if a proposed design element, a bit of functionality, or a chunk of content did not appear as an expressly desired feat ure for our users, it was considered clutter and did not make it onto the site. both iterations of our website redesign were strictly governed by this principle. but user centered design has blind spots. first, it may well be that what you take to be your comprehensive user base is not as comprehensive as you think. in my library, our primary users are our faculty and student researchers, so great attention was paid to them. this makes sense insofar as we are an academic library within a major research univ ersity. faculty and student researchers will always be our primary user group. but they are not our comprehensive user group. we have staff, administrators, visitors, members of our board of trustees, members of our friends, outside members of the profession, etc. — and they are all important constituencies in their own ways. second, unless your sample size of users is large enough to be statistically valid, you are merely playing a game of three blind men and the elephant. each user individually will be ex pressing his or her own experience and perceived needs based on that experience, and yet none of them, even taken as a group, will be reporting on the whole beast. while personal testimony definitely counts as evidence, it also frequently and insidiously results in blind spots that would otherwise be exposed through having a statistically valid sample of study participants. third, and perhaps most importantly, user centered design discounts the expertise of librarians. nobody knows a library’s users and patrons as well as librarians. knowing their users, eliciting their needs, is part of what librarians as one of the “helping professions” do; it is a central tenet of librarianship. there is no substitute for experience and the expertise that follows from it. in the art world, this is connoisseurship. somehow, the art historian just knows that what is before him is not a genuine rembrandt. the empirical evidence may ineluctably lead to a different conclusion — yet there remains something missing, something the connoisseur cannot fully elucidate. similarly, in the medical world the radiologist somehow just knows that the subtle gradations on his screen indicate one type of malady and not another. interestingly, in the poultry industry there is something called a “chicken sexer.” this is a person who quickly and accurately sorts baby chicks by sex. training for this vocation mailto:mcyzyk@jhu.edu editorial board thoughts: halfway home | cyzyk 5 https://doi.org/10.6017/ital.v37i1.103813 largely employs what the philosophers call “ostensive definition:” “this one is male; that one is female.” the differences are so small as to be imperceptible. and yet, experienced chicken sexers can accurately sort chicks at an astonishing rate. they just know through experience. such is the nature of tacit knowledge. in the case of our most recent website redesign, none of our users expressed any interest whatsoever, for example, in including floor maps as part of the new site. we were assured a demand for floor maps on the site was “not a thing.” so floor maps were initially excluded from the site. this was met with a slow crescendo of grumbling from the librarians, and rightly so. librarians, and the graduate students at our information desk, know through long experience that researchers of varying types find floor maps of the building to be useful. that’s why we’ve handed out paper copies for years. the fact that this need was missed through our focus on user centered design points to a blind spot in that process. valuable experience and the expertise that follows from it should not be dismissed or otherwise diminished through dogmatic adherence to the core principle of user centered design. ... and yet, don’t get me wrong: insofar as it’s the empirical study of select user groups and their expressed concerns and needs, user centered design as a design technique and foundational principle is crucially important and useful. it gets us halfway home. ital_24n4p3 ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ participatory networks | lankes, silverstein, and nicholson 17 author id box for 2 column layout column title editor the goal of the technology brief is to familiarize library decision-makers with the opportunities and challenges of participatory networks. in order to accomplish this goal the brief is divided into four sections (excluding an overview and a detailed statement of goal): ■ a conceptual framework for understanding and evaluating participatory networks; ■ a discussion of key concepts and technologies in participatory networks drawn primarily from web 2.0 and library 2.0; ■ a merging of the conceptual framework with the technological discussion to present a roadmap for library systems development; and ■ a set of recommendations to foster greater discussion and action on the topic of participatory networks and, more broadly, participatory librarianship. this summary will highlight the discussions in each of these four topics. for consistency, the section numbers and titles from the full brief are used. k nowledge is created through conversation. libraries are in the knowledge business. therefore, libraries are in the conversation business. some of those conversations span millennia, while others only span a few seconds. some of these conversations happen in real time. in some conversations, there is a broadcast of ideas from one author to multiple audiences. some conversa tions are sparked by a book, a video, or a web page. some of these conversations are as trivial as directing someone to the bathroom. other conversations center on the foun dations of ourselves and our humanity. it may be odd to start a technology brief with such seemingly abstract comments. yet, without this firm, if theoretical, footing, the advent of web 2.0, social net working, library 2.0, and participatory networks seems a clutter of new terminology, tools, and acronyms. in fact, as will be discussed, without this conceptual footing, many library functions can seem disconnected, and the field that serves lawyers, doctors, single mothers, and eightyear olds (among others) fragmented. the scale of this technology brief is limited; it is to present library decisionmakers with the opportunities and challenges of participatory networks. it is only a single piece of a much larger puzzle that seeks to pres ent a cohesive framework for libraries. this framework not only will fit tools such as blogs and wikis into their offerings (where appropriate), but also will show how a more participatory, conversational approach to libraries in general can help libraries better integrate current and future functions. think of this document as an overview or introduction to participatory librarianship. readers will find plenty of examples and definitions of web 2.0 and social networking later in this article. however, to jump right into the technology without a larger frame work invites the rightful skepticism of a library organiza tion that feels constantly buffeted by new technological advances. in any environment with no larger conceptual founding, to measure the importance of an advance in technology or practice selection of any one technology or practice is nearly arbitrary. without a framework, the field becomes open to the influence of personalities and trendy technology. therefore, it is vital to ground any technological, social, or policy conversation into a larger, rooted concept. as susser said, “to practice without theory is to sail an uncharted sea; theory without practice is not to set sail at all.”1 for this paper, the chart will be conversation theory. the core of this article is in four sections: ■ a conceptual framework for understanding and eval uating participatory networks; ■ a discussion of key concepts and technologies in par ticipatory networks drawn primarily from web 2.0 and library 2.0; ■ a merging of the conceptual framework with the technological discussion to present a sort of roadmap for library systems development; and ■ a set of recommendations to foster greater discussion and action on the topic of participatory networks and, more broadly, participatory librarianship. it is recommended that the reader follow this order to get the big picture; however, the second section should be a useful primer on the language and concepts of partici patory networks. ■ library as a facilitator of conversation let us return to the concept that knowledge is created through conversation. this notion stretches back to socrates and the socratic method. however, the specific foundation for this statement comes from conversation theory, a means of explaining cognition and how people learn.2 it is not the purpose of this article to provide a r. david lankes (jdlankes@iis.syr.edu) is director and associate professor, joanne silverstein (jlsilver@iis.syr.edu) is research professor, and scott nicholson (scott@scottnicholson.com) is associate professor at the information institute of syracuse, (n.y.) syracuse university’s school of information studies. participatory networks: the library as conversation r. david lankes, joanne silverstein, and scott nicholson 18 information technology and libraries | december 200718 information technology and libraries | december 2007 detailed description of conversation theory, a task already admirably accomplished by pask. rather, let us use the theory as a structure upon which to hang an exploration of participatory networking and, more broadly, participa tory librarianship. the core of conversation theory is simple: people learn through conversation. different communities have different standards for conversations, from the scientific community’s rigorous formalisms, to the religious com munity’s embedded meaning in scripture, to the some times impenetrable dialect of teens. the point remains, however, that different actors establish meaning through determining common definitions and building upon shared concepts. the library has been a place where we facilitate con versations, though often implicitly. the concept of learn ing through conversation is evidenced in libraries in such large initiatives as information literacy and teaching criti cal thinking skills (using such metacognitive approaches as selfquestioning), and in the smaller events of book groups, reference interviews, and speaker series. library activities such as building collections of artifacts (the tan gible products of conversation) inform scholars’ research through a formal conversation process where ideas are supported with evidence and methods. similarly, pres ervation efforts, perhaps of wax cylinders with spoken word content or of ancient maps that embody an ongo ing dialogue about the shape and nature of the physical world, seek to save, or at least document, important conversations. common use of the word “conversation” is com pletely in accordance with the use of the term in conver sation theory. the term is, however, more specifically defined as an act of communication and agreement between a set of agents. so, a conversation can be between two people, two organizations, two countries, or even within an individual. how can a conversation take place within an individual? educators and school librarians may be familiar with the term “metacogni tion,” or the act of reflecting on one’s learning.3 yet, even the most casual reader will be familiar with the concept of debating oneself (“if i go right, i’ll get there faster, but if i go left i can stop by jim’s . . .”). the point is that a conversation is with at least two agents trying to come to an understanding. also note that those two agents can change over time. so, while socrates and plato are dead, the conversation they started about the nature of knowl edge and the world is carried forward by new genera tions of thinkers—same conversation, different agents. people converse, organizations converse, states con verse, societies converse. the requirements, in the terms of conversation theory, are two cognitive systems seek ing agreement. the results of these conversations, what pask would call “cognitive entanglements,” are books, videos, and artifacts that either document, expand, or result from conversations.4 so, while one cannot con verse with a book, that book certainly can be a starting point for many conversations within the reader and within a larger community. if the theory is that conversation creates knowledge, the library community has added a corollary: the best knowl edge comes from an optimal information environment, one in which the most diverse and complete information is available to the conversant(s). library ethics show an implicit understanding of this corollary in the advocacy of intellectual freedom and unfettered access. libraries seek to create rich environments for knowledge and have taken the stance that they are not in the job of arbitrating the conversations that occur or the appropriateness of the information used to inform those conversations. as will be discussed later, this belief in openness of conversations will have some farreaching implications for the library collec tion and is an ideal that can never truly be met. for now, the reader may take away that conversation theory is very much in line with current and past library practice, and it also shows a clear trajectory for the future. this viewpoint’s value is not just theoretical; it has real consequences and uses. for example, much of library evaluation has been based on numeric counts of tangible outputs: books circulated, collection size, reference transactions, and so on. yet this quantitative approach has been frustrating to many who feel they are count ing outcomes but not getting at true impact of library service. librarians may ask themselves, “which num bers are important . . . and why?” if libraries focused on conversations, there might be some clarity and cohesion between statistics and other outcomes. suddenly, the number of reference questions can be linked to items cat aloged or to circulation numbers . . . they are all markers of the scope and scale of conversations within the library context. this approach might enable the library com munity to better identify important conversations and demonstrate direct contributions to these conversations across functions. for example, a school district identifies early literacy as important. there is a discussion about public policy options, new programs, and school goals to achieve greater literacy in k–5. the library should be able to track two streams in this conversation. the first is the one libraries are accustomed to counting; that is, the library’s contribution to k–5 literacy (participation in book talks, children’s events, circulation of children’s books, reference questions, and so on). but the library also can document and demonstrate how it furthered the conversation about children’s literacy in general. it could show the resources provided to community offi cials. it could show the literacy pathfinders that were created. the point of this example is that the library is both participant in the conversation (what we do to pro mote early literacy) and facilitator of conversation (what we do to promote public discourse). article title | author 19participatory networks | lankes, silverstein, and nicholson 19 the theoretical discussion leads us to a discussion about the second topic of this technology brief: pragmatic aspects of the knowledge as conversation approach, or a participatory approach, as it will be called. as new technologies are developed and deployed in the current environment of limited resources, there must be some means of evaluating their utility. a technology’s util ity is appropriately measured against a given library’s mission, which is, in turn, developed to respond to the needs of the community that library serves. first, how ever, let us identify some of the new technologies and describe them briefly. ■ participatory networking, social networks, and web 2.0 let us now move from the theoretical to the opera tional. the impetus behind this article is the relatively recent emergence of a new group of internet services and capabilities. suddenly, terms such as wiki, blog, mashup, web 2.0, and biblioblogosphere have become commonplace. as with any new wave of technological creation, these terms can seem ambiguous. they also come wrapped in varying amounts of hype. they may all, however, be grouped under the phenomenon of par ticipatory networking. while we now have a conceptual framework to evaluate these technologies that support participatory networking (for example, do they further conversa tions), we still need to know the basics of the terminol ogy and technologies. this section outlines key concepts in the pragmatics of participatory networking. the section after this one will join the theoretical and operational to outline key chal lenges and opportunities for the library world. we begin with web 2.0. web 2.0 much of what we call participatory networking, at least the technological foundation of it, stems from developments in web 2.0.5 as with many buzzwords, the exact definition of web 2.0 is not clear. it is more an aggregation of concepts that range from software development (loosely coupled application programming interfaces [apis] and the ease of incorporating features across platforms) to abstrac tions (the user is the content). what pervades the web 2.0 approach is the notion that internet services are increas ingly facilitators of conversations. the following sections describe some of the characteristics of web 2.0. web 2.0 characteristic: social networks a core concept of web 2.0 is that people are the content of sites; that is, a site is not populated with information for users to consume. instead, services are provided to individual users for them to build networks of friends and other groups (professional, recreational, and so on). the content of a site, then, comprises userprovided infor mation that attracts new members of an everexpanding network. examples include: ■ flickr. flickr (www.flickr.com) provides users with free web space to upload images and create photo albums. users then can share these photos with friends or with the public at large. flickr facilitates the creation of shared photo galleries around themes and places. ■ the cheshire public library. the teen book blog (http://cpltbb.wordpress.com) at the cheshire public library offers book reviews created only by the stu dents who use the library. ■ memorial hall library. the memorial hall library in andover, massachusetts, offers podcasts of poetry contests in which the content is created by students (www.mhl.org/teens/audio/index.htm). ■ libraries in myspace. myspace searches show that there are myspace sites for hundreds of individual libraries and scores of library groups. alexandrian public library (apl), for example, has established a site at myspace (www.myspace.com/teensatapl). this practice is growing among public libraries and is an attempt to reach out to users in their preferred online environments. in this venue, the more friends a library’s myspace site has, the more successful it may be considered. as of this writing, apl had sev entyfive friends and fifteen comments. the brooklyn college library had 2,195 friends and 270 comments. web 2.0 characteristic: wisdom of crowds there has been some research into the quality of mass decisionmaking.6 that research shows how remarkably accurate groups are in their judgments. web 2.0 pools large groups of users to comment on decisions. this aggregation of input is facilitated by the ready availabil ity of social networking sites. certainly, this approach of community organization and verification of knowledge also has its detractors. many, for example, question the wisdom seen in some entries of wikipedia. yet, recent articles have compared this mass editing process favor ably to traditional sources of information, such as the encyclopedia britannica.7 examples include: ■ ebay. ebay has perhaps the most studied and copied community policing and reputation systems. all buyers and sellers can be rated. the aggregation of many users’ experiences create a feedback score that is equivalent to a group credibility rating (see figure 1). these kinds of group feedback systems can now be seen in most major internet retailers. ■ librarything. librarything.com makes book recom 20 information technology and libraries | december 200720 information technology and libraries | december 2007 mendations based on the collective intelligence of all users of the site. the greater the pool of collective intelligence, the more information available to the user for decisionmaking. ■ the diary project. the diary project library (www. diaryproject.com) is a nonprofit organization that encourages teens to write about their daytoday experiences growing up. the goal of this site is to encourage communication among teens of all cul tures and backgrounds, provide peertopeer support, stimulate discussion, and generate feedback that can help ease some of the concerns teens encounter along the way and let them know that they are not alone. to that end, the site comprises thousands of entries in twentyfour categories. because of the great number of entries, most youth can find helpful materials. web 2.0 characteristic: loosely coupled apis an api provides a set of instructions (messages) that a programmer can use to communicate between applica tions. apis allow programmers to incorporate one piece of software they may not be able to directly manipulate (code) into another. for example, google maps has made a public api that allows web page designers to include satellite images into their web pages with little more than a latitude and longitude.8 apis vary in their ease of integration. loosely coupled apis allow for very easy integration using highlevel scripting languages such as javascript9. examples include: ■ google maps. google maps displays street or sat ellite maps showing markers on specific locations provided by an external source with simple sets of longitudes and latitudes. it becomes extremely easy to create geographic information systems with little knowledge of gis principles. ■ flickr. flickr provides easy means to integrate hosted images into other web pages or applications (as with a google map that shows images taken at a specific location). ■ youtube. youtube (www.youtube.com) provides users with the capability to upload and comment upon video on the internet. it also allows for easy integration of the videos into other web pages and blogs. with a simple line of html code, anyone can access streaming video for their content. web 2.0 characteristic: mashups mashups are combinations of apis and data that result in new information resources and services.10 this ease of incorporation has led to an assumption of a “right to remix.” in the world of open source software and the creative commons, the right to remix refers to a grow ing expectation among internet users that they are not limited by the interfaces and uses presented to them by a single organization. examples include: ■ chicagocrime.org. an oftencited example of a mashup is chicagocrime.org, which uses google maps to plot crime data for the city of chicago. users can now see exactly which street corner had the most murders. figure 2 shows a marker at the location of every homicide in chicago from november 2, 2005, to august 2, 2006. ■ book burro. book burro (http://bookburro.org/ about.html) “is a web 2.0 extension for firefox and flock. when it senses you are looking at a page that contains a book, it will overlay a small panel which when opened lists prices at online bookstores such as amazon, buy, half (and many more) and whether the book is available at your library.” ■ library lookup. the mit library lookup greasemonkey script for firefox (http://libraries. mit.edu/help/lookup.html) searches mit’s barton catalog from an amazon book screen. web 2.0 characteristic: permanent betas the concept of a permanent beta is, in part, a realization that no software is ever truly complete so long as the user community is still commenting upon it. for example, google does not release services from beta until it has achieved a sufficient user base, no matter how fixed the underlying source code is.11 permanent beta also is a design strategy. large applications are broken into smaller constituent parts that can be manipulated sepa rately. this allows large applications to be continually figure 1. a seller’s profile shows a potential buyer the ebay community’s current estimation of a seller’s credibility. article title | author 21participatory networks | lankes, silverstein, and nicholson 21 developed by a more diverse and distributed community (as in open source). examples include: ■ google labs. google has a site named “google labs” (http://labs.google.com) that puts out company generated tools and services. in fact, part of a google employee’s work time is dedicated to creating the resources and tools through personal projects and exploration. these tools and services remain a part of the “lab” until they are finished and have sufficient user bases. projects (see figure 3) range from the simple (google suggest, which provides a dropdown box of possible search queries as you being to type your search terms) to the extensive (google maps, which started as a google lab project). ■ mit libraries. the mit libraries are experimenting with new technologies to help make access to informa tion easier. the tools below are offered to the public with an appeal for feedback and additional tools, and the there is a permanent address designed just to collect feedback from the betaphase tools, which include: ■ the new humanities virtual browsery, which highlights new books and incorporates an rss feed, the ability to comment on books, links to book reviews, availability information, and links to other books by the same author. ■ the libx—mit edition (http://libraries.mit. edu/help/libx.html), which is a firefox toolbar that allows users to search the barton catalog, vera, google scholar, the sfx fulltext finder, and other search tools; it embeds links to mit only resources in amazon, barnes & noble, google scholar, and nyt book reviews. ■ the dewey research advisor business and economics q&a (http://libraries.mit.edu/help/ dra.html), which provides starting points for specific research questions in the fields of busi ness, management, and economics. web 2.0 characteristic: software gets better the more people use it an increasing number of web 2.0 sites emphasize social networks, where these services gain value only as they gain users. malcolm gladwell recounts this principle and the work of kevin kelly with an earlier telecommunica tions network, the network of fax machines connected to the phone system: the first fax machine ever made . . . cost about $2,000 at retail. but it was worth nothing because there was no other fax machine for it to communicate with. the second fax machine made the first fax machine more valuable, and the third fax made the first two more valuable, and so on. . . . when you buy a fax machine, then, what you are really buying is access to the entire fax network—which is infinitely more valuable than the machine itself.12 with social networking sites, and all sites that seek to capitalize on user input (reviews, annotations, profiles, etc.), the true value of each site is defined by the number of people it can bring together. a classic example of this characteristic is amazon. amazon sells books and other merchandise, but, in reality, amazon is very much about the marketing of information. amazon gains tremendous value by allowing its users to review and rate items. the more people use amazon and the more they comment, the more visibility these active users gain and the more credibility markers they take on. web 2.0 characteristic: folksonomies a folksonomy is a classification system created in a bottomup fashion with no central coordination. this differs from the deductive approach of such classifica tions systems as the dewey decimal system, where the world of ideas is broken into ten nominal classes.13 it also differs from other means of developing classifications where some central authority determines if a term should be included. in a folksonomy, the members of a group simply attach terms (or tags) to items (such as photos or blog postings), and the aggregate of these terms is seen as the classification. what emerges is a classification scheme that prioritizes common usage (the mostused tags) over semantic clarity (if most people use “car,” but some use “cars,” they are seen as different terms, and the tag “auto mobile” has no real relationship within the aggregate classification). examples include: figure 2: screenshot of chicagocrime.org 22 information technology and libraries | december 200722 information technology and libraries | december 2007 ■ penntags. penntags (http://tags.library.upenn.edu/ help) is a social bookmarking tool for locating, orga nizing, and sharing one’s favorite online resources. members of the penn community can collect and maintain urls, links to journal articles, and records in franklin, the online catalog, and vcat, the online video catalog. once resources are compiled, users can organize them by assigning tags (freetext key words) or by grouping them into projects according to specific preferences. penntags also can be used collaboratively, as it acts as a repository of the varied interests and academic pursuits of the penn com munity, and a user can find topics and other users related to his or her own favorite online resources. ■ hillsdale teen library. the hillsdale teen library (www.flickr.com/photos/hillsdalelibraryteens) uses flickr to post pictures of events at the hillsdale teen library (figure 4). the resulting tag view is repre sented in figure 5. these tags allow users to easily retrieve the images in which they are interested. there are more characteristics of web 2.0, but these give some overall concepts. core new technologies: ajax and web services as we have just discussed, web 2.0 is little more than set of related concepts, albeit with a lot of value being currently attached to these concepts. these concepts are supported by two underlying technologies that have facilitated web 2.0 development and brought a substantially new (and improved) user experience to the web. the first is ajax, which allows a more desktoplike experience for users. the second is the advent of web services. these technolo gies are not necessary for web 2.0 concepts, but they have made web 2.0 sites much more compelling. ajax ajax stands for asynchronous javascript and xml.14 it is a set of existing web technologies brought together. at the most basic, ajax allows a browser (the part the user interacts with) and a server (where the data resides) to send data back and forth without needing to refresh the entire web page being worked on. think about the web sites you work with. you click on a link, the browser freezes and waits for the data, then draws it on the screen. early versions of such sites as mapquest would show a map. if you wanted to zoom into the map, you would press a zoom icon and wait while the new map, and the rest of the web page was redrawn. compare this to google maps, where you click in the middle of a map and drag left or right and the map moves dynamically. we are used to this kind of interaction in desktop applications. click and drag has become second nature on the desktop, and ajax is making it second nature on the web, too. another ajax advantage is that it is open and requires only light programming skills. javascript on the client and almost any serverside scripting language (such as active server pages or php) are easily accessible languages. this fact allows for both fast development and easier integration with existing systems. as an example, it should now be easier to bring more interactive web interfaces to existing online catalogs. web services web services allow for softwaretosoftware interactions on the web.15 using web protocols and xml, applications exchange queries and information in order to facilitate the larger functioning of a system. one example would be a system that uses an isbn number to query multiple online catalogs and commercial vendors for availability (and price) of a book. this simple process might be part of a much larger library catalog that shows users a book and its availability. the point is, that unlike federated search systems such as z39.50, web services are small. they also tend to be lightweight (that is, limited in what they do), and are aggregated for greater functionality. this is the technological basis for the loosely coupled apis dis cussed previously. library 2.0 library 2.0 is a somewhat diffuse concept. walt crawford, in his extended essay “library 2.0 and ‘library 2.0,’” found sixtytwo different (and often contradictory) views and seven distinct definitions of library 2.0.16 it is no wonder that people are confused. however, it is natural for emerging ideas and groups to function in an environ figure 3: screenshot of current google lab projects article title | author 23participatory networks | lankes, silverstein, and nicholson 23 ment of high ambiguity. for use in this technology brief, the authors see library 2.0 as an attempt to apply web 2.0 concepts (and some longstanding beliefs for greater com munity involvement) to the purpose of the library. in the words of ormsby, “the purpose of a library is not to . . . showcase new gadgetry . . . ; rather, it is to make possible that instant of insight when all the facts come together in the shape of new knowledge.”17 in the case of library 2.0, the new gadgetry discussed in the previous section comprises a group of software applications. how the applications are used will determine whether they support ormsby’s “instant of insight.” many libraries and librarians already are pursuing this goal. some, for instance, are using blogs to reach other librarians, their own users (on their own web sites), and potential users (using myspace and other online communities). they are using wikis to deliver reports, teach information literacy, and serve as repositories. one has developed an api that allows wordpress posts to be directly integrated into a library catalog. clearly, the internet and newer tools that empower users seem to be aligned with the library mission. after all, librarians blogging and allowing the catalog to be mashed up can be seen as an extension of current information services. but this abundance of new applications poses a challenge. given the speed with which new tools are invented, librarians may find it difficult to create strate gies that include all the desired services that they make possible. for every new application that becomes avail able, library administrators must decide whether it can serve the library, how to use it, and how to find additional resources to manage it (for example, “now we can do this. but why should we?”). this problem stems from focusing excessively on the technology. librarians should instead focus on the phenomena made possible by the technology. most important of these phenomena, the library invites participation. as chad and miller state: library 2.0 facilitates and encourages a culture of participation, drawing upon the perspectives and con tributions of library staff, technology partners and the wider community. library 2.0 means harnessing this type of participation so that libraries can benefit from increasingly rich collaborative cataloguing efforts, such as including contributions from partner libraries as well as adding rich enhancements, such as book jackets or movie files, to records from publishers and others. library 2.0 is about encouraging and enabling a library’s community of users to participate, contribut ing their own views on resources they have used and new ones to which they might wish access. with library 2.0, a library will continue to develop and deploy the rich descriptive standards of the domain, whilst embracing more participative approaches that encourage interaction with and the formation of com munities of interest.18 the carte blanche statement that users participating in the library is “good,” however, is insufficient. library administers must ask, “what is the ultimate goal?” in summary, current initiatives in the library world to bring the tools of web 2.0 to the service of library 2.0 are exciting and innovative, and, more to the point, they are supportive of the library’s purpose. they may, however, incur costs, such as monitoring blogs and wikis, and cre figure 4: hillsdale teen library figure 5: hillsdale teen library flickr site 24 information technology and libraries | december 200724 information technology and libraries | december 2007 ating content and corresponding with users that stretch already inadequate resources even further. ultimately, the value of library 2.0 concepts requires us to answer some important questions: will they be used to further knowledge, or will they simply create more work for librarians? what does the next version of library 2.0 look like? is its mission the same, and only the tools dif ferent? what makes the library different from myspace— simply a legacy? should we incorporate new services into the current library offerings? how do we, as facilitators of conversations, point the way to the next generation of library? it is hoped that some of the concepts in participa tory librarianship may answer these questions and help further the innovations of the library 2.0 community. participatory networks the authors use the phrase “participatory networking” to encompass the concept of using web 2.0 principles and technologies to implement a conversational model within a community (a library, a peer group, the general public, and so on). why not simply adopt social networking, web 2.0, or library 2.0 for that matter? let us examine each term’s limitations: ■ social networking: social network sites such as myspace and facebook have certainly captured public attention. they also have proven very popular. in their short life spans, these sites have garnered an immense audience (myspace has been ranked one of the top destination sites on the web) and drawn much atten tion from the press.19 some of that attention, however, has been very negative. myspace, for example, has been typified as a refuge for pedophiles and online predators. even the television show saturday night live has parodied the site for the ease with which users can create false personas and engage in risky online behaviors.20 to say you are starting a social networking site in your library may draw either enthusiastic support, vehement opposition (“social networking experiment in my library?!”), or simply confused looks. add to the potential negative con notations the ambiguity of the term. is a blog a social networking site? is flickr? to compound this confu sion, the academic domain of social network theory predates myspace by about a decade. ■ web 2.0: ambiguity also dogs the web 2.0 world. for some, it is technology (blogs, ajax, web ser vices, and so on). for others, it is simply a buzzword for the current crop of internet sites that survived the burst of the dotcom bubble. in any case, web 2.0 certainly implies more than just the inclusion of users in systems. ■ library 2.0: as stated before, the term library 2.0 is a vague term used by some as a goad to the library community. further, this term limits the discussion of userinclusive web services to the library world. while this brief focuses on the library community, it also sees the library community as a potential leader in a much broader field. so, ultimately, the authors propose “participatory net working” as a positive term and concept that libraries can use and promote without the confusion and limitations of previous language. the phrase “participatory network” also has a history of prior use that can be built upon. it represents systems of exchange and integration and has long been used in discussions of policy, art, and government.21 the phrase also has been used to describe online communities that exchange and integrate information. ■ libraries as participatory conversations so where are we? we started with the abstract statement that knowledge is created through conversation. we then looked at the current landscape of technologies that can facilitate these conversations and showed examples of how libraries, other industries, and individuals are using these technologies. in this section we combine the larger framework with the technologies to see how libraries can incorporate participatory networks to further their knowledge mission. participatory librarianship in action let us look specifically at how participatory networks can be used in the library’s role as facilitator of knowledge through conversation. an obvious example is libraries hosting blogs and wikis for their communities, creat ing virtual meeting spaces for individuals and groups. indeed, these are increasingly useful functions for librar ies. they meet a perceived need in the community and can generate excitement both within the library and in the community. the idea of creating online sites for individu als and organizations makes sense for a library, although it is not without difficulties (see the section on challenges and opportunities). libraries also could use freely avail able (and increasingly easy to implement) open source software to create library versions of wikipedia (with or without enhanced editorial processes). another way for libraries to offer these services would be through a cooperative or other thirdparty vendor. such a service easily can be seen as a knowledge management activity capturing and providing local expertise while linking this expertise to that produced at other libraries. another reason for libraries to engage in participatory networking is that one library can more easily collaborate article title | author 25participatory networks | lankes, silverstein, and nicholson 25 with other libraries in richer dialogues. we currently have systems that connect our online catalogs and share resources through interlibrary loan. these conduits exist and can be used for the transferal of richer data, as has been proved through collaborative virtual reference sys tems. in our current systems, as in traditional library practice, when users are referred to other libraries, they are sent out and not brought back. in a participatory library setting, libraries would facilitate a conversation between the user, the community of the local library, and then through the developed conduits, other libraries and their communities. the end result would be a seamless web of libraries where the user can ignore the intrica cies of the library’s organization structure and boundar ies, and in which the libraries are using the best local resources to meet local needs. bringing libraries seamlessly together to participate in conversations with a single user has another sig nificant advantage: the library would make it easy for users to join the conversation regardless of where they are, through the presentation of a single façade. there is, for example, only one google, one amazon, and one wikipedia. why should users have to search from among thousands of libraries to find the conversations they want? participatory networking will be most effective when libraries work together, when the whole is greater than its parts. we currently see elements of the participatory library in the oclc open worldcat project. for example, users searching google may come across a listing provided by oclc. after selecting the entry for the book, the user can then jump to his or her own local library’s information about the book. users do not have to know which library to visit to find a book near them. extending this concept to conversations, one goal of these participatory networks is to make it easier for the user to enter a conversation with the library without having to work to discover their own specific entry points. however, ensuring this effective seamless access to the library will require more than simply adding ele ments of participatory networking around the library’s edges. adding services such as blogs and wikis may be seen merely as adjunct to current library offerings. as with any technological advance, scarce resources must be weighed against a desire to incorporate new services. do we expand the collection, improve the web site, or offer blogs to students? a better approach for making these kinds of decisions is to look at the needs of the community served in context with the commonly accepted, core tasks of a library, and see how they can be recast (and enhanced) as conversational, or participatory, tools. in point of fact, every service, patron, and access point is a starting point for a conversation. let’s start with the catalog. if the catalog is a conversation, it is decidedly formal and, more importantly, one way. think of today’s catalog as the educational equivalent of a college lecture. a for mal system is used to serve up a series of presentations on a given topic (selected by the user). the presentations are rigid in their construction (marc, aacr2, and so on). they follow an abstract model (relevance scores, some times alphabetical listings), and provide minimal oppor tunities to the receiver of the information to provide feedback or input. they provide no constructive means for the user to improve or shape the conversation. even recent advances in catalog functions (dynamic, graphical visualizations; faceted searching; simple search boxes’ links to noncollection resources) do little more than make the presentation of information more varied. they are still not truly interactive because they do not allow user participation; they do not allow for conversation. to highlight the oneway nature of the catalog, ask a simple question: what happens when the user doesn’t find something? do we assume that the information is there, but that the user is simply incapable of finding it (in which case the catalog presents search tips, refers the patron to an expert librarian who is capable, or offers more information literacy instruction)? do we assume that the information does not exist (refer the patron to interlibrary loan, pass him or her on to a broader search engine)? do we assume that the catalog itself is limited (refer the user to online databases, or other finding aids)? what if we assume that the catalog is just the current place a user is involving in an ongoing conversation —what would that look like? how can such a traditionally rigid system (in concept, more than in any one feature set) be made more participa tory? what if the user, finding no relevant information in the catalog, adds either the information or a place holder for someone else to fill in the missing information? possibly the user adds information from his or her exper tise. however, assuming that most people go to a catalog because they don’t have the information, perhaps the user instead begins a process for adding the information. the user might ask a question using a virtual reference service; at the end of the transaction, the user then has the option to add the question, along with the answer and associ ated materials, to the catalog. or perhaps, the user simply leaves the query in the catalog for other patrons to answer, requesting to be notified when a response is posted. in that case, when a new user does a catalog search and runs across the question, he or she can provide an answer. that answer might be a textual entry (or an image, sound, or video), or simply a new query that directs the original questioner or new patrons to existing information in the catalog (usercreated see also entries in the catalog). the catalog also can associate conversations with any data point. for example, a user pulls up the record for a book she or he feels might be relevant to an information need she or he is having. this process starts a conver sation between that user and the library, its users, and 26 information technology and libraries | december 200726 information technology and libraries | december 2007 authors of associated works. the user can see comments and ratings associated with this book from not only users of this library, but users of other libraries. also associated is a list of related works and the full audio of a lecture by the author. the user also might be directed to an in person or online book group that is reading that book. the point is that the catalog facilitates a conversation as opposed to simply presenting what it “knows” about a topic and then stepping out of the process. the catalog, then, does not simply present information, but instead helps users construct knowledge by allowing the user to participate in a conversation. there are other means of improving (and linking) systems in a conversational library. take the implicit link between the catalog and circulation. of course, these systems have always been linked in that items found in the catalog can be checked out, and checked out items have their status reflected in the catalog. but this kind of state information is a pretty meager offering. imagine using circulation data to improve the actual functionality of the catalog. take the example of a user who is search ing the catalog for fictional books on magic. currently, a relevance score between an item’s metadata and the query is computed and then all the items are ranked in a retrieval set. this relevance score can be computed in many ways, but is usually based on the number of times a keyword appears in the record and the placement of that keyword in the metadata record (giving preference to terms appearing in certain marc fields, such as titles). what is missing is the actual, realworld circulation of an item. wouldn’t it make sense, given such an abstract query, to present the user with harry potter first (but not exclusively)? what if we added circulation data to our relevance rankings: how many times this item has been checked out? it turns out that using a simple statistic is amazingly powerful. it is akin to google’s page rank algorithm that presents sites most linked to higher in the results. also, for those worried that users would be flooded with only popular materials, studies show that while these algorithms do change the very top ranked material, the effect quickly fades so that the user can still easily find other materials. another consideration for adjusting a search is to allow the user to tweak the algorithms used to retrieve works. in the example above, a user could turn off the popularity feature. the user also could toggle switches for currency, authority, and other facets of relevancy rankings. the conversational model requires us to rethink the catalog as a dynamic system with data of varying levels of currency and, frankly, quality, coming into and out of the system. in a conversational catalog, there is no reason that some data can’t exist in the catalog for limited dura tions (from years to seconds). records of wellgroomed physical collections may be a core and durable collection in the catalog, but that is only one of many types of infor mation that could exist in the catalog space. furthermore, even this core data can be annotated and linked to (and from) more transient media. so, the user might see a review from a blog as part of a catalog record on one day, but when she or he pulls the record up again in a few days, that review might be absent, the blog writer hav ing withdrawn the comment. this is akin to weeding the collection; however, it would happen in a more dynamic fashion than occurs with the content on library shelves. the conversational model also can be used in other areas of the library. what do we digitize? what do we select? what programs do we offer? what do we pre serve? the empowered user can participate in answer ing all of these questions but does not replace the expert librarian; rather, the user contributes additional and diverse information and commentary. in fact, the catalog scenario just proposed already assumes that the library catalog does more than store metadata. in order for the scenario to work, the catalog must store questions, answers, video, audio—in essence the catalog must be expanded and integrated with other library systems so that a final participatory library system can present a coherent view of the library to patrons. the next section lays out a sort of roadmap for these enhance ments and mergers. framework for integration of participatory librarianship as has been noted, participatory networks and libraries as conversations are not brand new concepts sprung from the head of zeus. instead, they are means to integrate past and current innovations and create a viable plan forward. figure 6 provides a sort of road map of how the library might make the transition from current systems to a truly participatory system. it includes current systems, systems under development (such as federated searching), and new concepts (such as the participatory library). it seeks to capture current momentum and push the field forward to a larger view instead of getting bogged down in the intricacies of any one development activity. along the left side of the graph are current library systems. while the terminology may differ from library to library, nearly every system can be found on today’s library web sites. by showing the systems together, the problems of user confusion and library management burden become obvious. users must often navigate these systems based on their needs, and often with little help. should they search the catalogs first, or the databases? isn’t the catalog really just another database? which data base do they choose? in our attempts to serve the users better by creating a rich set of resources and services, we have instead complicated their informationseeking lives. as one librarian puts it, “don’t give me one more system i, or my patrons, have to deal with.” article title | author 27participatory networks | lankes, silverstein, and nicholson 27 from the array of systems on the left side, we can see that libraries have not been doing themselves any favors either. we are maintaining many systems, therefore mak ing the calls for yet more systems not only impractical but unwise. the answer is to integrate systems, combining the best of each while discarding the complexity of the whole. the library world is in the midst of doing just that. this section seeks to highlight promising developments in integrating library systems well beyond the library catalog and to highlight not only an ideal endpoint, but also how this ideal system is truly participatory. merging reference and community involvement the functional area furthest along in the integration of participatory librarianship is reference; as reference is most readily recognizable as a conversation, this comes as no surprise. over the last decade, reference services have gone online and have led to shared reference ser vices. more importantly, reference done online creates artifacts of reference conversations: electronic files than can be cleaned of personal information and placed in a knowledge base and used as a resource for other users. a new development in reference is the reference blog, in which multiple librarians and other users can be part of a questionanswering community with conversations that can live on beyond a single transaction. another functional area of libraries that is already involved with participatory librarianship is community involvement. for decades, public libraries have supported local community groups through meeting spaces. some libraries now are hosting web spaces for local groups. as libraries incorporate participatory technologies into their offerings, they can create virtual places such as discussion forums, wikis, and blogs for these community groups to use. if there are standards for these discussion areas, then groups from different communities also could easily participate in shared boards; this makes sense for groups such as weight watchers or alcoholics anonymous that have local branches and national involvement. in an academic setting, these groups can be student, faculty, or staff organizations or courses. in addition to reference and hosted community con versations, the library has been actively creating digi tal collections of materials (either through digitization, leasing service from content providers, or capturing the library’s born digital items). parallel to the digital collec tion building of library materials is an active attempt to create institutional repositories of faculty papers, teacher lesson plans, organizational documentation, and the like. these services are participatory systems in which col lections come from users’ contributions, and they may evolve into digital repositories that include both user and librariancreated artifacts. these different conversations can be archived into a single repository, and, if properly planned, the refer ence conversations can live alongside, and eventually be intermingled with, the community conversations, and the digital repository (which, after all, though formal, is a community conversation) into a community repository. community repositories allow librarians to be more eas ily involved in the conversations of the community and capture important artifacts of these conversations for later use. merging library metadata into an enhanced catalog participatory librarianship can be supported by another functional area of the library: collections. traditionally, the collection comprises books, magazines, and other information resources paid for by the library. electronic resources, such as databases that are leased instead of purchased, make up a large portion of library expen ditures. more recently, webbased resources (external feeds and sites) have been selected and added to the virtual collection. several kinds of finding aids are used to locate these information resources. the catalog and databases both contain descriptions of resources and searching interfaces. in order to improve access, libraries include records for databases within the catalog. conversely, federated search ing tools combine the records from different databases and could allow the retrieval of both books and articles by com bining records from the traditional catalog and databases into one tool. if communitycreated resources are part of the catalog, then these resources also would be findable alongside other traditional library resources. the tools for describing information resources also can be participatory. in traditional librarianship, the librarians provide metadata that patrons then use to make selections. figure 6: road map of how the library might make the transition from current systems to a truly participatory system. 28 information technology and libraries | december 200728 information technology and libraries | december 2007 by examining this use data, recommender systems can be created to help users locate new materials. in participatory networking, patrons will be encouraged to add comments about items. if standards are used for these comments, then they can be shared among libraries to create larger pools of recommendations. as these comments are analyzed, they can be combined with usage databases to create stronger recommender systems to present patrons with additional choices based upon what is being explored. the end result is an enhanced catalog that allows users and libraries to find information regardless of which sys tem the information resides in. however, the enhanced catalog is still just that, a catalog. it contains surrogates of digital information and is managed separately from the artifacts themselves. in the case of physical items, this may be all the library systems can manage, but in the case of digital content, there is one more step that needs to be taken. namely, the artificial barrier between catalog (defined as inventory control system) and content (housed in the community repository) must come down. building the participatory library at this point in the evolution of distributed systems into a truly integrated library system, the participatory library, we have two large collections: one of resources, and one of information about the resources. the first collection of digital content, the community repository, is built by the library and its users collaboratively. the second collection, the enhanced catalog, includes metadata, both formal and usercreated (such as ratings, commentary, use data, and the like). both the community repository and the enriched catalog are participatory. yet to realize the dream of a seamless system of functionality (seamless to the user and the library), these two systems must be merged, allow ing users to find resources and, much more importantly, conversations. furthermore, the users must be able to add to metadata (such as tags to catalog records) and content (such as articles, postings to a wiki, or personal images). the result may be conceived of as a single integrated infor mation resource, which, for the purposes of this conversa tion, is called the participatory library. users may access the participatory library directly through the library or as a series of services in google, myspace, or their own home pages. the point is that the access to the library takes place at the point of conversa tion, not at the point the user realizes he or she needs information from the library. conversations and preservation the conversation model highlights the need for preserva tion. aside from simply providing systems that facilitate conversation, libraries serve as the vital community memory. conversations construct knowledge, but some one must remember what has already been said and know how to access that dialog. scientific conversations, for example, are built on previous conversations (theories, studies, methods, results, and hypotheses). capturing conversations and playing them back at the right time is essential. this might mean the preservation of artifacts (maps, transcripts, blueprints, photographs), but also it means the increasingly important tasks of capturing the digital dialogs. this highlights the need for institutional repositories (that will later be integrated seamlessly with other library systems, as previously discussed). specifically, web sites, lectures, courseware, and articles must be kept. further, they must be kept in true conversa tional repositories that capture the artifacts (the papers), the methods (data, instruments, policy documents), and the process (meeting notes, conversations, presentations, web sites, electronic discussions). they must be kept in information structures that make them readily available as conversations; in other words, users must be able to search for materials and reconstruct a conversation in its entirety from one fragment. being where the conversation is imagine the conversations that are going on in your local library as you read this. imagine the physicist chatting with the gardener, and the trustee talking with the volunteer who is reading the latest bestseller. what knowledge can be gleaned from these novel interac tions? can you measure it? can you enhance it? can you capture it? can you recall it when it would be precisely what a user needs? note also that these conversations do not belong solely to the library. the library is only part of the con versation. faced with the daunting variety of resources available on the web, many organizations try to become the single point of entry into it. remember that conversa tions are varied in their mode, places, and players, and, more importantly, that they are intensely personal. this means that participants need to have ownership in them, and often in their locations as well. this also means that the library, as facilitator, needs to be varied in its modes and access points. in many cases, it is better to either create a personal space in which users may converse, or, increasingly, to be part of someone else’s space. what we can learn from web 2.0’s mashups is that smaller sets of limited (but easy to access) functionalities lead to greater incorporation of tools into people’s lives. in the chicagocrime–google maps mashup, combining maps from google and chicago crime statistics, it was important for the host of the site to brand the space and shape the interface for his conversation on crime. can your library functions be as easily incorporated into these types of conversations? can a user search your catalog and present the results on his or her web site? the point is that libraries need to be proactive in a new way. instead of article title | author 29participatory networks | lankes, silverstein, and nicholson 29 the mantra, “be where the user is,” we need to, “be where the conversation is.” it is not enough to be at the users’ desktops; you need to be in their email program, in their myspace pages, in their instant messaging lists, and in their rss feed readers. all of these examples point to a significant mental shift that librarians will need to make in moving from delivering information from a centralized location to delivering information in a decentralized manner where the conversations of users are taking place. the catalog example presented earlier is an example of a centralized place for conversations. what if, instead of only being in a catalog, the same data were split into smaller components and embedded in the user’s browser and email pro grams? just as google’s mail system embeds advertising based upon the content of a message, the library could provide links to its resources based upon what a user is working on. by disaggregating the information within its system, the library can deliver just what is needed to a user, provide connections into mashups, and live in the space of the user instead of forcing the user to come to the space of the library. challenges and opportunities there is clearly a host of challenges in incorporating par ticipatory networks and a participatory model into the library. this is to be expected when we are dealing with something as fundamental as knowledge and as personal as conversations. we consider four major challenges that must be met by libraries before they can truly get into the business of participatory librarianship. technical there is a rich suite of participatory networking software that libraries can incorporate into their daily operations. implementing a blog, a wiki, or rss feeds these days is not a hard task, and they can easily be used to deliver information about library services and conversations to the user’s space. furthermore, these systems are often tested in very largescale environments and are, in some cases, the same tools used in large participatory network ing sites such as wikipedia and blogger. some of these packages are commercial, but others are open source software. open source software is cheaper, easier to adapt, and, in some cases, more advanced. the downside to open source is that it requires a considerable amount of technical knowledge by the library (but not as much as one might think) and does not come with a technical support hotline. the largest technological impediment, however, may be the currently installed base of software within librar ies. integrated library systems have a long history and include a broad range of library functions. legacy code and near monolithic systems have restricted the easy exchange of a diverse set of information. were these sys tems written today, they would use modular code and loosely coupled apis and allow customers much more interface customizability. these changes may come to integrated library systems (as customers are demanding it), but it may take years. several libraries are currently attempting to pick apart these integrated systems themselves. often, libraries go to the underlying databases that hold the library metadata or create their own data structures, such as the university of pennsylvania data farm project.22 once components of this system are exposed, the catalog simply becomes another database that can be federated into new and uni fied interfaces. however, such integration requires a great deal of technological expertise. there is an opportunity for integrated library system vendors or large consortial groups such as oclc to move quickly into this space. in the meantime, there is an opportunity for the larger library community. this technology brief was created in response to a perceived need. whether evi denced in the library 2.0 community or in conversations at lita, libraries are now interested in incorporating new web technologies into their offerings and opera tions. the technologies under consideration here pres ent platforms for experimentation. rather than setting up thousands of separated experiments, however, the library community should create a participatory net work of its own. the technology certainly exists to create a test bed for libraries to set up various combinations of communication technologies (blogs, tagging, wikis), to test new web services against pooled data (catalog data, metadata repositories, and large scale data sets), and even to incorporate new services into the current library offerings (rss feeds, for example). by combining resources (money, time, expertise) in a single, largescale test bed, libraries not only can get greater impact for the their investments, but can directly experience life as a connected conversation. these connections, if built at the ground level, will then make it easier for the library to come into existence. terminology can be clarified, claims tested, and best practices collaboratively developed, greatly accelerating innovation and dissemination. operational in addition to being in the conversation business, librar ies are in the infrastructure business. one of the most powerful aspects of a library is its ability not only to develop a collection of some type of information, but to maintain it over time. sometimes infrastructure can be problematic (as in the case of legacy systems), but more often than not it provides a stable foundation from which to operate. there are many conversations going on that need infrastructure but have none (or little). think of the opportunities in your community for using the web to 30 information technology and libraries | december 200730 information technology and libraries | december 2007 facilitate a conversation. it might be a researcher want ing to disseminate the results of his or her latest study. it might be a community organization seeking funding. it might be a business trying to manage its basic opera tional knowledge. the point is that such individuals and community organizations are not in the infrastructure business and could use a partner who is. imagine a local organization coming to the library and, within a few min utes, setting up a web site with an rss feed, a blog, and bulletin boards. the library facilitates, but does not own, that individual’s or organization’s conversation. it does form a strong partnership, however, that can be leveraged into resources and support. the true power of participa tory networking in libraries is not to give every librarian a blog; it is in giving every community member a blog (and making the librarian a part of the community). in addition, the library can play the role of connecting these conversations to other users when appropriate. participatory libraries allow the concept of com munity center (intellectual center, service center, media center, information center, meeting center) to be extended to the web. many public libraries have no problem providing meeting space to local nonprofits. why not provide web meeting space in the form of a web site or web conferencing? many academic libraries attempt to capture the scholarly output of their faculties, why not help generate the output with research data stores? the answers to these questions inevitably come back to time and money. however, there is nothing in this brief that says such services have to be free. in fact, the best part nerships are formed when all partners are invested in the process. the true problem is that libraries have no idea of how to charge for such services. faculty would be glad to write library support into grants (in the form of web site creation and hosting), but need a dollar figure to include and how long each task will take. many libraries aren’t used to positioning their services on a per item basis, and this makes it difficult to build partnerships. sometimes it is not a lack of money, but a lack of structure to take in money that is the problem. policy as always, it is policy that presents the greatest challenges. the idea of opening the library functions to a greater set of inputs is rife with potential pitfalls. how can libraries use the technologies and concepts of facebook and myspace without being plagued by their problems? how can users truly be made part of the collection without the library being liable for all of their actions? the answers may lie in a seemingly obscure concept: identity management. conversations can range in their mode, topic, and duration. they also can vary in the conversants. the library needs to know a conversant’s status to determine policy (for example, we can only disclose this information to this person), and requires a unique identifier, such as a library card, to uphold it. in traditional libraries, that is the extent of identity management. in a participatory model, distinctions among identi ties become complex and graduated, and require us to consider a new approach. this new model, of patrons adding information directly to library systems, is not as radical as it may first appear. we have become very used to the idea of roles and tiered levels of authority in many other settings. most modern computer systems allow for some gradation in user abilities (and responsibilities). online communities have even introduced merit systems, by which continual highquality contributions to a site equals greater power in the site. think about amazon, wikipedia, even ebay; as users contribute more to the community, they gain status and recognition. from par ticipants to editors, from readers to writers, these organi zations have seen membership as a sliding scale of trust, and libraries need to adopt this approach in all of their basic systems. we currently do, to a degree, in the form of librarians, paraprofessionals, and other staff. yet even these distinctions tend to be rigid and often classbased, with high walls (such as a master’s degree) between the strata. some of this is imposed by outside organizations (civil service requirements, tenure track, and so on), but a great deal is there by inertia of the field. skillful use of identity management will help librar ies avoid the baggage of myspace and facebook. as users grain greater access, greater responsibility, and greater autonomy, libraries need to be more certain of their identities. that is, for a user to do more requires the library to know more. knowing about a user may involve traditional identity verification or tracking an activity trail, whereby intentions can be judged in rela tion to actions. these concepts may be expressed as, “the more we know you, the more control you can have in valuable services such as blogging, or the catalog.” the concepts are illustrated in blogger and livejournal, both of which require some level of identity information. in another example, to join livejournal you must be invited, thus the community confers identity. the common theme is that verifying (and building) identity is community based. the difference between the library and myspace is that the library works in an established community with traditional norms of identity, whereas myspace is seeking to create a community (where identity is more defined by social connections than actions). both the library and the services mentioned above, however, base their functions and services on identity. ethical as knowledge is developed through conversation, and libraries facilitate this process, libraries have a powerful impact on the knowledge generated. can librarians inter fere with and shape conversations? absolutely. should we? we can’t help it. our collections, our reference work, article title | author 31participatory networks | lankes, silverstein, and nicholson 31 our mere presence will influence conversations. the ques tion is, in what ways? by dedicating a library mission to directly align with the needs of a finite community, we are accepting the biases, norms, and priorities of the com munity. while a library may seek to expand or change the community, it does so from within. when internet filtering became a requirement for fed eral internet funding, public and school libraries could not simply quit, or ignore the fact, because they are agents of their communities. school libraries had to accept filtering with federal funding because their parent organizations, the schools, accepted filtering.23 we see, from this example, that libraries may shift from facilitating conversations to becoming active conversants, but they are always doing both. thus, the question is not whether the library shapes conversations, but which ones, and how actively? these questions are hardly new to the underlying principles of librarianship. and nothing in the participa tory model seeks to change those underlying principles. the participatory model does, however, highlight the fact that those principles shape conversations and have an impact on the community. ■ recommendations the overall recommendation of this article is that librar ies must be active participants in the ongoing conversa tions about participatory networking. they must do so through action, by modeling appropriate and innovative use of technologies. this must be done at the core of the library, not on the periphery. rather than just adding blogs and photosharing, libraries should adopt the princi ples of participation in existing core library technologies, such as the catalog. anything less simply adds stress and stretches scarce resources even further. to complement this broad recommendation, the authors make two specific proposals: expand and deepen the discussion and understanding of participatory net works and participatory librarianship, and create a par ticipatory library test bed to give librarians needed participatory skills and sustain a standing research agenda in participatory librarianship. as stated in the outset of this document, what you are reading is limited. while it certainly contains the kernel and essence of participatory networks (systems to allow users to be truly part of services) and participatory librar ianship (the role of librarianship as facilitators and actors in conversations in general), the focus was on technology and technology changes. already, the ideas contained in this document have been part of an active conversation. the first draft of this document was made available for public comment via a wiki, email, and bulletin boards, and concepts herein presented at conferences and lec tures. however, there is now a need to broaden the scope and scale of the conversation. the theoretical founda tions of participatory librarianship need to be rigorously presented. the nontechnical components of the ideas (and the marriage of nontechnical to technical) need to be explored. there are curricular implications: how do we prepare participatory librarians? the nature and form of the library and participatory systems need to be dis cussed and examined in theoretical, experimental, and operational contexts. in order to do this, the authors propose a series of con versations to engage the ideas. these conversations, both in person and virtual, need to be within the profession and across disciplines and industries. the deeper conversa tions need to be documented in a series of publications that expand this document for academics and practitioners. the authors feel, however, that the first proposal must be grounded in action. to complement the more abstract exploration of participatory networks and participatory librarianship, there must be an active playground where conversants can experience firsthand the technologies discussed, and then actively shape the tools of partici pation. this is the test bed. this test bed would imple ment a participatory network of libraries, and provide a common technology platform to host blogs, wikis, discussion boards, rss aggregators, and the like. these shared technologies would be used to experiment with new technologies and to provide real services to librar ies. thus, libraries could not only read about blogging applications, they could try them and even roll them out to their community members. as libraries start new com munity initiatives, they could rapidly add wikis and rss feeds hosted at the shared test bed. the test bed would also make all software available to the libraries so they could locally implement technologies that have proven themselves. the test bed would provide the open source software and consulting support to implement features locally. the test bed also would develop new metrics and means of evaluating participatory library services for the use of planners and policy makers. a major deliverable of the test bed, however, would be to model innovations in integrated library systems (ils). the test bed would work with libraries and ils vendors to pilot new technologies and specify new stan dards to accelerate ils modernization. the point of the test bed is not to create new ilss, but to make it easy to incorporate innovative technologies into vendor and open source ilss. the location and support model of the test bed are open for the library community to determine. certainly, it could be placed in existing library associations or orga nizations. however, it would require the host to be seen as neutral in ils issues, and to be capable of supporting a diverse infrastructure over time. the host organiza tion also would need to be a nimble organization, able 32 information technology and libraries | december 200732 information technology and libraries | december 2007 to identify new technical opportunities and implement them quickly. one model that might work is establishing a pooled fund from interested libraries. this pooled fund would support an open source technology infrastructure and a small team of researchers and developers. the team’s activities would be overseen by an advisory panel drawn from contributing members. such a model spreads this investment out into experimentation across a broad col laboration and should, ultimately, save libraries time and money. as a result, the time and money that indi vidual libraries might spend on isolated or disconnected experiments can be invested in a common effort with greater return. libraries have a chance not only to improve service to their local communities, but to advance the field of par ticipatory networks. with their principles, dedication to service, and unique knowledge of infrastructure, libraries are poised not simply to respond to new technologies, but to drive them. by tying technological implementa tion, development, and improvement to the mission of facilitating conversations across fields, libraries can gain invaluable visibility and resources. impact and leadership, however, come from a firm and conceptual understanding of libraries’ roles in their communities. the assertion that libraries are an indis pensable part of knowledge generation in all sectors pro vides a powerful argument to an expanded function of libraries. eventually, blogs, wikis, rss, and ajax all will fade in the continuously dynamic internet environment. however, the concept of participatory networks and con versations is durable. ■ acknowledgements the authors would like to thank the following people and groups: ken lavender, for his editing prowess. the doctoral students of ist 800 for providing input on conversation theory: johanna birkland, john d’ignazio, keisuke inoue, jonathan jackson, todd marshall, jeffrey owens, katie parker, david pimentel, michael scialdone, jaime snyder, sarah webb. the students of ist 676 for their tremendous input and for their exploration of the related concept of massive scale librarianship: marcia alden, charles bush, janet chemotti, janet feathers, gabrielle gosselin, ana guimaraes, colleen halpin, katie hayduke, agnes imecs, jennifer kilbury, minchun ku, todd mccall, virginia payne, joseph ryan, jean van doren, susan yoo. those who commented on the draft, including karen scheider, walt crawford and john buschman, and kathleen de la peña mccook. lita for giving us a forum for feedback. carrie lowe, rick weingarten, and mark bard of ala’s oitp for their feedback and support. the institute staff, including lisa pawlewicz, joan laskowski, and christian o’brien, for logistical support. references and notes 1. cited in p. hardiker and m. baker, “towards social theory for social work,” handbook of theory for practice teachers in social work, j. lishman, ed. (london: jessica kingsley, 1991). 2. g. pask, conversation theory: applications in education and epistemology (new york: elsevier, 1976). 3. linda h. bertland, “an overview of research in metacog nition: implications for information skills instruction,” school library media quarterly 15 (winter 1986): 96–99. 4. pask, conversation theory, 92. 5. tim o’reilly, “what is web 2.0: design patterns and business models for the next generation of software,” o’reilly, www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/ whatisweb20.html (accessed feb. 1, 2007). 6. j. suroweicki, the wisdom of crowds (new york: double day, 2004). 7. “wiki’s wild world: researchers should read wikipedia cautiously and amend it enthusiastically,” nature 438, no. 890 (dec. 2005): 890; www.nature.com/nature/journal/v438/ n7070/full/438890a.html (accessed feb 1, 2007). 8. google, “google maps api,” www.google.com/apis/ maps (accessed feb. 1, 2007). 9. “java script tutorial,” w3 schools, www.w3schools.com/ js/default.asp (accessed feb. 1, 2007). 10. while the terms in web 2.0 are a bit ambiguous, many people confuse the term “mashup” with “remixes.” mashups are combining data and functions (such as mapping), whereas remixes are reusing and combining content only. so combining a song with a piece of video to create a “new” music video would be a remix. mapping all of your videos on a map using youtube to store the videos and google maps to plot them geographically would be a mashup. 11. for example gmail, a very widely used, webbased email service, but is still considered “beta” by google. 12. malcolm gladwell, the tipping point: how little things can make a big difference (boston: back bay books, 2000), 272. 13. oclc, “introduction to dewey decimal classification,” www.oclc.org/dewey/versions/ddc22print/intro.pdf (accessed feb. 1, 2007). 14. “ajax (programming),” wikipedia, http://en.wikipedia .org/wiki/ajax_(programming) (accessed feb. 1, 2007). 15. “web services activity,” w3c, www.w3.org/2002/ws (accessed feb. 1, 2007). 16. walt crawford, “library 2.0 and ‘library 2.0.’ ” cites & insights 6, no. 2 (2006), http://citesandinsights.info/civ6i2.pdf (accessed dec. 13, 2007). 17. eric ormsby, “the battle of the book: the research library today,” the new criterion (oct. 2001): 8. 18. ken chad and paul miller, “do libraries matter? the rise of library 2.0: a white paper,” version 1.0, 2005, www.talis .com/downloads/white_papers/dolibrariesmatter.pdf (accessed feb. 1, 2007). 19. slashdot, “myspace #1 us destination last week,” h t t p : / / s l a s h d o t . o rg / a r t i c l e s / 0 6 / 0 7 / 1 2 / 0 0 1 6 2 11 . s h t m l article title | author 33participatory networks | lankes, silverstein, and nicholson 33 (accessed feb. 1, 2007); pete williams, “myspace, facebook attract online predators,” msnbc, www.msnbc.msn.com/ id/11165576 (accessed feb. 1, 2007); “the myspace gener ation,” businessweek, dec. 12, 2005, www.businessweek .com/magazine/content/05_50/b3963001.htm (accessed feb. 1, 2007). 20. saturday night live, “sketch: myspace seminar,” nbc, www.nbc.com/saturday_night_live/segments/9166.shtml (accessed feb. 1, 2007). 21. c. stohl and g. cheney, “participatory processes/para doxical practices,” management communication quarterly 14, no. 3 (2001): 349–407. 22. j. zucca, “traces in the clickstream: early work on a management information repository at the university of penn sylvania,” information technology and libraries 22, no. 4 (2003): 175–78. 23. to be more precise, public and school libraries that accept erate funding. information technology and libraries at 50: the 1990s in review steven k. bowers information technology and libraries | december 2018 9 steven k. bowers (sbowers@wayne.edu) is executive director, detroit area library network (dalnet). i played some computers games — stored on data cassette tapes — in the 1980s. that was entertaining, but i never imagined the greater hold that computers would have on the world by the mid-1990s. i can remember getting my first email account in 1993, and looking at information on rudimentary web pages in 1996. i remember my work shifting from an electric typewriter to a bulky personal computer with dial-up internet access. eventually, this new computing technology became a prevalent part of my everyday life. this shift to a computer-driven reality had a major effect on libraries too. i was amazed by the end of the 1990s to be doing research on a university library catalog system connected with other institutions of higher education throughout the region, wondering at the expanded access to, and reach of, information. in my mind, due to computers and the internet, libraries were really connected at that time more than they had ever been. as i prepared this review of what we were writing about in ital in the 1990s, i had some fond memories of the advent of personal computers in my daily life and in the libraries i had access to. as we take a look back, i think it is interesting to see what we were doing then and how it is connected to what we are still working on today. along with the eventual disruption that the internet was to libraries, computers and online access also had the effect of greatly changing how libraries constructed our core research tools, especially the catalog. prior to the 1990s libraries had begun automation projects to move their catalogs to computer-based terminals, creating connections and access that were not previously possible with a card catalog. if we are still complaining about the design and function of the online public access catalog (opac) today, in the early 1990s we were discussing what their design and function should be, in a positive and optimistic way. in some ways it seems hard to recall the discussions of how to format data and display it to users. in other ways it seems like we are still having the same discussions, but our work has become more complex as we continue to restructure library data to become more open and accessible. while we were contemplating the design of online library catalogs, libraries were also discussing the implementation of networking and other information technology infrastructures. nevins and learn examined the changes in hardware, software, and telecommunications at the time and predicted a more affordable cost model with distributed personal computers connected through networks, and enhancing library automation cooperation. 1 they expanded the discussion to include consideration of copyright and intellectual property, security, authorization, and a need for information literacy in the form of user navigation, all key to what we are doing today. beyond catalogs, there was the real adoption of the internet itself. by the early 1990s there was growing enthusiasm for accessing and exploring the internet. 2 this created a need for libraries to learn about the internet and instruct others on how to use it. as late as 1997, however, even search engines were still being introduced and defined, and using the internet or searching the world wide web was still a new concept that was not fully understood by many people. at their the 1990s in review| bowers 10 https://doi.org/10.6017/ital.v37i4.10821 basis, search engines were simply defined as indexing and abstracting databases for the web. 3 it is interesting that library catalogs were developed separately from the development of search engines and we are still trying to get our metadata out of our closed systems and open to the rest of the web. in 1991, kibirige examined the potential impact of this new connectivity on library automation. he posited that “one of the most significant change agents that will pervade all other trends is the establishment and regular use of high-speed, fiber optic communication highways.”4 his article in ital provides a prescient overview of much of what has played out in technology, not just in libraries. he noted the need for disconnected devices to become tools to access full-text information remotely.5 perhaps most important, he noted the need for librarians to become experts in non-library technology, to keep pace with developments outside of the profession. this admonition is still important to keep in mind today. at the time, however, libraries were working on the basics of converting records from online bibliographic utility systems running on mainframes to a more useful format for access on a personal computer, let alone thinking about transforming library metadata into linked data that can be accessed by rest of the internet. so we keep moving forward. later in the decade, libraries began to think about the library catalog as a “one stop shop” for information. in 1997, caswell wrote about new work to integrate local content, digital materials, and electronic resources, all into one search interface. initially the discussion was more technical in nature, but caswell provided an early concept for providing a single access point to all of the content that the library has, print and electronic, which was a step forward from just listing the books in the catalog.6 at the time we were still far away from our current concept of a full discovery system with access to millions of electronic resources that may well surpass the print collections of a library. eventually more discussion developed around the importance of user experience and usability for the design of catalogs and websites. catalogs were examined in parallel with the structure of library metadata, and both were seen as important to the retrievability of library materials. human-machine interaction was starting to be examined on the staff side of systems, and this would eventually become part of examining the public interface usability as well. outlining an agenda for redesigning online catalogs, buckland summarized this new technological development work for libraries by noting that “sooner or later we need to rethink and redesign what is done so that it is not a mechanization of paper but fully exploits the capabilities of the new technology.”7 more exciting, by the end of the 1990s we were seeing usability studies for specific populations and those with accessibility difficulties. systems were in wide enough use that libraries began to examine their usefulness to more audiences. beyond our systems, the technology of our actual collections was changing. new network connectivity combined with new hardware led to new formats for library resources, specifically digital and electronic resources. in 1992, geraci and langschied summarized these changes, stating that “what is new for the 1990s is the complication of a greater variety of electronic format, software, hardware, and network decisions to consider.”8 they also expanded the conversation to include data in all forms, and data sets of various kinds, well beyond traditional library materials. this is an important evolution as libraries worked to shift their operations, identities, and curatorial practices. geraci and langschied defined data by type, including social data, scientific information technology and libraries | month year 11 data, and humanities data. they called most importantly for libraries to include access to this varied data to continue the role of libraries providing access to information, as they cautioned that information seekers were already beginning to bypass libraries and look for such information from other sources. libraries were beginning to lose ground as the gatekeepers of information and needed to shift to providing online access and open data themselves. the early 1990s were an exciting time for preservation, as discussion was moving from converting materials to microforms to digitization. in 1990, lesk compared the two formats and had hope for a promising digital future.9 thank goodness he was on target for sharing resources and creating economical digital copies, even if he did not completely predict the eventual shift to reliance on electronic resources that many research libraries have now made. lesk also noted the importance of text recognition, optical character recognition (ocr), and text formatting in ascii. others focused on digital file formats and the planning and execution of creating digital collections. digitization practices were developing and the need to formalize practice was becom ing evident. the same year, lynn outlined the relationship between digital resources and their original media, highlighting preservation, capture, storage, access, distribution.10 by the late 1990s there were more targeted discussions about the benefits of digitizing resources to provide not only remote access, but access to archival materials specifically. in 1996, alden provided a good primer on everything to consider when doing digitization projects, within budget constraints. 11 by the mid-1990s, karen hunter was excited to extol the promises of the dissemination of information electronically, calling the high performance computing and high speed networking applications act of 1993 “[a] formidable vision and goal. real-time access to everything and a laser printer in every house. the 1990s equivalent to a chicken in every pot.”12 hunter’s article is a good overview of where libraries were at working with electronic publications and online access in the early 1990s. halcyon enssle’s piece on moving reserves to online access opened with a great summary of where much of library access was headed: “the virtual library, libraries without walls, the invisible user . . . these are some of the terms getting used to describe the library of the future . . . .”13 eventually, by the end of the decade we even learned to start tracking how our new online libraries were being used, applying our knowledge of print resource usage to our new online collections. in 1995, laverna saunders had already developed a new definition of what a library was, and how the transformation of libraries from physical warehouses to providing access to online content would affect workflows in libraries. as defined by saunders, “the virtual library is a metaphor for the networked library, consisting of electronic and digital resources, both local and remote.”14 not a bad definition more than 20 years later. saunders asked pertinent questions such as which resources would be best in print vs. online, what print materials should be retained, and which resources and collections libraries should digitize themselves. the broader view provided was that these changes would affect not just collections but the entire operation of libraries. there would still be work to do in libraries, but changes in the work were necessary to address shifting technology and the composition of collections. by the end of the decade there was new work to assess use of electronic resources, extended virtual reference services, and information literacy extending to technology instruction. in 1998, kopp wrote about the promising future of library collaborations. consortia were well established in prior decades and they were seeing a resurgence. kopp noted that just as consortia the 1990s in review| bowers 12 https://doi.org/10.6017/ital.v37i4.10821 had been built around support for new shared utilities in the 1970s and 1980s, in the 1990s they were finding a new purpose in the new networking of the internet and possibilities of greater connectivity and collaborations in the online environment.15 beyond cataloging and automation technology, it is interesting to note that even in the new online environment that was forming in the 1990s, many consortia formed at the time to share print resources. this may have been conversely related to libraries shifting from complete print collections to online holdings that many may have felt were more ephemeral, or maybe money was spent on new technological infrastructures and less on library materials. resource sharing of print materials is still an important part of libraries working together to provide access to information, and since the time that kopp wrote about consortia and growing networked collaborations, there has also been a growing development of sharing electronic resources. a large part of the work of many consortia today revolves around purchasing of electronic resources, but in the late 1990s libraries were just beginning to get into purchasing commercial electronic resources.16 there were lots of ital articles in the 1990s looking at the future of libraries and technology, and some specific articles dedicated to prognostication. in 1991, looking into the future, kenneth e. dowlin shared a vision for public libraries in 2001. he predicted that libraries would still exist but it is noteworthy that at the time the future existence of libraries was questioned by many. dowlin did predict change for libraries, including the confluence of new media formats, computing, and yes, still books. he stated what time has now confirmed: “the public wants them all.”17 he had lots of other interesting ideas as well; his article is worth a second look. another fun take on the future was a special section on science fiction from 1994 considering future possibilities in information technology and access. in one piece, david brin noted, “nobody predicted that the home computer would displace the mega-machine and go on to replace the rifle over the fireplace as freedom’s great emancipator, liberating common citizens as no other technology has since the invention of the plow.”18 an interesting observation, even if the computer has now been replaced by phones in our pockets or other fantastic wearable technologies. by the end of the 1990s, libraries had been greatly transformed by technology. many libraries had automated, workflows continued to adjust in all areas of library work, and most libraries had at least partially incorporated elements of using the internet along with providing computer access to library users. some libraries were already moving through the change from print to electronic library resources. specific web applications and websites were also being developed and used for and by libraries. these eventually have matured into smarter systems that can provide better access to our collections and smarter assessment of our resource usage, for both print and electronic materials. as a whole, the 1990s are an exciting time to review when looking at the intersection of information technology and libraries. as information dissemination moved to an online environment, within and outside of the profession, the future existence of libraries began to be questioned. as we now know, libraries still play an important role in providing access to information. notes 1 kate nevins and larry l. learn, “linked systems: issues and opportunities (or confronting a brave new world),” information technology and libraries 10, no. 2 (1991): 115. information technology and libraries | month year 13 2 constance l. foster, cynthia etkin, and elaine e. moore, “the net results: enthusiasm for exploring the internet,” information technology and libraries 12, no. 4 (1993): 433-6. 3 scott nicholson, “indexing and abstracting on the world wide web: an examination of six web databases,” information technology and libraries 16, no. 2 (1997): 73-81. 4 harry m. kibirige, “information communication highways in the 1990s: an analysis of their potential impact on library automation,” information technology and libraries 10, no. 3 (1991): 172. 5 kibirige, “information communication highways in the 1990s,” 175. 6 jerry v. caswell, “building an integrated user interface to electronic resources,” information technology and libraries 16, no. 2 (1997): 63-72. 7 michael k. buckland, “agenda for online catalog designers,” information technology and libraries 11, no. 2 (1992): 162. 8 diane geraci and linda langschied, “mainstreaming data: challenges to libraries,” information technology and libraries 11, no. 1 (1992): 10. 9 michael lesk, “image formats for preservation and access,” information technology and libraries 9, no. 4 (1990): 300-308. 10 m. stuart lynn, “digital imagery, preservation, and access--preservation and access technology: the relationship between digital and other media conversion processes: a structured glossary of technical terms,” information technology and libraries 9, no. 4 (1990): 309-336. 11 susan alden, “digital imaging on a shoestring: a primer for librarians,” information technology and libraries 15, no. 4 (1996): 247-50. 12 karen a. hunter, “issues and experiments in electronic publishing and dissemination,” information technology and libraries 13, no. 2 (1994): 127. 13 halcyon r. enssle, “reserve on-line: bringing reserve into the electronic age,” information technology and libraries 13, no. 3 (1994): 197. 14 laverna m. saunders, “transforming acquisitions to support virtual libraries,” information technology and libraries 14, no. 1 (1995): 41. 15 james j. kopp, “library consortia and information technology: the past, the present, the promise,” information technology and libraries 17, no. 1 (1998): 7-12. 16 international coalition of library consortia, “guidelines for statistical measures of usage of web-based indexed, abstracted, and full text resources,” information technology and libraries 17, no. 4 (1998): 219-21; charles t. townley and leigh murray, “use-based criteria the 1990s in review| bowers 14 https://doi.org/10.6017/ital.v37i4.10821 for selecting and retaining electronic information: a case study,” information technology and libraries 18, no. 1 (1999): 32-9. 17 kenneth e. dowlin, “public libraries in 2001,” information technology and libraries 10, no. 4 (1991): 317. 18 david brin, “the good and the bad: outlines of tomorrow,” information technology and libraries 13, no. 1 (1994): 54. 20180926 10703 editor president’s message: rebuilding our identity, together bohyun kim information technology and libraries | september 2018 2 bohyun kim (bohyun.kim.ois@gmail.com) is lita president 2018-19 and chief technology officer & associate professor, university of rhode island libraries, kingston, ri. ital is the official journal of lita (library and information technology association), and if you are a reader of the ital journal, it is highly likely that you are a member of lita and/or one who is deeply interested in library technology. it is my pleasure to write this column to update all of you about the exciting discussion that is currently underway in lita and two other ala divisions, alcts (association for library collections and technical services), and llama (library leadership and management association). as many of you know, lita began discussing the potential merger with two other ala divisions, alcts and llama, last year.1 what initially prompted the discussion was the prospect of continuing budget deficits in all three divisions. but the resulting conversation has proved that financial viability is not the entire story of the change that we want to bring about. at the 2018 ala annual conference in new orleans, the three boards of lita, alcts, and llama held a joint meeting open to members and non-members alike to solicit and share our collective thoughts, suggestions, concerns, and hopes about the potential three-division realignment. at this meeting attended by approximately 75 people, participants expressed their support for creating a new division with the following key elements. • retain and build upon the best elements of each division. • embrace the breakdown of silos and positive risk-taking to better collaborate and move our profession forward. • build a strong culture of innovation, energy, and inspiration. • be more transparent, responsive, agile, and less bureaucratic • excel in diversity, equity, and inclusion. • support members in all stages of their careers, those with the least means to travel for in-person participation, in particular. • provide member-driven interactions and good value for the membership fee. these ideas have made it clear that members of all three divisions see the goal of realignment as something much more fundamental than financial sustainability. they have validated the shared belief among the lita, alcts, and llama boards that the ultimate goal of realignment is to create a division that better serves and benefits members, not to simply recover the division’s financial health. while the criteria for the success of a new combined division received almost unanimous endorsement at the meeting, opinions about how to realize such success varied. there were understandable concerns associated with combining three small-sized associations into one large one. for example, how will we reconcile three distinctly different cultures in lita, alcts, and llama? how will the new association ensure itself to be more transparent, responsive, and rebuilding our identity, together | kim 3 https://doi.org/10.6017/ital.v37i3.10703 nimble than the individual divisions prior to the merger? could the larger size of the new division make it more difficult for small groups with special interests to get needed support for their programs? many requested that the leadership of the three divisions provide more specific vision and details. as a group, the leaders of lita, alcts, and llama are committed to hashing out those details. with the aim of providing fuller information about what the new division would look like at the 2019 midwinter conference, we have already formed working groups, one for finances and the other for communication and are currently working to create two more on operations and activities. these four teams will work closely together with the current leadership of lita, alcts, and llama, to prepare the most important information about the proposed new division, so that the boards and the members of three divisions can review and provide feedback for needed adjustments. our goal is to present essential information that will allow the members to vote with confidence on the proposal to form one new division on the ala ballot in the spring of 2019. if the membership vote passes, then we will be taking the proposal to the ala committee on organization for finalization. on this occasion, i would also like to bring to everyone’s attention to an inherent tension between the two ideas that many of us hold as association members regarding alignment. one is that more member involvement in determining alignment-related details at an early stage is essential to the success of the new division. the other is that we can decide whether we will support the new division or not, only after the leadership first presents us with a clear, specific, and detailed picture of what the new division will look like. the problem is that we cannot have both at the same time. as members, if we want to be involved at an early stage of reorganization, we will have to accept that there will be no ready-made set of clear and specific details about the division waiting for us to simply say yes or no. we will be required to work through our collective ideas to decide on those details ourselves. it will be a messy, iterative, and somewhat confusing process for all of us. there is no doubt that this will be hard work for both the lita leadership and lita members. but it is also an amazing opportunity. imagine a new division, where (a) innovative ideas and projects are shared and tested through open conversation and collaboration among library professionals in a variety of functional areas such as systems and technology, metadata and cataloging, and management and administration, (b) frank and inspiring dialogues take place between front-line librarians and administrators about vexing issues and exciting challenges, and (c) new librarians learn the ropes, are supported throughout their careers going through changes in their responsibilities as well as areas of specialization, are mentored to be future leaders, and get to develop the next generation of leaders as they themselves achieve their goals. furthermore, i believe that the process of building this kind of new association from the ground up will be a truly rewarding experience. we had an opportunity to discuss and share our collective hope and vision for the new division at the joint meeting, and that vision is an inspiring one: a division that is member-driven, nimble and responsive, transparent and inclusive, and not afraid to take risks. can we create a new association that breaks down our own silos and builds bridges for better communication and collaboration to move our profession forward? information technology and libraries | september 2018 4 my hope is that we can model and embody the change we want to see, starting in the reorganization process itself. if we want to build a new association that is inclusive, transparent, and nimble, we should be able to build such an association in precisely that manner: inclusively, transparently, and nimbly. if we are successful, our identity as members of this new division will be rebuilt as the very spirit and energy of continuing innovation, experimentation, and collaboration across different functional silos of librarianship, rather than as what we have in our job titles. many lita members and ital readers are leaders in their field and care deeply about the continued success and innovation of lita and ital. i would like to invite all of you to participate in this effort of three-division alignment and to inform and lead our way together. while the boards of three divisions are working on the proposal, there will be multiple calls for member participation. keep your eye out for new updates that will be posted in the ala connect community, “alcts/llama/lita alignment discussion” at https://connect.ala.org/communities/community-home?communitykey=047c1c0e-17b9-45b6a8f6-3c18dc0023f5. all information in this group site is viewable to the public. lita, alcts, and llama members can also join the group, post suggestions and feedback, and subscribe to updates. where would you like lita to be next year, and the year after? let us take lita there, together. endnote 1 andromeda yelton, “president’s message,” information technology and libraries 37, no. 1 (march 19, 2018): 2–3, https://doi.org/10.6017/ital.v37i1.10386. 26 information technology and libraries | june 2008 preparing locally encoded electronic finding aid inventories for union environments: a publishing model for encoded archival description author id (to come) plato l. smith ii this paper will briefly discuss encoded archival description (ead) finding aids, the workflow and process involved in encoding finding aids using ead metadata standard, our institution’s current publishing model for ead finding aids, current ead metadata enhancement, and new developments in our publishing model for ead finding aids at florida state university libraries. for brevity and within the scope of this paper, fsu libraries will be referred to as fsu, electronic ead finding and/ or archival finding aid will be referred as ead or eads, and locally encoded electronic ead finding aids inventories will be referred to as eads @ fsu. n what is an ead finding aid? many scholars, researchers, and learning and scholarly communities are unaware of the existence of rare, historic, and scholarly primary source materials such as inventories, registers, indexes, archival documents, papers, and manuscripts located within institutions’ collections/holdings, particularly special collections and archives. a finding aid—a document providing information on the scope, contents, and locations of collections/ holdings—serves as both an information provider and guide for scholars, researchers, and learning and scholarly communities, directing them to the exact locations of rare, historic, and scholarly primary source materials within institutions’ collections/holdings, particularly noncirculating and rare materials. the development of the finding aid led to the institution of an encoding and markup language that was software/hardware independent, flexible, extensible, and allowed online presentation on the world wide web. in order to provide logical structure, content presentation, and hierarchical navigation, as well as to facilitate internet access of finding aids, the university of california–berkeley library in 1993 initiated a cooperative project that would later give rise to development of the nonproprietary sgml-based, xml-compliant, machine-readable markup language encoding finding aid standard, encoded archival description (ead) document type definition (dtd) (loc, 2006a). thus, an ead finding aid is a finding aid that has been encoded using encoded archival description and which should be validated against an ead dtd. the ead xml that produces the ead finding aid via an extensible style sheet language (xsl) should be checked for well-formed-ness via an xml validator (i.e. xml spy, oxygen, etc.) to ensure proper nesting of ead metadata elements “the ead document type definition (dtd) is a standard for encoding archival finding aids using extensible markup language (xml)” (loc, 2006c). an ead finding aid includes descriptive and generic elements along with attribute tags to provide descriptive information about the finding aid itself, such as title, compiler, compilation date, and the archival material such as collection, record group, series, or container list. florida state university libraries has been creating locally encoded electronic encoded archival description (ead) finding aids using a note tab light text editor template and locally developed xsl style sheets to generate multiple ead manifestations in html, pdf, and xml formats online for over two years. the formal ead encoding descriptions and guidelines are developed with strict adherence to the best practice guidelines for the implementation of ead version 2002 in florida institutions (fcla, 2006), manuscript processing reference manual (altman & nemmers, 2006), and ead version 2002. an ead note tab light template is used to encode findings down to the collection level and create ead xml files. the ead xml files are tranformed through xsl stylesheets to create ead finding aids for select special collections. n ead workflow, processes, and publishing model the certified archivist and staff in special collections and a graduate assistant in the digital library center encode finding aids in ead metadata standard using an ead clip and ead template library in note tab light text editor via data entry input for the various descriptive, administrative, generic elements, and attribute metadata element tags to generate ead xml files. the ead xml files are then checked for validity and well-formed-ness using xml spy 2006. currently, ead finding aids are encoded down to the folder level, but recent florida heritage project 2005–2006 grant funding has allowed selected special collections finding aids to be encoded down to the item level. currently, we use two xsl style sheets, ead2html.xsl and ead2pdf.xsl, to generate html and pdf formats, and simply display the raw xml as part of rendering ead finding aids as html, pdf, and xml and presenting these manifestations to researchers and end users. the ead2html.xsl style sheet used to generate the html versions was developed with specifications such as use of fsu seal, color, and display with input from the special collections department head. the ead2pdf.xsl style sheet used to generate pdf versions uses xsl-fo (formatting plato l. smith ii (psmithii@fsu.edu) is digital initiatives librarian at florida state university libraries, tallahassee. preparing locally encoded electronic finding aid inventories for union environments | smith 27 object), and was also developed with specifications for layout and design input from the special collections department head. the html versions are generated using xml spy home edition with built-in xslt, and the pdf versions are generated using apache formatting object processor (fop) software from the command line. ead finding aids, eads @ fsu, are available in html, pdf, and xml formats (see figure 1). the style sheets used, ead authoring software, and eads @ fsu original site are available via www.lib.fsu.edu/dlmc/dlc/ findingaids. n enriching ead metadata as ead standards and developments in the archival community advance, we had to begin a way of enriching our ead metadata to prepare our locally encoded ead finding aids for future union catalog searching and opac access. the first step toward enriching the metadata of our ead finding aids was to use rlg ead report card (oclc, 2008) on one of our ead finding aids. the test resulted in the display of missing required (req), mandatory (m), mandatory if applicable (ma), recommended (rec), optional (opt), and encoding analogs (relatedencoding and encodinganalog attributes) metadata elements (see figure 2). the second test involved reference online archive of california best practices guidelines (oac bpg), specifically appendix b (cdl, 2005, ¶ 2), to create a formal public identifier (fpi) for our ead finding aids and make the ead fpis describing archives content standards (dacs)–compliant. this second test resulted in the creation of our very first dacs– compliant ead formal public identifier. example: ftasu2003004. xml the rlg ead report card and appendix b of oac bpg together helped us modify our ead finding aid encoding template and workflow to enrich the ead document identifier metadata tag element, include missing mandatory ead metadata elements, and develop fpis for all of our ead finding aids. prior to recent new developments in the publishing model of ead finding aids at fsu libraries, the ead finding aids in our eads @ fsu inventories could not be easily found using traditional web search engines, were part of the so-called “deep web,” (prom & habing, 2002) and were “unidimensional in that they [were] based upon the assumption that there [was] an object in a library and there [was] a descriptive surrogate for that object, the cataloging record” (hensen, 1999). ead finding aids in our eads @ fsu inventories did not have a descriptive surrogate catalog record and lacked the relevant related encoding and analog metadata elements within the ead metadata with which to facilitate “metadata crosswalks”—mapping one metadata standard with another metadata standard to facilitate crosssearching. “to make the metadata in ead instance as robust as possible, and to allow for crosswalks to other encoding schemes, we mandate the inclusion of the relatedencoding and encodinganalog attributes in both the and segments” (meissner, et al., 2002). incorporating an ead quality checking tool such as rlg bpg and ead compliance such as dacs when figure 1. ead finding aids in html, pdf, and xml format figure 2. rlg ead report card of xml ead file 28 information technology and libraries | june 2008 authoring eads, will assist in improving ead encoding and ead finding aids publishing model. n some key issues with creating and managing ead finding aids one of the major issues with creating and managing ead finding aids is the set of rules used for describing papers, manuscripts, and archival documents. the former set of rules used for providing consistent descriptions and anglo-american cataloging rules (aacr) bibliographic catalog compliance for papers, manuscripts, and archival documents down to collection level was archives, personal papers, and manuscripts (appm), which was complied by steven l. hensen and published by the library of congress in 1983. however, the need for more description granularity down to the item level, enhanced bibliographic catalog specificity, marc and ead metadata standards implementations and metadata standards crosswalks, and inclusion of descriptors of archival material types beyond personal papers and manuscripts prompted the development of describing archives: a content standard (dacs), published in 2004 with the second edition published in 2007. “dacs [u.s. implementation of international standard for the description of archival materials and their creators] is an output-neutral set of rules for describing archives, personal papers, and manuscripts collections, and can be applied to all material types ”(pearce-moses, 2005). some international standards for describing archival materials are general international standard archival description isad(g) and international standard archival authority record for corporate bodies, persons, and families [isaar(cpf)]. other issues with creating and managing ead finding aids include (list not exhaustive): 1. online presentation of finding aids 2. exposing finding aids electronically for searching 3. provision of a search interface to search finding aids 4. online public access catalog record (marc) and link to finding aids 5. finding aids linked to digitized content of collections eads @ fsu exist in html for online presentation, pdf for printing, and xml for exporting, which allow researchers greater flexibility and options in the information-gathering and research processes and have improved the way archivists communicated guides to archival collections with researchers as opposed to paper finding aids physically housed within institutions. eads @ fsu have existed online in html, pdf, and xml formats for two years in a static html document and then moved to drupal (mysql database with php) for about one year, which improved online maintenance but not researcher functionality. however, the purchase and upgrade of a digital content management system marked a huge advancement in the development of our ead finding aids implementation and thus resolutions to issues numbers 1–3. researchers now have a single-point search interface to search eads @ fsu across all our digital collections/ institutional repository (see figure 3); the ability to search within the finding aids via full-text indexing of pdfs; the option of brief (thumbnails with ead, htm, pdf, and xml manifestation icons), table (title, creator, and identifier), and full (complete ead finding aid dc record with manifestations) views of search results, which provides different levels of exposures of ead finding aids; and the ability to save/e-mail search results. future initiatives are underway to enhance eads @ fsu implementation via the creation of ead marc records through dublin core to marc metadata crosswalk, to deep link to ead finding aids via 856 field in marc records, and to begin digitizing and linking to ead finding aids archival content via digital archival object ead element. is “linking element that uses the attributes entityref or href to connect the finding aid information to electronic representations of the described materials. the and elements allow the content of an archival collection or record figure 3. online search gui for ead finding aids and digital collections within ir preparing locally encoded electronic finding aid inventories for union environments | smith 29 group to be incorporated in the finding aid” (loc, 2006b). we have opted to create basic dublin core records of ead finding aids based on the information in the ead finding aids descriptive summary (front matter) first and then crosswalk to marc, but are cognizant that this current workflow is subject to change in the pursuit of advancement. however, we are seeking ways to improve the ead workflow and ead marc record creation through more communication and future collaboration with the fsu libraries cataloging department. n number of finding aids and percent of eads @ fsu as of february 16, 2006, we had 700 collections with finding aids in which 220 finding aids are electronic and encoded in html (31 percent of total finding aids). from the 220 electronic finding aids, 60 are available as html, pdf, and xml finding aids (20 percent of electronic finding aids are eads @ fsu). however, we currently have 63 ead finding aids available online in html, pdf, and xml formats. n new developments in publishing eads @ fsu current eads @ fsu include the recommendations from test 1 and test 2 (rlg bpg and dacs compliance) which were discussed earlier and the digital content management system (i.e. digitool) creates a descriptive digital surrogate of the ead objects in the form of brief and basic dublin core metadata records for each ead finding aid along with multiple ead manifestations (see figure 4). we have successfully built and launched our first new digital collection, fsu special collections ead inventories, in digitool 3.0 as part of fsu libraries dlc digital repository (http://digitool3.lib.fsu.edu/r/), a relational database digital content management system (dcms). digitool has an oracle 9i relational database management system backend, searchable web-based gui, a default ead style sheet that allows full-text searching of eads, supports marc, dc, mets metadata standards, jpeg2000 (built in tools for images and thumbnails) as well as z39.50 and oai protocols which will enable resource discovery and exposing of eads @ fsu. you can visit fsu special collections ead finding aids inventories at http://digitool3.lib.fsu.edu/r/? func=collections-result&collection_id=1076. n national, international, and regional aggregation of finding aids initiatives rlg’s archivegrid (http://archivegrid.org/web/index. jsp) is an international, cross-institutional search constituting the aggregation of primary source archival materials of more than 2,500 research libraries, museums, and archives with a single-point interface to search archival collections from across research institutions. other international, cross-institutional searches of aggregated archival collections are: n intute: arts& humanities in the united kingdom www.intute.ac.uk/artsandhumanities/ cgi-bin/browse.pl?id=200025 (international guide to subcategories of archival materials) n archives made easy www.archivesmade easy.org (guide to archives by country) there are also some regional initiatives, which provide cross-institutional search of aggregations of finding aids: n publication of archival library and museum materials (palmm) http://palmm.fcla.edu (crossfigure 4. ead finding aids in ead (default), html, pdf, and xml manifestations 30 information technology and libraries | june 2008 institutional searches in fl fsu participates, fl) n virginia heritage: guides to manuscript and archival collections in virginia http://ead.lib .virginia.edu/vivaead/ (cross-institutional searches in virginia) n texas archival resources online www.lib.utexas. edu/taro/ (cross-institutional searches in texas) n online archive of new mexico http://elibrary .unm.edu/oanm/ (cross-institutional searches in new mexico) awareness of regional, national, and international aggregation of finding aids initiatives and engagement in regional aggregation of finding aids will enable a consistent advancement in the development and implementation of eads @ fsu. acknowledgments fsu libraries digital library center and special collections department, florida heritage project funding (fcla), chuck f. thomas (fcla), and robert mcdonald (sdsc) assisted in the development, implementation, and success of eads at fsu. references altman, b. & nemmers, j. (2006). manuscripts processing reference manual. florida state university special collections. california digital library (cdl). (2005). oac best practice guidelines for encoded archival description, appendix b. formal public identifiers for finding aids. retrieved october 6, 2006 from www.cdlib.org/inside/diglib/guidelines/bpgead/ bpgead_app.html#d0e2995. digital library center, florida state university libraries. (2006). fsu special collections ead finding aids inventories. retrieved january 5, 2007 from http://digitool3.lib.fsu.edu/ r/?func=collections-result&collection_id=1076. florida center of library automation (fcla). (2004). palmm: publication of archival library and museum materials, archival collections. retrieved january 7, 2007 from http://palmm.fcla .edu. florida center for library automation (fcla). (2006). best practice guidelines for the implementaton of ead version 2002 in florida institutions. (john nemmers, ed.). accessed april 21, 2008, at www.fcla.edu/dlini/openingarchives/new/ floridaeadguidelines.pdf fox, m. (2003). the ead cookbook — 2002 edition.chicago: the society of american archivists. retrieved october 6, 2006 from www.archivists.org/saagroups/ead/ead2002cookbook .html. hensen, s. l. (1999). nistf ii and ead: the evolution of archival description. encoded archival description: context, theory, and case studies (pp. 23–34). chicago: the society of american archivsits library of congress (loc). (2006a). development of the encoded archival description dtd. retrieved october 6, 2006 from www.loc.gov/ead/eaddev.html. library of congress (loc). (2006b). digital archival object— encoded archival description tag library—version 2002. retrieved january 8, 2007 from www.loc.gov/ead/tglib. library of congress (loc). (2006c). encoded archival description —version 2002 official site. etd dtd version 2002. retrieved april 19, 2008 from www.loc.gov/ead/ead2002a.html. meissner, d., kinney, g., lacy, m., nelson, n., proffitt, m., rinehart, r., ruddy, d., stockling, b., webb, m., & young, t. (2002). rlg best practices guidelines for encoded archival description (pp. 1-24). mountain view: rlg. retrieved january 5, 2007 from www.rlg.org/en/pdfs/bpg.pdf. national library of australia. (1999). use of encoded archival description (ead) for manuscript collection retrieved january 4, 2007 from www.nla.gov.au/initiatives/ead/eadintro .html. oclc. (2007). archivegrid—open the door to history. retrieved january 4, 2007 from http://archivegrid.org/web. oclc. (2008). ead report card. retrieved april 11, 2008 www.oclc.org/programs/ourwork/past/ead/reportcard .htm. pearce-moses, r. (2005). a glossary of archival and records terminology. chicago: society of american archivists. retrieved january 8, 2007 from www.archivists.org/glossary/index.asp. prom, c. j. & habing, t. g. (2002). using the open archives initiative protocols with ead . paper preserted at the international conference on digital libraries proceedings of the 2nd acm/ieee-cs joint conference on digital libraries. portland, oregan, usa, july 14-18, 2002. retrieved october 6, 2006 from http://portal.acm .org/citation.cfm?doid=544220.544255. reese, t. (2005). building lite-weight ead repositories,. paper presented in the international conference on digital libraries proceedings of the 5th acm/ieee-cs joint conference on digital libraries. new york: acm. retrieved january 5, 2007 from http://doi.acm.org/10.1145/1065385.1065498. special collections department, university of virginia. (2004). virginia heritage guides to manuscripts and archival collections in virginia. retrieved january 7, 2007 from http://ead.lib.virginia .edu/vivaead/. thomas, c., et al. (2006). best practices guidelines for the implementation of ead version 2002 in florida institutions. florida state university special collections. university of texas libraries, university of texas at austin. (unknown). texas archival resources online (taro). retrieved january 4, 2007 from www.lib.utexas.edu/taro. author name and second author b y now, most library and information technology association (lita) members and information technology and libraries (ital) readers know that 2006 is the fortieth anniversary of lita’s predecessor, the information science and automation division (isad) of the american library association (ala). and 2007 marks the fortieth birthday of ital, first published in 1967 as the journal of library automation (jola). i hope that members and readers know the vital role played by fred kilgour in the founding of the division and as jola’s founding editor. this issue marks the initiation of a two-volume celebration (volumes 25 and 26) of his role as founding editor by publishing what we hope are significant articles resulting from original research, the development of important and creative new systems, or explications of significant new technologies that will shape future information technologies. i have invited some of the authors of these articles to submit their manuscripts. others are being submitted in response to a call i published both in an earlier editorial and in a message to the lita-l discussion list. whether invited or submitted, they will receive the same double-blind refereeing that all ital articles undergo. the referees will not know which articles have been invited or submitted for this purpose. the articles will, however, be so designated when they are published. volume 25 initiates a second landmark for ital. henceforth, ital will be published simultaneously in electronic and print versions. the electronic copy will be available to lita members and ital subscribers on the ala/lita web site. equally significantly, at the 2006 ala midwinter meeting in san antonio, the lita board of directors approved a second proposal from the lita publications committee. (the ital editor and editorial board report to the publications committee.) after six months, the electronic issues will be open to all, not restricted to members and subscribers. put simply, if you are a member or subscriber reading this issue in print, you may also read it and volume 25, number 1 (the march 2006 issue) on the web. when volume 25, number 3 is published in september 2006, the march issue on the web will be open for anyone to read. when the december issue is published, this june e-issue will be open to all. the web versions are to be published in both pdf and html versions. most ital articles now include urls. readers will be able to link to them. most figures and graphs submitted by authors are in color. from now on, these will be available to the readers of the e-copies. ala publishing allows authors to submit their articles to institutional repositories, and many authors now do so. authors will retain this option. some articles have been posted on other portals as well. martha yee’s outstanding june 2005 article on how to frbrize the opac appears not only on ucla’s repository site but also on the escholarship repository site of the university of california system, one of the few library-related articles on the site (http://repositories.cdlib.org/escholarhip). furthermore, on november 29, 2005, it was among the top ten most popular articles on the site. recently, dlist (http://dlist.sir.arizona.edu) at the university of arizona library received permission to include it. the decisions to allow simultaneous publication of print and electronic versions and to allow open access after six months were not made lightly. the lita board members carried on extensive electronic discussions among themselves and with nancy colyar, chair of the publications committee, and me. lita president pat mullin’s summary of those discussions was more than ten single-spaced pages. nancy and i also attended a meeting of the board in san antonio. publications and memberships are two chief sources of revenue for almost all professional associations. in two surveys in the past ten years, lita members have indicated they considered ital to be their most important membership benefit. lita membership fell this year, probably because of the recent dues increases by other divisions of ala. this decline was anticipated by lita’s leadership. i think both the ital editorial board and the lita leadership would love to take the additional pioneering step of making our journal a full open-access publication. however, legitimate concern was expressed that opening access after six months might lead to both a decrease in members and subscribers. a significant number of lita leaders said that their membership was based on lita programs, participation, and interaction with colleagues, not just ital. i hope that all lita members feel the same. i further hope that lita members will do everything they can to discourage their libraries from canceling their subscriptions. our financial health would be enhanced if all lita members took two other steps: participating in writing and encouraging the writing of significant articles, and encouraging your many library technology vendors to advertise in ital. fred kilgour and the other founders of our division were library information technology (it) pioneers. fred’s leadership helped make jola and now ital vital reading for library it professionals. i believe that by celebrating the lita/ital anniversaries with a reconfirmation of our practice of publishing articles of the highest quality and by making ital more accessible through electronic publication, we are reaffirming the scholarly and professional commitments first made by fred kilgour and his isad colleagues such a short forty years ago. john webb john webb (jwebb@wsu.edu) is assistant director for systems and planning, washington state university libraries, pullman, and editor of information technology and libraries. editorial: lita and ital: forty and still counting editorial | webb 51 letter from the editor (september 2019) letter from the editor kenneth j. varnum information technology and libraries | september 2019 1 https://doi.org/10.6017/ital.v38i3.11631 editorial board changes thanks to the dozens of lita members who applied to join the board this spring. the large number of interested volunteers made the selection process challenging. i’m pleased to welcome six new members to the ital editorial board for two-year terms (2019-2021): • lori ayre (independent technology consultant) • jon goddard (north shore public library) • soo-yeon hwang (sam houston state university) • holli kubly (syracuse university) • brady lund (emporia state university) • paul swanson (minitex) in this issue welcome to lita’s new president, emily morton-owens. in her inaugural president’s message, “sustaining lita,” morton-owens discusses the many ways lita strives to provide a sustainable organization for its members. we also have the next edition of our “public libraries leading the way column. this quarter’s essay is by thomas lamanna, “on educating patrons on privacy and maximizing library resources.” joining those essays are six excellent peer-reviewed articles: • “library-authored web content and the need for content strategy,” by courtney mcdonald and heidi burkhardt • “use of language-learning apps as a tool for foreign language acquisition by academic libraries employees,” by kathia ibacache • “is creative commons a panacea for managing digital humanities intellectual property rights?,” by yi ding • “am i on the library website?,” by suzanna conrad and christy stevens • “assessing the effectiveness of open access finding tools,” by teresa auch schultz, elena azadbakht, jonathan bull, rosalind bucy, and jeremy floyd • “creating and deploying usb port covers at hudson county community college,” by lotta sanchez and john delooper call for pllw contributions if you work at a public library, you’re invited to submit a proposal for a column in our “public libraries leading the way” series for 2020. our series has gotten off to a strong start with essays by thomas finley, jeffrey davis, and thomas lamanna. if you would like to add your voice, please submit a proposal through this google form. kenneth j. varnum, editor varnum@umich.edu september 2019 https://doi.org/10.6017/ital.v38n3.11627 https://doi.org/10.6017/ital.v38n3.11571 https://doi.org/10.6017/ital.v38n3.11571 https://doi.org/10.6017/ital.v38n3.11627 https://doi.org/10.6017/ital.v38n3.11077 https://doi.org/10.6017/ital.v38n3.11077 https://doi.org/10.6017/ital.v38n3.10714 https://doi.org/10.6017/ital.v38n3.10714 https://doi.org/10.6017/ital.v38n3.10977 https://doi.org/10.6017/ital.v38n3.11009 https://doi.org/10.6017/ital.v38n3.11007 https://doi.org/10.6017/ital.v38n1.10974 https://doi.org/10.6017/ital.v38n2.11141 https://doi.org/10.6017/ital.v38n3.11571 mailto:https://docs.google.com/forms/d/e/1faipqlsfqu7c9ogmcdvvbn025a0kiehavrrlr7090ao3rowqypbqtng/viewform?usp=sf_link mailto:varnum@umich.edu editorial board changes in this issue call for pllw contributions index blending is the process of database development whereby various components are merged and refined to create a single encompassing source of information. once a research need is determined for a given area of study, existing resources are examined for value and possible contribution to the end product. index blending focuses on the quality of bibliographic records as the primary factor with the addition of full text to enhance the end user’s research experience as an added convenience. key examples of the process of index blending involve the fields of communication and mass media, hospitality and tourism, as well as computers and applied sciences. when academia, vendors, subject experts, lexicographers, and other contributors are brought together through the various factors associated with index blending, relevant discipline-specific research may be greatly enhanced. a s consumers, when we set out to make a purchase, we want the utmost in quality, and when applica ble, quantity, and of course all of the other ”appeal” factors that might be associated with a given product or service. these factors may include any number of catego ries, not the least of which is price. in other words, let it suffice to say that, as buyers, we want to have our cake and eat it, too. but how often is this a realistic approach to evaluating a given item for purchase? we first must decide what is important to us, decipher the order of this importance as we see it, and evaluate our options. wouldn’t it be much easier if one product in every situ ation had all of the factors that we deem important, and the appropriate price to go along with it? according to veliyath and fitzgerald in an article published in competitiveness review, firms can either posi tion themselves at the high end, offering higher quality at higher prices, or at the lower end, offering lower quality at a lower price (or anywhere inbetween on the continuum of constant value for customers). customers, however, want more of what they value, such as convenience, speed, stateoftheart design, quality, etc. competitors then try to differentiate themselves from their rivals along the same line of constant value, either by offering a higher quality at the same price or the same quality at a lower price (thereby increasing value for the customer).1 as such, and using a common example, is it possible to have the handling of a bmw sports car, the luxurious ride of a cadillac, the passenger space of a winnebago, the cargo space of an oversized pickup truck, all for the price of an economy car? it’s doubtful. but through recent developments in the electronic research database market place, and a process known as “index blending,” we may be closer than ever to this ideal formula when it comes to webbased reference resources for academic libraries. the phrase “index blending” is used here to describe an original concept/methodology initiated by ebsco publishing (ebsco). this is not to say that ebsco is the first vendor ever to have combined resources to create a new product, but to the authors’ best knowledge, no other vendor has pursued the “blending” of resources to the same extent and with such a strong guiding directive as ebsco has. index blending is the combining of niche indexes and other important components to create a single defini tive index for a particular discipline. as vendors seek to offer the most powerful research database for a given area of study, the pieces may come together through a combination of existing resources and proprietary development. in other words, in order to refine the tools used for research in a discipline, existing resources may be combined, fleshed out, further expanded upon, and enhanced to culminate in the archetypical index for the particular discipline. perhaps this represents the solution to the dilemma that “database choices become increas ingly complex when multiple sources exist that cover the same discipline.”2 the idea may seem elementary, but the process, however, can be arduous. processes involved with index blending expand upon the basic development stages asso ciated with creating a research database from “scratch,” coupled with an increase in applicable factors, which become evident when several existing and emerging resources are involved and subsequently interwoven. as is always the case, the first step to building a solution is to identify the problem and/or the need. in database devel opment, this is, in a nutshell, pinpointing a subject area of research that is lacking a corresponding definitive index, and where study patterns and research interest dictate a need for such a resource. this involves not only conduct ing surveys and engaging in discussion with advisory boards, librarians, subject experts, users, etc., but also taking a close look at the research resources that are cur rently available to determine value. because the process begins with the fact that there is a problem (no definitive index for the particular area in question), the idea is to understand the strengths of available resources, as well as to identify weaknesses. through this research process, vendors can further identify independent elements of each resource that may index blending | brooks and herrick 27 index blending: enabling the development of definitive, discipline-specific resources sam brooks and mark herrick sam brooks (sbrooks@ebscohost.com) is the senior vice president of sales & marketing for ebsco information services. mark herrick (mherrick@ebscohost.com) is the vice president of business development for ebsco publishing. 2� information technology and libraries | june 20072� information technology and libraries | june 2007 provide significant benefit or value, as well as pinpoint the additional important pieces that are not represented in any of the available resources. in both cases (available and not available), these elements may represent various aspects associated with a research index such as content coverage (both current and backfile), quality of indexing and abstracts, software/search functionality, thesauri, etc. once the identification and research has taken place, vendors should have the necessary knowledge to proceed to the production phase. figure 1 helps to illustrate how the index blending process can help to develop a new database that fuses together the strengths of existing resources while simul taneously compensating for any individual weaknesses that they may have. if value is attributed to currently available databases, then, if appropriate, database acquisition may come into play. this is often a critical phase of the process, and may involve the acquisition of more than a single index. however, the desire by a vendor to acquire a given resource is based on several motivating factors, including the qual ity of the database as a whole, the depth and breadth of its coverage, and at times, the extreme quality of an intricate aspect of a database, which will eventually be said data base’s contribution to the process of index blending, thus representing its “mark” on the final product. because there is no authoritative resource available for a given subject area does not mean necessarily that certain aspects of existing resources are not of utmost quality. hence, utilizing strengths of existing resources makes sense so as to not “reinvent the wheel” when applicable. in a journal of academic librarianship article discussing the research environment in libraries and the simultaneous utilization of existing library resources, similar principles to those used in index blending are apparent. “properly combining library resources to func tion collectively as a cohesive, efficient unit is the basis of information integration.”3 similar themes to those asso ciated with information integration run through index blending. this is attributed largely to the fact that the basic goal of each is to enable the extraction and utiliza tion of essential material pertinent to specific research so as to enhance the overall research process. n the process of index blending an example an interesting example of index blending utilized for a major area of study is in the case of communication and mass media. an article in searcher outlined the develop ment process and release of the database, communication & mass media complete, which may be the quintessential instance of the power brought about through index blending. in the article, the author first identifies the problem/need as such: when a communication studies student approaches my reference desk, it can take a few moments before i choose a database to search. why the delay? well, to be perfectly blunt, the communication studies literature is all over the place. if the question relates to an aspect of the communications industry, i will often begin with a business database. if the question concerns the effects of media violence on children, i may choose to search one or more of the following: comabstracts, psychinfo [sic], sociological abstracts, eric, and even a few large aggregators, such as wilsonweb’s omnifile and ebsco’s academic search premier. in addition, there is the question of finding a single database that covers the communication science and disorders field and the more mass mediafocused communication studies field. the result has been a searching strategy that relies on consulting multiple databases—a strategy that may not please impatient or inexperienced patrons. the need for such an assortment of databases is symptomatic of the discipline. the field of com munication studies is extremely interdisciplinary. the discipline’s roots began in the study of rhetoric and journalism and now encompass subjects ranging from political communication to film studies to advertising to journalism to communication disorders to digital convergence and to every manner of media. the dis cipline has strong roots in the social sciences, but also draws heavily on the humanities and the sciences. as some have put it, there is an aspect of communication studies in every discipline. this leaves librarians with the difficult task of finding a single database that cov ers this wideranging discipline. enter ebsco’s new communication & mass media complete database.4figure 1. the index blending process public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 2�index blending | brooks and herrick 2� this overview of the need for a comprehensive resource in areas related to communication and mass media is indicative of the type of information that vendors must extract when deciding their course of action for creat ing (or not creating) a database to meet such needs. in this instance, the need became apparent to ebsco upon conducting investigative research in this direction. there were certainly important, quality resources available cov ering some of the subject areas and subdisciplines, but not a single, allencompassing resource. hence, the table was set to move forward and begin the process of data base development using the process of index blending. once the need for a comprehensive communication and mass media database was established, ebsco began the phases of looking closely at available resources and gathering specific important details about what was required to develop such a database. in order to under stand the finer details and make appropriate forward progress in formulating an index for a given research area, a dedicated group of subject experts (advisory board, indexers, lexicographers, etc.) must be estab lished. in addition, aggregators must develop appro priate relationships and key partnerships. in the case of the database communication & mass media complete, ebsco worked diligently to assemble a panel of experts to provide direction. often, suggestions made by advi sory board members ultimately led to larger organiza tional partnerships. the first of ebsco’s major partnerships for the benefit of the development of communication & mass media complete was with the national communication association (nca). nca is the oldest and largest national organization to promote communication schol arship and education. founded in 1914, the nca is a nonprofit organization of approximately 7,100 educa tors, practitioners, and students who work and reside in every u.s. state and more than twenty countries. the purpose of the association is to promote study, criti cism, research, teaching, and application of the artistic, humanistic, and scientific principles of communica tion. nca is a scholarly society and, as such, works to enhance the research, teaching, and service produced by its members on topics of both intellectual and social significance. staff at the nca national office follows trends in national research, teaching, and service pri orities. it relays those opportunities to its members and represents the academic discipline of communication in those national efforts.5 in addition to providing insight and advice into the areas associated with communication and mass media, nca found in ebsco an ideal partner to further the tremendous efforts the organization had put into its database, commsearch. commsearch, in its original form, was a scholarly communication database with deep, archival coverage of the journals of the nca and other major journals in the field of communication studies. the database provided bibliographic and keyword references to twentysix journals in communication studies with coverage extending to the inaugural issue of each—some from as far back as the early decades of the twentieth century. the database also included covertocover indexing of the nca’s first six journals (from their first editions to the present) and authorsup plied abstracts from their earliest appearance in nca journals. as ebsco’s goals were in line with the nca in terms of improving scholarly research in areas sur rounding communication as well as enhancing the dis semination of applicable materials, a partnership was formed, and ebsco acquired commsearch. the com pany acquired this database with the intent to enhance the collection through content additions such that it would take residence immediately as a core component of communication & mass media complete. the second major database acquisition came about similarly to the commsearch arrangement; only this time, ebsco worked closely with penn state university, the developers of a database called mass media articles index. created by jack pontius and maintained by the penn state libraries since 1984, mass media articles index provided citation coverage for over forty thousand articles on mass media published in over sixty research journals, as well as major journalism reviews, recent encyclopedias, and handbooks in the area of communications studies. this database, which was once a standalone research tool, is a good example of how a goodquality resource can arise out of the passion and unique vision of an individual, yet never fully develop into its full potential due to a lack of funding, dedicated staff, and experience in database publishing. seeing the incredible potential of mass media articles index, ebsco earmarked this database as the sec ond major component in its larger communication and mass media product. as mentioned, the basic idea with index blending is to pinpoint the best and most important aspects of each database to carry forward into the final product. it is at this point that difficulty typically arises in the normalization of data. once core database components are determined, a vendor ’s expertise in building data bases, standardizing entries, etc., comes to the forefront. furthermore, because another basic ingredient to the process of index blending revolves around additional material included by the database developer, that aggre gator has the burden of taking the core building blocks of the database and elevating these raw materials to the point where their combination and refinement become the desired end result—a definitive, cohesive index to research in the subject area. with this in mind, ebsco carefully selected the indexing components of each resource that were essen tial to carry forward and substantially expanded the 30 information technology and libraries | june 200730 information technology and libraries | june 2007 abstracting and indexing coverage of appropriate journals in commsearch and mass media articles index. the company also added indexing and abstracts for many more of the important titles in the communication and mass media fields that were not covered by these databases. through its initial research, ebsco gained a thorough knowledge of which journals and other content sources were not covered by the two acquired databases, and worked to provide coverage for those missing sources. as such, the idea with this database was to cover all appropriate, qual ity titles indexed in all other currently avail able communication and mass mediaspecific databases combined, as well as other important journals not previously covered by any such database. further still, the company took the database to new levels through the creation and deployment of features such as searchable cited references and index browsing. figure 2 provides a visual interpretation of the elements associated with this particular example of index blending. often academic librarians consider aggre gated fulltext databases as a means for access ing fulltext information quickly, but with a negative outlook toward the quality of the indexing included in these databases. however, it is ebsco’s intention to create first and fore most a powerful index, such that any full text included is that much easier to locate and utilize. according to cleveland and cleveland in the book introduction to indexing and abstracting, 3rd ed., “in any retrieval system, success or failure depends on the adequacy of the indexing and the related search ing procedures.”6 ebsco wholeheartedly agrees with this statement. and though the company is the leader in providing fulltext databases, it continues to raise the bar for these databases through not only constantly increasing the quality and quantity of full text, but also by enhancing indexing, abstracts, and associated search functionality. a database may provide the greatest collection of full text, yet it is still only as good as its underlying indexing framework that guides users to the appropriate content. index blending allows for this ideal because the development of the indexing takes place at the onset as the primary objective, and full text may be included at a later stage. this is precisely the case with ebsco’s communication/communications database where the first iteration of the collection (communication & mass media index) did not include full text, and the complete (fulltext version) was soon to follow. thus, in the case of communication & mass media complete, once the core elements for the index were in place, refined, and normalized, ebsco moved forward in the area of fulltext content. in addition to the inclu sion of full text for all of the nca journals, which david oldenkamp refers to as “heavyweights in communication studies,” ebsco included fulltext coverage for nearly 230 titles. according to oldenkamp, as of april 2004, the competing database with the next largest number of publications covered in full text included only sixteen fulltext titles.7 though index blending is not the traditional way in which to build a database, and may actually be the most laborintensive way in which to proceed, the end results can be remarkable when done properly. using this process, “ebsco has managed to create the largest and most comprehensive database serving the needs of communication studies scholars, faculty, students, and librarians.”8 in addition, a review published in the charleston advisor determined that “ebsco has brought together two reliable but atrophied resources and refreshed them with new search capabilities and added content, such as abstracts. these have been combined with a healthy dose of ‘not indexed anywhere’ new titles and interdisciplinary sources to create a comprehensive figure 2. indexing components of communication & mass media complete public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 31index blending | brooks and herrick 31 resource that will satisfy the needs of students, faculty, and researchers.”9 n another example of index blending hospitality & tourism index index blending is a concept as much as it is a process and a means to an end. much like applying a particular theory to a number of different instances, index blending is interdisciplinary in application. thus, the area of com munication/communications as described previously, is simply an example of practical implementation of this concept, and a particular way in which the process was approached given the specific elements involved. another discipline to which index blending has been applied is the niche areas related to hospitality and tourism. according to professor vivienne sario, director of travel and tourism at community college of southern nevada, “on a global basis the hospitality and tourism industry employs more than 10 percent of the worldwide workforce. it contributes over $4 trillion in gross global output. this means travel and tourism is the world’s largest industry.”10 though still considered (perhaps incorrectly) a “niche” area of study, the number of hospitality and tourism programs supported in colleges and universities around the globe has also increased to the point where dozens and dozens of twoand four-year academic institutions provide related courses of study. from a business perspective, in order to justify the amount of resources that would inevitably be expended to develop a high-end, comprehensive database, the basic criteria needed for database development must first be in place. considering the economic vastness of the hospitality and tourism industry, the interest and research need is quite apparent. if there is at least one clearly definitive academic resource covering the subject area, in all likelihood, the decision would be made to cease exploration and development in that area. contrarily, when ebsco conducted exhaustive research to determine the need for a new index to literature in the areas of hospitality and tourism, the unanimous conclusion was to move forward in the development of a product that would go above and beyond the level of the existing resources. this is not to say that quality was not inherent in some of the existing resources. in actuality, the fact that there were already quality (albeit perhaps incomplete) resources available, paved the way for utilizing principles of index blending in the development of a more comprehensive resource. the first element of what was to become ebsco’s hospitality & tourism index was purdue university’s lodging, restaurant, & tourism index (lrti). as an indi cator of the level of emphasis attributed to this subject area by the university, purdue’s hospitality and tourism management undergraduate program was ranked num ber one nationally by a survey published in the journal of hospitality & tourism education.11 a previous survey conducted by the same journal used a different method ology and sample, but still ranked purdue’s hospitality and tourism management (htm) program number one in the nation.12 to provide insight into the purdue htm program, the origins and history of lrti, the need for a compre hensive database, and the university’s decision to work with ebsco, questions were asked of two prominent purdue faculty members: raphael kavanaugh, head, hospitality and tourism management department, and priscilla geahigan, head, consumer and family sciences library. the following is taken from email cor respondence among one of the authors (sam brooks), kavanaugh, and geahigan: brooks: how long has purdue offered a hospitality & tourism management program? kavanaugh: the program began in 1928 as the department of institutional management. brooks: when and why did purdue decide to create the lodging restaurant & tourism index (lrti)? kavanaugh: to fill a serious void of access to relevant research conducted related to the industry. geahigan: before 1990 coverage of the hospitality industry within business indexes and databases was limited. to meet the needs of researchers and students, purdue’s restaurant, hotel, institutional, and tourism management department, an inhouse indexing project, started in the purdue consumer and family sciences library in 1977. citations of articles from scholarly and trade journals were entered on index cards, filed by subject headings. in 1985 the project became more for malized and migrated into partnership with a few other academic institutions. a printed index titled lodging and restaurant index started. in 1987, purdue became the sole producer of the index. in 1995, the index was renamed the lodging, restaurant, and tourism index (lrti), with expanded scope and coverage. over the years, data diskettes and cdrom formats were added to the printed version. brooks: how important are “niche” or subjectspecific databases to support research in a given area such as h&t? geahigan: in contrast to earlier years, students can now get their information from a multitude of databases and 32 information technology and libraries | june 200732 information technology and libraries | june 2007 venues. at purdue, we have databases that cover all aspects of business and management. undergraduate students often get confused and impatient at the large number of databases offered. a subject specific database like hti gives them a place to start without feeling lost. brooks: why did purdue decide to partner with ebsco, and subsequently merge lrti in the larger hospitality & tourism index (hti)? geahigan: we realized that we do not have the resources to support a database that measures up to industry technology standards and have long decided to look for a company to take over lrti. ebsco’s offer was attrac tive to purdue because of their willingness to assume future indexing of the lrti journals. in addition, many purdue students are already familiar with the ebsco interface because we have numerous other ebsco hosted databases. we are pleased that lrti became the foundation of ebsco’s building of hti.13 the second foundational component of the database also came about through acquisition from an academic institution. articles in hospitality and tourism was copro duced by oxford brookes university and the university of surrey. bournemouth university was also a source of data for this database between the years of 1988 and 1998. this database provided details of more than fortysix thousand englishlanguage articles selected from more than 330 relevant academic and trade journals published worldwide from 1984 to 2003.14 rounding out the list of three existing resources that were acquired by ebsco, the hospitality database (acquired from the original developers at cornell university) was also assimilated into the new hospitality and tourism database. the hospitality database evolved from the print publication bibliography of hotel management and related subjects that was originally established in the 1950s by blanche fickle, the first director of the library at cornell university’s school of hotel administration.15 this database, founded on the vision of ms. fickle, would serve as a core resource for ebsco’s new hospitality & tourism index by providing it with a foundation of quality indexing for journals related to the study of hotel adminis tration and management. ebsco completed the initial development of its hospitality and tourism database by reviewing applicable subscription statistics maintained by its sister company, ebsco subscription services, in order to locate other publications relevant to the various subdisciplines of hospitality and tour ism. any such publications that were not already indexed by the other three existing resources were targeted for inclusion in the new hospitality & tourism index. figure 3 provides a visual interpretation of the ele ments associated with this particular example of index blending. following the initial release of hospitality & tourism index, in order to provide an even more inclusive research experience, ebsco proceeded to develop and release a fulltext version of this resource entitled hospitality & tourism complete. this new variant of the database offers users the same indexing infrastructure as hospitality & tourism index, as well as provides the additional benefit of immediate access to relevant fulltext content. while the availability of full text is certainly of immense value, it is still the quality of underlying indexing that allows this database to be regarded as truly innovative. in fact, this same perspective was echoed in a recent review in choice where the author states that “hospitality & tourism complete indexes its specialized subject area bet ter than any other product currently available.”16 n the whole is greater than the sum of its parts the process of index blending not only brings together content from a variety of resources, it also has the power to increase the research value of that same content. by combining such content under the umbrella of a single comprehensive database, pertinent information can now be more efficiently accessed and crossreferenced with other relevant content. previously, the same body of information could only be explored via a highly ineffec tive, piecemeal research process. one last example that demonstrates this potential increase in research value is found in the computers & figure 3. indexing components of hospitality & tourism complete public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 33index blending | brooks and herrick 33 applied sciences complete database. this resource was shaped through the acquisition and merger of three distinct indexes—computer science index (csi), internet & personal computing abstracts (ipca), and information science & technology abstracts (ista)—and rounded out with addi tional indexed content relevant to the larger discipline. this resulted in a total of 1,100 active journals indexed back as far as 1965. then, after two years of dedicated licensing work with pub lishers, full text for more than 570 of those titles was added to provide more direct access to such content for researchers. figure 4 illustrates how the various subject areas (unique and shared) covered by the three original databases were merged together in the blending process. from this diagram, it is apparent that the original three databases were already quality resources in their own right and adequately rep resented their respective subject areas. however, it should also be apparent that, through the pro cess of index blending, the value of the original databases has been enhanced via the fusion of their unique, yet complementary content into a single comprehensive resource. n conclusion though the above examples of communication & mass media complete, hospitality & tourism index, and computers & applied sciences complete represent only three of sev eral subjectspecific databases culminating from the process of index blending, most database producers (including ebsco) would likely agree that this is not a common procedure for database development. however, the knowledge that a company derives from the pro cess often has a significant impact on the company’s other, “nonblended” databases. index blending typically requires a high degree of refinement in order to be fully successful, so when a company engages in this rigorous developmental process, the newfound experience and expertise gained from it may spill over into the com pany’s other database initiatives. end users may notice improved indexing, abstracts, and other valuable com ponents that are now included in other more established fulltext resources from the same vendor. databases that were once viewed simply as “aggregated fulltext data bases” may be looked upon in a different light after the company adopts the process of index blending for other, unrelated database projects. though these databases may still provide easy access to an abundance of fulltext content, they may also now be considered the definitive index for their respective subject area(s). therefore, when a company implements the practice of index blending for some of its products, the resulting effects are two fold. the databases created directly as a result of the index blending process are the first to benefit, and the company’s other databases (including those with full text) may also benefit from index blending in an indirect manner. in the end, however, the success of any index blending initiative is measured by the level of benefit that it provides to applicable researchers and other users of the resulting databases. references 1. rajaram veliyath and elizabeth fitzgerald, “firm capabil ities, business strategies, customer preferences, and hypercom petitive arenas: the sustainability of competitive advantages with implications for firm competitiveness,” competitiveness review 10 (2000): 56–82. 2. m. suzanne brown, jana s. edwards, and jeneen lasee willemssen, “a new comparison of the current index to jour nals in education and the education index: a deep analysis of indexing,” the journal of academic librarianship 25 (may 1999): 216–22. 3. sam brooks, “integration of information resources and collection development strategy,” the journal of academic librarianship 27 (july 2001): 316–19. 4. david oldenkamp, “ebsco’s new communication and mass media complete (cmmc) database,” searcher 12, no. 4 (apr. 2004): 40. 5. national communication association web site. http:// www.natcom.org (accessed aug. 2004). figure 4. subject areas of component databases are merged into a cohesive whole through index blending 34 information technology and libraries | june 200734 information technology and libraries | june 2007 6. donald b. cleveland and ana d. cleveland, introduction to indexing and abstracting, 3rd ed. (greenwood village, colo.: libraries unlimited, 2001): 26. 7. oldenkamp, “ebsco’s new communication and mass media complete (cmmc) database.” 8. ibid. 9. dodie owens, “advisor reviews—standard review: communication and mass media complete,” the charleston advisor 6, no. 4 (apr. 2005): 45. 10. vivienne sario, “hospitality & tourism programs,” http://www.studyusa.com/articles/hospitality.asp (accessed june 1, 2006). 11. purdue university web site. http://news.uns.purdue. edu/uns/html4ever/030130.kavanaugh.rank2003.html (accessed june 1, 2006). 12. michael g. brizek and mahmood a. khan, “ranking of u.s. hospitality undergraduate programs: 2000–01,” journal of hospitality & tourism education 14, no. 2 (2002): 4. 13. raphael kavanaugh and priscilla geahigan, email mes sage with author sam brooks, feb. 3, 2005. 14. articles in hospitality and tourism web site (hosted by the university of surrey). http://libweb.surrey.ac.uk/aht2/about .asp (accessed june 1, 2006). 15. cornell university’s school of hotel administration web site. http://www.nestlelib.cornell.edu/history.html (accessed june 1, 2006). 16. s. c. awe, “referencesocial and behavioral sciences— hospitality & tourism complete,” choice 43, no. 10 (june 2006). communications wikidata: from “an” identifier to “the” identifier theo van veen information technology and libraries | june 2019 72 theo van veen (theovanveen@gmail.com) is researcher (retired), koninklijke bibliotheek. abstract library catalogues may be connected to the linked data cloud through various types of thesauri. for name authority thesauri in particular i would like to suggest a fundamental break with the current distributed linked data paradigm: to make a transition from a multitude of different identifiers to using a single, universal identifier for all relevant named entities, in the form of the wikidata identifier. wikidata (https://wikidata.org) seems to be evolving into a major authority hub that is lowering barriers to access the web of data for everyone. using the wikidata identifier of notable entities as a common identifier for connecting resources has significant benefits compared to traversing the ever-growing linked data cloud. when the use of wikidata reaches a critical mass, for some institutions, wikidata could even serve as an authority control mechanism. introduction library catalogs, at national as well as institutional levels, make use of thesauri for authority control of named entities, such as persons, locations, and events. authority records in thesauri contain information to distinguish between entities with the same name, combine pseudonyms and name variants for a single entity, and offer additional contextual information. links to a thesaurus from within a catalog often take the form of an authority control number, and serve as identifiers for an entity within the scope of the catalog. authority records in a catalog can be part of the linked data cloud when including links to thesauri such as viaf (https://viaf.org/), isni (http://www.isni.org/), or orcid (https://orcid.org/). however, using different identifier systems can lead to having many identifiers for a single entity. a single identifier system, not restricted to the library world and bibliographic metadata, could facilitate globally unique identifiers for each authority and therefore improve discovery of resources within a catalog. the need for reconciliation of identifiers has been pointed out before.1 what is now being suggested is to use the wikidata identifier as “the” identifier. wikidata is not domain specific, has a large user community, and offers appropriate apis for linking to its data. it provides access to a wealth of entity properties, it links to more than 2,000 other knowledge bases, it is used by google, and the number of organisations that link to wikidata is quantifiably growing with tremendous speed.2 the idea of using wikidata as an authority linking hub was recently proposed by joachim neubert.3 but why not go one step further and bring the wikidata identifier to the surface directly as “the” resource identifier, or official authority record? this has been argued before and the implications of this argument will be considered in more detail in the remainder of this article. 4 information technology and libraries | june 2019 73 figure 1. from linking everything to everything to linking directly to wikidata. figure 1 illustrates the differences between a few possible situations that should be distinguished. on the left, the “everything links to everything” situation shows wikidata as one of the many hubs in the linked data cloud. in the middle, the “wikidata as authority hub” situation is shown, where name authorities are linked to wikidata. on the right is the arrangement proposed in this article, where library systems and other systems for which this may apply share wikidata as a common identifier mechanism. of course, there is a need for systems that feed wikidata with trusted information and provide wikidata with a backlink to a rich resource description for entities. in practice, however, many backlinks do not provide rich additional information and in such cases a direct link to wikidata would be sufficient for the identification of entities. figure 2 shows these two situations and other possible variations by means of dashed lines, i.e. systems that feed wikidata, but use the wikidata identifier as resource identifier for the outside world vs. systems that link directly to wikidata, but keep a local thesaurus for administrative purposes. it is certainly not the intention to encourage institutions to give up their own resource descriptions or resource identifiers locally, especially not when they are an original or rich source of information about an entity. a distinction can be made between the url of the description of an entity and the url of the entity itself. when following the url of a real-world entity in a browser, it is good practice to redirect to the corresponding description of the entity. this is known as the “httprange-14” issue.5 this article will not go into any detail about this distinction other than to note that it makes sense to have a single global identifier for an entity while accepting different descriptions of that entity linked from various sources. wikidata | van veen 74 https://doi.org/10.6017/ital.v38i2.10886 figure 2. feeding properties connecting collections to wikidata (left) and direct linking to wikidata using resource identifier (right). the dashed lines show additional connecting possibilities. the motivating use case the idea of using the wikidata identifier as a universal identifier was born at the research department of the national library of the netherlands (kb) while working on a project aimed at automatically enriching newspaper articles with links to knowledge bases for named entities occurring in the text.6 these links include the wikidata identifier and, where available, the dutch and english dbpedia (http://dbpedia.org) identifiers, the viaf number, the geonames number (http://geonames.org), the kb thesaurus record number, and the identifier used by the parliamentary documentation centre (https://www.parlementairdocumentatiecentrum.nl/). the identifying parts of these links are indexed along with the article text in order to enable semantic search, including search based on wikidata properties. for demonstration purposes the enriched “newspapers+” collection was made available through the kb research portal, which gives access to most of the regular kb collections (figure 3). 7 in the newspaper project, linked named entities in search results are clickable to obtain more information. as most users are not expected to know sparql, the query language for the semantic web, the system offers a user-friendly method for semantic search: a query string entered between square brackets, for example “[roman emperor]”, is expanded by a “best guess” sparql query in wikidata, in this case resulting in entities having the property “position held=roman emperor.”. these in turn are used to do a search for articles containing one or more mentions of a roman emperor, even if the text “roman emperor” is not present in the article. in another example, when a user searches for the term “[beatles]” the “best guess” search yields articles mentioning entities with the property “member of=the beatles”. for ambiguous items, as in the case of “guernica,”, which can be the place in spain or picasso’s painting, the one with the highest number of occurrences in the newspapers is selected by default, but the user may select another one. for information technology and libraries | june 2019 75 the default or selected item, the user can select a specific property from a list of wikidata properties available for that specific item. the possibilities of this semantic search functionality may inspire others to use the wikidata identifier for globally known entities in other systems as well. figure 3. screenshot of the kb research portal with a newspaper article as result of searching “[architect=willem dudok]”. the results are articles about buildings of which willem dudok is the architect. the name of the building meeting the query [architect=willem dudok] is highlighted. usage scenarios two usage scenarios can be considered in more detail: (1) manually following links between wikidata descriptions and other resource descriptions, and (2) a federated sparql query can be performed by the system to automatically bring up linked entities. in the first scenario, in which resource identifiers link to wikidata, the user can follow the link to all resource descriptions having a backlink in wikidata. but why would a user follow such a link? reasons may include wanting more or context-specific information about the entity, or a desire to search in another system for objects mentioning a specific entity. in the latter case, the information behind the backlink should provide a url to search for the entity, or the backlink should be the search url itself. wikidata provides the possibility to specify various uri templates. these can be used to specify a link for searching objects mentioning the entity, rather than just showing a thesaurus entry. when the backlink does not provide extra information or a way to search the entity, the backlink is almost useless. thus, when systems provide resource links to wikidata they give users access to a wealth of information about an entity in the web of data and, potentially, to objects mentioning a specific entity. some systems only provide backlinks from wikidata | van veen 76 https://doi.org/10.6017/ital.v38i2.10886 wikidata to their resource descriptions but not the other way around. users from such systems cannot easily benefit from these links. the second scenario of a federated sparql query applies when searching objects in one system based on properties coming from other systems. formulating such a sparql query is not easy because doing so requires a lot of knowledge about the linked data cloud. the alternative is to put the complete linked data cloud in a unified (triple store) database. the technology of linked data fragments might solve the performance and scaling issues but not the complexity. 8 using a central knowledge base like wikidata could reduce complexity for the most common situation of searching objects in other systems using properties from wikidata. this use case requires these systems to take the users query and automatically formulate a sparql search. there are many systems that are linked to wikidata that do not support sparql at all or only support it in a way that is not intended for the average user. those systems can still let users benefit from wikidata by offering a simple add-on to search in wikidata for entities that meet some criteria and use the identifiers for a conventional search in the local system as shown for the case of the historical newspapers. these two use cases illustrate how the use of a wikidata identifier can lower the barrier to access information about an entity and to finding objects related to an entity by minimizing the number of hubs, minimizing the required knowledge and minimizing the required technology. this is achieved by linking resources to wikidata and, even more so, by making objects searchable by means of the wikidata identifier. advantages of using the wikidata identifier as universal identifier summarizing the above, a number of significant advantages of using the wikidata identifier as universal identifier can be seen. these include: • using the wikidata identifier as resource identifier makes wikidata the first hub. applications therefore have in the first instance to deal with only one description model. from there, it is easy to navigate further: most information is only “one hub away,” so less prior knowledge is required to link from one source to another. • wikidata identifiers can be used for federated search based on properties in wikidata, so there is less need to know how to access properties in other resource descriptions. • wikidata identifiers facilitate generating “just in case” links to systems having the wikidata identifier indexed. • complicated sparql queries using wikidata as primary source for properties can be shared and reused more easily compared to a situation with many diverse sources for properties. • wikidata offers many tools and apis for accessing and processing data. • some libraries and similar institutions may even decide to use wikidata directly for authority control when it reaches a critical mass, relieving them from maintaining a local thesaurus. implementation institutions can gradually adopt the use of wikidata identifiers without needing to make radical changes in their local infrastructure. a simple first step is automatically generating links to information technology and libraries | june 2019 77 wikidata in the presentation of an object or to the object description to provide contextual information and navigation options. as a next step, the wikidata q-number of an entity could be indexed along with the descriptions containing it, so these objects become findable via a wikidata identifier search, e.g. of the form: https://whatever.local/wdsearch?id=q937 the wikidata identifier could then be used in conventional as well as federated searches for a resource, regardless of the exact spelling of a resource name. a search may be refined using wikidata properties without further requirements with respect to local infrastructures. institutions having a sparql endpoint can allow for a federated sparql query for combining local data with data from wikidata. as sparql is not easy for the end user this requires a user interface that can formulate a sparql query to protect the user from knowing sparql. those institutions willing to start using the wikidata identifier as resource identifier can unify references in their bibliographic records. currently, for example, a reference to albert einstein, in a simplified, rdf-like (https://www.w3.org/rdf/) xml fragment in a bibliographic record, could look quite different for different institutions, e.g.: albert einstein albert einstein albert einstein albert einstein if the wikidata identifier is used as resource identifier, this could for all institutions become the same: albert einstein in this case it becomes easy to navigate the web, to create common bookmarklets, and provide additional functionality using the wikidata identifier. cataloguing process and criteria for new wikidata entries for institutions that decide to link their entities directly to wikidata, their catalog software would have to be configured to support wikidata lookups. catalogers would not have to know about linked data or rdf to create links to wikidata; they would simply have to query wikidata and select the appropriate entry to link. the cataloging software would then add the selected identifier to the record being edited. if a query in wikidata does not yield any results the item would first then have to be created by the cataloger. creating a new item using the wikidata user interface (figure 4) is straightforward: create an account, add a new item, and add statements (fields) and values. wikidata | van veen 78 https://doi.org/10.6017/ital.v38i2.10886 figure 4. data entry screen for entering a new item in wikidata. catalogers must be aware of some rules when creating items. wikidata editors may delete items that fall under one of wikidata’s exclusion criteria, such as vandalism, empty descriptions, broken links, etc. in addition, the item must refer to an instance of a clearly identifiable conceptual or material “notable” entity. notable means that the item must be mentioned by at least one reliable, third-party published source. here, common sense is required: being mentioned in a telephone book or a newspaper is in itself not considered as notability. entities that are not notable enough to be entered into wikidata would then remain identified by a link to a local or other thesaurus. possible objections to wikidata as authority control mechanism although it is, at least at the present moment, not the intention of this article to propose the use of wikidata as the primary local authority control mechanism, some institutions may nonetheless consider the opportunity to do so. there are numerous objections to this idea to note, including: 1) institutions may consider themselves authoritative sources of information, and may therefore want to keep control over “their” thesaurus. the idea that the greater community can make changes to “their” thesaurus may not be tenable to them. quality control and error detection certainly are important issues, but experts from outside the library can sometimes provide more and better information about a resource than cataloguing professionals. for misuse and erroneous input, the community can be relied on and trusted to correct and add to wikidata entries. information that is critical for local usage, such as access control, may still be managed locally. despite possible objections to using wikidata for universal authority control, national libraries and other institutions can information technology and libraries | june 2019 79 work together with wikidata to share responsibility of maintaining the resource, to optimize and harmonize the shared use of wikidata, and maintain validity and authority. this might imply a more rigorous quality control. 2) existing systems like viaf and isni already, at present, still contain more persons than wikidata, so why use wikidata? viaf and isni are domain specific and are more restrictive with respect to updates of their content and the availability of tools and apis. in wikidata both viaf and isni are just one hub away and for internal use the viaf and isni identifiers remain available. the question here is whether there will be a moment that wikidata reaches a critical mass and supersedes viaf and isni. 3) there may be disagreement about a certain entity, especially when it concerns political events or persons whose role is perceived differently by different political parties. wikidata contains neutral properties. the properties that may contain subjective qualifications or might suffer bias are mostly behind the backlinks, like the abstract in wikipedia. a fundamental difference between wikipedia and wikidata is that wikipedia doesn’t have to be consistent across languages. wikidata is much more structured and therefore more useful for semantic applications. it doesn’t allow for the different nuances in descriptions like wikipedia articles do and therefore wikidata doesn’t reflect different opinions in descriptions and is less subject to bias.9 furthermore, the cataloguing practices in libraries are subject to bias and subjectivity too. perception and political view may, for example, be reflected in some subject headings and may also change over time.10 it is debatable whether a cataloger is more neutral and less biased than a larger user community. although the use and acceptance of wikipedia as a true source of information may be arguable, in the light of the current “fake news” discussion it is extremely important to guard the correctness of information in wikipedia. in this context it is interesting to note that “according to a study in nature, the correctness of wikipedia articles is comparable to the encyclopaedia britannica, and a study by ibm researchers found that vandalism is repaired extremely quickly.”11 4) some objections have to do with the discussion of “centralization versus decentralization.” some institutions may not want a central system perceptively having control over their local data. the idea of using wikidata as a common authority control mechanism is not that different from the use of any other thesaurus or identifier framework like isbn, issn, etc., except for its use of a central resource description. 5) what if wikidata disappears? there are solutions in terms of mirrors and a local copy of wikidata. moreover, national libraries and other, similar institutions that are already responsible for long-term preservation of digital content can take responsibility for keeping wikidata alive to maximize its viability wikidata | van veen 80 https://doi.org/10.6017/ital.v38i2.10886 conclusion reconciliation of linked data identifiers in general, and using the wikidata identifier as universal identifier in particular, has been shown to have many advantages. libraries and similar institutions can gradually start using the wikidata identifier without needing to make radical changes in their local database infrastructure. when wikidata reaches a critical mass, libraries and similar institutions may want to switch to using wikidata identifiers as the default resource identifiers or authority records. however, given the enormous growth of the number of collections that link entities to wikidata that is already taking place, we might end up in a situation where the perception is that “if an item is not in wikidata, it doesn’t exist” stimulating putting more items in wikidata and making local descriptions less relevant. from a strategic point of view for adopting wikidata decision makers may pose the question: “why do we have a local thesaurus when we already have wikidata?” the next question, then, will probably not be “should we go this way?” but rather “when should we go this way and start using the wikidata identifier as the identifier?” references 1 robert sanderson, “the linked data snowball and why we need reconciliation,” slideshare, apr. 4, 2016, https://www.slideshare.net/azaroth42/linked-data-snowball-or-why-we-needreconciliation. 2 karen smith-yoshimura, “the rise of wikidata as a linked data source,” hanging together, aug. 6, 2018, http://hangingtogether.org/?p=6775. 3 joachim neubert, “wikidata as a linking hub for knowledge organization systems? integrating an authority mapping into wikidata and learning lessons for kos mappings,” in proceedings of the 17th european networked knowledge organization systems workshop, 2017, 14-25, http://ceur-ws.org/vol-1937/paper2.pdf. 4 theo van veen, “wikidata as universal library thesaurus,” presented oct. 2017 at wikidatacon 2017, berlin, https://www.youtube.com/watch?v=1_nxkbncohm. 5 “httprange-14,” wikipedia, accessed mar. 15, 2019, https://en.wikipedia.org/wiki/httprange-14. 6 theo van veen et. al., “linking named entities in dutch historical newspapers,” in metadata and semantics research, mtsr 2016, ed. emmanouel garoufallou (cham: springer, 2016), 205–10, https://doi.org/10.1007/978-3-319-49157-8_18. 7 video demonstration of “kb research portal,” kb | national library of the netherlands, http://www.kbresearch.nl/xportal, accessed apr. 26, 2019, https://www.youtube.com/watch?v=j5mcem-hemg. 8 ruben verborgh, “linked data fragments: query the web of data on web-scale by moving intelligence from servers to clients,” accessed mar. 15, 2019, http://linkeddatafragments.org/. 9 mark graham, “the problem with wikidata,” apr. 6, 2012, https://www.theatlantic.com/technology/archive/2012/04/the-problem-withwikidata/255564/. information technology and libraries | june 2019 81 10 candise branum, “the myth of library neutrality,” may 15, 2014, https://candisebranum.wordpress.com/2014/05/15/the-myth-of-library-neutrality/. 11 “the reliability of wikipedia,” wikipedia, accessed mar. 15, 2019, https://en.wikipedia.org/wiki/reliability_of_wikipedia. information technology and libraries at 50: the 1960s in review mark cyzyk information technology and libraries | march 2018 6 mark cyzyk (mcyzyk@jhu.edu), a member of lita and the ital editorial board, is the scholarly communication architect in the sheridan libraries, the johns hopkins university, baltimore, maryland. in the quarter century since graduating from library school, i have now and then run into someone who had what i consider to be a highly inaccurate and unintuitive view of librarians and information technology. seemingly, in their view, librarians are at worst luddites and at best technological neophytes. not so! in my view, librarians have always been at worst technological power users and at best true it innovators. one has only to scan the first issues of ital, or the journal of library automation as it was then called, to put such debate to rest. march 1968 saw the first issue of the first volume of the journal of library automation published. the first article of that inaugural issue sets the scene: “computer based acquisitions system at texas a&i university” by ned c. morris. here we find librarians not only employing computing technology to streamline library operations (using an ibm 1620 with 40k ram), but as the article points out, this new system for computerizing acquisitions was an adjunct to the systems they already had in place at texas a&i for circulation and serials management. this first article in the first issue of the first volume indicates that we’ve dipped a toe into a stream that was already swiftly flowing. the other bookend of that first issue, “the development and administration of automated systems in academic libraries” by harvard’s richard de gennaro, goes meta and takes a comprehensive look at how automated library systems were already being created and the various system development and implementation rubrics under which such development occurred. much in this article should resonate with current readers of ital. i knew immediately that this article was going to be a good read when i encountered, in the very first paragraph: development, administration, and operations are all bound up together and are in most cases carried on by the same staff. this situation will change in time, but it seems safe to assume that automated library systems will continue to be characterized by instability and change for the next several years. i’d say that was a safe assumption. the second and final volume of the 1960’s contains gems as well. the entirety of volume 2 issue 2 that year was devoted to “usa standard for a format for bibliographic information interchange on magnetic tape” a.k.a. marc ii. is it possible for something to be dry, yet fascinating? some titles of this second volume point to the wide range of technological projects underway in the library world in 1969: mailto:mcyzyk@jhu.edu the 1960s in review | cyzyk 7 https://doi.org/10.6017/ital.v37i1.10339 • “an automated music programmer (musprog)” by david f. harrison and randolph j. herber • “a fast algorithm for automatic classification” by r. t. dattola • “simon fraser university computer produced map catalogue” by brian phillips and gary rogers • “management planning for library systems development” by fred l. bellomy • “performance of ruecking’s word-compression method when applied to machine retrieval from a library catalog” by ben-ami lipetz, peter stangl, and kathryn f. taylor and this is only in the first two volumes. as this current 2018 volume of ital proceeds, we’ll be surveying the morphing information technology and libraries landscape through ital articles of the seventies, eighties, and nineties. i think you will see what i mean when i say that librarians have always been at worst technological power users, at best true it innovators. integrated technologies of blockchain and biometrics based on wireless sensor network for library management articles integrated technologies of blockchain and biometrics based on wireless sensor network for library management meng-hsuan fu information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.11883 meng-hsuan fu (msfu@mail.shu.edu.tw) is assistant professor, shih hsin university (taiwan). © 2020. abstract the internet of things (iot) is built on a strong internet infrastructure and many wireless sensor devices. presently, radio frequency identification embedded (rfid-embedded) smart cards are ubiquitous, used for many things including student id cards, transportation cards, bank cards, prepaid cards, and citizenship cards. one example of places that require smart cards is libraries. each library, such as a university library, city library, local library, or community library, has its own card and the user must bring the appropriate card to enter a library and borrow material. however, it is inconvenient to bring various cards to access different libraries. wireless infrastructure has been well developed and iot devices are connected through this infrastructure. moreover, the development of biometric identification technologies has continued to advance. blockchain methodologies have been successfully adopted in various fields. this paper proposes the blockmetrics library based on integrated technologies using blockchain and finger-vein biometrics, which are adopted into a library collection management and access control system. the library collection is managed by image recognition, rfid, and wireless sensor technologies. in addition, a biometric system is connected to a library collection control system, enabling the borrowing procedure to consist of only two steps. first, the user adopts a biometric recognition device for user authentication and then performs a collection scan with the rfid devices. all the records are recorded in a personal borrowing blockchain, which is a peer-to-peer transfer system and permanent data storage. in addition, the user can check the status of his collection across various libraries in his personal borrowing blockchain. the blockmetrics library is based on an integration of technologies that include blockchain, biometrics, and wireless sensor technologies to improve the smart library. introduction the internet of things (iot) connects individual objects together through their uniqu e address or tag, which are based on the sensor devices and wireless network infrastructure. presently, “smart living” (a term that includes concepts such as the smart home, smart city, smart university, smart government, and smart transportation) is based on the iot, which plays a key role to achieve a convenient and secure living environment. gartner, a data analytics company that presents the top ten strategic technology trends for the next year at the end of each year, listed blockchain as one of the top ten in 2017, 2018, 2019, and 2020.1 the fact that blockchain has been proposed as one of the top strategic technology trends for four consecutive years represents its sustained interest among technology experts and developers. in a blockchain, a block is the basic storage unit where data is saved and protected with cryptography and complex algorithms. the technology of peer-to-peer transfer is adopted mailto:msfu@mail.shu.edu.tw information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 2 when data or information is exchanged without the need for a third party. in other words, data is transferred directly from node to node or user to user thanks to the decentralized nature of the blockchain. in addition, blockchain is authorized and maintained by all nodes in the same blockchain network. each node has equal right (also known as equal weight) to access the blockchain and authorize new transactions. thus, all transactions are published and broadcast to all nodes and content cannot be altered by single or minority users or nodes. additionally, transaction content is secured by cryptography and complex secure algorithms. therefore, transactions occur and are preserved under a fully secure and private network. in practice, the blockchain has been applied to various fields including finance, medicine, academia, and logistics. the blockchain has also been adopted for personal transaction records for its privacy and security properties and because it offers immutable and permanent data storage. in this research, blockchain technologies are adopted to store the records of collections borrowed from various libraries in a personal borrowing blockchain. table 1. definition of key terms of blockchain key term definition blockchain a blockchain comprises many blocks. it has the characteristics of security, decentralized, immutability, distributed ledgers, transparent log, and irreversible data storage. block a block is the basic unit in blockchain. each block consists of a block header with nonce, previous block hash, timestamp, merkle root, and many transactions in a block body. nonce the counter of the algorithm, hash value will be changed once the nonce modified. merkle root a secure hash algorithm (sha) is used in merkle root to transform data into a meaningless hash value. transaction each transaction is composed of address, hash, index, and timestamp. all transactions will be stored in blocks permanently. hash secure hash algorithm (sha) transforms input data into meaningless output data, called a hash, which consists of english letters and digital numbers, in order to protect data content during transmission. biometrics using human physical characteristics including finger vein, iris, voice, and facial features for recognition. sensor network a sensor is a small and portable node with a data record function and power source. a sensor network is composed of many sensors based on a communication infrastructure. iot internet of things (iot) is a system to connect sensors and devices together under an internet environment. presently, many iot applications were adopted such as smart home, health care, and smart transportation. information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 3 although the iot and wireless networks have been well developed, people still own many different rfid-embedded cards, such as public transportation cards, credit cards, student cards, medical cards, identification cards, membership cards, or library cards. an rfid-embedded card is issued for each place or purpose, requiring the user to bring the appropriate cards to access the corresponding functions. in this study, the library is used as the objective to which blockchain technologies are applied because currently each library has its own library card for entering the library and borrowing material. this implies that users may have to carry several library cards to access each of a university library, community library, and district library on the same day. here, biometrics can be adopted to solve the problem of having to carry many access control cards and managing various borrowing policies. in this study, the blockmetrics library is designed based on the technologies of blockchain and biometrics within the environment of a wireless sensor network with iot devices. here, borrowing records are transferred and stored through blockchain technologies, automatic library access control is managed by biometric identification and the borrowing and returning of library materials are achieved under a wireless sensor network with iot devices to create a convenient, efficient, and secure library environment. the key terms of blockchain and its related terms, biometrics, sensor network and iot applied in this research are defined in table 1. related works blockchain technology nakamoto has presented bitcoin as a peer-to-peer electronic system that uses blockchain technologies, which include smart contracts, cryptography, decentralization, and consensus in proof of work. because this electronic system is based on cryptography, a trusted third party is not required in the payment mechanism. additionally, peer-to-peer technology and a timestamp server are adopted and a block is given a hash in serial order. this procedure solves the problem of double-spending during payment.2 in addition, proof of work is used in a decentralized system for authentication by most nodes in the blockchain network. each node has equal rights to compete to receive a block and each node can vote to authenticate a new block. 3 košt’ál et al. define proof-of-work (pow) as an asymmetric method with complex calculations where its difficulty is adjusted by the problem-solving duration.4 however, pow has drawbacks, such as high power consumption and the fact that some users can control the blockchain if their shares of users in the same blockchain network reach 51 percent.5 despite the possible presence of malicious parties, the information in a blockchain is difficult to modify because of the distributed ledger methodology in which each node has the same copy of ledger, making it difficult for a single or minority node to change or destroy the stored data. 6 a block is a basic unit in a blockchain. in other words, a blockchain is composed of blocks that are connected. one of the blockchain technologies is the distributed ledger, in which a ledger can be distributed to each node all over the world.7 each block is composed of a block body and header, and a block size is 4 bytes. the block header is a combination of the version, previous block hash, merkle root, timestamp, difficulty target, and nonce. a blockchain is 80 bytes in total and transactions are between 1 to 9 bytes (see table 2). information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 4 table 2. block element. block size 4 bytes size block header (80 bytes) 4 bytes version 32 bytes previous block hash 32 bytes merkle root 4 bytes timestamp 4 bytes difficulty target 4 bytes nonce block body 1-9 bytes transactions blockchain technologies have been applied to various fields including finance, art, hygiene, healthcare, and academic certificates. for example, in healthcare, user medical records are stored in a blockchain so users can check their health conditions and share them with their family members in advance. in business, blockchain is adopted into the supply chain management for monitoring activities during goods production. in the academic field, the certificates are permanently saved in a blockchain where users can retrieve them from their mobile devices and show them upon request.8 because blockchain is protected by cryptography and offers privacy, reliability, and decentralization, an increasing number of applications are beginning to adopt it. as an application for a library system, a blockchain, in combination with biometrics within a wireless infrastructure, can be adopted for personal borrowing records. library borrowing management each library has its own regulations. for example, the national central library (ncl) in taiwan has created reader service directions in which the general principles and rules for library card application, reader access to library materials, reference services, request for library materials, violations, and supplementary provisions are clearly stated. according to the ncl reader service directions, citizens are required to present their national id card and foreigners are asked to present their passport to apply for a library card. users are allowed to access the library when they have a valid library card. those who have library cards but have forgotten to bring them can apply for a temporary library card to enter the library, but this is limited to three times.9 this rule is only specific to the ncl. other libraries in the same country have their own regulations. another example is the taipei public library. citizens can apply for a library card using their citizen’s id card, passport, educational certificate, or residence permit. a taipei citizen can apply for a family library card using their family certificate. users can borrow material and return the material to all the libraries in taipei city. however, these policies are only applicable to users who hold library cards issued by libraries in taipei city.10 as for university libraries, each library also has its own regulations. for instance, shih hsin university (shu) issues its own library card to access its library. alumni are requested to present their id cards and photo to apply for a library card in person. the number of items and their loan periods are clearly stated in the rules set by the shu library.11 again, the regulations are individually set by each university library. information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 5 biometrics rfid-embedded smart cards such as student cards, transportation cards, and bank cards are widely used, however, they can be stolen, lost, or forgotten at home. biometrics is becoming more widely used in access-control systems for homes, offices, buildings, government facilities, and libraries. for these systems, the fingerprint is one of the most commonly used biometrics. users place their finger on a read device, usually a touch panel. this method ensures a unique identity, is easy to use and widely accepted, boasts a high scan speed, and is difficult to falsify. however, its effectiveness is influenced by the age of the user and the presence of moisture, wounds, dust, or particles on the finger, in addition to the concern for hygiene because of the use of touch devices. face recognition has been used in various applications such as unlocking smart devices, performing security checks and community surveillance, and maintaining home security. this method of biometric identification is convenient, widely accepted, difficult to falsify, and can be applied without the awareness of the person. however, limitations to face recognition include errors that can occur due to lighting, facial expression, and cosmetics. also, privacy is an issue in face recognition because it may take place at a distance without the user’s consent. another form of biometric identification uses the iris as an inner biometric indicator because th e iris is unique for each person. nevertheless, this method is also prone to errors that can be caused by bad lighting and the possible presence of diseases such as diabetes or glaucoma. devices used for iris recognition are expensive, and thus rarely adopted in biometrics.12 speech recognition is used for equipment control such as a smart device switch, however, it can be affected by noise, physical conditions of the user, or weather. vein recognition using finger or palm veins is becoming more prevalent as a form of biometric identification for banks or access control but can be limited by the possible presence of bruises or a lower temperature. however, vein recognition ensures a unique identity, is easy to use, convenient, accurate, and widely accepted, thus, many businesses are adopting vein recognition for various usages. to summarize, biometric identification is convenient, reduces the error rate in recognition, and is difficult to falsify. therefore, biometric identification is suitable for access control. blockmetrics library the blockmetrics library is based on the integration of blockchain and biometric technologies in a wireless sensor network with iot devices. figure 1 shows the blockmetrics library architecture with its bottom-up structure consisting of five layers: hardware, software, internet, transfer and security, and application. all components are sequentially described in detail from the bottom to the top layers. in the hardware layer, sensor nodes are physically located on library collection shelves, entrance gates, and relevant equipment to be further connected with the upper layers. rfid tags are attached to each item in the library, including books, audio resources, and videos. tag information is read and transferred by rfid readers. the biometric devices used in this study include fingerprint readers, palm and finger-vein identifiers, and face or iris recognition devices for biometric authentication when users enter libraries or borrow collections. all images including action images, collection images, and surveillance images are recorded with cameras. the ground surveillance, library collection recognition, image processing, and user identification are manipulated by graphics processing units. touch panels are used for typing or searching for information and there is a particular process for user registration. for general input and output of information, i/o devices include speakers, microphones, keyboards, and monitors. the entrance gate is connected to biometric devices and recognition systems for automatic access control. microprocessors and servers, which make up the core of the hardware, handle all the functions that run in the operating system. data and programs are run and securely saved on a large information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 6 memory drive. data transmission occurs through the wireless data collector, and data collection and transfer in the library are based on a wireless environment. in the blockmetrics library, a library collection database is used to store and maintain all the library material information in a local library for backup usage, and a blockchain is used to record personal borrowing and returning history. figure 1. blockmetrics library architecture in the software layer, an open-source word processor (such as writer, impress, or calc provided by libreoffice) is used to record library collections and handle library affairs. biometric recognition identifies a user’s biological features collected from biometric devices such as a fingervein recognition device, which is adopted in this research. all images and videos include ground information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 7 and entrance surveillance and library material borrowing and returning are recorded by cameras. all images and videos are operated by and processed with video-processing software. the data of the images, videos, personal information, and library collections is managed and saved through an image and file management system as well as a database management system. the software programs associated with creating, modifying, and maintaining library processes are written with open-source programming codes, in particular, python and r, which were scored as two of the top ten programming languages by ieee spectrum in 2019.13 there are various functions saved as packages that are free to download and can be modified and reproduced into a customized program for specific purposes. the hardware houses the cpu, which runs the general library operations, and software programs are maintained by the operating system and management information system. library stock management, collection search, personal registration, and other library-related functions can be designed and developed through app inventor. each borrow and return record is connected with the personal identification recognized from the finger-vein reader and recognition system and is saved as a transaction. the transactions that take place in a specific period are saved in a block that can be connected to another block to form a personal borrowing blockchain. the internet technology layer is built upon the hardware and software structure. the main purpose of internet technologies is to connect the equipment and devices with the internet, in which the internet plays the role of an intermediary for all devices communicating and cooperating together in the library. in the internet technology layer, bluetooth connects devices such as earphones, speakers, and audio guidance within short distances. files are exchanged between smart devices, including smartphones, tablets, or ipads, through near-field communication, which is a contact-less sensor device. rfid is adopted for collection borrowing and returning services in the library. because fiber-optic cables have been ubiquitously planted within infrastructure with the development of smart cities, most libraries have also been built with them. users or vehicles are more easily and more accurately located by the global positioning system (gps), which also assists with image recognition when a material is taken from the shelves. sensors transfer the sensing data to the relative devices for recording or processing under the infrastructure of the wireless sensor network. the library is currently built with a wi-fi environment, but li-fi is one of the future trends that involve creating a wireless environment just with light. mobile devices operate under wireless communications and most countries provide 4g and 4g+ with some supporting even 5g. the internet technology layer is the tools provider for intercommunication among devices. data security and transmission reliability are extremely important issues when various equipment is linked together and connected to the internet. the user interface is the bridge between the user and the devices. in other words, the user gives commands to the devices or software through a user interface or app. biometric recognition devices, rfid readers, and entrance control equipment are connected via the internet in this study. the devices send the information to the corresponding devices for specific purposes in a specified order. collected data such as private user identification are secured by cryptography utilized in blockchain technology. the finger-vein identification used as personal identification is combined with the borrow and return records, stored as transactions, and secured under a secure hash algorithm before being saved into a blockchain. all data and personal identification are transferred under the corresponding secure methodologies. information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 8 in the blockmetrics library, self-check-in and self-checkout rely mostly on rfid technology and finger-vein biometrics. the borrow and return records are stored in a personal borrowing blockchain, where records are saved in the blockchains of user and library, biometrics system, and library servers. entrance control is automatically managed by finger-vein recognition. library stock management is particularly based on image assistance and rfid technology. new user registration is performed only through a few identification questions and finger-vein characteristics extraction. the blockmetrics library is without a circulation desk environment and has an automated borrow and return mechanism through a single sign at the entrance and exit. the five layers in the blockmetrics library architecture communicate with each other such that operations are inseparably related. the blockmetrics library scenario is described in the next section. scenario in this section, the scenarios for registration, entry, and material borrowing and returning in the blockmetrics library will be described in detail. in figures 2 and 3, the user side indicates the actual user actions and is represented as solid lines and the background shows mostly background operations and is indicated as dotted lines. in figure 2, when a new user comes into the blockmetrics library, the registration procedure starts with a biometric pattern extraction and recognition of the user. finger-vein authentication is selected as the personal biometrics for entrance and material borrowing. on the user side, registration is completed with only two steps. the first is finger-vein extraction and the second is to simply provide personal information. the biometric recognition data is processed and stored in the appropriate database, which is linked to the personal identification management system. personal information is secured through the cryptography used in blockchain technology, thus, all information is securely stored. the registration procedure is performed only once at the first entry, and afterward all registered users can enter the blockmetrics library using finger-vein authentication. biometric recognition proceeds with a biometrics database that verifies user identity followed by verification results sent to the entrance control management for entrance guarding. the entrance is automatically controlled because the results from the biometric recognition step are sent as the rules for entrance control. users will be permitted to enter the library when they pass the biometrics recognition step. users do not have to bring their library cards to enter or borrow material, increasing convenience and decreasing identity infringement when library cards get lost. figure 3 shows the scenario of library material borrowing and returning. on the user side, library material borrowing consists of four simple steps performed by the user: 1) retrieving items, 2) authenticating with the finger-vein recognition device, 3) placing items on the rfid reader, and 4) exiting the library. when the user removes a book from the shelf, an infrared detector is triggered, recognizing that a book was removed from the shelf. then, image recognition identifies the specific book and the book’s status is marked as charged to the user in the stock database. if the user wants to leave the library without borrowing anything, the user just scans their finger with the finger-vein device to open the entrance gate. if the user wants to borrow library materials such as books or videos, the borrowing procedure is quickly completed after finger-vein scanning and placing all material including books and videos under the rfid read area. in the background, the user’s recognition results from the finger-vein scan are saved in the biometrics database, which is connected to the blockchain. when the library materials are placed together in the rfid read area, all the tags are read at once while the materials’ statuses in the information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 9 database are updated. user information and material borrowing information are linked and saved as transactions that are stored in the personal borrowing blockchain. figure 2. blockmetrics library scenario—registration and entry information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 10 figure 3. blockmetrics library scenario – borrowing and returning to return library material, the user only needs to put the library materials in the specific area with the rfid reader and the return procedure is completed. the rfid tags of returned materials are read and recorded and their status in the stock database is updated. personal borrow and return records are saved as transactions and stored in the personal borrowing blockchain as well. limitations partial biometric technologies such as facial recognition or fingerprint recognition systems have been adopted by some libraries. these tools have increased the efficiency of accessing and borrowing procedures. however, all the records include some personal information (e.g., fingerprint, historical borrowing records, log of library access, etc.) that is still stored in the individual libraries’ database. the blockchain model may not suitable for all current libraries system due to the unknown database design of each library. at present, library classification systems are by each library individually. therefore, integrating library information among national or international libraries will be a huge task. thus, how to establish the general regulations for all libraries to develop and manage the library information will need additional information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 11 research. after that, the information management system should be designed and built by collecting diverse comments from all library managers. the works should be completed by interdisciplinary experts including library management, information engineering, biometrics system design, and data management. the cost may include manager committees, collection coding design, system development, hardware layout, and related training plans. also, th ere may exist unpredictable privacy issues which could be known until practical system operation. lastly, some users need an adaptation period while new technologies are implemented, the duration of which can depend on how smooth the interface design is, if the system manipulation is easy and clear to use, and what benefits the technologies bring to users’ life. the limitations are concluded as: 1) integrating library information such as stock data and serial numbers, 2) establishing general regulations, 3) creating a consistent library management system, 4) the cost for this system, 5) potential for privacy breaches, and 6) library patron resistance or reluctance to use the technology. conclusion in this research, the blockmetrics library is designed under a wireless sensor network infrastructure combined with blockchain, biometric, iot, and rfid technologies. the library access control system is based on finger-vein biometric recognition, in which users can register with their finger-vein information through biometric devices and input personal information via various i/o devices. thus, automatic and secure library access control is achieved through biometric recognition. additionally, image recognition, gps, and rfid are adopted in the library collection management, providing a simplified way to borrow and return library material. blockchain technologies are utilized to record personal borrowing history of collections from various libraries into a personal borrowing blockchain where records are permanently stored. users can clearly understand their borrowing status through their own blockchain and manage their borrowing information through an application. to summarize, users can enter the library with finger-vein recognition instead of a specific library card. then, if they would like to check out library material, the user can retrieve the items, pass them through the rfid reader, scan their finger vein, and go. the blockmetrics library is designed for convenience and security, which are achieved by combining a wireless sensor network with the integration of blockchain and biometric technologies. this method eliminates the inconvenience of having to bring many library cards, increases the efficiency of collection borrowing procedures, and simplifies the management of collection borrowing from different libraries. adoption of these biometric technologies is still in its early stages. some libraries have begun using different tools, but few libraries have adopted all of them. it simplifies both accessing and borrowing procedures, and all the records are still stored in a particular library’s database for private access only. the development of the blockmetrics library will help to integrate biometric technologies and blockchain under the infrastructure of wireless sensor network to maintain library-accessing recognition, library collections, library users, borrowing records crossing libraries to raise the user convenience and satisfactions, library management efficiency, and library security. in the near future, the library transaction formula in a blockchain will be developed for collection borrowing storage. the library collection serial numbers will be considered in information management system as well. information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 12 endnotes 1 gartner, “smart with gartner, gartner top 10 strategic technology trends for 2020,” https://www.gartner.com/smarterwithgartner/gartner-top-10-strategic-technology-trendsfor-2020/; gartner, “smart with gartner, gartner top 10 strategic technology trends for 2019,” https://www.gartner.com/smarterwithgartner/gartner-top-10-strategic-technologytrends-for-2019/; gartner, “smart with gartner, gartner top 10 strategic technology trends for 2018,” https://www.gartner.com/smarterwithgartner/gartner-top-10-strategictechnology-trends-for-2018/; gartner, “smart with gartner, gartner top 10 strategic technology trends for 2017,” https://www.gartner.com/smarterwithgartner/gartners-top10-technology-trends-2017/. 2 satoshi nakamoto, “bitcoin: a peer-to-peer electronic cash system” (2009), https://bitcoin.org/bitcoin.pdf. 3 david shrier, weige wu, and alex pentland, “blockchain & infrastructure (identity, data security),” connection science & engineering, massachusetts institute of technology, 2016. 4 kristián košt’ál et al., “on transition between pow and pos,” international symposium elmar (2018). 5 thomas p. keenan, “alice in blockchains: surprising security pitfalls in pow and pos blockchain systems,” 15th annual conference on privacy, security and trust (2017); takeshi ogawa, hayato kima, and noriharu miyaho, “proposal of proof-of-lucky-id (pol) to solve the problems of pow and pos,” ieee international conference on internet of things and ieee green computing and communications and ieee cyber, physical and social computing and ieee smart data (2018). 6 quoc khanh nguyen, quang vang dang, “blockchain technology for the advancement of the future,” 4th international conference on green technology and sustainable development, (2018); nir kshetri and jeffrey voas, “blockchain in developing countries,” it professional, 20, no.2 (2018): 11-14. 7 shangping wang, yinglong zhang, and yaling zhang, “a blockchain-based framework for data sharing with fine-grained access control in decentralized storage systems,” 2018 ieee access, 6 (2018):38437-38450. 8 pinyaphat tasatanattakool and chian techapanupreeda, “blockchain: challenges and applications,” 2018 international conference on information networking (icoin), (2018), https://doi.org/10.1109/icoin.2018.8343163; abderahman rejeb, john g. keogh and horst treiblmaier, “leveraging the internet of things and blockchain technology in supply chain management,” future internet, 11, no. 7 (2019): 161; stanislaw p. stawicki, michael s. firstenberg, and thomas j. papadimos, “what’s new in academic medicine? blockchain technology in health-care: bigger, better, fairer, faster, and leaner,” international journal of academic medicine, 4, no. 1 (2018): 1-11; guang chen et al., “exploring blockchain technology and its potential applications for education,” smart learning environments, 5, no. 1 (2018), https://doi.org/10.1186/s40561-017-0050-x; asma khatoon, “a blockchain-based smart contract system for healthcare management,” electronics, 9, no. 1 (2020): 94. https://doi.org/10.1109/icoin.2018.8343163 https://doi.org/10.1186/s40561-017-0050-x information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 13 9 national central library, “national central library reader service directions,” november 11, 2016, https://enwww.ncl.edu.tw/content_26.html. 10 taipei public library, “regulation of circulation services,” june 13, 2018, https://english.tpml.gov.taipei/cp.aspx?n=af5cca6fc258864e. 11 shih hsin university library, “library regulations, access to shu libraries,” http://lib.shu.edu.tw/e_orders_enter.htm; shih hsin university library, “library regulations, borrowing policies,” accessed september 25, 2019, http://lib.shu.edu.tw/e_orders_borrows.htm. 12 sudhinder singh chowhan and ganeshchandra shinde, “iris biometrics recognition application in security management,” 2008 congress on image and signal processing. 13 stephen cass, “the top programming languages 2019,” ieee spectrum (2019), https://spectrum.ieee.org/computing/software/the-top-programming-languages-2019. https://enwww.ncl.edu.tw/content_26.html https://english.tpml.gov.taipei/cp.aspx?n=af5cca6fc258864e http://lib.shu.edu.tw/e_orders_enter.htm http://lib.shu.edu.tw/e_orders_borrows.htm https://spectrum.ieee.org/computing/software/the-top-programming-languages-2019 abstract introduction related works blockchain technology library borrowing management biometrics blockmetrics library scenario limitations conclusion endnotes article title | author 33online workplace training in libraries | haley 33 this study was designed to explore and describe the relationships between preference for online training and traditional face-to-face training. included were variables of race, gender, age, education, experience of library employees, training providers, training locations, and institutional professional development policies, etc. in the library context. the author used a bivariate test, kruskalwallis test and mann-whitney u test to examine the relationship between preference for online training and related variables. i n the era of information explosion, the nature of library and information services makes library staff update their work knowledge and skills regularly. workplace training has played an important role in the acquisition of knowledge and skills required to keep up with this information explosion. as richard a. swanson states, human resource development (hrd) is personnel training and development and organization development to improve processes and enhance the learning and performance of individuals, organizations, communities, and society (swanson 2001). training is the largest component of hrd. it helps library employees acquire more skills through continuous learning. online workplace training is a relatively new medium of delivery. this new form of training has been explored in the literature of human resources development in corporation settings (macpherson, elliot, harris, and homan 2004), but it has not been adequately explored in university and library settings. universities are unique settings in which to study hrd, and libraries are unique settings in which to examine hrd theory and practice. in human resource development literature there are studies on participation (wang and wang 2004) from the perspective of individual motivation, attitudes, etc.; however, more research needs to be conducted to explore library employees’ demographics related to online training in the unique library contexts, such as various staff training and development, as well as training policies. hrd literature includes studies of online learning in formal educational settings (hiltz and goldman 2004; shank and sitze 2001; waterhouse 2005), and there are studies on relationships between national culture and the utility of online training (downey, wentling, wentling, and wadsworth 2005). but there has been very little research conducted in terms of online workplace training for library staff. it is not clear what relationships exist among preferences for online training and demographic variables such as ethnicity, gender, age, educational level, and years of library experience. due to lack of research in these areas, workplace training in libraries will be less effective if certain ethnic groups, or certain age groups, prefer traditional face-to-face training as libraries move toward online training. the author believes that research should govern library practice. therefore, it is necessary to research this topic and disseminate the findings. because of the growth in online training, there is a need to gain a better understanding of these relationships. ■ purpose of the study the study aims to reveal the relationships between preferences for online or traditional face-to-face training and variables such as ethnicity, gender, age, educational level, and years of experience. it also studies the relationships among preference for online training and other variables of training locations, training providers, training budgets, and professional development policies. the constructs are: the preference for online training was related to demographics, library’s training budget, professional development policies, training providers, and the training locations. these factors were included in the research questionnaire. we begin with the research questions, review the current literature, and then discuss the method, results, and need for further research. correlational research questions 1. what is the relationship between ethnicity and online workplace training preferences? 2. what is the relationship of employees’ educational levels, age, and years of library experience to online workplace training preferences? 3. how does preference for online workplace training in libraries relate to employee gender? 4. how does preference for online workplace training in libraries relate to training locations, training providers, training budgets, and professional development policies? 5. do library staff prefer traditional face-to-face training over online training? ■ review of the literature as stated above, training is the largest component of hrd. the discipline of hrd relies on three core theories: psychological theory, economic theory, and system theory. swanson (2001) stated: online workplace training in libraries connie k. haley connie k. haley (chaley@csu.edu) is systems librarian, chicago state university library, illinois 34 information technology and libraries | march 200834 information technology and libraries | march 2008 economic theory is recognized as a primary driver and survival metric of organizations; system theory recognizes purpose, pieces, and relationships that can maximize or strangle systems and subsystems; and psychological theory acknowledges human beings as brokers of productivity and renewal along with the cultural and behavioral nuances. each of these three theories is unique, complementary, and robust. together they make up the core theory underlying the discipline of hrd (p. 92–93). three specific economic theory perspectives are believed to be most appropriate to the discipline of hrd: (1) scarce resource theory, (2) sustainable resource theory, and (3) human capital theory (swanson 2001). training is an investment to human capital with valuable returns, but no costs. wenger and snyder’s study (as cited in mahmood, ahmad, samah, and idris 2004) states that today’s economy runs on knowledge and skills. thurow’s study (as cited in swanson 2001) states that new industries of the future depend on brain power. man-made competitive advantages replace the comparative advantage of natural-resources endowments or capital endowments. in a rapidly changing society, maintaining organizational and individual competence has become a greater challenge than ever before (hake 1999). competences include knowledge, skills, and attitudes. much of the literature focuses on job-related functional competences (deist and winterton 2005). library workplace training is one of the primary methods of investing in human capital and increasing competence for library employees. training is the process through which skills are developed, information is provided, and attributes are nurtured (davis and davis 1998). to increase training participation and efficacy, libraries need to determine employees’ preferences for online training or traditional face-to-face training; a resulting high training participation rate would increase the competence of all employees. library trainers and administrators can encourage nonparticipants to attend training by offering different training sessions (online or face-to-face), and/or by changing training policies and budget allocations. unlike personality and intelligence, skill competence may be learned; hence it may be improved through training and development (mcclelland 1998). nadler and tushman (1999) emphasized core competence as a key organizational resource that could be exploited to gain competitive advantage. core competence was defined as collective learning in the organization, especially how to coordinate diverse production skills and integrate multiple streams of technologies (prahalad and hamel 1990). mezirow (2000) asserted that there are asymmetrical power relationships that influence the learning process (as cited in baumgartner 2000). learning more about the relationships may benefit training and learning. in other words, training may be more effective if it is provided in the form preferred by the majority of staff. as stated above, there is very little research about online workplace training for library staff. past studies have focused on how to conduct online training for working catalogers (ferris 2002) or on online teaching for students (crichton and labonte 2003; hitch and hirsch 2001). from the design and implementation perspectives, kovacs (2000) discussed web-based training in libraries, and unruh (2000) emphasized problems in delivery of web-based training. markless (2002) addressed learning theory and other relevant theories that could be used to teach in libraries. yet there is a lack of research on the demographics of library staff participation in workplace training and a lack of research on the training preferences of library staff. ■ methodology the study took place in an online environment. the research activities covered a twenty-day period from april 10 to april 30, 2006. survey questionnaires and consent forms were posted on the web. select participants the survey url (http://freeonlinesurveys.com/rendersurvey.asp?id=106221) was sent to library staff via library discussion lists along with a consent form including contact information and a brief explanation of the survey’s purpose. the surveys were anonymous and confidential. names, e-mail addresses, and personally identifiable information were not tracked. all participants filled out the survey online. the sample was limited to employees who were at least nineteen years old. directors and department heads were also welcome to participate. instrument data collected for this study included categorical data (i.e., gender and ethnicity) and numeric data (age, years of education, and years of experience). this was an attitudinal survey; hence, the rensis likert scale was used for data feedback. most of the data was quantitative likert scale, such as the preference for online training, the professional development policy, and the budget allocation for training. data collection “entailed measuring the attitudes of employees, providing feedback to participants, and stimulating joint planning for improvement” (swanson 2001). likert-type scales provide more variation of responses and lend themselves to stronger statistical analysis (creswell 2005). it is important to select a well-tested instrument that article title | author 35online workplace training in libraries | haley 35 reports reliable and valid data. however, measuring attitudes has been one of the most challenging forms of psychometric measurement (thorkildsen 2005). due to a lack of similar studies of libraries’ online training, no instruments could be found for this study except the education participation scale (eps), the deterrents to participation scale (dps), and the style analysis survey (sas) instruments. boshier’s forty-item eps (1974) is reliable in differentiating among diverse groups with varying reasons for participating in continuing education (as cited in merriam and caffarella 1999). the eps is used to find the motivations as to why people participate in continuing education; consequently, the eps cannot answer all questions of this study. similarly, the dps reveals factors of nonparticipation; hence, the dps cannot be used in this study. and while the sas is designed to identify how individuals prefer to learn, concentrate, and perform in both educational and work environments (sloan, daane, and giesen 2002), after careful examination, it was found that the sas was not well-suited to this study. because surveys are used to collect data and to assess opinions and attitudes (creswell 2005), the researcher chose to develop a survey that contained about 20 items to assess library staff’s opinions and attitudes toward online training. the survey consisted of three parts: demographic variables, likert-scale assessment of online workplace training preference, and open-ended questions that were worded to reflect reasons for training preference (see appendix). to capture demographic data, participants were asked to indicate their age, years of library experience, years of education (high school/ged = 12; two years college = 14; bachelor’s degree = 16; one master’s degree = 18; two master’s degrees = 20; ph.d/ed.d = 22+), gender (1 = male or 2 = female), ethnicity (1 = asian/pacific islander, 2 = american indian, 3 = african american, 4 = hispanic, 5 = white, non-hispanic, and 6 = other). the likert scale items are designed using a forced-choice likert scale (smith 2006), that is, an even number of response options (1 = strongly agree; 2 = agree; 3 = mildly agree; 4 = mildly disagree; 5 = disagree; 6 = strongly disagree), rather than an odd number (strongly agree; agree; neither agree nor disagree; disagree; strongly disagree). a scoring decision is consistently applied in order to have a meaningful interpretation of the scores. thus, for the likert scale items, the scaling method is to use high scores to represent stronger resistance to a measured attitude of online training. to insure reliability and validity of scores, the questionnaire was reviewed by an expert in the library field to validate if questions were representative of the library field. data collection the way a researcher plans to draw a sample is related to the best way to collect data (fowler 2002). the above sampling approach made it easier for data collection. the author collected data via the web survey company by paying for survey services on a monthly basis. the data was collected by the end of april 2006. the total number of participants was 292 (n=292), of which 260 were valid. thirty-two participants did not complete the survey; those surveys with missing data were excluded from analysis. survey results were saved in a text file and then downloaded into spss for analysis. ■ results and analysis beside general frequency analysis, the kruskal-wallis test was used for six ethnic groups. since some ethnic groups had small sample sizes, all minorities (48) were merged in one ethnic group. thus, the mann-whitney test was used for the two ethnic groups—minority and majority. the author also assessed bivariate relationships with preference of online training and other variables. frequencies analysis frequencies analysis includes demographics, preference of online training versus face-to-face training, budget, and professional development policies. demographics. eighty-five percent of participants were female, 81 percent were white, 49 percent had one master ’s degree, and 23 percent had two or more master ’s degrees. nearly 70 percent were forty years old or older; 45 percent were fifty years old or older. thirty-six percent had less than 10 years of library experience (see table 1). preference of online training versus face-to-face training. most participants (87.3 percent) reported that online training was less effective than traditional face-to-face training. generally speaking, fewer participants (33.9 percent) preferred online training: strongly agree (3.1 percent), agree (13.5 percent), and mildly agree (17.3 percent). more participants (66.1 percent) did not prefer online training: mildly disagree (28.8 percent), disagree (28.1 percent), and strongly disagree (9.2 percent). budget. fifty-five percent of participants somewhat agree their library allocates sufficient budget for training: strongly agree (8.8 percent), agree (25.8 percent), and mildly agree (20 percent). professional development policies. sixty-eight percent of participants somewhat agree their libraries had good professional development policies: strongly agree (13.5 percent), agree (30 percent), mildly agree (24.6 percent). table 2 shows the frequencies of preference of online training, budget, and policy. 36 information technology and libraries | march 200836 information technology and libraries | march 2008 kruskal-wallis test of ethnicity (α = .05) in the kruskal-wallis test for ethnicity, to match the total number of 48 minorities, 48 white people were randomly selected from 212. the test was not significantly different. in the kruskal-wallis test, chi-square is 2.222 (df = 4) and asymptotic significance was 0.715, which was greater than the criterion α = .05. there was no difference in preference for online training between ethnic groups. mann-whitney u test statistics of ethnicity (α = .05) the mann-whitney test of ethnicity was not significant. asymptotic significance is 0.81 (z = -.241), which was greater than the criterion α = .05. there was no difference in preference for online training between the minorities group and the group of white/not hispanic. mann-whitney u test statistics of gender (α = .05) the mann-whitney test of gender was not significant. asymptotic significance was 0.675 (z = -.419), which was greater than the chosen α value (α = .05). there was no significant difference in preference for online training between males and females. bivariate analysis (α = .05) bivariate correlations were computed (see table 3). preference for online training was not associated with age, years of education, years of library experience, sufficient training budget, or professional development policy. it makes sense to believe that traditional face-to-face training has better quality than online training. before the survey analysis, the author expected that younger employees would prefer online training and older ones would prefer traditional face-to-face training due to the older employees’ reluctance to change. it was also expected that highly educated employees would prefer online training while less educated ones, with fewer online skills, would prefer traditional face-to-face training. another assumption was that employees with more library experience would prefer online training while less experienced ones would prefer traditional face-to-face training. the survey showed these assumptions were wrong. it was also assumed that an insufficient training budget might result in a preference for online training, since online training is more cost effective; and that good professional development policies might result in preference for traditional face-to-face training because it is of better quality than online training. the survey found these assumptions to be false. training budget and professional development policies were irrelevant to the preference for online training. however, it was not surprising to find that preference for online training was associated with training providers and training locations, as seen in table 3. ■ discussion the exploration of the relationships among these variables revealed that the preference for online training was not related to demographics, budgets, or professional development policies. however, the preference for online training did show a correlation to training providers and locations. it was surprising to discover table 1. demographic characteristics characteristics frequency n % gender male 40 15 female 220 85 ethnicity asian/pacific islander 22 8.5 american indian 2 0.8 african american 17 6.5 hispanic 7 2.7 white 212 81.5 age 20–29 23 9.4 30–39 54 21.2 40–49 61 23.9 50–59 102 40 60+ 14 5.5 missing 5 1.9 education less than 16 years 27 10.4 16–17 years/bachelor 45 17.4 18–19 years/one master 128 49.4 20–21 years/two masters 43 16.6 22+ years/doctorate 16 6.2 missing data 1 0.4 years of library experience less than 10 years 94 35.9 10–19 years 81 31.3 20–29 years 48 16.2 more than 30 years 37 14.3 missing data 1 0.4 article title | author 37online workplace training in libraries | haley 37 the preference for online training was not associated with ethnicity, gender, age, education, or library experience. it was interesting to note that training budgets and professional development policies were not related to the preference for online training. several study hypotheses were confirmed. library staff preferred traditional face-to-face training as opposed to online training. although one-third (33.8 percent) of participants preferred (including mildly agree, agree, and strongly agree) online training, only 12.7 percent of participants thought that online training was more effective than traditional face-to-face training. on the other hand, the majority (80 percent) preferred online training when the training was held out of state; 56.2 percent preferred online training when it was held in state. the study concluded that online training was preferred if the training locations required participants to travel great distances from the library. of the participants, 63.1 percent preferred online training when the training was provided by a vendor. some participants did not think face-to-face contact was important for vendor training. this finding suggests that online training is a better choice for vendor training. fifty-five percent preferred online training when it was provided by an association/organization. association/ organization trainers should consider a combination of online and traditional face-to-face training to meet the needs of the majority. online training can be provided for some specific tasks, and supplemented by face-to-face training for others. the following are survey summaries of key reasons to use online and traditional training, along with suggestions from the survey participants. the main reasons to use online training ■ flexible (allows more people from one worksite to participate) ■ saves time ■ eliminates travel cost ■ generally lower training costs ■ ease of access (able to have handson practice with a technology and software program, able to refer back to supplemental materials, able to obtain wider range of training, appropriate to give general overviews in preparation for more indepth face-to-face training) ■ convenient (have some control over one’s time, attend training from the comfort of home or office rather than having to drive somewhere and sit through a presentation, fits easily into a busy schedule, and self-paced in asynchronous online training) the main reasons to use face-to-face training ■ questions and answers: able to ask questions and discuss answers, see immediate feedback, questions others are asking may include some that you didn’t think of, and problems solved directly ■ networking with peers: face-to-face training allows for serendipitous networking opportunities, you have the option of personal conversations with trainers as well as social opportunities to meet other professionals, it is hard to meet people and make friends through an online training, get out of the library once in awhile, find out what experiences staff from other departments or libraries are having ■ better communication and interaction: have personal interaction with instructors and participants, share ideas and experiences with others, enjoy discussions and diversity of personal opinions that come from face-to-face training ■ learn efficiently and effectively: learn from others— not just the instructor, get more out of real training, easy to get disinterested if no face-to-face contact, learn better from an instructor ■ technology barrier: sometimes technology can get in the way of training, some online training was poorly designed, online classes took forever to load and two seconds to read the whole page. suggestions to improve library workplace training administrative support. the most important factor is having library administrators who support training and encourage staff at all levels to attend training. provide workshops for professional librarians and civil service workers that relate to their work, and give them release time for training. library administrators must understand the importance of training and develop training policies with a commitment toward staff development. table 2. frequencies of preference of online training, budget, and policy descriptor frequency mean* median* std. deviation preference of online training 3.93 4 1.281 budget 3.45 3.0 1.550 policy 3.0 3.0 1.437 1 = strongly agree; 2 = agree; 3 = mildly agree; 4 = mildly disagree; 5 = disagree; 6 = strongly disagree 38 information technology and libraries | march 200838 information technology and libraries | march 2008 library administrators must plan and design training infrastructures for core competence and cumulative learning, instead of spontaneous one-shot training for new products or systems. more training. many participants expressed their desire for more training. training not only increases their knowledge and skills needed for their job, but also provides opportunities to network with colleagues. more face-to-face and technical hands-on training are needed since many librarians felt left out of the technology loop. they think that maintaining a current view of developments in technology is difficult. more online training is needed, both asynchronous and synchronous. asynchronous training is good for self-paced training, which is preferred by many survey participants, while some enjoy online webcasts of seminars and workshops for better interaction. it is hoped that state libraries will provide online streaming videos about various topics for academic, private, and public library staff. more funding. make more funding available for library workplace training. the training budget should not be the first thing cut when budgets get tight. a combination of online and traditional face-to-face training. walton (1999) notes that we must ensure we learn and grow. we may learn and grow by participating workplace training. training programs should be built into strategic hrd plans that will best fit employees’ learning preference. this study shows that online training works well with basic informational topics and most technology topics (databases, searching, or web-related technologies). certain simple topics were more appropriate for online training, such as a vendor ’s product and procedural training. some topics do not translate well into online training, however, such as how to conduct storytimes— topics that require a lot of interaction between participants. difficult topics need traditional training for direct answers from the instructor. topics that need in-depth discussion should be provided with traditional training. in other words, provide basic trainings online and save face-to-face training for more difficult topics. ■ future research future research should focus on new learning needs, how people interact with technology, and how people learn in an online environment. more research is needed for a variety of online training. in this study, the generic term “online training” was used. future study needs to expand the term “online training” to static asynchronous online training and interactive synchronous online training. static online training includes text-only static, and textgraphic static with or without voice. interactive online training includes voice-only interactive, and voice-video interactive with ability to ask and answer questions in real time. as time goes by, more people will have taken online training and will be more comfortable with it. as more people have online training experience, their attitudes toward online training may change. further research should examine and measure library staff’s preference for a variety of online training. in addition, participants should be surveyed by grouping experienced online trainees and non-experienced online trainees. finally, studies may be conducted to survey library staff in other countries to compare their preferences with those of their u.s. peers. the goal of this study is to provide helpful information for department heads, supervisors, and library human resources staff to assist them in determining the types of training that will be most effective to meet traintable 3. bivariate correlations with preference of online training ( α = .05) variables preference of online training age .980 education .507 library experience .259 budget .858 prof. development policies .280 training provider vendors <.01* associations/org. (ala, oclc, etc.) <.01* lib. consortia <.01* library/institution <.01* training location out of state <.01* in state <.01* in town <.01* in house <.01* * significant at α = .05 article title | author 39online workplace training in libraries | haley 39 ing needs. the author hopes this study also provides useful information to all library employees who attend training or workshops, including civil service personnel and librarians, and that this study will be utilized for further research on library training and, in turn, that research will make more contributions to the workplace training literature of libraries and other professions. acknowledgements the author thanks lorraine lazouskas, john webb, judith carter, and the copy editors at ala production services for their assistance and valuable input on this manuscript. bibliography baumgartner, l. m. 2000. preface. in l. m. baumgartner and s. b. merriam (eds.), adult learning and development: multicultural stories. malabar, fla.: krieger publishing. creswell, j. w. 2005. educational research: planning, conducting, and evaluating quantitative and qualitative research. 2nd ed. upper saddle river, n.j.: pearson merrill prentice hall. crichton, s., and r. labonte. 2003. innovative practices for innovators: walking the talk; online training for online teaching. educational technology & society 6, no. 1: 70–73. davis, j. r., and a. b. davis. 1998. effective training strategies : a comprehensive guide to maximizing learning in organizations. san francisco: berret-koehler. deist, f. d., and j. winterton. 2005. what is competence? human resource development international 8, no. 1: 27–46. downey, s., r. m. wentling, t. wentling, and a. wadsworth. 2005. the relationship between national culture and the usability of an e-learning system. human resource development international 8, no. 1: 47–64. ferris, a. m. 2002. cataloging internet resources using marc21 and aacr2: online training for working catalogers. cataloging and classification quarterly 34, no. 3: 339–353. fowler, f. j. 2002. survey research methods. 3rd ed. thousand oaks, calif.: sage. haley, c. k. 2006. who participates in online workplace training in libraries? survey results retrieved april 25, 2006, from http:// freeonlinesurveys.com/viewresults.asp?surveyid=183507. hake, b. j. 1999. lifelong learning in late modernity: the challenges to society, organizations, and individuals. adult education quarterly 49, no. 2: 79–90. hiltz, s. r., and r. goldman, eds. 2004. learning together online. mahwah, n.j.: lawrence erlbaum associates. hitch, l. p., and d. hirsch. 2001. model training. journal of academic librarianship 27, no. 1: 15–19. kovacs, d. k. 2000. designing and implementing web-based training in libraries. business and finance division bulletin 113 (winter): 31–37. macpherson, a., m. elliot, i. harris, and g. homan. 2004. e-learning: reflections and evaluation of corporate programme. human resource development international 7, no. 3: 295–313. mahmood, n. h. n., a. ahmad, b. a. samah, and k. idris. 2004. informal learning of management knowledge and skills and transfer of learning among head nurses. in human resource development in asia: harmony and partnership, r. moon, a. m. osman-gani, k. shinil, g. roth, and h. oh, eds. seoul: the korea academy of hrd. markless, s. 2002. learning about learning rather than about teaching. retrieved july 5, 2007, from http://www.ifla.org/iv/ ifla68/papers/081-119e.pdf. mcclelland, d. 1998. identifying competencies with behavioralevent interviews. psychological science 9, no. 5: 331–339. merriam, s. b., and r. s. caffarella. 1999. learning in adulthood. san francsico: jossey-bass. mezirow, j. 2000. learning to think like an adult: transformation theory; core concepts. in learning as transformation: critical perspectives on a theory in progress, j. mezirow and associates, eds. san francisco: jossey-bass. nadler, d. a., and m. tushman. 1999. the organization of the future: strategic imperatives and core competencies for the 21st century. organisational dynamincs 27, no. 1: 45–48. prahalad, c. k., and g. hamel. 1990. the core competence of the corporation. harvard business review 68, no. 3: 79–91. shank, p., and a. sitze. 2004. making sense of online learning. san francisco: pfeiffer. sloan, t., c. j. daane, and j. giesen. 2002. mathematics anxiety and learning styles: what is the relationship in elementary preservice teachers? school science and mathematics 102, no. 2: 84–87. smith, j. t. 2006. applied categorical data analysis. lecture presented in spring 2006 at northern illinois university, dekalb. swanson, r. a. 2001. foundations of human resource development. san francisco: berrett-koehler. thorkildsen, t. a. 2005. fundamentals of measurement in applied research. boston: pearson education. unruh, d. l. 2000. desktop videoconferencing: the promise and problems of delivery of web-based training. internet and higher education 3, no. 3: 183–199. walton, j. 1999. strategic human resource development. harlow, england: pearson education. wang, g. g., and j. wang. 2004. toward a theory of human resource development learning participation. human resource development review 3, no. 4: 326–353. waterhouse, s. 2005. the power of elearning: the essential guide for teaching in the digital age. boston: pearson education. 40 information technology and libraries | march 200840 information technology and libraries | march 2008 appendix. questionnaire part i. 1. gender q male q female 2. ethnicity q asian or pacific islander q american indian q african american q hispanic q white, non-hispanic q other ____ 3. please indicate the year of your birth: _________ 4. please indicate years of education: _________ 5. please indicate years of library experience: ________ part ii. for questions 6–16, please read each item and check the response that best matches your degree of agreement/disagreement: (1 = strongly agree; 2 = agree; 3 = mildly agree; 4 = mildly disagree; 5 = disagree; 6 = strongly disagree) 6. if training is provided by library vendors such as ebsco or blackwell, i would prefer that it be offered online rather than face-to-face. 7. if training is provided by associations/organizations such as ala and oclc, i would prefer that it be offered online rather than face-to-face. 8. if training is provided by library consortia, i would prefer that it be offered online rather than face-to-face. 9. if training is provided by your institution or library, i would prefer that it be offered online rather than face-toface. 10. if training location is out of state, i would prefer that it be offered online rather than face-to-face. 11. if training location is in-state, i would prefer that it be offered online rather than face-to-face. 12. if training location is in town, i would prefer that it be offered online rather than face-to-face. 13. if training location is in-house, i would prefer that it be offered online rather than face-to-face. 14. my library allocates sufficient budget for training (may include online training). 15. my library has good professional or staff development policies. 16. generally speaking, i prefer online training rather than face-to-face training. part iii. 17. state reasons for your preference of traditional face-to-face training. 18. state reasons for your preference of online training. 19. please make suggestions to improve library workplace training. 20. do you think that online training is less effective than traditional face-to-face training? yes__ no __ reproduced with permission of the copyright owner. further reproduction prohibited without permission. a low-cost library database solution england, mark;lura, joseph;schlecht, nem w information technology and libraries; mar 2000; 19, 1; proquest pg. 46 responding rise in scholarly journal prices. nesli neither encourages nor hinders changes in scholarly communication and therefore the question of restructuring the scholarly communication process remains.20 references and notes 1. barbara mcfadden and arnold hirshon, "hanging together to avoid hanging separately: opportunities for academic libraries and consortia," information technology and libraries 17, no . 1 (march 1998): 36. see also international coalition of library consortia, "statement of current perspective and preferred practices for the selection and purchase of electronic information," information technology and libraries 17, no. 1 (march 1998): 45. 2. martin s. white, "from psli to nesli: site licensing for electronic journals," new review of academic librarianship 3, (1997): 139-50. see also chest. chest: software, data, and information for education (1996). 3. thomas j. deloughry, "library consortia save members money on electronic materials," the chronicle of higher education (feb. 9, 1996): a21. 4. information services subcommittee , "principles for the delivery of content." accessed nov . 17, 1999, www.jisc.ac.uk/ pub97 / nl_97.html#issc. 5. joint funding council's libraries review group . the follett report. (dec. 1993): accessed nov . 20, 1999, www.niss . ac . uk/ ed ucation/hefc/ follett/report/ . 6. john kirriemuir, "background of the elib programme ." accessed nov . 21, 1999, www .ukoln.ac.uk/services .elib/ background/history.html . 7. psli evaluation team, "uk pilot site license initiative : a progress report," serials io, no. 1 (1997): 17-20. 8. white, "from psli to nesli," 149. 9. tony kidd, "electronic journals: their introduction and exploitation in academic libraries in the uk," serials review 24, no . 1 (1998): 7-14. 10. jill taylor roe, "united we save, divided we spend: current purchasing trends in serials acquisitions in the uk academic sector," serials review 24, no. 1 (1998): ~11. psli evaluation team, "uk pilot site license initiative," 17-20. 12. beverly friedgood, "the uk national site licensing initiative," serials 11, no. 1 (1998): 37-39 . 13. university of manchester and swets & zeitlinger, nesli: national electronic site license initiative (1999). accessed nov. 21, 1999, www.nesli.ac.uk/. 14. nesli brochure, "further information for librarians." accessed nov . 21, 1999, www .nesli .ac.uk/ nesli-librarians-leaflet.html. 15. a copy of the model site license is available on the nesli web site . accessed nov . 22, 1999, www .nesli .ac .uk/ mode1license8.html . 16. albert prior, "nesli progress through collaboration," learned publishing 12, no . 1 (1999). 17. science direct. accessed nov. 24, 1999, www .sciencedirect.com. 18. declan butler, "the writing is on the web for science journals in print," nature 397, oan. 211998) . 19. the journal access core collection request for proposal. accessed nov . 22, 1999, www .calstate.edu/tier3/ cs+p/rfp_ifb/980160/980160.pdf . 20. frederick j. friend, "uk pilot site license initiative: is it guiding libraries away from disaster on the rocks of price rises?" serials 9, no. 2 (1996): 129-33. a low-cost library database solution mark england, lura joseph, and nem w. schlecht two locally created databases are made available to the world via the web using an inexpensive but highly functional search engine created in-house. the technology consists of a microcomputer running unix to serve relational databases. cgi forms created using the programming language perl offer flexible interface designs for database users and database maintainers. many libraries maintain indexes to local collections or resources and create databases or bibliographies con46 information technology and libraries i march 2000 cerning subjects of local or regional interest. these local resource indexes are of great value to researchers. the web provides an inexpensive means for broadly disseminating these indexes. for example, kilcullen has described a nonsearchable, webbased newspaper index that uses microsoft access 97.1 jacso has written about the use of java applets to publish small directories and bibliographies.2 sturr has discussed the use of wais software to provide searchable online indexes.3 many of the web-based local databases and search interfaces currently used by libraries may: • have problems with functionality; • lack provisions for efficient searching; • be based on unreliable software; • be based on software and hardware that is expensive to purchase or implement; • be difficult for patrons to use; and • be difficult for staff to maintain. after trying several alternatives, staff members at the north dakota state university libraries have implemented an inexpensive but highly functional and reliable solution. we are now providing searchable indexes on the web using a microcomputer running unix to serve relational databases. cgi forms created at the north dakota state university libraries using the programming language perl offer flexible interface designs for database users and database maintainers. this article describes how we have implemark england (england@badlands . nodak.edu) is assistant director, lura joseph (ljoseph@badlands.nodak.edu) is physical sciences librarian, and nem w. schlecht (schlecht@plains.nodak.edu) is a systems administrator at the north dakota state university libraries, fargo, north dakota. reproduced with permission of the copyright owner. further reproduction prohibited without permission. mented this technology to distribute two local databases to the world via the web. it is hoped that recounting our experiences will facilitate other such projects . i creating the databases the two databases that we selected to use as demonstrations of this technology are a community newspaper index and a bibliography of publications related to north dakota geology. the forum index the farg o forum is a daily newspaper published in fargo, north dakota. it began publication in 1879 and is the paper of record for north dakota . for many years, the north dakota state university libraries have maintained an index to the forum. beginning with the selective indexing of notable events and editions, we started offering full-text indexing of the entire paper in 1996. until early in the 1980s, all indexing was done manually and preserved on cards or paper. then for several years , indexing was done on one of the university's mainframe computers . starting in 1987, microcomputers were used to compile the index, first using dbase and then using procite as the database management software . printed copies of the database were sold annually to subscribing libraries and businesses . starting in the summer of 1996, th e library made arrangements with the publisher of the paper to acquire digital copy of the text of each newspaper. in early 1997, the ndsu libraries began a project to place all of our forum indexes on the web. dbase, pro-cite, wordperfect, or microsoft access computer files existed for the newspaper index from 1879 to 1975, 1988, and from 1990 to 1996. all other data was unavailable or unreadable. printed indexes from 1976 to 1987 and 1989 were scanned using a hewlett packard 4c scanner fitted with a page feeder . optical character recognition was accomplished using the software omnipage pro. once experience was gained with scanner and software settings, the scanning went very quickly with very few errors appearing in the data. various members of the library staff volunteered to check and edit the data, and the digitizing of approximately 1,500 pages was completed in about three weeks. all data were checked and normalized using microsoft's excel spreadsheet software and then saved as tab-delimited text. programmer's file editor was used to do the final text editing. because of variations in the completeness of the indexing, three separate relational database tables were created: one each for the years 1879-1975, 1976-1996, and 1996-the present. the collective bibliography of north dakota geology in 1996 a project was initiated to combine three bibliographies of north dakota geology and to make the final product searchable and browsable on the web. all three of the original print bibliographies were published by the north dakota geological survey. scott published the first bibliography as a thesis . it is a bibliography of all then-known north dakota geological literature published between 1805 and 1960, and most entries are annotated. 4 the second print bibliography, also by scott, focuses on north dakota geological literature published in the years 1960 through 1979, and also includes some material omitted in the first bibliography .5 most entries in the second bibliography include annotations in the form of keywords or keyword phrases. the third bibliography covers the years 1980 through 1993, and is not annotated.6 all three bibliographies are indexed . the third bibliography was available in digital format, whereas the first two were in print format only. library staff members began rekeying the two print bibliographies using microsoft word. the remaining pages were digitally scanned using a new hewlett packard 4c scanner and the optical character recognition software omnipage pro . there were many errors in the resulting text. different font sizes in the original documents may have contributed to optical recognition errors . editing of the scanned pages was nearly as time consuming and tedious as rekeying the documents . the microsoft word documents were saved as text files and combined as a single text file. programmer's file editor was used as a final editor to remove any line breaks or other undesirable formatting. each record was edited to occupy one line, and each field was delimited by two asterisks . asterisks were used because there were many occurrences of commas, semicolons, and other symbols that would have made it difficult to parse any other way. because italics were removed by converting to a text file, some errors were made in parsing. in retrospect, parsing should have been done before the document was saved as a text file. punctuation between fields was removed because the database would be converted to a large table. it would have been better to leave the punctuation intact, since it cannot easily be put back in for the output to be presented in bibliographic form. the alphabetical additions to publication dates (e.g. baker, 1966a) were left intact to aid in hand-cutting and pasting index terms into the records at a later date. initially, the resulting document was converted to a microsoft access file so that it would be in a table format. however, many of the fields communications i england, joseph, and schlecht 47 reproduced with permission of the copyright owner. further reproduction prohibited without permission. secure database: shaw diese fields in results : aalhor: ::::=====~---~---date : le~al to i p author p date p tille p' source tid,:l . _j r annot:l!iom r: index sour1:e: ~=====~ amiotalions: l ... -· · ... ······-···~~ ..... ·.-... --.... j r prilll resource p record number bulu: ;:::::::::::=::::::::::::::::::;:::;,~~ priat re1oun:11: ! show all ii record naml,er: i equal to ij l=:j sort results by: jaulhor j r descending ai b ic idi eif ig ihii ij ikil im in io ipi.q.iri s it iuiviwix iy iz figure 1: secure database editing interface were well over the 256 character limit of individual fields . to solve this problem, the data were imported into a relational database called mysql, which allows large data fields called "blobs." running under unix, mysql is very flexible and powerful . i database and search engine design we examined the features and capabilities of various online bibliographies and indexes when deciding on our search interfaces and search engine designs . we wanted our databases to be both searchable and browsable and, in the case of the collective bibliography of north dakota geology, we wanted to provide the option of receiving search results accurately in a specific bibliographic format. we wanted both simple and advanced search capabilities, including the ability to do highly sophisticated boolean searching. finally, we wanted to provide those maintaining the databases with the ability to easily add, delete, and change records from within simple forms on the web and immediately see the results of this editing . mysql uses a perl interface, dbi (database independent interface), which makes accessing the database simple from a perl script. essentially, a sql statement is generated, based on data from an html form. this sql statement is then run against the mysql database, returning matching rows that the same script can handle and display as needed. all of the dynamically generated pages in this database are created this way. using both mysql and perl provided a nice, elegant way to integrate database functionality with the web. the databases were installed on a server and made available via the web. it soon became apparent that there were problems with large numbers of returns . depending upon the client machine's hardware configuration, browsers could lock up the 48 information technology and libraries i march 2000 machine. while an efficient search should not result in such a large number of hits, we decided to limit returns to reduce this problem. following suggestions from users, various search tips were added, and some search interface terminology was changed. from a secure gateway , it is possible to call up different forms that allow individual records to be displayed, edited, and saved (see figure 1). new records are added by using a simple html form . it is also possible to bulk-load large numbers of records by using a special perl program to load the data directly from a text file. i advantages of the unix/mysql solution after first using glimpse, a popular web search engine, under linux, a free unix platform, and then microsoft's internet information server (iis) software on a windows nt platform to search the forum newspaper index, we settled on using mysql on a microcomputer running linux and the apache web server. we found we could write perl scripts that allowed users to make very sophisticated searches of the data from within very simple web forms. mysql is stable, reliable, free, and offers a high degree of functionality, flexibility, and efficiency. apache is reliable, extendible, very fast, free, and offers tight control of data access. initially, each story received from the newspaper was maintained as a separate file on a microcomputer. by having the stories as separate files, it was easy to set up glimpse as a searching tool for the articles. although it did provide a nice preview of a workable system, glimpse did not provide enough flexibility in how records were displayed, organized, or searched. it was not meant for managing data of this sort. windows nt, although a popular and successful it solution, was reproduced with permission of the copyright owner. further reproduction prohibited without permission. found to be somewhat cumbersome to implement and did not provide enough flexibility. the installation of these tools was easy, but it was difficult to obtain a high level of database and web integration . reliability and cost were also concerns . we found that unix was more stable and practically eliminated any unavailability of the data . perl, mysql, and apache were ultimately used to manage, store, and deliver the data. although these products are available for windows nt, their native platform is unix. by running these products on unix, we were able to take advantage of all the features offered by each of the products. we found that mysql offered the flexibility and power to manage both sets of data efficiently. also, to load the data into a relational database such as mysql required the data to be normalized. normalized data are data that are separated into logically separate components. to normalize data often takes some extra effort, as fields must be defined to contain certain types of data, but in the end the data is easier to manage and well organized. by having articles and bibliographies in a relational database, we are able to easily make updates, additions, and generate output or reports on the data in many different ways. there are several web servers available on the market today . however, apache is often singled out as being the most popular server . apache, like perl and mysql, is available free for all uses (educational and commercial). using apache and .htaccess control files, we are able to restrict access to administrative pages where data are added or modified. many extensions for apache are available to increase web performance in different situations. for example, a module for apache allows the web server to execute perl code within the server without the need to run the regular perl interpreter. i conclusion and future plans work is under way to refine and update the collective bibliography of north dakota geology. because bibliography number three was not annotated, index terms are being added to facilitate searching and retrieval of citations. we have recently updated the collective bibliography of north dakota geology to include citations to publications through 1998, and we plan to update the database annually. additionally, we receive monthly updates of forum articles, which are added using a simple perl script as soon as they are received. we have successfully implemented a number of other databases using these methods. we realize that this unix/ mysql solution is likely to be most helpful to other academic libraries: there are generally students and staff available on many campuses who are capable of programming in perl and maintaining sql databases on unix servers. our perl scripts are available at the url ww.lib.ndsu .nodak.edu/ kids. references and notes 1. m . kilcullen, "publishing a newspaper index on the world wide web using microsoft access 97," the indexer 20, no . 4 (1997): 195-96 . 2. p . jacso, "publishing textual databases on the web," information today 15, no . 11 (1998): 33, 36 3. n .o . sturr, "wais: an internet tool for full-text indexing," computers in libraries 15 (june 1995): 52-54. 4. m .w . scott, annotated bibliography of the geology of north dakota 1806-1959 north dakota geological survey miscellaneous series, no. 49 . (grand forks , n .d .: north dakota geological survey , 1972). 5. m . w . scott , annotated bibliography of the geology of north dakota 1960-1979 north dakota geological survey miscellaneous series, no. 60. (grand forks, n.d.: north dakota geological survey, 1981). 6. l. greenwood and others, bibliography of the geology of north dakota 1980-1993 north dakota geological survey miscellaneous series, no. 83. (bismarck, n .d .: north dakota geological survey, 1996). related urls linux homepage: www.linux.org/ mysql homepage: www.mysql.com/ perl homepage: www.perl.com/ apache homepage: www.apache.org/ ndsu forum index: www.lib.ndsu. nodak.edu/forum/ collective bibliography of north dakota geology: www.lib.ndsu.nodak.edu/ ndgs/ communications i england, joseph, and schlecht 49 meeting users where they are: delivering dynamic content and services through a campus portal communications meeting users where they are delivering dynamic content and services through a campus portal graham sherriff, dan desanto, daisy benson, and gary s. atwood information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11519 graham sherriff (graham.sherriff@uvm.edu) is instructional design librarian, university of vermont. dan desanto (ddesanto@uvm.edu) is instruction librarian, university of vermont. daisy benson (daisy.benson@uvm.edu) is library instruction coordinator, university of vermont. gary s. atwood (gatwood@uvm.edu) is education librarian, university of vermont. abstract campus portals are one of the most visible and frequently used online spaces for students, offering one-stop access to key services for learning and academic self-management. this case study reports how instruction librarians at the university of vermont collaborated with portal developers in the registrar’s office to develop high-impact, point-of-need content for a dedicated “library” page. this content was then created in libguides and published using the application programming interfaces (apis) for libguides boxes. initial usage data and analytics show that traffic to the libraries’ portal page has been substantially and consistently higher than expected. the next phase for the project will be the creation of customized library content that is responsive to the student’s user profile. introduction for many academic institutions, campus portals (also referred to as enterprise portals) are one of students’ most frequently used means of interacting with their institutions. campus portals are websites that provide students and other campus constituents with a “one-stop shop” experience, with easy access to a selection of key services for learning and academic self -management. typically, portals provide features that make it possible for students to obtain course information, manage course enrollment, view grades, manage financial accounts, and access information about campus activities. for faculty and staff, campus portals provide access to administrative resources related to teaching, human relations, and more. these campus portals are different from library portals, which some libraries implemented in the 2000s as a way to centralize access to key library services.1 currently, the public-facing websites of many colleges and universities serve a crucial role in marketing the institution to prospective students. this creates an incentive to be as comprehensive as possible and to showcase the full breadth of programs, services, offices, and facilities. a common disadvantage to this approach to institutional web design is information overload: an overwhelming array of labels and links that diminish the ability of current affiliates to find and access the services they need. these sites are designed for external users for whom the research and educational functions of the library are a low priority. campus portals, however, are designed for internal users and can take a more selective approach. they give student and faculty users a view of campus services that aligns with their priorities and places them in a convenient interface. in this sense, they are tools for information management. campus portals play a critical role in students’ daily lives because they do much more than simply present information. carden observes that campus portals have these key characteristics: mailto:graham.sherriff@uvm.edu mailto:ddesanto@uvm.edu mailto:daisy.benson@uvm.edu mailto:gatwood@uvm.edu information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 2 • allow a single user authentication and authorization step at the initial point of contact to be applied to all (or most) other entities within the portal; • allow multiple types and sources of information to be displayed on a single composite screen (multiple “channels”); • provide automated personalization of the selection of channels offered, based on each user’s characteristics, on the groups to which each user belongs, and possibly on the way in which the system has historically been used; • allow user personalization of the selection of channels displayed and the look-and-feel of the interface, based on personal preferences; • provide a consistent style of access to diverse information sources, including “revealing” legacy applications through a new consistent interface; and • facilitate transaction processing as well as simple data access. 2 in sum, enterprise portals use a combination of advanced technologies that have the ability to present both static and user-responsive information in a space reserved for affiliates of the university. these abilities present an attractive venue for libraries to leverage the capabilities of a campus portal to present users with dynamic, personalized instructional experiences—in a space where users are. this aligns with the principles of user-centered design, which emphasizes the need to empathize with users’ needs and perspectives. simplicity, efficiency, convenience, and responsiveness to each user’s individual circumstances are critical.3 the idea of presenting libraries’ content through a campus portal is not a new one. stoffel and cunningham surveyed libraries in 2004 and, while finding that “library participation in campus portals is . . . relatively rare,” of the sixteen self-selected responding campuses, ten had a library tab or a dedicated library channel within their campus portal, while two more had a channel or tab under development.4 the types of library integration described in most examples consisted of using the portal’s campus authentication to link to a user’s library account and view borrowed books, fines, holds, and announcements. while resources like federated searches, research guides, and lists of journals and databases appeared in some respondents’ portals, they largely appeared as static content rather than responding to the user’s profile. since 2004, portals have remained a core part of the university of vermont’s information delivery system, but portal integration remains relatively rare among libraries and most have done little to integrate new tools such as research guides or develop instructional content that leverages a portal’s user-responsive design. as a result, there is little in the literature on libraries’ integration of content into campus portals, but a small number of case studies provide proof of concept, such as lehigh university, california state university-sacramento, and arizona state university.5 these case studies also illustrate the importance of cross-campus collaboration. our project required some critical elements, specifically access to the campus portal and a method for publishing content. the projects described in the case studies were successful partly because they were able to apply advanced programming expertise that was not available to our group, such as api coding. instead, our group was able to obtain these critical inputs through a partnership with the university of vermont registrar’s office. information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 3 at the university of vermont, the campus portal uses the banner product licensed from ellucian and has branded it as “myuvm.” it is administered by the registrar’s office. librarians have observed that it is central to students’ academic lives. students go to myuvm as their pathway to many of the online services and tools that they use. they go there to check email, log in to the learning management system (lms), check grades, to add, drop, or withdraw from courses, to check their schedule, and more. they go there to carry out tasks. figure 1. screenshot of myuvm (https://myuvm.uvm.edu as it was on march 1, 2019). the importance of myuvm is communicated to university of vermont students at orientation. in this way, first-year students learn at the earliest point, even before their academic programs begin, that the portal is their primary gateway for access to campus academic services. this shapes their view of the services available to them and how those services are organized. it also shapes how they reach those services and how they interact with them. at the same time, the selective principle underlying the campus portal means that if something is not present, it is less visible and less accessible, and there is a risk of signaling to students that it is not important to their daily lives or their academic performance. methods the characteristics of campus portals and their contents motivated instruction librarians to explore the possibility of integrating library services into myuvm. in 2014, the university of vermont libraries’ educational services working group—a small cross-libraries group of librarians who work on a variety of projects supporting classroom instruction and research assistance—began by defining the desirable scope of possible portal content. https://myuvm.uvm.edu/ information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 4 the educational services working group quickly determined that library content included in the portal should be designed to conform with the principle of priority-based selectivity employed across the portal as a whole. this content should not attempt to represent the full s uite of library information and services available. this would replicate the websites of the three libraries on campus and would risk creating overload and disorientation, in a similar way to institutional websites. it is common for actionable and instructional material to become buried beneath links on a library homepage, and the homepages of our three libraries’ websites are no different. our hope was to reposition selected instructional content such as research guides, databases -bysubject, chat reference, and liaison librarian contacts in a venue with which students are used to interacting. the goal of the project was the strategic positioning of dynamic, responsive information about research services in a venue with which students frequently interact. research librarians would select and organize the most important and pertinent instructional content. such selectivity fit well within the portal’s principle for curating content: high-use tools and services that directly support students’ priorities. thus the objective for this project would not be the re-creation of the library websites within myuvm. it was also determined that the scope would exclude content that might be considered marketing or engagement for its own sake, for the same purpose of minimizing users’ cognitive load and helping them to quickly find the features they need. the myuvm developers in the registrar’s office were enthusiastic about working with us on this project, which partly reflects an increased attention across campus to equitable access to student services for all users—something that is important for its own sake, but also for the purposes of accreditation. following preliminary discussions in early 2018, myuvm developers created a test “libraries” page, equivalent to a full screen of content, and assigned to our group the privileges necessary to view it in the myuvm test environment. each page in myuvm is composed of a series of content boxes or channels. in developing our new page, our task was to develop content for the desired channels. we began our process for composing the page with a card-sorting exercise that identified priorities for the content that should be highlighted. the participants were the group’s members, in order to expedite initial decisions about content that could be tested with users at a later point in the project. items that figured prominently in this process were the libraries’ “ask a librarian” service, research guides, and search tools (discovery layer, databases, and journal directory). this confirmed that our group’s priorities centered on users’ transactional interactions with library services and not merely the one-way promotion of library information. the results of the card sorting were then translated into a wireframe (see figure 2). each square in the wireframe represented a channel for which we would need to create the appropriate content: • ask a librarian (contact details for the libraries’ research assistance services) • research guides (subject and class guides) • search our collections (search tools for the discovery layer, databases, and journal directory) information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 5 • research roadmap (the libraries’ suite of tutorials on foundational research skills) • featured content (a channel for rotating or temporary time-specific content) • libraries (a box with a link to each of the three libraries on campus; we later added a channel for each library) • the wireframe also envisaged the inclusion of a pop-out chat widget. figure 2. wireframe for library content. as noted, the project needed a process that would enable our group to create and publish this content autonomously, but without requiring advanced programming skills on our part. we learned that myuvm is capable of publishing content pushed from a webpage by using its url. this meant that we could create content in libguides, a platform with which our group was very familiar, and then push the content of an individual libguide box to a myuvm channel simply by providing the libguide box urls to the portal developers. this method offers several advantages. importantly, it meant that our group had direct control of the box content and was able to publish it without needing the myuvm developers to review and authorize every edit. information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 6 those involved in this project faced important decisions early in the process regarding which resources we deemed essential for inclusion and best suited to this new online context. once items were selected, it was important to keep user behaviors in mind as we prioritized “above the fo ld” content. students are used to quickly popping into the portal, finding what they need, and popping out. we tried to place interactive content that fit this use pattern in high-visibility places and moved content that required more sustained reading and attention further down the page. a challenge faced during the design process was our campus’s lack of a unified, cross-libraries web presence. the three libraries on our campus have separate websites, but the university of vermont portal required that we present a unified “libraries” presence. in some cases, such as links back to library webpages, we were easily able to treat the three libraries separately. in other cases, such as our research guides, we were able to merge resources from multiple libraries. in still other cases, such as our chat widgets, we had to make decisions about which library’s resource would be featured and which other versions would be secondary. the prototyping and testing phases revealed that some content needed to be adjusted in order to display in myuvm as desired. libguides’ tabbed boxes and gallery boxes did not display correctly. also, some style coding inherited from the libguides boxes needed to be adjusted in order to display cleanly. one item, “course reserves,” was present in the wireframe but not the page at the time of implementation. we continue to work on the development of a widget for searching “course reserves” holdings. the version of the “library” page at the time of going live is shown in figure 3. figure 3. screenshot of the “library” page in myuvm. the “research guides” channel has a dropdown menu for subject guides and another for class guides. these menus were created using libguides widgets, meaning that they update automatically as guides are published and unpublished, and do not require any manual maintenance. information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 7 the “search our collections” channel includes three access points to the libraries’ collections. this contrasts with the libraries’ websites, which display only the discovery layer search box. the latter approach has the advantage of promoting one-stop searching, but also the disadvantage of overwhelming users with non-relevant results. channels on the left side of the page are less dynamic and interactive. at the top, links to the three libraries on campus provide highly visible quick access for students looking for the libraries’ websites. similarly, the “ask a librarian” channel quickly gets students to reference and consultation services at their home library. the “you should know” channel provides a space for rotating content to be changed based on time-of-year, events on campus, or other perceived student needs. results the “library” page in myuvm went live in january 2019, at the same time that spring semester classes began. our preliminary review of results from the semester, based on data collected from myuvm, libguides statistics, and google analytics, has identified several positive outcomes. myuvm data showed that there were 18,891 visits to the “library” page during the period from mid-january to the end of march, a period of eleven weeks when classes were in session. this volume of traffic substantially exceeded our group’s expectations for the first months following implementation, during a period when we were only beginning to promote awareness of the page. data also showed that usage during this period was generally consistent. the most significant variation in traffic was a small peak in late february that corresponded with a high point in the level of library instruction. libguides statistics showed an overall increase in usage of subject guides, though it is not possible to attribute this to the myuvm project with complete certainty. in addition, however, we also observed that for many of our guides during this period, myuvm was among the top referring sites. libguides statistics also recorded unexpectedly large increases in usage for the “research roadmap” that we attribute primarily to the myuvm project. four sections of the “research roadmap” experienced increases of more than 100 percent during the january-march period. the research roadmap’s “more help” page showed a 65 percent drop in visits, but a possible explanation for this is that the highlighting of sections in myuvm is providing more-immediate help to our users in finding what they need and promoting independent use of instructional materials by students. libchat statistics indicated a significant increase in chat reference transactions at howe library, the university of vermont’s central library: a 23 percent increase over the count for the fall 2018 semester, with the implementation of the myuvm project being the only reasonable explanation. all initial data appear to show that users are finding and continuing to use the “library” tab in the portal. they are discovering guides and using the embedded chat widget. we plan to gather more usage data for other channels on the page to better inform our picture of what users are doing once they find and view the “library” tab. as campus portals have become a ubiquitous part of university life, revisiting the library’s role in these portals seems worthwhile, especially given that information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 8 commonplace design tools like libguides dramatically lower the technological acumen needed for creating content. future directions the next step for this project is to leverage the ability of a campus portal to create a myuvm homepage library channel that customizes the display of content, based on unique user characteristics. when the user logs in, they are routed to the portal’s landing page, which is dynamically created based upon their student or faculty status, enrollment in a college or school, level of study (graduate/undergraduate), or number of years attending the university of vermont. this page has the ability to conform to the user in even more granular ways and dynamically display content based upon their major or other demographic categories such as study abroad status, veteran status, or first-year students. by leveraging the portal’s ability to display user-specific content, the university of vermont libraries have the ability to customize instructional content tailored to a user’s information needs and place that content in a channel that will display alongside other channels on the myuvm homepage. a first-year history major’s library channel could contain tutorials on working with primary sources, a link to their liaison librarian, links to digitized newspaper collection, and help guides for chicago citation style. a graduate student in nursing might see information abou t evidence-based practices for developing a clinical question, help guides for using pubmed and cinahl, and resources for point-of-care. a faculty member in psychology might find tutorials for creating alerts in their favorite journals, information about copyright and reserves material, or information about citation-management software. in each case, the portal pushes resources and assistance to each user that best fits their specific need, as informed by the librarians best equipped to address that need. this last step of placing dynamic content on the myuvm homepage will require a great deal of coordination with liaison librarians both to identify the most pertinent disciplinary information to place in the portal and to identify the times of year when certain information is most relevant. to keep portal content dynamic and pertinent to users, a system will need to be created for releasing and removing content on a regular basis and this scheduling of content will require the input of liaison librarians. the educational services working group will need to manage this scheduling, as well as the enforcement of portal design conventions in coordination with the myuvm developers. although this management may end up being complex, it is not insurmountable, and our next steps will be to both to create a system for content creation and management, and to begin to create test content for a sample of user groups. we also plan to gather more data and expand our analytics capabilities to assess how users are using content on the myuvm “library” page and examine which features are most popular, how much traffic is being driven back to our websites, and how users are interacting with the features on the page. conclusion our project has confirmed our initial inclination that students go to myuvm as a finding tool for finding inter-campus resources. also, faculty have reported accessing library resources through the portal and directing their students to that pathway as well. the immediate high use and consistency of use indicate that we have placed our selected libraries resources in a high -traffic information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 9 venue. instead of attempting to coax students to our web outpost in the wilds of the internet, we have placed an exit ramp from a highway they already travel. this has proven overwhelmingly effective and confirms, on our campus at least, the literature from the mid-2000s pointing out the opportunity created for libraries by campuses’ institutional adoption of portal systems. in all, the project has been a worthwhile venture for the university of vermont libraries. we have observed immediate use and better-than-expected levels of traffic, as well as continued use throughout the semester. it appears that once students wear a path to resources in myuvm, they are continuing to use that path as a way to access library content. we look forward to further customizing that content in the near future. acknowledgements we gratefully acknowledge david alles, portal developer, and naima dennis, senior assistant registrar for technology, in the university of vermont office of the registrar, for their contributions to the design and development of this project. endnotes 1 scott garrison, anne prestamo, and juan carlos rodriguez, “putting library discovery where users are,” in planning and implementing resource discovery tools in academic libraries, ed. mary pagliero popp and diane dallis (hershey, pa: information science reference, 2012), 391, https://doi.org/10.4018/978-1-4666-1821-3.ch022; bruce stoffel and jim cunningham, “library participation in campus web portals: an initial survey,” reference services review 33, no. 2 (june 1, 2005): 145-46, https://doi.org/10.1108/00907320510597354. 2 mark carden, “library portals and enterprise portals: why libraries need to be at the centre of enterprise portal projects,” information services & use 24, no. 4 (2004): 172–73, https://doi.org/10.3233/isu-2004-24402. 3 ilka datig, “walking in your users’ shoes: an introduction to user experience research as a tool for developing user-centered libraries,” college & undergraduate libraries 22, nos. 3–4 (2015): 235–37, https://doi.org/10.1080/10691316.2015.1060143; steven j. bell, “staying true to the core: designing the future academic library experience,” portal: libraries and the academy 14, no. 3 (2014): 369–82. https://doi.org/10.1353/pla.2014.0021. 4 stoffel and cunningham, “library participation in campus web portals,” 145-46. 5 tim mcgeary, “mylibrary: the library’s response to the campus portal,” online information review 29, no. 4 (2005): 365–73, https://doi.org/10.1108/14684520510617811; garrison, prestamo, and rodriguez, “putting library discovery where users are,” 393-94. https://doi.org/10.4018/978-1-4666-1821-3.ch022 https://doi.org/10.1108/00907320510597354 https://doi.org/10.3233/isu-2004-24402 https://doi.org/10.3233/isu-2004-24402 https://doi.org/10.1080/10691316.2015.1060143 https://doi.org/10.1353/pla.2014.0021 https://doi.org/10.1108/14684520510617811 abstract introduction methods results future directions conclusion acknowledgements endnotes september_ital_varnum_final editorial board thoughts: content and functionality: know when to buy ‘em, know when to code ‘em1 kenneth j. varnum2 information technologies and libraries | september 2017 3 we in library technology live in interesting times, though not those of these apocryphal curse. no, these are interesting times in the best possible way. where once there was a paucity of choice in interfaces and content, we have arrived at a time when a range of competing and valid choices exists for just about any particular technology need. data and functionality of actual utility to libraries are increasingly available not just through proprietary interfaces, but also through apis (application programming interfaces) that are ready to be consumed by locally developed applications. this has expanded the opportunity for libraries to respond more thoughtfully and strategically to local needs and circumstances than ever before. libraries are faced with an actual, rather than hypothetical, choice between building or buying fundamental user interfaces and systems as the internet has evolved, and coding has become more central to the skillset of many libraries, the capability of libraries to seriously consider building their own interfaces has grown. how does a technologically capable library make the decision to buy a complete system or build its own interface to existing data? the process can be decided using a range of criteria that can help define the library’s need for a locally managed solution. we’ll start by discussing technological capabilities needed to take on almost any development project, then define three criteria, and finally discuss the circumstances in which a build solution might be appropriate. the goal is outline a process for deciding when it make more sense to buy both the interface and the content, to build one or the other locally, or to build both. criterion 0: what are the shortand long-term technological capabilities of the library? clearly, the first point of consideration is whether the institution has the capacity to manage application development and user research. the short-term answer may be no, but the long-term answer -one based on the library’s strategic direction -may be that these skills are needed to meet the library’s goals or strategic vision. one project may not be enough to tip the scales, but if the library is continually deciding if the immediate project under discussion is the one to change the balance, then perhaps the answer is that it’s time to invest in new skillsets and capabilities. there are actually several skillsets needed to undertake development projects. individuals with coding skills are needed to adapt existing open-source software to the library’s needs — it is a rare 1 with apologies to kenny rogers 2kenneth j. varnum (varnum@umich.edu), a member of the ital editorial board, is senior program manager for discovery, delivery, and library analytics at the university of michigan library, ann arbor, mi. editorial board thoughts | varnum https://doi.org/10.6017/ital.v36i3.10087 4 open-source project that does exactly what a library needs it to do, with connectors to all the same data sources and library management tools already perfectly configured by somebody else — but that is not sufficient. a library also needs people with user interface and user research skills ensure that the application meets at least the critical needs of its own user community, and does so with language and cues that match user expectations. even if there is not a permanent capability on the library’s staff, development can take place with contract services. if this is the option selected, a library would do well to make sure that staff are sufficiently trained to make minor updates to interfaces and applications, or that a longer-term arrangement is made for ongoing maintenance and updates. criterion 1: what is the need to customize interactions to local situations? most, but not all, applications offer opportunities to match interface features and functionality with local user needs. the more interactive and core to the library’s service model the tool is, the more likely the tool is to benefit from customization. for example, a proxy server -technology that allows an authenticated user to access licensed content as if she were in the physical library or within a campus on a defined network -has little or no user interface. there is little need to customize the tool to meet user needs, beyond ensuring the list of online resources and urls subject to being proxied is up to date. there really aren’t any particularly useful apis to consumer and reproduce elsewhere, and there are easier ways to build an a-z list of licensed content than harvesting the proxy server’s configuration lists. in contrast, the link resolver -technology that takes a citation formatted according to the openurl standard and returns a list of appropriate full-text destinations to which the library has licensed access -may well be worth bringing in house. some vendors offer their software to be run locally, while others provide api access to the metadata. at my institution, we used the apis serials solutions makes available for its 360 link api to build our own interface using the opensource umlaut software. (see https://mgetit.lib.umich.edu/). why go to the trouble of recreating an interface? for several reasons, some of which (understanding user behaviors and maintaining control over user data to the extent practical) i’ll touch on in the following two sections. the main reason centered on providing a user interface consistent with the rest of our web presence, offering integrations to our document delivery service, and a way to contact our online chat service, and a way to report problem links directly to the library when the full text links provided by the system do no work. while these features are generally available through vendor interfaces, the user experience is hard to make consistent with other services we offer. criterion 2: what are the needs for integration with other systems from different providers? integrations can run in two directions: from the system under consideration to existing library or campus/community tools, and from those environmental tools to the library. when thinking about the buy-or-build decision, understanding the scope of these integrations up front is important. if all of the tools or services that need to consume information from or provide information to your information technologies and libraries | september 2017 5 system rely on well-defined standards that are broadly implemented, this criterion may be a wash; there may not be an inherent advantage to building or buying based on data exchange. if, however, the other systems are themselves tricky to work with, relying on inputs or providing outputs in a non-standard or idiosyncratic way, this situation may swing the pendulum toward building the system yourself so you can manage. for example, many course management systems on academic campuses can consume and provide data using the lti [learning tools interoperability] standard for data exchange. many traditional library applications do, as well, so if a library using an lti-compliant system needs to provide course reserves reading lists to the course management system, this is a ready-made way to make that information available. at the other extreme, bringing registrar’s data into a library catalog -to know who is in what courses to provide those patrons with an appropriate reference librarian contact for a particular subject, or access to a reading list through a course reserves system -may only be possible through customized applications to read non-standard data. in this case, to provide the desired level of service to the campus, the library may need to build local applications. criterion 3: who manages confidentiality or privacy of user interactions? a final, and increasingly significant, criterion to consider is where the library believes responsibility for patron data and information seeking behavior to reside. notwithstanding contractual or licensing obligations taken on by library vendors, the risk of inadvertent exposure or intentional sharing of user interactions is always present. one advantage of building local systems to interact with vendor systems (link resolvers, discovery platforms, etc.) is that vendor does not have access to the end-user’s ip address or any other personally identifying information. the vendor only sees a request coming from the library’s application; all requests are equal and undifferentiated. of course, once users access the target item they are seeking (an online journal, database, etc.), that particular vendor’s site has access to that information. for libraries concerned about user privacy, the risk of exposure is somewhat mitigated by managing the discovery or access layer in-house -and deciding to maintain a level of user information that suits that particular library’s comfort level -and potentially minimizing the single point of failure for breaches. at the same time, such a decision puts more responsibility on the library or its parent information technology organization to protect data from exposure. some libraries feel they can handle this responsibility -either by careful protection of the data, or by not collecting and storing it in the first place -in a way that library vendors cannot. concluding thoughts making the buy-or-build decision is not straightforward; the criteria described here are not the only ones a library might wish to consider, but they are common ones with the greatest ramifications. putting the decision process into a framework can help a library make consistent editorial board thoughts | varnum https://doi.org/10.6017/ital.v36i3.10087 6 decisions over time, enabling it to focus on the projects and systems that are most important to the library and its community (a campus, a town, or company). microsoft word 12035 20211217 galley.docx article stateful library analysis and migration system (slam) an etl system for performing digital library migrations adrian-tudor pănescu, teodora-elena grosu, and vasile manta information technology and libraries | december 2021 https://doi.org/10.6017/ital.v40i4.12035 adrian-tudor pănescu (tudor@figshare.com) is software engineer, figshare. teodora-elena grosu (teodora@figshare.com) is software engineer, figshare. vasile manta (vmanta@tuiasi.ro) is professor, faculty of automatic control and computer engineering, gheorghe asachi technical university of iași, romania. © 2021. abstract interoperability between research management systems, especially digital libraries or repositories, has been a central theme in the community for the past years, with the discussion focused on means of enriching, linking, and disseminating outputs. this paper considers a frequently overlooked aspect, namely the migration of records across systems, by introducing the stateful library analysis and migration system (slam) and presenting practical experiences with migrating records from dspace and digital commons repositories to figshare. introduction bibliographic record repositories are a central part of the research venture, playing a key role in both the dissemination and preservation of outcomes such as journal articles, conference papers, theses and dissertations, monographs, and, more recently, datasets. as the ecosystem of which these are a part of has evolved at a sustained pace in the last decade, repositories also had to adapt while ensuring uninterrupted service to the research community. nevertheless, a number of developments, both at the local, repository level and at a more general, global scale, have created the necessity of considering the complete replacement of certain systems with new repository solutions which are better suited for their stakeholders’ requirements. the following are a few such developments: • the need to consolidate both technological solutions and operational teams, in order to reduce running costs and provide a unified experience for end users, the research personnel.1 • various policies require researchers to provide not only traditional outputs, such as journal articles or conference papers, but also the datasets and other materials backing up scientific claims. for repositories, this means both adapting to larger amounts of stored data as well as ensuring that the metadata dissemination and preservation mechanisms are suited for the new output types (e.g., while full-text search is a common feature of literature repositories, it cannot be easily applied to numeric datasets).2 • apart from extending the set of stored outputs, policies have also created new requirements for existing record types. for example, the research excellence framework (ref) in the uk mandates monitoring open access (oa) publishing of research articles; thus, institutional repositories are no longer only a facilitator of green open access (selfarchiving of records) but also a means of monitoring compliance.3 this requires the implementation of new logic in existing repositories, which can frequently be difficult, especially when faced with legacy repository code bases or insufficient technological resources. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 2 • commercial, contractual, or leadership changes can also create the need to replace repository systems, due to uncertainty (see the acquisition of bepress by elsevier) or preference for certain platforms.4 while these developments can generate the requirement to switch repositories in a very short span of time, such a venture needs to be properly planned and executed in order to ensure, on the one hand, that no records are lost or corrupted and, on the other hand, that minimal or no downtime is caused. ideally, migrations would also be an opportunity to curate and enrich the existing corpus by consolidating and correcting bibliographic records. between 2018 and 2019 the research team has performed six digital library migrations from various source repository solutions (dspace, digital commons, custom in-house built systems) to the figshare software as a service (saas) repository platform. for this purpose, slam, an extract, transform, load (etl) system, was developed and successfully employed in order to migrate over 80,000 records. this article describes the rationale behind slam, its design and implementation, and the practical experiences with employing it for repository migrations. a number of future enhancements and open problems are also discussed. motivation and background of slam in early 2018 figshare started considering the suitability of its repository platform for storing content which is usually specific to institutional repositories (journal articles, theses, monographs), along with non-traditional research outputs (datasets or scientific software).5 while feature-wise this was validated by its hosted preprint servers, a new challenge was posed, as stakeholders choosing to use figshare as an institutional repository also had to transfer all content from their existing systems.6 thus, in the first half of 2018, a first migration was performed, transferring records from a bepress digital commons (dc) repository (https://www.bepress.com/products/digitalcommons/) to figshare (https://figshare.com). from a technical point of view, a python (https://www.python.org/) script was developed for this migration; this script parsed a commaseparated values (csv) report produced by dc which contained all metadata and links to the record files.7 using this information, records were created on the figshare repository using its application programming interface (api) (https://api.figshare.com). while this migration succeeded, the naive technical solution presented a number of issues: • difficulties with the metadata crosswalk: while a crosswalk was initially set up, mostly based on the definition of the fields in the source and target repositories’ metadata schema, issues were discovered while migrating the records, mainly generated by inconsistencies in the values of the fields across the corpus. these issues were fixed on a case-by-case basis, in order to ensure a lossless migration, but it would have been preferable to surface them in the early phases, in order to have the migration script mitigate the issues in the final run. • running the migration procedure multiple times: the migration script followed mostly an all or nothing approach, which, at each run, fully migrated all records between repositories. this is undesirable, as there was a need to run the script only for those records that failed to migrate (due, for example, to metadata crosswalk issues). after the full migration was completed, there was also a need to apply only some minor corrections to records, without following the full procedure. this was not possible, since the script would recreate all records to migrate from scratch on the target repository, as it did not have any memory of information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 3 previous runs. this issue was also amplified by the fact that in the source repository records did not have any type of persistent identifier attached. thus, additional scripts, which only performed the corrections, had to be developed. • ability to run the migration procedure with minimal supervision: like most migrations, this instance considered a large number of records (over 10,000) and, ideally, the process would run with minimal supervision operators. while the script partially accomplished this, the need for better fault-tolerance and enhanced logging was identified. given the lessons learned from the initial attempt and the requirement that five additional migrations were to be completed between october 2018 and december 2019, a more robust alternative to the naive migration script was required. this alternative had to adhere to three design principles: 1. reusability: the system should be usable for multiple migrations without extensive additions or modifications. thus, it should be able to adapt to the workflows of multiple repositories, metadata schemas, and other concerns specific to each migration. 2. statefulness: in software engineering, programs can either discard knowledge of past transactions or preserve it, allowing previous results and operations to be revisited. migration systems benefit from a stateful architecture, as the system should be able to perform the same migration multiple times, without creating duplicate records on the target repository, while allowing for incremental record improvements with each run. apart from allowing for corrections to be applied post-migration, this would also support the prototyping phase (where multiple test migrations are performed in order to validate the metadata crosswalks), that no information is lost, and other general workflow aspects. 3. fault tolerance: the system should implement fault tolerance mechanisms at all levels, allowing it to run migrations of large corpora with minimal supervision and, at the same time, implement sufficient logging and exception handling to allow operators to identify and correct potential issues. several repository migrations are represented in the literature. in van tuyl et al., the authors describe the process of moving from a dspace (https://duraspace.org/dspace) to a samvera (https://samvera.org) system, while in the study from do van chau records were migrated from a solution developed in house to dspace.8 both instances offer valuable insight into the challenges posed by digital library migrations, especially at the level of bibliographic metadata; on the other hand, both works are focused mostly on a specific use-case and do not propose general technical solutions for other migrations. it is interesting to note that the migration presented by van tuyl et al. required two and a half years of work, while slam was employed to carry five migrations in 14 months. the bridge2hyku toolkit (https://bridge2hyku.github.io/toolkit) is a collection of tools, including a module for the hyku repository solution (https://hyku.samvera.org), aimed at facilitating the import of records into digital libraries based on this software. similar to slam, it includes an analysis component, useful for surfacing and correcting potential metadata issues during the migration. slam provides two major improvements over this solution, namely it defines a generic architecture that can be used for migrating records between any two repositories, while also defining a procedural migration workflow to create a robust, fault-tolerant, and extensible solution. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 4 pygrametl (http://chrthomsen.github.io/pygrametl/) and petl (https://github.com/petldevelopers/petl) are two open-source frameworks which allow the defining of etl workflows; similar to slam, the processing steps are defined using python functions. these projects are targeted towards tabular and numeric data, making them unsuitable for the transfer of files and metadata across bibliographic repositories. singer (https://www.singer.io/) is an etl framework similar in design to slam, which allows the composing of various data sources (or taps) and targets, in order to move data between them. the two downsides of this implementation are that it is focused on processing data specified in the javascript object notation (json) format, which is not always available for bibliographic metadata, and that it does not facilitate extending the pipeline with, for example, the analysis facilities targeted by slam. hevo data (https://hevodata.com/), pentaho kettle (https://github.com/pentaho/pentahokettle) and talend open studio (https://www.talend.com/products/talend-open-studio/) are etl frameworks which employ graphical interfaces to allow users to define the processing workflows. while such functionality was not initially identified as a requirement for our planned migration projects, during testing it became obvious that providing such an interface could bring value by having repository administrators be more involved in defining and validating the processing applied to bibliographic records, as the administrators possess the most knowledge of the organisation of the repositories. a downside of the three solutions is that their usage requires commercial agreements, which did not line up with the business requirements of the considered migrations. in their work, tešendić and boberić krstićev use the pentaho suite in order to implement the etl component of a business intelligence (bi) solution for reporting on bibliographic records.9 while the structure of the etl processing is different—the authors being mostly interested only on certain aspects of the metadata—this work provides insights into the types of analysis that could be performed while migrating records. slam’s design and implementation following the design principles previously mentioned, slam’s architecture was devised as presented in figure 1; as for most etl systems, the easiest way of understanding its operation is by examining the data flow. the migration workflow proceeds by extracting all the required information from the source repository. this could be achieved in multiple ways, such as harvesting through an oai-pmh (https://www.openarchives.org/pmh/) endpoint or other types of api, using the bulk export functionality implemented by most repository systems, or even by crawling the html markup describing records, similar to what search engines do in order to discover web pages. once this mechanism has been established, practical experience proves that it is beneficial to move this raw data closer to the destination repository (to a staging area as depicted in figure 1). while this transfer might prove cumbersome, especially for large corpora, it is required only once. moreover, having the data close to the destination repository allows faster prototyping and testing of the migration procedure, as network latency and throughput are improved, while also ensuring that the source repository’s functioning is not affected in any manner. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 5 figure 1. main components and data flow in slam. areas in light blue are currently under development, while the components highlighted in green need to be adapted for each migration. the system splits the data to be migrated into four logical slices: bibliographic metadata, record files (e.g., pdfs of journal articles), persistent identifiers of records (pids, such as digital object identifiers or handles), and usage data (views and downloads). metadata is the first aspect to be considered. from the migration point of view, two dimensions are considered: the syntax and the semantics. metadata comes in various formats, such as csv or extensible markup language (xml) files, but most of these can be easily parsed by openly available software solutions. of more interest are the semantics of the metadata, which stem from the employed schemas or ontologies of field definitions; examples include dublin core (https://www.dublincore.org) or datacite (https://schema.datacite.org). a schema crosswalk, which describes how the fields in the target repository schema should be populated using the source data, needs to be set up when transferring records. while this should not be a concern if information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 6 the two repositories use the same schema, for the performed migrations (described below) this was not the case. other reasons for setting up such a crosswalk include • loosely defined schema in at least one of the repositories: certain repository systems do not specify a schema with clear field definitions, validations or applicability. by having the source repository administrators help with setting up a crosswalk, the migration team can avoid issues caused by incomplete understanding of the metadata. • support for the review of bibliographic records: migrations can prove to be an opportunity for reviewing and amending the records’ metadata; for example, infrequently used fields can be completely removed, and values which tend to confuse end users can be moved to other fields. • ensuring that a record on how the migration was performed, from the metadata point of view, is maintained. the crosswalk is considered an artefact of the migration and is preserved for future reference. in slam, the crosswalk is tested using elasticsearch, “an open-source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured.”10 the setup uses the crosswalk to create elasticsearch documents which include all fields as they would be transferred to the destination repository. a kibana (https://www.elastic.co/products/kibana) dashboard is then used to inspect the records’ metadata and perform structured searches across the corpus. this can allow, for example, discovering fields which do not follow a consistent pattern for the values, as seen in figure 2. as the crosswalk includes, apart from the field mapping, altering operations that can be performed on each field, this analysis can facilitate the review process described by the second point above. while performing actual migrations, a number of inconsistencies that the source repository administrators were unaware of were surfaced by slam and corrected in the target repository. this is commonplace especially in large corpora spanning decades, where the repository metadata workflows and schemas changed multiple times. two points should be noted about this component: • this is the only component of the architecture for which we mention an actual solution chosen for the practical implementation, namely elasticsearch. while other solutions could have been chosen, such as the ones included in the bridge2hyku toolkit, elasticsearch proved to be the best fit for a highly automated system which requires analysis capabilities; it is a production-grade solution which can index a high number of documents and support complex queries, while also providing user-friendly analytical views via kibana. • there are arguments for loading the metadata in the analysis component without having it processed through the crosswalk; such a workflow could provide further insights into various issues in the corpus which are possibly obscured by the crosswalk. our practical experiences did not fully justify this requirement, while the actual implementation provided a mean to test the crosswalk, a major migration component; nevertheless, we are still considering the possibility of having to load the raw metadata for analysis in future migrations. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 7 figure 2. a view examining the possible values of the temporal coverage field from the dublin core schema in an institutional repository corpus to be migrated. this shows variation in the format of the values (full date, year only) which can cause issues when migrating to a schema which applies strict validation on date/time values, and thus need to be handled by the migration harness. this view is generated using kibana from the elasticsearch stack, employed by slam for metadata analysis purposes. with the crosswalk set up, the migration module can be completed. from a logical point of view, it comprises of four components: 1. metadata processing: this component uses the crosswalk in order to transfer the metadata to the target repository. 2. file upload: this simply uploads all files associated to a bibliographic record to their new locations. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 8 3. usage data transfer: most repositories implement counters for views and downloads of records, and this information, if available, is also transferred to the target repository. 4. persistent identifier update: if the records are using persistent identifiers, such as digital object identifiers (dois) (https://doi.org/) or handles (http://handle.net/), these are updated to resolve to the new locations in the target repository. while employing slam for migrations, cases in which persistent identifiers were not employed on source repositories were encountered, with records being accessible only via uniform resource locators (urls). as these cannot always be transferred across repositories, because each software uses its own url schema, it is advisable to implement persistent identifiers before migrations. figure 3. a simplified process diagram describing the steps required for migrating a bibliographic record. each successful operation is recorded in a persistent database which is used in subsequent runs for resuming the workflow. for example, files will not be uploaded each time the script is run, thus avoiding duplication. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 9 one of the architectural goals of slam is statefulness and this is implemented at this level, the migration module being designed as a state machine. a trivial example of such a state machine is shown in figure 3. the state machine status is serialised in a persistent database, with each migration run deserializing it in order to understand which operations still need to be applied for each record. maintaining such a registry provides several other benefits: • facilitates testing and prototyping: this was the original reason behind the architecture, useful especially before the metadata analysis functionality was implemented. if one of the operations required for transferring a record fails, subsequent runs will not apply all steps, but only the ones that did not complete. as for each record a separate state section is maintained, this becomes especially useful when migrating multiple entries; records which failed to migrate can be easily isolated and subsequently reprocessed. • allows creating reports on the migration: these are used, for example, to validate that all records were indeed transferred to the target repository. • allows the migration module to be portable: if the state machine serialisation is accessible, the module can run from different locations and at different points in time. the first architectural principle previously presented relates to the reusability of slam across migrations. the most common cause of divergence between migrations is related to the differences between repository solutions; slam isolates this concern by using two connectors, one for the source and one for the target repository. these connectors translate the information to be migrated to and from slam’s internal data model. thus, the source connector needs to be able to traverse the staging storage and provide slam with all the required record information, while the target connector will upload the records to the new repository (using a web-accessible api for example). this means that for each migration only three parts of slam need to be adapted (shown in green highlights in figure 1): the source and target connectors, and the metadata crosswalk. all other components can remain unchanged, thus reducing the technical development time. in the last step of slam’s workflow, the information that was used for the migration is sent to a long-term preservation storage, in order to ensure that it remains available for future reference. in our implementation, the following information is preserved: • original metadata and files, as extracted from the source repository. • metadata crosswalk from source to target repository. • migration script state machine serialisation. this information is sufficient for understanding the exact steps applied during the migration and, if required, for applying certain corrections to the migrated records at a future point in time. employing slam for real-world migrations slam was used for performing five repository migrations in one year, as described in table 1; the target repository in all five cases was figshare. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 10 table 1. overview of repositories migrated to figshare using slam. source repository identifier repository type software number of records ir1 institutional dspace 37,000 ir2 institutional dspace 25,605 d1 data custom 334 (105 gb) ir3 institutional digital commons 2,275 ir4 institutional dspace 15,474 slam’s viability was assessed based on the design principles outlined above. reusability, the main rationale behind slam, relates to being able to reuse as much of the system as possible across migrations. the architecture isolated the parts that required adaption from one migration to another (the connectors and the crosswalk); the time spent by a software engineer in order to set up these was monitored. the target here was to support the specialised staff on making domainspecific decisions, especially on the metadata crosswalk, by reducing the time needed to develop the three mentioned components. for example, the research excellence framework (ref) 2021 exercise in the united kingdom had strict metadata requirements, which required thorough testing in connection with current research information systems and open access monitoring solutions. between the first and fourth migration, this was reduced from six person-weeks to only two; it is important to note that slam evolved between the migrations, based on the lessons learned from each instance. statefulness, the property which allows re-processing already-migrated records, is covered in slam by the state machine implemented in the migration module, which is persistent and can be referenced in subsequent runs. all the migrations in table 1 required supplementary runs after all records were migrated, most frequently in order to fix metadata issues discovered after the full corpus was transferred. for example, ir1 required three such runs: 1. the first run fixed a number of issues caused by omissions in the metadata schema crosswalk. 2. the second run enriched the metadata using information taken from a current research information system (a source external to slam). 3. the last run corrected the usage statistics (view and downloads) which were incorrectly imported initially, due to incomplete understanding of the source repository’s database. due to slam’s design, no issues were encountered while performing these runs, as no records were duplicated, removed, or erroneously modified; this was manually checked by the repository administrators, either by sampling the corpus or by inspecting each migrated record, depending on the repository size. a key aspect highlighted by the requirement to reprocess migrated records relates to the granularity of the state machine. as an example, in ir3 a second run required attaching supplementary files to a number of migrated records, and this posed a challenge due to the fact that the state machine only recorded if all files have been uploaded, and not which files were successfully added to the record. thus, the state machine was amended to record the complete list of record files, allowing for more granular control over this processing step. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 11 the last concern, fault tolerance, was achieved by applying basic software engineering principles, such as fail-fast (report migration issues as soon as they manifest), the implementation of proper exception handling (such as not to ignore any potential issues), and addition of enhanced logging in order to provide a complete record of the processing steps. for each of the five migrations, slam ran unsupervised, reporting at the end of each run the records for which an issue was encountered. as an example, in the ir4 migration, slam initially failed to migrate 300 records. these were reported to the operator, and after minor fixes were applied to the metadata crosswalk the migration completed successfully. fault-tolerance plays a central role in ensuring that during migrations no data is lost or corrupted, by surfacing any edge-case that might have been missed during the development of the metadata crosswalk, repository connectors, or core migration module, while also isolating such issues to the records exhibiting them, with no impact on the full corpus. future directions while proven viable in real-world scenarios, a number of areas which can benefit from further improvements were identified through an analysis of the current implementation, based on the experiences of the five migrations. first, the migration-specific components (connectors and metadata crosswalk, shown in green in figure 1) require further decoupling from the core migration module. for example, since all migrations considered figshare as a target repository, this connector is currently strongly interlinked with the core module, in order to save development time according to business requirements and migration timelines. further decoupling will ensure that the core migration module’s design is not influenced in any way by the repository’s architecture and capabilities. completing this work will also allow making the source code of our current implementation of slam publicly available, as in its current state it is making use of proprietary components which are employed across other parts of the figshare platform. aside from these, the source code includes straightforward python modules and makes use of open technologies such as elasticsearch, which will allow the larger community to adapt and use slam with other source or target repositories, or even enhance it with further functionality. nevertheless, the general architecture can already be implemented in any other way or using a different set of technologies. further to this point, the metadata crosswalk is currently influenced by the logic and design of the migration module; for example, it uses the same procedural programming language, python, as all other components of slam. employing technologies such as extensible stylesheet language transformations (xslt, for metadata in xml formats) or sparql (for rdf) will help involve staff with in-depth domain knowledge further in the migration, for whom these technologies are more familiar; moreover, such a design does not require any knowledge of slam’s internal processes. second, the five completed migrations highlighted the importance of reviewing, correcting, and enhancing records during the migration. for example, when migrating a journal article’s version of record in an open access context, special care needs to be given to its metadata (title, authors, journal name, publication date or persistent identifier), as mistakes can generate issues with scholarly search engines which will not be able to link the published version to the repository one. a possible input for comparing and correcting existing metadata is the information contained by current research information systems, which aggregate information from various databases, such as scopus (https://www.scopus.com/). if access to such systems is not available, it is possible to information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 12 source metadata from open directories, such as crossref (https://www.crossref.org/). this component is included in the architectural overview presented in figure 1. the third area in need of improvement relates to testing the outcome of the migrations. as mentioned in the previous section, this is currently a manual process and can be both cumbersome and error prone. while in line with slam’s philosophy of automating every step of the process, implementing a mechanism for validating the end migration result could also provide stronger assurances on the completeness and correctness of the migration. finally, slam’s preservation module requires further development in order to ensure that it is fully automated; moreover, the possibility of adding a manifest explaining the migration artefacts needs to be considered, as knowledge on the organisation of the information, which is specific to each migration, might be lost in time. it is important to note that architecture-wise, which was the main concern of this work, we did not identify any major shortcomings in slam—most issues discussed above focus on implementation issues. slam’s modular design will facilitate any additions to the system, required to support new use cases and migrations. conclusions this paper describes slam, the stateful library analysis and migration system, an etl software architecture for performing digital library migrations. what differentiates such transfers from other data migrations is the required domain knowledge, the particularities of the target and source repositories in the context of the scholarly communications ecosystem, and the structure of the migration package, which includes, among others, bibliographic metadata, record files, and usage data. digital libraries are an integral part of the cultural heritage; thus, any migration needs to ensure that no information is lost or corrupted in the process. the main contributions brought by slam are 1. it includes an analysis module based on an industry standard search engine, elasticsearch, which allows operators to analyse the metadata and schema crosswalk, facilitating the decisions required for properly migrating information between repositories; 2. it implements a serializable state machine in its migration module, which facilitates running the migration procedures multiple times without duplicating, removing, or corrupting records, while allowing for corrections to be applied to the corpus; 3. it follows a modular design, which enhances its reusability across multiple migrations, by reducing the development time required for adapting the system to new source and target repositories. slam applies established software engineering principles in order to provide a trustworthy tool to digital library administrators that need to transfer content between systems. its design was both influenced and validated by real-world applications, having been used for five different migrations with various requirements and targeted repository solutions. future work will consider enhancing slam’s metadata analysis and enrichment capabilities as well as the collection of further data points on its performance and possible improvement directions while using it for new digital library migrations. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 13 endnotes 1 david scherer and dan valen, “balancing multiple roles of repositories: developing a comprehensive repository at carnegie mellon university,” publications 7, no. 2 (2019), https://doi.org/10.3390/publications7020030. 2 directorate-general for research & innovation, “h2020 programme—guidelines to the rules on open access to scientific publications and open access to research data in horizon 2020,” version 3.2, march 21, 2017, https://web.archive.org/web/20180826235248/http://ec.europa.eu/research/participants/ data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf; national institutes of health, “nih public access policy details,” last updated march 25, 2016, https://web.archive.org/web/20180421191423/https://publicaccess.nih.gov/policy.htm. 3 the ref, “research excellence framework,” https://web.archive.org/web/20191215143352/https://www.ref.ac.uk/. 4 roger c. schonfeld, “elsevier acquires bepress,” scholarly kitchen (blog), august 2, 2017, https://web.archive.org/web/20191212183253/https://scholarlykitchen.sspnet.org/2017/0 8/02/elsevier-acquires-bepress/. 5 alan hyndman, “announcing the figshare institutional repository… and data repository… and thesis repository… really just an all-in-one next gen repository,” figshare (blog), march 22, 2018, https://figshare.com/blog/announcing_the_figshare_institutional_repository_and_data_repos itory_and_thesis_repository_really_just_an_all-in-one_next_gen_repository/389. 6 alan hyndman, “figshare to power chemrxiv™ beta, new chemistry preprint server for the global chemistry community,” figshare (blog), august 14, 2017, https://web.archive.org/web/20191218194210/https:/figshare.com/blog/_/322. 7 bepress, “digital commons dashboard,” https://web.archive.org/web/20191218192450/https://www.bepress.com/reference_guide_ dc/digital-commons-dashboard/. 8 steve van tuyl et al., “are we still working on this? a meta-retrospective of a digital repository migration in the form of a classic greek tragedy (in extreme violation of aristotelian unity of time),” code{4}lib journal no. 41 (august, 9, 2018), https://journal.code4lib.org/articles/13581; do van chau, “challenges of metadata migration in digital repository: a case study of the migration of duo to dspace at the university of oslo library” (master’s thesis, university of oslo, 2011), http://hdl.handle.net/10642/990. 9 danijela tešendić and danijela boberić krstićev, “business intelligence in the service of libraries,” information technology and libraries 38, no. 4 (2019), https://doi.org/10.6017/ital.v38i4.10599. 10 “what is elasticsearch?” elasticsearch bv, http://web.archive.org/web/20191207032247/https://www.elastic.co/whatis/elasticsearch. 78 information technology and libraries | june 2006 in the early years of modern information retrieval, the fundamental way in which we understood and evaluated search performance was by measuring precision and recall. in recent decades, however, models of evaluation have expanded to incorporate the information-seeking task and the quality of its outcome, as well as the value of the information to the user. we have developed a systems engineering-based methodology for improving the whole search experience. the approach focuses on understanding users’ information-seeking problems, understanding who has the problems, and applying solutions that address these problems. this information is gathered through ongoing analysis of site-usage reports, satisfaction surveys, help desk reports, and a working relationship with the business owners. ■ evaluation models in the early years of modern information retrieval, the fundamental way in which we understood and evaluated search performance was by measuring precision and recall.1 in recent decades, however, models of evaluation have expanded to incorporate the information-seeking task and the quality of its outcome, cognitive models of information behavior, as well as the value of the information to the user.2 the conceptual framework for holistic evaluation of libraries described by nicholson defines multiple perspectives (internal and external views of the library system as well as internal and external views of its use) from which to measure and evaluate a library system.3 the work described in this paper is consistent with these frameworks as it emphasizes that, while efforts to improve search may focus on optimizing precision or recall, it is equally important to recognize that the search experience involves more than a perfect set of high-precision, high-recall search results. the total search experience and how well the system actually helps the user solve the search task must be evaluated. a search experience begins when users enter words in a search box. it continues when the users view some representation (such as a list or a table) of candidate answers to their queries. it includes the users’ reactions to the usefulness of those answers and their representation in satisfying information needs, and continues with the users clicking on a link (or links) to view content. optimizing search results without considering the rest of the search experience and without considering user behavior is missing an opportunity to further improve user success. for example, the experience is a failure if typical users cannot recognize the answers to their information need because the items lack a recognizable title or an informative description, or they involve extensive scrolling or hard-to-use content. ■ proposed solutions problems with search, such as low precision or low recall, are often addressed by either metadata solutions (adding topical tags to content objects based on controlled vocabularies) or replacement of the search engine. the problems with the metadata approach include the time and effort required to establish, evolve, and maintain taxonomies, and the need for trained intermediaries to apply the tags.4 a community of stakeholders may be convened to define the controlled vocabulary, but often the lowest common denominator prevails, the champions and stakeholders leave, and no one is happy with the resulting standard. even with trained intermediaries, inter-indexer inconsistency compromises this approach, and inconsistent term application can cause degradation of search results.5 another shortcoming of the metadata approach is that a specific metadata classification is just a snapshot in time and assumes that there is only one particular hierarchy of the information in the corpus. in reality, however, there is almost always more than one way to describe a concept, and the taxonomy is the view of only one individual or group of individuals. in addition, topical metadata is often implemented with little understanding of the types of queries that are submitted or the probable user search behavior. the other approach to improving search results— replacing a search engine—is not a guarantee to fixing the problem because it focuses only on improving precision (and perhaps recall as well) without understanding the true barriers to a successful search experience. ■ irs.gov irs.gov, one of the most widely used government web sites, is routinely accessed by millions of people each month (more than 27 million visits in april 2005). as an informational site, the key goal of irs.gov is to direct visitors quickly to useful information, either through marcia d. kerchner (mkerchner@mitre.org) is a principal information systems engineer at the mitre corporation, mclean, va. a dynamic methodology for improving the search experience marcia d. kerchner article title | author 79a dynamic methodology for improving the search experience | kerchner 79 navigation or a search function. given that there were almost 16 million queries submitted to irs.gov in april 2005, search is clearly a popular way for its users to look for information. this paper offers an alternative to conventional search-improvement approaches by presenting a systems engineering-based methodology for improving the whole search experience. this methodology was developed, honed, and modified in conjunction with work performed on the irs.gov web site over a threeyear period. a similar strategy of “sense-and-respond” for information technology (it) departments of public organizations that involves systematic intelligence gathering on potential customer demand, a rapid response to fulfill that demand, and metrics to determine how well the demand was satisfied, has recently been described.6 the methodology described in this paper focuses on analyzing the information-seeking behaviors and needs of users and determining the requirements of the business owners (the irs business operating divisions that provide content to irs.gov, such as small business and self-employed, wage and investment) for directing users to relevant content. it is based on the assumption that a web site must evolve based on its user needs, rather than expecting users to adapt to its singularities. to support this evolution, this approach leverages techniques for query expansion and document-space modification.7 dramatic improvements in quality of service to the user have resulted, enhancing the user experience at the site and reducing the need to contact the help desk. the approach is particularly applicable for those government, corporate, and commercial web sites where there is some control over the content, and usage can be categorized into regular patterns. the rest of this paper provides a case study in the application of the methodology and the application of metrics, in addition to precision and recall, to measure search experience improvement. ■ conceptual framework while analysis of search results often focuses on search syntax and search-engine performance, there are actually several steps in the retrieval process, from the user identifying an information need to the user receiving and reviewing query results. as shown in figure 1, finding information is a holistic process. there are several opportunities to improve the whole user experience by fine-tuning this process with a variety of tools—from document engineering to results categorization. once the user and business-owner needs are understood, the appropriate tools to address specific issues can be identified. the tools in our toolkit are described in the following sections. document engineering document engineering includes: ■ document-space modification: modifying the document space by adding terms to content (especially to titles) that are good discriminators and reflect terms commonly entered by users. this approach has the added benefit of making the content more understandable to users. ■ establishment of content-quality standards: defining business processes that improve content quality and organization. document-space modification there is significant syntactic and semantic impreciseness in the english language. in addition, because of the inadequacies of human or automatic keyword assignment, standard means of representing documents in indexes by statistical term associations and frequency counts or by adding metadata tags are not definitive enough to produce a space that is an exact image of the original documents. document-space modification moves documents in the document space closer to future similar queries by adding new terms or modifying the weight of existing terms in the content (figure 2).8 the document space is thus modified to improve retrieval. for irs.gov, rather than adjusting content weights, titles and content are modified to adjust to changing terminology and user needs. establishment of content-quality standards the quality of the search correlates with the quality of the content. improved search results can be achieved by applying good content-creation practices. retrieval can be significantly improved by addressing problems observed in the content. these problems include inconsistencies in term use—for example, earned income credit (eic) versus earned income tax credit (eitc)—duplicate content, insufficiently descriptive page titles, missing document summaries, misspellings, and inconsistent spellings. figure 1. the information retrieval process 80 information technology and libraries | june 2006 processes to improve content quality should establish standards for consistent term usage in content, as well as standards for consistent and descriptive naming of content types (for example, irs types include forms, instructions, and publications). these processes will not only improve search precision, but will also help users identify appropriate content in the search results. for example, content entitled “publication 503” in response to the query “child care” may be the perfect answer (with excellent precision and recall), but the user will not recognize it as the right answer. a title such as “publication 503: child and dependent care expenses” will clearly point the user to the relevant information. usability tests conducted in march 2005 for irs.gov confirmed that content organization plays an important role in the perceived success of a user’s search experience. long pages of links or scrolling pages of content left some users confused and overwhelmed, unable to find the needed information. for these queries, although the search results were perfect, with a precision of 100 percent after one document, the search experiences were still failures. query enhancement the technique of relevance feedback for query expansion improves retrieval in an iterative fashion.9 according to this approach, the user submits a query, reviews the search results, and then reports query-document relevance assessments to the system. these assessments are used to modify the initial query, that is, new terms are added to the initial query (hopefully) to improve it, and the query is resubmitted. if one visualizes the content in a collection as a space (figure 3), this approach attempts to move the query closer to the most relevant content. a drawback of relevance feedback is that it is not generally collected over multiple user sessions and over time, so the next user submitting the same query has to go through the same process of providing results evaluations for query expansion. borlund has noted that, given that an individual user ’s information need is personal and may change over session time, relevance assessments can only be made by a user at a particular time.10 however, on irs.gov, where there are many common queries for which there is a clear best-guess response, there is valuable relevance information that, if captured once, could benefit tens of thousands of users for specific queries. in fact, in april 2005, the top four hundred queries represented almost half of all the queries. another drawback of the relevance-feedback ap proach is that it forces the user, novice or expert, to become engaged in the search process. as noted previously, users are generally not interested in becoming search experts or in becoming intimately involved in the process of search. the relevance-feedback approach tries to change users’ behavior and forces them to find the specific word or words that will best retrieve the relevant information. in fact, some research has shown that the potential benefits of relevance feedback may be hard to achieve primarily because searchers have difficulty finding useful terms for effective query expansion.11 to avoid requiring users to submit relevance-feedback judgments, the methodology uses alternative approaches for gathering feedback: (1) mining sources of input that do not require any additional involvement on the part of the users; and (2) soliciting relevance judgments from subject matter experts. as noted above, while best results may be different per task and per user, particularly given the shortness of the queries, our goal is to maximize the good results for the maximum number of people. best-guess results are derived from a variety of sources, including usability testing, satisfaction survey questionnaires, and businesscontent owners. for example, users entering the common query “1040ez” can be looking for information on the form or the form itself. given that—as shown in table 1 (based on the responses of 11,715 users to satisfaction surveys in 2005)—the goal of 39 percent of irs.gov searchers is to download a form as opposed to 28 percent seeking to obtain general tax information, the retrieval of the 1040ez form and its instructions is prioritized, while also retrieving any general related information. figure 2. document-space modification figure 3. query modification article title | author 81a dynamic methodology for improving the search experience | kerchner 81 we can determine the best-guess results as follows: ■ review the search results for terms that are on the frequently entered search-terms list ■ review help desk contacts, satisfaction-survey comments, and zero-results reports to identify information users who are having trouble finding or understanding ■ identify best results by working with the business owners as necessary ■ analyze why best results are not being retrieved for a particular query ■ add appropriate synonyms for this and related queries ■ engineer relevant documents (as described above) in this way, the thesaurus, as the source for query enhancement, is an evolving structure that adapts to the needs of the users rather than being a fixed entity of elements based on someone’s idea of a standardized vocabulary. search improvement we can intercept very popular queries and return a set of preconfigured results or a quick link at the top of the search-results listing. for example, the user entering “1040” sees a list of the most popular 1040-related forms and instructions in addition to a list of other search results. there were more than 31,000 users in april 2005 who requested the i-9 form. since the form is not an irs form, users are presented with a link to the bureau of citizen and immigration services web site. the tens of thousands of users who look for state tax forms on irs.gov are directed either to the specific state-tax-form website page or to a page with links to state tax sites. this unique and user-friendly approach provides a significant improvement over a page that tells the user that there is no matching result, leaving him to fend for himself. another technique for improving search precision (not currently used for irs.gov) is to tune and adjust parameters in the search engine, such as the relative weighting of basic metadata tags such as title (if they are used in the relevance calculation). results-ranking improvement the search results can be programmatically re-ranked before being presented to the user. this approach (not used as yet on irs.gov) is a variation on the quick links described above for re-ranking more than one result. categorization a large set of search results can be automatically categorized into subsets to help the user find the information he needs. in addition, a “search within a search” function is available to help the user narrow down results. research to be conducted on commercial products to support automatic categorization is planned for the future. summarization as noted earlier, a barrier to a successful user experience can be the lack of informative descriptions in the search results. therefore, an important tool for search-experience improvement is to make sure that content titles and summaries are informative, or as a second choice, that the search engine dynamically generates informative summaries. passage-based summaries and highlighted search terms in the summary and the content have become a feature of many commercial search engines as another way to improve the usability of the returned results. in table 1. reasons for using irs.gov reason for coming to irs.gov % of total site visitors % of total search users download a tax form, publication, or instructions 39 39 obtain general tax information 27 28 obtain information on e-file 10 10 other 6 6 obtain info on tax regulations or written determinations 4 4 order forms from the irs 3 4 sign up or login to e-services 3 3 link and learn (vita/vce) training 3 3 obtain info on the status of your tax return 2 2 use online tax calculators 1 1 obtain info on revenue rulings or court cases 1 1 obtain an employer identification number (ein) 1 — note: due to rounding, totals may not equal 100%. 82 information technology and libraries | june 2006 addition, for those pdf publications that lacked informative titles in the title tag, descriptive information from a different metadata field was added to the search display programmatically, which improved the usability of such results significantly. ■ methodology the methodology for evolving the search functionality is based on a logical, systems-engineering approach to the issue of getting users the information they seek: understanding the problems, understanding who has the problems, and applying solutions that address the problems. usability studies, weblogs, focus groups, help desk contacts, and user surveys provide different perspectives of the information system. the steps of the methodology are: 1. understand the user population. 2. identify the barriers to a successful search experience. 3. analyze the information-seeking behaviors of the users. 4. understand the needs of the business owners. 5. identify and use the appropriate tools to improve the user’s search experience. 6. repeat as needed. 7. monitor new developments in search and analytic technologies and replace the search engine as appropriate. step 1: understand the user population the first step is to profile and understand the user population. as mentioned above, an online satisfaction survey was conducted during a six-week period in january– february 2005, to which 11,715 users responded. the users were asked the frequency of their usage of the site, their primary reason for coming to irs.gov, their category (individual, tax professional, business representative), and how they generally find information on irs.gov. as shown in tables 1–4, 76 percent of the irs. gov visitors use it once a month or less (the largest group being those who use it every six months or less), or were using it for the first time; 64 percent are individual taxpayers; 10 percent are tax professionals; 39 percent visit the site to download a form or publication; and 27 percent come for general tax or e-file information. forty-nine percent use the search engine. not surprisingly, 44 percent of the frequent visitors (those who visit once a week or more) are tax professionals, while 72 percent of the infrequent visitors are individuals or those who represent a business. the most common task of both the most frequent and infrequent visitors is to download a form, publication, or instructions, followed by obtaining general tax information. most frequent and infrequent visitors use the search function to locate their information. thus, the largest group of irs.gov users consists of average citizens, unfamiliar with the site, who have a specific question or a need for a specific form or publication. these users require high-precision, highly relevant results, and a highly intuitive search interface. they do not want or need to read all the material generated by their search, but they want their question answered quickly. these users are generally not experienced with sophisticated query language syntax, and because they come to the site no more than once a month, they are not likely to be familiar with its navigational organization. as studies demonstrate, users in general do not want to learn a search engine interface or tailor their queries to the design of a particular search engine.12 they want to find their information now before “search rage” sets in. one study observed that, on average, searchers get frustrated in twelve minutes.13 tax professionals form a small but important group of irs.gov users that includes lawyers, accountants, and tax preparers. they generally use the site on a regular basis, which could be daily, weekly, or monthly. some of these users, particularly lawyers and accountants, require high relevance in their search results; it is critical that they retrieve every relevant piece of information (e.g., all the tax regulations) related to a tax topic. they may be willing to sift through large results sets to make sure they have seen all the relevant items. in contrast, many tax preparers use the site primarily to download forms and instructions. while these different sets of users have different levels of expertise using the site and somewhat different precision and recall requirements, they do have one characteristic in common—they are not interested in search table 2. frequency of visits to irs.gov first time every six months or less about once a month about once a week daily more than once a day site visitor 29% 34% 13% 13% 7% 4% search user 26% 34% 14% 14% 7% 5% article title | author 83a dynamic methodology for improving the search experience | kerchner 83 for its own sake. approaches to improving retrieval results that focus on forcing users to use tools to refine their query to get presumably better search results (e.g., leveraging the power of boolean or other search syntax) are not desirable in a public web site environment. the complexity of the search must be hidden behind the search box and users must be helped to find information rather than be expected to master a search function. step 2: identify the barriers to a successful search experience there are several categories of reasons why finding information on a public web site can be frustrating for the user. ■ mismatch between user terminology and content terminology  the user search terms may not match the terminology or jargon used in the content (e.g., users ask for “tax tables” or “tax brackets”; the irs names them “tax rate schedules”).  multiple synonymous terms or acronyms are found because different authors are providing content on similar topics (e.g., “ein,” “employer identification number,” “federal id number”; “eic” versus “eitc”).  users request the same information in a variety of ways (e.g., “1040ez,” “1040-ez,” “ez,” “form1040ez,” “1040ez form,” “2005 1040ez,” “ez1040”).  related content may be inconsistently named, complicating the user’s search process (e.g., “1040x” form versus “1040-x” instructions).  the user may use a familiar acronym that is spelled out in the content (e.g., “poa” for “power of attorney”). ■ mismatch between user requests and actual content  many users ask for information that they expect to find on the site but is actually hosted at another site (e.g., “ds156,” a department of state form; “it-201,” a new york state tax form). ■ issues with results listing and content display  content may lack informative titles.  automatically generated summaries may not be sufficiently descriptive for users to recognize the relevant material in the results listing.  content may consist of long, scrolling pages, which users find hard to manage. ■ incomplete user queries  very short search phrases (average length of less than two words) can make it difficult for a search algorithm to deduce the specific content the user is seeking. step 3: analyze the information-seeking behaviors of the users site-usage reports, satisfaction surveys, help desk contact reports, zero-results reports, focus groups, and usability studies are valuable sources of information. they should be mined for information-seeking behaviors of the site’s users and other barriers to a successful search experience, as follows: ■ review site-usage reports for the most frequently entered search terms and popular pages (both may change over time) and the zero-results search terms. look for:  new terms  variations on popular terms  common misspellings or typos  common searches, including searches for items table 3. irs.gov user types type of user % of total site visitors % of total search users individual taxpayer 64% 64% representing a business 11% 11% tax professional 10% 11% representing a charity or nonprofit 3% 3% vita/vce volunteers 3% 3% representing a government entity 2% 2% student 2% 1% irs employee 1% 2% other 4% 3% table 4. how users find information on irs.gov how do you usually find information on irs.gov? % of total site visitors search engine 49% irs keyword 18% navigation to the web page 11% internet search engine (e.g., google, yahoo) 7% site map 5% other 4% bookmarks 3% links to irs.gov from other web sites 3% 84 information technology and libraries | june 2006 not on the site, that could be candidates for preprogrammed “quick links”  frequently entered terms—review search results to identify candidates for improvement ■ review satisfaction surveys over time  look for new problems that caused satisfaction to decrease  analyze answers to questions asking what people could not find, potentially identifying new barriers to success ■ conduct usability studies  identify issues with the user interface as well as with content findability and usability ■ review help desk contact reports  identify which topics users are having trouble finding or understanding step 4: understand the needs of the business owners the business owners are the irs business operating divisions that provide content to irs.gov, such as small business and self-employed, wage and investment. it is important to involve them in the process of enhancing the user experience, because they may have specific goals for prioritizing information on a particular topic or may be managing campaigns for highlighting new information. thus it is desirable to: ■ meet with business owners regularly to understand their goals for providing information to users ■ work with them to increase the findability of their content for example, when an issue in finding a particular content topic is identified (e.g., through an increase in help desk contacts), one approach is to show the business owner the actual results that common queries (based on the site-usage reports) on the topic retrieve and then present suggested alternative results that could be retrieved with a variety of enhancement techniques, such as thesaurus expansion or title improvement. the business owner can then evaluate which set of results presents the content in the most informative manner to the user. steps 1–4 facilitate work behind the scenes to gather the data needed to improve precision and recall and to make information more findable. the remaining steps use these data to adapt proven, widely used techniques for improving search experiences to a web site’s specific environment. step 5: identify appropriate tools to improve the information-retrieval process as described in the previous section, the tools in our toolkit are document engineering, query enhancement, search improvement, results-ranking improvement, categorization, and summarization. step 6: repeat as needed the process of improving the user search experience is ongoing as the site evolves. at irs.gov, different search terms appear on the site-usage reports over time, depending on whether or not it is filing season, or as new content and applications are published. human intervention (with the help of applicable tracking software) is essential for incorporating business requirements, evaluating human behavior, and identifying changing terms. step 7: monitor new developments in search and analytic technologies and replace the search engine as appropriate although a new search engine will not address all the issues that have been described, new features such as passage-based summaries and term-highlighting can improve the search experience. of course, one should consider replacing a search engine if new technology can demonstrate significantly improved precision and recall. the application of the methodology and the use of the toolkit for irs.gov will be described in the next section. ■ findings site-usage reports in 2003, an example of a serious mismatch in user and content terminology was discovered when site-usage reports were analyzed. users entering the equivalent terms ein, employer number, employer id number, and employer identification number retrieved significantly different sets of results. we met with the business owner, who identified a key-starting page that should be retrieved along with other highly relevant pages for all of these query terms. we recommended that “ein” be added to the title of the key page because, although ein is a very popular query, the acronym was not used in the content, but was instead spelled out. as a result, the key page was not being retrieved. synonyms were added to the query enhancement thesaurus to accommodate the variants on the ein concept. after these steps were implemented, the results were as follows: ■ for the query ein, the target page moved from #16 to #1 ■ for the query ein number, it moved from #17 to #5 article title | author 85a dynamic methodology for improving the search experience | kerchner 85 ■ for the query employer identification number, it moved up to #2 (it was not in the top 20 previously) ■ all search results now retrieved on the first page for these terms were highly relevant in january 2004, there were approximately twenty thousand queries using these terms, so the search experience has been improved for tens of thousands of users in one month and hundreds of thousands of users throughout the year. ■ review of help desk contacts help desk reports summarize, for each call or e-mail, the general topic of the user’s contact (filing information, employer id number, forms, and publications issues) and the specific question. for example, the report might indicate that a user needed help in finding or downloading the w-4 form or did not understand the instructions for amending a tax return. as help desk contact reports were reviewed, clusters of questions emerged indicating information that many users could not find or understand. by analyzing approximately 9,800 contacts (e-mail, telephone, chat) during a peak five-day period in april 2003, four particular areas were identified that were ripe for improvement: 480 users could not find previous years’ forms, which, although they can be found on the site, are not indexed and thus not findable through search; 250 users had questions about where to send their tax returns; 170 users had questions about getting a copy of their tax return or w-2 form; and 77 users had problems finding the 1040x or 1040ez forms. utilizing the information retrieval toolkit, the following improvements were implemented: a) search for previous years’ forms tool used: results-ranking improvement a user requesting a previous year’s forms (for example, 2002 1099misc) is now presented with a link directly to the page of forms for that specific year, as follows: recommendation(s) for: 2002 1099misc ■ 2002 forms and publications 2002 forms, instructions, and publications available in pdf format b) request for filing address tools used: document engineering and query enhancement a new “where to file” page was created. synonyms were added to the thesaurus to accommodate the variations on how people make this request (address, where to send, where to mail) and to prioritize retrieval of the “where to file” page. c) request for information about obtaining a copy of a tax return or w-2 form tools used: results-ranking improvement and query enhancement a “quick link” was created to the target page for getting a copy of returns and w-2 forms and synonyms were added to the thesaurus to prioritize related content for any query containing the word “copy.” d) requests for 1040x or 1040ez forms or instructions tool used: query enhancement synonyms were added to the thesaurus to address both the variations on how users requested the 1040x and 1040ez forms and instructions, and the inconsistencies in the titling of these documents (for example, the form and the instructions have different variations of the compound name). ■ results in 2004, approximately 4,200 contacts were reviewed with the help desk during the same time period (the week before april 15) to see whether the changes actually did help users find the information. it should be noted that, during this period from april 2003 to april 2004, many other improvements to the user search experience based on the methodology were deployed. although the number of visits to irs.gov increased by approximately 50 percent compared with the same period in 2003, the total number of contacts with the help desk decreased by 47 percent (there were approximately 9,800 contacts in this period in 2003). the results for the specific improvements are shown in table 5. the average decrease in contacts for those four topics was 68 percent, compared with the average decrease of 47 percent. this approach has significantly improved the user experience by identifying and addressing subject areas users have trouble finding or understanding on irs.gov, eliminating the need for them to contact the help desk. as a result, an increase of resources at the help desk was avoided and, hopefully, user satisfaction improved. 86 information technology and libraries | june 2006 ■ conclusions while the case presented in this article was specific to irs.gov, the methodology itself has wide application across domains. customer service for most government and commercial organizations depends on providing users with relevant information effectively and efficiently. there are many aspects to achieving this elusive goal of matching users with the specific information they need. in this paper, it has been demonstrated that, rather than focusing just on optimizing the search engine or developing a metadata-based solution, it is essential to view the user search experience from the time content is created to the moment when users have truly found the answer to their information needs. there is no one surefire solution, and one should not assume that enhanced metadata or a new search engine is the only solution to retrieval problems. the methodology described in this paper assumes that users, especially infrequent users of public web sites, do not wish to become search experts; that intuitive interfaces and meaningful results displays contribute to a successful user experience; and that keeping business owners involved is important. the methodology is based on understanding the behavior of a site’s users in order to identify barriers to a successful search experience, and on understanding the needs of business owners. the methodology focuses on adapting the site to its users (rather than vice versa) through document modification, improved content-development processes, query enhancement, and targeted search improvement. it includes improvements to the results phase of the search process, such as improved titles and summaries, as well as to the searchand-retrieval phase. this toolkit-based approach is effective and low-cost. it has been used over the past four years to improve the user search experience significantly for the millions of irs.gov users. interesting follow-on research could focus on identifying to what degree this methodology can be automated and how to leverage new tools to provide automated support for usage log analysis (such as mondosearch by mondosoft). it is clear from this case study that it is time to apply systems engineering rigor to search-experience improvement. this approach confirms the need to extend metrics for evaluating search beyond precision and recall to include the totality of the search experience. ■ future work teleporting has been defined as an approach in which users try to jump directly to their information targets.14 trying to achieve perfect search results supports the information-seeking strategy of teleporting. but the search process may involve more than a single search. people often conduct “a series of interconnected but diverse searches on a single, problem-based theme, rather than one extended search session per task.”15 this approach is similar to the sport of orienteering with searchers using data from their present situation to determine where to go next—that is, looking for an overview first and then submitting more detailed searches. given the general, nonspecific nature of the short queries submitted by irs.gov users, the orienteering approach may well describe the information-seeking behaviors of many users. this paper is limited to the improvement of search results for individual searches, but the need to investigate improving the search experience to support orienteering behavior is acknowledged. future research will investigate how to leverage the theoretical models of the information-search process, such as the anomalous states of knowledge (ask) underlying information needs and the information search process model.16 references and notes 1. “common evaluation measures,” the thirteenth text retrieval conference, nist special publication sp 500-261 (gaithersburg, va.: national institute of standards and technology, 2004), appendix a. 2. kalervo jarvelin and peter ingwersen, “information-seeking research needs extension towards tasks and technology,” information research 10, no. 1 (2004), http://informationr .net/ir/10-1/paper212.html (accessed feb. 2, 2006); k. fisher, s. erdelez, and l. mckechnie, eds., theories of information behavior (medford, n.j.: information today, 2005); t. saracevic and paul b. kantor, “studying the value of library and information services, part i: establishing a theoretical framework,” journal of the american society for information science. 48, no. 6 (1997): 527–42. table 5. comparison of 2004 and 2003 help desk contacts problem area number of contacts 2003 number of contacts 2004 change 1040x, 1040ez 77 19 -75% prior year forms 480 103 -78% copy of return 170 91 -47% where to file 250 104 -58% total 977 317 -68% article title | author 87a dynamic methodology for improving the search experience | kerchner 87 3. scott nicholson, “a conceptual framework for the holistic measurement and cumulative evaluation of library services,” journal of documentation 60, no. 2 (2004): 164–82. 4. avra michelson and michael olson, “dynamically enabling search and discovery tem,” internal mitre presentation, mclean, va., mar. 30, 2005. 5. lawrence e. leonard, “inter-indexer consistency studies, 1954–1975: a review of the literature and summary of study results,” occasional paper series, no. 131, graduate school of library science, university of illionois, urbana-champaign, 1977; tefko saracevic, “individual differences in organizing, searching and retrieving information,” in proceedings of american society for information science ’91 (new york: john wiley, 1991), 82–86; g. furnas et al., ”the vocabulary problem in human-system communication,” communications of the acm 30, no. 11 (1987): 964–71. 6. rajiv ramnath and david landsbergen, “it-enabled sense-and-respond strategies in complex public organizations,” communications of the acm 48, no. 5 (2005): 58–64. 7. t. l. brauen et al., “document indexing based on relevance feedback,” report no. isr-14 to the national science foundation, section xi, department of computer science, cornell university, ithaca, n.y., 1968; m. c. davis, m. d. linsky, and m. v. zelkowitz, “a relevance feedback system employing a dynamically evolving document space,” report no. isr-14 to the national science foundation, section x, department of computer science, cornell university, ithaca, n.y., 1968; marcia d. kerchner, dynamic document processing in clustered collections, report no. isr-19 to the national science foundation, ph.d. thesis, department of computer science, cornell university, ithaca, n.y., 1971. 8. ibid. 9. gerard s. salton, dynamic information and library processing (englewood cliffs, n.j.: prentice-hall, 1975). 10. p. borlund, “the iir evaluation model: a framework for evaluation of interactive information retrieval systems,” information research 8, no. 3 (2003), http://informationr.net/ir/8 -3/paper152.html (accessed feb. 15, 2006). 11. ian ruthven, “re-examining the effectiveness of interactive query expansion,” in proceedings of the 26th international acm sigir conference on research and development in information retrieval (new york: acm press, 2003), 213–20. 12. marc l. resnick and rebecca lergier, “things you might not know about how real people search,” 2002, www.search tools.com/analysis/how-people-search.html (accessed oct. 1, 2005). 13. danny sullivan, “webtop search rage study,” the search engine report, 2001, http://searchenginewatch.com/sereport/ article.php/2163451 (accessed sept. 10, 2005). 14. j. teevan et al., “the perfect search engine is not enough: a study of orienteering behavior in directed search,” in proceedings of computer-human interaction conference ’94 (new york: acm press, 2004), 415–22. 15. vicki o’day and robin jeffries, “orienteering in an information landscape: how information seekers get from here to there,” in proceedings interchi ’93 (new york; acm press, 1993), 438. 16. n. j. belkin, r. n. oddy, and h. m. brooks, “ask for information retrieval, part i. background and theory,” the journal of documentation 38, no. 2 (1982): 61–71; n. j. belkin, r. n. oddy, and h. m. brooks, “ask for information retrieval, part ii. results of a design study,” the journal of documentation 38, no. 3 (1982): 145–64; carol c. kuhlthau, seeking meaning: a process approach (norwood, n.j.: ablex, 1993). 32 information technology and libraries | june 2007 author id box for 3 column layout column title 32 information technology and libraries | june 2008 communications michaela brenner and peter klein discovering the library with google earth libraries need to provide attractive and exciting discovery tools to draw patrons to the valuable resources in their catalogs. the authors conducted a pilot project to explore the free version of google earth as such a discover tool for portland state library’s digital collection of urban planning documents. they created eye-catching placemarks with links to parts of this collection, as well as to other pertinent materials like books, images, and historical background information. the detailed how-to-do part of this article is preceded by a discussion about discovery of library materials and followed by possible applications of this google earth project. in calhoun’s report to the library of congress, it becomes clear that staff time and resources will need to move from cataloging traditional formats, like books, to cataloging unique primary sources, and then providing access to these sources from many different angles. “organize, digitize, expose unique special collections” (calhoun 2006). in 2005, portland state university library received a grant “to develop a digital library under the sponsorship of the portland state university library to serve as a central repository of the collection, accession, and dissemination of [urban] key planning documents . . . that have high value for oregon citizens and for scholars around the world” (abbott 2005). this collection is called the oregon sustainable community digital library (oscdl) and is an ongoing project that includes literature, planning reports, maps, images, rlis (regional land information system) geographical data, and more. much of the older material is unpublished, and making it available online presents a valuable resource. most of the digitized—and, more recently, borndigital—documents are accessible through the library’s catalog, where patrons can find them together with other library materials about the city of portland. the bibliographic records are arranged in the catalog in an electronic resource management (erm) system (brenner, larsen, and weston 2006). additionally, these bibliographic data are regularly exported from the library catalog to the oscdl web site (http://oscdl. research.pdx.edu) and there integrated with gis (global information system) features, thus optimizing cataloging costs by reusing data in a different electronic environment. committed to not falling into the trap that clifford lynch had in mind when he wrote, “i think there is a mental picture that many of us have that digitization is something you do and you finish . . . a finite, one-time process“ (lynch 2002), and agreeing with gatenby that “it doesn’t matter at all if a user finds our opac through the ‘back door ’“ (gatenby 2007), the authors looked into further using these existing data from the library catalog by making them accessible from a popular and appealing place on the internet, a place that users are more likely to visit than the library catalog. the free version of google earth, a virtual-globe program that can be installed on pcs, lent itself to experimenting. “google earth combines the power of google search with satellite imagery, maps, terrain and 3-d buildings to put the world’s geographic information at your fingertips” (http://earth.google.com). from there, the authors provide links to the digitized documents in the library catalog. easy distribution, as well as the more playful nature of this pilot project and the inclusion of pictures, make the available data even more attractive to users. “google now reigns” “google now reigns,” claims karen markey (markey 2007), and many others agree that using google is easier and more appealing to most than using library catalogs. google’s popularity has been growing spectacularly. in august 2007, google accounted for 64 percent of all u.s. searches (avtec media group 2007). in contrast, the oclc report on how users perceive the library shows that only one percent of the respondents begin their information search on a library web site, while 84 percent use search engines (de rosa, et al. 2005). “if we [libraries] want to survive,” says stephen abram, “we must place our messages where the users are seeking answers and will trip over them. today that usually means at yahoo, msn, and google” (abram 2005). according to lorcan dempsey, in the longer run, traffic to the library catalog will come by linking from larger consolidated resources, like open worldcat and google scholar (dempsey 2005). dempsey also stressed that it becomes more and more significant to differentiate between discovery and location (dempsey 2006a). initially, users want to discover; they want to find what interests them independent from where this information is actually located and available. while there may be lots of valuable, detailed, and exceptionally well-organized bibliographic information in the library catalog, not michaela brenner (brennerm@pdx.edu) is assistant professor and database maintenance and catalog librarian at portland state university library, oregon. peter klein (peter.klein@colorado.edu) is aerospace engineering bs/ms at the university of colorado at boulder. introducing zoomify image | smith 33discovering the library with google earth | brenner and klein 33 many users (one percent) are willing to discover this information through the catalog. they may not discover what a library has to offer if “the library does not find a way to go to the user, rather than waiting for the user to come to the library” (coyle 2007). unless the intent is to keep our treasures buried, the library community needs to work with popular outside discovery environments— like search engines—to bring information available in libraries to users from the outside. libraries are, although sometimes reluctantly, responding. google, google scholar, and google books are open worldcat partner sites that are now or soon will be providing access to worldcat records. google book search includes “find this book in the library,” and the advanced book search also has the option to limit a search to library catalogs with access to the worldcat web record for each item. “deep linking” enables web users to link from search results in yahoo, google, or other partner sites to the “find in a library” interface in open worldcat, and then directly to the item’s record in their library’s online public access catalog (opac). simply put, “find it on google, get it from your library” (calhoun 2006). the “leveraged discovery environment” is an expression coined by dempsey that means it becomes increasingly important to leverage a “discovery environment which is outside your control to bring people back into our catalog environment (like amazon, google scholar)” (dempsey 2006b). issues in calhoun’s report to the library of congress include the question of how to get a google user from google to library collections. she quotes an interviewee saying that “data about a library’s collection needs to be on google and other popular sites as well as the library interface” (calhoun 2006). with evidence pointing to the heavy use of google for discovery and with google earth technology providing such a powerful visualization tool, the authors felt tempted to experiment with existing data from portland state library’s digital oscdl collection and make these data accessible through a virtual globe. the king’s college cultural heritage project martyn jessop from king’s college in london, united kingdom, published an article about a relatively small pilot project on providing access to a digital cultural heritage collection through a geographical information system (jessop 2005). jessop’s approach to explore different technologies and techniques to apply to existing data about unique primary sources was exactly what the authors had in mind with this project, and provided encouragement to move forward with the idea of providing additional access to the oregon sustainable community digital library (oscdl) collections through google earth. similar to jessop, the authors regard it an unaffordable luxury to put a great deal of effort into collecting, digitizing, and cataloging materials without making them available to a much broader audience through multiple access points. comparable to jessop, the goal of this project was to find a relatively simple, low-cost technological solution that could also be applied to a much wider range of data without much more investment in staff time and money. once the authors mastered the initial hurdle of understanding google earth’s programming language, they could easily identify with jessop’s notion of “project creep” as more and more possibilities arose to make the project more appealing. this, as with the king’s college project, was a valuable part of the development process, the details of which are described below. the portland state library oscdl-ongoogle-earth project the authors chose ten portlandbased oscdl sub-collections as the basis of this pilot project: harbor drive, front street, portland public market, urban studies collection, downtown, park blocks, south park blocks, pioneer courthouse square, portland city archives, and jpact (joint policy advisory committee on transportation). the programming language for google earth is kml (keyhole markup language), a file format used to display geographic data. kml is based on the xml standard and can be created with the google earth user interface or from scratch with a simple text editor. having no previous kml experience, the authors decided to use both. figure 1. basic placemark in google earth figure 2. kml script for basic placemark 34 information technology and libraries | june 200834 information technology and libraries | june 2008 a basic placemark provided by google earth (figure 1), copied and pasted in notepad (figure 2), was the starting point. at portland state library, information technology routinely batch export cataloged oscdl data from the library catalog (ils) to the oscdl web site to reuse them. for the google earth project, the authors had two options, to either export data relevant to our collections from the ils to a spreadsheet or to use an existing excel spreadsheet containing most of the same data, including place coordinates. this spreadsheet was one of many others that had been created to keep track for the digitization process as well as for creating bibliographic records for the library catalog later. using the available spreadsheet again, the following data were retained: n the title of the collection n longitude and latitude of the place the collection refers to n a brief description of the collection the following were added manually to the remaining spreadsheet: n all the texts and urls for the collection-specific links n urls for the collection-specific images the authors extracted the placemark-specific script from figure 2 to create a template in notepad. a general description and all links that were the same for the ten collections were added to this template, and placeholders were inserted for collection-specific data (figure 3). using microsoft office word’s mail merge, the authors populated the template with the data from the spreadsheet in one quick step. the result was a kml script that included all the placemark data for the ten collections (figure 4). the script was saved as plain text (.txt) first, and then renamed with the extension .kml, which represents the final file (figure 5). clicking the oscdl.kml icon on a desktop or inside a web application opens google earth. the user “flies” to portland, where ten stars represent the ten collections (figure 6). zooming in, the placemarks show the locations to which the collections refer. considering the many layers and icons available in google earth, the authors decided to use yellow stars to make them more visible. in order to avoid clutter and overlapping labels, titles only appear on mouse-over (figures 7 and 8). figure 9 shows the open placemark for portland public market. “portland state university” with the university’s logo is a link that takes the user to the university’s homepage. the next line is the title of the collection, followed by a brief description. the paragraph after that is the same for all collections and includes links to the portland state university library and the oscdl web site. the collection-specific links that follow next go to the library catalog where the user has access to the digitized manuscripts of this collection (figure 10). other pertinent links—in this case to a book available in the library, a public web site on the history of the market, and a historic image of the market—were added as well. to make the placemarks visually more attractive, all links are presented in the school’s “psu green,” and an image representative of the collection was added. the pictures can be enlarged in a new window by clicking on them. to avoid copyright issues, the authors photographed their own images. the last link opens an e-mail window for questions and comments (figure 11). this link is intended to bring some feedback and suggestions on how to improve the project and on its value for researchers and other users. the authors have been toying with the idea of including in the future more elaborate features such as video clips and music. one more recent feature is that kml files, created in google earth, can now also be viewed on the web by simply entering the url of the kml file into the search box of google maps (figure 12), thus creating google earth placemarks in figure 3. detail of template with variables between « double brackets » figure 4. detail: “downtown” placemark of finished kml script figure 5. simplified process figure 6. ten stars representing the ten collections introducing zoomify image | smith 35discovering the library with google earth | brenner and klein 35 google maps with different view options (figures 13 and 14). not all formatting is correctly transferred, and at this point, there is no way to correct this in google maps. for example, the yellow stars were white, the mouse-over didn’t work and the size of the placemarks was imprecise. however, the content of the placemarks—except for the images which didn’t show on some computers—was fully retained and all links worked (figure 15). although the use of the kml file in google maps is not as elegant as in google earth, it has the advantage that there is no need to install software as with google earth. this adds value to kml files and makes projects like this more versatile. the authors have identified several uses for the kml file: n a workstation in the library can be dedicated to resources about the city of portland. an icon on the desktop of this workstation will open google earth and “fly” directly to portland where the yellow stars are displayed. n professors can easily add the .kml file to webct (now blackboard) or other course management systems. n the file can be e-mailed as an figure 7. zoomed in with mouse-over placemark figure 8. location of the pioneer courthouse square placemark figure 9. portland public market figure 10. access to the collection in library catalog figure 11. ready-to-go e-mail window figure 12. url of kml file in google maps search box figure 13. “map” view in google maps figure 14. “satellite” view in google maps figure 15. portland public market placemark in google maps 36 information technology and libraries | june 200836 information technology and libraries | june 2008 attachment to those interested in the development of the city of portland. n a link from the wikipedia page related to the oscdl project leads to the google earth pilot project. n the project was added to the google earth gallery where many remarkable projects, created by individuals and groups can be found. n it can also be accessed through the oscdl web site, and relevant links from the records in the library catalog to google maps can be included. it may be useful to alert patrons, who actually did come to the catalog by themselves, to this visual tool. conclusion “the question now is not how we improve the catalog as such,” says dempsey. “it is how we provide effective discovery and delivery of library materials in a network environment where attention is scarce and information resources are abundant and where discovery opportunities are being centralized into major search engines and distributed to other environments” (dempsey 2006a). with this in mind, the authors took on the challenge to create another discovery tool for one of the library’s primary unique digital collections. google earth is not the web, and it needs to be installed on a workstation in order to use a kml file. on the other hand, the file created in google earth can also be used on the web more readily but less elegantly in google maps, thus possibly reaching a larger audience. similar to the king’s college project and following abram’s suggestion that “we should experiment more with pilots in specific areas” (abram 2005), this pilot project is of an exploratory, experimental nature. and as with many experiments, the authors were testing an idea, trying something different and new to find out how useful this idea might be, and useful applications for this project were identified. google earth is a sophisticated, attractive, and exciting program—and fun to play with. in a time “where attention is scarce and information resources are abundant,” as dempsey (2006a) says, we need to provide these kinds of discovery tools to attract patrons and to lure them to these valuable resources in our library’s catalog that we created with so much diligence and cost of staff time and resources. works cited abbott, carl. 2005. planning a sustainable portland: a digital library for local, regional, and state planning and policy documents. framing paper. http://oscdl.research.pdx.edu/documents/library_grant.pdf. abram, stephen. 2005. the google opportunity. library journal 130, no. 2: 34. avtec media group. 2007. search engine statistics. http://avtecmedia.com/ internet-marketing/internet-marketing-trends.htm. brenner, michaela, tom larsen, and claudia weston. 2006. digital collection management through the library catalog. information technology and libraries 25, no. 2: 65–77. calhoun, karen. 2006. the changing nature of the catalog and its integration with other discovery tools; final report, prepared for the library of congress. www.loc.gov.proxy.lib.pdx. edu/catdir/calhoun-report-final.pdf. coyle, karen. 2007. the library catalog in a 2.0 world. the journal of academic librarianship 33, no. 2: 289–291. de rosa, cathy et al. 2005. perceptions of libraries and information resources. a report to the oclc membership. www .oclc.org.proxy.lib.pdx.edu/reports/ pdfs/percept_all.pdf. dempsey, lorcan. 2006a. the library catalogue in the new discovery environment: some thoughts. ariadne 48. www.ariadne.ac.uk/issue48/dempsey. dempsey, lorcan. 2006b. lifting out the catalog discovery experience. lorcan dempsey’s weblog on libraries, services, and networks, may 14, 2006. http://orweblog .oclc.org/archives/001021.html dempsey, lorcan. 2005. making data work—web 2.0 and catalogs. lorcan dempsey’s weblog on libraries, services, and networks, october 4, 2005. http://orweblog.oclc .org/archives/000815.html gatenby, janifer. 2007. accessing library materials via google and other web sites. paper presented to elag (european library automation group), may 9, 2007. http://elag2007.upf. edu/papers/gatenby_2.pdf. jessop, martyn. 2005. the application of a geographical information system to the creation of a cultural heritage digital resource. literary and linguistic computing: journal of the association for literary and linguistic computing 20, no. 1: 71–90. lynch, clifford. 2002. digital collections, digital libraries, and the digitization of cultural heritage information. first monday 7, no. 5. www.firstmonday. org/issues/issue7_5/lynch. markey, karen. 2007. the online library catalog. d-lib magazine 13, no. 1/2. www .dlib.org/dlib/january07/markey/01 markey.html. lita cover 2, cover 3, cover 4 index to advertisers ital_24n4.pdf ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ checking out facebook.com | charnigo and barnett-ellis 23 author name and second author checking out facebook.com | charnigo and barnett-ellis 23 author name and second author author id box for 2 column layout while the burgeoning trend in online social networks has gained much attention from the media, few studies in library science have yet to address the topic in depth. this article reports on a survey of 126 academic librarians concerning their perspectives toward facebook.com, an online network for students. findings suggest that librarians are overwhelmingly aware of the “facebook phenomenon.” those who are most enthusiastic about the potential of online social networking suggested ideas for using facebook to promote library services and events. few individuals reported problems or distractions as a result of patrons accessing facebook in the library. when problems have arisen, strict regulation of access to the site seems unfavorable. while some librarians were excited about the possibilities of facebook, the majority surveyed appeared to consider facebook outside the purview of professional librarianship. d uring the fall of 2005, librarians noticed something unusual going on in the houston cole library (hcl) at jacksonville state university (jsu). students were coming into the library in droves. patrons waited in lines with photos to use the publicaccess scan ner (a stack of discarded pictures quickly grew). library traffic was noticeably busier than usual and the computer lab was constantly full, as were the publicaccess termi nals. the hubbub seemed to center around one particular web site. once students found available computers, they were likely to stay glued to them for long stretches of time, mesmerized and lost in what was later determined to be none other than “facebook addiction.” this addic tion was all the more obvious the day the internet was down. withdrawal was severe. soon after the librarians noticed this curious behavior, an article in the chanticleer, the campus newspaper for jsu, dispelled the mystery surrounding the website brouhaha. a campus reporter broke the exciting news to the jsu community that “after months of waiting and requests from across the country, it’s finally here. jsu is officially on the facebook.”1 the library suddenly became a popular hangout for students in search of computers to access facebook. apparently jsu jumped on the bandwagon relatively late. the facebook craze had already spread throughout other colleges and universities since the web site was founded in february 2004 by mark zuckerberg, a former student at harvard university. the creators of facebook vaguely define the site as “a social utility that connects you with the people around you.”2 although originally created to allow students to search for other students at colleges and universities, the site has expanded to allow individuals to connect in high schools, companies, and within regions. recently, zuckerberg has also announced plans to expand the network to military bases.3 currently, students and alumni in more than 2,200 colleges and uni versities communicate, connect with other students, and catch up with past high school classmates daily through the network. students who may never physically meet on campus (a rather serendipitous occurrence in nature) have the opportunity to connect through facebook. establishing virtual identities by creating profiles on the site, students post photographs, descriptions of academic and personal interests such as academic majors, campus organizations of which they are members, political orientation, favorite authors and musicians, and any other information they wish to share about themselves. facebook’s search engine allows users to search for students, faculty, and staff with similar interests by keyword. it would be hard to gauge how many of these students actually meet in person after connecting through facebook. the authors of this study have heard students mention that either they or their friends have made dates with other students on campus through facebook. many of the “friends” facebook users first add when they initially establish their accounts are the ones they are already acquainted with in the physical world. when facebook made its debut at jsu, it had become the “ninth most highly trafficked web site in the u.s.”4 one source estimated that 85 percent of college students whose institutions are registered in facebook’s directory have created personal profiles on the site.5 membership for the university network requires a university email address, and an institution cannot be registered in the directory unless a significant number of students request that the school be added. currently, more than nine mil lion people are registered on facebook.6 soon after jsu was registered on facebook’s direc tory, librarians began to receive questions regarding use of the scanner and requests for help uploading pictures to facebook profiles. students seemed surprisingly open about showing librarians their profiles, which usually contained more information than the librarians wanted to know. however, not all students were enthusiastic about facebook. complaints began to surface from students awaiting access to computers for academic work while classmates “tied up” computers on facebook. some stu dents complained about the distraction facebook caused checking out facebook.com: the impact of a digital trend on academic libraries laurie charnigo and paula barnett-ellis laurie charnigo (charnigo@jsu.edu) is an education librarian and paula barnett-ellis (pbarnett@jsu.edu) is a health, science, and nursing librarian at the houston cole library, jacksonville state university, alabama. 24 information technology and libraries | march 200724 information technology and libraries | march 2007 in the library’s computer lab, a complaint that eventually reached the president of jsu. currently, the administra tion at jsu has decided to block access to facebook in the computer labs on campus, including the lab in the library. opinions of faculty and staff in the library about facebook vary. some librarians scoff at this new trend, viewing the site primarily as just another dating service. others have created their own facebook accounts just to see how it works, to connect with students, and to keep up with the latest internet fad.7 ■ study rationale prompted by the issues that have arisen at hcl as a result of heavy patron use of facebook, the authors surveyed academic librarians throughout the united states to find out what impact, if any, the site has had on other libraries. the authors sought information about the practical effect facebook has had on libraries, as well as librarians’ perspectives, perceived roles associated with, and awareness of internet social trends and their place in the library. online social networking, like email and instant messaging, is emerging as a new method of com munication. recently, the librarians have heard facebook being used as a verb (e.g., “i’ll facebook you”). few would probably disagree that making social connections and friends (and facebook revolves around connecting friends) is an important aspect of the campus experi ence. much of the attraction students and alumni have toward college yearbooks (housed in the library) stems from the same fascination that viewing photos, student profiles, and searching for past and present classmates on facebook inspires. emphasis in this study centers on librarians’ awareness of, experimentation with, and atti tudes towards facebook and whether or not they have created policies to regulate or block access to the site on publicaccess computers. however trendy an individual web site such as facebook may appear, online social networking, a cat egory facebook falls within, has become a new subject of inquiry to marketing professionals, sociologists, commu nication scholars, and library and information scientists. downes defines social networks as a “collection of indi viduals linked together by a set of relations.”8 according to downes, “social networking web sites fostering the development of explicit ties between individuals as ‘friends’ began to appear in 2002.”9 facebook is just one of many popular online social network sites (myspace, friendster, flickr), and survey respondents often asked why questions focused solely on facebook. the authors decided to investigate it specifically because it is cur rently the largest online social network targeted for the academic environment. librarians are also increasingly exploring the use of what have loosely been referred to as “internet 2.0” com panies and services, such as facebook, to interact with and reach out to our users in new and creative ways. the term internet 2.0 was coined by o’reilly media to refer to internet services such as blogs, wikis, online social net working sites, and types of networks that allow users the ability to interact and provide feedback. o’reilly lists the core competencies that define internet 2.0 services. one of these competencies, which might be of particular inter est to librarians, is that internet 2.0 services must “trust the users” as “codevelopers.”10 as librarians struggle to develop innovative ways to reach users beyond library walls, it seems logical to observe online services, such as facebook and myspace, which appeal to a huge portion of our clientele. from a purely evaluative standpoint of the site as a database, the authors were impressed by several of the search features offered in facebook. graphtheory algo rithms and other advanced network technology are used to process connections.11 some of the more interesting search options available in facebook include the ability to: ■ search for students by course field, class number, or section; ■ search for students in a particular major; ■ search for students in a particular student organiza tion or club; ■ create “groups” for student organizations, clubs, or other students with common interests; ■ post announcements about campus or organization events; ■ search specifically for alumni; and ■ block or limit who may view profiles, providing users with builtin privacy protection if the user so wishes. since the authors finished the study, the site has added a news feed and a mini feed, features that allow users to keep track of their friends’ notes, messages, profile changes, friend connections, and group events. in response to negative feedback about the news feeds and mini feeds by users who felt their privacy was being violated, facebook’s administrators created a way for users to turn off or limit information displayed in the feeds. the addition of this technology, however, provides a sophisticated level of connectivity that is a benefit to users who like to keep abreast of the latest happenings in their network of friends and groups. the pulse, another feature on the site, keeps daily track of popular interests (e.g., favorite books) and member demographics (number of members, political orientation) and compares them with overall facebook member averages. the authors were pleasantly surprised to discover that the beatles and led zeppelin, beloved bands of the baby boomers, article title | author 25checking out facebook.com | charnigo and barnett-ellis 25 continue to live on in the hearts of today’s students. these groups were ranked in the top ten favorite bands by stu dents at jsu. as of october 2006, the top campaign issues expressed by facebook users were: reducing the drinking age to eighteen (go figure) and legalization for samesex marriage. arguably, much of the information provided by facebook is not academic in nature. however, an evaluation or review of facebook might provide useful information to instruction librarians and database ven dors regarding interface design and search capabilities that appeal to students. proviteramcglynn suggests that facilitating learning among millennials, who “represent 70 to 80 million people” born after 1992 (a large percent age of facebook members) involves understanding how they interact and communicate.12 awareness of students’ cultural and social interests, and how they interact online, may help older generations of academic librarians better connect with their constituents. ■ the literature on online social networks although social networks have been the subject of study by sociologists for years and social network theories have been established to describe how these networks func tion, the study of online social networks has received little attention from the scholarly community. in 1997, garton, haythornthwaite, and wellman were among the first to describe a method, social network analysis, for studying online social networks.13 their work was published years before online social networks similar to facebook evolved. currently, the literature on these networks is predominantly limited to popular news pub lications, business magazines, occasional blurbs in library science and communications journals, and numerous student newspapers.14 privacy issues and concerns about sexual predators lurking on facebook and similar sites have been the focus of most articles. in the chronicle of higher education, read details numerous arrests, suspensions, and schol arship withdrawals that have resulted from police and administrators searching for incriminating information students have posted in facebook.15 read discovered that, because students naively reveal so much informa tion about their activities, some campus police were regularly trolling facebook, finding it “an invaluable ally in protecting their campuses.”16 students may feel a false sense of security when they post to facebook, regarding it as their private space. however, read warns that “as more and more colleges offer alumni email accounts, and as campus administrators demonstrate more internet savvy, students are finding that their conversations are playing to a wider audience than they may have antici pated.”17 privacy concerns expressed about facebook appear to revolve more around surveillance than stalk ers. in a web seminar on issues regarding facebook use in higher education, shawn mcguirk, director of judicial affairs, mediation, and education at fitchburg state college, massachusetts, recommends that administrators and others concerned with students posting potentially incriminating, embarrassing, or overtly personal infor mation draft a document similar to the one created by cornell university’s office of information technologies, which advises students on how to safely and responsibly use online social networking sites similar to facebook.18 after pointing out the positive benefits of facebook and reassuring students that cornell university is proud of its liberal policy in not monitoring online social networks, the essay, entitled “thoughts on facebook,” provides poignant advice and examples of privacy issues revolv ing around facebook and similar web sites.19 the golden rule of this essay states: don’t say anything about someone else that you would not want said about yourself. and be gentle with your self too! what might seem fun or spontaneous at 18, given caching technologies, might prove to be a liability to an ongoing sense of your identity over the longer course of history.20 a serious concern discussed in this document is the real possibility that potential employers may scan facebook profiles for the “real skinny” on job candidates. however, unless the employer uses an email issued from the same school as the candidate, he or she is unable to look at the individual’s full profile without first request ing permission from the candidate to be added as a “friend.” all the employer is able to view is the user’s name, school affiliation, and picture (if the user has posted one). unless the user has posted an inappropriate picture or is applying for a job at the college he or she is attending, the threat of employers snooping for informa tion on potential candidates in facebook is minimal. the same, however, cannot be said of myspace, which is much more open and accessible to the public. additionally, three pilot research studies have also focused on privacy issues specifically relating to facebook, including those of stutzman, gross and acquisti, and govani and pashley. results from all three studies revealed strikingly close findings. individuals who participated in the studies seemed willing to dis close personal information about themselves—such as photos and sometimes even phone numbers and mailing addresses—on facebook profiles even though students also seemed to be aware that this information was not secure. in a study of fifty carnegie mellon university undergraduate users, govani and pashley concluded that these users “generally feel comfortable sharing their per sonal information in a campus environment. participants said they “had nothing to hide” and “they don’t really 26 information technology and libraries | march 200726 information technology and libraries | march 2007 care if other people see their information.”21 a separate study of more than four thousand facebook members at the same institution by gross and acquisti echoed these findings.22 comparing identity elements shared by members of facebook, myspace, friendster, and the university of north carolina directory, stutzman discov ered that a significant number of users shared personal information about themselves in online social networks, particularly facebook, which had the highest level of campus participation.23 gross and acquisti provide a list of explanations suggesting why facebook members are so open about sharing personal information online. three explanations that are particularly convincing are that “the perceived benefit of selectively revealing data to strang ers may appear larger than the perceived costs of possible privacy invasions”; “relaxed attitudes toward (or lack of interest in) personal privacy”; and “faith in the network ing service or trust in its members.”24 in public libraries, concern has primarily centered on teenagers accessing myspace.com, an online social net working site much larger than facebook. myspace, whose membership, unlike facebook, does not require an .edu email address, has a staggering 43 million users, a num ber that continues to rise.25 julian aiken, a reference librar ian at the new haven free public library, wrote about the unpopular stance he took when his library decided to ban access to myspace due to the hysterical hype of media reports exposing the dangers from online predators lurking on the site.26 for aiken, the damage of censorship policies in libraries far outweighs the potential risk of sex crimes. furthermore, he suggests that there are even edu cational benefits of myspace, observing that “[t]eenagers are using myspace to work on collaborative projects and learn the computer and design skills that are increasingly necessary today.”27 what is apparent is that whether facebook continues to rise in popularity or fizzles out among the college crowd, the next generation of college students, who now constitute the largest percentage of myspace users, are already solidly entrenched and adept at using online social networks. librarians in institutions of higher education might need to consider what implica tions the communication style preferences of these future students could have, if any, on library services. while most of the academic attention regarding online social networks has centered on privacy concerns, perhaps the business sector has done a more thorough investiga tion of user behavior and students’ growing attraction towards these types of sites. business magazines have naturally focused on the market potential, growth, and fluctuating popularity of various online social networks. advertisers and investors have sought ways to capital ize on the exponential growth of these hightraffic sites. business week reported that as of october 2005, facebook .com had 4.2 million members. more than half of those members were between the ages of twelve and twenty four.28 while some portended that the site was losing momentum, as of august 2006, membership on facebook had expanded beyond eight million.29 marketing experts have closely studied, apparently more so than com munication scholars, the behavior of users in online social networks. in a popular business magazine, hempel and lehman describe user behavior of the “myspace generation”: “although networks are still in their infancy, experts think they’re already creating new forms of social behavior that blur the distinctions between online and realworld interactions.”30 the study of user behavior in online social networks, however, has yet to be addressed in length by those outside the field of marketing. although evidence of interest in online social net works is apparent in librarian weblogs and forums (many librarians have created facebook groups for their libraries), actual literature in the field of library and information science is scarce.31 dvorak questions the lack of interest displayed by the academic community toward online social networks as a focus of scholarly research. calling on academics to “get to work,” he argues “aca demia, which should be studying these phenomena, is just as out of the loop as anyone over 30.”32 this discon nect is also echoed by michael j. bugeja, director of the greenlee school of journalism and communication at iowa state university, who writes, “while i’d venture to say that most students on any campus are regular visitors to facebook, many professors and administrators have yet to hear about facebook, let alone evaluate its impact.”33 the lack of published research articles on these types of networks, however, is understandable given the newness of the technology. a few members of the academic community have sug gested opportunities for using facebook to communicate with and reach out to students. in a journal specifically geared toward student services in higher education, shier considers the impact of facebook on campus community building.34 although she cannot identify an academic purpose for facebook, she describes how the site can con tribute to the academic social life of a campus. facebook provides students with a virtual campus experience, particularly in colleges where students are commuters or are in distance education. shier writes, “as the student’s definition of community moves beyond the geographic and physical limitations, facebook.com provides one way for students to find others with common interests, feel as though they are part of a large community, and also find out about others in their classes.”35 furthermore, facebook membership extends beyond students to fac ulty, staff, and alumni. shier cites examples of professors who used facebook to connect or communicate with their students, including the president of the university of iowa and more than one hundred professors at duke university. professors who teach online courses make article title | author 27checking out facebook.com | charnigo and barnett-ellis 27 themselves seem more human or approachable by estab lishing facebook profiles.36 greeting students on their own turf is exactly the direction staff at washington university’s john m. olin library decided to take when they hired web services librarian joy weese moll to communicate and answer questions through a variety of new technologies, includ ing facebook.37 brian mathews, information services librarian at georgia institute of technology, also created a facebook profile in order to “interact with the students in their natural environment.”38 mathews decided to experiment with the possibilities of using facebook as an outreach tool to promote library services to 1,700 stu dents in the school of mechanical engineering after he discovered that 1,300 of these students were registered on facebook. advising librarians to become proactive in the use of online social networks, mathews reported that overall, his experience helped him to effectively “expand the goal of promoting the library.”39 bill drew was among the first librarians to create an account and profile for his library, the suny morrisville library. as of september 2006, nearly one hundred librarians had created profiles or accounts for their libraries on facebook. one month later, however, the administration at facebook began shutting down library accounts on the grounds that libraries and institutions were not allowed to represent themselves with profiles as though they were individu als. in response, many of these libraries simply created groups for their libraries, which is completely appropri ate, similar to creating a profile, and just as searchable as having an account. the authors of this study created the “houston cole library users want answers!” group, which currently has ninetyone members. library news and information of interest about the library is announced in the group.40 in this study, one trend the authors will try to identify is whether other librarians have considered or are already using facebook in similar ways that moll, mathews, and drew have explored as avenues for com municating with students or promoting library services. ■ the survey in february 2006, 244 surveys were mailed to reference or public service librarians (when the identity of those per sons could be determined). these individuals were chosen from a random sample of the 850 institutions of higher education classified by the carnegie classification listing of higher education institutions as “master’s colleges and universities (i and ii)” and “doctoral/ research universities (extensive and intensive).”41 the sample size provided a 5.3 percent margin error and a 95 percent confidence level. one hundred twentysix surveys were completed, providing a response rate of 51 percent. fifteen survey questions (appendix a) were designed to target three areas of inquiry: awareness of facebook, practical impact of the site on library services, and perspectives of librarians toward online social networks. awareness of facebook a series of questions on the survey queried respondents about their awareness and degree of knowledge about facebook. the overwhelming majority of librarians were aware of facebook’s existence. out of 126 librarians, 114 had at least heard of facebook; 24 were not familiar with the site. as one individual wrote, “i had not heard of facebook before your survey came, but i checked and our institution is represented in facebook.” universities registered in facebook are easily located through a searchbyregion on facebook’s home page. thirtyeight colleges and universities for alabama (jsu’s location) are registered in facebook. (in comparison, 143 academic institutions in california are listed.) out of those librar ians who had heard of the site, 27 were not sure whether their institutions were registered in facebook’s directory. sixty survey participants were aware that their institu tions were registered in the directory, while fifteen librar ians reported that their universities were not registered (figure 1). several comments at the end of the survey indicated that some of the institutions surveyed did not issue school email accounts, making membership in facebook impossible for their university. interestingly, out of the sixty individuals who could claim that their universities were in the directory, 34 percent have created their own personal facebook accounts and two libraries have individual profiles (figure 2). one individual who established an account on the site wrote, “personally, i’m a little embarrassed by having an account because it’s such a teenybopper kind of thing and i’m a little old for it. but it’s an interesting cultural phenomenon and academic librarians need to get on the bandwagon with it, if only to better understand their constituents.” another survey respondent with an individual profile on the site reported a group created by his or her institution on facebook titled “i totally want to have sex in the library.” this individual wanted to make it clear, however, that the students—not the librarians—created this group. a particularly help ful participant went so far as to poll the reference col leagues in all nine of the libraries at his/her institution and found that “only a few had even heard of facebook.” that librarians will become increasingly aware of online social networks was the sentiment expressed by another individual who wrote, “most librarians at my institu tion are unaware of social software in general, much less facebook. however, i think this will change in the future as social software is mentioned more often in traditional media (such as television and newspapers).” according to survey responses, it does not appear 28 information technology and libraries | march 200728 information technology and libraries | march 2007 that use of facebook by students has been as noticeable or distracting in other libraries as it has been at hcl. when asked to describe their observation of student use of library computers to access facebook, 56 percent of those surveyed checked “rarely to never.” only 20 percent indicated “most of the time” to “all of the time” (table 1). however, it is important to remember that only sixty individuals could verify that their institutions are regis tered on facebook. through comments, some librarians hinted that “snooping” or keeping mental notes of what students view on library computers is frowned upon. it simply is not our business. “we do not regulate or track student use of computers in the library,” wrote one indi vidual. several librarians noted that students were using facebook in the libraries, but more so on personal laptops than publicaccess computers. practical impact of facebook another goal of this study was to find out whether facebook has had any real impact on library services, such as an increase in bandwidth, library traffic, and noise, or in use of publicaccess computers, scanners, or other equipment. student complaints about monopolization of computers for use of facebook led administrators to block the site from computer labs at jsu. access to facebook on publicaccess terminals, however, was not regulated. survey responses revealed that facebook has had minimal impact on library services elsewhere. only one library was forced to develop a policy for specifically addressing computeruse concerns as a result of facebook use. one individual mailed the sign posted on every computer terminal in the library, which states, “if you are using a computer for games, chat, or other recreational activity, please limit your usage to thirty minutes. computers are primarily intended for academic use.” another librarian reported that academic computing staff had to shut down access to facebook on library computers due to band width and access issues. this individual, however, added, “interestingly, no one has complained to the library staff about its absence!” given a list of possible effects facebook may have had on library services and operations, 10 per cent of respondents indicated that facebook has increased patron use of computers. seven percent agreed that it has increased patron traffic, and only 2 percent reported that the site has created bandwidth problems or slowed down internet access. only four individuals received patron complaints about other users “tying up” the computers with facebook (figure 3). since the advent of facebook, the public scanner has become one of the hottest items in hcl. librarians at jsu know that use of the scanner has increased tremendously due to facebook because the scanner used by students to upload photos is attached to a public workstation next to the general reference desk. students often ask questions about uploading pictures to their facebook profiles as well as how to edit photos (e.g., resizing and cropping). one survey question asked whether scanner use had increased as a result of facebook. of the sixtytwo respon dents who answered this question (it was indicated that only those libraries that provide public access to scanners should answer the question), 77 percent reported that figure 1. institutions added to the facebook directory figure 2. involvement with facebook table 1. student use of library computers to access facebook (based on observation) total percentage never 23 32 rarely 17 24 some of the time 17 24 all the time 7 10 most of the time 7 10 article title | author 2�checking out facebook.com | charnigo and barnett-ellis 2� scanner use had not increased. furthermore, only two librarians have assisted students with the scanner or pro vided any other type of assistance, for that matter, with facebook. the assistance the two librarians gave included scanning photographs, editing photos, uploading photos to facebook profiles, and creating accounts. however, in a separate question, 21 percent of participants agreed that librarians should be responsible for helping students, when needed, with questions about facebook. no librar ian has added additional equipment such as computers or scanners as a result of facebook. only one individual reported future plans by his/her library to add additional equipment in the future as a result of heavy use of the site. perspectives toward facebook one of the main goals of the study was to obtain a snapshot of the perspectives and attitudes of librarians toward facebook and online social networks in general. most of the librarians surveyed were neither enthusiastic nor disdainful of facebook. a small group of the respon dents, however, when given the chance to comment, were extremely positive and excited about the possibilities of online social networking. twentyone individuals saw no connection between libraries and facebook. sixty seven librarians were in agreement that computer use for academic purposes should take priority, when needed, over use of facebook. however, fiftyone respondents indicated that librarians needed to keep up with internet trends, such as facebook, even when such trends are not academic in nature (table 2). out of 126 librarians who completed the survey, only 23 reported that facebook has generated discussion among library faculty and staff about online social networks. on the other hand, few individuals voiced negative opinions toward facebook. only 5 percent of those surveyed indicated that facebook annoyed faculty and staff. one individual wrote, “i don’t like facebook or most social networking services. they encourage the formation of cliques and keep users from meeting and accepting those who are different than themselves.” comments like this, however, were rare. although the majority of librarians seemed fairly apa thetic toward facebook, few individuals expressed nega tive comments toward the site. few librarians indicated that facebook should be addressed or regulated in library policy. most individu als viewed the site as just another communication tool similar to instant messaging or cell phones. in fact, while most librarians did not express much interest in facebook, many were quite vocal about not regulating its use. the following comment by one survey partici pant captures this sentiment: “attempts to restrict use of facebook in the library would be futile, in my opinion, in the same way it is now impossible to ban use of usb drives and aim in academic libraries.” while most indi table 2. access, assistance, and awareness of facebook and similar trends: perspectives total percentage computer use for academic purposes should take priority, when needed, over use of facebook. 67 53 librarians need to “keep up” with internet trends, such as facebook, even when these trends are not academic in nature. 51 40 library resources should not be monopolized with use of facebook. 35 28 librarians should help students, when able, with questions regarding facebook. 27 21 there is no connection between libraries and facebook. 21 17 student use of facebook on library computers should not be regulated. 15 12 library computers should be available for access to facebook, but librarians should not feel that it is their responsibility to assist students with questions regarding the site. 11 9 (respondents were allowed to check any or all responses that applied.) figure 3. patron complaints about facebook 30 information technology and libraries | march 200730 information technology and libraries | march 2007 viduals agreed that academic use of computers should take priority over recreational use, a polite request that a patron using facebook allow another student to use the computer for academic purposes, when necessary, appears more preferable than the creation and enforce ment of strict policies. as one librarian put it, “i don’t want students to see the library as a place where they are ‘policed’ unnecessarily.” when asked if facebook serves any academic pur pose, 54 percent of those surveyed indicated that it does not, while 34 percent were “not sure.” twelve percent of the librarians identified academic potential or pos sible benefits of the site (figure 4). the authors were surprised to find that 46 percent of those surveyed were not completely willing to dismiss facebook as pure rec reation. some librarians found facebook to be a distrac tion to academics: “maybe i’m old fashioned, but when do students find time for this kind of thing? i wonder about the impact of distractions like this on academic pursuits. there’s still only twentyfour hours in a day.” another individual asked two students who were using facebook in the library what they thought of the site and they admitted that it was “frequently a distraction from academic work.” for the 34 percent who were not sure whether facebook has any academic value, there were comments such as “i am continuing to observe and will decide in the future.” academic uses for facebook included suggestions that it be used as a communication tool for student collaboration in classes (facebook allows students to search for other students by course and sec tion number). one individual suggested it could be used as an “online study hall,” but then wondered if this might lead to plagiarism. some thought instructors could somehow use facebook for conducting online discussion forums, with one participant observing “it’s ‘cooler’ than using blackboard.” “building rapport” with students through a communication medium that many students are comfortable with was another benefit mentioned. respondents who were enthusiastic about facebook thought it most beneficial as a virtual extension of the campus. facebook could potentially fill a void where facetoface connections are absent in online and dis tanceeducation classes. several librarians suggested that facebook has had a positive influence in fostering col legiate bonds and school spirit. as one individual wrote, “[t]he academic environment is not only responsible for scholarly growth, but personal growth as well. this is just one method for students to interact in our highly techno logical society.” facebook could provide students who are not physically on campus with a means to connect with other students at their institutions who have similar academic and social interests. some librarians were so enthusiastic about facebook that they suggested libraries use the site to promote their services. using the site to advertise library events and creating online library study groups and book clubs for students were some of the ideas expressed. one librar ian wrote: “facebook (and other social networking sites) can be a way for libraries to market themselves. i haven’t seen students using facebook in an academic manner, but there was a time when librarians frowned on email and aim too. if it becomes a part of students’ lives, we need to welcome it. it’s part of welcoming them, too.” more librarians, however, felt that facebook should serve as a space exclusively for students and that librarians, profes sors, administrators, police, and other uninvited folks should keep out. furthermore, as one individual noted, it is not “an appropriate venue” for librarians to promote their services. while the review of literature demonstrates that much has been made of online social networks and privacy issues, the librarians surveyed were not particularly con cerned about privacy. only 19 percent indicated that they were concerned about privacy issues related to facebook. however, some librarians voiced concerns that many stu dents are ignorant about the risks of posting personal infor mation and photographs on facebook and do not seem fully aware of the possibility that individuals outside their social sphere might also have reason to access the site. one individual mentioned that the librarians at her institution have begun to emphasize this to students during library instruction sessions on internet research and evaluation. ■ limitations several limitations to this study must be noted when attempting to reach any type of conclusion. participants who had never heard of facebook obviously could not answer any questions except that they were not famil iar with the site. some questions required respondents to “guesstimate.” unless librarians have access to their figure 4. finds conceivable academic value in facebook article title | author 31checking out facebook.com | charnigo and barnett-ellis 31 institution’s internet usage statistics, it would be hard for them to really know how much bandwidth is being used by students accessing facebook. librarians, having been trained in a profession that places a high value on freedom of access, might also be wary of activities that suggest any type of censorship. therefore, it is conceivable that some of the librarians surveyed do not know whether students are using facebook in the library because they make a point not to snoop or make note of individual web sites that students view. ■ discussion while online education is growing at a rapid rate across the united states, so is the presence of virtual academic social communities. although facebook might prove to be a passing fad, it is one of the earliest and largest online social networking communities geared specifically for students in higher education. it represents a new form of communication that connects students socially in an online environment. if online academics have evolved and continue to do so, then it is only natural that online academic social environments, such as facebook, will continue to evolve as well. while traditionally considered the heart of the campus, one is left to ponder the library’s presence in online academic social networks. what role the library will serve in these environments might largely depend on whether librarians are proactive and experi mental with this type of technology or whether they simply dismiss it as pure recreation. emerging technolo gies for communication should provoke, at the very least, an interest in and knowledge of their presence among library and information science professionals. this survey found that librarians were overwhelmingly aware of and moderately knowledgeable about facebook. some librarians were interested in and fascinated with facebook, but preferred to study it as outsiders. others had adopted the technology, but more for the purpose, it would seem, of having a better understanding of today’s students and why facebook (and other online social net working sites) appeals to so many of them. it is apparent from this study that there is a fine line between what now constitutes “academic” activity and “recreational” activity in the library. sites like facebook seem to blur this line fur ther and librarians do not seem eager or find it necessary to distinguish between the two unless absolutely pressed (e.g., asking a student to sign out of facebook when other patrons are waiting to use computers for academic work). one area of attention this study points to is a lack of con cern among librarians toward the internet and privacy issues. some individuals surveyed suggested that librari ans play a larger role in making students aware that people outside their society of friends—namely, administrative or authority figures—have the ability to access the informa tion they post online to social networks. participants were most enthusiastic about facebook’s role as a space where students in the same institution can connect and share a common collegiate bond. librarians who have not yet “checked out” facebook might consider one individual’s description of the site as “just another ver sion of the college yearbook that has become interactive.”42 among the most cherished books in hcl that document campus life at jsu are the mimosa yearbooks. alumni and students regularly flip through this treasure trove of pho tographs and memories. no administrator or librarian would dare weed this collection or find its presence irrele vant. while year books archive campus yesteryears, online social networks are dynamically documenting the here and now of campus life and shaping the future of how we communicate. as casey writes, “libraries are in the habit of providing the same services and the same programs to the same groups. we grow comfortable with our provision and we fail to change.”42 by exploring popular new types of internet services such as facebook instead of quickly dismissing them as irrelevant to librarianship, we might learn new ways to reach out and communicate better with a larger segment of our users. ■ acknowledgements the authors would like to acknowledge stephanie m. purcell, student worker at the houston cole library, for her excellent editing suggestions and insight into online social networks from the student’s point of view, and johnbauer graham, head of public services at the houston cole library, for his encouragement. references and notes 1. angela reid, “finally . . . the facebook,” the chanticleer, sept. 22, 2005, 4. 2. facebook.com, http://www.facebook.com/about.php (accessed dec. 2, 2005). 3. angus loten, “the great communicator,” inc.com., june 6, 2006, http://www.inc.com/30under30/zuckerberg.html (accessed dec. 4, 2005). 4. adam lashinsky, “facebook stares down success,” fortune, nov. 28, 2005, 4. 5. michael amington, “85 percent of college students use facebook,” testcrunch: tracking web 2.0 company review on facebook (sept. 7, 2005), http://www.techcrunch.com/2005/09/07/ 85ofcollegestudentsusefacebook (accessed dec. 2, 2005). 6. http://www.facebook.com/about.php. 7. facebook us! if you are a registered member of facebook, do a global search for “laurie charnigo” or “paula barnett ellis.” 32 information technology and libraries | march 200732 information technology and libraries | march 2007 8. stephen downes, “semantic networks and social net works,” the learning organization 12, no. 5 (2005): 411. 9. ibid. 10. tim o’reilly, “what is web 2.0?” http://www.oreilly net.com/pub/a/oreilly/tim/news/2005/09/30/whatisweb 20.html (accessed aug. 6, 2006). 11. http://www.facebook.com/about.php. 12. angela provitera mcglynn, “teaching millennials, our newest cultural cohort,” the education digest 71, no. 4 (2005): 13. 13. laura garton, caroline haythornthwaite, and barry well man, “studying online social networks,” journal of computer mediated communication 31, no. 4 (1997). 14. facebook.com’s “about” page archives a collection of col lege newspaper articles about facebook: http://www.facebook .com/about.php (accessed dec. 4, 2005). 15. brock read, “think before you share,” the chronicle of higher education, jan. 20, 2006, a38–a41. 16. ibid., a41. 17. ibid., a40. 18. shawn mcguirk, “facebook on campus: understanding the issues,” magna web seminar presented live on june 14, 2006. transcripts available for a fee from magna pubs. http://www .magnapubs.com/catalog/cds/5987551.html (accessed aug. 2, 2006). 19. tracy mitrano, “thoughts on facebook” (apr. 2006) cor nell university of information technologies, http://www.cit .cornell.edu/oit/policy/ memos/facebook.html (accessed june 22, 2006). 20. ibid., “conclusion.” 21. tabreez govani and harriet pashley, “student awareness of the privacy implications when using facebook,” unpublished paper presented at the “privacy poster fair” at the carnegie mellon university school of library and information science, dec. 14, 2005, 9, http://lorrie.cranor.org/courses/fa05/tubzhlp .pdf (accessed jan. 15, 2006). 22. ralph gross and alessandro acquisti, “information rev elation and privacy in online social networks,” paper presenta tion at the acm workshop on privacy in the electronic society, alexandria, va., nov. 7, 2005, 79, http://portal.acm.org/citation .cfm?id=1102214 (accessed nov. 30, 2005). 23. frederic stutzman, “an evaluation of identitysharing behavior in social network communities,” paper presentation at the idmaa and ims code conference, oxford, ohio, april 6–8, 2006, 3–6, http://www.ibiblio.org/fred/pubs/stutzman _pub4.pdf (accessed may 23, 2006). 24. gross and acquisti, “information revelation and privacy in online social networks,” 73. 25. “myspace: design anarchy that works,” business week, jan. 2, 2006, 16. 26. julian aiken, “hands off myspace,” american libraries 37, no. 7 (2006): 33. 27. ibid. 28. jessi hempel and paula lehman, “the myspace genera tion,” business week, dec. 12, 2005, 94. 29. http://www.facebook.com/about.php. 30. hempel and lehman, “the myspace generation,” 87. 31. the authors created the “librarians and facebook” group on facebook to discuss issues concerning facebook and librari anship, such as censorship issues, policies, and ideas for con necting with students through facebook. this is a global group. if you have a facebook account, we invite you to do a search for “librarians and facebook” and join our group. 32. john c. dvorak, “academics get to work!” pcmagazine online, http://www.pcmag.com/article2/0,1895,1928970,00 .asp (accessed feb. 21, 2006). 33. michael j. bugeja, “facing the facebook,” the chronicle of higher education, jan. 27, 2006, c1–c4; ibid. 34. maria tess shier, “the way technology changes how we do what we do,” new directions for student services 112 (winter 2005): 83–84. 35. ibid., 84 36. shier, “the way technology changes how we do what we do,” 112; j. duboff, “poke” your prof: faculty discovers thefacebook.com,” yale daily news, mar. 24, 2005, http://www .yaledailynews.com/article.asp?aid=28845 (accessed jan. 15, 2006; mingyang liu, “would you friend your professor? duke chronicle online, feb. 25, 2005, http://www.dukechronicle.com/ media/paper884/news/2005/02/25/news/would.you.friend .your.professors1472440.shtml?norewrite&sourcedomain =www.dukechronicle.com (accessed jan. 15, 2006). 37. brittany farb, “students can ‘check out’ new librarian on the facebook,” student life (washington univ. in st. louis), feb. 27, 2006, http://www.studlife.com/home/index.cfm?eve nt=displayarticle&ustory_id=5914a90d53b (accessed feb. 27, 2006). 38. brian s. mathews, “do you facebook? networking with students online,” college & research libraries news 37, no. 5 (2006): 306. 39. ibid., 307. 40. view the “houston cole library users want answers!” group by doing a search for the group title on facebook. 41. nces compare academic libraries, http://nces.ed.gov/ surveys/libraries/ compare/peervariable.asp (accessed dec. 2, 2005). the random sample was chosen using the research ran domizer available online, http://www.randomizer.org/form .htm (accessed dec. 2, 2005). 42. michael e. casey and laura c. savastinuk, “library 2.0,” library journal 131, no. 14 (2006): 40. article title | author 33checking out facebook.com | charnigo and barnett-ellis 33 1. has your institution been added to the facebook directory?  yes  no (skip to questions 10, 11, and 12  not sure (skip to questions 10, 11, and 12)  i am not familiar with facebook (skip all questions and submit) 2. which best describes your involvement with facebook?  i have a personal account  my library has an account  no involvement 3. which best describes your observation of student use of library computers to access facebook?  all the time  most of the time  some of the time  rarely  never 4. has your library added additional equipment such as computers or scanners as a result of facebook use?  yes  no  no, but we plan to in the future 5. have patrons complained about other patrons using library computers for facebook?  yes  no  not sure 6. has your library had to develop a policy or had to address computer use concerns as a result of facebook use?  yes  no  not sure 7. if your library provides public access to a scanner, has patron use of scanners increased due to the use of facebook?  yes  no 8. have you assisted students with the library’s scan ner for facebook?  yes  no 9. if you have provided assistance to students with facebook, please check all that apply:  creating accounts  scanning photographs or offering advice on where students can access a scanner  editing photographs (e.g., resizing photos or use of a photo editor)  uploading photographs to facebook profiles  other __________________________________ 10. check the responses that best describe your opinion about the responsibilities of librarians in assisting students with facebook questions and access to the web site:  student use of facebook on library computers should not be regulated.  library resources should not be monopolized with facebook use.  computer use for academic purposes should take priority, when needed, over use of facebook.  librarians should help students, when able, with facebook questions.  librarians need to “keep up” with internet trends, such as facebook, even if they are not academic in nature.  there is no connection between librarians, libraries, and facebook.  library computers should be available for facebook use, but librarians should not feel that they need to assist students with facebook questions. 11. would you consider facebook to be a relevant aca demic endeavor?  yes  no  not sure appendix a: survey on the impact of facebook on academic libraries 34 information technology and libraries | march 200734 information technology and libraries | march 2007 12. if you answered “yes” to question 11, please describe how facebook could be considered an aca demic endeavor. ______________________________________________ ______________________________________________ ______________________________________________ ______________________________________________ 13. please check all answers that best describe what effect, if any, use of facebook in the library has had on library services and operations?  has increased patron traffic  has increased patron use of computers  has created computer access problems for patrons  has created bandwidth problems or slowed down internet access  has generated complaints from other patrons  annoys library faculty and staff  interests library faculty and staff  has generated discussion among library faculty and staff about facebook 14. is privacy a concern you have about students using facebook in the library?  yes  no  not sure please list any observations, concerns, or opinions you have regarding facebook use in libraries. extracted the paragraphs from my palm to my desktop, and saved that document and the tocs on a universal serial bus (usb) key. today, i combined them in a new document on my laptop and keyed the remaining paragraphs in my room at an inn on a pier jutting into commencement bay in tacoma on southern puget sound. i sought inspiration from the view out my window of the water and the fall color, from old crow medicine show on my ipod, and from early sixties beyond the fringe skits on my treo. fred kilgour was committed to delivering informa tion to users when and where they wanted it. libraries must solve that challenge today, and i am confident that we shall. editorial continued from page 3 reproduced with permission of the copyright owner. further reproduction prohibited without permission. prospector: a multivendor, multitype, and multistate western union catalog bush, carmel;garrison, william a;machovec, george;reed, helen i information technology and libraries; jun 2000; 19, 2; proquest pg. 71 prospector: a multivendor, multitype, and multistate western union catalog the prospector project represents a unique union catalog. the origin, goals, and design of the union catalog that uses the inn-reach system are presented. challenges of the union catalog include the integration of records from libraries that do not use the innovative interfaces system and the development of best practices for participating libraries. t he prospector project is a union catalog of sixteen libraries in colorado and wyoming built around the inn-reach software from innovative interfaces, inc. (iii).1 in 1997, the colorado alliance of research libraries (the colorado alliance) and the university of northern colorado submitted a joint grant proposal to create a regional union catalog for many of the major academic and public libraries in the region. the project would allow users to view library holdings and circulation information with a single query of the central database. the union catalog also would allow patrons to request items from any of the participating libraries and have them delivered to a nearby local library. however, unlike many of the other union catalogs in the country, prospector has several unique elements: • it is multistate (colorado and wyoming). • it is multisystem (incorporating systems from innovative interfaces and carl corporation; plans call for voyager from endeavor). • it is multi-library-type (academic, public, and special libraries). regional union catalogs representing the cataloged collections of libraries that are related by geography, subject, or library type have been extant for many years. early leaders in the field spearheaded locally developed systems such as the university of california's melvyl system and the illinois library computer systems organization's (ilcso) illinet online system, which became operational in 1980.2 the commercial integrated library system market began to emerge in the late 1980s and the 1990s with such vendors as innovative interfaces and its work with ohiolink through its inn-reach union catalog product, and the carl system.3 many major vendors now have union catalog solutions for a single physical union catalog, although most have the requirement that participating libraries all use the same integrated library system. an alternative approach that is also becoming popular, because of the heterogeneous nature of the ils marketplace and the widespread implementation of z39.50, is for libraries to create virtual union catalogs through broadcast searching. this solution is available from many ils vendors as well as through organizations such as oclc and its webz software. carmel bush, william a. garrison, george machovec, and helen i. reed there is not a single "right" answer for whether regional catalog searching and document delivery is best accomplished through a physical or virtual union catalog. each solution has benefits and drawbacks that must be balanced against the mix of vendors, economics, politics, and technical issues within a state. prospector is somewhat unusual in that it does create a single physical union catalog but allows for the incorporation of other library systems, made possible through a published specification from innovative interfaces. i prospector history, funding, and project goals colorado has a long history of resource sharing through a variety of programs, including use of the colorado library card statewide borrower's card and access to individual libraries' online catalogs through the access colorado library information network (aclin) and other regional catalogs. the colorado alliance has taken a leadership role within the state in promoting cooperation among major academic and public libraries in the areas of automation, joint acquisitions, and other cooperative endeavors. existing online catalog software enabled patrons to easily search individual online catalogs, but searching several catalogs was a tedious task requiring many steps. it has long been a goal of the alliance to have a true union catalog of holdings for all member libraries. to forward this goal, in 1997 the colorado alliance of research libraries and the university of northern colorado jointly applied for and received a grant from the colorado technology grant and revolving loan program to establish the colorado unified catalog, a unified catalog of holdings for sixteen of the major academic, public, and special libraries in colorado.4 the university of wyoming was included in the project through separate funding. the grant of $640,000 was used to develop a union catalog that would support searching and patron borrowing from a single database. the colorado alliance carmel bush (cbush@manta.library.colostate.edu) is assistant dean for technical services at the colorado state university libraries, fort collins; william a. garrison (garrisow@ spot.colorado.edu) is head of cataloging at the university of colorado at boulder (colo.) libraries; george machovec (gmachove@coalliance.org) is the associate director of the colorado alliance of research libraries, denver; and helen i. reed (hreed@unco.edu) is associate dean, university of northern colorado libraries, greeley. prospector i bush, garrison, machovec, and reed 71 reproduced with permission of the copyright owner. further reproduction prohibited without permission. and the university of northern colorado contributed an additional $189,500 of in-kind services to the unified catalog project. additionally, the colorado alliance contributed $119,000 of in-kind funds to support purchase of distributed system software. the colorado unified catalog project, later named prospector, was based upon the inn-reach software developed by innovative interfaces, inc. it included all innovative interfaces sites in colorado as of december 1996 as well as the carl system sites that were members of the nonprofit colorado alliance of research libraries.s the colorado unified catalog project had two major goals: • the development of a global catalog database containing the library holdings of the largest public and academic libraries in the region; and • the development of an automated borrowing system so that users at any of the participating libraries could easily request materials electronically from any other participating libraries.6 the union catalog would allow users to view library holdings and circulation information on titles with a single query of the global database. once titles were located, patrons could request available items and have them delivered to their home library. the grant proposal identified four major goals and outcomes of the project: access, equity, connections, and content and training. by creating a global catalog, the colorado unified catalog project would provide students, faculty, staff, and patrons free and open access to the union catalog via the internet. patrons from all participating libraries would have equal access to the combined holdings of all sixteen participating libraries, thus greatly enhancing resources available to patrons without the necessity of travel across the state. connectivity was greatly enhanced by the installation of high-speed internet access in the colorado alliance office where the union catalog server was housed. the unified catalog project amassed, in one place, the complete cataloged collections of the major libraries in the region creating a single, easy-to-use public interface. training for the catalog would be conducted in each library so that it could be integrated into the standard training and reference services of each participating library.? addressing statewide goals for libraries, the colorado unified catalog was designed to dovetail with an existing project in colorado called the access colorado library and information network (aclin) in several ways. the goal of aclin was to provide statewide searching of several hundred library catalogs in colorado through broadcast 239.50 searching. however, because of the large number of online library catalogs (too many z39.50 targets cause broadcast searching to be slow) and 72 information technology and libraries i june 2000 poor network infrastructure in some parts of the state, the creation of physical union catalogs, such as prospector, greatly enhanced the ability for a project such as aclin to be successful. as stated in the grant proposal it will: • make aclin more efficient since sixteen libraries will be grouped together and can be accessed via a single search, thus saving alcin users steps in searching; • enhance aclin's document delivery plans since patrons can make requests themselves; • offer both web and character interfaces for various levels of users; • provide access via aclin's dial-in ports as well as via the internet; and • support alcin's future developments based on a 239.50 environment.s work on the development of the colorado unified catalog began in mid-1997. even while contract negotiations were underway in midto late 1997, groups were busy undertaking discussions on the design and structure of the unified catalog. work on development of profiling and system specifications continued through july 1998. this data was entered onto the server at the colorado alliance office and a test database was created in august 1998. testing was completed in november 1998 and the first records were loaded in december 1998. the creation of the database for the first twelve libraries took seven months. during the database load the catalog was available for searching, although most participating libraries did not highlight the system in their local opacs. innovative interfaces, inc. conducted training on the actual patron requesting and circulation functions at three sites over the period from may through july 1999. as of january 2000 the catalog included more than 3.6 million unique bibliographic records of the twelve largest libraries in colorado (more than 6.6 million marc records have been contributed, which has resulted in 3.6 million unique records after de-duplication). with the database in place and opac and circulation training complete, prospector went "live" for patron-initiated requests in the first eight libraries on july 23, 1999. as of december 31, 1999, all twelve innovative sites were "live" in prospector. the final programming for loading the records from carl-system sites will be completed in spring 2000. it is anticipated that carl-system library records will be loaded in late spring 2000 and will bring the database to more than five million unique marc records, with more than ten million item records. since the receipt of the grant, two participating libraries have selected endeavor as their online integrated system . contract negotiations are underway between innovative interfaces and the reproduced with permission of the copyright owner. further reproduction prohibited without permission. colorado alliance to come to an agreement on loading records for the endeavor libraries into prospector. i politics and marketing of prospector planning and policy making are inherently political processes in which participants choose among goals and options in order to make decisions and to direct actions. for prospector the diverse makeup of multitype libraries and multisystems augured for different perspectives on implementation from the onset. nearly every department in member libraries would have an impact from the project. to be successful in carrying out their charges, the work of the task forces appointed to implement prospector had to address how these staff could influence the process and how local practices would be affected. the challenge was to engage staff in the process since the task force structure precluded representation from every member library. meeting this challenge would be vital to ensuring input and fostering buy-in and advocacy for prospector in member institutions. consequently, in addition to reviewing standards or best practices and focusing on the goals stipulated in the grant, obtaining factual knowledge about member practices and resources and encouraging communications served as key ingredients in planning and policy development. general process profiling prospector, a main charge for the cataloging/ reference task force, illustrates the general process employed in planning and how key ingredients were applied to gain input and produce results. the first step involved the task force's review of the grant's aims for the unified catalog. with that framework as a basis, a planning process was outlined and shared with participants. the prospector web site detailed the specification development process, including the schedule and opportunities for input. next the task force surveyed participants for information on their systems: bibliographic indexing rules, types of indexes, characters indexed in phrase indexes, indexes on which authority control performed, and suppliers of authority records. using this data, the task force identified the commonalties and differences to determine what to create in the unified catalog. members also consulted innovative interfaces and reviewed what previous innreach customers had established. draft recommendations for indexing, indexes, record overlay, and record display specifications were then posted on the prospector web site and participants requested to review and provide input. a notice in data/ink: the alliance newsletter (www.coalliance.org/ datalink) also referenced the site. at the same time, testing was performed using draft specifications in order to assess them and to check for other concerns that testing might reveal. because of the importance of the recommendations, an open forum was held to receive additional comments. following the forum, the task force members made final adjustments to the specifications. after the period for public comment ended, the specifications were submitted as recommendations to the prospector steering committee for approval. once approved, the specifications became official and were referenced in all site visits. issues because of the design of inn-reach, participants must make decisions about contribution of records, priorities for what record would serve as the master record, order of loading, indexing, indexes, and displays for the unified catalog. circulation functions require decisions about services for patron types, circulation statuses, loan periods, numbers of loans, renewals, recalls, checkouts, holds, overdues, fines, notices, pick-up locations, and billing. in the case of prospector, expectations regarding what would be controversial met with a few surprises. for example, the master record, the bibliographic record from one participating library to which holdings of other libraries are displayed, is based upon encoding level and the library priority list. the latter determines if the incoming record should replace an existing level; a record with a higher level will replace a lower one. based upon the data collected from libraries, a proposal categorized libraries into the following order: large, special, and "all others." the order was further factored by a member library's application of authority control and participation in program for cooperative cataloging programs. the proposal drew minimal comment from libraries. pride of ownership was not an obstacle. everyone was committed to the fullest authorized form of the record. how many loans an individual could request was the subject of early debate. there were concerns about discrepancies between local limits for borrowing and the possible setting of a higher number of loans on prospector. a corollary concern was that a high number might result in depleting a member library's collection. previous experience with borrowing by a subset of members shed light on the issue; there were no problems with loan limits. in fact, inn-reach supports "load leveling" across participating libraries randomly as well as by prospector i bush, garrison, machovec, and reed 73 reproduced with permission of the copyright owner. further reproduction prohibited without permission. precedence tables thus avoiding systematic checkout from one library only. members decided that they could always pass a request on to another owning library if necessary and monitor loans to determine if any abuses would develop. with these options, it then became possible to establish a forty checkout limit for individual patrons in prospector. differences in cataloging practices engendered more discussion because of the potential for a policy that might affect local practice. in the course of comparing practices of institutions, the cataloging/reference task force identified multiple records for the same serial titles that reflected differences in forms of entry and multiple versions treated either in separate records or on the same record. there was wide variety in statements of holdings. these differences warranted gathering further information on holdings; multiple versions, especially those involving electronic versions; and successive/latest entry for cataloging. the task force decided to hold a focus group on serials and invited staff in member libraries from serials, cataloging, and reference to attend. in the meantime, visits to participating libraries were instituted, the first of the roadshows, to discuss serials practices, their implications for overlays and displays, and options for handling them. the focus group attracted a large attendance and proved useful in gathering information about practices and the concerns of participating libraries regarding serials. most libraries reported individual practices for recording holdings. although participants expressed a desire for consistency, attendees also shared that resources are not available to retroactively change them. instead attendees encouraged development of a best practice recommendation that would follow the niso standards for those libraries wishing to change practices. with the exception of electronic versions of serials, focus group participants had no problem with multiple formats in the same bibliographic record as long as it was clear to users. electronic versions prompted a lot of questions about what to do with 856 links to restricted access resources and about changes in software. it was clear that this issue would need further investigation by the task force. the hottest area, successive or latest entry cataloging of serials, registered strong preferences by proponents. attendees did not welcome changing practice in either direction. instead there were questions asked about possible system changes and about the conduct of use studies to determine what problems might arise from latest entry records in the system. with the information gained from the focus group meeting, the task force assigned priority to the areas and pursued latest/ successive entry as the top priority. 74 information technology and libraries i june 2000 already the task force had consulted innovative interfaces, inc. and received a negative reply to possible changes to matching algorithms, loading programs, and record values that could deal with practices of participants because of the software structure. it was technically impossible for a latest entry and successive entry record to load separately given their match on the oclc number. the predominant use of successive entry and its status as the current national standard persuaded the task force initially to recommend coding latest entry in a special way so that the record for such an entry would not be the master record in the system unless it was unique. this interim measure led to the policy recommendation that successive entry serve as the standard for prospector. as a part of the recommendation, members are asked to not undertake retroactive conversion/ recataloging projects to change existing latest entry records. up to the meeting of the prospector board of directors, the serials policy was argued. the approval by the board illustrates that controversial issues may require that leadership commit their libraries to policies. marketing marketing incorporates an overall strategy of identifying patrons' needs, designing products to meet those needs, implementing the products, and promoting and evaluating them. the twin goals of prospector are: (1) one-stop shopping and expanded access regardless of location, and (2) an automated borrowing system to facilitate fast delivery of materials that addressed problems experienced by patrons in searching and obtaining materials. the grant proposal outlined a plan for member libraries to meet these goals through inn-reach software and the cooperative efforts of participating members. with the implementation of the unified catalog and patron-initiated borrowing, the next pieces of the strategy, promotion and evaluation, come into play. member libraries commitment to a cooperative venture takes time and energy. the support for prospector at the library director and dean level had to be translated to staff in member libraries whose efforts would be necessary to support the unified catalog and patron-initiated loans. staff members had to become acquainted with how prospector would benefit patrons and their work. hence internal promotion was a necessary component throughout planning and policy development and with implementation to users. because of the numbers of staff in member libraries, no one method would assure awareness of developments for prospector. the approach involved the alliance's newsletter (datalink), a prospector web site, electronic reproduced with permission of the copyright owner. further reproduction prohibited without permission. discussion lists, e-mail, correspondence, phone calls, documentation, training sessions, and many site visits. the site visits facilitated interaction across institutional lines and were important for discussing critical issues at the local level. in arranging for site visits, it was important to clarify what the staff members wanted to discuss. a general update on prospector might be followed by other technical sessions such as preparing the library's database for load into the prospector system. participants' questions emphasized the importance of sharing the plan for developing prospector and the basic concepts guiding the implementation planning and policy process as listed below. these concepts bore repeating because a staff member could have been hearing about prospector for the first time. • decisions and directions are guided by data and input gathered from participants, standards/best practices, system capabilities, and the aims for prospector described in the grant. • relatively few local practices are affected by participating in prospector. • inclusiveness in record contributions would build prospector into a rich resource for users; however, participating libraries can exert control over contributions. • global policies are developed for prospector only; local sites define their own local policies. • assistance is available to participating libraries in coming up with solutions for special circumstances. • prospector is not reinventing the wheel. although the multitype library and multisystem involvement would produce a new model of inn-reach, other inn-reach sites could serve as models. • think globally but act locally. more than a catchphrase, this statement acknowledges the reality of individual library circumstances and the balancing of prospector goals to maximize access and use of resources by patrons. patrons the design of the pac, a promotional brochure, and individual library public relations efforts all served to promote prospector's availability to users. prospector provides access via telnet and the web. the impetus, however, was to examine member webpacs and create a prospector webp ac that exemplified the best in menu design including caption descriptions, navigational aids, and consistency in display of elements among search screens. special attention was paid to providing example searches that would have appeal for the diversity of patrons served by the membership. after mulling over several name possibilities, the alliance staff suggested the name prospector for the unified catalog, connoting the rich mining history of the rocky mountain area. this identity found its depiction in a classic picture of a gold miner supplied by the colorado historical society. representing the user, the miner is the center panning for gold, an apt image for users exploring the richness of resources from the unified catalog. the incorporation of the image as the logo on the web site and the catalog was followed by its adoption for the entire cooperative venture. name recognition spread quickly. to facilitate promotion at member libraries, the alliance staff designed a brochure. the design features a brief description of the unified catalog, a list of members and information for patrons on how to connect, what's available on prospector, how to use the self-service borrowing, and how to view their circulation record. many libraries have web-mounted guides or paper handouts in their instructional service, using the alliance-designed brochure as a model. finally, staff in member libraries exercised individual approaches to promote prospector to users. denison library describes and provides a link to prospector on its web list of databases and help guides. colorado state university libraries devoted the front page of its library newsletter to "hunting for hidden gold," the introduction of prospector. a special newsletter for auraria's history faculty highlighted prospector in its database news section. the university libraries of the university of colorado at boulder describes the unified catalog in its web site on its state services page. more introductions came from instructional classes held by every member library. profile of participating libraries prospector is unique since it is multistate, multi type, and multisystem. of the sixteen members (see appendix a), almost all are located along the front range of the rocky mountains extending from laramie, wyoming, southward to colorado springs, colorado. only fort lewis college is located on the western slope of the mountains. despite the distances, a network of courier service connects all members. within the membership are eleven public and private academic libraries, three special libraries representing law and medicine, and two public libraries that serve almost one million registered patrons. twelve of the libraries operate innopac and are loaded into prospector. two libraries on the carl system are slated for loading in mid-2000. two other libraries are migrating to the voyager system by endeavor information systems in the summer of 2000. hopes are to incorporate them into the system in 2001. prospector i bush, garrison, machovec, and reed 75 reproduced with permission of the copyright owner. further reproduction prohibited without permission. description of how inn-reach works the inn-reach software is designed to provide a union catalog with one record per title with all of the libraries holding a title represented. after databases are loaded initially, the software automatically queues transactions that occur to bibliographic, item, order, or summary serial holdings records and sends those transactions up to the central catalog. staff in the local library has no extra work or steps to take to send transactions to the union catalog. the union catalog uses a "master" record to maintain only one bibliographic record per title. the "owner" of the master record is determined by several factors. a bibliographic record with only one holding library automatically has that library as the owner of the master record. if more than one library holds a title, the system uses an algorithm to determine which record coming into the system has the highest encoding level. the library that has the record with the highest encoding level becomes the owner of the record, and its version of the record is displayed and indexed in the catalog. in addition, a table is created which has a list of the libraries in priority order for determining the master record if two or more matching records enter the system with the same encoding level. for the prospector catalog, a survey was conducted of the participating institutions to determine which libraries might have the best or fullest records. questions in the survey included size of database, source of bibliographic records, participation in national projects (e.g., program for cooperative cataloging, oclc enhance), amount of authority work done and level of authority control in the local database, level of cataloging given to records, and type of institution. the task force charged with designing the catalog examined these surveys and determined a priority order of the participating institutions for selecting bibliographic records. the system also uses a set of match points each time a bibliographic record is added to the union catalog. whenever a match occurs, the system examines the encoding level of the incoming record and the library from which the record is coming to determine if a change in the master record is required. the existing record is overlaid by the incoming record if the master record holder is changed. the first check is done on the oclc record number. if there is a match on that, the system adds the holdings to the existing record. if there is no match on the oclc number, the system attempts to match on the isbn or issn in combination with the title in the 245 field. again, if a match occurs, the system adds the holdings to the existing record. if no match occurs, a new bibliographic record is added to the catalog. in addition, each library that has a local innovative interfaces system has the ability to exclude bibliographic, item, order, or check-in records from being sent to the 76 information technology and libraries i june 2000 union catalog. suppression may occur in each of these record types. the library may also choose to send a record to the union catalog but exclude it from public display in the union catalog or to suppress a record from displaying in the public catalog both locally and centrally. the inn-reach system has no central database maintenance module, though it does provide a staff mode in which to view records, to create lists, and to monitor transaction queues. the staff module that is available via a telnet connection allows authorized users to view those records that have been contributed to the union catalog but are not displayed to the public in the union catalog. for example, a library may contribute its order records to the union catalog but choose to suppress those records from public display; however, authorized staff may view these records in the inn-reach staff mode or create lists for collection development purposes that include those order records. circulation status of individual items and volumes also appears to the user. the prospector member libraries with local innovative interfaces systems also maintain a set of circulation or item status codes that display various messages to users of their individual public catalogs. the inn-reach system also has a set of circulation or item status codes. agreement was reached on what the status codes were to be in the central catalog, and each member library then had to map its local codes to the codes used in the central catalog to ensure proper message display in the union catalog. in some cases, the member libraries had to adjust local status codes. indexes for the prospector catalog were determined during the profiling process. in general, there are more indexes in the union catalog than are available in the member libraries' local catalogs. indexes in prospector include author, author/ title, library of congress subject headings, medical subject headings, library of congress children's subject headings, journal title, keyword, library of congress classification numbers, national library of medicine classification numbers, dewey decimal classification numbers, government documents numbers, oclc numbers, and special numbers (e.g., isbn, issn, music publisher numbers, etc.). the classification number indexes are derived using the classification numbers that appear in the defined marc tags for the various classification schemes in the bibliographic record and do not represent local call numbers. local call numbers are always stored at the item record level in the union catalog. it was decided that many local marc fields that are defined for local notes or local access would not transfer from the local catalog to the union catalog (e.g., 59x, 69x, 79x, 9xx) to avoid ambiguities and excessive heading conflicts. therefore, there may be access points or index entries in the local catalog that may not be available in the union catalog; the local reproduced with permission of the copyright owner. further reproduction prohibited without permission. catalog may still contain "richer" or "fuller" searching than the union catalog. the local catalog may have materials accessible in it as well that do not appear in the union catalog. patrons using a local catalog may transfer their searches up to prospector simply by clicking on a button in their local public catalogs and have the search automatically occur in the union catalog. patrons may access prospector directly either via the world wide web or via telnet. navigation between local catalogs and prospector as well as navigation within prospector has been designed to be clear and simple. patrons may also go from prospector either back to their local catalog or to the local catalogs of other member libraries. when a patron locates an item that he or she wishes to borrow from prospector, he or she may initiate the request for the item online. the borrowing and lending process is described below. prospector member libraries have been asked to be as inclusive as possible in contributing bibliographic records to the union catalog. member libraries have been asked to contribute the following: • items that users may borrow, including all monographic materials that circulate, and other material types as specified by individual institutions that are listed as available for circulation. • items that users may not borrow but may use onsite, including reference materials, archival materials, rare books, and others as determined by individual institutions. virtual items, such as electronic journals, which have ip limiting and authentication are included in this category. • items that are owned virtually which have urls or ip addresses that are open and unrestricted include government publications and selected home pages as determined by the local institution. bibliographic records that are contributed should have as full cataloging as possible for identification and retrieval. materials that are on reserve and other locally defined special materials (e.g., materials that have use restrictions placed upon them) may be excluded from prospector. the prospector union catalog will also include bibliographic and circulation information from libraries that do not use innovative interfaces as their local system vendor. i the integration of non-innovative libraries into inn-reach one of the major efforts in the prospector project was to be able to incorporate bibliographic, item, summary serial holdings, and acquisitions records from other vendors with the inn-reach union catalog software. in 1997, when the grant was written, it was envisioned that the system would incorporate libraries using two ils vendors-innovative interfaces, inc. and carl corporation-two of the major vendors in colorado at the time. twelve libraries used innovative interfaces and four used the carl system (denver public library, regis university, colorado school of mines, and the university of wyoming). however, in late 1999, the colorado school of mines and the university of wyoming decided to migrate to the voyager system by endeavor information systems (this is occurring in 2000). both of these institutions have still expressed an interest in being part of prospector, so they will need to be integrated in 2001 after they are stable on their new system. the remaining carl sites will be fully integrated in 2000. the integration of records that allows document requests from different vendors is being accomplished as follows: • innovative interfaces, inc. has published a set of specifications for how bibliographic, item, summary serial holdings, and acquisitions order records should be formatted to be loaded into the union catalog. • published specifications were also created for patron verification and for how document requests are to be transferred. • the alliance office is developing the software to package usmarc bibliographic records, item records, summary serial holding records, and order records to transfer to prospector. work is also being done so that document requests may be relayed between the different systems using an intermediate unix server running an sql database with a web interface for circulation to ill staff. because the carl and endeavor systems are built differently, the record updating may be done on a "batch" basis several times a day. patron verification, to determine if a carl or endeavor patron is in good standing before allowing a document request, will be done in realtime. i administrative and committee structures under provisions of the grant, the dean of libraries at the university of northern colorado provides administrative management for the project while the colorado alliance of research libraries houses the server, maintains the union catalog software, provides network connectivity, prospector i bush, garrison, machovec, and reed 77 reproduced with permission of the copyright owner. further reproduction prohibited without permission. develops the software to integrate the non-innovative sites into the union catalog, and provides ongoing system administration support for the project. a prospector steering committee comprised of deans and directors of three participating libraries provided general overview for the project during the initial stages. to carry out the initial work of the project, two task forces were appointed with responsibility for detailed design and implementation of the system: the catalog/reference task force and the circulation/document delivery task force. the catalog/reference task force was charged with making all bibliographic and display decisions relating to the catalog. this included establishing the criteria for determining which institution's bibliographic record displays in the catalog, developing display and overlay hierarchies for bibliographic records coming into the system, and identifying marc fields that would be indexed and displayed in the catalog. membership on this task force included both public services and technical services personnel, but did not include representation from every participating library.9 the circulation/document delivery task force was charged with developing common circulation policies to be applied in the union catalog including loan periods, fines, renewals, holds, recalls, checkout limits, and patron blocks. the task force was also responsible for developing the precedence table for routing patron requests. the members of this task force represented each participating library, and several libraries had representation from both their circulation and interlibrary loan department.lo these two task forces conducted meetings from july 1997 through december of 1999. the stage was set for the task forces' work at a training session held by innovative interfaces, inc. on system operation and functionality. each group received direction on what policy issues needed to be determined to lay the groundwork for establishing the codes that drive system functionality. after the initial training, each task force met several times a month, often consulting with innovative interfaces, inc. and/ or their local libraries as their planning and deliberations continued. communication was an important component during the development of the system. soon after the grant was awarded, staff from the alliance office visited each participating library and met with library personnel to explain the overall goals of the project and how work would be conducted. as detailed development progressed, open forums were held in central locations to keep representatives of all libraries apprised of progress and to get feedback regarding specific policy issues. completed work from the task forces was mounted on the prospector web site. in addition, regular articles appeared in data/ink, the alliance monthly newsletter. specific training sessions were conducted both by the task forces and by innovative interfaces. 78 information technology and libraries i june 2000 as the actual database loading process began, the catalog/reference task force conducted sessions at each prospector library. these sessions were twofold in purpose: to provide an opportunity for a general overview of how the database structure and indexing worked for all library personnel, and to train technical services personnel in how local coding of records impacted the display of their local records in the global catalog. in preparation for going live with patron requesting, innovative interfaces, inc. conducted pac searching and circulation training sessions at several central locations for frontline staff from all institutions. in addition, the circulation/ document delivery task force held a central session for representatives from all libraries to discuss issues relating to the flow of materials among libraries. during system implementation, it became apparent that some ongoing structure would be required for ongoing maintenance and development of the global catalog. in completion of their charges, each task force prepared a final report, which was submitted to the steering committee and to the prospector directors group. each task force recommended its own termination but outlined a structure to address ongoing issues. as approved by the prospector directors group, the ongoing governance structure is multilayered with frontline operations groups, broader planning and policy-setting committees, an advisory committee, a directors group, and electronic discussion lists for communication. monitoring of the day-to-day work of the cataloging and circulation/ document delivery operations is handled by frontline staff via e-mail, electronic discussion lists, and/ or telephone. broader planning and policy issues are addressed through smaller, representative standing committees. the advisory committee and directors group operate at a policy level. the new structure includes: • a catalog site liaison group comprised of one representative from each participating library and charged with serving as the point of contact for inquiries regarding catalog maintenance, access and record merging; • a catalog/reference committee comprised of members selected from the participating libraries and charged with responsibility for all bibliographic and display issues relating to prospector. this includes monitoring details of the current implementation as well as addressing ongoing policy issues, recommending system enhancements, testing new system functionality, and training staff at new sites coming into the system; • a document delivery site liaison group comprised of one or more representatives from each participating institution with responsibility to reproduced with permission of the copyright owner. further reproduction prohibited without permission. serve as a point of contact for other prospector libraries that have inquiries concerning issues, lost books, courier delivery, or related topics; • a circulation/document delivery committee comprised of representatives selected from the participating libraries and responsible for issues relating to the courier delivery service, circulation load-balancing, monitoring member compliance with circulation policies, recommending system enhancements, testing new system functionality , and the year-end reconciliation of lost book charges; and • a prospector advisory committee comprised of tewnty-four deans and directors from participating libraries to address issues requiring quick response relating to project specifications and operating rules. the prospector directors group is comprised of the deans/ directors of all participating libraries and is charged with making recommendations on high-level policy and admission of new participants . since prospector is a project of the nonprofit colorado alliance of research libraries consortium, all final high-level decisions and financial commitments are subject to the approval of the board of directors of the consortium . at the present, five of the sixteen prospector libraries are not part of the formal consortium but participate in this one project. the newly formed committees will continue to address broad policy and operational issues such as the load-balancing tables for routing patron requests to owning libraries, will document best practices for local libraries to follow in implementing certain functionality within their local system to achieve maximal results in the central catalog, will identify enhancements to the system , and will test new release functionality. i borrowing and lending policies and specifications as a prelude to its work, the circulation / document delivery task force examined borrowing and lending practice s from other innovative interfaces . inn-reach sites and reviewed the borrowing policies for consortia! borrowers that were developed and agreed to by a subset of alliance libraries (university of northern colorado, auraria library, and denver public library) several years ago. the first major duty of the task force was to establish circulation and document delivery policies that would govern those functions within the prospector system. these common circulation and document delivery policies were based on a series of assumptions: • the task force policies apply to the unified catalog only; local sites define local policies; • local workflow remains local purview; • policies should be kept simple; • circulating materials are commonly circulated materials, primarily books, at each site; • the task force will work within the confines of the inn-reach system; • if a patron is blocked locally, he or she will be blocked at the global level; • for routing purposes, each institution (rather than branch) is the routing site; and • local sites will determine when their items are declared lost. the task force established a series of recommendations for policies that applied to the prospector system . the proposed policies were discussed within the local institutions as well as with various administrative groups. the final policies for prospector lending as adopted and implemented in the system are: • loan period : twenty -one days • renewals: one • number of holds allowed : forty • checkout limit: forty items • recalls: none, except for academic library reserve collections • lost book charge: $100, which is comprised of a $75 refundable lost book charge and a $25 nonrefund able processing fee • libraries establish their own local rules for overdue fines on prospector materials . key features of the inn-reach software that were emphasized with each local library during training sessions are: • libraries have local control over what is loaned through the global catalog. • libraries have local control over which of their patrons can borrow materials through the global catalog. • if the local copy is checked out or missing, a copy may be requested through prospector. • the system is sensitive to multivolume works and allows particular volumes to be selected. the ongoing document delivery committee has developed a series of "best practices" that establish benchmark policies that each library is urged to adopt in the spirit of uniform cooperation among participating libraries. individual libraries, however, may choose not to adopt these practices. prospector i bush, garrison, machovec, and reed 79 reproduced with permission of the copyright owner. further reproduction prohibited without permission. system functionality the actual steps for a patron to request an item within the prospector system are simple and self-explanatory. once a patron has identified an item they wish to order, the following steps take place: • the user is prompted for institutional affiliation, name, and library card number. • the system checks local system to ensure that the patron is in good standing. • the user selects a pick-up location from those offered by their home institution. • the system forwards the patron request to an owning library with an available circulation status doing load balancing among the libraries with available copies. once the patron request is forwarded to a lending library, the request goes into the queue of requested items from that library. each library has established its own workflow for handling requests; however, that workflow must include interaction with the system to record the status of the request. once the item is located by the lending library, it is checked out to the requesting patron's "home" library and is sent, via courier, to that library. the "home" library then receives the item in the system and holds it pending pick-up by the patron. when the patron arrives to borrow that item, it is checked out to that patron's record according to the prospector loan rules. having a common set of loan rules for all prospector loans provides consistency for the patron. the patron may still have multiple due dates on items checked out at the same time depending on the loan rules for local checkouts. the system maintains statistics on several elements of the borrowing and lending processes. it tracks the total number of items borrowed and loaned and calculates the ratio of borrowing to lending per institution. in addition, it tracks the number of items cancelled and the reason why, the number of holds filled and cancelled, and several other groupings. i challenges and issues with the building of prospector still underway and public access available only since late july 1999, prospector is doing a respectable volume of loans in its infancy. over ten thousand items were delivered during the first six months of operation. this number is expected to dramatically rise as the system grows and as local libraries promote the service. this auspicious start provides a sense of 80 information technology and libraries i june 2000 accomplishment tempered by recognition that there is more to do. some of the major challenges facing the project include: • • • • • • • • development is underway to integrate records for the carl system libraries into the central catalog and provide borrowing capabilities for their patrons. as member libraries choose other online system providers, ideally, these systems likewise need to be interfaced with the prospector system. coming to agreements with all vendors involved will require careful negotiation and wording of contracts. discussions are underway with innovative interfaces and endeavor information systems for merging endeavor libraries into inn-reach. monitoring how the fiscal accounting for first endof-year reconciliation will work for lost books is planned. developing best practices and evaluating software enhancements for inn-reach are necessary. we need to determine how to handle electronic resources and multiple formats, and load records from commercial electronic resources, for example, net library. we must improve matching within the system and additional enhancements to the prospector web site. with growth of the system, full-time operations and management staff may be required. securing funding for the new ventures and new staffing will require development efforts or a sharing of costs by members. there is no state-based funding for ongoing maintenance and new product acquisition. with the increasing flow of materials between libraries, the courier delivery service must be monitored on an ongoing basis. the statewide courier service has been recently restructured and was contracted based on pre-prospector activity levels for interlibrary loan materials. with the ever-growing popularity of prospector, there will be a corresponding increase in volume for the courier. service levels need to be monitored closely to ensure that the speed of delivery is maintained and that the loss and incorrect routing rate is within acceptable limits. the balance of borrowing and lending will have financial impacts on some of the participating libraries. through a legislative allocation, the state library of colorado provides funding on a per transaction basis to libraries that are net lenders, or that loan more materials than they borrow. most libraries are considering the prospector transactions as equivalent to interlibrary loan transactions and counting them toward the payment for lending program. it is anticipated that the inclusion of prospector activity in the interlibrary loan borrowing and reproduced with permission of the copyright owner. further reproduction prohibited without permission. lending statistics will significantly alter the balance of payment for lending among the prospector libraries. already prospector has shown that it is changing behaviors. the cooperation between libraries has been impressive. in member libraries, staff are factoring prospector into their plans and realizing that keeping prospector operations staff informed of problems is a good habit. user searching and document delivery patterns are changing. margaret landrum, director at the fort lewis college library, predicts that prospector will have a dramatic effect on researchers in the geographic area. its start has given all members a share in that expectation. i the future and interesting spin-offs union catalog projects often take on a "life of their own" far beyond what was originally envisioned. some of the future spin-offs may include: • the addition of other research libraries in nearby states. • collection overlap studies and improved coordination on acquisition and weeding projects between libraries. • with the full implementation of the union catalog, there are opportunities for resource sharing at a broader level. the central catalog has the functionality to support bibliographic records for and access to "consortia!" resources, thus enabling libraries to jointly purchase resources and provide centralized access to them. • as database and online information providers develop new methodologies for access to their resources, there will be opportunities to easily link from either the local or central catalog to these online resources, a process which is cumbersome and/or impossible in the nonglobal environment. for instance, where databases are centrally mounted at the alliance office with shared ownership, the link to serial holdings feature is pointed to prospector, thus providing patron access to consortiawide holdings. • use of the system as a central repository for cataloged metadata for electronic resources on the web. • encouraging innovative interfaces, inc. to allow document requests that "fail" in the system to be forwarded to national ill subsystems or commercial document suppliers using national standards. i conclusion prospector dramatically alters the bibliographic landscape in colorado, offering patrons easy access to the bibliographic wealth of the state. patrons will be easily able to move from a local catalog to this regional system and request materials. librarians will find the system useful for collection overlap studies, improved coordination on acquisitions and weeding projects, z39.50 links with other indexing/ abstracting services for serials holdings information (e.g., ovid or silverplatter), and expedited book delivery. the high level of cooperation among the diverse nature of the participating libraries is exemplary. the incorporation of public and private universities, public libraries, and special libraries offers a model for cooperation. references 1. anthony j. dedrick, "the colorado union catalog project," college and research libraries news 59, no. 10 (1998): 754-55; george machovec, "prospector: a regional union catalog," colorado libraries 25, no. 2 (1999): 43-45. 2. clifford a. lynch, "the next generation of public access information retrieval systems for research libraries: lessons from ten years of the melvyl system," l!'.formation technology and libraries 11, no. 4 (1992): 405-15; bernie sloan, "testing common assumptions about resource sharing," information technology and libraries 17, no. 1 (1998): 18-29. 3. thomas dowling, "ohiolink-the ohio library and information network," library hi tech 15, no. 3 / 4 (1997): 136-39; lindy naj, "the carl system at the university of hawaii uhm library," library software review 12, no. 1 (1993): 5-11. 4. gary pitkin and george machovec, colorado union catalog. senate bill 96-197. technology grant and revolving loan program. excellence in learning through technology. december 1996. grant proposal by the university of northern colorado and the colorado alliance of research libraries. 5. gary pitkin, colorado union catalog-prospector. final report. july 27, 1999. 6. machovec, "prospector: a regional union catalog." 7. ibid. 8. ibid. 9. prospector staff web site, www.coalliance.org/prospector. 10. ibid. prospector i bush, garrison, machovec, and reed 81 reproduced with permission of the copyright owner. further reproduction prohibited without permission. appendix a general statistics about prospector: • sixteen libraries (see below) • twelve innovative interfaces sites (went live in fall 1999) • two carl sites (to go live in 2000) • two voyager endeavor sites (to be incorporated in 2001 pending final negotiations with both vendors) • 3.6 million unique marc records as of january 2000, which are expected to grow to more than 5 million after the incorporation of the carl and endeavor sites. • 9 million item records, which are expected to grow to more than 12 million after the incorporation of the carl and endeavor sites. • currently 61 percent of the records in the system are held by only one library. • greater than 1 million registered patrons are possible users . denver public library has over 500,000 patrons and jefferson county public library has over 300,000 patrons . • prospector url for public use : http:/ /prospector.coalliance.org • prospector staff url, which includes policies, committee minutes, and profiling tables: www.coalliance.org/ prospector prospector libraries auraria library colorado college colorado school of mines colorado state university denver public library fort lewis college jefferson county public library regis university university of colorado at boulder university of colorado/colorado springs university of colorado/health sciences university of colorado/law library university of denver university of denver/law library university of northern colorado university of wyoming web site http://carbon.cudenver.edu/public/library http://www.coloradocollege.edu/library http://www.mines.edu/academic/library http://manta.library.colostate.edu http://www.denver.lib.co.us http:/ !library. fortlewis.edu http://www.jefferson.lib.co .us http://www.regis.edu/1 ib/wlibhome.htm http://www.libraries.colorado.edu http://web.uccs.edu/library http://www.uchsc.edu/library/index.html http://www.colorado.edu/law/lawlib http://www.penlib.du.edu http://www.law.du.edu/library http://www.unco.edu/library http://www-lib.uwyo.edu 82 information technology and libraries i june 2000 reproduced with permission of the copyright owner. further reproduction prohibited without permission. appendix b early borrowing/lending data the borrowing and lending patterns in prospector will be of interest to monitor because of the wide variety of participating libraries in the system. the incorporation of both academic and public libraries has the potential for different use patterns as seen in more homogeneous academic union catalogs. the following data represents some of the very early borrowing and lending patterns in prospector . all of the libraries in the table went "live" in terms of borrowing and lending in late july or august 1999, with the exception of jefferson county public library, which went live in november 1999. history with other similar projects has shown that use will dramatically grow as libraries and users gain familiarity with the service. the incorporation of denver public library in 2000 should provide significant impact on the service. at the present (and in the accompanying table), prospector has been configured to do random load balancing without the use of any precedence tables to force document requests to one site or another. borrowing site aur ccc su cul cub du dul ftl jcpl uccs uchsc unc lending (owning) site ratio ub totals 1879 930 2301 225 1520 1132 129 946 1775 882 364 2063 aur 0.89 1667 108 282 33 232 187 17 113 234 128 70 263 ccc 0.72 673 114 109 11 96 57 66 89 53 10 68 csu 0.86 1985 267 156 29 272 221 18 130 288 134 55 415 cul 0.55 123 24 9 20 5 11 12 3 10 7 3 19 cub 2.05 3120 396 231 590 26 260 21 246 420 233 56 641 du 2.07 2341 361 153 464 42 315 20 163 279 131 69 344 dul 1.12 145 27 7 14 27 15 25 3 11 6 4 6 ftl 0.54 511 66 36 130 3 66 36 7 72 31 11 53 jcpl 0.54 962 187 81 201 11 154 65 11 64 33 38 117 uccs 1.02 900 170 65 148 12 130 65 5 3 137 15 90 uchsc 0.83 301 63 5 49 5 26 31 3 5 32 36 46 unc 0.69 1422 219 81 291 27 207 153 13 89 222 90 30 prospector fulfillments report, august 1999 through february 14, 2000 prospector i bush, garrison, machovec, ano reed 83 is creative commons a panacea for managing digital humanities intellectual property rights? articles is creative commons a panacea for managing digital humanities intellectual property rights? yi ding information technology and libraries | september 2019 34 yi ding (yi.ding@csun.edu) is online instructional design librarian and affordable learning $olutions co-coordinator, california state university, northridge. abstract digital humanities is an academic field applying computational methods to explore topics and questions in the humanities field. digital humanities projects, as a result, consist of a variety of creative works different from those in traditional humanities disciplines. born to provide free, simple ways to grant permissions to creative works, creative commons (cc) licenses have become top options for many digital humanities scholars to handle intellectual property rights in the us. however, there are limitations of using cc licenses that are sometimes unknown by scholars and academic librarians. by analyzing case studies and influential lawsuits about intellectual property rights in the digital age, this article advocates for a critical perspective of copyright education and provides academic librarians with specific recommendations about advising digital humanities scholars to use cc licenses with four limitations in mind: 1) the pitfall of a free license; 2) the risk of irrevocability; 3) the ambiguity of noncommercial and nonderivative licenses; 4) the dilemma of sharealike and the open movement. introduction along with an increasing number of digital scholarships, open access became a preferred, more affordable model for scholarly communication in the us.1 in particular, digital humanists envision a sharing culture that digital contents and tools can be widely distributed through open access licenses.2 creative commons (cc) licenses, with their promise to provide simple ways to grant permissions to creative works, became top options for many digital humanities to handle intellectual property rights in the us. however, creative commons is not a panacea for managing the intellectual property rights of digital scholarship. digital humanities projects usually consist of complicated components and their intellectual property rights involve various licenses and stakeholders. with misunderstandings of intellectual property and cc licenses, many scholars are not fully aware of the implications of using cc licenses, which cannot provide legal solutions to all intellectual property rights issues. the increasingly popular application and commercialization of digital humanities projects in the us further complicate the issue. based on case studies and influential lawsuits involving the topic in the us, this article critically investigates the limitations of using cc licenses and recommends that academic librarians provide scholars with more sophisticated suggestions on using cc licenses as well as providing education on intellectual property rights in general. mailto:yi.ding@csun.edu is creative commons a panacea for managing digital humanities ip rights? | ding 35 https://doi.org/10.6017/ital.v38i3.10714 literature review usually identified as rights experts, academic librarians are in a unique position to provide copyright education in the digital humanities field through consultation, instruction, and other means to faculty and students.3 librarians sometimes position themselves as “reuse evangelists” who embrace the vision of creative commons by applying cc licenses as well as introducing cc licenses to the campus community through guides and webpages.4 yet, few discussions have been brought up about the limitations of cc licenses in the library community.5 drawing from scholarly literature from the law field and primary sources including lawsuits, websites, magazine articles, and newspaper articles involving this topic, this article intends to bring a critical perspective into the copyright education academic librarians provide by analyzing the four limitations of cc licenses in managing digital humanities projects intellectual property rights. in the law community, scholars have examined the limitations of open licensing and creative commons. katz elaborated the mismatch of the vision of creative commons and its licensors as well as how the incompatibility of cc licenses may result in potential detriment to the dissemination of knowledge.6 scholars later have referred to katz in extensive discussions of the limitations of cc licenses in different realms of copyrighted works. for example, johnson investigated several limitations of cc licenses for entertainment media, including those with sharealike, noncommercial, and nonderivative licenses.7 lukoseviciene acknowledged the efficiency of cc licenses while pointing out its limitation in ensuring equity in a sharing culture.8 when discussing the problems of cc licenses in data sharing, khayyat and bannister echoed katz’s critique on the limitation of cc licenses in combining copyrighted works with different types of licenses.9 scholars have also addressed problems related to intellectual property rights other than copyright when applying cc licenses. for example, hietnanen discussed the problems of license interpretation and concluded that although cc licenses are useful for “low value high volume licensing,” it fails to address some important intellectual property rights including privacy and moral rights.10 burger demonstrated how cc commercial licenses have encouraged publicity right infringement in several cases.11 nevertheless, none of the above scholars discussed the implication of the limitations of cc licenses in digital scholarship. to solve the problem of excessive open-source licenses, gomulkiewicz suggested a license-selection “wizard” modeling what creative commons offers, which demonstrates the limitation of cc licenses in managing the intellectual property rights of codes, a common component of many digital humanities projects.12 this article does not aim to conduct a comprehensive assessment of pitfalls of cc licenses in digital scholarship or make legal recommendations to manage the intellectual property rights of digital humanities projects. rather, it discusses the four limitations of cc licenses that are usually overlooked but essential for academic librarians to educate patrons in the digital humanities field. with the development of the digital humanities field and more students involved in it, academic librarians should educate both faculty scholars and emerging scholars about implications of applying cc licenses.13 information technology and libraries | september 2019 36 four limitations of cc licenses is creative commons really free?—the pitfall of a free license one major reason that scholars and institutions are using cc licenses is the ease of applying them to creative works. the directory of open access journals (doaj), which is regarded as “both an important mode of discovery and marker of legitimacy within the world of open access publishing,” now recommends cc licenses as a best practice.14 doaj explicitly encourages scholars to use creative commons’ “simple and easy” license chooser tool. indeed, the creative commons website provides scholars and institutions a very user-friendly way to select and apply a license to copyrightable works.15 anyone can place a cc license on a work by copying and pasting from its website. however, this oversimplified process of handling intellectual property rights of creative works may mislead both copyright owners and copyrighted works users to overlook pitfalls of this free license, including unintentional copyright and other intellectual property rights infringements. more specifically, one prominent legal formality of cc licenses is that licensees do not need to pay to register with creative commons to apply a cc license. as indicated by creative commons website, a cc license is legally valid as soon as a user applies it to any material the user has the legal right to license. creative commons also does not “require registration of the work with a national copyright agency.”16 while copyright protection is automatic the moment a work is created and “fixed in a tangible form,” there are various advantages to register copyrighted works through the united states copyright office to establish a public record of the copyright claim.17 one foremost important advantage of copyright registration is that copyright owners can file an infringement suit of works of u.s. origin in court. actually, filing a registration before or within five years of publishing a work will actually put the copyright owner in a stronger position in court to validate the copyright.18 additionally, copyright registration enables one to get awarded statutory damages and attorney’s fees and to gain protection against the importation of infringing copies.19 the emphasis on a free-to-use license along with the lack of clarification of the functions of copyright registration on the website of creative commons may not only mislead scholars to ignore important legal formalities within the copyright law, but also increase the abuse of original materials by stakeholders such as predatory publishers. one example is how the integrated study of marine mammals repackaged existing articles taking advantage of the creative commons licenses used by plos one, which has been publishing articles on digital humanities. 20 the oversimplified process of using cc licenses advocated by creative commons website may also prevent licensors from double-checking or clarifying if they have the legal right to license a work. in 2013, persephone magazine, which used an image with a creative commons license, was later sued for $1,500 for using it. it turned out the photo did not belong to the person who uploaded it with a cc license, which led to 73 companies who used it being sued. persephone magazine claimed that $1,500 was more than its entire advertising revenue for the year and it had to ask its users to donate just to keep the site going.21 therefore, scholars of digital humanities projects, which usually include different types of content such as artworks and photographs, should be wary of using cc licensed images. otherwise, a freely available license might end up costing a scholar unexpected money and energy. in the is creative commons a panacea for managing digital humanities ip rights? | ding 37 https://doi.org/10.6017/ital.v38i3.10714 meantime, when deciding to put their projects under cc licenses or to publish their works in a journal that requires cc licenses, scholars should also be reminded to make accurate and clear copyright statements to prevent innocent infringements of other copyright owners ’ works. for example, a team of art historians who create an online map of architectures in ancient china are very likely to use and critique other people’s images in digital projects under fair use. these digital humanists should cite image sources and clarify the scope of the cc license that they apply to their project. it is understandable that in order to promote an open, sharing culture, the application of a cc license is intentionally designed to be simple and free by creative commons to fulfill its mission. however, the misuse of a free license can lead to false licenses and more innocent infringements and ultimately costs. academic librarians should become aware of these pitfalls and provide more in-depth training on cc licenses to scholars, especially by collaborating with campus centers of digital humanities or language and literature faculty as well as other institutional research support departments as suggested by fraser and arnhem.22 is creative commons really safe?—the risk of irrevocability similar to the pitfall of inaccurate licenses, the irrevocability of cc licenses can also be problematic. a “revocable” license is one that can be terminated by the licensor at any time during the term of the license agreement. an “irrevocable” license, on the other hand, cannot be terminated if there is no breach. all cc licenses are irrevocable.23 licenses and contracts usually have effective date of termination and even if they don’t have one, most courts hold that simple, nonexclusive licenses with unspecified durations that are silent on revocability are revocable at will.24 as a result, the irrevocability of cc licenses can be easily overlooked by cc licensors. this means that while in traditional academic publishing and other means of the dissemination of research, scholarly, and creative output, a scholar will be able to revise the copyright agreement he or she has established with a publisher or a scholarly communication venue due to the usually clear rules on termination dates and revocability, it is impossible to revoke a cc license. this discrepancy of the revocability between traditional copyright agreements and cc licenses may put copyright owners at disadvantage especially because many of them apply noncommercial cc licenses. copyright experts have warned scholars to keep in mind that once a “nonexclusive license,” which cc noncommercial licenses are, has been chosen to grant one’s work, the scholar has lost potential opportunities to “license the same work on an exclusive basis,” which is the case in the commercialization of a digital humanities work.25 we can understand this pitfall of the irrevocability of cc licenses in a case in late 2014. a plan by yahoo to begin selling prints of images uploaded to flickr was met with anger by users, even though yahoo only used photos with creative commons licenses that explicitly allowed commercial uses. although yahoo’s use of cc licensed works was legal, users who initially applied cc licenses with commercial use would not have wanted the company making canvas prints from the photos they posted to flickr to make money.26 should these copyright owners understand better the irrevocability of cc licenses, they might have chosen a different type of cc license with caution. bill of rights, a community of people advocating for protecting the intellectual property rights of artists, even called this kind of commercial use “abuse.”27 although most digital scholars, like those flickr users, have a genuine interest in making their works available to as many people as possible, it can be hard to gauge their reactions to all unforeseen outcomes of applying cc licenses to their works. therefore, scholars need more institutional support and education to information technology and libraries | september 2019 38 become aware of the irrevocability of cc licenses when managing the intellectual property rights of their digital scholarship projects. this institutional awareness-building is especially important because of the lack of support from creative commons. irrevocability is listed in the “considerations before licensing” section on the website of creative commons. however, scholars may easily overlook the irrevocability feature of cc licenses due to two reasons. first, the 6,500-plus-word “considerations before licensing” section is not a mandatory step to go through for licensors. it is simply a clickable link from the “choose a license” webpage of creative commons.28 second, although every cc license consists of three layers, the lawyer-readable legal code, the human-readable deed, and the machine-readable code, the irrevocability of cc licenses can be easily buried in those texts when a layperson without any experience or training of cc license look for the simplest way to promote and expose their works as much as possible.29 some may suggest putting everything under noncommercial use. however, it is not an option for some platforms and is even discouraged by some digital scholarship repositories. for example, the open access scholarly publishers association strongly encourages the use of the cc-by license wherever possible.30 the rationale behind the recommendation is the hope to make scientific findings available for innovations as well as to make open-access journals sustainable with sufficient profit to operate. driven by the same objectives, cc-by has become the gold standard for oa publishing. the three largest oa publishers (biomed central, plos, and hindawi) all use this license.31 in particular, the often multimedia and viable characteristics of digital humanities projects can expose them to even more infringement issues in the future. one example of this is romelab, a project focusing on the recreation of the roman forum, and its website is made up of multiple separate components. the project’s website is constructed with the drupal content management system, and is integrated with a 3d virtual world component, where users can access the romelab website and walk through the virtual space of rome itself. romelab is currently under a creative commons attribution-noncommercial license. as a project funded by the mellon foundation, romelab is required to offer “nonexclusive, royalty-free, worldwide, perpetual, irrevocable license to distribute” its data. 32 however, it is never clear to the researchers creating the site how to release the data that only work within the proprietary software unity engine that they used to produce the virtual space and more importantly, all the 3d models and pictures. simply putting the whole site under the creative commons attribution-noncommercial license doesn’t automatically make its research data accessible by the public. in this case, the irrevocability of cc licenses further complicates the issue of cc licenses being oversimplified. specifically, since the romelab website is also equipped with a chat feature and a multiplayer function, allowing multiple users to interact with each other, the project has a great potential to make profit if repurposed as a teaching tool and even an educational game in the future. whether or not researchers of romelab manage to make their research data publicly available, cc licenses are not a panacea to handle conflicting data release expectations and intellectual property rights of unity engine and mellon foundation. it is therefore recommended that digital scholars consider various data types and licensing options before exclusively applying irrevocable cc licenses to their creative works. moreover, if the creator of romelab wants to produce a virtual introduction of the 3d world of the project, he should take into consideration of the limitation of cc licenses before disseminating his is creative commons a panacea for managing digital humanities ip rights? | ding 39 https://doi.org/10.6017/ital.v38i3.10714 work via platforms such as youtube. in 2014, a user found out that somebody took his drone video of burning man 2013 and reposted it in its entirety to youtube under the inaccurate and misleading title “drone’s eye view of burning man 2014,” which earned a large number of views and advertising.33 when everyone was looking for the newest drone video of burning man in 2014, the video posted by this other person received millions of views, which earned them money from youtube advertising. the reason the user cannot sue this other person is that he originally licensed his video under cc by license, which allows commercial use, and which unfortunately is youtube’s only cc license option.34 had the original videographer better understood the irrevocability of cc licenses, he might have chosen a different platform to disseminate his video or at least utilized other ways to protect his copyright. scholars would not want this kind of abuse of their original works and thus should be more cautious of the irrevocability of cc licenses. furthermore, youtube and many other platforms that digital humanities scholars use to disseminate their research, scholarly, and creative work fail to provide effective functionalities and incentives to fulfill cc’s attribution requirement.35 cc by license stipulates, “if supplied, you must provide the name of the creator and attribution parties, a copyright notice, a license notice, a disclaimer notice, and a link to the material.”36 to find this piece of information on youtube, however, someone must go to a video’s landing page and first click the “show more” text in the description below the video. although it is clear to see the cc attribution license with link displayed, someone must click a “view attributions” link to discover the original author’s credit and source video link. the difficulty of going through different steps may impede an average youtube user or most potential licensees of a cc-licensed digital scholarly work to learn the original creator of any content and if what they are viewing was partially or wholly created by someone else.37 since cc licenses only provide licensees with a very general requirement to attribute, licensees are allowed to attribute “in any reasonable manner.”38 with the only limitation to be “not in any way that suggests the licensor endorses you or your use,” licensees are not incentivized to accurately attribute to the scholar of the original work and thus to help disseminate his or her work crediting the copyright owner.39 while users can search for registered works on the official website of united states copyright office, there is no way to conduct a comprehensive search for works under cc licenses. creative commons does not maintain a database of works distributed under cc licenses. although there are search engines and websites for works under cc licenses, there is no way to conduct an exhaustive search.40 this can create hurdles for future licensees of a derivative work to accurately and clearly attribute the original work. one of the most important motivations of scholars to distribute their works under cc licenses is to get gain more exposure. due to all these above limitations and others to be discussed in this paper, scholars should be more cautious of the irrevocability of cc licenses and its lack of enforcement and support system to help licensors accurately attribute the original work. is creative commons really clear?—the ambiguity of noncommercial and nonderivative licenses noncommercial license in the legal code of a cc attribution-noncommercial-sharealike license, noncommercial is defined as “not primarily intended for or directed towards commercial advantage or monetary compensation. for purposes of this public license, the exchange of the licensed material for other material subject to copyright and similar rights by digital file-sharing or similar means is noncommercial provided there is no payment of monetary compensation in connection with the https://www.youtube.com/watch?v=m2thtb6iffa https://www.youtube.com/watch?v=m2thtb6iffa https://www.youtube.com/watch?v=z9jtiouk_6o https://creativecommons.org/licenses/by/2.0/ information technology and libraries | september 2019 40 exchange.”41 this seemingly clear statement can create some confusion and problems in the real world. while a commercial use weighs against fair use, copyright law does not rule infringement solely on a use being commercial. in fact, it is hard to determine a use as totally noncommercial. in the case of princeton university press v. michigan document services, inc., michigan document services (mds) being a commercial copy shop weighs against a finding of fair use, but mds’s use being commercial is only one of the four factors in a fair-use analysis. in this case, the court held that mds’s commercial exploitation of the copyrighted works from princeton university press did not constitute fair use although the courts clarified the educational use was “noncommercial in nature.”42 there have been a number of cases in us copyright law where commercial uses have been ruled lawful fair use. by making commercial use a decisive factor to determine an illegal use, creative commons fails to specify real cases of commercial uses and thus oversimplifies the complicated copyright issues involving commercial uses that scholars should be aware of. more specifically, many digital scholars nowadays post their articles and projects with noncommercially cc licensed images on a website, the maintenance of which is seldom free. similar to the case of princeton university press v. michigan document services, inc., the educational or scholarly use of those noncommercially licensed images should be considered “noncommercial in nature.”43 however, if a digital humanist maintains a website that is subsidized partly by google ads or a company, the nature of the use of those noncommercially licensed images might be called into question as in the case of princeton university press v. michigan document services, inc. although in both situations, the image is not “primarily intended for or directed towards commercial advantage or monetary compensation,” the digital humanist may still increase the traffic of his site and thus profit from including those images on his site. 44 the “different viewpoints and colliding interests” among commercial publishers, librarians, scholars, university administrators, and others may further complicate the already “ambiguous commercial nature of use” in fair use analysis that creative commons oversimplifies.45 the more recent case of great minds v. fedex office & print services, inc. demonstrates this ambiguity of commercial use and one use of cc noncommercial license that is legal yet unexpected and unwanted by copyright owners. to specify, great minds argued that fedex should compensate it for the money the company made from copying materials that great minds distributed under a cc attribution-noncommercial-sharealike 4.0 license. in an amicus brief to support fedex office, creative commons held that “entities using cc-licensed works must be free to act as entities do—including through employees and the contractors they engage in their service” and otherwise “the value of the license would be significantly diminished.”46 creative commons demonstrated its interpretation of a commercial use to be different from the ruling in the case princeton university press v. michigan document services, in which the judge explicitly ruled the use to be commercial because the copyright complaint was performed on “a profitmaking basis by a commercial enterprise” and clearly forbade the contract between this enterprise and a nonprofit organization to copy and distribute copyrighted content.47 in contrast, in the case of great minds v. fedex office & print services, inc., the court held that great minds ’ nonexclusive public license, i.e. cc attribution-noncommercial-sharealike 4.0 international public license, “unambiguously permitted school districts to engage fedex, for a fee, to reproduce” the copyrighted content.48 scholars should therefore be wary of the complicated process and “several areas of uncertainty” surrounding creative commons, which can be easily is creative commons a panacea for managing digital humanities ip rights? | ding 41 https://doi.org/10.6017/ital.v38i3.10714 overlooked when applying the “simple and easy” cc licenses.49 none of the interpretations of noncommercial uses by creative commons are specified in the generic license deed. compared to more customized licenses that usually involve direct interactions between the licensor and the licensee, the free-of-charge license, cc licenses, has a long way to go to protect both licensors and licensees from infringements and financial loss. a study of noncommercial uses conducted by creative commons indicates that noncommercial licenses account for “approximately two-thirds of all creative commons licenses associated with works available on the internet.”50 kim confirmed this popularity of cc noncommercial licenses that “over 60 percent flickr users prohibit commercial use or derivative work.”51 as kim elaborated and as the previous section in this paper on the irrevocability of cc licenses showcases, either commercial or noncommercial cc licenses are “likely to be detrimental to potential professional careers” of copyright owners.52 nevertheless, as stated by creative commons, they do not offer legal advice. 53 when providing copyright education, academic librarians should therefore remind digital scholars to be careful in using both commercial and noncommercial content and making their own content available for noncommercial purposes. nonderivative license similarly, scholars should be reminded to have a critical view of nonderivative use of cc licenses. according to title 17 section 101 of the copyright act, a “derivative work” is a work based upon one or more preexisting works in which it may be recast, transformed, or adapted.54 however, creative commons used the phrase “adapted material” to define derivative work in the legal code for nonderivative uses.55 creative commons has a different understanding of derivative works from what is defined by the copyright act in musical works. “adapted material is always produced where the licensed material is synched in timed relation with a moving image.”56 this means that while using an original soundtrack in a video is not derivative work according to the copyright act, videos that use an nd-licensed song violate the terms of the cc license. similar to the difference of revocability and commercial use between creative commons and copyright act as discussed earlier in this article, this different understanding of derivative work should be made aware to scholars. specifically, when providing copyright education to scholars, academic librarians should make it clear that nonderivative license cannot alienate the fair use rights of users and that a noncommercial nonderivative license does not prevent companies from using a work in a parody.57 some licensors of cc licenses may not share creative commons’ vision of an open, sharing culture as suggested by the prevalence of nd licenses.58 therefore, instead of providing generic recommendation on using cc licenses, academic librarians should “balance the interests of information users and rights holders” by providing a more sophisticated and critical perspective when educating the scholarly community about the nonderivative cc licenses.59 is creative commons really sustainable—the dilemma of sharealike and open access incompatible sharealike licenses for many digital scholars, the sharealike term in cc licenses is intended to distribute their works more broadly and openly since a licensee is required by creative commons to “distribute . . . [their contributions] . . . under the same license as the original.”60 nevertheless, some incompatibility issues arise to prevent a more open distribution of works. for example, since the creative commons system offers two different sharealike licenses, a scholar cannot create a new derivative work combining two sharealike works with different terms of their respective licenses. http://www4.law.cornell.edu/uscode/17/101.html information technology and libraries | september 2019 42 it is the open and accessible nature of cc-licensed works that makes them ideal for scholars including digital humanists to collaborate on projects but ironically, the sharealike function can create the risk of “an intractable thicket” if incompatibilities between those licenses hinder future collaboration.61 creative commons does provide a series of compatible licenses, but only same licenses with differences in cc versions are considered compatible.62 against open access? these incompatibilities between certain cc licenses have been pointed out by copyright experts to limit “the future production and distribution of creative works” and even “anti-public domain.”63 in 2009, the cofounder and ceo of creative commons, lawrence lessig, pointed out the perils of openness in government in his article “against transparency.”64 echoing his argument that “whether and how new information is used to further public objectives depends upon its incorporation into complex chains of comprehension, action, and response,” this paper advocates a critical perspective of cc licenses in digital scholarship. apart from all the limitations of cc licenses discussed already, a more unsettling misuse of cc licenses is a failure to recognize other rights of a work beyond copyright. in 2011, the image of an underage girl, which was placed on flickr under a cc license, was used in an advertising campaign for mobile phone services.65 although after the lawsuit, creative commons ceo added a term in the legal deed of the latest version (4.0) of every cc license to explicitly state that other rights such as publicity, privacy, or moral rights may limit how to use the material, the case reveals the perils of openness.66 when providing copyright education, academic librarians should not only warn digital scholars of this limitation of cc licenses but also encourage them to include a statement of intellectual property rights including privacy and other rights on their digital scholarship websites to reduce abuses and innocent infringements. conclusion even though cc licenses are helpful for digital humanities to gain more exposure, these licenses are still being improved. creative commons pledged to the community to “clarify how the nc limitation works in the practical world.”67 yet, when providing copyright consultation or partnering with digital humanity scholars, academic librarians should warn these scholars as both licensors and licensees the sophisticated implications of not only the noncommercial license, but also other characteristics and limitations of cc licenses. academic librarians should introduce to digital scholars a more critical view of cc licenses by collaborating with different campus stakeholders.68 while it is recommended that academic librarians suggest digital scholars place their creative works under noncommercial license, academic librarians should also educate them about the ambiguous definitions of commercial use as well as the possibility of commercial parody and other fair use situations. it is also recommended that academic librarians provide digital humanists with guidance on how to create intellectual property statements on their website, which should include not only copyright, but also privacy and other intellectual property rights. currently, a number of university libraries and nonprofit organizations, ranging from duke university library (http://library.duke.edu/), to library of congress (https://www.flickr.com/photos/library_of_congress), and wikipedia (https://en.wikipedia.org/wiki/main_page), use cc licenses for their entire site.69 as cc license http://library.duke.edu/ https://www.flickr.com/photos/library_of_congress https://en.wikipedia.org/wiki/main_page is creative commons a panacea for managing digital humanities ip rights? | ding 43 https://doi.org/10.6017/ital.v38i3.10714 users, academic librarians should also be extremely careful when using cc-licensed pictures or music on the library’s website. the safest way is to only use ones that are in the public domain or that are acquired by the library. despite the use of free and simple cc licenses, academic libraries are recommended to include terms of use and privacy sections on their websites to provide more detailed explanations of the function of cc licenses and intellectual property rights in general. the alignment between the visions of creative commons, digital humanities, and “higher education as a cultural and knowledge commons” put academic librarians in a unique position to provide copyright education in the digital humanities field.70 because of all the limitations of cc licenses, academic librarians should go beyond a simple endorsement of cc licenses and offer a more sophisticated and critical perspective when educating the scholarly community about cc licenses. notes 1 amanda hornby and leslie bussert, "digital scholarship and scholarly communication," university of washington libraries, accessed november 30, 2016, https://www.uwb.edu/getattachment/tlc/faculty/teachingresources/newmedia. 2 oya y rieger, “framing digital humanities: the role of new media in humanities scholarship,” first monday 15, no. 10 (october 11, 2010), http://firstmonday.org/ojs/index.php/fm/article/view/3198. 3 elizabeth joan kelly, "rights instruction for undergraduate students: needs, trends, and resources," college & undergraduate libraries 25, no. 1 (2018): 1-16, https://doi.org/10.1080/10691316.2016.1275910. 4 daniel hickey, "the reuse evangelist: taking ownership of copyright questions at your library," reference & user services quarterly 51, no. 1 (2011): 9-11; “research guides: image resources: creative commons images,” creative commons images image resources research guides at ucla library, accessed april 28, 2019, https://guides.library.ucla.edu/c.php?g=180361&p=1185834; “finding public domain & creative commons media: images,” research guides, accessed april 28, 2019, https://guides.library.harvard.edu/c.php?g=310751&p=2072816. ucla and harvard are two good examples. 5 lewin-lane et al., "the search for a service model of copyright best practices in academic libraries," journal of copyright in education and librarianship 2, no. 2 (2018): 1-24. harvard. for example, when conducting a literature review of the copyright education in academic libraries to search for best practices, does not discuss any limitation of cc licenses in this article. 6 zachary katz, "pitfalls of open licensing: an analysis of creative commons licensing," idea: the intellectual property law review 46, no. 3 (2006): 391-413. 7 eric e. johnson, "rethinking sharing licenses for entertainment media," cardozo arts & entertainment law journal 26, no. 2 (2008): 391-440. https://www.uwb.edu/getattachment/tlc/faculty/teachingresources/newmedia http://firstmonday.org/ojs/index.php/fm/article/view/3198 https://doi.org/10.1080/10691316.2016.1275910 https://guides.library.ucla.edu/c.php?g=180361&p=1185834 https://guides.library.harvard.edu/c.php?g=310751&p=2072816 https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=tn_gale_ofa190356089&context=pc&vid=01cals_uno&search_scope=everything&tab=everything&lang=en_us https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=tn_gale_ofa190356089&context=pc&vid=01cals_uno&search_scope=everything&tab=everything&lang=en_us https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=tn_gale_ofa190356089&context=pc&vid=01cals_uno&search_scope=everything&tab=everything&lang=en_us https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=tn_gale_ofa190356089&context=pc&vid=01cals_uno&search_scope=everything&tab=everything&lang=en_us information technology and libraries | september 2019 44 8 aurelija lukoseviciene, "beyond the creative commons framework of production and dissemination of knowledge," http://dx.doi.org/10.2139/ssrn.1973967. 9 mashael khayyat and frank bannister, “open data licensing: more than meets the eye,” information polity: the international journal of government & democracy in the information age 20 (4): 231–52, https://doi:10.3233/ip-150357. 10 herkko hietanen, “the pursuit of efficient copyright licensing: how some rights reserved attempts to solve the problems of all rights reserved,” lappeenranta university of technology, 2008. 11 christa engel pletcher burger, “are publicity rights gone in a flash?: flickr, creative commons, and the commercial use of personal photographs,” florida state business review 8 (2009): 129, https://ssrn.com/abstract=1476347. 12 robert w gomulkiewicz, “open source license proliferation: helpful diversity or hopeless confusion?” washington university journal of law & policy 30 (2009): 261; expanded academic asap, accessed april 28, 2019, http://link.galegroup.com.libproxy.csun.edu/apps/doc/a208273638/eaim?u=csunorthridge &sid=eaim&xid=4bbf2442. 13 jacob h. rooksby, “a fresh look at copyright on campus,” missouri law review (summer 2016): 769; general onefile, accessed april 27, 2019, http://link.galegroup.com.libproxy.csun.edu/apps/doc/a485538679/itof?u=csunorthridge& sid=itof&xid=1f2822f3. 14 “escholarship: copyright & legal agreements,” accessed december 1, 2016, http://escholarship.org/help_copyright.html#creative. 15 “directory of open access journals,” doaj, accessed december 1, 2016, https://doaj.org. 16 “frequently asked questions—creative commons,” accessed december 7, 2016, https://creativecommons.org/faq/#do-i-need-to-register-with-creative-commons-before-iobtain-a-license. 17 “copyright in general,” u.s. copyright office, accessed july 30, 2019, https://www.copyright.gov/help/faq/faq-general.html. 18 “why should i register my work if copyright protection is automatic?,” copyright alliance, accessed july 28, 2019, https://copyrightalliance.org/ca_faq_post/copyright-protection-ata/. 19 “copyright basics,” u.s. copyright office and library of congress, accessed november 30, 2016. https://www.copyright.gov/circs/circ01.pdf#page=7. 20 phil clapham, “are creative commons licenses overly permissive? the case of a predatory publisher,” bioscience (2018): 842-43, accessed april 20, 2019, https://doi:10.1093/biosci/biy098; cornelius puschmann and marco bastos, “how digital are http://dx.doi.org/10.2139/ssrn.1973967 https://doi:10.3233/ip-150357 https://ssrn.com/abstract=1476347 http://link.galegroup.com.libproxy.csun.edu/apps/doc/a208273638/eaim?u=csunorthridge&sid=eaim&xid=4bbf2442 http://link.galegroup.com.libproxy.csun.edu/apps/doc/a208273638/eaim?u=csunorthridge&sid=eaim&xid=4bbf2442 http://link.galegroup.com.libproxy.csun.edu/apps/doc/a485538679/itof?u=csunorthridge&sid=itof&xid=1f2822f3 http://link.galegroup.com.libproxy.csun.edu/apps/doc/a485538679/itof?u=csunorthridge&sid=itof&xid=1f2822f3 http://escholarship.org/help_copyright.html#creative https://doaj.org/ https://creativecommons.org/faq/#do-i-need-to-register-with-creative-commons-before-i-obtain-a-license https://creativecommons.org/faq/#do-i-need-to-register-with-creative-commons-before-i-obtain-a-license https://www.copyright.gov/help/faq/faq-general.html https://copyrightalliance.org/ca_faq_post/copyright-protection-ata/ https://www.copyright.gov/circs/circ01.pdf#page=7 https://doi:10.1093/biosci/biy098 is creative commons a panacea for managing digital humanities ip rights? | ding 45 https://doi.org/10.6017/ital.v38i3.10714 the digital humanities? an analysis of two scholarly blogging platforms,” plos one 10, no. 2 (2015), accessed april 20, 2019. https://doi:10.1371/journal.pone.0115035. 21 “why your blog images are a ticking time bomb,” koozai.com, accessed december 2, 2016, https://www.koozai.com/blog/content-marketing-seo/blog-sued-for-images/. 22 john w. white and heather gilbert eds., laying the foundation: digital humanities in academic libraries (west lafayette: purdue university press, 2016), proquest ebook central. 23 “considerations for licensors and licensees—creative commons,” accessed december 7, 2016, https://wiki.creativecommons.org/wiki/considerations_for_licensors_and_licensees. 24 “the terms ‘revocable’ and ‘irrevocable’ in license agreements: tips and pitfalls,” accessed december 7, 2016, http://www.sidley.com/news/the-terms-revocable-and-irrevocable-inlicense-agreements-tips-and-pitfalls-02-21-2013. 25 mark seeley and lois wasoff, “legal aspects and copyright-15,” in academic and professional publishing, edited by robert campbell, ed pentz, and ian borthwick (cambridge, uk: elsevier ltd, 2012), 355-83. 26 douglas macmillan, “fight over yahoo’s use of flickr photos,” wall street journal, november 25, 2014, sec. tech, http://www.wsj.com/articles/fight-over-flickrs-use-of-photos-1416875564. 27 “flickr apologizes but what about cc abuses by others?,” accessed december 7, 2016, http://www.artists-bill-of-rights.org/news/campaign-news/flickr-apologizes-but-whatabout-cc-abuses-by-others?/. 28 “the terms ‘revocable’ and ‘irrevocable’ in license agreements: tips and pitfalls,” accessed december 7, 2016, http://www.sidley.com/news/the-terms-revocable-and-irrevocable-inlicense-agreements-tips-and-pitfalls-02-21-2013. 29 “legal code—creative commons,” accessed december 7, 2016, https://wiki.creativecommons.org/wiki/legal_code. 30 “why cc-by?—oaspa,” accessed december 7, 2016, http://oaspa.org/why-cc-by/. 31 “why cc-by?—oaspa.” 32 “intellectual property policy,” the andrew w. mellon foundation, accessed july 28, 2019, https://mellon.org/grants/grantmaking-policies-and-guidelines/grantmakingpolicies/intellectual-property-policy/. 33 “why i’m giving up on creative commons on youtube,” eddie.com, september 6, 2014, http://eddie.com/2014/09/05/why-im-giving-up-on-creative-commons-on-youtube/. 34 “creative commons—attribution 4.0 international—cc by 4.0,” accessed december 7, 2016, https://creativecommons.org/licenses/by/4.0/. 35 “why i’m giving up on creative commons on youtube.” https://doi:10.1371/journal.pone.0115035. https://www.koozai.com/blog/content-marketing-seo/blog-sued-for-images/ https://wiki.creativecommons.org/wiki/considerations_for_licensors_and_licensees http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls-02-21-2013 http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls-02-21-2013 http://www.wsj.com/articles/fight-over-flickrs-use-of-photos-1416875564. http://www.artists-bill-of-rights.org/news/campaign-news/flickr-apologizes-but-what-about-cc-abuses-by-others?/ http://www.artists-bill-of-rights.org/news/campaign-news/flickr-apologizes-but-what-about-cc-abuses-by-others?/ http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls-02-21-2013. http://www.sidley.com/news/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls-02-21-2013. https://wiki.creativecommons.org/wiki/legal_code http://oaspa.org/why-cc-by/ https://mellon.org/grants/grantmaking-policies-and-guidelines/grantmaking-policies/intellectual-property-policy/ https://mellon.org/grants/grantmaking-policies-and-guidelines/grantmaking-policies/intellectual-property-policy/ http://eddie.com/2014/09/05/why-im-giving-up-on-creative-commons-on-youtube/ https://creativecommons.org/licenses/by/4.0/ information technology and libraries | september 2019 46 36 “creative commons—attribution 4.0 international—cc by 4.0.” 37 “why i’m giving up on creative commons on youtube.” 38 “creative commons—attribution 4.0 international—cc by 4.0.” 39 ibid. 40 “cc search,” accessed december 7, 2016, https://search.creativecommons.org/. 41 “creative commons—attribution-noncommercial-sharealike 4.0 international—cc by-nc-sa 4.0,” accessed december 7, 2016, https://creativecommons.org/licenses/by-ncsa/4.0/legalcode. 42 “u.s. copyright office fair use index,” u.s. copyright office, accessed april 21, 2019, https://www.copyright.gov/fair-use/. 43 ibid. 44 ibid. 45 jerry d campbell, “intellectual property in a networked world: balancing fair use and commercial interests,” library acquisitions: practice and theory 19, no. 2 (1995): 179-84, https://doi:10.1016/0364-6408(95)00020-a; igor slabykh, “ambiguous commercial nature of use in fair use analysis,” aipla quarterly journal 46, no. 3 (2018): 293-339. 46 “defending noncommercial uses: great minds v fedex office,” creative commons, august 30, 2016, https://creativecommons.org/2016/08/30/defending-noncommercial-uses-greatminds-v-fedex-office/. 47 “princeton university press v. michigan document services,” bitlaw, accessed december 7, 2016, http://www.bitlaw.com/source/cases/copyright/pup.html#iiia. 48 justia, “great minds v. fedex office & print services, inc,” stanford copyright and fair use center, march 21, 2018, https://fairuse.stanford.edu/case/great-minds-v-fedex-office-printservices-inc/. 49 minjeong kim, “the creative commons and copyright protection in the digital era: uses of creative commons licenses,” journal of computer‐mediated communication 13, no. 1 (2007): 187-209, https://doi:10.1111/j.1083-6101.2007.00392.x; “directory of open access journals,” doaj, accessed december 1, 2016, https://doaj.org. 50 “feature: creative commons: copyright tools for the 21st century,” accessed december 7, 2016, http://www.infotoday.com/online/jan10/gordon-murnane.shtml. 51 “the creative commons and copyright protection in the digital era: uses of creative commons licenses.” 52 ibid. https://search.creativecommons.org/ https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode https://www.copyright.gov/fair-use/ https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=tn_sciversesciencedirect_elsevier0364-6408(95)00020-a&context=pc&vid=01cals_uno&search_scope=everything&tab=everything&lang=en_us https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=tn_sciversesciencedirect_elsevier0364-6408(95)00020-a&context=pc&vid=01cals_uno&search_scope=everything&tab=everything&lang=en_us https://doi:10.1016/0364-6408(95)00020-a. https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=tn_gale_ofa570516325&context=pc&vid=01cals_uno&search_scope=everything&tab=everything&lang=en_us https://csun-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=tn_gale_ofa570516325&context=pc&vid=01cals_uno&search_scope=everything&tab=everything&lang=en_us https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great-minds-v-fedex-office/ https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great-minds-v-fedex-office/ http://www.bitlaw.com/source/cases/copyright/pup.html#iiia https://fairuse.stanford.edu/case/great-minds-v-fedex-office-print-services-inc/ https://fairuse.stanford.edu/case/great-minds-v-fedex-office-print-services-inc/ https://doi:10.1111/j.1083-6101.2007.00392.x https://doaj.org/ http://www.infotoday.com/online/jan10/gordon-murnane.shtml is creative commons a panacea for managing digital humanities ip rights? | ding 47 https://doi.org/10.6017/ital.v38i3.10714 53 “creative commons—attribution-sharealike 4.0 international—cc by-sa 4.0,” accessed december 7, 2016, https://creativecommons.org/licenses/by-sa/4.0/legalcode#s6a. 54 “17 u.s. code § 101—definitions,” legal information institute, accessed april 20, 2019, https://www.law.cornell.edu/uscode/text/17/101. 55 “creative commons—attribution-noncommercial-noderivatives 4.0 international—cc by-ncnd 4.0,” accessed december 7, 2016, https://creativecommons.org/licenses/by-ncnd/4.0/legalcode. 56 “creative commons—attribution-noncommercial-noderivatives 4.0 international—cc by-ncnd 4.0.” 57 the famous campbell v. acuff-rose music case established that a commercial parody could qualify as fair use. 58 katz, “pitfalls of open licensing,” 411. 59 “professional ethics,” tools, publications & resources, american library association, february 6, 2019, http://www.ala.org/tools/ethics. 60 “creative commons—attribution-sharealike 4.0 international—cc by-sa 4.0,” accessed december 7, 2016, https://creativecommons.org/licenses/by-sa/4.0/. 61 molly houweling, “the new servitudes,” georgetown law journal 96, no. 3 (2008): 885-950. 62 “compatible licenses,” creative commons, accessed december 7, 2016, https://creativecommons.org/share-your-work/licensing-considerations/compatiblelicenses/. 63 katz, “pitfalls of open licensing,” 391; susan corbett, “creative commons licences, the copyright regime and the online community: is there a fatal disconnect?,” the modern law review 74, no. 4 (2011): 506, http://www.jstor.org/stable/20869091. 64 lawrence lessig, “against transparency,” new republic, october 8, 2009, https://newrepublic.com/article/70097/against-transparency. 65 “creative commons ceo apologizes to virgin mobile—stock photography news, analysis and opinion,” accessed december 7, 2016, https://www.selling-stock.com/article/creativecommons-ceo-apologizes-to-virgin-mob. 66 “frequently asked questions,” creative commons, accessed july 30, 2019, https://creativecommons.org/faq/#how-are-publicity-privacy-and-personality-rightsaffected-when-i-apply-a-cc-license. 67 “defending noncommercial uses: great minds v fedex office,” creative commons, august 30, 2016, https://creativecommons.org/2016/08/30/defending-noncommercial-uses-greatminds-v-fedex-office/. https://creativecommons.org/licenses/by-sa/4.0/legalcode#s6a https://www.law.cornell.edu/uscode/text/17/101 https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode https://en.wikipedia.org/wiki/parody https://en.wikipedia.org/wiki/fair_use http://www.ala.org/tools/ethics https://creativecommons.org/licenses/by-sa/4.0/ https://creativecommons.org/share-your-work/licensing-considerations/compatible-licenses/ https://creativecommons.org/share-your-work/licensing-considerations/compatible-licenses/ http://www.jstor.org/stable/20869091 https://newrepublic.com/article/70097/against-transparency https://www.selling-stock.com/article/creative-commons-ceo-apologizes-to-virgin-mob https://www.selling-stock.com/article/creative-commons-ceo-apologizes-to-virgin-mob https://creativecommons.org/faq/#how-are-publicity-privacy-and-personality-rights-affected-when-i-apply-a-cc-license https://creativecommons.org/faq/#how-are-publicity-privacy-and-personality-rights-affected-when-i-apply-a-cc-license https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great-minds-v-fedex-office/ https://creativecommons.org/2016/08/30/defending-noncommercial-uses-great-minds-v-fedex-office/ information technology and libraries | september 2019 48 68 andrea malone et al., “center stage: performing a needs assessment of campus research centers and institutes,” journal of library administration 57, no.4 (2017): 406–19, https://doi:10.1080/01930826.2017.1300451. 69 laura gordon-murnane, “feature: creative commons: copyright tools for the 21st century,” information today, accessed december 7, 2016, http://www.infotoday.com/online/jan10/gordon-murnane.shtml. 70 ibid. https://doi:10.1080/01930826.2017.1300451 http://www.infotoday.com/online/jan10/gordon-murnane.shtml abstract introduction literature review four limitations of cc licenses is creative commons really free?—the pitfall of a free license is creative commons really safe?—the risk of irrevocability is creative commons really clear?—the ambiguity of noncommercial and nonderivative licenses noncommercial license nonderivative license is creative commons really sustainable—the dilemma of sharealike and open access incompatible sharealike licenses against open access? conclusion notes 34 information technology and libraries | december 2007 author id box for 2 column layout column title editor as public libraries are becoming e-government access points relied on by both patrons and government agencies, it is important for libraries to consider the implications of these roles. while providing e-government access serves to reinforce the tremendously important role of public libraries in the united states social infrastructure, it also creates new demands on libraries and opens up significant new opportunities. drawing upon several different strands of research, this paper examines the nexus of public libraries, values, trust, and e-government, focusing on the ways in which the values of librarianship and the trust that communities place in their public libraries reinforce the role of public libraries in the provision of e-government. the unique values embraced by public libraries have not only shaped the missions of libraries, they have influenced popular opinion surrounding public libraries and fostered the confidence that communities place in them as a source of trusted information and assistance in finding information. as public libraries have embraced the provision of internet access, these values and trust have become intertwined with their new social role as a public access point for e-government both in normal information activities and in the most extreme circumstances. this paper explores the intersections of these issues and the relation of the vital e-government role of public libraries to library funding, public policy, library and information science education, and research initiatives. p ublic libraries have always been valued and trusted institutions within society. due to recent advances in technology and changes in united states society, public libraries now also play a unique and critical role by offering free public internet access. with the increas ing reliance on the internet as a key source of news, social capital, and access to government services and information, the free access provided by public librar ies is an invaluable resource. as a result, a significant proportion of the u.s. population, including people who have no other means of access, people who need help using computers and the internet, and people who have lower quality access, rely on the internet access and computer help available in public libraries. federal, state, and local government agencies now also rely on public libraries to provide citizens with access to and guidance in using egovernment web sites, forms, and services; many government agencies simply direct citizens to the nearest public library for help. this confluence of events has created a major new social role for public libraries— guarantors of internet and egovernment access. though public libraries are not the only points of free internet access in many communities, they have created the strongest commitment to providing access and help for all. by providing not only the access to technology, but also to help using the technology, libraries became internet access points, while community technology cen ters, which usually did not offer the same level of avail able assistance, failed in the late 1990s and early 2000s. further, as libraries not only provide internet access, but free computer access as well, they attract the people who do not own computers and do not benefit from a city’s or coffee shop’s free wifi. the compelling combination of free computer access, free internet access, the avail ability of assistance from knowledgeable librarians, the value that public librarians place on serving their local communities, and the historical trust that society places in public libraries has made libraries a critical part of the u.s. social infrastructure. without public libraries, large segments of the population would be cut off from access to the internet and egovernment. while the provision of internet access for those who have no other access parallels the role of public libraries as providers of access to print materials, the matura tion of public libraries into internet and egovernment access hubs has profound implications for the roles that public libraries are being expected to play in their communities. public libraries are trusted by their com munities as places that community members can turn to for unfettered information access and as places to go for information in times of need. combining this trust with the power of internet access and support makes public libraries even more critical within their local com munities. the trust placed in libraries is also important in balancing the lack of confidence that many citizens place in other government institutions as well as in the internet. clearly, egovernment, which exists at this intersection, has its trustworthiness bolstered by the role of public libraries in its use. as patrons are able to access egovernment through the library—a place that is trusted—they may have greater confidence in the gov ernment services they use through library computers and with the assistance of librarians. the important role of libraries in providing citizens with access to the internet, and especially to egovern paul t. jaeger (pjaeger@umd.edu) is an assistant professor and director of the center for information policy and electronic government at the college of information studies of the university of maryland, college park. kenneth r. fleischmann (kfleisch@umd.edu) is an assistant professor at the college of information studies of the university of maryland, college park. paul t. jaeger and kenneth r. fleischmann public libraries, values, trust, and e-government article title | author 35public libraries, values, trust, and e-government | jaeger and fleischmann 35 ment, makes natural sense given the values of the public library. these new services reflect the values traditionally upheld by public libraries, such as equal access to infor mation, literacy and learning, and democracy. indeed, these values likely have played a significant role in developing and sustaining public trust in public libraries as institutions. thus, to understand how public libraries have come to serve as the default site for egovernment access, it is important to consider how this role builds on and reflects the public library’s enduring values. drawing upon several different strands of research, this article explores the intersections of public libraries, values, trust, and egovernment. the article first exam ines the values of public libraries and the role that these values play in influencing popular opinion surrounding public libraries. next, the article focuses on the trust that communities place in public libraries, which builds upon the values that libraries uphold. after that, the article explores the reasons why public libraries became and remain the public access point for egovernment, providing examples from the 2004 and 2005 hurricane seasons that illustrate this point in the most extreme cir cumstances. the article then examines the nexus of public libraries, values, trust, and egovernment, further exam ining how the values of librarianship and the confidence that communities place in their public libraries reinforce the role of public libraries in the provision of egovern ment. finally, the article explores how the egovernment role of public libraries could be cultivated to improve library services through involvement in research and educational initiatives. ■ public libraries and values values can be seen as “evaluative beliefs that synthesize affective and cognitive elements to orient people to the world in which they live.”1 in other words, values tie together how individuals think about the world and how they feel about the world. following this definition, values are situated within individuals. although they are a result of social interaction and may be shared among individuals, values are a highly individualized and per sonalized phenomenon. thus, values arise at the intersec tion of the individual and the social, with some scholars now making a case for increasing the emphasis placed on values in the social sciences.2 recently, many scholars and commentators have focused on the values of librar ies, most notably former ala president michael gorman, who has written extensively on the topic.3 gorman focuses on library values in response to what he views as a disconnect between library practitioners and academics. he argues that libraryscience programs are becoming increasingly detached from reality, and that one way to ground library science, as well as the library profession, is through an emphasis on the values of librar ianship, which demonstrate the core, enduring values of the profession.4 he explains that values, on the one hand, should provide a foundation for interaction and mutual understanding among members of a profession; on the other hand, they should not be viewed as immutable, but rather as sufficiently flexible to match the changing times. he lists eight central values of librarianship that he views as particularly salient at present: stewardship, service, intellectual freedom, rationalism, literacy and learning, equity of access to recorded knowledge and information, privacy, and democracy. frances groen echoes gorman’s sentiments and argues that one of the major limitations of libraryscience programs is their lack of attention to values.5 she argues that library and information science (lis) programs place almost all of their educational emphasis on what librar ians do and how they do it, and almost none on the rea sons why they do what they do and why such activities are important. she identifies three fundamental library values: access to information, universal literacy, and preservation of cultural heritage, all of which she argues are also characteristics of liberal democratic societies. this argument parallels the observation that increases in information access within a society are essential to increasing the inclusiveness of the democratic process in that society.6 library historian toni samek focuses on another aspect of library values that is no longer as strongly emphasized—attempts to achieve neutrality in libraries.7 neutrality often was advocated as a cherished value, in the sense of providing equal access to all information and sources. however, samek demonstrates that libraries, on the contrary, were more likely to emphasize mainstream information sources and thus privilege them over alter native sources. not only has the value of neutrality been problematic in terms of how it has been implemented and mobilized in public libraries in the 1960s and 1970s, but it also is perhaps impossible to ever achieve in reality.8 the fact that neither gorman nor groen include neutrality in their listings of fundamental library values demonstrates how library values have continued to evolve as public libraries have developed as social institutions. as library values have developed, they have served to unite librarians and establish the role of public libraries in their communities. the values of librarianship have been encoded in the american library association’s (ala) library bill of rights, which strongly asserts the values of equal access and service for all patrons, nondiscrimina tion, diversity of viewpoint, and resistance to censorship and other abridgments of freedom of expression.9 the values of libraries and librarianship are one of the fac tors that lead communities to trust public libraries, as the following section explores. overall, further study of the 36 information technology and libraries | december 200736 information technology and libraries | december 2007 role of values in libraries is essential, especially given the increasing role of technology in public libraries.10 ■ public libraries and trust exactly one half of the respondents to a 2007 pew research center study agreed with the statement “you can’t be too careful in dealing with people.”11 however, even in a climate where trust can be a precious commodity, public libraries are trusted by their communities. carr argues that libraries have come to earn the trust of their com munities because of four obligations that librarians strive to meet: to provide usercentered service, to actively engage in helping users, to connect information seekers to unexplored information sources, and to take the goal of helping users as a professional duty that is controlled first and foremost by the library user.12 similarly, jaeger and burnett argue that, because of its traditional defense of commonly accepted and popular values—such as free access to and exchange of information, providing a diverse range of materials and perspectives to users from across society, and opposition to government intrusions into personal reading habits—public libraries have come to be seen by members of the populace as a trusted source of information in the community.13 gorman argues for a direct link between the values of libraries and the trust that is instilled within them by the public, stating that one important mission for ensuring the survival of libraries and librarianship is “assuring the bond of trust between the library and the society we serve by demonstrating our stewardship and commitment, thus strengthening the mutuality of the interests of librar ians and the broader community.”14 further, a 2006 study conducted by public agenda found that “public libraries seem almost immune to the distrust that is associated with so many other institutions.”15 in specific terms of the internet, the public library “is a trusted communitybased entity to which individuals turn for help in their online activities—even if they have comput ers and internet access at home or elsewhere.”16 in a large scale national survey, 64 percent of respondents, including both users and nonusers of public libraries, asserted that providing public access to the internet should be one of the highest priorities for public libraries.17 thus, trust in public libraries seems to carry over from other library services to provision of internet access and training. however, challenges to trust in public libraries seem to be growing in the internet age. the trusted role of pro tecting users’ personal information may create conflicts with the other social responsibilities of public libraries.18 as a result of a lack of preparedness of some librarians to deal with privacy issues, it is possible that “the trust that research shows users place in libraries is not fully repaid.”19 a 2005 oclc study suggests that, indeed, user trust in public libraries shows signs of weakening, as the majority of citizens place as much trust in internet search engines as they do in public libraries.20 further, the changes in the law following the 9/11 terror attacks that have increased the ability of the federal government to track patron activities in public libraries, such as through the usa patriot act, have raised serious concerns about privacy and freedom of expression among many public library patrons and librarians.21 trust in libraries also has been challenged by the impo sition of filters for public libraries that receive erate fund ing due to the children’s internet protection act.22 while internet access is no longer unfettered in libraries that have to comply with the law, public libraries have been able to prevent this law from eroding their role as trusted internet provider through ala’s vigorous legal challenge to the constitutionality of law and the rejection of erate funds by a large number of libraries after the supreme court upheld the constitutionality of the law.23 thus, the trusting rela tionships that public libraries have built with their com munities are valuable commodities that can be transferred under some circumstances from one particular service to another, yet are not inalienable rights granted to public libraries. rather, public trust is something that libraries must work hard to maintain. trust in public libraries also has served as an important cause and effect of the role of libraries in providing access to egovernment. ■ public libraries and e-government public libraries are not only trusted as a means of access to the internet in general, they are trusted as a provider of access to egovernment. with nearly every united states public library now connected to the internet and offer ing free public access, they can fill a community need of ensuring that all citizens have access to egovernment and assistance using egovernment services.24 indeed, public libraries and the internet have both improved public access to government information.25 this social role also is embraced by all levels of government, with government agencies often directing people with questions about their online materials to public libraries for help.26 as such, government agencies also trust public libraries to serve as key providers of e government access and training. public libraries could not have foreseen becoming the default social access point for egovernment when they began to provide free public internet access in the mid1990s, due in great part to the largely separate evolution of internet access in libraries and egovernment. however, they now fill this role in society, ensuring access to those who have no other means of reaching egovernment and providing a safety article title | author 37public libraries, values, trust, and e-government | jaeger and fleischmann 37 net of training and assistance for those who have access but need help using egovernment. public libraries have developed into the social source of egovernment for two reasons. the first is simply that libraries committed to the provision of public internet access in the early 1990s and have continued to grow and improve that access so that virtually all public libraries in the united states provide free public internet access.27 however, presence of access alone does not account for the current role of the public library, as most public schools and government offices have internet access, and community technology centers were origi nally funded to create an environment that would provide computer access. a key difference in public libraries is that they are historically trusted as providers of information, including government information, to all segments of society. “the public library is one place that is culturally ingrained as a trusted source of free and open information access and exchange.”28 a key part of the provision of internet access in pub lic libraries also has been providing help. as heanue explains, “even if americans had all the hardware they needed to access every bit of government information they required, many would still need the help of skilled librarians whose job it is to be familiar with multiple systems of access to government systems.”29 not only is the information trusted because of the source, the help is trusted because the librarians are part of the library. as egovernment has developed and the complexity has grown, this trusted help has become invaluable to many people who need to use egovernment but do not feel able to on their own. in a 2001 study of both public library and internet users, the key preferences identified for public libraries included the ease of use, accuracy of informa tion available, and help provided by library staff.30 these perceptions have carried over into egovernment, as the staff members not only provide help using egovernment; their guidance directs users to the correct egovernment sites and forms and makes using the sites an easier expe rience than it otherwise would be. in the era of egovernment, governments internation ally are showing a strong preference for delivering ser vices via the internet, particularly as a means of boosting costefficiency and reducing time spent on direct interac tions with citizens.31 however, citizens show a strong preference for phonebased or inperson interactions with government representatives when they have questions or are seeking services.32 egovernment services generally are limited by difficulties in searching for and locating the desired information, as well as lack of availability of computers and internet access to many segments of the general population.33 such problems are exacerbated by general lack of familiarity of the structure of government and which agencies to contact as well as many citizens’ attitudes toward technology and government.34 also, as egovernment sites give more emphasis to presenting political agendas rather than promoting democratic par ticipation, users are less trusting of the sites themselves.35 finally, perhaps the most compelling reason for the reli ance on public libraries to provide access to and help with egovernment is that public libraries provide support equally to all members of a community—and that free services are of most relative value to those who have the fewest resources of their own. as a result of the reliance of patrons and government agencies on the public library as a center for egovernment access and assistance, public librarians have had to become de facto experts on egovernment, ranging from medicare prescription plans to fema forms to immigration registra tion to water management registration.36 in one case, the involvement of a librarian who specialized in government information was necessary in a community planning pro cess to sort through the related egovernment materials and information sources.37 one area where the social roles as provider of egovernment and as trusted provider of information were notably intertwined was during the 2004 and 2005 hurricane seasons along the gulf coast. ■ public libraries as trusted provider of e-government public libraries have become vital access points and com munication hubs for many communities and, in times of emergency, are vital in helping their communities cope with the crisis.38 this role proved especially important in com munities along the gulf coast during the unprecedented 2004 and 2005 hurricane seasons, with public libraries employing their internet access to assist their communities in hurricane recovery in numerous ways. the public librar ies in that region described five major roles for the public library internet access in communities after a hurricane: ■ finding and communicating with dispersed and dis placed family members and friends; ■ completing fema forms, which are online only, and insurance claims; ■ searching for news about conditions in the areas from which they had evacuated; ■ trying to find information about the condition of their homes or places of work, including checking news sites and satellite maps; and ■ helping emergency service providers find informa tion and connect to the internet.39 the provision of egovernment information and assis tance in filling out egovernment forms was a central function of these libraries in helping their communities. the level of assistance was astounding—one mississippi library completed more than fortyfive thousand fema 38 information technology and libraries | december 200738 information technology and libraries | december 2007 applications for patrons in the first month after katrina struck—despite the fact that the libraries were not specifi cally prepared to offer such a service and that few library systems planned for this type of situation.40 furthermore, while libraries helped many communities, they could not meet the enormous needs in the affected communi ties. the events along the gulf coast in 2004 and 2005 revealed a serious need for the integration of local and state public entities that have largescale coordination plans to work with the libraries.41 most of the functions that community organizations played in the most ravaged areas after katrina, rita, wilma, dennis, ivan, and the other major storms were completely ad hoc and unplanned.42 the federal gov ernment was of little help in the immediate aftermath of many of these situations.43 as such, it was the local community organizations, particularly public libraries, that used information technology (at least what was still working) to try to pick up the pieces, get aid, find the missing, and perform other vital functions. consider the following quotes from local government officials explaining the role computers and internet access in public libraries played in providing information to dev astated communities: our public access computers have been the only source of communicating with insurance carriers, the federal emergency management agency and other sources of aid. the greatest impact has been access to information such as fema forms and job applications that are only available via internet. this was highly visible during the aftermath of hurricanes rita & katrina. overall access to information in this rural community has been outstanding due to use of the internet. relief workers were encouraged to use the library to keep in touch with family and friends through email. . . . the library provided a fema team with local maps and help in locating areas that potentially suffered major damage from the storm. during the immediate aftermath of katrina, our com puters were invaluable in locating missing family, applying for fema relief (which could only be done online) and other emergency needs. for that time—the computers were a godsend. we have a large number of displaced people who are coming to rely upon the library in ways many of them never expected. i’ve had so many people tell me that they had never been to a library before they had to find someplace to file a fema application or insur ance claim. many of these people knew nothing about computers and would have been totally lost without the staff’s help.44 along with egovernment access, one of the greatest affects of access to information related to searches for lost family, friends, and pets, with many libraries creating lists of individuals who had been to the library and who were being sought to help in establishing contacts between people. as one librarian stated, “our computers were invaluable in locating a missing family.”45 searches were conducted by patrons and by librarians helping them to locate evacuees and search for information about those who stayed behind. internet access also allowed patrons to have “contact with family members outside of the disaster area,” “communicate with family and friends,” and “stay in touch with family and friends due to lack of telephone service.”46 libraries used their internet access to aid rescue personnel to communicate with their agen cies, and even to direct emergency responders with direc tions, maps, and information about where people most needed help.47 the level of local libraries’ success in meeting the needs of their communities after the hurricanes varied widely, though. many were simply overwhelmed by the numbers of people in need and limited by the fact that they had never expected to have to act as a community lifeline in this way.48 the libraries that faired the best were usually in florida; they have a greater familiarity with dealing with hurricanes and thus were more prepared and had more established ties between local libraries, county governments, and state agencies.49 having internet access and expertise is clearly not enough. planning, coordina tion, experience, and government support and funding all influenced how different public libraries were able to respond after the major hurricanes. public libraries also may be able to play a role in ongoing emergency response efforts, such as the development of largescale community response grids that coordinate citizens and emergency responders in emergencies.50 the greatest lesson, however, may be that public librar ies, as trusted providers of information technology access, particularly access to egovernment, are the most local line of response in communities. the national government failed shatteringly and completely to help people after hurricane katrina, while little public libraries in and on the edges of the devastation hummed along. the local nature of the response that libraries could provide man aged to reach communities and members of those commu nities much better than national or state level responses. such local response to crises, while vital, is becoming much harder to find outside of public libraries. ■ the nexus of public libraries, values, trust, and e-government the democratically oriented core values of public librar ies and the trust that communities place in their public article title | author 39public libraries, values, trust, and e-government | jaeger and fleischmann 39 libraries have the potential to significantly enhance and strengthen the role of public libraries in the provision of egovernment. citizens who access egovernment using computers in public libraries, and with the expert assistance of librarians, may have more confidence in the egovernment information and services they are using as a result of their high regard for public libraries. as patrons trust that librarians will help them reach the information they need, patrons’ awareness of and confidence in egovernment will increase as they learn from librarians about the types of information and services available from egovernment. further, by teaching patrons what is available from and how to use egovernment, librar ians are serving to increase the number of egovernment users. because egovernment is still at an early stage in its development, such positive associations could play a critical role in encouraging and facilitating its widespread acceptance and adoption. just as egovernment is still in its formative stages, research on egovernment also is just getting started. to date, research on egovernment has focused more on technical than social aspects. for example, a meta analysis of 110 peerreviewed journal articles related to egovernment revealed that the relationship between egovernment and values is an important, yet to date understudied, topic.51 it is important to consider not only bandwidth and markup languages, but also values and trust in developing and analyzing egovernment. it also is important to consider the relationship between trust in egovernment and the potential for increasingly participatory democracy. trust can be seen as “centrally positioned at the nexus between the primarily internally driven administrative reforms of egovernment’s architecture and the related, more exter nally rooted pressures for egovernance reflected in widening debates on openness and engagement.”52 similarly, “citizen engagement can help build and strengthen the trust relationship between governments and citizens.”53 through egovernment, it is possible to facilitate citizen participation in government through the bidirectional interactive potential of the internet, making it possible to move toward strong democracy.54 greater faith in democracy can potentially significantly increase citizen trust in egovernment. at the same time that we consider all of these impor tant issues related to egovernment, it is important not to lose sight of the critical role that public libraries play in the provision of egovernment. further, it is necessary to make certain that public libraries receive credit and support for the work that they do in providing access to and help with egovernment. as demonstrated above, public libraries are uniquely and ideally situated to ensure access to and assistance in using egovernment information and services. however, this activity is not sustainable without the recognition and resources that must accompany this role. the conclusion addresses this important point in more detail. ■ conclusions and future directions the evolution of the public library into an egovernment access point has occurred without the direct intention of public libraries and without their involvement in policy decisions related to these new social roles. as with the need to become more active in encouraging the develop ment of technologies to help libraries fulfill these social expectations, public libraries also must become more involved in the policymaking process and in seeking financial and other support for these activities. public libraries have to demand a voice not only to better con vey their critical role in the provision egovernment, but to help shape the direction of the policymaking process to ensure more government support for the access to and help with egovernment that they provide. public libraries have taken on these responsibilities without receiving additional funding. while the provi sion of internet access alone is a major expense for public libraries, the reliance of government agencies on public libraries as the public support system for egovernment adds very significant extra burdens to libraries.55 in a 2007 survey of florida public libraries, for example, 98.7 percent indicated that they receive no support from an outside agency to support the egovernment services the library provides, despite the fact that 83.3 percent of responding libraries indicated that the use of egovern ment in the library had increased overall library usage.56 this lack of outside support has resulted in public librar ies in different parts of the country having widely varying access to the internet.57 the reality is that public libraries are expected by patrons and government agencies to fulfill this social role, whether or not any support—financial, staffing, or training—is provided for this role. the vital roles that public libraries played in the aftermath of the major hur ricanes of the 2004 and 2005 seasons may have perma nently cemented the public and government perception of public libraries as hubs for egovernment access.58 while public libraries have become the unofficial uni versal access point for egovernment and are trusted to serve as a vital community response and recovery agency during emergencies, they do not receive funding or other forms of external assistance for these functions. public libraries need to become involved in and encourage plans and programs that will serve to sustain these essential and inextricably linked activities, while also bringing some level of financial, training, and staffing support for these roles. the tremendous efforts and successes of public librar ies in the aftermath of the 2004 and 2005 hurricanes has 40 information technology and libraries | december 200740 information technology and libraries | december 2007 earned libraries a central position to egovernment and emergency planning at local, state, and federal levels. in those emergency situations, public libraries were able to serve their communities in a capacity that was far beyond the traditional image of the role of libraries, but these emergency response roles are as significant as anything else libraries could do for their communities. in order to continue fulfilling these roles and adequately performing other expected functions, public libraries need to push not only for financial support, but also for a greater role in planning and decisionmaking related to egovernment services as well as emergency response and recovery at all levels of government. if strategic plans and library activities have a consis tent message about the need for support, the interrelated roles of trusted source of local information, egovernment access provider, and communityresponse information and coordination center can make a compelling argument for increases in funding, support, and social standing of public libraries. the most obvious source of further sup port for these activities would be the federal government. amazingly, federal government support accounts for only about 1 percent of public library funding.59 given that federal government agencies are already relying on public libraries to ensure access to egovernment and fos ter community response and recovery in times of emer gencies, federal support for these social roles of the public library clearly can and should be increased significantly. state libraries, cooperatives, and library networks already work to coordinate funding and activities related to certain programs, such as the erate program.60 these same library collectives may be able to work together to promote the need for additional resources and coor dinate those resources once they are attained. private and public partnerships offer another potential means of support for these library activities. with its strong historical and current connections to technology and libraries, the bill and melinda gates foundation might be a very important partner in funding and facilitating the increased role that public libraries play in providing access to and help with egovernment. the search for additional funding to support egovernment provision should not only focus on funds for access and training, but also on funds for research about how to better meet individual and community egovernment needs and the affects of egovernment provision by public libraries on individuals and communities. regardless of what approaches are taken to find ing greater support, however, public libraries must do a better job of communicating their involvement in the provision of egovernment to governments and private organizations in order to increase support. such commu nications will need to be part of a larger strategy to define a place within public policy that gives public libraries a voice in egovernment issues. if public libraries are going to fulfill this social role, they must become a greater pres ence in the national policy discourse surrounding egov ernment. to increase their support and standing in policy discourse, libraries must not be hesitant in reminding the public and government officials of their successes after emergencies and in providing the social infrastructure for efiling of taxes, enrolling in medicare prescription drug plans, and myriad other routine egovernment activities. in many societies, egovernment has come to be seen by many citizens and governments as a force that will enhance democratic participation, more closely link citizens and their representatives, and help disadvan taged populations become more active participants in government and in society.61 egovernment is seen by many as having “the potential to fundamentally change a whole array of public interactions with government.”62 while the egovernment act of 2002 and president’s egovernment management agenda have emphasized the transformative effect of egovernment, thus far it has primarily been used as a way to make information available, provide forms and electronic filing, and distrib ute the viewpoints of government agencies.60 however, many citizens do look to egovernment as a valuable source of information, considering egovernment sites to be “objective authoritative sources.”64 currently, the primary reason that people use egovernment is to gather information.65 in the united states, 58 percent of internet users in the united states believe egovernment to be the best source for government information, 65 percent of americans expect that information they are seeking will be on a government site, and 26 million americans seek political information online everyday.66 public satisfaction with the egovernment services available, however, is limited. as commercial sites are developing faster and provide more innovative services than egovernment sites, public satisfaction with gov ernment web sites is declining.67 public confidence in government web sites also has declined as much of the public policy related to egovernment since 9/11 has been to reduce access to information through egovern ment.68 the types of information that have been affected include many forms of socially useful information, from scientific information to public safety information to information about government activities.69 for these and other reasons, the majority of citizens, even those with a highspeed internet connection at home, seeking govern ment information and services prefer to speak to a person directly in their contacts with the government.70 in many cases, people turn to public librarians to serve as the per son involved in egovernment contacts. further, when people struggle with, become frustrated by, or reject egovernment services, they turn to public libraries. every year, public libraries deal with huge num bers of patrons needing help with online taxes, and the medicare prescription drug plan signup period resulted in article title | author 41public libraries, values, trust, and e-government | jaeger and fleischmann 41 an influx of seniors to public libraries seeking help in using the online registration system.71 for example, during the 2006 tax season, virginia discontinued the distribution of free print copies of tax forms to encourage use of the online system. instead, citizens of the state flooded public librar ies, assuming that libraries could find them print copies of the forms, which of course the libraries did. it seems unlikely, however, that the same government officials pushing the use of egovernment are aware of the roles of public libraries in helping citizens with daytoday egovernment use. further, the enormous social roles of public libraries in emergency response in communities, such as during the 2004 and 2005 hurricane seasons, are far from widely known among government officials. to encourage the provision of external funding, the develop ment of targeted support technologies, and policy sup port for these social roles, public libraries must make the government and the public better aware of these roles and what is needed to ensure that the roles can be fulfilled. similarly, there is an extremely important role for lis programs in ensuring public libraries can meet community expectations for egovernment provision. lis program graduates need to be prepared to help patrons access and use egovernment information and services. as govern ment activities move primarily or exclusively online, patrons will increasingly seek help with egovernment from public libraries. lis programs must ensure that grad uates are ready to serve patrons in this capacity. in 2007, the college of information studies at the university of maryland became the first alaaccredited school to offer a concentration in egovernment as part of the master of library science program.72 the goal of this concentration is to prepare future librarians who wish to specialize in egovernment, which will be an area of increasing and sig nificant need as more government information and services move online and more government agencies rely on public libraries to ensure access to egovernment. lis programs need to prioritize finding ways to incorporate the teaching of issues related to egovernment in public libraries as new concentrations or courses, or into existing courses. the provision of egovernment is an important role of public libraries that is likely to increase significantly, and gradu ates of lis programs need to be prepared to meet patrons’ egovernment information needs. further, lis faculties also can support public libraries in their egovernment access and training roles by focusing more research on the intersections of public libraries and egovernment. ultimately, the role of the trusted and valued public provider of egovernment access creates many financial and staffing obligations and social responsibilities, but it also is a tremendous opportunity for public libraries. fighting against censorship efforts in the 1950s estab lished the public perception of libraries as guardians of the first amendment during the mccarthy era.73 working to ensure access and the ability to use egovernment is creating new public perceptions of libraries as guardians of equal access in new but just as socially meaningful ways. rather than needing to ponder whether the emer gence of the internet will limit or remove the relevance of public libraries, the advent of egovernment has created a brand new and very significant role that public libraries can play in serving their communities. given the empha sis that governments are placing on moving information and services online, patrons will continue to need access to and assistance in using egovernment. the trust and values that have long been associated with public libraries are evolving to include the social expectations of the provision of access to and training for egovernment by public libraries. in the same ways that patrons have learned to trust public libraries to provide equal access to print information sources, they now have learned to trust that libraries can provide equal access to egovernment information. it seems that citizens will regu larly be turning to public libraries for help with mundane egovernment activities, such as finding forms and filing taxes, as well as with the most pressing egovernment activities, as was demonstrated in the aftermath of hur ricanes katrina and rita. because the trust in and values of public libraries have set the stage for the emerging role of libraries in egovernment, public libraries need to work to ensure the availability of the support, education, and policy decisions that they need to serve their communities in this new and vital role in situations ranging from every day information needs to the most extreme circumstances. in spite of the costs associated with serving as the public’s egovernment access center, acting as the social guarantor of equal access to egovernment emphatically demonstrates that public libraries will continue to be a central part of the infrastructure of society in the internet age. public libraries now must learn to articulate better the social roles they are playing and the types of support they need from lis programs, funding agencies, and gov ernment agencies to continue playing these roles. ■ acknowledgment the authors of this paper have worked with several col leagues on projects related to the ideas discussed in this paper. the authors would particularly like to thank john carlo bertot, lesley a. langa, charles r. mcclure, jennifer preece, yan qu, ben shneiderman, and philip fei wu. references and notes 1. margaret mooney marini, “social values and norms,” encyclopedia of sociology, edgar f. borgatta and marie l. borgatta, eds., 2828 (new york: macmillan, 2000). 42 information technology and libraries | december 200742 information technology and libraries | december 2007 2. steven hitlin and jane allyn piliavin, “values: reviv ing a dormant concept,” annual review of sociology 30 (2004): 359–93. 3. michael gorman, our singular strengths: meditations for librarians (chicago: ala, 1997); michael gorman, our enduring values: librarianship in the 21st century (chicago: ala, 2000); michael gorman, our own selves: more meditations for librarians (chicago: ala, 2005). 4. gorman, our enduring values. 5. frances k. groen, access to medical knowledge: libraries, digitization, and the public good (lanham, md.: scarecrow, 2007). 6. elizabeth smith, “equal information access and the evo lution of american democracy,” journal of educational media and library sciences 33, no. 2 (1995): 158–71. 7. toni samek, intellectual freedom and social responsibility in american librarianship, 1967–1974 (jefferson, n.c.: mcfarland, 2001). 8. pam scott, evelleen richards, and brian martin, “cap tives of controversy: the myth of the neutral social researcher in contemporary scientific controversies,” science, technology, and human values 15 (1990): 474–94. 9. american library association, “library bill of rights,” www.ala.org/ala/oif/statementspols/statementsif/librarybill rights.htm (accessed may 19, 2007). 10. kenneth r. fleischmann, “digital libraries with embed ded values: combining insights from lis and science and technology studies,” library quarterly (in press); kenneth r. fleischmann, “digital libraries and human values: human computer interaction meets social informatics,” proceedings of the 70th annual conference of the american society for infor mation science and technology, milwaukee, wisc., 2007. 11. pew research center, americans and social trust: who, where, and why (washington, d.c.: pew research center, 2007), http://pewresearch.org/assets/social/pdf/socialtrust.pdf, 2. 12. david wildon carr, “an ethos of trust in information service,” in ethics and electronic information: a festschrift for stephen almagno, barbara rockenbach and tom mendina, eds., 45–52 (jefferson, n.c.: mcfarland, 2003). 13. paul t. jaeger and gary burnett, “information access and exchange among small worlds in a democratic society: the role of policy in redefining information behavior in the post 9/11 united states,” library quarterly 75, no. 4 (2005): 464–95. 14. gorman, our enduring values, 66. 15. public agenda, long overdue: a fresh look at public and leadership attitudes about libraries in the 21st century (new york: public agenda, 2006), 11, www.publicagenda.org/research/ pdfs/long_overdue.pdf (accessed may 19, 2007). 16. john carlo bertot et al., “public access computing and internet access in public libraries: the role of public librar ies in egovernment and emergency situations,” first monday 11, no. 9 (2006), www.firstmonday.org/issues/issue11_9/bertot (accessed may 19, 2007). 17. public agenda, long overdue. 18. nancy zimmerman and feili tu, “it is not just a matter of ethics ii: an examination of issues related to the ethical provi sion of consumer health services in public libraries,” ethics and electronic information: a festschrift for stephen almagno, barbara rockenbach and tom mendina, eds., 119–27 (jefferson, n.c.: mcfarland, 2003). 19. paul sturges and ursula iliffe, “preserving a secret garden for the mind: the ethics of user privacy in the digital library,” ethics and electronic information: a festschrift for stephen almagno, barbara rockenbach and tom mendina, eds., 74–81 (jefferson, n.c.: mcfarland, 2003), 81. 20. online computer library center, inc. (oclc), perceptions of libraries and information resources: a report to the oclc membership (dublin, ohio: oclc, 2005). 21. jaeger and burnett, “information access and exchange among small worlds in a democratic society”; paul t. jaeger et al., “the usa patriot act, the foreign intelligence surveil lance act, and information policy research in libraries: issues, impacts, and questions for library researchers,” library quarterly 74, no. 2 (2004): 99–121. 22. children’s internet protection act, public law 106–554. 23. paul t. jaeger, john carlo bertot, and charles r. mcclure, “the effects of the children’s internet protection act (cipa) in public libraries and its implications for research: a statistical, policy, and legal analysis,” journal of the american society for information science and technology 55, no. 13 (2004): 1131–39; paul t. jaeger et al., “cipa: decisions, implementation, and impacts,” public libraries 44, no. 2 (2005): 105–09. 24. bertot et al., “public access computing and internet access in public libraries”; john carlo bertot et al., “drafted: i want you to deliver egovernment,” library journal 131, no. 13 (2006): 34–39; john carlo bertot et al., public libraries and the internet 2006: study results and findings (tallahassee, fla.: infor mation institute, 2006), www.ii.fsu.edu/plinternet_reports.cfm (accessed may 19, 2007). 25. nancy kranich, “libraries, the internet, and democracy,” libraries & democracy: the cornerstones of liberty, nancy kranich, ed., 83–95 (chicago: ala, 2001). 26. bertot et al., “public access computing and internet access in public libraries”; bertot et al., “drafted.” 27. bertot et al., public libraries and the internet 2006. 28. jaeger and burnett, “information access and exchange among small worlds in a democratic society,” 487. 29. anne heanue, “in support of democracy: the library role in public access to government,” information, libraries, and democracy: the cornerstones of liberty, nancy kranich, ed. (chi cago: ala, 2001), 124. 30. george d’elia et al., “the impact of the internet on public library uses: an analysis of the current consumer market for library and internet services,” journal of the american society for information science and technology 53, no. 10 (2002): 802–20; eleanor jo rodger, george d’elia, and corrine jorgensen, “the public library and the internet: is peaceful coexistence pos sible?,” american libraries 31, no. 5 (2001): 58–61. 31. w. e. ebbers, w. j. pieterson, and h. n. noordman, “elec tronic government: rethinking channel management strate gies,” government information quarterly (in press). 32. ibid. 33. awdhesh k. singh and rajendra sahu, “integrating inter net, telephones, and call centers for delivering better quality egovernance to all citizens,” government information quarterly (in press). 34. paul t. jaeger and kim m. thompson, “egovernment around the world: lessons, challenges, and new directions,” government information quarterly 20, no. 4 (2003): 389–94; paul t. jaeger and kim m. thompson, “social information behavior article title | author 43public libraries, values, trust, and e-government | jaeger and fleischmann 43 and the democratic process: information poverty, normative behavior, and electronic government in the united states,” library & information science research 26, no. 1 (2004): 94–107. 35. paul t. jaeger, “deliberative democracy and the con ceptual foundations of electronic government,” government information quarterly 22, no. 4 (2005): 702–19; paul t. jaeger, “information policy, information access, and democratic partic ipation: the national and international implications of the bush administration’s information politics,” government information quarterly (in press). 36. bertot et al., “public access computing and internet access in public libraries”; bertot et al., “drafted.” 37. aimee c. quinn and laxmi ramasubramanian, “infor mation technologies and civic engagement: perspectives from librarianship and planning,” government information quarterly (in press). 38. bertot et al., public libraries and the internet 2006; paul t. jaeger et al., “the 2004 and 2005 gulf coast hurricanes: evolv ing roles and lessons learned for public libraries in disaster preparedness and community services,” public library quarterly (in press). 39. bertot et al., “drafted.” 40. jaeger et al., “the 2004 and 2005 gulf coast hurricanes.” 41. ibid. 42. ibid. 43. michael arnone, “storm watch 2006: ready or not,” federal computer week, june 5, 2006, www.fcw.com/print/12_20/ news/947111.html (accessed may 19, 2007). 44. jaeger et al., “the 2004 and 2005 gulf coast hurricanes.” 45. bertot et al., “public access computing and internet access in public libraries.” 46. jaeger et al., “the 2004 and 2005 gulf coast hurricanes.” 47. ibid. 48. ibid. 49. bertot et al., “public access computing and internet access in public libraries.” 50. paul t. jaeger et al., “911.gov: harnessing egovernment, mobile communication technologies, and social networks to promote community participation in emergency response,” telecommunications policy (in press); ben shneiderman and jenny preece, “911.gov: community response grids,” science 315 (2007): 944. 51. kim viborg andersen and helle zinner henriksen, “egovernment research: capabilities, interaction, orientation, and values,” current issues and trends in e-government research, donald f. norris, ed., 269–88 (hershey, pa.: cybertech, 2007). 52. jeffrey roy, “egovernment in canada: transition or trans formation?” current issues and trends in e-government research, donald f. norris, ed., 44–67 (hershey, pa.: cybertech, 2007), 51. 53. oecd egovernment studies, the e-government imperative (danvers, mass.: organization for economic cooperation and development, 2005), 45. 54. bruce barber, strong democracy (berkeley, calif.: univ. of california pr., 1984). 55. bertot et al., public libraries and the internet 2006. 56. charles r. mcclure et al., e-government and public libraries: current status, meeting report, findings, and next steps (tallahassee, fla.: information use management and policy institute, 2007), www.ii.fsu.edu/announcements/egov2006/ egov_report.pdf (accessed may 19, 2007). 57. paul t. jaeger et al., “public libraries and internet access across the united states: a comparison by state from 2004 to 2006,” information technology and libraries 26, no. 2 (2007): 4–14. 58. jaeger et al., “the 2004 and 2005 gulf coast hurricanes.” 59. bertot et al., “drafted.” 60. jaeger et al., “public libraries and internet access across the united states.” 61. beth simone noveck, “designing deliberative democracy in cyberspace: the role of the cyberlawyer,” boston university journal of science and technology 9 (2003): 1–91. 62. s. h. holden and l. i. millett, “authentication, privacy, and the federal egovernment,” information society 21 (2005): 367. 63. egovernment act of 2002, p.l. 107–347; jaeger, “delibera tive democracy and the conceptual foundations of electronic government”; e-government strategy: implementing the president’s management agenda for e-government (washington, d.c.: egov, 2003), www.whitehouse.gov/omb/egov/2003egov_strat.pdf (accessed may 19, 2007). 64. anderson office of government services, a usability analysis of selected federal government web sites (anderson office of government services: washington, d.c., 2002), 1. 65. christopher g. reddick, “citizen interaction with egov ernment: from the streets to servers?,” government information quarterly 22, no. 1 (2005): 338–57. 66. john b. horrigan, politics online (washington, d.c., pew internet & american life project, 2006); john b. horrigan and lee rainie, counting on the internet (washington, d.c., pew internet & american life project, 2002). 67. stephen barr, “public less satisfied with government websites,” washington post, mar. 21, 2007, www.washingtonpost. com/wpdyn/content/article/2007/03/20/ar2007032001338. html (accessed may 19, 2007). 68. lotte e. feinberg, “foia, federal information policy, and information availability in a post9/11 world,” government information quarterly 21 (2004): 439–60; elaine l. halchin, “electronic government: government capability or terrorist resource,” government information quarterly 21 (2004): 406–19: harold c. relyea and elaine l. halchin, “homeland security and information management,” the bowker annual: library and trade almanac 2003, d. bogart, ed., 231–50 (medford, n.j.: infor mation today, 2003). 69. jaeger, “information policy, information access, and democratic participation.” 70. john b. horrigan, how americans get in touch with government (washington, d.c., pew internet & american life project, 2004). 71. bertot et al., “public access computing and internet access in public libraries”; bertot et al., “drafted.” 72. the description of the university of maryland’s egov ernment master’s program is available at www.clis.umd.edu/ programs/egov.shtml. 73. jaeger and burnett, “information access and exchange among small worlds in a democratic society.” lita president’s message joining together emily morton-owens information technology and libraries | december 2019 2 emily morton-owens (egmowens.lita@gmail.com) is lita president 2019-20 and the assistant university librarian for digital library development & systems at the university of pennsylvania libraries. . in writing this column i am looking ahead, as i have been throughout my term as vice-president and president of lita, to the possibility of our merger with alcts and llama. recently our discussions have included an exploration on all sides of how a division can support members through their career. this has inspired me to reflect on how lita has always taken a broad and inclusive view of what library technology work is and can be in the future. i believe the proposed core division can support and extend that tradition. one question that i’ve heard posed from time to time is “am i technical enough for lita?” longtime lita members like to answer that with a full-throated “yes!” if you’re interested enough to ask the question, we want you to join us in using technology as a part of your work. we want you to be supported in doing so at your current skill level, whether or not you want to make technology more a part of your work than it is today. if you want to go deeper into technology, we’ll be there with you. while the culture of the for-profit technology industry can promote imposter syndrome, we want lita to be a haven. in lita’s events and meetings, we consistently see different facets of library technology work reflected. some of us are training users in new technologies or creating programs that get young people excited about coding. others are working to make online resources accessible and easy for our users to benefit from. we have members who are manipulating metadata, creating services to help researchers comply with data management requirements, creating websites that guide users to the information they need, and preserving cultural heritage in digital forms. some of us manage tech projects or workers. some of our members work on large tech teams with generous resources and others are spinning magic just from their own skills. when i started working in libraries, my bosses and mentors were often librarians who had started in technical services or other roles, before “automation.” eager to improve their own workflows, and getting pulled into ils migrations and catalog development, they had become the technology experts. these accidental systems librarians have always been some of my favorite colleagues because of their sure-footed approach to our data. recently i’ve come to work with colleagues who are accidental systems librarians in the opposite sense; tech workers who took jobs in libraries and embraced what we do. one developer on my team, who had no previous library experience, took to our projects and ethical stance like a duck to water. he told me that he now goes to parties and tells people about how librarians are defenders of privacy and protectors of information. lita embraces growth in any direction because we want to support learning and problem-solving with a foundation of shared principles and resources. i don’t see these developments as time-based or inevitable in any given person’s career. there are plenty of library tech workers who prefer being an individual contributor and think they have mailto:egmowens.lita@gmail.com joining together | morton-owens 3 https://doi.org/10.6017/ital.v38i4.11905 their biggest impact doing direct work on applications. and many of my technical services colleagues prefer to define their work goals in those terms, no matter how adept they become with tech tools. whether or not they seek out a management position, our members will probably find themselves exhibiting leadership in some context, like developing standards or advocating for standards. instead of a rigid path of career development, many librarians today have fluid and multi-faceted careers. for myself, i have held similar positions at quite different types of libraries—public, medical, academic. lita has always been a part of my experience, though, providing a sort of collegial bedrock through a lot of change. the people are what make lita, lita: friendly, principled, and quirky. lita members are the kind of people who will learn all they can about a technology like the amazon alexa—and then unplug the one on the exhibit floor at annual. both as i was thinking about all this, and in this resulting column, leadership, collections, and technical services kept coming up. there is such strong and fruitful cross-pollination among these specialties, and i see that as something that would enhance the member experience—both for current lita members who want more contact with expert colleagues and for current llama and alcts members who want learning opportunities and support for their work with technology. lita members love to share their knowledge and hash through challenges together. sometimes i wish more ala members would feel comfortable giving us a try, and perhaps core will be a new, friendly face for that ongoing outreach. if, in the future, someone asked the new question “am i technical enough for core?” i’m sure the answer will be the same: “yes, please join us!” articles no need to ask: creating permissionless blockchains of metadata records dejah rubel information technology and libraries | june 2019 1 dejah rubel (rubeld@ferris.edu) is metadata and electronic resource management librarian, ferris state university. abstract this article will describe how permissionless metadata blockchains could be created to overcome two significant limitations in current cataloging practices: centralization and a lack of traceability. the process would start by creating public and private keys, which could be managed using digital wallet software. after creating a genesis block, nodes would submit either a new record or modifications to a single record for validation. validation would rely on a federated byzantine agreement consensus algorithm because it offers the most flexibility for institutions to select authoritative peers. only the top tier nodes would be required to store a copy of the entire blockchain thereby allowing other institutions to decide whether they prefer to use the abridged version or the full version. introduction several libraries and library vendors are investigating how blockchain could improve activities such as scholarly publishing, content dissemination, and copyright enforcement. a few organizations, such as katalysis, are creating prototypes or alpha versions of blockchain platforms and products.1 although there has been some discussion about using blockchains for metadata creation and management, only one company appears to be designing such a product. therefore, this article will describe how permissionless blockchains of metadata records could be created, managed, and stored to overcome current challenges with metadata creation and management. limitations of current practices metadata standards, processes, and systems are changing to meet twenty-first century information needs and expectations. there are two significant limitations, however, to our current metadata creation and modification practices that have not been addressed: centralization and traceability. although there are other sources for metadata records, including the open library project, the largest and most comprehensive database with over 423 million records is provided by the online computer library center (oclc).2 oclc has developed into a highly centralized operation that requires member fees to maintain its infrastructure. oclc also restricts some members from editing records contributed by other members. one example of these restrictions is the program for cooperative cataloging (pcc). although there is no membership fee for pcc, catalogers from participating libraries must receive additional training to ensure that their institution contributes high quality records.3 requiring such training, however, limits opportunities for participation and can create bottlenecks when non-pcc institutions identify errors in a pcc record. decentralization no need to ask | rubel 2 https://doi.org/10.6017/ital.v38i2.10822 would help smaller, less-well-funded institutions overcome such barriers to creating and contributing their records and modifications to a central database. the other significant limitation to our current cataloging practices is the lack of traceability for metadata changes. oclc tracks record creation and changes by adding an institution’s oclc symbol to the 040 marc field.4 however, this symbol only indicates which institution created or edited the record, not what specific changes they made. oclc also records a creation date and a replacement date in each record, but a record may acquire multiple edits between those two dates. recording the details of each change within a record would help future metadata editors to understand who made certain changes and possibly why they were made. capturing these details would also mitigate concerns about the potential for metadata deletion because every datum would still be recorded even if it is no longer part of the active record. information science blockchain research many researchers and institutions are exploring blockchain for information science applications. most of these applications can be categorized as either scholarly publishing, content dissemination and management, or metadata creation and management. one of the most promising applications for blockchain is coordinating, endorsing, and incentivizing research and scholarly publishing activities. in “blockchain for research,” rossum from digital science describes benefits such as data colocation, community self-correction, failure analysis, and fraud prevention.5 research activity support and endorsement would use an academic endorsement points (aep) currency to support work at any level, such as blog posts, data sets, peer reviews, etc. the amount credited to each scientist is based on the aep received for their previous work. therefore, highly endorsed researchers will have a greater impact on the community. one benefit of this system is that such endorsements would accrue faster than traditional citation metrics.6 one detriment to this system is its reliance on the opinions of more experienced scientists. the current peer review process assumes these experts would be the best to evaluate new research because they have the most knowledge. breakthroughs often overturn the status quo, however, and consequently may be overlooked in an echo chamber of approved theories and approaches. micropayments using aep could “also introduce a monetary reward scheme to researchers themselves,” bypassing traditional publishers.7 unfortunately, such rewards could become incentives to propagate unscientific or immoral research on topics like eugenics. in addition, research rewards might increase the influence of private parties or corporations to science and society’s detriment. blockchains might also reduce financial waste by “incentivizing research collaboration while discouraging solitary and siloed research.”8 smart contracts could also be enabled that automatically publish any article, fund research, or distribute micropayments based on the amount of endorsement points.9 to support these goals, digital science is working with katalysis on the blockchain for peer review project. it is hard to tell exactly where they are in development, but as of this writing, it is probably between the pilot phase and the minimum viable product.10 the decentralized research platform (deip) serves as another attempt “to create an ecosystem for research and scientific activities where the value of each research…will be assessed by an experts’ community.”11 the whitepaper authors note that the lack of negative findings and unmediated or open access to information technology and libraries | june 2019 3 research results and data often leads to scientists replicating the same research.12 they also state that 80 percent of publishers’ proceeds are from university libraries, which spend up to 65 percent of their entire budget on journal and database subscriptions.13 this financial waste is surprising because universities are the primary source of published research. therefore, deip’s goals include research and resource distribution, expertise recognition, transparent grant processes, skill or knowledge tracking, preventing piracy, and ensuring publication regardless of the results.14 the second most propitious application of blockchain to information science is content dissemination and management.15 blockchain is an excellent way to track copyright. several blockchains have already been developed for photographers, artists, and musicians. examples include photochain, copytrack, binded, and dotbc.16 micropayments for content supports the implementation of different access models, which can provide an alternative to subscriptionbased models.17 micropayments can also provide an affordable infrastructure for many content types and royalty payment structures. blockchain could also authenticate primary sources and trace their provenance over time. this authentication would not only support archives, museums, and special collections, but it would also ensure law libraries can identify the most recent version of a law.18 finally, blockchain could protect digital first sale rights, which are key to libraries being able to share such content.19 “while drm of any sort is not desirable, if by using blockchain-driven drm we trade for the ability to have recognized digital first sale rights, it may be a worthy bargain for libraries.”20 to support such restrictions, another use for blockchain developed by companies such as libchain is open, verifiable, and anonymous access management to library content.21 another suitable application for blockchain is metadata creation and management.22 an open metadata archive, information ledger, or knowledgebase is very appealing because access to high quality records often requires a subscription to oclc.23 some libraries cannot afford such subscriptions. therefore, they must rely on records supplied by either a vendor or a government agency, like the library of congress. unfortunately, as of this writing, there is little research on how these blockchains could be constructed at the scale of large databases like those of oclc and the library of congress. in fact, the only such project is demco’s private, invitation-only beta.24 demco does not provide any information regarding their new product, but to make its development profitable, it is most likely a private, permissioned blockchain. creating permissionless blockchains for metadata records this section will describe how to create permissionless blockchains for metadata records including grouping transactions, an appropriate consensus algorithm, and storage options. please note that these blockchains are intended to augment current metadata record creation and modification practices and standards, not supersede them. the author assumes that record creation and modification will still require content (rda) and encoding (marc) validation prior to blockchain submission. validation in this section will refer solely to blockchain validation. generating and managing public and private keys all distributed ledger participants will need a public key or address for blocks of transactions to be sent to them and a private key for digital signatures. one way to create these key pairs is to generate a seed, which can be a group of random words or passphrases. the sha-256 algorithm can then be applied to this seed to create a private key.25 next, a public key can be generated from that private key using an elliptic curve digital signature algorithm.26 for additional security, the no need to ask | rubel 4 https://doi.org/10.6017/ital.v38i2.10822 public key can be hashed again using a different cryptographic hash function, such as ripemd160, or multiple hash functions, like bitcoin does to create its addresses.27 these key pairs could be managed with digital wallet software. “a bitcoin wallet is an organized collection of addresses and their corresponding private keys.”28 larger institutions, such as the library of congress, could have multiple key pairs with each pair designated for the appropriate cataloging department based on genre, form, etc. creating a genesis block every blockchain must start with a “genesis block.”29 for example, a personal name authority blockchain might start with william shakespeare’s record. a descriptive bibliographic blockchain might start with the king james bible. this genesis block includes a block header, a recipient’s public key or address, a transaction count, and a transaction list.30 being the first block, the block header will not contain a hash of the previous block header. it will contain, however, a hash of all of the transactions within that block to verify that the transactions list has not been altered. the block header will also include a timestamp and possibly a difficulty level and nonce.31 then the block header is hashed using the sha-256 algorithm and encrypted with the creator’s private key to produce a digital signature. this digital signature will be appended to the end of the block so validators can verify that the creator made the block by using their (the creator’s) public key.32 finally, the recipient’s public key or address, the transaction count, and transaction list are appended to the block header.33 block header • hash of previous block header • hash of all transactions in that block • timestamp • difficulty level (if applicable) • nonce (if applicable) block • recipient public key or address • transaction count • transaction list • digital signature in her master of information security and intelligence thesis at ferris state university, amber snow investigated the feasibility of using blockchain to add, edit, and validate changes to woodbridge n. ferris’ authority record.34 as shown in figure 1, she began by creating a hash function using the sha-256 algorithm to encrypt the previous hash, the timestamp, the block number, and the metadata record. “the returned encrypt value is significant because the returned data is the encrypted data that is being committed as [a] mined block transaction permanently to ledger.”35 the ledger block, however, “contains the editor’s name, the entire encrypted hash value, and the prior blocks [sic] hashed value.”36 information technology and libraries | june 2019 5 figure 1. creating a sha-256 hash. next, as shown in figures 2 and 3, she created a genesis block with a prior hashed value of zero by ingesting ferris’ authority record as “a single line file that contains the indicator signposts for cataloging the record.”37 figure 2. ingesting woodbridge n. ferris' authority record.38 figure 3. woodbridge n. ferris' authority record as a genesis block. note the previoushash value is zero. snow noted that “the understanding and interpretation of the marc authority record’s signposts is not inherently relevant for the blockchain data processing.”39 to keep the scope narrow, she also avoided using public and private key pairs to exchange records between nodes. “the ri [research institution] blockchain does not necessarily require two users to agree…instead the ri blockchain is looking to commit and track single user edits to the record.”40 creating and submitting new blocks for validation once a genesis block has been created and distributed, any node on the network can submit new blocks to the chain. for metadata records, new blocks should contain either new records or multiple modifications to the same record with each field being treated as a transaction. when a no need to ask | rubel 6 https://doi.org/10.6017/ital.v38i2.10822 second block is appended, the new block header will include the hash of the previous block header, a hash of all of the new transactions, a new timestamp, and possibly a new difficulty level and/or nonce. the block header will then be hashed using sha-256 and encrypted with the submitter’s private key to become a digital signature for that block. finally, another recipient’s public key or address, a new transaction count, and a new transaction list will be appended to the block header. additional blocks can then be securely appended to the chain ad infinitum without losing any of the transactional details. if two validators approve the same block at the same time, then the fork where the next block is appended first becomes the valid chain while the other chain becomes orphaned.41 although snow’s method does not include exchanging records using public keys or addresses, she was able to change a record, add it to the blockchain, and successfully commit those edits using the proof of work consensus algorithm.42 as shown in figure 4, after creating and submitting a genesis block as “tester 1,” she added a modified version of woodbridge n. ferris’ record as “tester 2.” this version appended the string “testerchanged123” to woodbridge n. ferris’ authority record. then she validated or “mined” the second block to commit the changes. figure 4. submitting and validating an edited record. figure 5 shows that the second block is chained to the genesis block because the “previoushash” value of the second block matches the “hash” of the genesis block. this link is what commits the block to the ledger. the appended string in the second block is at the end of the “metadata” variable. information technology and libraries | june 2019 7 figure 5. the new authority record blockchain. a more sophisticated method to append a second block would require key pairs. as described previously, a block would include a recipient’s public key or address, which would route the new and modified records to large, known institutions like the library of congress. although every node on the network can see the records and all of the changes, large institutions with welltrained and authoritative catalogers may be the best repository for metadata records and could store a preservation or backup copy of the entire chain. they are also the most reliable for validating records for content accuracy and correct encoding. achieving algorithmic consensus once a block has been submitted for validation, the other nodes use a consensus algorithm to verify the validity of the block and its transactions. “consensus mechanisms are ways to guarantee a mutual agreement on a data point and the state…of all data.”43 the most well-known consensus algorithm is bitcoin’s proof of work, but the most suitable algorithm for permissionless metadata blockchains is a federated byzantine agreement. proof of work proof of work (pow) relies on a one-way cryptographic hash function to create a hash of the block header. this hash is easy to calculate, but it is very difficult to determine its components.44 to solve a block, nodes must compete to calculate the hash of the block header. to calculate the hash of a block header, a node must first separate it into its constituent components. the hash of the previous block header, the hash of all of the transactions in that block, the timestamp, and the difficulty target will always have the same inputs. the validator, however, changes the nonce or random value appended to the block header until the hash has been solved.45 in bitcoin this process is called “mining” because every new block creates new bitcoins as a reward for the node that solved the block.46 no need to ask | rubel 8 https://doi.org/10.6017/ital.v38i2.10822 bitcoin also includes a mechanism to ensure the average number of blocks solved per hour remains constant. this mechanism is the difficulty target. “to compensate for increasing hardware speed and varying interest in running nodes over time, the proof-of-work difficulty is determined by a moving average targeting an average number of blocks per hour. if they’re generated too fast, the difficulty increases.”47 adjusting the difficulty target within the block header keeps bitcoin stable because its block rate is not determined by its popularity.48 in sum, validators are trying to find a nonce that generates a hash of the block header that is less than the predetermined difficulty target. unfortunately, proof of work requires immense and ever-increasing computational power to solve blocks, which poses a sustainability and environmental challenge. bitcoin and other financial services may need to rely on proof of work because “the massive amounts of electricity required helps to secure the network. it disincentivizes hacking and tampering with transactions…”49 because an attacker would need to control over 51 percent of the entire network to convince the other nodes that a faulty ledger is correct.50 metadata blockchains would rely on public information and therefore would not need the same level of security as private financial, medical, or personally identifiable information. unlike bitcoin, metadata blockchains also would not need a difficulty target because fluctuations in block production rates would not affect a metadata block’s value the same way cryptocurrency inflation would. therefore, despite its incredible security, proof of work would be computationally excessive for metadata record blockchains. federated byzantine agreement byzantine agreements are “the most traditional way to reach consensus. […] a byzantine agreement is reached when a certain minimum number of nodes (known as a quorum) agrees that the solution presented is correct, thereby validating a block and allowing its inclusion on the blockchain.”51 byzantine fault-tolerant (bft) state machine replication protocols support consensus “despite participation by malicious (byzantine) nodes.”52 this support ensures consensus finality, which “mandates that a valid block…never be removed from the blockchain.”53 in contrast, proof of work does not satisfy consensus finality because there is still the potential for temporary forking even if there are no malicious nodes.54 the “absence of consensus finality directly impacts the consensus latency of pow blockchains as transactions need to be followed by several blocks to increase the probability that a transaction will not end up being pruned and removed from the blockchain.”55 this latency increases as block size increases, which may also increase the number of forks and possibility of attack.56 “with this in mind, limited performance is seemingly inherent to pow blockchains and not an artifact of a particular implementation.”57 bft protocols, however, can sustain tens of thousands of transactions at nearly network latency levels.58 a bft consensus algorithm is also superior to one based on proof of work because “users and smart contracts can have immediate confirmation of the final inclusion of a transaction into the blockchain.”59 bft consensus algorithms also decouple trust from resource ownership, allowing small organizations to oversee larger ones.60 to use bft, every node must know and agree on the exact list of participating peer nodes. ripple, a bft protocol, tries to ameliorate this problem by publishing an initial membership list and allowing members to edit that list after implementation. unfortunately, users are often reluctant to edit the membership list thereby placing most of the network’s power in the person or organization that maintains the list.61 information technology and libraries | june 2019 9 federated byzantine agreement (fba), however, does not require each node to agree upon and maintain the same membership list. “in fba, each participant knows of others it considers important. it waits for the vast majority of those others to agree on any transaction before considering the transaction settled.”62 theoretically, an attacker could join the network enough times to outnumber legitimate nodes, which is why quorums by majority would not work. instead, fba creates quorums using a decentralized method that relies on each node selecting its own quorum slices.63 “a quorum slice is the subset of a quorum convincing one particular node of agreement.”64 a node may have many slices, “any one of which is sufficient to convince it of a statement.”65 the system constructs quorums based on individual node decisions thereby generating consensus without every node being required to know about every other node in the system.66 one example of quorum slices that might be good for metadata blockchains is a tiered system as shown in figure 6. the top tier would be structured like a bft system where the nodes can tolerate a limited number of byzantine nodes at the same level. this level would include the core metadata authorities, such as the library of congress or pcc members. members of this tier would be able to validate any record. the second or middle tier nodes would depend on the top tier because, in this example, a middle tier node requires two top tier nodes to form a quorum slice. these middle tier nodes would be authoritative, known institutions, such as universities, that already rely on the core metadata authorities on the top tier to validate and distribute their records. finally, a third tier, such as smaller institutions, would, in this example, rely on at least two middle tier nodes for their quorum slice. figure 6. tiered quorum example. no need to ask | rubel 10 https://doi.org/10.6017/ital.v38i2.10822 using an fba protocol to validate a transaction requires each node to exchange two sets of messages. the first set of messages gathers validations and the second set of messages confirms those validations. “from each node’s perspective, the two rounds of messages divide agreement…into three phases: unknown, accepted, and confirmed.”67 the unknown status becomes an acceptance when the first validation succeeds. acceptance is not sufficient for a node to act on that validation, however, because acceptance may be stuck in an indeterminate state or blocked for other nodes.68 the accepting node may also be corrupted and validate a transaction the network quorum rejects. therefore, the confirmation validation “allows a node to vote for one statement and later accept a contradictory one.”69 figure 7. validation process of statement a for a single node v. fba would lessen concerns about sharing a permissionless blockchain, but it can “only guarantee safety when nodes choose adequate quorum slices.”70 after discovery, byzantine nodes should be excluded from quorum slices to prevent interference with validation. one example of such interference is tricking other nodes to validate a bad confirmation message. “in such a situation, nodes must disavow past votes, which they can only do by rejoining the system under new node names.”71 theoretically, this recovery process could be automated to include “having other nodes recognize reincarnated nodes and automatically update their slices.”72 therefore, the key limitation to using an fba algorithm is continuity of participation. if too many nodes leave the network, reengineering consensus would require centralized coordination whereas proof of work algorithms could operate after losing many nodes without substantial human intervention.73 storing the blockchain storing a large blockchain, such as bitcoin, is a significant challenge. one method to facilitate that storage would be to rely on top tier nodes to retain a complete copy of the blockchain and allow smaller, lower tier nodes to retain an abridged version. in bitcoin, these methods are known as full payment verification (fpv) and simplified payment verification (spv). fpv requires a complete copy of the blockchain to “verify that bitcoins used in a transaction originated from a mined block by scanning backward, transaction by transaction, in the blockchain until their origin is found.”74 unfortunately, as one might expect, fpv consumes many resources and can take a long time to initialize. for example, downloading bitcoin’s blockchain can take several days. this long installation period is partly due to the size of blockchain, but if proof of information technology and libraries | june 2019 11 work is used as the consensus algorithm, then the new node must also connect to other full nodes “to determine whose blockchain has the greatest proof-of-work total (by definition, this is assumed to be the consensus blockchain).”75 using fba instead of proof of work would eliminate this time and resource consuming step. in contrast, svp only allows a node “to check that a transaction has been verified by miners and included in some block in the blockchain.”76 a node does this by downloading the block headers of every block in the chain. in addition to retaining the hash of the previous block header, these headers also include root hashes derived from a merkle tree. a merkle tree is a method where “the spent transactions…can be discarded to save disk space.”77 as shown in figure 8, combining transaction hashes for the entire block into a single root hash in the block header saves a considerable amount of storage capacity because the interior hashes can be eliminated or “pruned” off the merkle tree. figure 8. using a merkle tree for storage. as shown in figure 9, to verify that a transaction was included a block, a node “obtains the merkle branch linking the transaction to the block it’s timestamped in.”78 although it cannot check the transaction directly, “by linking it to a place in the chain he can see that a network node has accepted it and blocks after it further confirm the network has accepted it.”79 no need to ask | rubel 12 https://doi.org/10.6017/ital.v38i2.10822 figure 9. verifying a transaction using a merkle root hash. compared to fvp, svp “requires only a fraction of the memory that’s needed for the entire blockchain.”80 this small amount of storage enables svp ledgers to sync and become operational in less than an hour.81 svp is limited, however, only allowing nodes to manage addresses or public keys that they maintain whereas fvp ledgers are able to query the entire network. thus, an svp ledger must rely “on its network peers to ensure its transactions are legit.”82 theoretically, an attacker could overpower the entire network and convince nodes using svp to accept fraudulent transactions, but such an attack is very unlikely for metadata blockchains. for additional security, an svp node could also “accept alerts from network nodes when they detect an invalid block, prompting the user’s software to download the full block and alerted transactions to confirm the inconsistency.”83 adding such a feature to metadata blockchain software would eliminate the slight risk of it being contaminated by malicious actors. thus, svp offers the ability for smaller institutions to participate in creating and maintaining a metadata blockchain without requiring them to have the storage capacity for the entire blockchain. conclusion and future directions this article described how permissionless metadata blockchains could be created to overcome two significant limitations in current cataloging practices: centralization and a lack of traceability. the process would start by creating public keys using a seed and the sha-256 algorithm and private keys using an elliptic curve digital signal algorithm. after creating the genesis block, nodes would submit either a new record or modifications to a single record for validation. validation would rely on a federated byzantine agreement (fba) consensus algorithm because it offers the most flexibility for institutions to select authoritative peers. quorum slices would be chosen using a tiered system where the top tier institutions would be the core metadata authorities, such as the library of congress. only the top tier nodes would be required to store a copy of the entire blockchain (fvp) thereby allowing other institutions to decide whether they prefer to use svp or fvp. information technology and libraries | june 2019 13 future directions for research could start with investigating whether this theoretical design will work. fba has not been heavily promoted as an option for a consensus algorithm, but its quorum slices create trust between recognized authorities and smaller institutions. another area of study could be whether there is a significant demand for metadata blockchains. many institutions appear frustrated at the costs and limitations of working with a vendor, but they also view such relationships as necessary for metadata record creation and maintenance. a metadata blockchain would reduce such dependence, but some institutions may be leery of using open source software. other institutions might be hesitant to adopt blockchain because they believe it is merely another “fad” or an unnecessary addition to metadata exchange systems. a third area for research could be a cost-benefit analysis for implementing metadata blockchains that weighs current vendor fees and labor costs against the potential storage and labor costs. such an analysis may create a tipping point where long-term return on investment outweighs the short-term challenges. endnotes 1 “about the project,” blockchain for peer review, digital science and katalysis, accessed nov. 29, 2018, https://www.blockchainpeerreview.org/about-the-project/. 2 “marc record services,” marc standards, library of congress, accessed nov. 29, 2018, https://www.loc.gov/marc/marcrecsvrs.html; “open library data,” open library, internet archive, accessed nov. 29, 2018, https://archive.org/details/ol_data ; oclc, 2017-2018 annual report. 3 “join the pcc,” program for cooperative cataloging, library of congress, accessed nov. 29, 2018, http://www.loc.gov/aba/pcc/join.html. 4 “040 cataloging source (nr),” oclc support & training, oclc, accessed nov. 29, 2018, https://www.oclc.org/bibformats/en/0xx/040.html. 5 dr. joris van rossum, “blockchain for research,” accessed nov. 29, 2018, https://www.digitalscience.com/resources/digital-research-reports/blockchain-for-research/. 6 van rossum, 11. 7 van rossum, 12. 8 van rossum, 12. 9 van rossum, 16. 10 digital science and katalysis, “about the project.” 11 “decentralized research platform,” deip, accessed nov. 29, 2018, https://deip.world/wpcontent/uploads/2018/10/deip-whitepaper.pdf. 12 deip, 13. 13 deip, 14. 14 deip, 16. no need to ask | rubel 14 https://doi.org/10.6017/ital.v38i2.10822 15 jason griffey, “blockchain for libraries,” feb. 26, 2016, https://speakerdeck.com/griffey/blockchain-for-libraries. 16 “e-services,” concensum, accessed nov. 29, 2018, https://concensum.org/en/e-services; “about,” binded, accessed nov. 29, 2018, https://binded.com/about; “faq,” dot blockchain media, accessed nov. 29, 2018, http://dotblockchainmedia.com/. 17 van rossum, “blockchain for research,” 10. 18 debbie ginsberg, “law and the blockchain,” blockchains for the information profession, nov. 22, 2017, https://ischoolblogs.sjsu.edu/blockchains/law-and-the-blockchain-by-debbieginsberg/. 19 griffey, “blockchain for libraries.” 20 “ways to use blockchain in libraries,” san josé state university, accessed nov. 29, 2018, https://ischoolblogs.sjsu.edu/blockchains/blockchains-applied/applications/. 21 “libchain: open, verifiable, and anonymous access management,” libchain, accessed nov. 29, 2018, https://libchain.github.io/. 22 griffey, “blockchain for libraries.” 23 san josé state university. “ways to use blockchain in libraries.” 24 “demco software blockchain,” demco, accessed nov. 29, 2018, http://blockchain.demcosoftware.com/. 25 jordan baczuk, “how to generate a bitcoin address—step by step,” coinmonks, accessed nov. 29, 2018, https://medium.com/coinmonks/how-to-generate-a-bitcoin-address-step-by-step9d7fcbf1ad0b. 26 “elliptic curve digital signature algorithm,” bitcoin wiki, accessed nov. 29, 2018, https://en.bitcoin.it/wiki/elliptic_curve_digital_signature_algorithm. 27 conrad barski and chris wilmer, bitcoin for the befuddled (san francisco: no starch pr., 2015), 139. 28 barski and wilmer, 12-13. 29 barski and wilmer, 11. 30 barski and wilmer, 172-73. 31 barski and wilmer, 172-73. 32 satoshi nakamoto, “bitcoin: a peer-to-peer electronic cash system,” accessed nov. 29, 2018, https://bitcoin.org/bitcoin.pdf. 33 barski and wilmer, bitcoin for the befuddled, 170-72. information technology and libraries | june 2019 15 34 amber snow, “the design and implementation of blockchain technology in academic resource’s authoritative metadata records: enhancing validation and accountability” (master’s thesis, ferris state university, 2018), 34. 35 snow, 40. 36 snow, 40. 37 snow, 37, 40. 38 snow, 42. 39 snow, 37. 40 snow, 39. 41 barski and wilmer, bitcoin for the befuddled, 23. 42 snow, “the design and implementation of blockchain technology,” 37. 43 “9 types of consensus mechanisms you didn’t know about,” daily bit, accessed nov. 29, 2018, https://medium.com/the-daily-bit/9-types-of-consensus-mechanisms-that-you-didnt-knowabout-49ec365179da. 44 barski and wilmer, bitcoin for the befuddled, 138. 45 barski and wilmer, 171. 46 barski and wilmer, 138. 47 nakamoto, “bitcoin,” 3. 48 barski and wilmer, bitcoin for the befuddled, 171. 49 helen zhao, “bitcoin and blockchain consume an exorbitant amount of energy. these engineers are trying to change that,” cnbc, feb. 23, 2018, https://www.cnbc.com/2018/02/23/bitcoinblockchain-consumes-a-lot-of-energy-engineers-changing-that.html. 50 barski and wilmer, bitcoin for the befuddled, 23. 51 shaan ray, “federated byzantine agreement,” towards data science, accessed nov. 29, 2018, https://towardsdatascience.com/federated-byzantine-agreement-24ec57bf36e0. 52 marko vukolić, “the quest for scalable blockchain fabric: proof-of-work vs. bft replication,” ibm research – zurich, accessed nov. 29, 2018, http://vukolic.com/inetsec_2015.pdf 53 vukolić, “the quest for scalable blockchain fabric,” [5]. 54 vukolić, [6]. no need to ask | rubel 16 https://doi.org/10.6017/ital.v38i2.10822 55 vukolić, [6]. 56 vukolić, [7]. 57 vukolić, [7]. 58 vukolić, [7]. 59 vukolić, [6]. 60 david mazières, “the stellar consensus protocol: a federated model for internet-level consensus,” stellar development foundation, accessed nov. 29, 2018, https://www.stellar.org/papers/stellar-consensus-protocol.pdf. 61 mazières, 3. 62 mazières, 1. 63 mazières, 4. 64 mazières, 4. 65 mazières, 4. 66 mazières, 5. 67 mazières, 11. 68 mazières, 11. 69 mazières, 13. 70 mazières, 28. 71 mazières, 29. 72 mazières, 29. 73 mazières, 29. 74 barski and wilmer, bitcoin for the befuddled, 191. 75 barski and wilmer, 191. 76 barski and wilmer, 192. 77 nakamoto, “bitcoin,” 4. 78 nakamoto, 5. 79 nakamoto, 5. information technology and libraries | june 2019 17 80 barski and wilmer, bitcoin for the befuddled, 192. 81 barski and wilmer, 193. 82 barski and wilmer, 193. 83 nakamoto, “bitcoin,” 5. public libraries leading the way: on educating patrons on privacy and maximizing library resources public libraries leading the way on educating patrons on privacy and maximizing library resources t.j. lamanna information technology and libraries | september 2019 4 t.j. lamanna (professionalirritant@riseup.net) is an adult services librarian, cherry hill public library. abstract libraries are one of our most valuable institutions. they cater to people of all demographics and provide services to patrons they wouldn’t be able to get anywhere else. the list of services libraries provide is extensive and comprehensive, although unfortunately, there are significant gaps in what our services can offer, particularly those regarding technology advancement and patron privacy. though library classes on educating patrons’ privacy protection are a valiant effort, we can do so much more and lead the way, maybe not for the privacy industry but for our communities and patrons. creating a strong foundational knowledge will help patrons leverage these new skills in their day to day lives as well as help them educate their families about common privacy issues. in this column, we’ll explore some of the ways libraries can utilize their current resources as well as provide ideas on how we can maximize their effectiveness and roll new technologies into their operations. though many libraries have policies on how they deal with patron privacy, unfortunately some policies aren’t very strong and oftentimes staff isn’t trained in the details of these policies. fortunately, for libraries who don’t have these necessary policies, there are some, such as the san jose public library, that offer their own as a framework.1 those that do have a strong comprehensive policy must make sure they are enforcing and regularly updating it to comply with new technologies being released. it’s a daunting task, but as article vii of the library bill of rights says, “all people, regardless of origin, age, background, or views, possess a right to privacy and confidentiality in their library use. libraries should advocate for, educate about, and protect people’s privacy, safeguarding all library use data, including personally identifiable information.”2 this means we have a responsibility to our patrons to do everything in our power to protect them and teach them to protect themselves. this requires a concerted effort not just for technology and it librarians, but for all library workers. a privacy policy means little if those on the front lines are either unaware of the policy or unsure how it is to be implemented. therefore, all library staff should both understand the fundamental reasons behind library privacy policies and be trained in maintaining them. libraries may consider implementing this training during staff development days or offer independent training sessions as needed. since the introduction of the patriot act, libraries stopped collecting patrons’ reading habits, but so many library integrated library systems (ils) snag massive amounts of patron information we are unaware of. i’ve been administering our ils for over two years and i just found another space where items are being unnecessarily retained that i didn’t notice before. an instance such as this calls for limiting personally identifiable information (pii) to what is strictly necessary. mailto:professionalirritant@riseup.net on educating patrons on privacy and maximizing library resources | lamanna 5 https://doi.org/10.6017/ital.v38i3.11571 in limiting the pii gathered in the first place, library staff should consider the following questions: what information do libraries really need to collect to offer library cards or programming? does your library really need patrons’ date of birth or gender? probably not. if so, you shouldn’t be collecting it, and if you do, make sure you anonymize the data. using metrics is vital to how libraries function, receive funding, and schedule programming. you can still use the information, but it should not be connected to a patron in any way. after educating staff, we can educate patrons on developing better and safer practices regarding personal privacy and security in their daily lives. practical examples range from teaching patrons how to create strong passwords and backup sensitive files to explaining how malware works and what the “cloud” actually is. this is a start, but it goes far beyond that. i’ve served many patrons who, even after taking courses on the subject, are overwhelmed by the security measures needed to protect themselves. this isn’t necessarily a sign that our classes are ineffective, but it does imply that new tactics are needed. let’s look at a few examples. another version of pii that we often overlook are security measures such as closed-circuit television (cctv) or security/police officers in our buildings.3 they often are either forgotten or outside the purview of the library itself. as the college of policing states, “cctv is more effective when directed at reducing theft of and from vehicles, while it has no impact on levels of violent crime.”4 while there are justifications for bringing this technology into the library, they should only be set up where needed, taking great care not to point them at patron or staff computers. if cctv is needed, make sure to follow local retention laws and remove the footage as soon as its time has expired. this idea applies to all collected information. there is no reason to archive data beyond the date they can be destroyed as it puts the library and its patrons in a compromised position. law enforcement in the library is a tough thing to argue against in our current political climate. but studies have shown that police presence does little to deter crime and may actually disproportionately impact marginalized communities.5 consider the purpose of law enforcement personnel and if their presence is actually necessary to the proper functioning of your library. in the event that you should have law enforcement come in with a subpoena that requires you to turn over your patron data, it’s important to have a canary warning that can be removed so your patrons understand what has happened.6 another way libraries can lead the way in protecting patron privacy both inside and outside the library is by supporting legislation that bans facial recognition software. this type of technology is becoming ubiquitous, but places have already started pushing back and libraries can be the epicenter of this movement. it’s already been banned in oakland,7 san francisco8 (one of the homes of this technology), as well as somerville, massachusetts, with groups like the massachusetts library association unanimously putting out a moratorium on facial surveillance, which is the practice of recording ones face to create user profiles.9 there are other states that are working down this path and it’s overwhelmingly heartening to see libraries step up and in front of something they know would damage our communities. we ought to be activists, standing on the front lines and showing our patrons our deepest commitment to them. surely there are greater strides we can make, such as revising wifi policies. wifi is one of the most used services libraries offer and many libraries don’t use it to their full potential. for instance, some libraries turn off their wifi when the building is closed, severely limiting patrons’ information technology and libraries | september 2019 6 usage. it’s a service we pay for and there is no reason it shouldn’t be available at all times. your it service should make sure the wifi is secure (it should be where it’s available at all hours or not). unlimited access to wifi becomes invaluable to users who need it for emergencies including completing work or accessing important online services when the library is closed. while we do have limited bandwidth and it services must actively maintain wifi security, libraries should make sure it’s available to the public as often as possible. now that we’ve covered using bandwidth when we aren’t open, let’s talk about libraries with excess bandwidth. no resource should go unused in the library. we have a limited budget and we should make sure every penny is used to serve our communities. one fantastic use of excess bandwidth — especially during closed hours — would be to set up a tor relay in your library, an anonymity network that allows people to surf the internet with extra security and privacy in mind. it’s quite easy to set up and you can limit how much bandwidth it uses so you aren’t shorting anyone in your library. it’s a service used by groups such as journalists or activists who want to make positive change in the world and need a safe place to do so. some are concerned that the tor network is used for malicious intent but the tor project, the organization that runs the network, constantly works to ensure nothing like that is taking place. also, anything solicitous you can find on the tor network is available on the regular internet including places like facebook or craigslist, so the stigma of the network should be taken in context. the tor project routinely monitors the network and searches out illegal material (there are no hired killers on the tor network). given all this, you could help the network greatly by just partitioning a small amount of your bandwidth. libraries have the unique ability to be transformative. unlike other non-profits or organizations, we have the ability to pivot. we can both change directions as needed and pave the way for our communities as leaders in the movement toward patron privacy. i leave you with a quote from hardt and negri: “…we share common dreams of a better future.”10 that should be our motto. endnotes 1 “our privacy policy, san jose public library, accessed august 15, 2019, https://www.sjpl.org/privacy/our-privacy-policy. 2 “library bill of rights,” american library association, last modified january 19, 2019, http://www.ala.org/advocacy/intfreedom/librarybill. 3 “importance of cctv in libraries for better security,” accessed august 14, 2019, https://www.researchgate.net/publication/315098570_importance_of_cctv_in_libraries_for _better_security. 4 “effects of cctv on crime,” college of policing, accessed august 14, 2019, http://library.college.police.uk/docs/what-works/what-works-briefing-effects-of-cctv2013.pdf. 5 “do police officers in school really make them safer?” accessed august 14, 2019, https://www.npr.org/2018/03/08/591753884/do-police-officers-in-schools-really-make-themsafer. 6 “canary warning,” wikipedia https://en.wikipedia.org/wiki/warrant_canary. https://www.sjpl.org/privacy/our-privacy-policy http://www.ala.org/advocacy/intfreedom/librarybill https://www.researchgate.net/publication/315098570_importance_of_cctv_in_libraries_for_better_security https://www.researchgate.net/publication/315098570_importance_of_cctv_in_libraries_for_better_security http://library.college.police.uk/docs/what-works/what-works-briefing-effects-of-cctv-2013.pdf http://library.college.police.uk/docs/what-works/what-works-briefing-effects-of-cctv-2013.pdf https://www.npr.org/2018/03/08/591753884/do-police-officers-in-schools-really-make-them-safer https://www.npr.org/2018/03/08/591753884/do-police-officers-in-schools-really-make-them-safer https://en.wikipedia.org/wiki/warrant_canary on educating patrons on privacy and maximizing library resources | lamanna 7 https://doi.org/10.6017/ital.v38i3.11571 7 sarah ravani, “oakland bans use of facial recognition technology, citing bias concerns,” san francisco chronicle, july 17, 2019, https://www.sfchronicle.com/bayarea/article/oakland-bansuse-of-facial-recognition-14101253.php. 8 kate conger, richard fausset, and serge f. kovaleski, “san francisco bans facial recognition technology,” new york times, may 14, 2019, https://www.nytimes.com/2019/05/14/us/facialrecognition-ban-san-francisco.html. 9 sarah wu, “somerville city council passes facial recognition ban,” boston globe, june 27, 2019, https://www.bostonglobe.com/metro/2019/06/27/somerville-city-council-passes-facialrecognition-ban/sfaqq7mg3dgulxonbhscyk/story.html. 10 michael hart and antonio negri, multitude: war and democracy in the age of empire, (new york: the penguin press, 2009), p. 128. https://www.sfchronicle.com/bayarea/article/oakland-bans-use-of-facial-recognition-14101253.php https://www.sfchronicle.com/bayarea/article/oakland-bans-use-of-facial-recognition-14101253.php https://www.nytimes.com/2019/05/14/us/facial-recognition-ban-san-francisco.html https://www.nytimes.com/2019/05/14/us/facial-recognition-ban-san-francisco.html https://www.bostonglobe.com/metro/2019/06/27/somerville-city-council-passes-facial-recognition-ban/sfaqq7mg3dgulxonbhscyk/story.html https://www.bostonglobe.com/metro/2019/06/27/somerville-city-council-passes-facial-recognition-ban/sfaqq7mg3dgulxonbhscyk/story.html abstract endnotes an overview of the current state of linked and open data in cataloging irfan ullah, shah khusro, asim ullah, and muhammad naeem information technology and libraries | december 2018 47 irfan ullah (cs.irfan@uop.edu.pk) is doctoral candidate, shah khusro (khusro@uop.edu.pk) is professor, asim ullah (asimullah@uop.edu.pk) is doctoral student, and muhammad naeem (mnaeem@uop.edu.pk) is assistant professor, at the department of computer science, university of peshawar. abstract linked open data (lod) is a core semantic web technology that makes knowledge and information spaces of different knowledge domains manageable, reusable, shareable, exchangeable, and interoperable. the lod approach achieves this through the provision of services for describing, indexing, organizing, and retrieving knowledge artifacts and making them available for quick consumption and publication. this is also aligned with the role and objective of traditional library cataloging. owing to this link, major libraries of the world are transferring their bibliographic metadata to the lod landscape. some developments in this direction include the replacement of anglo-american cataloging rules 2nd edition by the resource description and access (rda) and the trend towards the wider adoption of bibframe 2.0. an interesting and related development in this respect are the discussions among knowledge resources managers and library community on the possibility of enriching bibliographic metadata with socially curated or user-generated content. the popularity of linked open data and its benefit to librarians and knowledge management professionals warrant a comprehensive survey of the subject. although several reviews and survey articles on the application of linked data principles to cataloging have appeared in literature, a generic yet holistic review of the current state of linked and open data in cataloging is missing. to fill the gap, the authors have collected recent literature (2014–18) on the current state of linked open data in cataloging to identify research trends, challenges, and opportunities in this area and, in addition, to understand the potential of socially curated metadata in cataloging mainly in the realm of the web of data. to the best of the authors’ knowledge, this review article is the first of its kind that holistically treats the subject of cataloging in the linked and open data environment. some of the findings of the review are: linked and open data is becoming the mainstream trend in library cataloging especially in the major libraries and research projects of the world; with the emergence of linked open vocabularies (lov), the bibliographic metadata is becoming more meaningful and reusable; and, finally, enriching bibliographic metadata with user-generated content is gaining momentum. conclusions drawn from the study include the need for a focus on the quality of catalogued knowledge and the reduction of the barriers to the publication and consumption of such knowledge, and the attention on the part of library community to the learning from the successful adoption of lod in other application domains and contributing collaboratively to the global scale activity of cataloging. introduction with the emergence of the semantic web and linked open data (lod), libraries have been able to make their bibliographic data publishable and consumable on the web, resulting in an increased understanding and utility both for humans and machines.1 additionally, the use of linked data principles of lod has allowed connecting related data on the web.2 traditional catalogs as mailto:cs.irfan@uop.edu.pk mailto:khusro@uop.edu.pk mailto:asimullah@uop.edu.pk mailto:mnaeem@uop.edu.pk current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 48 https://doi.org/10.6017/ital.v37i4.10432 collections of metadata about library content have served the same purpose for a long time.3 it is, therefore, natural to establish a link between the two technologies and exploit the capabilities of lod to enhance the power of cataloging services. in this regard, significant milestones have been achieved, which includes the use of linked and open data principles for publishing and linking library catalogs, bibframe, and europeana data model (edm).4 however, the potential of linked and open data for building more efficient libraries and the challenges involved in that direction are mostly unknown due to the lack of a holistic view of the relationship between cataloging and the lod initiative and the advances made in both areas. likewise, the possibility of enriching the bibliographic metadata with user-generated content such as ratings, tags, and reviews to facilitate the search for known-items as well as exploratory search has not received much attention. 5 some studies of preliminary extent have, however, appeared in literature an overview of which is presented in the following paragraphs. several survey and review articles have contributed to different aspects of cataloging in the lod environment. hallo et al. investigated how linked data is used in digital libraries, how the major libraries of the world implemented it, and how they benefit from it by focusing on the selected ontologies and vocabularies. 6 they identified several specific challenges to applying linked data to digital libraries. more specifically, they reviewed the linked data applications in digital libraries by analyzing research publications regarding the major national libraries (obtaining five-stars by following linked data principles) and published from 2012 to 2016.7 tallerås examined statistically the quality of linked bibliographic data published by the major libraries including spain, france, the united kingdom, and germany. 8 yoose and perkins presented a brief survey of lod uses under different projects in different domains including libraries, archives, and museums.9 by exploring the current advances in the semantic web, robert identified the potential roles of libraries in publishing and consuming bibliographic data and institutional research output as linked and open data on the web.10 gardašević presented a detailed overview of semantic web and linked open data from the perspective of library data management and their applicability within the library domain to provide a more open and integrated catalog for improved search, resource discovery, and access.11 thomas, pierre-yves, and bernard presented a review of linked open vocabularies (lov), in which they analyzed the health of lov from the requirements perspective of its stakeholders, its current progress, its uses in lod applications, and proposed best practices and guidelines regarding the promotion of lov ecosystem.12 they uncovered the social and technical aspects of this ecosystem and identified the requirements for the long-term preservation of lov data. vandenbussche et al. highlighted the features, components, significance, and applications of lov and identified the ways in which lov supports ontology & vocabulary engineering in the publication, reuse and data quality of lod.13 tosaka and park performed a detailed literature review of rda (2005–11) and identified its fundamental differences from aacr2, its relationship with the metadata standards, and its impact on metadata encoding standards, users, practitioners, and the training required.14 sprochi presented the current progress in rda, frbr (functional requirements for bibliographic records), and bibframe to predict the future of library metadata, the skills and knowledge required to handle it, and the directions in which the library community is heading. 15 gonzales identified the limitations of marc21 and the benefits of and challenges in adopting the bibframe information technology and libraries | december 2018 49 framework.16 taniguchi assessed bibframe 2.0 for the exchange and sharing of metadata created in different ways for different bibliographic resources.17 he discussed bibframe 1.0 from rda point of view.18 he examined bibframe 2.0 from the perspective of rda to uncover issues in its mapping to bibframe including rda expressions in bibframe, mapping rda elements to bibframe properties, and converting marc21 metadata records to bibframe metadata. 19 fayyaz, ullah, and khusro reported on the current state of lod and identified several prominent issues, challenges, and research opportunities. 20 ullah, khusro, and ullah reviewed and evaluated different approaches for bibliographic classification of digital collections.21 by looking at the above survey and review articles, one may observe that these articles target a specific aspect of cataloging from the perspective of lod. the holistic analysis and a complete picture of the current state of cataloging in transiting to lod ecosystem are missing. this paper adds to the body of knowledge by filling this gap in the literature. more specifically, it attempts to answer the following research questions (rqs): rq01: how linked open data (lod) and vocabularies (lov) are transforming the digital landscape of library catalogs? rq02: what are the prominent/major issues, challenges, and research opportunities in publishing and consuming bibliographic metadata as linked and open data? rq03: what is the possible impact of extending bibliographic metadata with the usergenerated content and making it visible on the lod cloud? the first section of this paper answers rq01 by discussing the potential role of lod and lov in making library catalogs visible and reusable on the web. the second section answers rq02 by identifying some of the prominent issues, challenges, and research opportunities in publishing, linking, and consuming library catalogs as linked data. it also identifies specific issues in rda and bibframe from lod perspective and highlights the quality of lod-based cataloging. the third section answers rq03 by reviewing the state-of-the-art literature on the socially curated metadata and its role in cataloging. the last section concludes the paper followed by references cited in this article. the role of linked open data and vocabularies in cataloging the catalogers, librarians, and information science professionals have always been busy defining the set of rules, guidelines, and standards to record the metadata about knowledge artifacts accurately, precisely, and efficiently. the aacr2 are among the widely used rules and guidelines for cataloging. however, it has several issues with the nature of authorship, the relationships between bibliographic metadata, the categorization of format-specific resources, and the description of new data types.22 in an attempt to produce its revised version, aacr3, the cataloging community noticed that a new framework should be developed with the name of rda.23 based on frbr conceptual models, rda is a “flexible and extendible bibliographic framework” that supports data sharing and interoperability and is compatible with marc21 and aacr2.24 according to the rda toolkit, rda describes digital and non-digital resources by taking advantage of the flexibilities and efficiencies of modern information storage and retrieval technologies while at the same time is backward-compatible with legacy technologies used in conventional resource discovery and access applications.25 it is aligned with the ifla’s current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 50 https://doi.org/10.6017/ital.v37i4.10432 (international federation of library associations and institutions) conceptual models of authority and bibliographic metadata (frbr, frad [functional requirements for authority data], frsad [functional requirements for subject authority data]).26 rda accommodates all types of content and media in digital environments with improved bibliographic control in the realm of linked and open data; however, its responsiveness to user requirements needs further research.27 the discussion of the cataloging rules and guidelines stays incomplete without the metadata encoding standards and formats that give practical shape to these rules in the form of library catalogs. the most common encoding formats include dublin core (dc) and marc21. dublin core (http://lov.okfn.org/dataset/lov/vocabs/dce) is a [general-purpose metadata encoding scheme and] vocabulary of fifteen properties with “broad, generic, and usable terms” for resource description in natural language. it is advantageous as it presents relatively low barriers to repository construction; however, it lacks in standards to index subjects consistently as well as to offer a uniform semantic basis necessary for an enhanced search experience.28 the lack of uniform semantic basis is due to the individual interpretations and exploitations of dc metadata by the libraries, which in turn originated from its different and independent implementations at the element level.29 marc21 is the most common machine process-able metadata encoding format for bibliographic metadata. it can be mapped to several formats including dc, marc/xml (http://www.loc.gov/standards/marcxml/), mods (http://www.loc.gov/standards/mods), mads (http://www.loc.gov/standards/mads), and other metadata standards.30 however, marc21 has several limitations such as only library software and librarians understand it, it is semantically inexpressive and isolated from the web structure, and it lacks in expressive semantic connections to relate different data elements in a single catalog record.31 besides its limitations, marc metadata encoding format is vital for resource discovery especially within the library environment, and therefore, ways must be found to make visible the library collections outside the libraries and available through the major web search engines.32 one such effort is from the library of congress (http://catalog.loc.gov/) that introduced a new bibliographic metadata framework, bibframe 2.0, which will eventually replace marc21 and allow semantic web and linked open data to interlink bibliographic metadata from different libraries. other metadata encoding schema and frameworks include schema.org, edm, and the international community for documentation (cidoc)’s conceptual reference model (cidoc-crm).33 today, the bibliographic metadata records are available on the web in several forms including marc21, online public access catalogs (opacs), and bibliographic descriptions from online catalogs (e.g., library of congress), online cooperative catalogs (e.g., oclc’s worldcat [https://www.oclc.org/en/worldcat.html program]), social collaborative cataloging applications (e.g., librarything [https://www.librarything.com]), digital libraries (e.g., ieee xplore digital library [https://ieeexplore.ieee.org/xplore/home.jsp]), acm digital library(https://dl.acm.org), book search engines such as google books, and commercial databases including e.g., amazon.com. most of these cataloging web applications use either marc or other legacy standards as metadata encoding and representation schemes. however, the majority of these applications are either considering or transiting to the emerging cataloging rules, frameworks, and encoding schemes so that the bibliographic descriptions of their holdings could be made visible and reusable as linked and open data on the web for the broader interests of libraries, publishers, and end-users. http://lov.okfn.org/dataset/lov/vocabs/dce http://www.loc.gov/standards/marcxml/ http://www.loc.gov/standards/mods http://www.loc.gov/standards/mads http://catalog.loc.gov/ https://www.oclc.org/en/worldcat.html https://www.librarything.com/ https://ieeexplore.ieee.org/xplore/home.jsp https://dl.acm.org/ information technology and libraries | december 2018 51 the presence of high-quality reusable vocabularies makes the consumption of linked data more meaningful, which is made possible by linked open vocabularies (lov) that bring value-added extensions to the web of data.34 the following two subsections attempt to answer the rq01 by highlighting how lod and lov are transforming the current digital landscape of cataloging. linked and open data the semantic web and linked open data have enabled libraries to publish and make visible their bibliographic data on the web, which increases the understanding and consumption of this metadata both for humans and machines.35 lod connects and relates bibliographic metadata on the web using linked data principles.36 publishing, linking, and consuming bibliographic metadata as linked and open data brings several benefits. these include improvements in data visibility, linkage with different online services, interoperability through universal lod platform, and the credibility due to user annotations.37 other benefits include: the semantic modeling of entities related to bibliographic resources; ease in transforming topics into skos; ease in the usage of linked library data in other services; better data visualization according to user requirements; linking and querying linked data from multiple sources; and improved usability of library linked data in other domains and knowledge areas.38 different users including scientists, students, citizens and other stakeholders of library data can benefit from adopting lod in libraries.39 linked data has the potential to make bibliographic metadata visible, reusable, shareable, and exchangeable on the web with greater semantic interoperability among the consuming applications. several major projects including bibframe, lodlam (linked open data in libraries archives and museums [http://lodlam.net]), and ld4l (linked data for libraries [https://www.ld4l.org]) are in progress, which advocates for this potential.40 similarly, library linked data (lld) is lod-based bibliographic datasets, available in mods and marc21 and could be used in making search systems more sophisticated and may also be used in lov datasets to integrate applications requiring library and subjects domain datasets.41 bianchini and guerrini report on the current changes in the library and cataloging domains from ranganathan’s point of view of trinity (library, books, staff), which states that changes in one element of this trinity undoubtedly affect the others.42 they found several factors including readers, collections, and services influence this trinity and emphasize for a change: • readers moved to the web from libraries and wanted to save their time but want many capabilities including searching and navigating the full-text of resources by following links. they want resources connected to similar and related resources. they want concepts interlinked to perform an exploratory search, find serendipitous results to fulfill their information needs. • collections encompass several changes from their production to dissemination, from search and navigation to the representation and presentation of content. the ways the users access them and catalogers describe them are changing. their management is moving beyond the boundaries of their corresponding libraries to the open and broader landscape of open access context and exposure to lod environment. • services are moving from bibliographic data silos to the semantic web. this affects moving the bibliographic model to a more connected and linked data model and environment of semantic web. the data is moving from bibliographic database management systems to large lod graph, where millions of marc records are reused and converted to new http://lodlam.net/ https://www.ld4l.org/ current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 52 https://doi.org/10.6017/ital.v37i4.10432 encoding formats that are backward compatible with marc21, rda, and others and provide opportunities to be exploited fully by the linked and open data environment. thinking along this direction, new cataloging rules and guidelines, such as rda, are making us a part of the growing global activity of cataloging. therefore, catalogers should take keen interest in and avail themselves of the opportunities that lie in linked and open data for cataloging. otherwise, they (as a service) might be forgotten or removed from the trinity, i.e., from collections and readers.43 several major libraries have been actively working to make their bibliographic metadata visible and re-usable on the web. the library of congress through its linked data service (http://id.loc.gov) enables humans and machines to access its authority data programmatically. 44 it exposes and interconnects data on the web through dereferenceable uniform resource identifiers (uris).45 its scope includes providing access to the commonly found loc standards and vocabularies (controlled vocabularies and data values) for the list of authorities and controlled vocabularies that loc currently supports.46 according to the loc, the linked data service brings several benefits to the users including: accessing data at no cost; providing granular access to individual data values; downloading controlled vocabularies and their data values in numerous formats; enabling linking to loc data values within the user metadata using linked data principles; providing a simple restful api, clear license and usage policy for each vocabulary; accessing data across loc divisions through a unified endpoint; and visualizing relationships between concepts and values.47 however, to fully exploit the potentials of lod, loc is mainly focusing on its bibframe initiative.48 bibframe is not only a replacement for the current marc21 metadata encoding format it is a new way of thinking how the available large amount of bibliographic metadata could be shared, reused, and made available as linked and open data. 49 the bibframe 2.0 (https://www.loc.gov/bibframe/docs/bibframe2-model.html) model organizes information into work (the details of the about the work information), instance (work on specific subject quantity in numbers), item (format: print or electronic), and nature (copy/original work). bibframe 2.0 elaborates the roles of the persons in the specific work as agents, and the subject of the work as subjects and events.50 according to taniguchi, bibframe 2.0 takes the bibliographic metadata standards to the linked and open data with model and vocabulary that makes the cataloging more useful both inside and outside the library community.51 to achieve this goal, it needs to fulfill two primary requirements. these include (1) accepting and representing metadata created with rda by replacing the marc21, and therefore, working as creating, exchanging, and sharing rda metadata; (2) accepting and accommodating descriptive metadata for bibliographic resources created by libraries, cultural heritage communities, and users for the wide exchange and sharing. bibframe 2.0 should comply with the linked data principles including the use of rdf and uris. in addition to the library of congress, oclc through its linked data research has also been actively involved in research on transforming and publishing its bibliographic metadata as linked data.52 under this program, oclc aims to provide a technical platform for the management and publication of its rdf datasets at a commercial scale. it models the key bibliographic entities including work and person and populates them with legacy and marc-based metadata. it extends http://id.loc.gov/ https://www.loc.gov/bibframe/docs/bibframe2-model.html information technology and libraries | december 2018 53 models to efficiently describe the contents of digital collections, art objects, and institutional repositories, which are not very well-described in marc. it improves the bibliographic description of works and their translations. it manages the transition from marc and other legacy encoding formats to linked data and develops prototypes for native consumption of linked data to improve resource description and discovery. finally, it organizes teaching and training events.53 since 2012, oclc has been publishing bibliographic data as linked data with three major lod datasets including oclc persons, worldcat works, and worldcat.org.54 inspired from google research, currently, they have been working on knowledge vault pipeline process to harvest, extract, normalize, weigh, and synthesize knowledge from bibliographic records, authority files, and the web to generate linked data triples to improve the exploration and discovery experience of end-users.55 worldcat.org publishes it bibliographic metadata as linked data by extracting a rich set of entities including persons, works, places, events, concepts, and organizations to make possible several web services and functionalities for resource discovery and access.56 it uses schema.org (http://schema.org) as the base ontology, which can be extended with different ontologies and vocabularies to model worldcat bibliographic data to be published and consumed as linked data.57 tennant presents a simple example of how this works. suppose we want to represent the fact “william shakespeare is the author of hamlet” as linked data.58 to do this, the important entities should be extracted along with their semantics (relationships) and represented in a format that is both machine-processable and human-readable. using schema.org, virtual international authority file (viaf.org), and worldcat.org, the sentence can be represented as a linked data triple, as shown in figure 1 based on tennant.59 the digital bibliography & library project (dblp) is an online computer science bibliography that provides bibliographic information about major publications in computer science with the goal of providing free access to high-quality bibliographic metadata and links to the electronic version of these publications.60 as of october 2018, it has indexed more than 4.3 million publications from more than 2.1 million authors and has indexed more than 40,000 journal volumes, 38,000 conference/workshop proceedings, and more than 80,000 monographs.61 its dataset is available on lod that allows for faceted search and faceted navigation to the matching publications. it uses growbag graphs to create topic facets and uses dblp++ datasets (an enhanced version of dblp) and additional data extracted from the related webpages on the web.62 a mysql database stores the dblp++ dataset that is accessible through several ways including (1) getting the database dump; (2) using its web services; (3) using d2r server to access it in rdf; and (4) getting the rdf dump available in n3 serialization.63 the above discussions on loc, oclc, and dblp make it clear that lod can potentially transform the cataloging landscape of libraries by making bibliographic metadata visible and reusable on the web. however, this potential can only be exploited to its fullest if relevant vocabularies are provided to make the linked data more meaningful. lov fulfills this demand for relevant and standard vocabularies, discussed in the next subsection. current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 54 https://doi.org/10.6017/ital.v37i4.10432 figure 1. an example of publishing a sample fact as linked data (based on tennant64). linked open vocabularies linked open vocabularies (lov) are a “high-quality catalog of reusable vocabularies to describe linked and open data.”65 they assist publishers in choosing the appropriate vocabulary to efficiently describe the semantics (classes, properties, and data types) of the data to be published as linked and open data.66 lov interconnect vocabularies, version control, the property type of values to be matched with a query to increase the score of the terms, and offers a range of data access methods including apis, sparql endpoint, and data dump. the aim is to make the reuse of well-documented vocabularies possible in the lod environment.67 the lov portal brings valueadded extensions to the web of data, which is evident from its adoption in several state-of-the-art applications.68 the presence of vocabulary makes the corresponding linked data meaningful, if the original vocabulary vanishes from the web, linked data applications that rely on it no longer function because they cannot validate against the authoritative source. lov systems prevent vocabularies from becoming unavailable by providing redundant or back-up locations for these vocabularies.69 the lov catalog meets almost all types of search criteria including search using metadata, ontology, apis, rdf dump, and sparql endpoint enabling it to provide a range of services regarding the reuse of rdf vocabularies.70 linked data should be accompanied by its meaning to achieve its benefits, which is possible using vocabularies especially rdf vocabularies that are also published as linked data and linked with each other forming an lov ecosystem.71 such an ecosystem defines the health and usability of linked data by making its meaningful interpretation possible.72 for an ontology or vocabulary to be included into the lov catalog, it must be of an appropriate size with low-level and normalized information technology and libraries | december 2018 55 constraints and represented in rdfs or web ontology language (owl); it must allow creating instances and support documentation by permitting comments, labels, definitions, and descriptions to support end users.73 the ontology must have additional characteristics such as those described in semantic web languages like owl, published on the web with no limitations on its reuse, and support for content negotiation using searchable content and namespace uris .74 the lov catalog offers four core functionalities that make it more attractive for libraries. the aggregate accesses vocabularies through dump file or (a sparql) endpoint. the search finds classes/properties in a vocabulary or ontology. the stat displays descriptive statistics of lov vocabularies. finally, suggest enables the registry of new vocabularies.75 radio and hanrath uncovered the concerns regarding transitioning to lov including how preexisting terms could be mapped while considering the potential semantic loss.76 they describe this transition in the light of a case study at the university of kansas institutional repository, which adopted oclc’s fast vocabulary and analyzed the outcomes and impact of exposing their data as linked data. to them, a vocabulary that is universal in scope and detail can become “bloated” and may result in an aggregated list of uncontrolled terms. however, such a diverse system may be capable of accurately describing the contents of an institutional repository. in this regard, adopting linked data vocabulary may serve to increase the overall quality of data by ensuring consistency with greater exposure of the resources when published as lod. however, such a transition to a linked data vocabulary is not that simple and gets complicated when the process involves reconciling the legacy metadata especially when dealing with the issues of under or misrepresentation.77 publishers, commercial entities, and data providers such as universities are taking keen interest and consortial participation, and therefore the library community must contribute to, benefit from, and consider this inevitable opportunity seriously.78 considering, the core role of libraries in connecting people to the information, they should come forward to make available their descriptive metadata collections as linked and open data for the benefit of the scholarly community on the web. it is time to move from strings (descriptive bibliographic records) to things (data items) that are connected in a more meaningful manner for the consumption of both machines and humans.79 besides the numerous benefits of the lov, there are some well-documented [and well-supported] vocabularies that are “not published or no longer available.”80 while focusing on the mappings between schema.org and lov, nogales et al. argue that the lov portal is limited as “some of the vocabularies are not available here.”81 in other words, the lov portal is growing, but currently, it is at the infant stage, where much work is needed to bring all or at least the missing welldocumented and well-supported vocabularies. this way the true benefits of lov could be exploited to the fullest when such vocabularies are linked and made available for the consumption and reuse of the broader audience and applications of the web of data. challenges, issues, and research opportunities to answer the rq02, this section attempts to identify some of the prominent/key challenges and issues regarding publishing and consuming bibliographic metadata as linked and open data. the sheer scale and diversity of cataloging frameworks, metadata encoding schemes, and standards make it difficult to approach cataloging effectively and efficiently. the quality of the cataloging data is another dimension that needs proper attention. current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 56 https://doi.org/10.6017/ital.v37i4.10432 the multiplicity of cataloging rules and standards the importance and critical role of standards in cataloging are clear to everyone. with standards, it becomes possible to identify authors uniquely; link users to the intended and the required resources; assess the value and usage of the services a library or information system provides; operate efficiently different transactions regarding bibliographic metadata, link content, preserve metadata, and generate reports; and enable the transfer of notifications, data, and events across machines.82 the success of these standards is because of the community-based efforts and their utility for a person/organization and ease of adoption. 83 however, we are living in a “jungle of standards” with massive scale and complexity.84 we are facing a flood of standards, schemas, protocols, and formats to deal with bibliographic metadata. 85 it is necessary to come up with some uniform and widely accepted standard, schema, protocol, and format, which will make possible the uniformity between bibliographic records and make way for records de-duplication on the web. also, because of the exponential growth of the digital landscape of document collections and the emerging yet widely adopted linked data environment, it becomes necessary for librarians to be part of this global scale activity of making their bibliographic data available as linked and open data.86 therefore, all these standards need reenvisioning and reconsideration when libraries transit from the current implementations to a more complex lod-based environment.87 rda is easy to use, user-centric, and retrieval-supportive with a precise vocabulary.88 however, it has lengthier descriptions with a lot of technical terms, is time-consuming, needs re-training, and suffers from the generation gap.89 rda is transitioning from aacr2 to produce metadata for knowledge artifacts, and it will be adaptive to the emerging data structures of linked data.90 although librarians could potentially play a vital role in making rda successful, it is challenging to bring them on the same page with publishers and vendors.91 while studying bibframe 2.0 from rda point of view, taniguchi observed that: • bibframe has no class correspondence with rda, especially making a distinction between work and expression is challenging. • some rda elements have no corresponding properties in bibframe, and therefore, cannot be expressed in bibframe. in other cases, bibframe properties cannot be converted back to rda elements due to the many-to-one and many-to-many mappings between them. • the availability of multiple marc21-to-bibframe tools results in the variety of bibframe metadata, which makes its matching and merging in the later stages challenging.92 to understand whether bibframe 2.0 is suitable as a metadata schema, taniguchi examined it closely for domain constraint of properties and developed four additional methods for implementing such constraints, i.e., defining properties in bibframe.93 in these methods, method 1 is the strictest one for defining such properties, method 2 from bibframe, and the remaining gradually loosen. method 1 defines the domain of individual properties as work or instance only, which is according to the method in rda. method 2 defines properties using multiclass structure (work-instance-item) for descriptive metadata. method 3 introduces a new class bibres to accommodate work and instance properties. method 4 uses two classes bibres and work for representing a bibliographic resource. method 5 leaves the domain of any property unspecified and uses rdf:type to represent whether a resource belongs to the work or instance. he observed that: information technology and libraries | december 2018 57 • the multi-class structure used in bibframe (method 2) questions the consistency between this structure and the domain definition of the properties. • if the quality of the metadata is concerned especially matching among converted metadata from different source metadata, then method 1 works better than method 2. • if metadata conversion from different sources is required, then method 4 or 5 should be applied.94 taniguchi concludes that bibframe’s domain constraint policy is unsuitable for descriptive metadata schema to exchange and share bibliographic resources, and therefore, should be reconsidered.95 according to sprochi, bibliographic metadata is passing through a significant transformation. 96 frbr, rda, and bibframe are among the three major and currently running programs that will affect the recording, storage, retrieval, reuse and sharing of bibliographic metadata. ifla focuses on reconciling frbr, frad, and frsad models into one model namely frbr-library reference model (rfbr-lrm [https://www.ifla.org/node/10280]), published in may 2016.97 sprochi further adds that it is generally expected that by adopting this new model, rda will be changed and revised significantly. bibframe will also get substantial modifications to become compatible with frbr-lrm and the resulting rda rules.98 these initiatives, on the one hand, makes possible their visibility on the web, but on the other hand, introduces several changes and challenges for the library and information science community.99 to cope with the challenges of making bibliographic data visible, available, reusable, and shareable on the web, sprochi argues that: 100 • the library and information science community must think of the bibliographic records in terms of data that is both human-readable and machine-understandable, which can be processed across different applications and databases with no format restrictions. also, this data must support interoperability among vendors, publishers, users, and libraries and therefore, should be thought of beyond the notion that “only library create quality metadata (as quoted in coyle (2007)” and cited by sprochi101). • a shared understanding of semantic web, lod, data formats, and other related technologies is necessary for the library and information science community for more meaningful and fruitful conversations with software developers, information & library science (ils) designers, and it & linked data professionals. at least some basic knowledge about these technologies will enable the library community to take active participation in publishing, storing, visualizing, linking, and consuming bibliographic metadata as linked and open data. • the library community must show a strong commitment to more ils vendors to “postmarc” standards such as bibframe or any other standard that is supportive of the lod environment. this way we will be in a better position to exploit linked data and semantic web to their fullest. the library community must be ready to adopt lod in cataloging. transitioning from marc to linked data needs collaborative efforts and requires addressing several challenges. these challenges include: https://www.ifla.org/node/10280 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 58 https://doi.org/10.6017/ital.v37i4.10432 • committing to a single standard by getting all units in the library, so that the big data problem resulting from using multiple metadata standards by different institutions could be mitigated; • bringing individual experts, libraries, university, and governments to work together and organize conferences, seminars, and workshops to bring linked data into the mainstream ; • translating the bibframe vocabulary into other languages; • involving different users and experts in the area; and • obtaining funding from the public sector and other agencies to continue the journey towards linked data.102 in the current scenario of metadata practices, the interoperability for the exchange of metadata varies across different formats.103 the semantic web and lod support different library models such as frbroo, edm, and bibframe. these conceptual models and frameworks suffer from the interoperability issue, which makes data integration difficult. currently, several options are available for encoding bibliographic data to rdf (and to lod), which further complicates the interoperability and introduces inconsistency.104 existing descriptive cataloging methodologies and the bibliographic ontology descriptions in cataloging and metadata standards set the stage for redesigning and developing better ways of improved information retrieval and interoperability.105 besides the massive heaps of information on the web, the library community (especially digital libraries) has devised standards for metadata and bibliographic description to meet the interoperability requirements for this part of the data on the web.106 semantic web technologies could be exploited to make information presentation, storage, and retrieval more user-friendly for digital libraries.107 to achieve such interoperability among resources, castro proposed an architecture for semantic bibliographic description.108 gardašević emphasizes on employing information system engineers and developers to understand resource description, discovery, and access process in libraries and then extend these practices by applying linked data principles.109 this way bibliographic metadata will be more visible, reusable and shareable on the web. godby, wang, and mixter stress collaborative efforts to establish a single and universal platform for cataloging rules, encoding schema, and model to a higher level of maturity, which requires initiatives such as rda, bibframe, ld4l, and biblow (https://bibflow.library.ucdavis.edu/about).110 the massive volume of metadata (available in marc and other legacy formats) makes data migration to bibframe challenging.111 although bibframe challenges the conventional ground of cataloging, which aims to record tangible knowledge containers, it is still in the infant stage at both theoretical and practical levels.112 for bibframe to be more efficient, enhanced, and enriched, it needs the attention of librarians and information science experts who will use it to encode their bibliographic metadata.113 gonzales suggests that librarians must be willing to share metadata and upgrade metadata encoding standards to bibframe; they should train, learn, and upgrade their systems to efficiently use bibframe encoding scheme and research new ways of bringing interoperability between bibframe and other legacy metadata standards; and they should ensure the data security of patrons and mitigate the legal and copyright issues in making visible their resources as linked and open data.114 also, lov must be exploited from the cataloging perspective by finding out ways to create a single, flexible, adaptable, and representative vocabulary. such a vocabulary will bring the cataloging data from different https://bibflow.library.ucdavis.edu/about information technology and libraries | december 2018 59 libraries of the world and make it accessible and consumable as a single library linked data to get free from the jungle of metadata vocabularies [and standards]. publishing and consuming linked bibliographic metadata according to the findings of one survey, there are several primary motives for publishing an institution’s [meta]data as linked data. these include (in the order from most frequent/ essential to a lesser one):115 • making data visible on the web; • experimenting and finding the potentials of publishing datasets as linked data; • exposing local datasets to understand the nature of linked data; • exploring the benefits of linked data for search engine optimization (seo); • consuming and reusing linked data in future projects; • increasing the data reusability and interoperability; • testing schema.org and bibframe; • meeting the requirements of the project; and • making available the “stable, integrated, and normalized data about research activities of an institution.”116 they also identified several reasons from the participants regarding the consumption of such data. these include (in the order from most frequent/essential to a lesser one):117 • improving the user experience; • extending local data with other datasets; • effectively managing the internal metadata; • improving the accuracy and scope of search results; • trying to improve seo for local resources; • understanding the effect of data aggregation from multiple datasets; and • experimenting and finding the potentials of consuming linked datasets. publishing and consuming bibliographic data on the lod cloud brings numerous applications. kalou et al. developed a semantic mashup by combining semantic web technologies, restful services, and content management services (cms) to generate personalized book recommendations and publish them as linked data.118 it allows for the expressive reasoning and efficient management of ontologies and has potential applications in the library, cataloging services, and ranking book records and reviews. this application exemplifies how we can use the commercially [and socially] curated metadata with bibliographic descriptions from improved user experience in digital libraries using linked data principles. however, publishing and consuming bibliographic metadata as linked and open data is not that simple and need addressing several prominent challenges and issues, which are identified in the following subsections along with some opportunities for further research. publishing linked bibliographic metadata the university of illinois library worked on publishing marc21 records of 30,000 digitized books as linked library data by adding links, transforming them to lod-friendly semantics (mods) and deploying them as rdf with the objective to be used by a wider community.119 to them, using semantic web technologies, a book can be linked to related resources and multiple possible current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 60 https://doi.org/10.6017/ital.v37i4.10432 contexts, which is an opportunity for libraries to build innovative user-centered services for the dissemination and uses of bibliographic metadata.120 in this regard, the challenge is to utilize the existing book-related bibliographic maximally and descriptive metadata in a manner that parallels with the services (both inside the library and outside) as well as exploit to the fullest the full-text search and semantic web technologies, standards, and lod services.121 while publishing the national bibliographic information as free open linked data, ifla identifies several issues including:122 • dealing with the negative financial impact on the revenue generated from traditional metadata services; • the inability to offer consistent services due to the complexity of copyright and licensing frameworks; • the confusion in understanding the difference between “open” and “free” terms; • remodeling library data as library linked data; • the limited persistence and sustainability of linked data resources; • the steep learning curve in understanding and applying linked data practices to library data; • making choices between sites to link to; and • creating persistent uris for library data objects. from the analysis of the relevant literature, hallo identified several issues in publishing bibliographic metadata as linked and open data. these include difficulties in cataloging and migrating data to new conceptual models; the multiplicity of vocabularies for the same metadata; the lack of agreements to share data; the lack of experts and tools for transforming data; the lack of applications and indicators for its consumption; mapping issues; providing useful links of datasets; defining and controlling data ownership; and ensuring dataset quality.123 libraries should adopt to linked data five-stars model by adopting emerging non-proprietary formats to publish its data; link to external resources and services; participate actively in enriching; and improving the quality of metadata to improve knowledge management and discovery. 124 the cataloging has a bright future with more dataset providers by involving citizens and end -users in metadata enrichment and annotation; making ranking and recommendation as part of library cataloging services; and the increased participation of the library community to the body of semantic web and linked data.125 publishing linked data poses several issues. these include data cleanup issues es pecially when dealing with legacy data; technical issues such as data ownership; the software maturity to keep linked data up-to-date; managing its colossal volume; and providing it support for data entry, annotation, and modeling; developing representative and widely applicable lovs; and handling the steep learning curve to understand and apply linked data principles. 126 bull and quimby stress understanding how the library community is transiting their cataloging methods, systems, standards, and integrations to the lod for making them visible on the web and how they keep backward compatibility with legacy bibliographic metadata.127 it is necessary for the lod data model to maintain the underlying semantics of the existing models, schemas, and standards, yet innovate and renew old traditions, where the quality of the conversion solely depends on the ability of this new model to cope with heterogeneity conflicts, information technology and libraries | december 2018 61 maintain granularity and semantic attributes and consequently prevent loss of data and semantics.128 the new model should be semantically expressive enough to support meaningful and precise linking to other datasets. by thinking alternatively, these challenges are the significant research opportunities that will enable us to be part of linked and open data community in a more profound manner. consuming linked bibliographic metadata consuming linked data resources can be a daunting task and may involve resolving/mitigating several challenges. these challenges include:129 • dealing with the bulky or non-available rdf dumps, no authority control within rdf dumps, and data format variations; • identifying terms’ specificity levels during concept matching; • the limited reusability of library linked data due to lack of contextual data; • harmonizing classes and objects at the institution level; • excessive handcrafting due to few off-the-shelf visualization tools; • manual mapping of vocabularies; • matching, aligning, and disambiguating library and linked data; • the limited representation of several essential resources as linked data due to nonavailability of uris; • the lack of sufficient representative semantics for bibliographic data; • the time-consuming nature of linked data to understand its structure for reuse; • the ambiguity of terms across languages; and • the non-stability of endpoints and outdated datasets. syndication is required to make library data visible on the web. also, it is necessary to understand how current applications including web search engines perceive and treat visibility, to what extent schema.org matters, and what is the nature of the linked data cloud.130 an influential work may be translated into several languages, which results in multiple metadata records. some of these are complete, and others are with missing details. godby and smith‐ yoshimura suggest aggregating these multiple metadata records into a single record, which can be complete, link the work to its different translations and translators, and is publishable (and consumable) as linked data.131 however, such an aggregation demands a great deal of human effort to make these records visible and consumable as linked data. this also includes describing all types of objects that libraries currently collect and manage, translating research findings to best practices; and establishing policies to use uris in marc and other types of records. 132 to achieve the long-term goal of making metadata consumable as linked data; the libraries, as well as individual researchers, should align their research with work that of the major players such as oclc, loc, and ifla and follow their best practices.133 the issues in lov needs immediate attention to make lod more useful. these issues, according to include the following:134 • lov publishes only a subset of rdf vocabularies with no inclusion for value vocabularies such as skos thesaurus; • it provides no or almost negligible support for vocabulary authors; current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 62 https://doi.org/10.6017/ital.v37i4.10432 • it relies on third parties to get the information about vocabulary usage in published datasets; • it has insufficient support for multilingualism or many languages; • it should support multi-term vocabulary search, which is required from the ontology designers to understand and employ the complex relationships among concepts; • it should support vocabulary matching, vocabulary checking, and multilingualism to allow users to search and browse vocabularies using their native language. it also improves the quality of the vocabulary by translation, which allows the community to evaluate and collaborate; and • efforts are required to improve and make possible the long-term preservation of vocabularies. lod emerged to change the design and development of metadata, which has implications for controlled vocabularies, especially, the person/agent vocabularies that are fundamental to data linkage but suffer from the issues of metadata maintenance and verification. 135 therefore, practical data management and the metadata-to-triples transition should be studied in detail to make the wider adaptation of lod possible.136 to come out of the lab environment and make lod practically useful, the controlled vocabularies must be cleaned, and its cost should be reduced.137 however, achieving this is challenging and needs to answer how knowledge artifacts could be uniquely identified and labeled across digital collections and what should be the standard practices to use them.138 linked data is still new to libraries.139 the technological complexities, the feeling of risks in adopting new technology and limitations due to the system, politics, and economy are some of the barriers in its usage in libraries.140 however, libraries can potentially overcome these barriers by learning from the use of linked data in other domains including, e.g., google’s knowledge graph and facebook’s open graph.141 the graph interfaces could be developed to link author, publisher, and book-related information, which in turn can be linked to the other open and freely available datasets.142 it is time that the library and information science professionals come out of the old, document-centric approach to bibliographic metadata and adapt their thinking as more datacentric for a more meaningful consumption of bibliographic metadata by both users and machines.143 quality of linked bibliographic metadata the use of a cataloging data defines its quality.144 the quality is essential for the discovery, usage, provenance, currency, authentication, and administration of metadata. 145 cataloging data or bibliographic metadata is considered fit for use based on its accuracy, completeness, logical consistency, provenance, coherence, timeliness, conformance and accessibility. 146 data is commonly assessed by its quality to be used in specific application scenarios and use cases, however, sometimes, low-quality data can still be useful for a specific application as far as its quality meets the requirements of that application.147 the reasons include several factors including availability, accuracy, believability, completeness, conciseness, consistency, objectivity, relevance, understandability, timeliness, and verifiability that determine the quality of data. 148 the quality of linked data can be of two types, one is the inherent quality of linked data, and the other relates to its infrastructure aspects. the former can be further divided into aspects including domain, metadata, rdf model, links among data items, and vocabulary. the infrastructural information technology and libraries | december 2018 63 aspects include the server that hosts the linked data, linked data fragments, and file servers.149 this typology introduces issues of their own, the issues related to the inherent quality including “linking, vocabulary usage and the provision of administrative metadata.”150 the infrastructural aspect introduces issues related to naming conventions, which include avoiding blank nodes and using http uris, linking through owl:sameas links, describing by reusing the existing terms and dereferencing.151 the quality cataloging definitions are mainly based on the experience and practices of the cataloging community.152 its quality falls into at least four basic categories: (1) the technical details of the bibliographic records, (2) the cataloging standards, (3) the cataloging process, and (4) the impact of cataloging on the user.153 the cataloging community focuses mainly on the quality of bibliographic metadata. however, it is not sufficient enough to consider the accuracy, completeness, and standardization of bibliographic metadata, and therefore, it is necessary that they should also consider the information needs of the users.154 van kleeck et al. investigated issues in the quality management of metadata of electronic resources to assess in supporting user tasks of finding, selecting, and accessing library holdings as well as identifying the potential for increasing efficiencies in acquisition and cataloging workflow.155 they evaluated the quality of existing bibliographic records mostly provided by their vendors and compared them with those of oclc and found that the latter has better support users in resource discovery and access. 156 from the management perspective, the complexity and volume of bibliographic metadata and the method of ingesting it to the catalog emphasize the selection of highest quality records.157 from the perspective of digital repositories, the absence of well-defined theoretical and operational definitions of metadata quality, interoperability, and consistency are some of the issues for the quality of metadata.158 the national information standards organization (niso) identifies several issues in creating metadata. 159 these include the inadequate knowledge about cataloging in both manual and automatic environments leading to inaccurate data entry, inconsistency of subject vocabularies, and limitations of resource discovery, and the development of standardized approaches to structure metadata.160 the poor quality of linked data can make its usefulness much difficult.161 datasets are created at the data level resulting in a significant variance in perspectives and underlying data models.162 this also leads to errors in triplication, syntax, and data; misleading owl:sameas links, and the low availability of sparql endpoints.163 library catalogs, because of their low quality, most often fail to communicate clear and correct information correctly to the users.164 the reasons for such low quality include user’s inability to produce catalogs that are free from faults and duplicates as well as low standards and policies that drive these cataloging practices. 165 although the rich collections of bibliographic metadata are available, these are rich in terms of the heaps of cataloging data and not in terms of quality with almost no bibliographic control. 166 these errors in and the low quality of bibliographic metadata are the result of misunderstanding the aims and functions of bibliographic metadata and adopting the “unwise” cataloging standards and policies.167 still there exist some high-quality cataloging efforts with well-maintained cataloging records, where the only quality warrant is to correctly understand the subject matter of the artifact and effectively communicate between librarians and experts in the corresponding domain knowledge. 168 the demand for such high quality and well-managed catalogs has increased on the web. although current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 64 https://doi.org/10.6017/ital.v37i4.10432 people are more accustomed to web search engines, the quality catalogs will attract not only libraries but the general web users as well (when published and consumed as linked data).169 the community must work together on metadata with publishers and vendors to approach cataloging from the user perspective and refine the skillset as well as produce quality metadata.170 as library and information science professionals, we should not only be the users of the standards , instead, we must actively participate and contribute to its development and improvement so that we may effectively and efficiently connect our data with the rest of the world.171 such collaboration is required from not only the librarians and vendors but also from the users in developing an efficient cataloging environment and for a more usable bibliographic metadata, this is discussed in the next section. linking the socially curated metadata this section addresses rq03 by reviewing the state-of-the-art literature from multiple but related domains including library sciences, information sciences, information retrieval, and semantic web. the section below discusses the importance and possible impact of making socially curated metadata as part of the bibliographic or professionally curated metadata. the next section highlights why social collaborative cataloging approaches should be adopted by librarians to work with other stakeholders in making their bibliographic data available and visible as linked and open data and what is the possible impact of fusing the user-generated content with professional metadata and making it available as linked and open data. the socially curated metadata matters in cataloging conventional libraries have clear and well-established classification and cataloging schemes but these are as challenging to learn, understand, and apply as they are slow and painful to consume.172 using computers to retrieve bibliographic records resulted in the massive usage of copy cataloging.173 however, adopting this practice is challenging, because these records are inconsistent; incomplete; less visible, granular, and discoverable; unable to integrate metadata and content to the corresponding records; difficult to preserve with new and usable format for the consumption by users and machines; and not supportive towards integrating the user-generated content into the cataloging records.174 the university of illinois library, through its vufind service, offers extra features to enhance the search and exploration experience of end users by providing a book’s cover image, table of contents, abstracts, reviews, comments, and user tags.175 users can contribute content such as tags, reviews, comments, and recommend books to friends. h owever, it is necessary to research whether this user-generated content should be integrated to or preserved along the bibliographic records.176 in their book, alemu and stevens mentioned several advantages of making user-generated content as part of the library catalogs.177 these include (i) enhancing the functionality of professionallycurated metadata by making information objects findable and discoverable; (ii) removing the limitations posed by sufficiency and necessity principles of the professionally-curated metadata; (iii) bringing users closer to the library by “pro-actively engaging” them in ratings, tagging, and reviewing, etc., provided that users are also involved in managing and controlling metadata entries; and (iv) the resulting “wisdom of the crowd” would benefit all the stakeholders from this massively growing socially-curated metadata. however, this combination can only be utilized optimally if we can semantically and contextually link it to the internal and external resources; the resulting metadata is openly accessed, shared, and reused; users are supported in easily adding information technology and libraries | december 2018 65 the metadata and made part of the quality control by enabling them to report spamming activities to the metadata experts.178 librarything for libraries (ltfl) makes a library catalog more informative and interactive by enhancing opac, providing access to professional and social metadata, and enabling them to search, browse, and discover library holdings in a more engaging way (https://www.librarything.com/forlibraries). it is one of the practical examples of enriching library catalogs with user-generated content. this trend of merging social and professional metadata innovates library cataloging by dissolving the borders between “social sphere” and library resources.179 the social media has expanded library into social spaces by exploiting tags and tag clouds as navigational tools and enriching the bibliographic descriptions by integrating the user-generated content.180 it bridges the communication gaps between the library and its users, where users take active participation in resource description, discovery, and access. 181 the potential role of the socially curated metadata in resource description, discovery, and access is also evident from the long long-tail social book search research under the initiative for xml retrieval (inex) where both professionally curated bibliographic and user-generated social metadata are exploited for retrieval and recommendation to support both known-item as well as exploratory search.182 by experimenting with amazon/librarything datasets of 2.8 million book records, containing both professional and social metadata, the results conclude that enriching the professional metadata with social metadata especially tags significantly improves search and recommendation.183 koolen also noticed that the social metadata especially tags and reviews significantly improve the search performance as professionally curated metadata is “often too limited” to describe books resourcefully.184 users add socially curated metadata with the intention of making resource re-findable during a future visit, i.e., they add metadata such as tags to facilitate themselves and allow other similar users in resource discovery and access, and therefore, form a community around the resource.185 clements found user tags (social tagging) beneficial for librarians while browsing and exploring the library catalogs.186 to some librarians, tags are complementary to controlled vocabulary; however, training issues and lack of awareness of social tagging functionality in cataloging interfaces prevent their perceived benefit.187 the socially curated metadata as linked data metadata is socially constructed.188 it is shaping and shaped by the context in which it is developed and applied, and demands community-driven approaches, where data should be looked at from a holistic point of view rather than considering them as discrete (individual) semantic units.189 the library is adopting the collaborative social aspect of cataloging that will take place between authors, repository managers, libraries, e-collection consortiums, publishers, and vendors.190 librarians should improve their cataloging skills in line with the advances in technology to expose and make visible their bibliographic metadata as linked and open data.191 currently, linked library data is generated and used by library professionals. socially constructed metadata will act as a value-added in retrieving knowledge artifacts with precision.192 the addition of socially constructed and community-driven metadata in current metadata structures, controlled vocabularies, and classification systems provide the holistic view of these structures as they add the community-generated sense to the professionally-curated metadata structures.193 an https://www.librarything.com/forlibraries current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 66 https://doi.org/10.6017/ital.v37i4.10432 example of the possibilities of making user-generated content as part of cataloging and linked open data is the semantic book mashup (see “consuming linked bibliographic metadata” above) which demonstrates how the commercially [and socially] curated metadata could be retriev ed and linked with bibliographic descriptions.194 while enumerating the possible applications of this mashup, they argue that book reviews from different websites could be aggregated using linked data principles by extending the review class of bibframe 2.0.195 from the analysis of twenty-one in-depth interviews with lis professionals, alemu discovered four metadata principles, namely metadata enrichment, linkage, openness, and filtering.196 this analysis revealed that the absence of socially curated metadata is sub-optimal for the potential of lod in libraries.197 their analysis advocates for a mixed-metadata approach, in which social metadata (tags, ratings, and reviews) augments the bibliographic metadata by involving users proactively and by offering a social collaborative cataloging platform. the metadata principles should be reconceptualized, and linked data should be exploited to address the existing library metadata challenges. therefore, the current efforts in linked data should fully consider social metadata.198 library catalogs should be enriched by mixing the professional and social metadata as well as semantically and contextually interlinked to internal and external information resources to be optimally used in different application scenarios.199 to fully exploit this linkage, the duplication of metadata should be reduced. it must be made openly accessible so that its sharing, reuse, mixing, and matching could be made possible. the enriched metadata must be filtered per user requirements using an interface that is flexible, personalized, contextual, and reconfigurable.200 their analysis suggests a “paradigm shift” in metadata’s future, i.e., from simple to enriched; from disconnected, invisible and locked to well-structured, machine-understandable, interconnected, visible, and more visualized metadata; and from single opac interface to reconfigurable and adaptive metadata interfaces.201 by involving users in the metadata curation process, the mixed approach will bring diversity in metadata and make resources discoverable, usable, and user-centric with the wider and well-supported platform of linked and open data.202 in conclusion, the fusion of socially curated metadata with the standards-based professional metadata is essential from the perspective of the user-centric paradigm of cataloging, which has the potential to aid resource discovery and access and open new opportunities for information scientists working in linked and open data as well as catalogers who are transiting to the web of data to make their metadata visible, reusable, and linkable to other resources on the web. from the analysis and scholarly discussions of alemu, stevens, farnel, and others as well as from the initial experiments of kalou et al.203 it becomes apparent that the application of linked data principles for library catalogs is future-proof and promising towards more user-friendly search and exploration experience with efficient resource description, discovery, access, and recommendations. conclusions in this paper, we presented a brief yet holistic review of the current state of linked and open data in cataloging. the paper identified the potentials of lod and lov in making the bibliographic descriptions publishable, linkable, and consumable on the web. several prominent challenges, issues, and future research avenues were identified and discussed. the potential role of sociallycurated metadata for enriching library catalogs and the collaborative social aspect of cataloging were highlighted. some of the notable points include the following: information technology and libraries | december 2018 67 • publishing, linking, and consuming bibliographic metadata on the web using linked data principles brings several benefits for libraries.204 the library community should improve their skills regarding this paradigm shift and adopt the best practices from other domains.205 • standards have a key role in cataloging, however, we are living in a “jungle of metadata standards” with varying complexity and scale, which makes it difficult to select, apply and work with.206 to be part of global scale activity of making bibliographic data available on the web as linked and open data, these standards should be considered and reenvisioned.207 • the quality of bibliographic metadata depends on several factors including accuracy, completeness, logical consistency, provenance, coherence, timeliness, conformance and accessibility.208 however, achieving these characteristics is challenging because of several reasons including cataloging errors; limited bibliographic control; misunderstanding the role of metadata; and “unwise” cataloging standards and policies.209 to ensure high-quality and make data visible and reusable as linked data, the library community should contribute to developing and refining these standards and policies. 210 • metadata is socially constructed and demands community-driven approaches and the social collaborative aspect of cataloging by involving authors, repository managers, librarians, digital collection consortiums, publishers, vendors, and users.211 this is an emerging trend, which is gradually dissolving the borders between the “social sphere” and library resources and bridging the communication gap between libraries and their users, where end users contribute to the bibliographic descriptions resulting in a diversity of metadata and making it user-centric and usable.212 • adopting a “mixed-metadata approach” by considering bibliographic metadata and the user-generated content complementary and essential for each other suggests a “paradigm shift” in the metadata’s future from simple to enriched; from human-readable data silos to machine understandable, well-structured, and reusable; from invisible and restricted to visible and open; and from single opac to reconfigurable interfaces on the web.213 several researchers including the ones cited in this article agree that the professionally curated bibliographic metadata supports mostly the known-item search and has little value to open and exploratory search and browsing. they believe that not only the collaborative social efforts of the cataloging community are essential but also the socially curated metadata, which can be used to enrich bibliographic metadata and support exploration and serendipity. this is not only evident from the wider usage of librarything and its ltfl but also from the long-tail inex social book search research where both professionally curated bibliographic and user-generated social metadata are exploited for retrieval and recommendation to support both known-item as well as exploratory search.214 therefore, this aspect should be considered for further research to make cataloging more useful for all the stakeholders including libraries, users, authors, publishers, and for the general consumption as linked data on the web. the current trend of social collaborative cataloging efforts is essential to fully exploit the potential of linked open data. however, if we look closely, we find four groups including librarians, linked data experts, information retrieval (ir) and interactive ir researchers; and users, all going on their separate ways with minimal collaboration and communication. more specifically, they are not benefiting from each other to a greater extent, which could result in better possibilities of current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 68 https://doi.org/10.6017/ital.v37i4.10432 resource description, discovery, and access. for example, the library community should consider the findings of inex sbs track, which have demonstrated that professional and social metadata, are essential for each other to facilitate end users in resource discovery and access and support not only known-item search but also exploration and serendipity. the current practices of librarything, ltfl, and social web in general advocate for user-centric cataloging, where users are not only the consumers of bibliographic descriptions but also the contributors to metadata enrichment. linked open data experts have achieved significant milestones in other domains including, e.g., e-government, they should understand the cataloging and resource discovery & access practices in libraries to make the bibliographic metadata not only visible as linked data on the web but also shareable, re-usable, and beneficial to the end-users. the social collaborative cataloging approach by involving the four mentioned groups actively is significant to make bibliographic descriptions more useful not only for the library community and users but also for their consumption on the web as linked and open data. together we can, and we must. references 1 maría hallo et al., “current state of linked data in digital libraries,” journal of information science 42, no. 2 (2016):117–27, https://doi.org/10.1177/0165551515594729. 2 tim berners-lee, “design issues: linked data,” w3c, 2006, updated june18, 2009, accessed november 09, 2018, https://www.w3.org/designissues/linkeddata.html; hallo, “current state,” 117. 3 yuji tosaka and jung-ran park, “rda: resource description & access—a survey of the current state of the art,” journal of the american society for information science and technology 64, no. 4 (2013): 651–62, https://doi.org/10.1002/asi.22825. 4 hallo, “current state,” 118; angela kroeger, “the road to bibframe: the evolution of the idea of bibliographic transition into a post-marc future,” cataloging & classification quarterly 51, no. (2013): 873–90. https://doi.org/10.1080/01639374.2013.823584; martin doerr et al., “the europeana data model (edm).” paper presented at the world library and information congress: 76th ifla general conference and assembly, gothenburg, sweden, august 10–15, 2010. 5 getaneh alemu and brett stevens, an emergent theory of digital library metadata—enrich then filter,1st edition (waltham, ma: chandos publishing, elsevier ltd. 2015). 6 hallo, “current state,” 118 . 7 berners-lee, “design issues.” 8 kim tallerås, “quality of linked bibliographic data: the models, vocabularies, and links of data sets published by four national libraries,” journal of library metadata 17, no. 2 (2017):126– 55, https://doi.org/10.1080/19386389.2017.1355166. 9 becky yoose and jody perkins, “the linked open data landscape in libraries and beyond,” journal of library metadata 13, no. 2–3 (2013): 197–211, https://doi.org/10.1080/19386389.2013.826075. https://doi.org/10.1177/0165551515594729 https://www.w3.org/designissues/linkeddata.html https://doi.org/10.1002/asi.22825 https://doi.org/10.1080/01639374.2013.823584 https://doi.org/10.1080/19386389.2017.1355166 https://doi.org/10.1080/19386389.2013.826075 information technology and libraries | december 2018 69 10 robert fox, “from strings to things,” digital library perspectives 32, no. 1 (2016): 2–6, https://doi.org/10.1108/dlp-10-2015-0020. 11 stanislava gardašević, “semantic web and linked (open) data possibilities and prospects for libraries,” infotheca—journal of informatics & librarianship 14, no. 1 (2013): 26–36, http://infoteka.bg.ac.rs/pdf/eng/2013-1/infotheca_xiv_1_2014_26-36.pdf. 12 thomas baker, pierre-yves vandenbussche, and bernard vatant, “requirements for vocabulary preservation and governance,” library hi tech 31, no. 4 (2013): 657-68, https://doi.org/10.1108/lht-03-2013-0027. 13 pierre-yves vandenbussche et al., “linked open vocabularies (lov): a gateway to reusable semantic vocabularies on the web,” semantic web 8, no. 3 (2017): 437–45, https://doi.org/10.3233/sw-160213. 14 tosaka, “rda,” 651, 652. 15 amanda sprochi, “where are we headed? resource description and access, bibliographic framework, and the functional requirements for bibliographic records library reference model,” international information & library review 48, no. 2 (2016): 129–36, https://doi.org/10.1080/10572317.2016.1176455. 16 brighid m.gonzales, “linking libraries to the web: linked data and the future of the bibliographic record,” information technology and libraries 33, no. 4 (2014): 10, https://doi.org/10.6017/ital.v33i4.5631. 17 shoichi taniguchi, “is bibframe 2.0 a suitable schema for exchanging and sharing diverse descriptive metadata about bibliographic resources?,” cataloging & classification quarterly 56, no. 1 (2018): 40–61, https://doi.org/10.1080/01639374.2017.1382643. 18 shoichi taniguchi, “bibframe and its issues: from the viewpoint of rda metadata,” journal of information processing and management 58, no. 1 (2015): 20–27, https://doi.org/10.1241/johokanri.58.20. 19 shoichi taniguchi, “examining bibframe 2.0 from the viewpoint of rda metadata schema,” cataloging & classification quarterly 55, no. 6 (2017): 387–412, https://doi.org/10.1080/01639374.2017.1322161. 20 nosheen fayyaz, irfan ullah, and shah khusro, “on the current state of linked open data: issues, challenges, and future directions,” international journal on semantic web and information systems (ijswis) 14, no. 4 (2018): 110–28, https://doi.org/10.4018/ijswis.2018100106. 21 asim ullah, shah khusro, and irfan ullah, “bibliographic classification in the digital age: current trends & future directions,” information technology and libraries 36, no. 3 (2017): 48–77, https://doi.org/10.6017/ital.v36i3.8930. 22 tosaka, “rda,” 659. https://doi.org/10.1108/dlp-10-2015-0020 http://infoteka.bg.ac.rs/pdf/eng/2013-1/infotheca_xiv_1_2014_26-36.pdf https://doi.org/10.1108/lht-03-2013-0027 https://doi.org/10.3233/sw-160213 https://doi.org/10.1080/10572317.2016.1176455 https://doi.org/10.6017/ital.v33i4.5631 https://doi.org/10.1080/01639374.2017.1382643 https://doi.org/10.1241/johokanri.58.20 https://doi.org/10.1080/01639374.2017.1322161 https://doi.org/10.4018/ijswis.2018100106 https://doi.org/10.6017/ital.v36i3.8930 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 70 https://doi.org/10.6017/ital.v37i4.10432 23 tosaka, “rda,” 651, 652, 659. 24 tosaka, “rda,” 653, 660. 25 the first author used the trial version of rda toolkit to report these facts about rda (https://access.rdatoolkit.org). rda toolkit is co-published by american library association (http://www.ala.org), canadian federation of library associations (http://cflafcab.ca/en/home-page), and facet publishing (http://www.facetpublishing.co.uk). 26 ifla, “ifla conceptual models,” the international federation of library associations and institutions (ifla), 2017, updated april 06, 2009, accessed november 12, 2018, https://www.ifla.org/node/2016. 27 tosaka, “rda,” 651, 652, 655. 28 michael john khoo et al., “augmenting dublin core digital library metadata with dewey decimal classification,” journal of documentation 71, no. 5 (2015): 976–98. https://doi.org/10.1108/jd-07-2014-0103; ulli waltinger et al., “hierarchical classification of oai metadata using the ddc taxonomy,” in advanced language technologies for digital libraries, edited by raffaella bernardi, frederique segond and ilya zaihrayeu. lecture notes in computer science (lncs), 29–40: springer, berlin, heidelberg, 2011; aaron krowne and martin halbert, “an initial evaluation of automated organization for digital library browsing,” paper presented at the proceedings of the 5th acm/ieee-cs joint conference on digital libraries, denver, co, usa, june 7–11, 2005 2005; waltinger, “ddc taxonomy,” 30. 29 khoo, “dublin core,” 977, 984 . 30 loc, “marc standards: marc21 formats,” library of congress (loc), 2013, updated march 14, 2013, accessed january 2, 2014, http://www.loc.gov/marc/marcdocz.html. 31 philip e schreur, “linked data for production and the program for cooperative cataloging,” pcc policy committee meeting, 2017, accessed may 18, 2018, https://www.loc.gov/aba/pcc/documents/facil-session-2017/pcc_and_ld4p.pdf. 32 sarah bull and amanda quimby, “a renaissance in library metadata? the importance of community collaboration in a digital world,” insights 29, no. 2 (2016): 146–53, http://doi.org/10.1629/uksg.302. 33 philip e. schreur, “linked data for production,” pcc policy committee meeting, 2015, accessed november 09, 2018, https://www.loc.gov/aba/pcc/documents/pcc-ld4p.docx. 34 vandenbussche, “linked open vocabularies,” 437, 438, 450. 35 hallo, “current state,” 120. 36 hallo, “current state,” 118. 37 hallo, “current state,” 120, 124. https://access.rdatoolkit.org/ http://www.ala.org/ http://cfla-fcab.ca/en/home-page http://cfla-fcab.ca/en/home-page http://www.facetpublishing.co.uk/ https://www.ifla.org/node/2016 https://doi.org/10.1108/jd-07-2014-0103 http://www.loc.gov/marc/marcdocz.html https://www.loc.gov/aba/pcc/documents/facil-session-2017/pcc_and_ld4p.pdf http://doi.org/10.1629/uksg.302 https://www.loc.gov/aba/pcc/documents/pcc-ld4p.docx information technology and libraries | december 2018 71 38 hallo, “current state,” 120, 124. 39 hallo, “current state,” 124. 40 bull, “community collaboration,” 147. 41 sam gyun oh, myongho yi, and wonghong jang, “deploying linked open vocabulary (lov) to enhance library linked data,” journal of information science theory and practice 2, no. 2 (2015): 6–15, http://dx.doi.org/10.1633/jistap.2015.3.2.1. 42 carlo bianchini and mauro guerrini, “a turning point for catalogs: ranganathan’s possible point of view,” cataloging & classification quarterly 53, no. 3-4 (2015): 341–51, http://doi.org/10.1080/01639374.2014.968273. 43 bianchini, “turning point,” 350. 44 loc, “library of congress linked data service,” the library of congress, accessed march 24, 2018, http://id.loc.gov/about/. 45 loc, “linked data service.” 46 loc, “linked data service.” 47 loc, “linked data service.” 48 loc, “linked data service.” 49 margaret e dull, “moving metadata forward with bibframe: an interview with rebecca guenther,” serials review 42, no. 1 (2016): 65–69, https://doi.org/10.1080/00987913.2016.1141032. 50 loc, “overview of the bibframe 2.0 model,” library of congress, april 21, 2016, accessed november 09, 2018, https://www.loc.gov/bibframe/docs/bibframe2-model.html. 51 taniguchi, “bibframe 2.0,” 388; taniguchi, “suitable schema,” 40. 52 oclc. 2016, “oclc linked data research,” online computer library center (oclc), https://www.oclc.org/research/themes/data-science/linkeddata.html. 53 oclc, “linked data research.” 54 jeff mister, “turning bibliographic metadata into actionable knowledge,” next blog—oclc, february 29, 2016, http://www.oclc.org/blog/main/turning-bibliographic-metadata-intoactionable-knowledge/. 55 mister, “turning bibliographic metadata.” 56 george campbell, karen coombs, and hank sway, “oclc linked data,” oclc developer network, march 26, 2018, https://www.oclc.org/developer/develop/linked-data.en.html. 57 campbell, “oclc linked data.” http://dx.doi.org/10.1633/jistap.2015.3.2.1 http://doi.org/10.1080/01639374.2014.968273 http://id.loc.gov/about/ https://doi.org/10.1080/00987913.2016.1141032 https://www.loc.gov/bibframe/docs/bibframe2-model.html https://www.oclc.org/research/themes/data-science/linkeddata.html http://www.oclc.org/blog/main/turning-bibliographic-metadata-into-actionable-knowledge/ http://www.oclc.org/blog/main/turning-bibliographic-metadata-into-actionable-knowledge/ https://www.oclc.org/developer/develop/linked-data.en.html current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 72 https://doi.org/10.6017/ital.v37i4.10432 58 roy tennant, “getting started with linked data,” next blog—oclc, february 8, 2016, http://www.oclc.org/blog/main/getting-started-with-linked-data-3/. 59 tennant, “linked data.” 60 dblp, “dblp computer science bibliography: frequently asked questions,” digital bibliography & library project (dblp), updated november 07, 2018, accessed 08 november 2018. http://dblp.uni-trier.de/faq/. 61 dblp, “frequently asked questions.” 62 jörg diederich, wolf-tilo balke, and uwe thaden, “demonstrating the semantic growbag: automatically creating topic facets for faceteddblp,” paper presented at the proceedings of the 7th acm/ieee-cs joint conference on digital libraries, vancouver, canada, june 17–22, 2007. 63 jörg diederich, wolf-tilo balke, and uwe thaden, “about faceteddblp,” 2018, accessed november 09, 2018, http://dblp.l3s.de/dblp++.php. 64 tennant, “linked data.” 65 in this section, lov catalog or portal refers to the lov platform available at http://lov.okfn.org/dataset/lov/, whereas the abbreviation lov, when used alone (without the term catalog/portal), refers to linked open vocabularies in general; vandenbussche, “linked open vocabularies,” 437. 66 vandenbussche, “linked open vocabularies,” 443, 450. 67 vandenbussche, “linked open vocabularies,” 437. 68 vandenbussche, “linked open vocabularies,” 437, 438, 450. 69 vandenbussche, “linked open vocabularies,” 438. 70 vandenbussche, “linked open vocabularies,” 437, 438, 443–46. 71 baker thomas, pierre-yves vandenbussche, and bernard vatant, “requirements for vocabulary preservation and governance,” library hi tech 31, no. 4 (2013): 657–68, https://doi.org/10.1108/lht-03-2013-0027. 72 thomas, “vocabulary preservation,” 658. 73 oh, “deploying,” 9. 74 oh, “deploying,” 9. 75 oh, “deploying,” 9, 10. http://www.oclc.org/blog/main/getting-started-with-linked-data-3/ http://dblp.uni-trier.de/faq/ http://dblp.l3s.de/dblp++.php http://lov.okfn.org/dataset/lov/ https://doi.org/10.1108/lht-03-2013-0027 information technology and libraries | december 2018 73 76 erik radio and scott hanrath, “measuring the impact and effectiveness of transitioning to a linked data vocabulary,” journal of library metadata 16, no. 2 (2016): 80–94, https://doi.org/10.1080/19386389.2016.1215734. 77 radio, transitioning,” 81. 78 robert, “strings to things,” 2. 79 robert, “strings to things,” 2, 4, 6. 80 vandenbussche, “linked open vocabularies,” 438. 81 as of april 23, 2018, the schema.org vocabulary is now available at http://lov.okfn.org/dataset/lov/; alberto nogales et al., “linking from schema.org microdata to the web of linked data: an empirical assessment,” computer standards & interfaces 45 (2016): 90-99. https://doi.org/10.1016/j.csi.2015.12.003. 82 bull, “community collaboration,” 146. 83 bull, “community collaboration,” 146. 84 bull, “community collaboration,” 147. 85 bull, “community collaboration,” 147. 86 bull, “community collaboration,” 147, 148. 87 schreur, 2015. linked data for production. 88 yhna therese p. santos, “resource description and access in the eyes of the filipino librarian: perceived advantages and disadvantages,” journal of library metadata 18, no. 1 (2017): 45–56, https://doi.org/10.1080/19386389.2017.1401869. 89 santos, “filipino librarian,” 51–55. 90 philomena w. mwaniki, “envisioning the future role of librarians: skills, services and information resources,” library management 39, no. 1, 2 (2018): 2–11, https://doi.org/10.1108/lm-01-2017-0001. 91 mwaniki, “envisioning the future,” 7, 8. 92 taniguchi, “bibframe 2.0,” 410, 411 . 93 taniguchi, “suitable schema,” 52–58 . 94 taniguchi, “suitable schema,” 59, 60. 95 taniguchi, “suitable schema,” 60. 96 sprochi, “where are we headed?,” 129, 134. https://doi.org/10.1080/19386389.2016.1215734 http://lov.okfn.org/dataset/lov/ https://doi.org/10.1016/j.csi.2015.12.003 https://doi.org/10.1080/19386389.2017.1401869 https://doi.org/10.1108/lm-01-2017-0001 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 74 https://doi.org/10.6017/ital.v37i4.10432 97 sprochi, “where are we headed?,” 129. 98 sprochi, “where are we headed?,” 134. 99 sprochi, “where are we headed?,” 134. 100 sprochi, “where are we headed?,” 134, 135. 101 sprochi, “where are we headed?,” 134. 102 caitlin tillman, joseph hafner, and sharon farnel, “forming the canadian linked data initiative,” paper presented at the the 37th international association of scientific and technological university libraries 2016 (iatul 2016) conference, dalhousie university libraries in halifax, nova scotia, june 5–9, 2016. 103 carol jean godby, shenghui wang, and jeffrey k mixter, library linked data in the cloud: oclc's experiments with new models of resource description. vol. 5, synthesis lectures on the semantic web: theory and technology, san rafael, california (usa),morgan & claypool publishers, 2015, https://doi.org/10.2200/s00620ed1v01y201412wbe012. 104 sofia zapounidou, michalis sfakakis, and christos papatheodorou, “highlights of library data models in the era of linked open data,” paper presented at the the 7th metadata and semantics research conference, mtsr 2013, thessaloniki, greece, november 19 –22, 2013; timothy w. cole et al., “library marc records into linked open data: challenges and opportunities,” journal of library metadata 13, no. 2–3 (2013): 163–96, https://doi.org/10.1080/19386389.2013.826074; kim tallerås, “from many records to one graph: heterogeneity conflicts in the linked data restructuring cycle, information research 18, no. 3 (2013) paper c18, accessed november 10, 2018. 105 fabiano ferreira de castro, “functional requirements for bibliographic description in digital environments,” transinformação 28, no. 2 (2016): 223–31. https://doi.org/10.1590/231808892016000200008. 106 castro, “functional requirements,” 223, 224. 107 castro, “functional requirements,” 224, 230. 108 castro, “functional requirements,” 223, 228–30. 109 gardašević, “possibilities and prospects,” 35. 110 godby, oclc's experiments, 112. 111 gonzales, “the future,” 17. 112 karim tharani, “linked data in libraries: a case study of harvesting and sharing bibliographic metadata with bibframe,” information technology and libraries 34, no. 1 (2015): 5–15. https://doi.org/https://doi.org/10.6017/ital.v34i1.5664. 113 tharani, “harvesting and sharing,” 16. https://doi.org/10.2200/s00620ed1v01y201412wbe012 https://doi.org/10.1080/19386389.2013.826074 https://doi.org/10.1590/2318-08892016000200008 https://doi.org/10.1590/2318-08892016000200008 https://doi.org/https:/doi.org/10.6017/ital.v34i1.5664 information technology and libraries | december 2018 75 114 gonzales, “the future,” 16. 115 karen smith-yoshimura, “analysis of international linked data survey for implementers,” dlib magazine, 2016, july/august 2016. 116 smith-yoshimura, “analysis.” 117 smith-yoshimura, “analysis.” 118 aikaterini k. kalou, dimitrios a. koutsomitropoulos, and georgia d. solomou, “combining the best of both worlds: a semantic web book mashup as a linked data service over cms infrastructure,” journal of library metadata 16, no. 3–4 (2016): 228–49, https://doi.org/10.1080/19386389.2016.1258897. 119 cole, “marc,” 163, 165, 175. 120 cole, “marc,” 163, 164, 191. 121 cole, “marc,” 164, 191. 122 ifla, “linked open data: challenges arising,” the international federation of library associations and institutions (ifla), 2014, accessed march 03, 2018, https://www.ifla.org/book/export/html/8548. 123 hallo, “current state,” 124. 124 hallo, “current state,” 126. 125 hallo, “current state,” 124. 126 karen smith-yoshimura, “linked data survey results 4–why and what institutions are publishing (updated),” hanging together the oclc research blog, september 3, 2014, accessed november 12, 2018, https://hangingtogether.org/?p=4167. 127 bull, “community collaboration,” 148. 128 tallerås, “one graph.” 129 karen smith-yoshimura, “linked data survey results 3–why and what institutions are consuming (updated),” hanging together the oclc research blog, september 1, 2014, accessed november 12, 2018, http://hangingtogether.org/?p=4155. 130 godby, oclc’s experiments, 116. 131 carol jean godby and karen smith‐yoshimura, “from records to things: managing the transition from legacy library metadata to linked data,” bulletin of the association for information science and technology 43, no. 2 (2017): 18–23, https://doi.org/10.1002/bul2.2017.1720430209. 132 godby, “from records to things,” 23. https://doi.org/10.1080/19386389.2016.1258897 https://www.ifla.org/book/export/html/8548 https://hangingtogether.org/?p=4167 http://hangingtogether.org/?p=4155 https://doi.org/10.1002/bul2.2017.1720430209 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 76 https://doi.org/10.6017/ital.v37i4.10432 133 godby, “from records to things,” 22. 134 vandenbussche, “linked open vocabularies,” 449, 450. 135 silvia b. southwick, cory k lampert, and richard southwick, “preparing controlled vocabularies for linked data: benefits and challenges,” journal of library metadata 15, no. 3–4 (2015): 177–190, https://doi.org/10.1080/19386389.2015.1099983. 136 southwick, “controlled vocabularies,” 177. 137 southwick, “controlled vocabularies,” 189, 190. 138 southwick, “controlled vocabularies,” 183. 139 robin hastings, “feature: linked data in libraries: status and future direction,” computers in libraries (magzine article), 2015, http://www.infotoday.com/cilmag/nov15/hastings-linked-data-in-libraries.shtml. 140 hastings, “status and future.” 141 hastings, “status and future.” 142 hastings, “status and future.” 143 hastings, “status and future.” 144 tallerås, “national libraries,” 129 (by quoting from van hooland 2009; wang and strong 1996). 145 jung-ran park, “metadata quality in digital repositories: a survey of the current state of the art,” cataloging & classification quarterly 47, no. 3–4 (2009): 213–28, https://doi.org/10.1080/01639370902737240. 146 tallerås, “national libraries,” 129 (by quoting from bruce & hillmann, 2004). 147 park, “metadata quality,” 213, 224; tallerås, “national libraries,” 129, 150. 148 park, “metadata quality,” 213, 215, 218–21, 224, 225; tallerås, “national libraries,” 141. 149 tallerås, “national libraries,” 129. 150 tallerås, “national libraries,” 129. 151 tallerås, “national libraries,” 129. 152 karen snow, “defining, assessing, and rethinking quality cataloging,” cataloging & classification quarterly 55, no. 7–8 (2017): 438–55, https://doi.org/10.1080/01639374.2017.1350774. 153 snow, “quality cataloging,” 445. 154 snow, “quality cataloging,” 451, 452. https://doi.org/10.1080/19386389.2015.1099983 http://www.infotoday.com/cilmag/nov15/hastings--linked-data-in-libraries.shtml http://www.infotoday.com/cilmag/nov15/hastings--linked-data-in-libraries.shtml https://doi.org/10.1080/01639370902737240 https://doi.org/10.1080/01639374.2017.1350774 information technology and libraries | december 2018 77 155 david van kleeck et al., “managing bibliographic data quality for electronic resources,” cataloging & classification quarterly 55, no. 7-8 (2017): 560–77, https://doi.org/10.1080/01639374.2017.1350777. 156 van kleeck, “data quality,” 560, 575, 576. 157 van kleeck, “data quality,” 575. 158 park, “metadata quality,” 214, 216–18, 225. 159 niso, a framework of guidance for building good digital collections, ed. niso framework advisory group, 3rd ed (baltimore, md: national information standards organization, 2007), https://www.niso.org/sites/default/files/2017-08/framework3.pdf. 160 park, “metadata quality,” 214, 215; niso. guidance; jane barton, sarah currier, and jessie mn hey, “building quality assurance into metadata creation: an analysis based on the learning objects and e-prints communities of practice,” paper presented at the proceedings of the international conference on dublin core and metadata applications: supporting communities of discourse and practice—metadata research & applications, seattle, washington, september 28–october 2, 2003. 161 pascal hitzler and krzysztof janowicz, “linked data, big data, and the 4th paradigm,” semantic web 4, no. 3 (2013): 233–35, https://doi.org/10.3233/sw-130117. 162 hitzler, “4th paradigm,” 234. 163 hitzler, “4th paradigm,” 234. 164 alberto petrucciani, “quality of library catalogs and value of (good) catalogs,” cataloging & classification quarterly 53, no. 3–4 (2015): 303–13. https://doi.org/10.1080/01639374.2014.1003669. 165 petrucciani, “quality,” 303, 305. 166 petrucciani, “quality,” 303, 309, 311. 167 petrucciani, “quality,” 303, 309. 168 petrucciani, “quality,” 309, 310. 169 petrucciani, “quality,” 310. 170 bull, “community collaboration,” 147. 171 bull, “community collaboration,” 148. 172 han, myung-ja, “new discovery services and library bibliographic control,” library trends 61, no. 1 (2012):162–72, https://doi.org/10.1353/lib.2012.0025. 173 han, “bibliographic control,” 162. https://doi.org/10.1080/01639374.2017.1350777 https://www.niso.org/sites/default/files/2017-08/framework3.pdf https://doi.org/10.3233/sw-130117 https://doi.org/10.1080/01639374.2014.1003669 https://doi.org/10.1353/lib.2012.0025 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 78 https://doi.org/10.6017/ital.v37i4.10432 174 han, “bibliographic control,” 169–71. 175 han, “bibliographic control,” 163. 176 han, “bibliographic control,” 167–70. 177 alemu, emergent theory, 29–33, 43–65. 178 alemu, emergent theory, 29–65. 179 lorri mon, social media and library services, synthesis lectures on information concepts, retrieval, and services, ed. gary marchionini, 40, san rafael, california (usa), morgan & claypool publishers, 2015), https://doi.org/10.2200/s00634ed1v01y201503icr040. 180 mon, social media, 50. 181 mon, social media, 24. 182 marijn koolen et al., “overview of the clef 2016 social book search lab,” paper presented at the 7th international conference of the cross-language evaluation forum for european languages, évora, portugal, september 5–8, 2016; koolen et al., “overview of the clef 2015 social book search lab,” paper presented at the 6th international conference of the crosslanguage evaluation forum for european languages, toulouse, france, september 8–11, 2015; patrice bellot et al., “overview of inex 2014,” paper presented at the international conference of the cross-language evaluation forum for european languages, sheffield, uk, september 15–18, 2014; bellot et al., “overview of inex 2013,” paper presented at the international conference of the cross-language evaluation forum for european languages, valencia, spain, september 23–26, 2013. 183 bo-wen zhang, xu-cheng yin, and fang zhou, “a generic pseudo relevance feedback framework with heterogeneous social information,” information sciences 367–68 (2016): 909–26, https://doi.org/10.1016/j.ins.2016.07.004; xu-cheng yin et al., “isart: a generic framework for searching books with social information,” plos one 11, no. 2 (2016): e0148479, https://doi.org/10.1371/journal.pone.0148479; faten hamad and bashar alshboul, “exploiting social media and tagging for social book search: simple query methods for retrieval optimization,” in social media shaping e-publishing and academia, edited by nashrawan tahaet al., 107–17 (cham: springer international publishing, 2017). 184 marijn koolen, “user reviews in the search index? that’ll never work!” paper presented at the 36th european conference on ir research (ecir 2014), amsterdam, the netherlands, april 13–16, 2014. 185 alemu, emergent theory, 29–33, 43–65. 186 lucy clements and chern li liew, “talking about tags: an exploratory study of librarians’ perception and use of social tagging in a public library,” the electronic library 34, no. 2 (2016): 289–301, https://doi.org/10.1108/el-12-2014-0216. 187 clements, “talking about tags,” 291, 297-99. https://doi.org/10.2200/s00634ed1v01y201503icr040 https://doi.org/10.1016/j.ins.2016.07.004 https://doi.org/10.1371/journal.pone.0148479 https://doi.org/10.1108/el-12-2014-0216 information technology and libraries | december 2018 79 188 sharon farnel, “understanding community appropriate metadata through bernstein’s theory of language codes,” journal of library metadata 17, no. 1 (2017): 5–18, https://doi.org/10.1080/19386389.2017.1285141. 189 farnel, “bernstein’s theory,” 5, 6. 190 mwaniki, “envisioning the future,” 8. 191 mwaniki, “envisioning the future,” 8, 9. 192 getaneh alemu et al., “toward an emerging principle of linking socially-constructed metadata,” journal of library metadata 14, no. 2 (2014): 103–29, https://doi.org/10.1080/19386389.2014.914775. 193 farnel, “bernstein’s theory,” 15–16. 194 kalou, “book mashup.” 195 kalou, “book mashup,” 242, 243. 196 alemu, “socially-constructed metadata,” 103, 107. 197 alemu, “socially-constructed metadata,” 103. 198 alemu, “socially-constructed metadata,” 103, 104, 120, 121. 199 getaneh alemu, “a theory of metadata enriching and filtering: challenges and opportunities to implementation,” qualitative and quantitative methods in libraries 5, no. 2 (2017): 311–34, http://www.qqml-journal.net/index.php/qqml/article/view/343 200 alemu, “metadata enriching and filtering,” 311. 201 alemu, “socially-constructed metadata,” 125. 202 alemu, “metadata enriching and filtering,” 319, 320. 203 alemu, “metadata enriching and filtering”; alemu, emergent theory; alemu, “sociallyconstructed metadata”; farnel, “bernstein's theory”; kalou, “book mashup.” 204 hallo, “current state,” 120. 205 alemu, “socially-constructed metadata,” 125; hastings, “status and future.” 206 bull, “community collaboration,” 147. 207 bull, “community collaboration,” 152; bull, “community collaboration,” 152; schreur, 2015. linked data for production. 208 tallerås, “national libraries,” 129. 209 petrucciani, “quality,” 303, 309. https://doi.org/10.1080/19386389.2017.1285141 https://doi.org/10.1080/19386389.2014.914775 http://www.qqml-journal.net/index.php/qqml/article/view/343 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 80 https://doi.org/10.6017/ital.v37i4.10432 210 bull, “community collaboration,” 147, 152. 211 farnel, “bernstein's theory,” 5, 6, 12, 13, 15, 16; mwaniki, “envisioning the future,” 8. 212 mon, social media, 3; alemu, “metadata enriching and filtering,” 320. 213 alemu, “socially-constructed metadata,” 125. 214 koolen, “clef 2016”; koolen, “clef 2015”; bellot, “inex 2014”; bellot, “inex 2013.” abstract introduction the role of linked open data and vocabularies in cataloging linked and open data linked open vocabularies challenges, issues, and research opportunities the multiplicity of cataloging rules and standards publishing and consuming linked bibliographic metadata publishing linked bibliographic metadata consuming linked bibliographic metadata quality of linked bibliographic metadata linking the socially curated metadata the socially curated metadata matters in cataloging the socially curated metadata as linked data conclusions references letter from the editor (december 2019) letter from the editor kenneth j. varnum information technology and libraries | december 2019 1 https://doi.org/10.6017/ital.v38i4.11923 earlier this fall, i had the privilege of participating in the sharjah library conference, a three-day event hosted by the sharjah book authority in the united arab emirates with programming coordinated by the ala international relations office. the experience of meeting with so many librarians from cultures different from my own was truly rewarding and enriching. it was both refreshing and invigorating to see, first-hand, the global importance of the local matters that occupy so much of my professional life. i returned to my regular job with a newfound appreciation for how much the issues i spend so much of my professional time on—information access, equity, user experience, and the like—are universal. it is easy to get lost in the weeds of my own circumstances and environment, and sometimes difficult to look up and explore what colleagues, known and unknown, are doing and thinking. the experience reinforces the importance of important open access publications such as information technology and libraries. while “open access” doesn’t remove every possible barrier to accessing the knowledge, experience, and lessons contained within in its virtual cover, it does remove the all-important paywall. and that is no small thing, in a community of library technologists who interact and exchange information through social media, email, and other tools. our open access status gives this journal a vibrant platform for sharing knowledge, experience, and expertise to all who seek it. i hope you find this issue’s contents useful and informative, and will share the items you find most important with your peers at your institutions and beyond. i invite you to add your own knowledge and experience to our collective wisdom through a contribution to the journal. for more details, see the about the journal page or get in touch with me. sincerely, kenneth j. varnum, editor varnum@umich.edu december 2019 https://www.sibfala.com/program http://www.sba.gov.ae/ http://www.ala.org/aboutala/offices/iro https://ejournals.bc.edu/index.php/ital/about mailto:varnum@umich.edu article title | author 39 author id box for 2 column layout thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria 39 author id box for 2 column layout knowledge organization systems denotes formally represented knowledge that is used within the context of digital libraries to improve data sharing and information retrieval. to increase their use, and to reuse them when possible, it is vital to manage them adequately and to provide them in a standard interchange format. simple knowledge organization systems (skos) seem to be the most promising representation for the type of knowledge models used in digital libraries, but there is a lack of tools that are able to properly manage it. this work presents a tool that fills this gap, facilitating their use in different environments and using skos as an interchange format. u nlike the largely unstructured information avail able on the web, information in digital libraries (dls) is explicitly organized, described, and man aged. in order to facilitate discovery and access, dl sys tems summarize the content of their data resources into small descriptions, usually called metadata, which can be either introduced manually or automatically generated (index terms automatically extracted from a collection of documents). most dls use structured metadata in accor dance with recognized standards, such as marc21 (u.s. library of congress 2004) or dublin core (iso 2003). in order to provide accurate metadata without ter minological dispersion, metadata creators use different forms of controlled vocabularies to fill the content of typi cal keyword sections. this increase of homogeneity in the descriptions is intended to improve the results provided by search systems. to facilitate the retrieval process, the same vocabularies used to create the descriptions are usu ally used to simplify the construction of user queries. as there are many different schemas for modeling controlled vocabularies, the term knowledge organization systems (kos) is intended to encompass all types of schemas for organizing information and promoting knowledge management. as hodge (2000) says, “a kos serves as a bridge between the users’ information need and the material in the collection.” some types of kos can be highlighted. examples of simple types are glossaries, which are only a list of terms (usually with definitions), and authority files that control variant ver sions of key information (such as geographic or personal names). more complex are subject headings, classifica tion schemes, and categorization schemes (also known as taxonomies) that provide a limited hierarchical structure. at a more complex level, kos includes thesauri and less traditional schemes, such as semantic networks and ontologies, that provide richer semantic relations. there is not a single kos on which everyone agrees. as lesk (1997) notes, while a single kos would be advantageous, it is unlikely that such a system will ever be developed. culture constrains the knowledge classifi cation scheme because what is meaningful to one area is not necessarily meaningful to another. depending on the situation, the use of one or another kos has its advan tages and disadvantages, each one having its place. these schemas, although sharing many characteristics, usually have been treated heterogeneously, leading to a variety of representation formats to store them. thesauri are an example of the format heterogeneity problem. according to iso2788 (norm for monolingual thesauri) (iso 1986), a thesaurus is a set of terms that describe the vocabulary of a controlled indexing language, formally organized so that the a priori relationships between con cepts (for example, synonyms, broader terms, narrower terms, and related terms) are made explicit. this stan dard is complemented with iso5964 (iso 1985), which describes the model for multilingual thesauri, but none of them describe a representation format. the lack of a stan dard representation model has caused a proliferation of incompatible formats created by different organizations. so each organization that wants to use several external thesauri has to create specific tools to transform all of them to the same format. in order to eliminate the heterogeneity of represen tation formats, the w3c initiative has promoted the development of simple knowledge organization systems (skos) (miles et al. 2005) for its use in the semantic web environment. skos has been created to represent simple kos, such as subject heading lists, taxonomies, classifica tion schemes, thesauri, folksonomies, and other types of controlled vocabulary as well as concept schemes embed ded in glossaries and terminologies. although skos has been recently proposed, the number and importance of organizations involved in its creation process (and that publish their kos in this format) indicates that it will probably become a standard for kos representation. skos provides a rich, machinereadable language that is very useful to represent kos, but nobody would expect to have to create it manually or by just using a generalpurpose resource description framework (rdf) editor (skos is rdfbased). however, in the digital library area, there are not specialized tools that are able to manage it adequately. therefore, this work tries to fill this gap, describing an open source tool, thmanager, that thmanager: an open source tool for creating and visualizing skos javier lacasta, javier nogueras-iso, francisco javier lópez-pellicer, pedro rafael muro-medrano, and francisco javier zarazaga-soria javier lacasta (jlacasta@unizar.es) is assistant professor, javier nogueras-iso (jnog@unizar.es) is assistant professor, francisco javier lópez-pellicer (fjlopez@unizar.es) is research fellow, pedro rafael muro-medrano (prmuro@ unizar.es) is associate professor, and francisco javier zarazaga-soria (javy@unizar.es) is associate professor in the computer science and systems engineering department, university of zaragoza, spain. �0 information technology and libraries | september 2007�0 information technology and libraries | september 2007 facilitates the construction of skosbased kos. although thmanager has been created to manage thesauri, it also is appropriate to create and manage any other models that can be represented using skos format. this article describes the thmanager tool, highlight ing its characteristics. thmanager’s layerbased architec ture permits the reuse of the components created for the management of thesauri in other applications where they are also needed. for example, it facilitates the selection of values from a controlled vocabulary in a metadata cre ation tool, or the construction of user queries in a search client. the tool is distributed as open source software accessible through the sourceforge platform (http:// thmanager.sourceforge.net/). ■ state of the art in thesaurus tools and representation models the problem of creating appropriate content for thesauri is of interest in the dl field and other related disciplines, and an increasing number of software packages have appeared in recent years for constructing thesauri. for instance, the web site of willpower information (http://www .willpower.demon.co.uk/thessoft.htm) offers a detailed revision of more than forty tools. some are only avail able as a module of a complete information storage and retrieval system, but others also allow the possibility of working independently of any other software. among these thesaurus creation tools, one may note the follow ing products: ■ bibliotech (http://www.inmagic.com/). this is a multiplatform tool that forms part of bibliotech pro integrated library system and can be used to build an ansi/niso standard thesaurus (standard z39.19 [ansi 1993]). ■ lexico (http://www.pmei.com/lexico.html). this is a javabased tool that can be accessed and/or manip ulated over the internet. thesauri are saved in a textbased format. it has been used by the u.s. library of congress to manage such vocabularies and thesauri as the thesaurus for graphic materials, the global legal information network thesaurus, the legislative indexing vocabulary, and the symbols of american libraries listing. ■ multites (http://www.multites.com/) is a windows based tool that provides support for ansi/niso relationships plus userdefined relationships and comment fields for an unlimited number of thesauri (both monolingual and multilingual). ■ termtree 2000 (http://www.termtree.com.au/) is a windowsbased tool that uses access, sql server, or oracle for data storage. it can import and export trim thesauri (a format used by the towers records information management system [http://www.towersoft.com/]), as well as a defined termtree 2000 tag format. ■ webchoir (http://www.webchoir.com/) is a family of clientserver web applications that provides dif ferent utilities for thesaurus management in multiple dbms platforms. termchoir is a hierarchical infor mation organizing and searching tool that enables one to create and search varieties of hierarchical subject categories, controlled vocabularies, and tax onomies based on either predefined standards or a userdefined structure, and is then exported to an xmlbased format. linkchoir is another tool that allows indexers to describe information sources using terminology organized in termchoir. and seekchoir is a retrieval system that enables users to browse thesaurus descriptors and their references (broader terms, related terms, synonyms, and so on). ■ synaptica (http://www.synaptica.com/) is a client server web application that can be installed locally on a client’s intranet or extranet server. thesaurus data is stored in a sql server or oracle database. the application supports the creation of electronic the sauri in compliance with the ansi/niso standard. the application allows the exchange of thesauri in csv (commaseparated values) text format. ■ superthes (batschi et al. 2002) is a windowsbased tool that allows the creation of thesauri. it extends the ansi/niso relationships, allowing many pos sible data types to enrich the properties of a concept. it can import and export thesauri in xml and tabular format. ■ tematres (hhttp://r020.com.ar/tematres/) is a web application specially oriented to the creation of thesauri, but it also can be used to develop web navigation structures or to manage the documentary languages in use. the thesauri are stored in a mysql database. it provides the created thesauri in zthes (tylor 2004) or in skos format. finally, it must be mentioned that, given that thesauri can be considered as ontologies specialized in organiz ing terminology (gonzalo et al. 1998), ontology editors have sometimes been used for thesaurus construction. a detailed survey of ontology editors can be found in the denny study (2002). all of these tools (desktop or webbased) present some problems in using them as general thesaurus editors. the main one is the incompatibility in the interchange formats that they support. these tools also present integration problems. some are deeply integrated in bigger sys tems and cannot easily be reused in other environments because they need specific software components to work article title | author �1thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �1 (as dbms to store thesauri). others are independent tools (can be considered as generalpurpose thesaurus editors), but their architecture does not facilitate their integration within other information management tools. and most of them are not open source tools, so there is no possibility to modify them to improve their functionality. focusing on the interchange format problem, the iso5964 standard (norm for multilingual thesauri) is currently undergoing review by iso tc46/sc 9, and it is expected that the new modifications will include a stan dard exchange format for thesauri. it is believed that this format will be based on technologies such as rdf/xml. in fact, some initiatives in this direction have already arisen: ■ the adl thesaurus protocol (janée et al. 2003) defines an xml and httpbased protocol for access ing thesauri. as a result of query operations, portions of the thesaurus encoded in xml are returned. ■ the language independent metadata browsing of european resources (limber) project has published a thesaurus interchange format in rdf (matthews et al. 2001). this work introduces an rdf representa tion of thesauri, which is proposed as a candidate thesaurus interchange format. ■ the california environmental resources evaluation system (ceres) and the nbii biological resources division are collaborating in a thesaurus partnership project (ceres/nbii 2003) for the development of an integrated environmental thesaurus and a thesau rus networking toolset for metadata development and keyword searching. one of the deliverables of this project is an rdf format to represent thesauri. ■ the semantic web advanced development for europe (swadeurope 2001) project includes the swadeurope thesaurus activity, which has defined the skos, a set of specifications to represent the knowledge organization systems (kos) on the semantic web (thesauri between them). the british standards bs5723 (bsi 1987) and bs6723 (bsi 1985) (equivalent to the international iso2788 and iso5964) also lack a representation format. the british standards institute idt/2/2 working group is now developing the bs8723 standard that will replace them and whose fifth part will describe the exchange formats and protocols for interoperability of thesauri. the objec tive of this working group is to promote the standard to iso, to replace the iso2788 and iso5964. here, it is important to remark that given the direct involvement of the idt/2/2 working group with skos development; probably the two initiatives will not diverge. the new representation format will be, if not exactly skos, at least skosbased. taking into account all these circumstances, skos seems to be the most adequate representation model to store thesauri. given that skos is rdfbased, it can be created using any tool that is able to manage rdf (usually used to edit ontologies); for example, swoop (mindswap group 2006), protégé (noy et al. 2000), or triple20 (wielemaker et al. 2005). the problem with these tools is that they are too complex for editing and visualizing such a simple model as skos. they are thought to create complex ontologies, so they provide too many options not spe cifically adapted to the type of relations in skos. in addition, they do not allow an integrated management of collection of thesauri and other types of controlled vocabularies as needed in dl processes (for example, the creation of metadata of resources, or the construction of queries in a search system). ■ skos model skos is a representation model for simple knowledge organization systems, such as subject heading lists, tax onomies, classification schemes, thesauri, folksonomies, other types of controlled vocabulary, and also concept schemes embedded in glossaries and terminologies. this section describes the model, providing characteristics, showing the state of development, and indicating the problems found to represent some types of kos. skos was initially developed within the scope of the semantic web advanced development for europe (swadeurope 2001). swade was created to support w3c’s semantic web initiative in europe (part of the ist7 programme). skos is based on a generic rdf schema for thesauri that was initially produced by the desire project (cross et al. 2001), and further developed in the limber project (matthews et al. 2001). it has been developed as a draft of an rdf schema for thesauri com patible with relevant iso standards, and later adapted to support other types of kos. among the kos already published using this new format are gemet (eea 2001), agrovoc (fao 2006), adl feature types (hill and zheng 1999), and some parts of wordnet lexical data base (miller 1990), all of them available on the skos project web page. skos is a collection of three different rdf schema application profiles: skoscore, to store common prop erties and relations; skosmapping, whose purpose is to describe relations between different kos; and skos extension, to indicate specific relations and properties only contained in some type of kos. for the first step of the development of the thmanager tool, only the most stable part of skos has been consid ered. figure 1 shows the part of skoscore used. the rest of skoscore is still unstable, so its support has been delayed until it is approved. skosmapping and skosextension are still in their first steps of develop �2 information technology and libraries | september 2007�2 information technology and libraries | september 2007 ment and are very unstable, so their management in thmanager also has been delayed until the creation of stable versions. in skoscore, a kos (in our case, usually a the saurus) consists of a set of concepts (labelled as skos: concept) that are grouped by a concept scheme (skos: conceptscheme). to distinguish between different mod els provided, the skos:conceptscheme contains a uri that identifies it, but to describe the model content to humans, metadata following the dublin core standard also can be added. the relation of the concept scheme with the concepts of the kos is done through the skos: hastopconcept relation. this relation points at the most general concepts of the kos (top concepts), which are used as entry points to the kos structure. in skos, each concept consists of a uri and a set of properties and relations to other concepts. among the properties, skos.preflabel and skos.altlabel provide labels for a concept in different languages. the first one is used to show the label that better identifies a concept (for the sauri it must be unique). the second one is an alternative label that contains synonyms or spelling variations of the preferred label (it is used to redirect to the preferred label of the concept). the skos concepts also can contain three other properties called skos.scopenote, skos.definition, and skos.example. they contain annotations about the ways to use a concept, a definition, or examples of use in differ ent languages. last, the skos.prefsymbol and skos.altsymbol properties are used to provide a preferred or some alter native symbols that graphically represent the concept. for example, a graphical representation is very useful to identify the meaning of a mathematical formula. another example is a chemical formula, where a graphical repre sentation of the structure of the substance also provides valuable information to the user. with respect to the relations, each concept indicates by means of the skos:inscheme relation in which concept scheme it is contained. the skos.broader and the skos.narrower relations are inverse relations used to model the generalization and specialization characteristics present in many kos (including thesauri). skos.broader relates to more general concepts, and skos.narrower to more spe cific ones. the skos.related relation describes associative relationships between concepts (also present in many thesauri), indicating that two concepts are related in some way. with these properties and relations, it is perfectly possible to represent thesauri, taxonomies, and other types of controlled vocabularies. however, there is a problem for the representation of classification schemes that provide multiple coding of terms, as there is no place to store this information. under this category, one may find classification schemes such as iso639 (iso 2002) (iso standard for coding of languages), which proposes different types of alphanumeric codes (for example, two letters and three letters). for this special case, the skos working group proposes the use of the property skos.notation. although this property is not in the skos vocabulary yet, it is expected to be added in future versions. given the need to work with these types of schemes, this property has been included in the thmanager tool. ■ thmanager architecture this section presents the architecture of thmanager tool. this tool has been created to manage thesauri in skos, but it also is a base infrastructure that facilitates the management of thesauri in dls, simplifying their inte gration in tools that need to use thesauri or other types of controlled vocabularies. in addition, to facilitate its use on different computer platforms, thmanager has been developed using the java objectoriented language. the architecture of thmanager tool is shown in figure 2. the system consists of three layers: first, a repository layer where thesauri are stored and identified by means of associated metadata describing them; second, a per sistence layer that provides an api for access to thesauri stored in the repository; and third, a gui layer that offers different graphical components to visualize thesauri, to search by their properties, and to edit them in different ways. the thmanager tool is an application that uses the different components provided by the gui layer to allow the user to manage the thesauri. in addition, the layered architecture allows other applications to use some of the visualization components or the method provided by the persistence layer to provide access to thesauri. the main features that have guided the design of these layers have been the following: a metadatadriven design, efficient management of thesauri, the possibility of interrelating thesauri, and the reusability of thmanager figure 1. skos model article title | author �3thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �3 components. the following subsections describe these characteristics in detail. metadata-driven design a fundamental aspect in the repository layer is the use of metadata to describe thesauri. thmanager considers metadata of thesauri as basic information in the thesau rus management process, being stored in the metadata repository and managed by the metadata manager. the reason for this metadatadriven design is that thesauri must be described and classified to facilitate the selec tion of the one that better fits the user needs, allowing the user to search them not only by their name but also by the application domain or the associated geographi cal area between others. the lack of metadata makes the identification of useful thesauri (provided by other organizations) difficult, producing a low reuse of them in other contexts. to describe thesauri in our service, a metadata profile based on dublin core has been created. the reason to use dublin core as basis of this profile has been its extensive use in the metadata community. it provides a simple way to describe a resource using very general metadata ele ments, which can be easily matched with complex domain specific metadata standards. additionally, dublin core also can be extended to define application profiles for specific types of resources. following the metadata pro file hierarchy described in tolosanacalasanz et al. (2006), the thesaurus metadata profile refines the definition and domain of dublin core elements as well as includes two new elements (metadata language and metadata identifier) to appropriately identify the metadata records describing a thesaurus. the profile for thesauri has been described using the iemsr format (heery et al. 2005) and is distributed with the tool. iemsr is an rdfbased format created by the jisc ie metadata schema registry project to describe metadata application profiles. figure 3 shows the metadata created for gemet thesaurus (the resource), expressed as a hedgehog graph (reinterpreta tion of rdf triplets: resources, named properties, and values). the purpose of these metadata is not only to sim plify the thesaurus location to a user, but also to facilitate the identification of thesauri useful for a specific task in a machinetomachine communication. for instance, one may be interested only in thesauri that cover a restricted geographical area or have a specific thematic. efficient thesauri storage thesauri vary enormously in size, ranging from hundreds of concepts and properties to millions. so the time spent on load, navigation, and search processes are a functional restriction for a tool that has to manage them. skos is rdfbased, and because reading rdf to extract the con tent is a slow process, the format is not appropriate for inner storage. to provide better access time, thmanager transforms skos into a binary format when a new skos is imported. the persistence layer provides a unified access to the thesaurus repository. this layer is used by the gui layer figure 2. kos manager architecture viewer generatorviewer generator repository concept repository metadata manager concept manager persistence gui disambiguation tool concept core thesaurus persistence manager skos core skos mapping jena api metadata repository thesaurus metadata applications thmanagerthmanager other tools that use thesauri other tools that use thesauri desktop tools that use thesauri other tools that use thesauri other tools that use thesauri other tools that use thesauri other tools that use thesauri desktop tools that use thesauri desktop tools that use thesauri other tools that use thesauri other tools that use thesauri web services that use thesauri other tools that use thesauri other tools that use thesauri other tools that use thesauri other tools that use thesauri web services that use thesauri web services that use thesauri visualization edition search gui manager figure 3. metadata of gemet thesaurus european topic centre on catalogue of data sources (etc/cds) general multilingual environmental thesaurus dc:title dcterms:alternative gemet dc:creator [ http://www2.ulcc.ac.uk/unesco/concept/mt_mt_2.55 ] science.environmental sciences and engineering [ http://www2.ulcc.ac.uk/unesco/concept/mt_2.60 ] science.pollution, disasters and security [ http://www2.ulcc.ac.uk/unesco/concept/mt_2.65 ] science.natural resources dc:subject dc:subject dc:subject dc:subject gemet was conceived as a "general" thesaurus, aimed to define a common general language, a core of general terminology for the environment dc:description dc:publisher european environment agency (eea) dc:date 2005-03-07 dc:type [ http://iaaa.cps.unizar.es/dctype/concept/236 ] text.reference materials.ontology dc:format [ http://iaaa.cps.unizar.es/mimetype/concept/skos ] skos http://www.eionet.eu.int/gemetdc:identifier dc:language en es fr ... iaaa:metadatalanguage en http://iaaa.cps.unizar.es/ontologies/gemetiaaa:metadataidentifier [ http://www2.ulcc.ac.uk/unesco/concept/mt_2.75 ] science.natural sciences [ http://www.eionet.europa.eu ] european environment information and observation network it can be used whenever there is no commercial profitdc:rights dc:relation us environmental protection agency (epa) dc:contributor dc:source [ http://europa.eu/eurovoc ] eurovoc thesaurus european environment agency (eea) dc:creator ... �� information technology and libraries | september 2007�� information technology and libraries | september 2007 to access the thesauri, but it also can be employed by other tools that need to use thesauri outside a desktop environment (for example, a thematic search system accessible through the web that requires browsing a thesaurus to facilitate construction of user queries). this layer performs the transformation of skos to the binary format when a thesaurus is imported. the transformation is provided using the jena library, a popular library to manipulate rdf documents that allows storing them in different kinds of repositories (http://jena.sourceforge. net/). jena provides an open model that can be extended with specialized modules to use other ways of storage, making it possible to easily change the storage format system for another that is more efficient if needed. the data structure used is shown in figure 4. the model is an optimized representation of the information given by the rdf triplets. the concepts map contains the concepts and their associated relations in the form of keyvalue pairs: the key is a uri identifying a concept; and the value is a relations object containing the properties of the concept. a relations object is a map that stores the properties of one concept in the form of pairs. the keys used for this map are the names of the typical property types in the skos model (for example, narrower or broader). the only special cases for encoding these property types in the proposed data structure occur when they have a language attribute (for example, preflabel, definition, or scopenote). in those cases, we propose the use of a [lang] suffix to distinguish the property type for a particular language. for instance, preflabel_en indicates a preflabel property type in english. additionally, it must be noted that the data type of the property values assigned to each key in the relations map varies upon the semantics given to each property type. the data types fall into the following categories: a string for a preflabel property type; a list of strings for altlabel, definition, scope note, and example property types; a uri for a prefsymbol property type; a list of uris for narrower, broader, related, and altsymbol property types; and a list of notation objects for a notation property type. the data type used for notation values is a complex object because there may be different notation types. a notation object consists of type and value attributes. the type attribute is a uri that identifies a particular notation type and qualifies the associated notation value. additionally, and with the objective of increasing the speed of some operations (for example, navigation or search), some optimizations have been added. first, the uris of the top concepts are stored in the topconcepts list. this list contains redundant information, given that those concepts also are stored in the concepts map, but it makes immediate their location. second, to speed up the search of concepts and the drawing of the alphabetic viewer, the translations map has been added. for each language sup ported by the thesaurus, this map contains a translationterm object, or list of pairs , ordered by preflabel. it also contains redundant information that allows the immediate creation of the alphabetic viewer for a language, simplifying the search process; as can be seen later, this does not provides a big over head in load time. in addition, if no alphabetic viewer and search are needed, this structure can be removed without affecting the hierarchical viewer. this solution has proven to be useful to manage the kind of thesauri we use (they do not sur pass 50,000 concepts and about 330,000 properties), loading them to memory in an average com puter in a reasonable time, and allowing immediate navigation and search (see section 6). interrelation of thesauri the vast choice of thesauri that are available nowadays implies an undesired effect of content heterogeneity. although a the saurus is usually created for a specific application domain, some of the concepts defined in thesauri from different applicafigure �. persistence model …… relations uri 3uri 3 relations uri 2uri 2 relations uri 1uri 1 valuekey …… relations uri 3uri 3 relations uri 2uri 2 relations uri 1uri 1 valuekey <> concepts uriprefsymbol list altsymbol list notation stringpreflabel_[lang] list altlabel_[lang] list definition_[lang] list scopenote_[lang] list example_[lang] list related list broader list narrower valuekey uriprefsymbol list altsymbol list notation stringpreflabel_[lang] list altlabel_[lang] list definition_[lang] list scopenote_[lang] list example_[lang] list related list broader list narrower valuekey <> relations -type : uri -value : string notation …… list narrower valuekey …… list narrower valuekey <> relations … uri 390 uri 27 uri 3 … uri 390 uri 27 uri 3 <> topconcepts … -concept : uri -label : string translationterm …… listfr listes listen valuekey …… listfr listes listen valuekey <> translations article title | author �5thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �5 tions domains may be equivalent. in order to facilitate crossdomain classification of resources, users would benefit from the possibility of knowing the connections of a thesaurus in their application domain to thesauri used in other domains. however, it is difficult to manually detect the implicit links between those different thesauri. therefore, in order to automatically facilitate these interthesaurus connections, the persistence layer of thmanager tool provides an interrelation function that relates a thesaurus with respect to an upperlevel lexical database (the concept core displayed in figure 2). the interrelation mechanism is based on the method presented in noguerasiso, zarazagasoria, and muro medrano (2005). it is an unsupervised disambiguation method that uses the relations between concepts as disam biguation context. it applies a heuristic voting algorithm to select the most adequate sense of the used concept core for each thesaurus concept. at the moment, the concept core is the wordnet lexical database. wordnet is a large english lexical database that groups nouns, verbs, adjectives, and adverbs into sets of cognitive synonyms (synsets), each expressing a distinct concept. those synsets are interlinked by means of conceptualsemantic and lexical relations. the interrelation component has been conceived as an independent module that receives a thesaurus as input in skos and returns the relation respect to concept core using an extended version of the skos mapping model (miles and brickley 2004). this model, as commented before, is a part of skos that allows describing exact, major, and minor mappings between concepts of two different kos (in this case between a thesaurus and the common core). skos mapping is still in an early stage of development and has been extended in order to provide the needed functionality. the base skos mapping provides the map:exactmatch, map:majormatch, and map:minormatch relations to indicate the degree of relation between two concepts. given that the interrelation algorithm cannot ensure that a mapping is 100 percent exact, only the major and minor match properties are used. the algorithm returns a list of pos sible mappings with the lexical database for each concept: the one with the highest probability is assigned as major match, and the rest are assigned as minor matches. to store the interrelation probability, skos mapping has been extended by adding a blank node with the liability of the mapping. also, to be able to know which concepts of which thesauri are equivalents to one of the common core, the inverse relations of map:majormatch and map:minormatch have been created. an example of skos mapping can be seen in figure 5. there, the concept 340 of gemet thesaurus (alloy) is correctly mapped to the wordnet concept number 13751474 (alloy, metal) with a probability of 91.007 percent, an unrelated minor mapping also is found, but it is given a low probability (8.992 percent). reusability of thmanager components on top of the api layer, the gui layer has been con structed. this layer contains several graphical interfaces to provide different types of viewers, searchers, and edi tors for thesauri. this layer is used as base for the con struction of the thmanager tool. the tool groups a subset of the provided components, relating them to obtain a final user application that allows the management of the stored thesauri, their visualization (navigation by the concept relations), their edition, and their importation and exportation using skos format. the thmanager tool not only has been created as an independent tool to facilitate thesauri management, but also to allow easy integration in tools that need to use thesauri. it has been done by combining the informa tion management with specific graphical interfaces in different blackbox components. between the provided components, there is a hierarchical viewer, an alphabetic viewer, a list viewer, a searcher, and an editor, but more components can be constructed if needed. the use of the gui layer as a library of reusable graphical components makes it possible to create different tools that are able to manage thesauri with different user requirements with minimum effort, allowing also the integration of this technology in other applications that need controlled vocabularies to improve their functionality. for example, in a metadata creation tool, it can be used to provide the graphical component to select controlled values from thesauri and automatically insert them in the metadata. it also can be used to provide the list of possible values to use in a web search system, or to provide a thesaurus based navigation of a collection of resources in an explor atory search system. figure 6 shows the integration process of a thesau rus visualization component in an external tool. the provided thesaurus components have been constructed following the java beans philosophy (reusable software components that can be manipulated visually in a builder tool), where a component is a black box with methods to read and change its state that can be reused when needed. here, each thesaurus component is a thesaurusbean that can be directly inserted in a graphical application to use its functionality (visualize or edit thesauri) in a very simple way. the thesaurusbeans are provided by the thesaurusbeanmanager that, given the parameters of the thesaurus to visualize and the type of visualization, returns the most adequate component to use. ■ description of thmanager functionality thmanager tool is a desktop application that is able to manage thesauri stored in skos. as regards to the instal �6 information technology and libraries | september 2007�6 information technology and libraries | september 2007 lation requirements, the application requires 100 mbs of free space on the hard disk. with respect to ram and cpu requirements, they depend greatly on the size and the number of thesauri loaded in the tool. considering the number and size of thesauri used as testbed in section 6, ram consumption ranges from 256 to 512 mbs, and with a 3ghz cpu (for example, pentium iv), the load times for the bigger thesauri are acceptable. however, if the size of thesauri is smaller, ram and cpu requirements decrease, being able to operate on a computer with just a 1 ghz cpu (for example, pentium iii) and 128 mbs of ram. given that the management of thmanager is meta data oriented, the first window in the application shows a table including the metadata records describing all the thesauri stored in the system (figure 7). the selection of a record in this table indicates to the rest of the compo nents the selected thesaurus. the creation or deletion of thesauri also is provided here. the only operation that can be performed when no record is selected is to import a new thesaurus stored in skos. to import it, the name of the skos file must be provided. the import tool also contains the option to interrelate the imported thesaurus to the concept core. the metadata of the thesaurus are extracted from inside of the skos if they are available, or they can be provided in an associated xml metadata file. if no metadata record is provided, the application generates a new one with minimum information, using as base the name of the skos file. once the user has selected a thesaurus, it can visualize and modify its metadata or content, export it to skos, or, as commented before, delete it. with respect to the metadata describing a thesaurus, a metadata viewer visualizes the metadata in html and a metadata editor allows the editing of metadata following the thesaurus metadata profile described in the metadatadriven design section (figure 8 shows a screenshot of the metadata edi tor). different html views can be provided by adding more css files to the application. the metadata editor is customiz able. to add or delete metadata elements to the metadata edi tor window, it is only neces sary to modify the description of the iemsr profile for thesauri included in the application. the main functionality of the tool is to visualize the thesaurus structure, showing all proper ties of concepts and allowing the navigation by relations (see figure 9). here, different readonly viewers are provided. there is an alphabetic viewer that shows all the concepts ordered by the preferred label in one language. a hierar chical viewer provides navigation by broader and nar rower relations. additionally, a hypertext viewer shows all properties of a concept and provides navigation by all its relations (broader, narrower, and related) via hyper links. finally, there also is a search system that allows the typical searches needed for thesauri (equals, starts with, contains). currently, search is limited to preferred labels in the selected language, but it could be extended to allow searches by other properties, such as synonyms, defini tions, or scope notes. figure 5. skos mapping extension alloy ... 91.00727 alloy, metal … … 91.00727 map:majormatch iaaa:probability map:majormatch iaaa:hasmajormatch iaaa:hasmajormatch resource property alloy, metal a mixture containing two or more metallic elements or metallic and nonmetallic elements usually fused together or dissolving into each other when molten; "brass is an alloy of zinc and copper" skos:definition map:minormatch iaaa:hasminormatch admixture, alloy map:minormatch iaaa:hasminormatch http://www.eionet.eu.int/ gemet/concept/340 rdf:about a28660 rdf:nodeid a2821 8.992731 iaaa:probability rdf:nodeid http://wordnet.princeton.edu/ wordnet_2.0/13751474 rdf:about skos:preflabel alloy skos:preflabel http://wordnet.princeton.edu/ wordnet_2.0/13664144 the state of impairing the quality or reducing the value of something skos:preflabel skos:definition rdf:about any of a large number of substances having metallic properties and consisting of two or more elements; with few exceptions, the components are usually metallic elements. (source: mgh) skos:definition figure 6. gui component integration desktop tool thesaurusbeanmanager type: tree, thesaurus: gemet thesaurusbean article title | author �7thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �7 all of these viewers are synchronized, so the selec tion of a concept in one of them produces the selection of the same concept in the others. the layered architec ture described previously allows these viewers to be reused in many situations, including other parts of the thmanager tool. for example, in the thesaurus metadata editor described before, the thesaurus viewer is used to facilitate the selection of values for the subject section of metadata. also, in the thesaurus editor shown later, the thesaurus viewer simplifies the selection of a concept related (by some kind of relation) to the selected, and provides a preview of the hierarchical viewer to help to detect wrong relations. the third available operation is to edit the thesaurus structure. here, to create a thesaurus following the skos model, an edition component is provided (see figure 10). the graphical interface shows a list with all the concepts created in the selected thesaurus, allowing the creation of new ones (providing their uris) or deletion of selected ones. once a concept has been selected, its properties and relations to other concepts are shown, allowing the creation of new ones and the deletion of others. to facili tate the creation of relations between concepts, a selector of concepts (based in the thesaurus viewer) is provided, allowing the user to add related concepts without manu ally typing the uri of the associated concept. also, to see if the created thesaurus is correct, a preview of the hier archical viewer can be shown, allowing the user to easily detect problems in the broader and narrower relations. with respect to the interrelation functionality, at the moment the mapping obtained is shown in the thesaurus viewers, but the navigation between equivalent concepts of two thesauri must be be done manually by the user. however, a navigation component still under develop ment will allow the user to jump from a concept in a the saurus to concepts in others that are mapped to the same concept in the common core. as mentioned before, for efficiency, the format used to store the thesauri in the repository is binary, but the inter change format used is skos. so a module for thesauri importation and exportation is provided. this module is able to import from and export to skos. in addition, if the thesaurus has been interrelated with respect to the concept core, it is able to export its mapping to the con cept core using the extended version of skos mapping above. ■ results of the work this section shows some experiments performed with the thmanager tool for the storage and management of a selected set of thesauri. in particular, this set of thesauri is relevant in the context of the geographic information community. the increasing relevance of geographic infor mation for decisionmaking and resource management in different areas of government has promoted the cre ation of geolibraries and spatial data infrastructures to facilitate distribution and access of geographic informa tion (noguerasiso, zarazagasoria, and muromedrano, 2005). in this context, complex metadata schemes, such as iso19115, have been proposed for a fulldetail descrip tion of resources. many of the metadata elements in these schemes are either constrained to a selected vocabulary (iso639 for language encoding, iso3166 for country codes, and so on), or the user is told to pick a term from the most suitable thesaurus. the problems with this sec ond case are that typically the choice for thesauri is quite open, the thesauri are frequently large, and the exchange format of available thesauri is quite heterogeneous. in such a context, the thmanager tool has proven to be very useful to simplify the management of the used thesauri. at the moment, eighty kos between thesauri and other types of controlled vocabulary have been cre ated or transformed to skos and managed through this tool. table 1 shows some of them, indicating their names (name column), the number of concepts (nc column), their total number of properties and relations (np and nr columns), and the number of languages in which concept properties are provided (nl column). to give an idea of the cost of loading these structures, the sizes of skos and binary files (ss and sb columns) are provided in kilobytes (kb). additionally, table 1 compares the performance time of thmanager with respect to other tools that load the figure 7. thesaurus selector figure �. thesaurus metadata editor �� information technology and libraries | september 2007�� information technology and libraries | september 2007 thesauri directly from an rdf file using the jena library (time performance has been obtained using a 3ghz pentium iv processor). for this purpose, three different load times (in seconds) have been computed. the bt column contains the load time of binary files without the cost of creating the gui for the thesauri viewers. the lt column contains the total load time of binary files (including the time of gui creation and drawing). the jt column contains the time spent by a hypothetical rdf based editor tool to invoke jena and load in its memory model the rdf skos files (it does not include gui cre ation) containing the thesauri. the difference between the bt and lt column shows the time used to draw the gui once the thesauri have been loaded in memory. the difference between bt and jt columns shows the gain in terms of time of using a binary storage instead of a rdf based one. the thesauri shown in the table are the adl feature types thesaurus (adl ftt), the isoc thesaurus of geography (isocg), the iso639, the unesco thesaurus (unesco 1995), the ogp surveying and positioning committee code lists (epsg) (ogp 2006), the multilingual agricultural thesaurus (agrovoc), the european vocabulary thesaurus (eurovoc) (eupo 2005), the european territorial units (spain and france) (etu), and the general multilingual environmental thesaurus (gemet). they have been selected because they have different sizes and can be used to show how the load time evolves with the thesaurus size. among them, gemet and agrovoc can be high lighted. although they are provided as skos, they include nonstandard extensions that we have transformed to standard skos relations and properties. eurovoc and unesco are examples of thesauri provided in formats different than skos that we have completely transformed into skos. the former one was in an xmlbased format, and the latter used a plaintext format. another thesaurus transformed to skos is the european territorial units, which contains the administrative political units in spain and france. here, the original source was a collection of heterogeneous documents that contained parts of the needed information and have been processed to generate a skos file. some classification schemes also have been trans formed to skos, such as the iso639 and the different epsg codes for coordinate reference systems (includ ing datums, ellipsoids, and projections). with respect to controlled vocabularies created (by the authors) in skos using the thmanager tool, there is an extended version of the adl feature types that includes a more detailed clas sification of features types and different glossaries used for resource classification. figure 11 depicts the comparison of the different load times shown in table 1 with respect to the size of the rdf skos files. the order of the thesauri in the figure is the same as in the table 1. it can be seen that the time to con struct the model using a binary format is almost half the time spent to create the model using a rdf file. in addi tion, once the binary model is loaded, the time to generate the gui is not very dependent on thesaurus size. this is possible thanks to the redundant information added to facilitate the access to top concepts and to speed up load ing of the alphabetic viewer. this redundant informa tion produces an overhead in the load of the model, but without it the drawing time would be much worse, as it would have to generate it on the fly. however, in spite of the improvements, for the larger thesauri considered, the load time starts to be long, given that it includes the load time of all the structure of the thesaurus in memory and the creation of the objects used to manage it quickly when loaded. but, once it is loaded, future accesses are immediate (quicker than 0.5 seconds). these accesses include opening it again, navigating by figure 9. thesaurus concept selector figure 10. thesaurus concept editor article title | author �9thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �9 thesaurus relations, changing the visualization language, and searching concepts by their preferred labels. to minimize the load time, thesauri can be loaded in the background when the application is launched, reducing, in that way, the user perception of the load time. another interesting aspect in figure 11 is the peak of the third element. it corresponds with the iso639 classifica tion scheme. it has the special characteristic of not having hierarchy and having many notations. these two character istics produce a little increase in the model load time, given that the top concepts list contains all the concepts and the notations are more complex than other relations. but most of the time is used to generate the gui of the tree viewer. the tree viewer gets all the concepts that are top terms, and for each one it asks for their preferred labels in the selected language and sorts them alphabetically to show the first level of the tree. this is fast for a few hundred concepts, but not for the 7,599 in the iso639. however, this problem could be easily solved if the metadata contained a descrip tion of the type of kos to visualize. if the tool knew that the kos does not have broader and narrower relations, it could use the structures used to visualize the alphabetic list, which are optimized to show all of the kos concepts rapidly, instead of trying to load it as a tree. the persistence approach used has the advantage of not requiring external persistence systems, such as a dbms, and providing rapid access after loading, but it has the drawback of loading all thesauri in memory (in time and space). so, for much bigger thesauri, the use of some kind of dbms would be necessary. if this change were necessary, minimum modifications would be needed (one class). however, if not all the concepts are loaded, the alphabetic viewer (shows all the concepts) would have to be updated (for example, showing the concepts by pages) or it would become too slow to work with it. ■ conclusions this article has presented a tool for managing the the sauri needed in a digital library, for creating metadata, and for running search processes using skos as the interchange format. this work revises the tools that are available to edit thesauri, highlighting the lack of a formalized way to exchange thesauri and the difficulty of integrating those tools in other environments. this work selects skos from the available interchange formats for thesauri as the most promising format to become a standard for skos repre sentation, and highlights the lack of tools that are able to manage it properly. the thmanager tool is offered as the solution to these problems. it is an open source tool that can manage the sauri stored in skos, allowing their visualization and editing. thanks to the layered architecture, its components can be easily integrated in other applications that need to use thesauri or other controlled vocabularies. additionally, the components can be used to control the possible values used in a web search service to facilitate traditional or exploratory searches based on a controlled vocabulary. the performance of the tool is proved through a series of experiments on the management of a selected set of thesauri. this work analyzes the features of this selected set of thesauri and compares the efficiency of this tool with respect to other tools that load the thesauri directly from a rdf file. in particular, it is shown that the internal representation used by thmanager helps to decrease the time spent for the graphical loading of thesauri, facilitating navigation of the thesaurus contents as well as other typical operations, such as sorting or change of visual ization language. additionally, it is worth noting that the tool can be used as a library of components to simplify the integration of the sauri in other applications that require the use of controlled vocabularies. thmanager has been integrated within the open source catmdedit tool table 1. sizes of some thesauri and other types of vocabularies name nc np nr nl lt bt jt ss sb adl ftt 210 210 408 1 0.4 0.047 0.062 103 41 isocg 5,136 5,136 1,026 1 2.4 1.063 1.797 2,796 1,332 iso639 7,599 16,247 0 6 5.1 1.969 2.89 3,870 3,017 unesco 8,600 13,281 21,681 3 2.1 1.406 2.984 4,034 2,135 epsg 4,772 9,544 0 1 1.8 0.969 1.796 2,935 1,682 agrovoc 16,896 103,484 30,361 3 7.5 4.953 14.75 15,859 5,089 eurovoc 6,649 196,391 20,861 15 11.1 9.266 15.828 18,442 11,483 etu 44,991 89,980 89,976 2 13.3 10.625 17.844 23,828 10,412 gemet 5,244 326,602 12,750 21 13.7 11.828 25.61 28,010 15,048 50 information technology and libraries | september 200750 information technology and libraries | september 2007 (zarazagasoria et al. 2003), a metadata editor tool for the documentation of geographic information resources (metadata compliant with iso19115 geographic informa tion metadata standard). the thesaurusbeans provided in thmanager library have been used to facilitate keyword selection for some metadata elements. the thmanager component library also has contributed to the develop ment of catalog search systems guided by controlled vocabularies. for instance, it has been used to build a thematic catalog in the sdiger project (zarazagasoria 2007). sdiger is a pilot project on the implementa tion of the infrastructure for spatial information in europe (inspire) for the development of a spatial data infrastructure to support access to geographic infor mation resources concerned with the european water framework directive. thanks to the thmanager compo nents, the thematic catalog allows browsing of resources by means of several multilingual thesauri, including gemet, unesco, agrovoc, and eurovoc. future work will enhance the functionalities provided by thmanager. first, the ergonomics will be improved to show connections between different thesauri. currently, these connections can be computed and annotated, but the gui does not allow the user to navigate them. as the base technology already has been developed, only a graphical interface is needed. second, the tool will be enhanced to support data types different from texts (for example, images, documents, or other multimedia sources) for the encoding of concepts’ property values. third, it has been noted that the thesauri concepts can evolve with time. thus, a mechanism for the managing the different ver sions of thesauri will be necessary in the future. finally, improvements in usability also are expected. thanks to the componentbased design of thmanager widgets (thesaurusbeans), new viewers or editors can be readily created to meet the needs of specific users. ■ acknowledgments this work has been partially supported by the spanish ministry of education and science through the proj ects tin200600779 and tic200309365c0201 from the national plan for scientific research, development, and technology innovation. the authors would like to express their gratitude to juan josé floristán for his support in the technical development of the tool. references american national standards institute (ansi). 1993. guidelines for the construction, format, and management of monolin gual thesauri. ansi/niso z39.191993. revision of z39.19. batschi, wolfdieter et al. 2002. superthes: a new software for construction, maintenance, and visualisation of mul tilingual thesauri. http://www.treks.cnr.it/docs/st_ enviroinfo_2002.pdf (accessed sept. 6, 2007). british standards institute (bsi). 1985. guide to establishment and development of multilingual thesauri. bs 6723. british standards institute (bsi). 1987. guide to establishment and development of monolingual thesauri. bs 5723. ceres/nbii. 2003. the ceres/nbii thesaurus partnership project. http://ceres.ca.gov/thesaurus/ (accessed june 12, 2007). cross, phil, dan brickley, and traugott koch. 2001. rdf the saurus specification. technical report 1011, institute for learning and research technology. http://www.ilrt.bris.ac.uk/ discovery/2001/01/rdfthes/ (accessed june 12, 2007). denny, michael. 2002. ontology building: a survey of edit ing tools. xml.com. http://xml.com/pub/a/2002/11/06/ ontologies.html (accessed june 12, 2007). european environment agency (eea). 2004. general multilingual environmental thesaurus (gemet). version 2.0. european environment information and observation network. http:// www.eionet.europa.eu/gemet/rdf (accessed june 12, 2007). european union publication office (eupo). 2005. european vocabulary (eurovoc). publications office. http://europa .eu/eurovoc/ (accessed june 12, 2007). food and agriculture organization of the united nations (fao). 2006. agriculture vocabulary (agrovoc). agricul tural information management standards. http://www.fao. org/aims/ag%20alpha.htm (accessed june 12, 2007). gonzalo, julio, et al. 1998. applying eurowordnet to crosslan guage text retrieval. computers and the humanities 32, no. 2/3 (special issue on eurowordnet): 185–207. heery, rachel, et al. 2005. jisc metadata schema registry. in 5th acm/ieee-cs joint conference on digital libraries, 381–81. new york: acm pr. hill, linda, and qi zheng. 1999. indirect geospatial referencing through place names in the digital library: alexandria digi figure 11. thesaurus load times 0 5 10 15 20 25 30 0 5000 10000 15000 20000 25000 30000 skos file size (kb) lo ad t im e (s ) rdf (jena) binary thmanager article title | author 51thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria 51 tal library experience with developing and implementing gazetteers. in asis ‘99: proceedings of the 62nd asis annual meeting: knowledge: creation, organization, and use, 57–69. med ford, n.j.: information today, for the ameircan society for information science. hodge, gail. 2000. systems of knowledge organization for digital libraries: beyond traditional authority files. washington, d.c.: the digital library federation. international organization for standardization (iso). 1985. guidelines for the establishment and development of multilingual thesauri. iso 5964. international organization for standardization (iso). 1986. guidelines for the establishment and development of monolingual thesauri. iso 2788. international organization for standardization (iso). 2002. codes for the representation of names of languages. iso 639. international organization for standardization (iso). 2003. information and documentation—the dublin core metadata element set. iso 15836:2003. janée, greg, satoshi ikeda, and linda l. hill. 2003. the adl the saurus protocol. http://www.alexandria.ucsb.edu/~gjanee/ thesaurus/ (accessed june 12, 2007). lesk, michael. 1997. practical digital libraries. san francisco: books, bytes, and bucks. matthews, brian m., et al. 2001. internationalising data access through limber. in third international workshop on internationalisation of products and systems: 1–14. milton keynes (uk). http://epubs.cclrc.ac.uk/bitstream/401/limber_iwips.pdf (accessed june 12, 2007). miles, alistair, and dan brickley, eds. 2004. skos mapping vocab ulary specification. w3c. http://www.w3.org/2004/02/ skos/mapping/spec/20041111.html (accessed june 12, 2007). miles, alistair, brian matthews, and michael wilson. 2005. skos core: simple knowledge organization for the web. in 2005 dublin core annual conference—vocabularies in practice, 5–13. madrid: universidad carlos ii de madrid. miller, george a. 1990. wordnet: an online lexical database. int. j. lexicography 3: 235–312. mindswap group. 2006. swoop a hypermediabased feath erweight owl ontology editor. maryland information and network dynamics lab. semantic web agents project. http://www.mindswap.org/2004/swoop/ (accessed june 12, 2007). noguerasiso, javier, francisco javier zarazagasoria, and pedro rafael muromedrano. 2005. geographic information metadata for spatial data infrastructures—resources, interoperability, and information retrieval. new york: springer verlag. noy, natalie f., ray w. fergerson, and mark a. musen. 2000. the knowledge model of protégé2000: combining interoper ability and flexibility. in knowledge engineering and knowledge management: methods, models, and tools: 12th international conference, ekaw 2000, juan-les-pins, france, october 2–6, 2000: proceedings, 120 (lecture notes in computer science, 1937). new york: springer. ogp surveying & positioning committee. 2006. surveying and positioning. http://www.epsg.org/ (accessed june 12, 2007). semantic web advanced development for europe (swad europe). 2001. semantic web advanced development for europe thesaurus activity. http://www.w3.org/2001/sw/ europe/ reports/thes (accessed june 12, 2007). tolosanacalasanz, r., et al. 2006. semantic interoperability based on dublin core hierarchical onetoone mappings. international journal of metadata, semantics, and ontologies 1, no. 3: 183–88. tylor, mike. 2004. the zthes specifications for thesaurus rep resentation, access, and navigation. http://zthes.z3950.org/ (accessed june 12, 2007). united nations educational, scientific, and cultural organiza tion (unesco). 1995. unesco thesaurus: a structured list of descriptors for indexing and retrieving literature in the fields of education, science, social and human science, culture, communication and information. paris: unesco publ. u.s. library of congress. network devlopment and marc standards office. 2004. marc standards. http://www.loc. gov/marc/ (accessed june 12, 2007). wielemaker, jan, guss schreiber, and bob wielinga1. 2005. using triples for implementation: the triple20 ontology-manipulation tool (lecture notes in computer science, 3729): 773–85. new york: springer. zarazagasoria, francisco javier, et al. 2003. a java tool for creating iso/fgdc geographic metadata. in geodatenund geodiensteinfrastukuren—von der forschung zur praktischen anwendung: beitrage ze den münsteraner gi-tagen, 26/27. juni 2003 (ifgiprints, 18). münster, germany: institut fur geoin formatik, universitat münster. zarazagasoria, francisco javier, et al. 2007. providing sdi ser vices in a crossborder scenario: the sdiger project use case. in research and theory in advancing spatial data infrastructure concepts, 113–26. redlands, calif.: esri. ebsco cover 2 lita cover 3, cover 4 index to advertisers reproduced with permission of the copyright owner. further reproduction prohibited without permission. the internet, the world wide web, library web browsers, and library web servers jian-zhong, zhou information technology and libraries; mar 2000; 19, 1; proquest pg. 50 tutorial the internet, the world wide web, library web browsers, and library web servers jian-zhong (joe) zhou this article first examines the difference between two very familiar and sometimes synonymous terms, the internet and the web. the article then explains the relationship between the web's protocol http and other high-level internet protocols, such as telnet and ftp, as well as provides a brief history of web development. next, the article analyzes the mechanism in which a web browser (client) "talks" to a web server on the internet. finally, the article studies the market growth for web browsers and web servers between 1993 and 1999. two statistical sources were used in the web market analysis: a survey conducted by the university of delaware libraries for the 122 members of the association of research libraries, and the data for the entire web industry from different web survey agencies. many librarians are now dealing with the internet and the web on a daily basis. while the web is sometimes synonymous with the internet in many people's minds, the two terms are quite distinct, and they refer to different but related concepts in the modem computerized telecommunication system. the internet is nothing more than many small computer networks that have been wired together and allow electronic information to be sent from one network to the next around the world . a piece of data from joe zhou (joezhou@udel.edu) is associate librarian at the university of delaware library, newark. beijing, china may traverse more than a dozen networks while making its way to washington, d.c. we can compare the internet to the great wall of china, which was built in the qin dynasty around the third century b.c. by connecting many existing short defense walls built by previous feudal states . the great wall not only served as a national defense system for ancient china, but also as a fast military communication system. a border alarm was raised by means of smoke signals by day, and beacon fires at night, ignited by burning a mixture of wolf dung , sulfur, and saltpeter. the alarm signal could be relayed over many beacon-fire towers from the western end of the great wall to the eastern end (4,500 miles away) within a day . this was considered light speed two thousand years ago. however, while the great wall transferred the message in a linear mode, the internet is a multidimensional network. the web is a late-comer to the internet, one of the many types of high-level data exchange protocols on the internet. before the web, there was telnet, the traditional commanddriven style of interaction. there was ftp, a file transfer protocol useful for retrieving information from large file archives. there was usenet , a communal bulletin board and news system. there was also e-mail for individual information exchange, and e-mail lists, for one-to-many broadcasts. in addition, there was gopher, a campus-wide information system shared among universities and research institutions, and wais, a powerful search and retrieval system developed by thinking machines, inc. in 1990 tim bemerslee and robert cailliau at cern (www. cern.ch), the european laboratory for particle physics, created a new information system called "world wide web" (www). designed to help the cern scientists with the increasingly confusing task of exchanging information on the 50 information technology and libraries i march 2000 internet, the web system was to act as a unifying force, a system that would seamlessly bind all file-protocols into a single point of access. instead of having to invoke different programs to retrieve information via various protocols, users would be able to use a single program, called a "browser," and allow it to handle all the details of retrieving and displaying information. in december 1993 www received the ima award, and in 1995 bemers-lee and cailliau received the association for computing (acm) software system award for its development. the web is best known for its ability to combine text with graphics and other multimedia on the internet. in addition, the web has some other key features that make it stand out from earlier internet information exchange protocols. since the web is a late-comer to the internet, it has to be compatible backwards with other communications protocols in addition to its native language, hypertext transfer protocol (http). among the foreign languages spoken by web browsers are telnet, ftp, and other high-level communication protocols mentioned earlier. this support for foreign protocols lets people use a single piece of software, the web browser, to access information without worrying about shifting from protocol to protocol and software incompatibility . despite different high-level protocols including http for the web, there is one thing in common for all parts of the internet-tcp/ ip, the lower level of the internet protocol. tcp /ip is respon sible for establishing the connection between two computers on the internet and guarantees that the data can be sent and received intact. the format and content of the data are left for high-level communication protocols to manage, among which the web is the best known one. at the tcp /ip level all computers "are created equal." two computers establish a connection and start to reproduced with permission of the copyright owner. further reproduction prohibited without permission. communicate. in reality, however, most conversations are asymmetric. the end user's machine (the client) usually sends a short request for information, and the remote machine (the server) answers with a longwinded response. the media is the internet. the common language on the internet can be the web or any other high-level protocols . on the web, the client is the web browser; it handles the user's request for a document. the first web browser, ncsa mosaic, developed by the national center for supercomputing applications (ncsa) at the university of illinois at urbanachampaign, was released in midnovember 1993 for unix, windows, and macintosh platforms. version 3.0 of ncsa mosaic is available at www. ncsa. uiuc.ed u/ sdg /software/ mosaic. both source code and binaries are free for academic use. mosaic lost market share to netscape after its key developer left ncsa and joined netscape. even after mosaic introduced an innovative 32-bit version in early 1997, which can perform feats that other major browsers had not even thought of back then, mosaic remained out of the major browsers' market. the two most widely-used browsers today are microsoft's internet explorer (ie) and netscape's navigator (part of the netscape communicator suite). recent web browser surveys conducted by different internet survey companies such as www.zonaresearch.com/ browserstudy, www.psrinc.com/ trends.htm, and www .statmarket. com all indicate that ie is the market leader with more than 60 percent market share, leaving navigator with between 35 percent and 40 percent. in 1995 ie had only 1 percent share versus navigator's more than 90 percent, an unimaginable rise critics have attributed to microsoft's strategy of bundling the browser with its near-monopoly windows operating system. however, a survey conducted in december 1998 by the university of delaware library of 122 members of the association of research libraries (arl) showed that netscape still remained the market leader among big academic libraries. more than 90 percent of arl libraries supported netscape, and about 50 percent also supported ie. most arl libraries supported both browsers, and unlike the browser industry survey mentioned earlier, in which only one product can be picked as the primary browser , the sum of the percentages for the arl survey was greater than 100 percent. the main function of the web browser is to request a document available from a specific server through the internet using the information in the document's url. the server on a remote machine returns the document usually physically stored on one of the server's disks. with the use of common gateway interface (cgi), the documents do not have to be static. rather, they can be synthesized at the point of being requested by cgi scripts running on the server's side of the connection . in some database-driven web servers that make the core of today's e-commerce, the documents provided may never exist as physical files but are generated as needed from database records . the web server can be run on almost any computer, and server software is available for almost all operating systems, such as unix, windows 95/98/nt, macintosh, and os / 2. according to the university of delaware library's 1998 survey of internet web servers among arl member libraries, more than 32 percent of arl libraries chose apache as their web server software, followed by the netscape series at 29.32 percent, ncsa httpd at 11.28 percent, and microsoft internet information server (iis) at 7.52 percent. in july 1999 the author checked the netcraft survey at www .netcraft. com/survey . the top three web server software programs for more than 6.5 million web sites are apache (56.35 percent) , microsoft-hs (22.33 percent), and netscape (5.65 percent). the netcraft survey also provides the historical market share information of major web servers since august 1995. ncsa httpd was the first web server software released, about the same time as the release of mosaic in 1993. however, it slipped from the number-one position with more than 90 percent market share in 1993, and almost 60 percent in 1995, to less than 1 percent in july 1999. it is no longer supported by ncsa, however, httpd remains a popular choice for web servers due to its small size, fast performance, and solid collection of features . the "inertia effect" of the existing sites (if it runs well, why bother to change?) will likely keep ncsa on the major web server software list for some time. ncsa is free, but available only for the unix platform. it is available from http:/ /hoohoo .ncsa.uiuc.edu. however, when the author visited the site in july 1999, the following message appeared on the main page : "the ncsa httpd is no longer under development. it is an unsupported product. we recommend that you check out the apache server, instead of installing our server." most people who use only web browsers may have heard of apache only as an indian nation or a military helicopter, not the most popular web server software with more than 50 percent market share . it was first introduced as a set of fixes or "patches" to the ncsa httpd. apache 1.0 was released in december 1995 as open-source server software by a group of webmasters who named themselves the apache group. open-source means the source code is available and freely distributed, and it is the key to apache's attractiveness and popularity. the apache group members were nsca users tutorial i zhou 51 reproduced with permission of the copyright owner. further reproduction prohibited without permission. who decided to coordinate development work on the server software after nsca stopped. in july 1999 the apache group announced that it was establishing a more formal organization called the apache software foundation (asp). in the future, the asp (www .apache.org) will monitor development of the free software, but it will remain a "not-for-profit" foundation. apache is high-end, enterprise-level server software and can be run on os/2, unix (including linux), and windows platforms, but a mac version is still not available. the netscape series includes netscape-enterprise, netscape-pasttrack, netscape-commerce, and netscape-communication . enterprise is a high-end, enterprise-level server while pasttrack serves as an entrylevel server for small workgroups. netscape supports both the unix and the windows nt platforms. the other major commercial web server, microsoft internet information server (iis), as of 1999, is only available for the windows platform. however, one advantage of iis over netscape is that it can be downloaded for free as part of the windows option pack. in addition, iis can handle ms office documents very well. while both the microsoft and netscape brand names are well recognized by millions of end users. a name alone does not necessarily equate to large market share, nor does a deep pocket. apache remains the top web server despite intense competition. one of the keys to apache's success, in addition to its outstanding performance, lies in its open-source code movement and active user support on a wide basis. the web server of choice for the macintosh platforms is webstar. however, due to the limitations of the operating system networking software, the performance of macintosh-based servers has not been great. webstar can be downloaded as a free evaluation release from www.stamine.com/webstar. the web server market is dynam52 information technology and libraries i march 2000 ic and competition intense. there are more than sixty web server products on the top list ( of web servers with more than one thousand web sites) as of july 1999, and newcomers are being added frequently. acknowledgments the author thanks peter liu, head of the systems department at the university of delaware library, for providing the web survey data of arl libraries . after this article was submitted, the survey data was published by arl in 1999 as spec kit 246: web page development and management. the author also wants to thank his dear wife min yang for her technical assistance. min is webmaster and system administrator for the web site at a. i. dupont nemours foundation and hospital for children, http:/ /kidshealth.org. online ticketed-passes: a mid-tech leap in what libraries are for public libraries leading the way online ticketed-passes: a mid-tech leap in what libraries are for jeffrey davis information technology and libraries | june 2019 8 jeffrey davis (jtrappdavis@gmail.com) is branch manager at san diego public library, san diego, california. last year a library program received coverage from the new york times, the wall street journal, the magazines mental floss and travel+leisure, many local newspapers and tv outlets, online and trade publications like curbed, thrillist, and artforum, and more. that program is new york’s culture pass, a joint program of the new york, brooklyn, and queens public libraries. culture pass is an online ticketed-pass program providing access to area museums, gardens, performances, and other attractions. as the new york daily news wrote in their lede: “it’s hard to believe nobody thought of it sooner: a new york city library card can now get you into 33 museums free.” libraries had thought of it sooner, of course. museum pass programs in libraries began at least as early as 1995 at boston public library and the online ticketed model in 2011 at contra costa (ca) county library. the library profession has paid this “mid-tech” program too little attention, i think, but that may be starting to change. what are online ticketed-passes? the original museum pass programs in libraries circulate a physical pass that provides access to an attraction or group of attractions. sometimes libraries are able to negotiate free or discounted passes but many times the passes are purchased outright. the circulating model is still the most common for library pass programs, but it suffers from many limitations. passes by necessity are checked out for longer than they’re used. they sit waiting for pick up on hold shelves and in transit to their next location. long queues make it hard for patrons to predict when their requests will be filled, and therefore difficult to plan on using. for the participating attractions, physical passes are typically good anytime and so compete with memberships and paid admission. there are few ways to shape who borrows the passes in order to meet institutional goals. and there are few ways to limit repeat use by library patrons to both increase exposure and nudge users toward membership. as a result, most circulating pass programs only connect patrons to a small number of venues. despite these limitations, circulating passes have been incredibly popular: at writing there are 967 requests for san diego public library’s 73 passes to the new children’s museum. we sometimes see that sort of interest in a new bestseller, but this is a pass that sdpl has offered continuously since 2009. in 2011, contra costa county library launched the first “ticketed-pass” program, discover & go. discover & go replaced circulating physical passes with an online system with which patrons, remotely or in the library with staff assistance, retrieve day-passes — tickets — by available date or venue. this relatively simple and common-sense change makes an enormous difference. in addition to convenience and predictability for patrons, availability is markedly increased because venues are much more comfortable providing passes when they can manage their use: patrons can be restricted to a limited number of tickets per venue per year and venues can match the information technology and libraries | june 2019 9 number of tickets available to days that they are less busy. the latter preserves the value of their memberships while making use of their own “surplus capacity” to bring in new visitors and potential new members. funding and internal expectations at many venues carry obligations to reach underserved communities and the programs allow partner attractions to shape public access and receive reporting by patron zip code and other factors. the epass software behind discover & go is regional by design and supports sharing of tickets across multiple library systems in ways that are impractical to do with physical passes. as new library systems join the program, they bring new partner attractions into the shared collection with them. the oakland zoo, for example, needs only to negotiate with their contact at oakland public library to coordinate access for members of oakland, san francisco, and san jose public libraries. because of the increased attractiveness of participation, it’s been easier for libraries to bring venues into the program. in 2011, discover & go hoped for a launch collection of five museums but ultimately opened with forty. the success of ticketed-pass programs in turn attracts more partners. today, discover & go is available through 49 library systems in california and nevada with passes to 137 participating attractions. similarly, new york’s culture pass launched with 33 participating venues and has grown in less than a year to offer a collection of 49. while big city programs attract the most attention, pass programs are offered by county systems like alamace county (nc), consortiums like libraries in clackamas county (or), small cities like lawrence (ma), small towns like atkinson (nh), and statewide like the michigan activity pass which is available through over 600 library sites with tickets to 179 destinations plus state parks, camping, and historical sites. for each library, the participating destinations form a unique collection: a shelf of local riches, idiosyncratic and rooted in place. through various libraries one can find tickets for the basketball hall of fame, stone barns center for food and agriculture, dinosaur ridge, eric carle museum of picture book art, bushnell park carousel, california shakespeare theater, children’s museums, zoos, aquariums, botanical gardens, tours, classes, performances, and on to the met, moma, crocker, de young, and many, many, many more. for kids, “enrichments” like these are increasingly understood as essential parts of learning and exploration. for adults, access to our cultural treasures, including partners like san francisco’s museum of the african diaspora or chicago’s national museum of puerto rican arts & culture — besides being its own reward — enhances local connection and understanding. we’re also starting to see the ticketing platform itself become an asset to smaller organizations — craft studios, school performances, farm visits, nature centers, and more — that want to increase public access without having to take on a new ability. importantly, ticketed-pass programs are built on the core skills of librarians: information management, collection development, community outreach, user-centered design, customer service, and technological savvy. the technology discover & go was initially funded by a $45,000 grant from the bay area library and information system (balis) cooperative. contra costa contracted with library software company quipu group to develop the epass software that runs the program and that is also used by ny’s culture pass, public libraries leading the way: online ticketed passes | davis 10 https://doi.org/10.6017/ital.v38i2.11141 multnomah county (or) library’s my discovery pass, and a consortium of oregon libraries as cultural pass. ticketed-pass software is also offered by the libraryinsight and plymouth rocket companies and used by denver public library, seattle public library, the michigan activity pass, and others. the software consists of a web application with a responsive patron interface and connects over sip2 or vendor api to patron status information from the library ils. administrative tools set finegrained ticket availability, blackout dates, and policies including restrictions by patron age, library system, zip code, municipality, number of uses allowed globally and per venue, and more. recent improvements to epass include geolocation to identify nearby attractions and improved search filters. still in development are transfer of tickets between accounts, re-pooling of unclaimed tickets, and better handling of replaced library cards. the strength that comes from multi-system ticketed-pass programs also carries with it challenges on the patron account side. ilses each implement protocols and apis for working with patron account information differently and library systems maintain divergent policies around patron status. there’s a role for lita and for library consortia and state libraries to push for more attention to and consistency on patron account policies and standards. the emphasis in library automation is similarly shifting. our ilses originated to manage the circulation of physical items, a catalog-centric view. today, as robert anderson of quipu group suggested to me, a diverse range of online and offline services and non-catalog offerings orbit our users, calling for a new frame of reference: “it’s a patron-centric world now.” the vision library membership is the lynchpin of ticketed-pass and complementary programs in the technical sense, as above, and conceptually: library membership as one’s ticket to the world around. though i’m not aware of academic libraries offering ticketed-passes, they have been providing local access through membership. at many campuses, the library is the source for one’s library card which is also one’s campus id, onand off-campus cash card, transit pass, electronic key, print management, and more. that’s kind of remarkable and deserving of more attention. traditionally, librarians have responded to patron needs by providing information, resources, and services ourselves. new models and technologies are making it easier to complement this with the facilitation approach, of which online ticketed-passes are the quintessential example. we further increase access by reducing barriers of complexity, language, know-how, and social capital, for example, by maintaining community calendars of local goings-on or helping communities take advantage of nearby nature. online ticketed-pass programs will grow and take their place in the public’s expectations of libraries and librarians: that libraries are the place that help us (better, more equitably) access the resources and riches around us. powering this are important new tools for library technologists to interrogate and advance with the same attention we give to both more established and more speculative applications. 52 information technology and libraries | june 2006 author name and second author author id box for 2 column layout this paper discusses google scholar as an extension of kilgour’s goal to improve the availability of information. kilgour was instrumental in the early development of the online library catalog, and he proposed passage retrieval to aid in information seeking. google scholar is a direct descendent of these technologies foreseen by kilgour. google scholar holds promise as a means for libraries to expand their reach to new user communities, and to enable libraries to provide quality resources to users during their online search process. editor’s note: this article was submitted in honor of the fortieth anniversaries of lita and ital. f red kilgour would probably approve of google scholar. kilgour wrote that the paramount goal of his professional career is “improving the availability of information.”1 he wrote about his goal of achieving this increase through shared electronic cataloging, and even argued that shared electronic cataloging will move libraries toward the goal of 100 percent availability of information.2 throughout much of kilgour’s life, 100 percent availability of information meant that all of a library’s books would be on the shelves when a user needed them. in proposing shared electronic cataloging—in other words, online union catalogs—kilgour was proposing that users could identify libraries’ holdings without having to travel to the library to use the card catalog. this would make the holdings of remote libraries as visible to users as the holdings of their local library. kilgour went further than this, however, and also proposed that the full text of books could be made available to users electronically.3 this would move libraries toward the goal of 100 percent availability of information even more than online union catalogs. an electronic resource, unlike physical items, is never checked out; it may, in theory, be simultaneously used by an unlimited number of users. where there are restrictions on the number of users of an electronic resource—as with subscription services such as netlibrary, for example—this is not a necessary limitation of the technology, but rather a limitation imposed by licensing and legal arrangements. kilgour understood that his goal of 100 percent availability of information would only be reached by leveraging increasingly powerful technologies. the existence of effective search tools and the usability of those tools would be crucial so that the user would be able to locate available information without assistance.4 to achieve this goal, therefore, kilgour proposed and was instrumental in the early development of much library automation: he was behind the first uses of punched cards for keeping circulation records, he was behind the development of the first online union catalog, and he called for passage retrieval for information seeking at a time when such systems were first being developed.5 this development and application of technology was all directed toward the goal of improving the availability of information. kilgour stated that the goal of these proposed information-retrieval and other systems was “to supply the user with the information he requires, and only that information.”6 shared catalogs and electronically available text have the effect of removing both spatial and temporal barriers between the user and the material being used. when the user can access materials “from a personal microcomputer that may be located in a home, dormitory, office, or school,” the user no longer has to physically go to the library.7 this is a spatial barrier when the library is located at some distance from the user, or if the user is physically constrained in some way. even if the user is perfectly able-bodied, however, and located close to a library, electronic access still eliminates a temporal barrier: accessing materials online is frequently faster and more convenient than physically going to the library. electronic access enables 100 percent availability of information in two ways: by ensuring that the material is available when the user wants it, and by lowering or removing any actual or perceived barriers to the user accessing the material. ■ library automation weise writes that “for at least the last twenty to thirty years, we [librarians] have done our best to provide them [users] with services so they won’t have to come to the library.”8 the services that weise is referring to are the ability for users to search for and gain access to the full text of materials online. libraries of all types have widely adopted these services: for example, at the author’s own institution, the university of north carolina at chapel hill, the libraries have subscriptions to approximately seven hundred databases and provide access to more than 32,000 unique periodical titles; many of these subscriptions provide access to the full text of materials.9 additionally, the state library of north carolina provides a set of more than one hundred database subscriptions to all academic and public libraries around the jeffrey pomerantz jeffrey pomerantz (pomerantz@unc.edu) is assistant pro fessor in the school of information and library science, university of north carolina at chapel hill. google scholar and 100 percent availability of information google scholar and 100 percent availability of information | pomerantz 53 state; any north carolina resident with a library card may access these databases.10 several other states have similar programs. by providing users with remote access to materials, libraries have created an environment in which it is possible for users to be remote from the library. or rather, as lipow points out, it is the library that is remote from the user, yet the user is able to seek and find information.11 this adoption of technology by libraries has had the effect of enabling and empowering users to seek information for themselves, without either physically going to a library or seeking a librarian’s assistance. the increasing sophistication of freely available tools for information seeking on the web has accelerated this trend. in many cases, users may seek information for themselves online without making any use of a library’s human-intermediated or other traditional services. (certainly, providing access to electronic collections may be considered to be a service of the library, but this is a service that may not require the user either to be physically in the library or to communicate with a librarian.) even technically unsophisticated users may use a search engine and locate information that is “good enough” to fulfill their information needs, even if it is not the ideal or most complete information for those purposes.12 thus, for better or worse, the physical library is no longer the primary focus for many information seekers. part of this movement by users toward self-sufficiency in information seeking is due to the success of the web search engine, and to the success of google in particular. recent reports from the pew internet and american life project shed a great deal of light on users’ use of these tools. rainie and horrigan found that “on a typical day at the end of 2004, some 70 million american adults logged onto the internet.”13 fallows found that “on any given day, 56% of those online use search engines.”14 fallows, rainie, and mudd found that of their respondents, “47% say that google is their top choice of search engine.”15 from these figures, it can be roughly estimated that more than 39 million people use search engines, and more than 18 million use google on any given day—and that is only within the united states. this trend seems quite dark for libraries, but it actually has its bright side. it is important to make a distinction here between use of a search engine and use of a reference service or other library service. there is some evidence that users’ questions to library reference services are becoming more complex.16 why this is occurring is less clear, but it may be hypothesized that users are locating information that is good enough to answer their own simple questions using search engines or other internet-based tools. the definition of “good enough” may differ considerably between a user and a librarian. nevertheless, one function of the library is education, and as with all education, the ultimate goal is to make the student self-sufficient in self-teaching. in the context of the library, this means that one goal is to make the user self-sufficient in finding, evaluating, and using information resources. if users are answering their own simple questions, and asking the more difficult questions, then it may be hypothesized that the widespread use of search engines has had a role in raising the level of debate, so to speak, in libraries. rather than providing instruction to users on simply using search engines, librarians may now assume that some percentage of library users possess this skill, and may focus on teaching higher-level information-literacy skills to users (www.ala.org/ala/acrl/ acrlstandards/informationliteracycompetency.htm). simple questions that users may answer for themselves using a search engine, and complex questions requiring a librarian’s assistance to answer are not opposites, of course, but rather two ends of a spectrum of the complexity of questions. while the advance of online search tools may enable users to seek and find information for themselves at one end of this spectrum, it seems unlikely that such tools will enable users to do the same across the entire spectrum any time soon; perhaps ever. the author believes that there will continue to be a role for librarians in assisting users to find, evaluate, and use information. it is also important to make another distinction here, between the discovery of resources, and access to those resources. libraries have always provided mechanisms for users to both discover and access resources. neither the card catalog nor the online catalog contains the full text of the materials cataloged; rather, these tools are means to enable the user to discover the existence of resources. the user may then access these resources by visiting the library. search engines, similar to the card and online catalogs, are tools primarily for discovery of resources: search-engine databases may contain cached copies of web pages, but the original (and most up-todate) version of the web page resides elsewhere on the web. thus, a search engine enables the user to discover the existence of web pages, but the user must then access those web pages elsewhere. the author believes that there will continue to be a role for libraries in providing access to resources—regardless of where the user has discovered those resources. in order to ensure that libraries and librarians remain a critical part of the user’s information-seeking process, however, libraries must reappropriate technologies for online information seeking. search engines may exist separate from libraries, and users may use them without making use of any library service. however, libraries are already the venue through which users access much online content—newspapers, journals, and other periodicals; reference sources; genealogical materials—even if many users do not physically come to the library or consult a librarian when using them. it is possible for 54 information technology and libraries | june 2006 libraries to add value to search technologies by providing a layer of service available to those using it. ■ google scholar one such technology for online information seeking to which libraries are already adding value, and that could add value to libraries in turn, is google scholar (scholar. google.com). google scholar is a specialty search tool, obviously provided by google, which enables the user to search for scholarly literature online. this literature may be on the free web (as open-access publications become more common and as scholars increasingly post preprint or post-print copies of their work on their personal web sites), or it may be in subscription databases.17 users may access literature in subscription databases in one of two ways: (1) if the user is affiliated with an institution that subscribes to the database, the user may access it via whatever authentication method is in place at the institution (e.g., ip authentication, a proxy server), or (2) if the user is not affiliated with such an institution, the user may pay for access to individual resources on a pay-perview basis. there is not sufficient space here to explore the details of google scholar’s operation, and anyway that is not the point of this paper; for excellent discussions of the operation of google scholar, see gardner and eng, and jacsó.18 pace draws a distinction between federated searching and metasearching: federated search tools compile and index all resources proactively, prior to any user’s actual search, in a just-in-case approach to users’ searching.19 metasearch tools, on the other hand, search all resources on the fly at the time of a user’s search, in a just-in-time approach to users’ searching. google scholar is a federated search tool—as, indeed, are all of google’s current services—in that the database that the user searches is compiled prior to the user’s actual search. in this, google scholar is a direct descendent of kilgour’s work to develop shared online library catalogs. a shared library catalog is a union catalog: it is a database of libraries’ physical holdings, compiled prior to any actual user’s search. google scholar is also a union catalog, though a catalog of publishers’ electronic offerings provided by libraries, rather than of libraries’ physical holdings. it should be noted, however, that while this difference is an important one for libraries and publishers, it might not be understood or even relevant for many users. many of the resources indexed in google scholar are also available in full text. this fact allows google scholar to also move in the direction of kilgour’s goal of making passage retrieval possible for scholarly work. by using google’s core technology—the search engine and the inverted index that is created when pages are indexed by a search engine—google scholar enables full-text searching of scholarly work. as mentioned above, when users search google scholar, they retrieve a set of links to the scholarly literature retrieved by the search. google scholar also makes use of google’s linkanalysis algorithms to analyze the network of citations between publications—instead of the network of hyperlinks between web pages, as google’s search engine more typically analyzes. a cited by link is included with each retrieved link in google scholar, stating how many other publications cite the publication listed. clicking on this cited by link performs a preformulated search for those publications. this citation-analysis functionality resembles the functionality of one of the most common and widely used scholarly databases in the scholarly community: the isi web of science (wos) database (scientific .thomson.com/products/wos). wos enables users to track citations between publications. this functionality has wide use in scholarly research, but until google scholar, it has been largely unknown outside of the scholarly community. with the advent of google scholar, however, this functionality may be employed by any user for any research. further, there is a plugin for the firefox browser (www.mozilla.com/firefox) that displays an icon for every record on the page of retrieved results that links to the appropriate record in the library’s opac (google scholar does not, however, currently provide this functionality natively20). this provides a link from google scholar to the materials that the library holds in its collection. when the item is a book, for example, this link to the opac enables users to find the call number of the book in their local library. when the item is a journal, it enables them to find both the call number and any database subscriptions that index that journal title. periodicals are often indexed in multiple databases, so libraries with multiple-database subscriptions often have multiple means of accessing electronic versions of journal titles. a library user may access a periodical via any or all of these individual subscriptions without using google scholar— but to do so, the user must know which database to use, which means knowing either the topical scope of a database or knowing which specific journals are indexed in a database. as a more centralized means of accessing this material, many users may prefer a link in google scholar to the library’s opac. google scholar thus fulfills, in large part, kilgour’s vision of shared electronic cataloging. in turn, shared cataloging goes a long way toward achieving kilgour’s vision of 100 percent availability of information by allowing a user to discover the existence of information resources. however, discovery of resources is only half of the equation: the other half is access to those resources. and it is here where libraries may position themselves as a critical part of the information-seeking process. search engines google scholar and 100 percent availability of information | pomerantz 55 may enable users to discover information resources on their own, without making use of a library’s services, but it is the library that provides the “last mile” of service, enabling users to gain access to many of those resources. ■ conclusion google scholar is the topic of a great deal of debate, both in the library arena and elsewhere.21 unlike union catalogs and many other online resources used in libraries, it is unknown what materials are included in google scholar, since as of this writing google has not released information about which publishers, titles, and dates are indexed.22 google is known to engage in self-censorship—or self-filtering, depending on what coverage one reads—and so potentially conflicts with the american library association’s freedom to read statement (www .ala.org/ala/oif/statementspols/ftrstatement/freedom readstatement.htm).23 google is a commercial entity and, as such, a primary motivation of google must be profit, and only secondarily, meeting the information needs of library users. for all of these and other reasons, there is considerable debate among librarians about whether it is appropriate for libraries to provide access to google scholar. despite this debate, however, users are using google scholar. google scholar is simply the latest tool to enable users to seek information for themselves; it isn’t the first and it won’t be the last. google scholar holds a great deal of promise for libraries due to the combination of google’s popularity and ease of use, and the resources held by or subscribed to by libraries to which google scholar points. as kesselman and watstein suggest, “libraries and librarians need to have a voice” in how tools such as google scholar are used, given that “we are the ones most passionate about meeting the information needs of our users.” given that library users are using google scholar, it is to libraries’ benefit to see that it is used well. google scholar is the latest tool in a long history of information-seeking technologies that increasingly realize kilgour’s goal of achieving 100 percent availability of information. google scholar does not provide access to 100 percent of information resources in existence; but rather enables discovery of information resources, and allows for the possibility that these resources will be discoverable by the user 100 percent of the time. google scholar may be on the vanguard of a new way of integrating library services into users’ everyday information-seeking habits. as taylor tells us, people have their own individual sources to which they go to find information, and libraries—for many people—are not at the top of their lists.25 google, however, is at the top of the list for a great many people.26 properly harnessed by libraries, therefore, google scholar has the potential to bring users to library resources when they are seeking information. google scholar may not bring users physically to the library. instead, what google scholar can do is bring users into contact with resources provided by the library. this is an important distinction, because it reinforces a change that libraries have been undergoing since the advent of the online database: that of providing access to materials that the library may not own. ownership of materials potentially allows for a greater measure of control over the materials and their use. ownership in the context of libraries has traditionally meant ownership of physical materials, and physical materials by nature restrict use, since the user must be physically collocated with the materials, and use of materials by one user precludes use of those materials by other users for the duration of the use. providing access to materials, on the other hand, means that the library may have less control over materials and their use, but this potentially allows for wider use of these materials. by enabling users to come into contact with library resources in the course of their ordinary web searches, google scholar has the potential to ensure that libraries remain a critical part of the user’s information-seeking process. it benefits google when a library participates with google scholar, but it also benefits the library and the library’s users: the library is able to provide users with a familiar and easy-to-use path to materials. this is (for lack of a better term) a “spoonful of sugar” approach to seeking and finding information resources: by using an interface that is familiar to users, libraries may provide quality information sources in response to users’ information seeking. green wrote that “a librarian should be as unwilling to allow an inquirer to leave the library with his question unanswered as a shop-keeper is to have a customer go out of his store without making a purchase.”27 a modern version of this might be that a librarian should be as unwilling to allow an inquirer to abandon a search with his question unanswered. google scholar and online tools like it have the potential to draw users away from libraries; however, these tools also have the potential to usher in a new era of service for libraries: an expansion of the reach of libraries to new users and user communities; a closer integration with users’ searches for information; and the provision of quality resources to all users, in response to all information needs. google scholar and online tools like it have the potential to enable libraries to realize kilgour ’s goals of improving the availability of information, and to provide 100 percent availability of information. these are goals on which all libraries can agree. 56 information technology and libraries | june 2006 ■ acknowledgements many thanks to lisa norberg, instruction librarian, and timothy shearer, systems librarian, both at the university of north carolina at chapel hill, for many extensive conversations about google scholar, which approached coauthorship of this paper. this paper is dedicated to the memory of kenneth d. shearer. references and notes 1. frederick g. kilgour, “historical note: a personalized prehistory of oclc,” journal of the american society for information science 38, no. 5 (1987): 381. 2. frederick g. kilgour, “future of library computerization,” in current trends in library automation: papers presented at a workshop sponsored by the urban libraries council in cooperation with the cleveland public library, alex ladenson, ed. (chicago: urban libraries council, 1981), 99–106; frederick g. kilgour, “toward 100 percent availability,” library journal 114, no. 19 (1989): 50–53. 3. kilgour, “toward 100 percent availability.” 4. frederick g. kilgour, “lack of indexes in works on information science,” journal of the american society for information science 44, no. 6 (1993): 364; frederick g. kilgour, “implications for the future of reference/information service,” in collected papers of frederick g. kilgour: oclc years, lois l. yoakam, ed. (dublin, ohio: oclc online computer library center, inc., 1984): 9–15. 5. frederick g. kilgour, “a new punched card for circulation records,” library journal 64, no. 4 (1939): 131–33; kilgour, “historical note”; frederick g. kilgour and nancy l. feder, “quotations referenced in scholarly monographs,” journal of the american society for information science 43, no. 3 (1992): 266–70; gerald salton, j. allan, and chris buckley, “approaches to passage retrieval in full-text information systems,” in proceedings of the 16th annual international acm sigir conference on research and development in information retrieval (new york: acm pr., 1993), 49–58. 6. kilgour, “implications for the future of reference/information service,” 95. 7. kilgour, “toward 100 percent availability,” 50. 8. frieda weise, “being there: the library as place,” journal of the medical library association 92, no. 1 (2004): 10, www.pubmedcentral.nih.gov/articlerender.fcgi?artid=314099 (accessed apr. 9, 2006). 9. it is difficult to determine precise figures, as there is considerable overlap in coverage; several vendors provide access to some of the same periodicals. 10. north carolina’s database subscriptions are via the nc live service, www.nclive.org (accessed apr. 9, 2006). 11. anne g. lipow, “serving the remote user: reference service in the digital environment,” paper presented at the ninth australasian information online and on disc conference and exhibition, sydney, australia, 19–21 jan. 1999, www.csu.edu.au/ special/online99/proceedings99/200.htm (accessed apr. 9, 2006). 12. j. janes, “academic reference: playing to our strengths,” portal: libraries and the academy 4, no. 4 (2004): 533–36, http:// muse.jhu.edu/journals/portal_libraries_and_the_academy/ v004/4.4janes.html (accessed apr. 9, 2006). 13. lee rainie and john horrigan, a decade of adoption: how the internet has woven itself into american life (washington, d.c.: pew internet & american life project, 2005), 58, www.pewinter net.org/ppf/r/148/report_display.asp (accessed apr. 9, 2006). 14. deborah fallows, search engine users (washington, d.c.: pew internet & american life project, 2005), i, www.pew internet.org/pdfs/pip_searchengine_users.pdf (accessed apr. 9, 2006). 15. deborah fallows, lee rainie, and graham mudd, data memo on search engines (washington, d.c.: pew internet & american life project, 2004), 3, www.pewinternet.org/ppf/ r/132/report_display.asp (accessed apr. 9, 2006). 16. laura bushallow-wilber, gemma devinney, and fritz whitcomb, “electronic mail reference service: a study,” rq 35, no. 3 (1996): 359–69; carol tenopir and lisa a. ennis, “reference services in the new millennium,” online 25, no. 4 (2001): 40–45. 17. alma swan and sheridan brown, open access selfarchiving: an author study (truro, england: key perspectives, 2005), www.jisc.ac.uk/uploaded_documents/open%20access %20self%20archiving-an%20author%20study.pdf (accessed apr. 9, 2006). 18. susan gardner and susanna eng, “gaga over google? scholar in the social sciences,” library hi tech news 8 (2005): 42–45; péter jacsó, “google scholar: the pros and the cons,” online information review 29, no. 2 (2005): 208–14. 19. andrew pace, “introduction to metasearch . . . and the niso metasearch initiative,” presentation to the openurl and metasearch workshop, sept. 19–21, 2005, www.niso.org/news/ events_workshops/openurl-05-ppts/2-1-pace.ppt (accessed apr. 9, 2006). 20. this plugin was developed by peter binkley, digital initiatives technology librarian at the university of alberta. see www.ualberta.ca/~pbinkley/gso (accessed apr. 9, 2006). 21. see, for example, gardner and eng, “gaga over google?”; jacsó, “google scholar”; m. kesselman and s. b. watstein, “google scholar and libraries: point/counterpoint,” reference services review 33, no. 4 (2005): 380–87. 22. jacsó, “google scholar.” 23. anonymous, google censors itself for china, bbc news, jan. 25, 2006, http://news.bbc.co.uk/2/hi/technology/4645596 .stm (accessed apr. 9, 2006); a. mclaughlin, “google in china,” google blog., jan. 27, 2006, http://googleblog.blogspot .com/2006/01/google-in-china.html (accessed apr. 9, 2006). 24. kesselman and s. b. watstein, “google scholar and libraries,” 386. 25. robert s. taylor, “question-negotiation and information seeking in libraries,” college & research libraries 29, no. 3 (1968): 178–94. 26. fallows, rainie, and mudd, data memo on search engines. 27. samuel s. green, “personal relations between librarians and readers,” american library journal i, no. 2–3 (1876): 79. 1–11. the goal of this paper is to describe a design—including the hardware, software, and configuration––for an open source wireless network. the network designed will require authentication. while care will be taken to keep the authentication exchange secure, the network will otherwise transmit data without encryption. w ireless networks are an essential tool for provid ing service for colleges and libraries. this paper will explain the setup of a wireless network using opensource software and inexpensive commodity hardware. opensource software was employed exclu sively. this allowed for flexibility in design and reduction in expense while also providing a platform for students to learn more about the internal workings of the system by examining particular sections of code in which they have interest. standard commodity hardware was used as a means of saving cost. this should allow others to repeat this design with a minimum of funding. the purpose of a network, like any resource, is to provide a service for those who own it; in this case, the patrons of a library, or students, faculty, and staff at a col lege. to ensure that this network serves its owners, users will be required to authenticate before gaining access. once authenticated, the central captive portal can pro vide different levels of service for specific user groups, including guest access, if desired. for this system, ease of access for users was the primary concern; other than using the secure socket layer for authentication, the remainder of the traffic was unencrypted. other than the base nodes, the remaining access points were connected to each other using a wireless connection in order to avoid physically connecting all access points across campus and to further reduce the expense for the deployment of the network. this was accomplished using the wds (wireless distributed system) feature on the wireless routers. all access points connect to a centralized set of servers that provide: dhcp, webcaching proxy, dns caching, radius, web server, a captive portal, and logging of network traffic. n hardware requirements for the network were relatively modest, using inexpensive wireless routers along with several linux servers built upon older pentium 3 desktop systems. linksys wrt54gs routers were chosen as the access points as they are inexpensive, readily available, and possess the ability to run custom opensource firmware. other access points could be used; however, the configuration sugges tions are specific to the wrt54gs and may not apply to other hardware. the routing functions of the wrt54gs were not used in this implementation. the servers need not be anything special; older hardware will work just fine. for this implementation, decommissioned 900 mhz units with 512mb of ram and 40gb hard drives were used. n wireless router software in order to provide the functionality required, the units had their firmware flashed with an opensource, linux based operating system available from openwrt for the linksys routers (http://www.openwrt.org). support is also available for other wireless devices. “the firmware from openwrt provides a fully writable file system with pack age management. this allows developers the freedom to customize the devices by choosing only the packages and software that are necessary for their applications.”1 as the routers have limited storage, being able to hand select only the necessary components is a definite advantage. n server software for the operating system on the servers, fedora core was chosen.2 fedora provides the yellow dog updater, modified (yum), which eases the updating of all pack ages installed on the system, including kernel updates.3 this aids security by providing a platform for easily and frequently updating the system. fedora core is an open source distribution that is available for free. fedora core also comes with many other opensource packages that were used in this design, such as the apache web server. while the designers had more familiarity with fedora, other distributions are also available that provide simi lar benefits (suse, ubuntu, openbsd, debian, etc.). the server was run in command line mode with no graphical user interface in order to reduce the load on the server and save space on the hard drive. n captive portal in order to require authentication before gaining access to the network, a captive portal was used. some of the open source wifi hotspot implementation | sondag and feher 35 open source wifi hotspot implementation tyler sondag and jim feher jim feher (jdfeher@mckendree.edu) is an associate professor of computer science at mckendree college in lebanon, illinois. tyler sondag (tnsondag@mckendree.edu), is a senior in computer science at mckendree college. 36 information technology and libraries | june 200736 information technology and libraries | june 2007 desired features in the choice of the captive portal were: encrypted authentication, traffic logging, and the ability to provide different levels of service for different user groups. logging traffic allows the system administrators to identify accounts that have been misusing the network. those who inadvertently misuse the system or perhaps have had their accounts compromised can have their access temporarily disabled until they can be contacted with instructions concerning acceptable use of the net work. as the network must be shared by all, those who habitually abuse the resource can have their accounts per manently disabled. the captive portal should also redi rect web traffic to a login page that is served on the secure socket layer until the user logs in. chillispot was chosen as it possesses all of the features mentioned above.4 n server layout as can be seen in appendix a, three servers were used for this implementation. the first server was used as the main router to the internet. the second server ran a squid web caching server.5 it also ran a dns cach ing server and the freeradius server.6 the third was used for the captive portal. three servers were used for various reasons. first, this distributed the load. second, portions of the network that were not behind the cap tive portal could more easily use the services on the second server running squid, dns, and freeradius. it should be noted that three independent servers are not required; many of the services could be consolidated on two or even one single server to reduce the hardware requirements. the implementation depends upon the specific needs for the network. n server installation installing the operating system (fedora core) on each server is a relatively straightforward procedure. each machine was partitioned with 1024 mbs of swap space with the rest of the drive being an ext3 partition with the mount point “/”. only the minimal set of packages required were installed at this time. the first server, server #1 (router), was given three network interfaces, one for the internet connection, one to connect to a switch that then connects to server #2 (web/dns caching and radius) as well as other machines that do not connect through the captive portal, and one connecting to server #3 (captive portal machine). the second server, server #2, only needs one interface, but the third, server #3, requires two interfaces, one for the master wireless access point, and one to connect to the switch connecting this machine to the rest of the network (appendix a). ssh login for root was also disabled at this time for added security. n server #1 configuration for server #1, very little setup was required. since this server works mainly as a router, the only major items that went into its configuration were the iptables rules, which are shown and described in appendix b.7 rules were set up to: n set up network address translation; n allow traffic to flow within the network; n log the traffic from the wireless portion of the net work; n allow for the transparent setup of the web proxy server; and n set up port knocking before allowing users to log into the router via ssh.8 a reference to this script was placed in the /etc/rc.d/ rc.local file so that it would run when the server boots. last was the setup of the three network interfaces in the machine. this can be done during system installation or afterwards on the fedora core based server by editing the configuration files in the /etc/sysconfig/networking scripts/ directory. one of the configuration files used in this implementation can be seen in appendix c. of course the configuration will change as the topology of the net work changes. n server #2 configuration the second server required significantly more setup to configure all of the necessary services that it runs. the first service added for this implementation was the web caching proxy server, squid. squid’s default configura tion file (/etc/squid.conf) is quite large; fortunately it requires little modification to get a simple server up and running.9 the changes made for this implementation can be seen in appendix d. the most important lines in this configuration are the last few, which enable it to act as a transparent proxy server, making it invisible to the users and requiring no setup of their browsers. as there was no need for an authoritative dns server, just dns caching for the network, dnsmasq, which is easy to configure and can handle both dhcp services as well as dns caching, was chosen.10 in this instance, the captive portal was used to provide dhcp services for the wireless clients; however dnsmasq was used for dynamic clients on the remaining portion of the network. dnsmasq public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 37open source wifi hotspot implementation | sondag and feher 37 is relatively easy to configure, requiring only one change in its default configuration file, which points to the file in which the dns server addresses are stored, in this case /etc/dnsmasq_resolv.conf. next is the configuration of freeradius server. there are two files that need to be modified for the radius server; both are in the /etc/raddb/ directory. the first is clients.conf (appendix e). in this file at least two clients must be listed, one for localhost (this machine) and one for the captive portal machine. for each machine, a pass word must be specified as well as the hostname for that machine. this establishes the shared key that is used to encrypt communication between the captive portal and the radius server. the second is the users file (appendix f). in this file, each user for the captive portal system must be listed with his/her password. this implementa tion also included a class, a session timeout (dhcp lease time), idle timeout, accounting interim interval, and the maximum upload and download speeds. if guest access is required, one or several guest accounts should be added to this file along with entries for the registered users. an entry was added for each access point so that they can obtain an ip address from the dhcp server. finally for this machine, the interface configuration file was changed according to the network specifications. for this machine the configuration is simple since it only has one interface, and the only requirement for its address is that it be on the same network as the interface on the main router server to which it is connected. n server #3 configuration the third server required the installation of the captive portal software, in this case chillispot. in order to install chillispot, if fedora was used for the base system, it may be possible to install it as a prepackaged binary in the form an rpm package manager (rpm) file. otherwise, if you find that you need to compile chillispot from source code, you may need to deviate from a minimal installa tion of the operating system and base components and also include the gnu compiler collection (gcc). when installing from source code, first download the code from the chillispot web site. once the code is down loaded, unzipped and untarred, installing the chillispot daemon is done by entering the directory containing the source files and entering the standard commands: ./configure make make install when chillispot is on the system, either by compiling from source or through an rpm file, two more files must be configured and copied to the proper directory, the main configuration file and the login file. the configuration file, chilli.conf, is located in the directory that contains the source files. move this file to the /etc/ directory and make the necessary changes. in this implementation, the file required several changes (appendix g). one of the more significant alterations was to change the default network range of 192.168.182.0/24, which would be limited to less than 256 addresses. the address range was for the dhcp server was also expanded to allow for more users. the lower portion of the network range was left to make room for addresses that could be assigned to the wireless access points. an entry was added to allow the access points to obtain a static ip address in that lower range. after this, settings must be changed for the dns addresses given out to clients, and the address of the radius server. there is also a setting in the chillispot configuration file that allows users to access a certain list of domains without logging in. for this implementation, the decision was to allow the users access to the campus network, as well as to the dns server. next, the “radi ussecret” must be set. this is the same password that was entered into the clients.conf file on the radius server for this machine. it is also necessary to set the address of the page to which users will be directed. two lines must also be added to allow authentication using the physical or media access control (mac) address for the access points. all of the access points shared a common password. chillispot passes the physical address of the access point to the radius server along with this password. a separate entry must exist in the radius configuration file for each ip/physical address combination. for this setup, the redirect page was placed on this server, therefore apache (using yum) was also installed, and this server’s address was added as the web address for the redirect page (also note that the https module may be required for apache if it does not automatically install). rather than write a new page at this time, the sample page (hotspotlogin.cgi) from the chillispot source folder was copied and modified slightly (appendix h). in addi tion, a secure socket layer (ssl) certificate was installed on this server. this is not necessary, but it helps to avoid the warnings that pop up when a client attempts to access the login page with a browser. a few iptables rules need to be added. the first com mand needs to be executed in order to utilize network address translation (nat) and have the server forward packets to the outside network. /sbin/iptables t nat a postrouting o eth0 \ j masquerade the next is used to drop all outbound traffic originating from the access points. this prevents anyone spoofing the physical address of the access point from accessing 3� information technology and libraries | june 20073� information technology and libraries | june 2007 the internet, while still allowing the access points and the chillispot server to communicate for configuration and monitoring. /sbin/iptables a forward s 192.168.182.0/24 \ j drop these commands need to be executed when the chillispot machine boots, so they were placed into the /etc/rc.d/rc.local file. it may also be necessary to ensure that the machine can forward network traffic. this can be accomplished with the following command, which is also found as the first executable command from the script in appendix b: echo “1” > /proc/sys/net/ipv4/ip_forward finally, the configuration files for the interfaces were set up. n openwrt installation and configuration several ways exist to replace the default linksys firmware with the openwrt firmware.11 the tftp protocol can be used with both windows and linux, and one such method can be found in appendix i.12 in addition, other methods for using the standard web interface can be found on the openwrt web site.13 there are several versions of the openwrt firmware available; the newest version that uses the squashfs filesystem was chosen because it utilizes com pression that frees more space on the access point. openwrt comes with a default web interface that can be used for configuration, however, ssh was enabled and a script using the nvram command was used to configure each access point (see appendix j). before ssh can be used, you must telnet into the router and change the default password (which for linksys routers is ‘admin’). note: even if you decide to use the web interface, you should still change the default password. as several services that were installed with the default configuration were not used in the implementa tion, they were disabled once the firmware was flashed by removing the modules that boot at startup: the web interface, dnsmasq, and the firewall. this is done by deleting their entries in the /etc/init.d directory. changes were needed to set the mode of the access point, to turn on and configure the clients needing to use wds, to set the network information for the access point and then to save these settings. all of the wireless access points that communicate with each other via a wireless connec tion must have their physical addresses entered using a nvram command. for example, the command used for the main access point for the library would be: nvram set w10_wds=”mac_4_lib1 mac_4_lib2” all of this is detailed in appendix j. a final set of com mands, which were needed for the wrt54gs, are included to allow the access point to obtain its ip address from the dhcp server. these commands may not be necessary depending upon the type of access point used. since extra wireless access points are available, if an access point fails or is having problems for some reason, it is simply a matter of running a script similar to the one found in the appendix on one of the extra routers and swapping it out. n security unfortunately this system is not very secure. only the login credentials are encrypted via ssl. general data packets are in no way encrypted, so any information being transmitted is available to anyone sniffing the channel. wep and wpa could be used for encryption, but they have known vulnerabilities. other methods exist for securing the network such as wpa with radius or the use of a virtual private network, however the client setup for such systems may not be considered trivial for the typical user. therefore it was decided that it was better to inform the users that the data was not being encrypted and let them act accordingly, rather than use encryption with known flaws or invest the time required to train the general population on how to configure their mobile units to use a more secure form of encryption. as the main goal of this particular network was connectivity and not security, it was felt that this was a fair trade off. as new standards for wireless communication are developed and commodity hardware that supports them becomes available, this may change so that encrypted channels can be employed more easily. n conclusion this implementation is in no way completed. it is a work in progress, with many goals still in mind. also, as new features are desired, parts of the system will change to accommodate these requirements. current plans for the future are first to develop scripts to check the status of the access points and display this information to a web page. these scripts will also notify network administrators when access points go offline. this will help the adminis trators in making sure the system is up at all times. after this, scripts will be developed to parse the log files to find abusive activity (spamming, viruses, etc). however, the current project as described is complete and has already functioned successfully for nearly a year providing con nectivity for the library and portions of the mckendree college campus. public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 3�open source wifi hotspot implementation | sondag and feher 3� references and notes 1. openwrt, wireless freedom. www.openwrt.org (accessed june 16, 2006). 2. the fedora project. www.fedora.redhat.com (accessed nov. 29, 2005). 3. yum: yellow dog updater, modified. www.linux.duke. edu/projects/yum (accessed july 22 2006). 4. chillispot—open source wireless lan access point controller. www.chillispot.org (accessed june 23, 2006). 5. squid web proxy cache. www.squidcache.org (accessed june 1, 2006). 6. freeradius—building the perfect radius server. www. freeradius.org (accessed june 28, 2006). 7. netfilter/iptables project homepage—the netfilter.org project. www.netfilter.org (accessed aug. 8, 2006). 8. thomas eastep, “port knocking and other uses of ‘recent match.’” www.shorewall.net/portknocking.html (accessed aug. 11, 2006). 9. squid web proxy cache, “squid frequently asked questions: interception caching/proxying.” www.squidcache. org/doc/faq/faq17.html (accessed aug. 8, 2006). 10. dnsmasq—a dns forwarder for nat firewalls. www. thekelleys.org.uk/dnsmasq/doc.html (accessed june 1, 2006). 11. linksys.com. www.linksys.com (accessed dec. 15, 2005). 12. openwrtdocs/installing/tftp—openwrt. wiki.open wrt.org/openwrtdocs/installing/tftp?action=show&redirect =openwrtviatfp (accessed aug. 2, 2006). 13. openwrtdocs/installing—openwrt. wiki.openwrt.org/ openwrtdocs/installing (accessed aug. 2, 2006). appendix a. network configuration 40 information technology and libraries | june 200740 information technology and libraries | june 2007 appendix b. iptables script—server #1 # this particular bit must be set to one to allow the # network to forward packets echo “1” > /proc/sys/net/ipv4/ip_forward # set up path to the internal network from internet if the # internal network initiated the connection iptables a forward i eth0 o eth1 d 10.4.0.0 \ m state state established,related j accept # same for the chillispot subnet iptables a forward i eth0 o eth2 d 10.5.0.0 \ m state state established,related j accept # allow the internal subnets to communicate with one another iptables a forward i eth1 d 10.5.0.0 o eth2 \ j accept iptables a forward i eth2 d 10.4.0.0 o eth1 \ j accept # allow subnet containing server 2 to reach the internet iptables a forward i eth1 o eth0 j accept # chillispot – accept and forward packets iptables a forward i eth2 s 10.5.3.30 j accept # set up transparent proxy for wireless network, but allow # connections that go through to the campus network # to bypass proxy iptables t nat a prerouting i eth2 ! \ d 66.99.172.0/23 p tcp dport 80 s 10.5.0.0/16 \ j dnat todestination 10.4.1.90:3128 # nat iptables t nat a postrouting o eth0 \ j masquerade # simple port knocking to allow port 22 connection adapted # from www.shorewall.net/portknocking.html1 another # excellent document can be found at # www.debian-administration.org/articles/26814 # once connection started let it continue iptables a input m state state \ established,related j accept # if name ssh has been set, then allow connection iptables a input p tcp dport 22 m recent \ rcheck name ssh j accept # surround the port that opens ssh so that a sequential port # scanners will end up closing it right after opening it. iptables a input p tcp dport 1233 m recent \ –name ssh remove j drop iptables a input p tcp dport 1234 m recent \ name ssh set j drop iptables a input p tcp dport 1235 m recent \ name ssh remove j drop # drop all packets that do not match a rule above by default iptables a input j drop appendix c. server configuration for first network card (ethernet 0) # /etc/sysconfing/networkingscripts/ifcfgeth0 # server #1 # device=eth0 bootproto=static broadcast=66.128.109.63 hwaddr=00:11:22:33:44:66 ipaddr=66.128.109.60 netmask=255.255.255.248 network=66.128.109.56 onboot=yes type=ethernet appendix d. /etc/squid.conf—server #2 #default squid port http_port 3128 # settings changed to specify memory for squid cache_mem 32 mb cachedir ufs /var/spool/squid 1000 16 256 # allow assess to squid for all within our network acl all src 0.0.0.0/0.0.0.0 http_access allow all http_reply_access allow all # internal host with no externally known name so we put # our internal host name visible_hostname hostname # specifications needed for transparent proxy2 httpd_accel_port 80 httpd_accel_host virtual httpd_accel_with_proxy on httpd_accel_uses_host_header on public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 41open source wifi hotspot implementation | sondag and feher 41 appendix e. /etc/raddb/clients.conf— server #2 client 127.0.0.1 { secret = password shortname = localhost nastype = other } client 10.5.3.30 { secret = password shortname = other machine } appendix f. /etc/raddb/users—server #2 # example of an entry for a user joeuser authtype:=local, userpassword==”passwd” class = 0702345678, sessiontimeout = 3600, idletimeout = 600, acctinteriminterval = 60, wisprbandwidthmaxup = 128000, wisprbandwidthmaxdown = 512000 # example of an entry for an access point # the physical/mac address listed below is for the # lan side of the router/access point mac_address authtype := local, userpassword == “password” framedipaddress = 192.168.182.10, acctinteriminterval = 3600, sessiontimeout = 0, idletimeout = 0 appendix g. /etc/chilli.conf—server #3 # used to expand the network net 192.168.176.0/20 # used to expand the number of hosts that can connect # while still leaving a portion of the network for # infrastructure dynip 192.168.184.0/21 # used to give static addresses to the access points statip 192.168.182.0/24 # internal dns followed by external dns dns1 10.4.1.90 dns2 24.217.0.3 # radius server for the network radiusserver1 10.4.1.90 radiusserver2 10.4.1.90 # radius secret used radiussecret password # interface chillispot server to listens to dhcp requests dhcpif eth1 # specified default login page uamserver https://10.5.3.30/cgibin/hotspotlogin.cgi # addresses that users can visit without authenticating uamallowed 10.4.1.90,24.217.0.3,66.99.172.0/24 # this allows the access points to authenticate based on # mac address only, this is required to log into the access # points from the captive portal server macauth # this password corresponds with the password from the # radius users file macpasswd password 42 information technology and libraries | june 200742 information technology and libraries | june 2007 appendix h. redirection page appendix i. method for flashing firmware of linksys router the firmware can be flashed using the builtin web inter face or via tftp. while help is available online3 for this, the procedure outlined here may also be helpful. on newer versions of the linksys routers, an older version of the linksys firmware must be installed first that supports a bug in the ping function on the router. once the older version is installed, you can exploit a bug in the ping com mand on the router to enable “boot wait,” which enables the router to accept a connection to flash its firmware as it is booting. detailed instructions for this installation are as fol lows: n first, download an old version of a linksys firmware that supports the ping bug to enable boot wait. one is available at: ftp://ftp.linksys.com/pub/network/ wrt54gs_3.37.2_us_code.zip n download and unzip this file. n plug an ethernet patch cable into link #1 on the router (not the wan port) and the interface on your machine. set the ip address of your computer to a static ip address in the 192.168.1.x range, not 192.168.1.1, which is used by the router. n log into router by opening a browser window and putting 192.168.1.1 into the address bar. (note: this is only for factory preset routers.) username: (leave blank) password: admin n click on "administration". n click on "firmware upgrade". n click "browse" and locate the old linksys firmware on your machine. n click "upgrade". n wait patiently while it flashes the firmware…. n click "setup". n click "basic setup". public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 43open source wifi hotspot implementation | sondag and feher 43 n choose "static ip" from the first box. n for the ip address put in "10.0.0.1". n for the netmask put in "255.0.0.0". n for the gateway put in "10.0.0.2". n you can leave everything else as their default set tings. n choose save settings at the bottom of the page. n click on "administration". n click on "diagnostics". n click on "ping". in the “address” box put the following commands in one at a time and click on “ping”; if you see the message that the host was unreachable you have done something wrong. ;cp${ifs}*/*/nvram${ifs}/tmp/n ;*/n${ifs}set${ifs}boot_wait=on ;*/n${ifs}commit ;*/n${ifs}show>tmp/ping.log n after the last command you will see a list of all the nvram settings on the router, make sure that the line for "boot_wait" is set to on n unplug the router (the linksys router will only look for new firmware on boot). n use tftp on your linux or windows machine. n if the openwrt0wrt54gssquashfs.bin file is not in this directory, copy the file to this directory n run the following commands at the prompt (below are the linux commands) tftp 192.168.1.1 tftp> binary tftp> rexmt 1 tftp> timeout 60 tftp> trace tftp> put openwrtxxxx.xxxx.bin n the router will now reboot (it may take a very long time), when it is done rebooting, the dmz light will turn off the new firmware is now loaded onto the router. appendix j. nvram script for wireless routers ## server information stored as comments ##192.168.182.10 mainap 00:11:22:33:44:00 ##192.168.182.11 cl202a 00:11:22:33:44:11 ##192.168.182.20 lib01 00:11:22:33:44:22 ##192.168.182.21 lib02 00:11:22:33:44:33 ##192.168.182.22 lib03 00:11:22:33:44:44 ##192.168.182.30 car01 00:11:22:33:44:55 ## same for all nvram set wl0_mode=ap nvram set wl0_ssid=mck_wireless nvram set wl0_channel=9 nvram set lan_proto=dhcp ## sample configuration for a few access points. ## uncomment and run for the appropriate node. ## make sure to ## add a line for every access point you have. ## unique for lib01 ## allow connections to/from lib02, and lib03 #nvram set wl0_wds=”00:11:22:33:44:33 00:11:22:33:44:44” ## unique for lib02 ## allow connections to/from lib01 #nvram set wl0_wds=”00:11:22:33:44:22” ## unique for lib03 ## allow connections to/from lib01 #nvram set wl0_wds=”00:11:22:33:44:22” ## same for all nvram commit ## same for all ## this needed to be done to allow each wrt54gs router ## to accept an ip address from a dhcp server. this is ## only for the wrt54gs. other access point/routers ## may require something different. # cd /etc/init.d # rm s05nvram # cp /rom/etc/init.d/s05nvram . # vi s05nvram ## place a # in front of (comment out) ## nvram set lan_proto=”static” references 1. thomas eastep, “port knocking and other uses of ‘recent match.’ ” www.shorewall.net/portknocking.html (accessed aug. 11, 2006) 2. ibid. 3. openwrtdocs/installingopenwrt, wiki.openwrt.org/ openwrtdocs/installing (accessed aug. 2, 2006). pal: toward a recommendation system for manuscripts scott ziegler and richard shrake information technology and libraries | september 2018 84 scott ziegler (sziegler1@lsu.edu) is head of digital programs and services, louisiana state university libraries. prior to this position, ziegler was the head of digital scholarship and technology, american philosophical society. richard shrake (shraker13@gmail.com) is a library technology consultant based in burlington, vermont. abstract book-recommendation systems are increasingly common, from amazon to public library interfaces. however, for archives and special collections, such automated assistance has been rare. this is partly due to the complexity of descriptions (finding aids describing whole collections) and partly due to the complexity of the collections themselves (what is this collection about and how is it related to another collection?). the american philosophical society library is using circulation data collected through the collectionmanagement software package, aeon, to automate recommendations. in our system, which we’re calling pal (people also liked), recommendations are offered in two ways: based on interests (“you’re interested in x, other people interested in x looked at these collections”) and on specific requests (“you’ve looked at y, other people who looked at y also looked that these collections”). this article will discuss the development of pal and plans for the system. we will also discuss ongoing concerns and issues, how patron privacy is protected, and the possibility of generalizing beyond any specific software solution. introduction the american philosophical society library (aps) is an independent research library in philadelphia. founded in 1743, the library houses a wide variety of material in early american history, history of science, and native american linguistics. the majority of the library’s holdings are manuscripts, with a large amount of audio material, maps, and graphics, nearly all of which are described in finding aids created using encoded archival description (ead) standards. like similar institutions, the aps has long struggled to find new ways to help library users discover material relevant to their research. in addition to traditional in-person, email, and phone reference, the aps has spent years creating search and browse interfaces, subject guides , and web exhibitions to promote the collections.1 as part of these ongoing efforts to connect users with collections, the aps is working on an automated recommendation system to reuse circulation data gathered through aeon. developed by atlas systems, aeon is a “request and workflow management software specifically designed for special collections libraries and archives,” and it enables the aps to gather statistics on both the use of our manuscript collections and on aspects of the library’s users.2 the automated recommendation system, which we’re calling pal, for “people also liked,” is an ongoing effort. this article presents a snapshot of current work. pal: toward a recommendation system for manuscripts | ziegler and shrake 85 https://doi.org/10.6017/ital.v37i3.10357 literature review the benefits of recommendations in library opacs has long been recognized. writing in 2008 about the library recommendation system bibtip, itself started in the early 2000s, mönnich and spiering observe that “library services are well suited for the adoption of recommendation systems, especially services that support the user in search of literature in the catalog.” by 2011 oclc research and the information school at the university of sheffield began exploring a recommendation system for oclc’s worldcat.3 recommendations for library opacs commonly fall into one of two categories, content-based or collaborative filtering. content-based recommendations pair specific users to library items based on the metadata of the item and what is known about the user. for example, if a user indicates in some way that they enjoy mystery novels, items identified as mystery novels might be recommended to them. collaborative filtering combines users in some way and creates recommendations for one user based on the preferences of another user. there can be a dark side to recommendations. the algorithms that determine which users are similar and thus which recommendations to make are not often understood. writing about algorithms in library discovery systems broadly, reidsma points out that “in librarianship over the past few decades, the profession has had to grapple with the perception that computers are better at finding relevant information then people.”4 the algorithms that are doing the finding, however, often carry the same hidden biases that their programmers have. reidsma encourages a broader understanding of algorithms in general and deeper understanding of recommendation algorithms in particular. the history of recommendation systems in libraries has informed the ongoing development of pal. we use both the content-based and the collaborative filtering approach to offering recommendations to users. for the purposes of communicating them to nontechnical patrons, we refer to them as “interest-based” and “request-based,” respectively. furthermore, we are cautious about the role algorithms play in determining which recommendations users see. our help text reinforces the continued importance of working directly with in-house experts, and we promote pal as one tool among the many offered by the library. we are not aware of any literature on the development of recommendation tools for archives or special-collections libraries. the nature of the material held in these institutions presents special challenges. for example, unlike book collections, many manuscript and archival collections are described in aggregate: one description might refer to many letters. these issues are discussed in detail below. putting data to use: recommendations based on interests and requests the use of aeon allows the aps to gather and store data, including both data that users supply through the registration form and data concerning which collections are requested. pal use both types of data to create recommendations. interest-based recommendations the first type of recommendation uses self-identified research interest data that researchers supply when creating an aeon account. when registering, a user has the option to select from a list of sixty-four topics grouped into seven broad categories (figure 1). the aps selected these information technology and libraries | september 2018 86 interests based on suggestions from researchers as well as categories common in the field of academic history. upon signing in, a registered user sees a list of links (figure 2); each link leads to a full-page view of collection recommendations (figure 3). these recommendations follow the model, “you’re interested in x, other people interested in x looked at these collections.” request-based recommendations using the circulation data that aeon collects, we are able to automate recommendations in pal based on request information. upon clicking a request link in a finding aid, the user is presented with a list of recommendations on the sidebar in aeon (figure 4). each link opens the finding aid for the collection listed. figure 1. list of interests a user sees when registering for the first time. a user can also revisit this list to modify their choices at any point by following links through the aeon interface. the selected interests generate recommendations. pal: toward a recommendation system for manuscripts | ziegler and shrake 87 https://doi.org/10.6017/ital.v37i3.10357 figure 2. list of links appearing on the right-hand sidebar, based on interests that users select. figure 3. recommended collections, based on interest, showing collection name (with a link to finding aid), call number, number of requests, and number of users who have requested from the collections. the user sees this list after clicking on option from sidebar, as shown in figure 2. information technology and libraries | september 2018 88 figure 4. request-based recommendation links appearing on the right-hand sidebar after a patron requests an item from a finding aid. the process currently, the data that drives these two functions is obtained from a semidynamic process via daily, automated sql query exports. usernames are employed to tie together requests and interests but are subsequently purged from the data before the results are presented to users and staff. this section explains the process in detail and presents code snippets where available. all code is available on github.5 interest-based recommendations for interest-based recommendations, we employ two queries. the first query pulls every collection requested by a user for each topic for which that user has expressed an interest. the second aggregates the data for every user in the system. the following queries get data from the microsoft sql database, via a microsoft access intermediary, that aeon uses to store data. because of the number of interest options in the registration form, and the character length of some of them (“early america colonial history,” for example) we encode the interests in shortened form. “early america colonial history” becomes “ea-colhist” so as not to run into character limits in the database. this section explores each of these queries in more detail and provides example code. pal: toward a recommendation system for manuscripts | ziegler and shrake 89 https://doi.org/10.6017/ital.v37i3.10357 the first query gathers research topics for all users who are not staff (user status is ‘researcher’), and where at least one research topic is chosen (‘researchtopics’ is not null). the data is exported into an xml file that we call “aeonmssreg.” select aeondata.dbo.users.researchtopics, aeondata.dbo.transactions.callnumber, aeondata.dbo.transactions.location from aeondata.dbo.transactions inner join aeondata.dbo.users on (aeondata.dbo.users.username = aeondata.dbo.transactions.username) and (aeondata.dbo.transactions.username = aeondata.dbo.users.username) where (((aeondata.dbo.users.researchtopics) is not null) and ((aeondata.dbo.transactions.callnumber) like 'mss%' or (aeondata.dbo.transactions.callnumber) like 'aps.%') and ((aeondata.dbo.users.status)='researcher')) for xml raw ('aeonmssreq'), root ('dataroot'), elements; the second query combines all data for all users and exports an xml file ‘aeonmssusers.’ select distinct aeondata.dbo.users.researchtopics, aeondata.dbo.transactions.callnumber, aeondata.dbo.transactions.location, aeondata.dbo.transactions.username from aeondata.dbo.transactions inner join aeondata.dbo.users on (aeondata.dbo.users.username = aeondata.dbo.transactions.username) and (aeondata.dbo.transactions.username = aeondata.dbo.users.username) where (((aeondata.dbo.users.researchtopics) is not null) and ((aeondata.dbo.transactions.callnumber) like 'mss%' or (aeondata.dbo.transactions.callnumber) like 'aps.%') and ((aeondata.dbo.users.status)='researcher')) for xml raw ('aeonmssusers'), root ('dataroot'), elements; each query produces an xml file. these files are parsed using xsl stylesheets into subsets for each research interest. the stylesheets also generate counts of users requesting a collection and number of total requests for a collection by users sharing an interest. an example is the following stylesheet for the topic “early america colonial history,” which pulls from the xml file “aeonmssreg”: information technology and libraries | september 2018 90 this process is repeated for each interest. the data from the query that we modify with xslt is presented as html that we insert into aeon templates. this html includes the collection name (linked to finding aid), call number, number of requests, and number of users in a table. see figure 3 for how this appears to the user. the following shows how xsl is wrapped in html.
the collections most frequently requested from researchers who expressed an interest in are listed below with links to each collection's finding aid and the number of times each collection has been requested.

collection call number # of requests # of users

to ensure a user only sees the links that match the interests they have selected, we use javascript to determine the expressed interests of the current user and display the corresponding links to the html pages in a sidebar. this approach works well, but we must account for two quirks. the first is that many interests in the database do not conform to the current list of options because many users predate our current registration form and wrote in free-form interests. secondly, aeon stores the research information as an array rather than in a separate table, so we must account for the fact that the aeon database contains an array of values that includes both controlled and uncontrolled vocabulary. first, we set the array as a variable so we can look for a value that matches our controlled vocabulary and separate the array into individual values for manipulation: // use var message to check for presence of controlled list of topics var message = "<#user field='researchtopics'>"; // use var values to separate topics that are collected in one string var values = "<#user field='researchtopics'>".split(","); pal: toward a recommendation system for manuscripts | ziegler and shrake 91 https://doi.org/10.6017/ital.v37i3.10357 we also create variables to generate the html entries and links out when we have extracted our research topics: var open = "
because you are interested in "; var close = "
" next we set a conditional to determine if one of our controlled vocabulary terms appears in the array: //determine if user has an interest topic from the controlled list if ((message.indexof("ea-colhis") > -1) || (message.indexof("ea-amrev") > -1) || (message.indexof("ea-earlynat") > -1) || (message.indexof("ea-antebellum") > -1) || … if the array contains a value from our controlled vocabulary, we generate a link and translate our internal code back into a human-friendly research topic (“ea-colhist,” for example, becomes once again “early american colonial history”): for (var i = 0; i < values.length; ++i) { if (values[i]=="ea-colhis"){ document.getelementbyid("topic").innerhtml += (open + values[i] + middle + "early america-colonial history" + close);} else if (values[i]=="ea-amrev"){ document.getelementbyid("topic").innerhtml += (open + values[i] + middle + "early america american revolution" + close);} else if (values[i]=="ea-earlynat"){ document.getelementbyid("topic").innerhtml += (open + values[i] + middle + "early america early national" + close);} else if (values[i]=="ea-antebellum"){ document.getelementbyid("topic").innerhtml += (open + values[i] + middle + "early america antebellum" + close);} … see figure 2 for how this appears to the user. users only see the links that correspond to their stated interest. if the array does not contain a value from our controlled vocabulary, we display the research-topic interests associated with the user account, note that we don’t currently have a recommendation, and provide a link to update the research topics for the account. else {document.getelementbyid("notopic").innerhtml = "
you expressed interest in:
<#user field='researchtopics'>
we are unable to provide a specific collection recommendation for you. please visit our user profile page to select from our list of research topics.
" } request-based recommendations in addition to interest-based recommendations, pal supplies recommendations based on past requests a user has made. this section details how these recommendations are generated. aeon allows users to request materials directly from a finding aid (see figure 6). to generate our request-based recommendations we employ a query depicting the call number and user of every request in the system and export the results to an xml file called “aeonlikecollections.” information technology and libraries | september 2018 92 select subquery.callnumber, subquery.username, iif(right(subquery.trimlocation,1)='.',left(subquery.trimlocation,len(subquery.trimlocation)1),subquery.trimlocation) as finallocation from ( select distinct aeondata.dbo.transactions.callnumber, aeondata.dbo.transactions.username, iif(charindex(':',[location])>0,left([location],charindex(':',[location])-1),[location]) as trimlocation from aeondata.dbo.transactions inner join aeondata.dbo.users on (aeondata.dbo.users.username = aeondata.dbo.transactions.username) and (aeondata.dbo.transactions.username = aeondata.dbo.users.username) where (((aeondata.dbo.transactions.callnumber) like 'mss%' or (aeondata.dbo.transactions.callnumber) like 'aps.%') and ((aeondata.dbo.transactions.location) is not null) and ((aeondata.dbo.users.status)='researcher'))) subquery order by subquery.callnumber for xml raw ('aeonlikecollections'), root ('dataroot'), elements; we then process the “aeonlikecollections” file through a series of xslt stylesheets, creating lists of every other collection that every user of the current collection has requested. first the stylesheets remove collections that have only been requested once. then we count the number of times each collection has been requested: we sort on the collection name and username and then re-sort to combine groups of requested collections with users who have requested each collection. pal: toward a recommendation system for manuscripts | ziegler and shrake 93 https://doi.org/10.6017/ital.v37i3.10357 we then create a new xml file that is organized by our collection groupings. the following snippet shows a populated xml file generated by the xslt stylesheet above. mss.497.3.b63c mss.497.3.b63c american council of learned societies … 94 mss.ms.coll.200 mss.ms.coll.200 miscellaneous manuscripts collection … 92 we use javascript to determine the call number of the user’s current request and display the list of other collections that users who have requested the current collection have also requested. see figure 4 for how these links appear to the user. all of the exports and processing are handled automatically through a daily scheduled task. the only personally identifiable data that is contained in these processes are usernames, which are used for counting purposes, but they are removed from the final products through the xslt processing on an internal administrative server, are never stored in the aeon web directory, and are never available for other library users or staff to see. potential pitfalls and what to do about them pal allows us to see new things about our users, and we hope that our users are able to see new collections in the library. however, there are potential pitfalls to the way we’ve been working on this project. we’re calling the two biggest pitfalls the “bias toward well-described collections” and the “problem of aboutness.” information technology and libraries | september 2018 94 the bias toward well-described collections the bias toward well-described collections is best understood by examining how the aps integrates aeon into our finding aids. we offer request links at every available level of description: collection, series, folder, and item. if a patron spends all day in our reading room and looks at the entirety of an item-level collection, they could have made between twenty and one hundred individual requests from that collection. for our statistics, each request will be counted as that collection being used. figure 6 shows a collection described at the item level; each item can be individually requested, giving the impression that this collection is very heavily used even if it is only one patron doing all the requesting. figure 6. finding aid of collection described at the item level. a patron making their way through this collection could make as many as one hundred individual requests. for collections described at the collection level, however, the patron has only one link to click to see the entire collection. for pal, however, it looks like that collection was only used once, as shown in figure 7. a patron sitting all day in our reading room looking at a collection with little description might use the collection more heavily than a patron clicking select items in a well-described collection. however, when we review the numbers, all we see is that the well-described collections get more clicks. pal: toward a recommendation system for manuscripts | ziegler and shrake 95 https://doi.org/10.6017/ital.v37i3.10357 figure 7. screenshot of finding aid with only collection-level description. this collection has only one request link, the “special request” link at the top right. a patron looking through the entirety of this collection will only log a single request from the point of view of our statistics. the problem of aboutness when we speak of the problem of aboutness, we draw attention to the fact that manuscript collections can be about many different things. one researcher might come to a collection for one reason, another researcher for another reason. a good example at the aps library is the william parker foulke papers.6 this collection contains approximately three thousand items and represents a wide variety of the interests of the eponymous mr. foulke. he discovered the first full dinosaur skeleton, promoted prison reform, worked toward abolition, and championed arctic exploration. a patron looking at this collection could be interested in any of these topics, or others. pal, however, isn’t able to account for these nuances. if a researcher interested in prison reform requests items from the foulke papers, they’ll see the same suggestion as a researcher who came to the collection for arctic exploration. what to do about this identifying these pitfalls is a good first step to avoiding them, but it’s only a first step. there are technical solutions, and we’ll continue to explore them. for example, the bias toward welldescribed collections is mitigated by showing both the number of requests and the number of users who have requested from a collection (see figure 3). we hope that by presenting both numbers, we move a little toward overcoming this bias. however, we’re also interested in the nontechnical approaches to these issues. as mentioned in the introduction, the aps relies heavily on traditional reference service, both remote and in-house. nontechnical solutions acknowledge the shortcomings of any constructed solution and injects a healthy amount of humility into our work. additionally, the subject guides, search tools, and web exhibitions all form an ecosystem of discovery and access to supplement pal. future steps using data outside of aeon we have begun exploring options for using the recommendation data outside of aeon. one early prototype surfaces a link in our primary search interface. for example, searching for the william information technology and libraries | september 2018 96 parker foulke papers shows a link of what people who requested from this collection also looked at. see figures 8 and 9. generalizing for other repositories there are ways to integrate the use of aeon with ead finding aids. the systems that the aps has developed to collect data for automated recommendations takes advantage of our infrastructure. we’d like for other repositories to be able to use pal. it is our hope that an institution using aeon in a different way will help us generalize this system. generalizing beyond aeon pal is currently configured to pull data out of the microsoft sql database used by aeon. however, all the manipulation is done outside of aeon and is therefore generalizable to data collected in other ways. because archives and special collections have long-held statistics in different types of systems, we hope to be able to generalize beyond the aeon use case if there is any interest in this from other repositories. integrating pal into aeon conversations with atlas staff about pal have been positive, and there is interest in building many of the features into future releases of aeon. as of this writing, an open uservoice forum topic is taking votes and comments about this integration.7 figure 8. a link in the search returns that leads to recommendations based on finding aid search. clicking on the link “pal recommendations: patrons who used henry howard houston, ii papers also used these collections” will open an html page with a list of links to finding aids. pal: toward a recommendation system for manuscripts | ziegler and shrake 97 https://doi.org/10.6017/ital.v37i3.10357 figure 9. html link of recommended finding aids based on search. conclusion the aps is trying to add to the already robust options for users to find relevant manuscript collections. in addition to traditional reference, web exhibitions, and online search and browse tools, we have started reusing circulation data and self-identified user interests to automate recommendations. this new system fits within the ecosystem of tools we already supply. this is a snapshot of where the pal recommendation project is as of this writing, and we hope to work with other special collections libraries and archives to continue to grow the tool. if you are interested, we hope you reach out. endnotes 1 “subject guides and bibliographies,” american philosophical society, accessed february 27, 2018, https://amphilsoc.org/library/guides; “exhibitions,” american philosophical society, accessed february 27, 2018, https://amphilsoc.org/library/exhibit; “galleries,” american philosophical society, accessed february 27, 2018, https://diglib.amphilsoc.org/galleries. 2 “aeon,” atlas systems, accessed february 27, 2018, https://www.atlas-sys.com/aeon/. https://amphilsoc.org/library/guides https://amphilsoc.org/library/exhibit https://diglib.amphilsoc.org/galleries https://www.atlas-sys.com/aeon/ information technology and libraries | september 2018 98 3 michael mönnich and marcus spiering, “adding value to the library catalog by implementing a recommendation system,” d-lib magazine 14, no. 5/6 (2008), https://doi.org/10.1045/may2008-monnich. 4 matthew reidsma, “algorithmic bias in library discovery systems,” matthew reidsma (blog), march 11, 2016, https://matthew.reidsrow.com/articles/173. 5 “americanphilosophicalsociety/pal,” american philosophical society, last modified september 11, 2017, https://github.com/americanphilosophicalsociety/pal. 6 “william parker foulke papers, 1840–1865,” american philosophical society, accessed february 27, 2018, https://search.amphilsoc.org/collections/view?docid=ead/mss.b.f826-ead.xml. 7 “recommendation system to suggest items to researchers based on users with the same research topic,” atlas systems, accessed february 27, 2018, https://uservoice.atlassys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-tosuggest-items-to-research. https://doi.org/10.1045/may2008-monnich https://matthew.reidsrow.com/articles/173 https://github.com/americanphilosophicalsociety/pal http://amphilsoc.org/collections/view?docid=ead/mss.b.f826-ead.xml https://uservoice.atlas-sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to-suggest-items-to-research https://uservoice.atlas-sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to-suggest-items-to-research https://uservoice.atlas-sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to-suggest-items-to-research abstract introduction literature review putting data to use: recommendations based on interests and requests interest-based recommendations request-based recommendations the process interest-based recommendations request-based recommendations potential pitfalls and what to do about them the bias toward well-described collections the problem of aboutness what to do about this future steps using data outside of aeon generalizing for other repositories generalizing beyond aeon integrating pal into aeon conclusion endnotes 88 information technology and libraries | june 2006 article title: subtitle in same font author name and second author the concept of digital libraries is familiar to both librarians and library patrons today. these new libraries have broken the limits of space and distance by delivering information in various formats via the internet. since most digital libraries contain a colossal amount of information, it is critical to design more user-friendly interfaces to explore, understand, and manage their content. one important technique for designing such interfaces is information visualization. although computer-aided information visualization is a relatively new research area, numerous visualization applications already exist in various fields today. furthermore, many library professionals are also starting to realize that combining information visualization techniques and current library technologies, such as digital libraries, can help library users find information more effectively and efficiently. this article first discusses three major tasks that most visualization for digital libraries emphasize, and then introduces several current applications of information visualization for digital libraries. a good understanding of user tasks is the foundation of designing useful visualizations. rao et al. defined several specific user tasks of digital libraries and illustrated some existing information-visualization techniques that can be used to enhance these tasks, such as tilebar, cone tree, and document lens.1 the tasks were browsing subsets of sources iteratively, viewing context-of-query match, visualizing passages within documents, rendering sources and results, reflecting time costs of interaction, managing multiple-search processes, integrating multiple search and browsing techniques, and visualizing large information sets. moreover, zaphiris et al. generalized these tasks into three essential ones: searching, navigation, and browsing.2 indeed, most information-visualization projects for digital libraries have emphasized these three tasks. in terms of searching, shneiderman et al. proposed the use of a two-dimensional display with continuous variables to view several thousand search results simultaneously.3 this visualization included two strategies: two-dimensional visualizations, and browsers for hierarchical data sets (implemented by using categorical and hierarchical axes). in combination with a grid display, this visualization let users see an overview by colorcoded dots or bar charts arranged on a grid and organized by familiar labeled categories. users could probe further by zooming in on desired categories or switching to another hierarchical variable. a language-independent document-classification system, completed by liu et al., provided a search aid in a digital-library environment and helped users analyze the search query results visually.4 this system used a vector model to calculate the similarities between documents and a java applet to display the classification to the user. in terms of navigation, there are also a variety of information-visualization applications. the previous example of two-dimensional display developed by shniederman et al. also contained navigation functions.5 another example is hascoet’s map interface applied to a digital library.6 this prototype was associated with summary views in the form of navigation trees and neighbor trees that showed documents related to one focus document. the user interface was composed of maps automatically generated based on the characteristics of documents retrieved and a default configuration. users could also modify the configuration of maps and edit maps (classical operations such as cut, paste, move, delete, save and load a view, and expand a view). as for browsing, the use of dynamic queries is a technique that has been employed for some time. ahlberg and shneiderman’s (1994) filmfinder is an early example.7 users can move several sliders to select query parameters, and the search results change with the movement of the sliders. this tool can help users browse movie records more easily and cognitively. another technique is query previews, proposed by doan et al.8 query previews allows users to rapidly gain an understanding of the content and scope of a digital collection. users are presented with generalized previews of the entire database using only the most salient attributes. when they select rough ranges, they will immediately learn the availability of the data for their proposed query. all these applications provide good examples and paradigms to some recent projects. this paper’s discussion of visualization techniques will be based on these three essential tasks—searching, navigation, and browsing. ■ techniques and applications this section presents several recent information-visualization projects applied to digital libraries. all these applications emphasize searching, navigation, and browsing. gang wan gang wan (wangang11@gmail.com) is science librarian, texas a&m university, college station. visualizations for digital libraries article title | author 89visualizations for digital libraries | wan 89 lvis—digital library visualizer indiana university’s (iu’s) lvis (digital library visualizer) project aims to aid users’ navigation and comprehension of large document sets retrieved from digital libraries. borner et al. developed a prototype of lvis based on the data set in the dido image bank, provided by iu’s department of art history.9 lvis is a good combination of information-retrieval algorithms and visualized-search interface. in the information retrieval and classification stage, it adopts latent semantic analysis (lsa) to automatically extract semantic relationships between images. the lsa output feeds into a clustering algorithm that groups images into several classes sharing semantically similar descriptors. a modified boltzman algorithm is then used to lay out images in space. this section will focus on the interface metaphors used to display the results of this classification. two interfaces have been implemented for lvis. a 2d java applet was used on a desktop computer for details, and a 3d immersible environment for the cave (cave automatic virtual environment). cave is a virtual reality 10’ x 10’ x 10’ theater made up of three rear-projection screens for walls and a down-projection screen for the floor. projectors throw full-color workstation fields (1,280 x 512 stereo) at 120hz onto the screens, giving between 2,000 and 4,000 linear pixel resolution to the surrounding composite image.10 both 2d and 3d interfaces give users access to three levels of detail: they provide an overview about document clusters and their relations; they show how images belonging in the same cluster relate to one another; and they give more detailed information about an image, such as its description or its full-size version. in the cave environment, users can first enter a virtual display theater that stages the digital library as a cyberspace easter island, presenting gateways to specific subject categories established by the previous classification process. borner et al. used 3d icons here to encode subject categories (in this case, they actually used a sculptural form of heads inspired by images of the data set).11 after users “enter” into these head icons, they are transited to a new 3d spatial metaphor that presents images in the current category. these images, or slides from the digital library, are presented in crystalline structures (figure 1). in this environment, each crystal represents a set of images with semantically similar image descriptions. again, physical proximity is used here as a metaphor to encode semantic similarities among images. the formations of the crystalline structures depend on the size of the actual search-result data set. navigation in this space is easy. users can also “walk” through this environment and select images of interest to display a largerand clearer-size version (as two images shown in figure 1). if the larger version is not satisfactory, it can be returned to its previous iconic presentation. uc system: a fluid treemap interface for digital libraries the uc system—the acronym “uc” came from its original (but no longer used) internal name “uplib client”—was developed by good et al. at palo alto research center.12 it was built on the uplib personal digital-library platform, which provides an “extensible, format-agnostic, and secure document-repository layer.”13 “personal” here means that the user already has the right to use all of the data objects in the library, and already has local possession of those objects. however, this visualization can be employed in more general digital libraries. the uc system uses continuous and quantum treemap layouts to present collections of documents. continuous treemaps are space-filling visualizations of trees that assign an area to tree nodes based on the weighting of the nodes.14 in continuous treemaps, the aspect ratio of the cells is not constrained, although square cells are often preferred. quantum treemaps extend this idea by making cell dimensions an even multiple of a unit size.15 the treemap visualizations provide meaningful overviews of document collections and fast, intuitive navigation among multiple documents in a working set. an important aspect of the interface is the fluidity of navigation. this allows the user to focus on the documents rather than on interacting with the tool. the interface allows a user to zoom in on an object with a left-click, and to zoom out when the user clicks on the background; figure 1. lvis: image crystals and panels 90 information technology and libraries | june 2006 however, the combination of a “zoomable” user interface and continuous treemaps leads to a problem: conflicts with aspect ratios. to solve this problem, good et al. proposed to zoom and morph the cell to the window size while leaving the rest of the layout in place.16 thus the visual disturbance of the display is minimized since only a single cell moves. with respect to searching, this system provides several methods to filter results. first, its interface includes a mechanism to search for specific content within the documents. as letters are added to the search query, the system increasingly highlights matching documents to immediately indicate matching documents. secondly, the user can also choose to update the view to display only those documents that match the current query. figure 2 gives an example of a user-initiated query process. in the scenario described in figure 2, the user first enters the search terms (2.1), and interactive highlights then appear for groups with matching documents (2.2). the user presses a button to limit the view to only the matching documents (2.3). finally, the user zooms in on a document and begins reading with an integrated reading tool. (2.4) the uc system also offers a mechanism that allows the user to compare multiple documents. after users retrieve a set of documents through a search, they can press a button to “explode” the documents to pages. they can continue zooming in to a portion of a single document, and then select a document page to read with the integrated reading tool. in short, the uc system uses treemaps as the primary visual metaphor. it also uses various visualization techniques that enhance user interactions, such as zooming, interactively highlighting, exploding, etc. activegraph activegraph is an information-visualization tool designed by marks et al. (2000) at los alamos national laboratory (lanl).17 it aims to provide users with a concise, customizable view of documents in a digital library. in this system, a set of digital-library documents is represented as a data set in a 2d or 3d scatter plot. the data set can represent any digital-library objects in various formats including books, journals, papers, images, and web resources. marks et al. used six visual attributes of the scatter plot: the x-, y-, and z-axes, color, size, and shape to encode the bibliographic information of documents in a digital library, including title, author, date of publication, and number of citations.18 the user can select and adjust these attributes from a control panel on the right-hand side of the screen. thus, activegraph allows users to both view and customize the contents of a digital library. the main visual representation of this tool is a scatter plot. scatter plots have been used to represent large sets of data for a long time. they provide an overview of a data set and show the distribution of data points clearly, revealing clusters and statistical information.19 hence, these scatter plots make it possible for users to perceive meaningful patterns of the data. an example of using activegraph scatter plots to visualize citation data for postdoctoral researchers at lanl is given in figure 3. this scatter plot intends to provide users with information, such as the number of times their papers published between 1998 and 2002 were cited. the visualization is based on the metadata in the lanl digital library. in this scatter plot, the postdoctoral researcher’s last name is mapped to the x-axis and the number of citations is mapped to the y-axis. a pixel of a particular color can provide two pieces of information, for example, by encoding a paper and the subject category of the paper. a group of pixels of a particular size, shape, and color can provide four pieces of information by encoding a paper, the subject category of the paper, whether the paper has been cited, and whether the paper has been read by another user of the collaborative library. from this scatter plot, users can easily learn the citation pattern of these papers. unlike some other scatter plot applications such as homefinder and filmfinder, activegraph uses different filters for queries. instead of filter sliders, it uses filter lists, which consist of selection list boxes, one for each data attribute. these filter lists can provide users with functionality that is important in the context of digital libraries. activegraph allows users to manipulate the display of data in another manner by applying a logarithmic transformation. as some data sets, such as citation data, can frequently have an exponential distribution, the figure 2. the search interface of the uc system article title | author 91visualizations for digital libraries | wan 91 logarithmic transformation can spread the clustered dots more evenly across the scatter plot. other data transformations and visualizations may be important in some cases as well, such as parallel coordinates for displaying citation statistics for the same group of researchers at different points in time. the scatter plot is not a new visualization technique. this example, however, demonstrates that by encoding document attributes and designing proper filters, it can be used in a digital library environment effectively and efficiently. 3d vase museum the previous example, lvis, already introduced the applications of 3d representations in digital libraries. the 3d vase museum developed by shiaw et al. at tufts university is another good example of 3d space metaphor in digital libraries using a variety of visualization techniques.20 in this 3d vase museum, the user can navigate seamlessly from a high level scatter-plot-like plan view to a perspective overview of a subset of the collection, to a view of an individual item, to retrieval of data associated with that item, all within the same virtual room and without any mode change or special command. unlike the traditional digital library, which displays thumbnails and descriptions of vases in the main browser interface, this museum is a 3d virtual environment that presents each vase as a 3d object and places it in a 3d space of a room within a museum. figure 4 gives a wideangle view of this 3d museum. in this view, one wall represents the timeline (year bce) and the adjacent wall represents the types of wares (e.g., red figures, black figures). the user can “walk” through this virtual room and look at the vases. the wide-angle view pictured in figure 4 will then be transited to an eye-level view so that the user can probe the objects more clearly. when the user continues “walking” toward an object of interest, secondary information about this vase will appear in the virtual scene. if the user looks closer, the text information becomes clearer. as the user moves farther and farther away, the information becomes less and less visible until it eventually disappears from the scene. if the user clicks on the vase html page, a version of the original html page will be loaded, from which a 3d model of the vase can be loaded. the user can then rotate this 3d model on the screen using the mouse to see all the aspects of the vase. the 3d vase museum is maintained in the background all times. the user can also navigate the room in a perspective view by switching the view port upward toward the ceiling (watching from upside down). the user can then switch the views between a high-level scatter plot and a 3d perspective view. similarly, in this application, the xand yaxes are mapped to two attributes of the vases: year and ware. with this seamless blend from a high-level data plot to 3d objects, the user can navigate without losing the point of view or context by just moving within the virtual environment. according to shiaw et al., a usability test has been administered, in which tasks based on archaeology courses were designed and subjects were asked to perform these tasks in the original traditional digital library and this 3d museum.21 the results showed subjects who used the 3d vase museum performed the tasks 33 percent better and did so nearly three times faster. ■ collaborative visual interfaces to digital libraries collaborative visual interfaces is an ongoing project led by borner at indiana university (iu). borner et al. (2002) proposed the development of a shared 3d document space for a scholarly community—namely faculty, staff, and students at iu’s school of library and information science.22 the space will provide access to a collection of various online documents including text, images, video, and software demonstrations. a semantic treemap algorithm has been developed to layout documents in a 3d space.23 semantic treemaps utilize the original treemap approach to determine the size (dependent on the number of documents) and layout figure 3. activegraph scatter plot of citation data for papers authored by lanl postdoctoral researchers 92 information technology and libraries | june 2006 of document clusters. subsequently, an algorithm (force directed placement) was applied to the documents in each cluster to place documents spatially, based on their semantic similarity, which was encoded by the physical proximity between two dots. an example of the semantic treemaps is shown in figure 5.1. a 3d space metaphor was then used to display these documents on the desktop interface, as shown in figure 5.2. in this 3d space, each document is represented by a square panel textured by the corresponding web page’s thumbnail image and augmented by a short description such as the web page title that appears when the user moves the mouse over the panel. as in other 3d environments, users can “walk” through this space to probe documents of interest. upon clicking the panel, the corresponding web page is displayed in the web-browser interface. users can collaboratively examine, discuss, and modify (add and annotate) documents, thereby converting this document space into an ever-evolving repository of the user community’s collective knowledge that members can access, learn from, contribute to, and build upon. certain usability studies have been performed to determine the influence of panel size and panel density on retrieval performance. results showed that subjects were slightly faster and more accurate if web-page panels are larger and denser. ■ aquabrowser aquabrowser is a fuzzy visualization tool that shows the high-level description of a conceptual space, hiding irrelevant information and displaying information elements in context.24 it is a generic java applet that can be embedded into any web page. medialab, the developer of aquabrowser, claims that users of aquabrowser can browse through a dynamic conceptual space that is continually reshaped to reflect their interests. animations make transitions from one state to another appear more fluid, showing users why and how the information is rearranged. medialab uses the term “word cloud” as the visual metaphor of the aquabrowser interface. but in fact, the primary visual representation is a network of linked words that are distributed in the conceptual space. the search term that the user assigns will display at the center of this network. the physical distance between another term node and this term encodes the relevancy between these terms. the larger and closer the word is to the center of the screen, the greater its relevance to the search term. in contrast, the smaller and more peripherally positioned, the less relevant it is. each of the user’s actions will change and rearrange the distribution and importance of the words, putting those of greater interest closer to the user and those of less interest nearer to the edge of the screen. it also uses colors to encode attributes of terms, such as spelling variations, visited words, and translations. figure 6 shows an example of a search display. this tool has been used by a number of libraries to enhance their online catalog search interfaces. it could be a very useful search aid in digital libraries as well. ■ summary and trends the above applications are just a few examples of information visualization in a digital-library environment. many other metaphors and techniques, such as perspective wall, cone tree, document lens, and hyperbolic browser, have been used or can potentially be used to facilitate searching, browsing, and navigating through the maze of information in a digital library. the digital library is an interdisciplinary subject involving several research areas such as information retrieval, multimedia information processing, and classification. all these aspects of digital libraries make information visualizations more complicated in this environment. therefore, the systems described in this paper have integrated various visualization techniques. figure 4. a wide-angle overview of the 3d vase museum article title | author 93visualizations for digital libraries | wan 93 the examples in this paper, along with many others, show that the 3d space metaphor has attracted much attention from information-science communities. the combination of 3d space and virtual reality that can be accessed from web browsers these days is becoming a trend of information visualization for digital libraries. this technique gives the user maximum freedom to walk through the library collections, searching and browsing documents. the 3d visual structures, however, have greater implementations compared with those that are 2d, since they require more processing power and include more parameters.25 that is partly why many 2d visualizations developed in the 1990s are still widely used. for example, both activegraph and 3d vase museum have employed 2d scatter plots; both uc system and collaborative visual interfaces have used treemaps. furthermore, it is very important to focus on the actual needs of users. research on any visualization for digital libraries should be based on the detailed analysis of users, their information needs, and their tasks.26 usability tests have been done for some of the above applications, but not for all of them. further research and usability tests are required to determine to what extent a visual interface facilitates the user ’s perception of information. references and notes 1. r. rao et al., “rich interaction in the digital library,” communications of acm 38, no. 4 (1996): 29–39. 2. p. zaphiris et al, “exploring the user of information visualization for digital libraries,” the new review of information networking 10, no. 1 (2004): 51–69. 3. b. shneiderman et al., “digital library search results with categorical and hierarchical axes,” dl-00: 5th acm digital library conference, san antonio (new york: acm pr., 1999). 4. y. liu et al, “visualizing document classification: a search aid for the digital library,” journal of the american society for information science 51, no. 3 (2000), 216–27. 5. shneiderman et al., “digital library search results with categorical and hierarchical axes.” 6. m. hascoet, “using maps as a user interface to a digital library,” sigir ’98, melbourne, australia (new york: acm pr., 1998). 7. c. ahlberg and b. shneiderman, “visual information seeking using the filmfinder,” acm chi94 conference, boston (new york: acm pr., 1994). 8. k. doan et al., “query previews for networked information services,” advanced digital libraries conference (washington: ieee, 1996). 9. k. borner et al., “lvis—digital library visualizer,” proceedings, ieee international conference on information visualfigure 5.2. interface to the document space figure 5.1. a semantic treemap of web links figure 6. a search display in aquabrowser 94 information technology and libraries | june 2006 ization, july 19–21, 2000, london, england (los alamitos, calif.: ieee computer society, 2000), 77–81. 10. c. cruz-neira et al., “surround-screen projection-based virtual reality: the design and implementation of the cave,” computer graphics (proceedings of siggraph ’93), vol. 27 (new york: acm siggraph, 1993), 135–42. 11. k. borner et al., “lvis—digital library visualizer.” 12. l. e. good et al., “a fluid treemap interface for personal digital libraries,” jcdl’05, june 7–11, denver (new york: acm pr., 2005). 13. w. c. janssen and k. popat, “uplib: a universal personal digital library system,” acm symposium on document engineering (new york: acm press, 2003), 234. 14. good et al., “a fluid treemap interface for personal digital libraries.” 15. b. b. bederson et al., “ordered and quantum treemaps: making effective use of 2d space to display hierarchies,” acm transactions on computer graphics 21, no. 4 (2002): 833–54. 16. l. e. good et al., “zoomable user interface for in-depth reading,” jcdl’04, june 7–11, tucson, ariz. (new york: acm pr., 2004) 17. l. marks et al., “activegraph: a digital library visualization tool,” international journal on digital libraries 5, no. 1 (mar. 2005), 57–69. 18. ibid. 19. e. r. tufte, the visual display of quantitative information (cheshire, conn.: graphics pr., 1983). 20. h. shiaw et al., “the 3d vase museum: a new approach to context in a digital library,” jcdl’04, june 7–11, tucson, ariz. (new york: acm pr., 2004). 21. ibid. 22. k. borner et al., “collaborative visual interfaces to digital libraries,” jcdl’02, july 13–17, portland, ore. (new york: acm pr., 2002). 23. y. feng and k. borner, “using semantic treemaps to categorize and visualize bookmark files,” visualization and data analysis 2002: 21–22 january 2002, san jose, usa (proceedings of spie, v. 4665) (bellingham, wash.: spie—the international society for optical engineering, 2002), 218–27. 24. a. veling, “the aquabrowser—visualization of dynamic concept spaces,” journal of agsi 6, no. 3 (1997): 136–42. 25. b. eden, “3d visualization techniques: 2d and 3d information visualization resources, applications, and future,” library technical reports 41, no. 1 (2005). 26. e. bertini et al., “visualization in digital libraries,” www .dis.uniromal.it/~delos/docs/ivdls_book_chapter.pdf (accessed jan. 12, 2006). reproduced with permission of the copyright owner. further reproduction prohibited without permission. digital resource sharing and library consortia in italy giordano, tommaso information technology and libraries; jun 2000; 19, 2; proquest pg. 84 digital resource sharing and library consortia in italy tommaso giordano interlibrary cooperation in italy is a fairly recent and not very widespread practice. attention to the topic was aroused in the eighties with the italian library network project. more recently, under the impetus toward technological innovation, there has been renewed (and more pragmatic) interest in cooperation in all library sectors. sharing electronic resources is the theme of greatest interest today in university libraries, where various initiatives are aimed at setting up consortia to purchase licenses and run digital products. a number of projects in hand are described, and emerging trends analyzed. t he state of progress and the details of implementation in various countries of initiatives to share digital information resources obviously depend-apart from current investment policies to develop the information society-on many factors of a historical, social, and cultural nature that have determined the evolution and consolidation of cooperation practices specific to each context. before going to the heart of the specific subject of this article, in order to foster an understanding of the environment in which the trends and problems that we shall be considering are set, i feel it best to give a quick (and necessarily summary) sketch of the library cooperation position in italy. the word "cooperation" became established in the language of italian librarians only toward the mid-'70s, when in the sector of public libraries-which were transferred in those years from central government to local authorities-the "territorial library systems" were taking shape: this was a form of cooperation provided for and encouraged by regional laws that brought together groups of small and medium-sized libraries, often around a system centre supplying shared services. a few years later, in the wake of the new information technologies and in line with ongoing trends in the most advanced countries, in italy, too, the term "cooperation" became increasingly associated with the concept of computerized library networks. the decisive impulse in this direction came from a project of the national library service (sbn), the national network of italian libraries, then in a gestation stage, which also had the merit of tommaso giordano (giordano@datacomm.iue.it) is deputy director of the library at the european university institute, florence. 84 information technology and libraries i june 2000 speeding up the opening of the italian librarianship profession to experiences underway in the most advanced countries_! in the '80s, cooperation, together with automation, was the dominant theme at conferences and in italian professional literature. however, the heat of the debate had no satisfactory counterpart in terms of practical implementation, because of both resistance attributable to a noninnovative administrative culture and the polarization of the bulk of the investments around a single major project (the sbn network), the technical and organizational choices of which were shared by only part of the libraries, while others remained completely outside this programme. many librarians, while recognizing the progress over the last fifteen or twenty years (including the possibility of accessing the collective catalogue of sbn libraries through the internet), maintain that results obtained in the area of cooperation are well below expectations, or energy involved. i am touching here on one of the most sensitive, controversial points in the ongoing professional debate, which i do not wish to dwell on except to note the split that came in italian libraries following the vicissitudes of a project that ought, instead, to have united them and stimulated large-scale cooperation.2 i shall now seek to summarize the cooperation position in italy in relation to the subject of this article. very schematically (and arbitrarily) i have grouped the experiences i feel most signficant under three heads: sbn network, territorial library systems, and sectoral cooperation. sbn brings together some eight hundred large, medium-sized, and small libraries (national, localauthority, university, and research-institute). the programme, funded by the central government, supports cooperation in the following main sectors: • hardware sharing, • development and maintenance of library software packages, • network administration, • shared cataloguing, and • interlibrary loans. the sbn is a star network with its central node consisting of a database (the so-called "index") containing the collective catalogue of the participating libraries (currently some four million relevant bibliographic titles and 7.5 million locations). to the index are linked the thirtyseven local systems, single libraries or multiple libraries, that apply the computerized procedures developed by the sbn programme. thus the sbn is a closed network of only those libraries agreeing to adopt the automation systems distributed by the central institute for the union catalogue, the central office coordinating the programme, take part. reproduced with permission of the copyright owner. further reproduction prohibited without permission. from the organizational viewpoint, the sbn can be regarded as a de facto consortium (i.e., not in the legal sense of the term), even if the management bodies, participation structures, and funding mechanisms differ considerably from consortia that have been set up in other countries. in fact, libraries join the sbn through an agreement among state, regions, and universities, and the governing bodies represent not the libraries but their parent institutions. participating libraries receive the services free, and funding for developing the systems and network administration comes from the central government, which coordinates the technical level of the project through iccu.3 currently, ideas are moving toward evolving the sbn into an open network system and reorganizing its management bodies: if this provision becomes a reality, the sbn will have potential for taking on an important role in developing digital cooperation. the territorial library systems, developed especially in the central and northern regions, consist of small groups of public libraries cooperating in one or more sectors of activity such as: • sharing computer systems, • cataloguing, • centralized management of purchases, • interlibrary loans, and • professional training and other activities. the library systems are based on conventions and formal or informal agreements between local institutions (the municipalities) and receive support from the provincial and regional administrations. in more recent years some systems (e.g., abano terme, in the veneto) have formed themselves into formal, legal consortia. the most advanced experience in this sector-for example, the libraries in the valseriana (an industrial valley in lombardy), which have been operating on the basis of an informal consortium for some twenty years now-have reached a high level of efficiency comparable with the most developed european situations and may rightly be regarded as reference models for the organization of cooperation. however, given their limited size, they are unlikely to achieve economies of scale in the digital context unless they develop broader alliances. it is not unlikely that these consortia, given their capacity to work together, will in the near future develop broader forms of cooperation suited to tackling current technological challenges. sectoral cooperation (cooperation by area of specialization) is meeting today with steadily increasing interest, though it did not fare very well in the past. among the rare initiatives embarked upon by university and research libraries in this direction, particular importance in our context attaches to the national coordination of architectural libraries (cnba), started some twenty years ago, which became an association in 1991. the cnba has various projects on its programme and can be regarded as an established reference point for cooperation among architectural libraries. we should also mention one of the "oldest" cooperation projects among research libraries: the italian periodicals catalogue promoted by the national research council (cnr), recently made available online by the university of bologna.4 to complete this sketch, at least a mention should be made of the participation of italian libraries in the european commission's technical programme in favor of libraries. this programme, which since 1991 has mobilized the world of libraries in the european union, not only favors and guides explosion of technologies into libraries in accordance with preset objectives, but also has the aim of encouraging cooperation among libraries in the various countries. the programme-the latest edition of which includes not just libraries but also archives and museums-has secured significant participation from many italian libraries. over and above the validity of the projects already carried out or under way (important as that is), this programme has been very valuable to italian libraries in terms of exchanges of experience and of opening up professional horizons, especially as regards cooperation practice.s digital cooperation recently, following the expansion of electronic publishing, university libraries have been displaying renewed interest in cooperation activities with particular reference to acquiring licenses and sharing electronic resources. this movement is at present in full swing and is giving rise to manifold cooperation initiatives. to get an idea of the trends under way, one may leaf through a session on database networking in italian universities in the proceedings of the aib congress at genoa. 6 on that occasion a group of universities presented a "draft proposal of agreement on access to electronic information." the document is divided into two parts, the first defining the purposes and object of university cooperation in the sphere of electronic information. the second part indicates operational objectives for cooperation in acquiring electronic information and proposes a model contract for purchasing licenses, to which member universities are to keep. the content of this second part coincides with the recommendations and understandings signed by associations, consortia, and groups of libraries in other countries, and largely follows the indications and recommendations issued by the european bureau of library information and documentation associations (eblida), the organization that brings together the library associations of the various european countries; by digital resource-sharing and library consortia in italy i giordano 85 reproduced with permission of the copyright owner. further reproduction prohibited without permission. the international coalition of library consortia (icolc); and by other library organizations. there is no point here in listing all initiatives under way in italian libraries, in part because most of them are only just started or in the experimental stage. i shall mention a few only to bring out the trends that seem, from my point of view , to be emerging . development of digital collections at the moment initiatives in this sector are much fewer and less substantial than in other industrialized countries. among them the biblioteca telematica italiana stands out: in it, fourteen italian and two foreign universities digitize , archive, and put online works in italian . the project is based on a consortium, the italian interuniversity library center for the italian telematic library (cibit), supported by funds from the national research council (cnr) and made up of the fourteen italian and two foreign universities that have signed the agreement. technical support is provided by the cnr institute for computer linguistics, located in pisa.7 in this context we must also note, especially for the consequences it may have for the future growth of digital collections, an agreement between the national central library in florence and the publishers and authors associations aimed at accomplishing the national legal depository for electronic publishing project, which also provides for production of a section of the italian national bibliography to be called bnidocumenti elettronici. the publishers who have signed the agreement undertake to supply a copy of their electronic products to the national central library in florence. the latter undertakes to guarantee conservation of the electronic products deposited, and to make them accessible to the public in accordance with the agreements reached. • description of electronic resources in this area the bulk of the initiatives are still in an embryonic stage. in the sector of periodicals index production (i.e., tocs), mention should be made of the economic social science periodicals (essper), a cooperation project on italian economics periodicals launched by the libero istituto universitario carlo cattaneo (castellanza, varese) to which some forty libraries are contributing. 9 recently the project has been extended to italian legal journals. essper is a cooperative programme based on an informal agreement among the libraries, each of which undertakes to supply in good time the tocs of the periodical titles they have undertaken to monitor. the programme does not benefit from any outside funds, being supported entirely by the participating libraries, which 86 information technology and libraries i june 2000 have recently been endeavouring to evolve into a more structured form of cooperation . administration of electronic resources and licenses in this sphere there have been numerous initiatives recently, particularly by university libraries . one may note, first, a certain activism by university data-processing consortia (big computing centres created at the start of the computer era to support applications in scientific and then university and library administration areas). the interuniversity consortium for automation (cilea) in milan , which has for some time been operating in the area of library systems and electronic information distribution (especially in the biomedical sector), has extended its activities by offering services to nonmembers of the consortium too. recently cilea, in connection with a broader programme---cdl-cilea digital library-has been negotiating with a number of major publishers the distribution of electronic journals and online bibliographic services on the basis of needs expressed by the libraries in the consortium. caspur (the university computing consortia in rome) is working on several projects, among them shared management of electronic resources on cd-rom in a network among five universities of the centre-south . caspur, too, has opened its services to libraries not in the consortium and is negotiating with a number of major publishers the licenses for establishing a mirror site for electronic periodicals. the university of genoa, through csita, its computing services centre, has concluded an agreement with an italian distributor of electronic services to enable multisite license-sharing for biomedical databases by institutions operating on the territory of liguria. very recently the universities of florence, bologna, modena, genoa, and venice and the european university institute in florence have initiated a pilot project (cipe) for shared administration of electronic periodicals, and have begun negotiations with a number of publishers. let us now seek to draw some conclusions from this initial, brief consideration of current initiatives: • initiatives in the area of digital cooperation are coming mainly from the world of university and research-institute libraries. • no projects are big enough to achieve economies of scale, with most initiatives in hand having a very limited number of partners and often being experimental in nature . • projects under way do not provide for the formation of proper consortia, most likely because the legal form of the consortium is hard to set up in italy because of the burdens involved, especially the complexity and length of the decision-making processes needed to constitute such an organization. reproduced with permission of the copyright owner. further reproduction prohibited without permission. • librarians prefer decentralized forms of cooperation, partly because , shaken by experiences of the past, they fear losing autonomy and efficiency and finding themselves caught up in the bureaucracy of centralized organizations. "however, there can also be a correlation between the amount of autonomy that the individual institution retains and the ability of the consortium to achieve goals as a group". this observation by allen and hirshon obviously holds for italy too . jo it is no coincidence , in fact, that university computing consortia, who have centralized staff and funds available, are able to carry out more incisiv e actions in this sector. • except for the biblioteca telematica italiana, no initiatives seem to have been incentivized by ad hoc government programmes or funds. • a part of the cooperation projects concerns sharing of databases on cd-roms. the traditional italian resistance to online materials would seem to be due partly to the still inadequate network infrastructures in our country; improvements in this sector might bring a quick turnaround here. • some initiatives in hand have been inspired more by suppliers than by librarians : the risk is to cooperate in distributing a particular product, not to enhance libraries' bargaining power. without wishing to deny anything to the suppliers, who today play an essential part in terms of professional information too, i feel that keeping the roles clearly separate may help to develop clear, upright and mutually advantageous cooperation. • some major project s are being led by universit y computing consortia that have begun to take an interest in the library sector. the university computing consortia would indeed have some of the requirements to play a first-rank role in this sphere if they can manage to bring themselves into their most natural position, i.e., to operate as agents of libraries rather than as distributors of services on behalf of the commercial suppliers. moreover, it ought to be clear that th e computing consortia should act as partners with the library consortia and not as substitutes for them, otherwise the libraries risk limiting their autonomy of decision . • some attention is turning toward university electronic publishing , though at the present stage it does not seem there are practical projects for cooperation in this area. • finally, one has to not e low initiative by libraries (compared with other countries) in developing content and in storing digital collections. th e analysis i have rapidly summarized here is the basis for an initative which has in recent months been stimulating the debate on digital cooperation in italy. i am referring to the italian national forum on electronic information resources (infer), a coordination group initially promoted by the europ ean university institut e, the university of florence, and a number of universities in the centre-north, which is today extending beyond the sphere of university and research libraries. the forum's chief mission is to coop erate to promote efficient use of electronic information reso urce s and facilitate access by the public. to this end it encourages libraries to set up consortia and other typ es of agreement on acquisition and management of electronic resources and access to them . infer's objectives can be summarized as follows: • to act as a reference and linkage point and develop initiatives to promote activities and programmes in the area of library e lectro nic resource sharing; • to enhance awar eness both at institutional political levels (ministries, universities, local authorities, etc.) and among librarians and end users; • to facilitate dialogue and mutual collaboration between libraries and all others in the knowledge production and distribution chain, to help them all (authors, publi shers, intermediaries, end users) to take advantage of the opportunities offered by the information society; and • to maintain contacts with similar initiatives under way in other countries. infer has immediately embarked on a rich programme of activities which is giving appreciable results especia lly in terms of raising awareness of the problem and coordinating initiativ es in the area. we shall her e briefly mention some of the actions in hand that seem to us most important. dissemination of information. infer has developed a web site where as well as information on the forum's activities, important information and documents can be found relating to the consortia, the negotiations and licenses, and in general the digital resource-sharing programmes in italy and around the world.1 1 a discussion list for tnfer members has also been activated. seminars and workshops. thi s activity is aimed at further exploration of themes of particular interest (e.g ., legal aspects of license contracts, or programmes under way in other countries) . data collection. the two main programmes corning und e r this heading are: (a) monitoring of italian cooperation initiatives under way in the digital sector; and (b) collecting data on acquisitions of electronic information resources in university libraries . this information will enable the libraries to have a more exact picture of the situation , so as to assess their bargaining power and achieve the necessary support to adopt the most appropriate strategies. digital resource-sharing and library consortia in italy i giordano 87 reproduced with permission of the copyright owner. further reproduction prohibited without permission. indications and recommendations. as well as translating and distributing documents from the most important associations operating in this area (such as eblida, icolc, and ifla), infer is developing a model license for the italian consortia. infer was set up in may 1999 and currently has some forty members, most of them representatives of university library systems, university computing consortia or research libraries, or univer si ty professors. one of infer's aspirations is to persuade decision-makers to develop a programme of incentives on a national scale for the creation of library consortia . i critical factors as to the delay we note in terms of shared management of electronic resources, weight clearly attaches to the fact that cooperation is not very established , nor are the national structures that ought to have supported it. it would be all too easy and perhaps also more fun to attribute this situation to the so-called individualism of italians and to abandon inquiry into th e structural limitations that may have determined it. first of all, except in very few cases, libraries have no administrative autonomy, or only very little, and with hardly any decision-making powers. this factor favors interference in decision-making processes, complicates th em, slows down procedures, and strips librarians of their responsibility. one of the reasons why the sbn has not managed to generate cooperation is to be sought in the mechanisms for joining and participating in the programme . in other words, many libraries have joined the sbn following decisions taken from above, at the political and admistrative levels, and not on the basis of an autonomous, weighted assessment of attitudes, needs, and alternatives. these experiences have augmented libraries' reluctance to embark on centrally steered national programmes. on the other hand, the low administrative autonomy they have prevents them from implementing truly effective alternative solutions, i.e., ones able to realize economies of scale. another factor is the administrative fragmentation of libraries . the big universities have fifty or so libraries each (often one per department). some universities have an office coordinating the librari es , but only in very few cases does this structure have the powers and the necessary support to coordinate; more often it acts as a mediation office with no real administrative powers. in short, the result is that since (perhaps also because of a misunderstood sense of departmental autonomy) there is no 88 information technology and libraries i june 2000 decision-making centre for libraries in each university , decisional processes prov e slow and cumbersom e. clearly, all this brings many probl ems in establishing understandings and cooperative programmes with other libraries and weakens the universities in negotiating licenses. this position, while objectively favoring suppliers in the short term, in the long term risks facing them with difficulties given an increasingly impoverished, uncertain market because of the fragmentation and the limited capacity of possible purchasers . another limit is the insufficient awareness, especially on the academic side, of the challenges of electronic information. in early 1999 the french daily le monde published an extensive feature on scientific publishing, showing how current publishing production mechanisms, whil e assuring a few big publishers of ample profit margins, are suffocating libraries and universities under the continu ous rises in prices for scientific joumals.12 the argument, immediately taken up by the spanish el pais and other european newspapers, met with very little response in italy. clearly, in italy today, the conditions do not exist to embark on initiatives like the incisive open letter to publishers sent by the kommission des deutschen bibliotheksinstituts filr erwerbung und bestandsentwicklung in germany, supported by similar swiss, austrian, and dutch organizations. 13 the lack of an adequate national policy in the area of electronic information is probably the direct consequence of the problems i have just mentioned. in this context, however praiseworthy the initiatives, they tend in the absence of reference points and practical support to break up or fritter away . under the ministry for universities there are no leadership or action bodies in the area of academic information, like the joint information system committee in britain that stimulates programmes aimed at developing and utilizing information technologies in university and research libraries . these observations are also valid for the state libraries and public libraries, too, where the central (ministry for cultural affairs) and regional authorities could play a more effective part in promoting digital cooperation . i conclusions the picture i have presented is not very rosy. however, it does reveal considerable elements of vitality and great er awareness of the problems emerging, starting with a few representatives of academic sectors who might be able to wield influence and bring about a turnaround. at the moment, the consortium movement to share electronic resources chiefly involves university libraries, reproduced with permission of the copyright owner. further reproduction prohibited without permission. but a few initiatives by public libraries are starting to appear, especially in the multimedia products sector. no specific lines of action are yet emerging at the level of the national authorities-especially the ministry for education and research and the ministry of cultural activities, on which the national libraries and many research libraries depend. it is likely that in the near future the entry of these agencies may be able to modify the current scenario and considerably influence the approach to cooperation. from this viewpoint, the impression is that a few consortium initiatives that have been flourishing in recent months on the part of both libraries and suppliers have the principal aim of proposing cooperation models to guide future choices. in conclusion, we are only at the outset, and the game is still waiting to be played. references and notes 1. michel boisse!, "l'organisation automatisee de la bibliotheque de l'institut universitaire europeen de florence," bulletin des bibliotheques de france 24, no. 5 (1979): 231-39. for an overall picture of the debate, see: la cooperazione: ii servizio bibliotecario nazionale: atti de/ 30th congresso del/'associazione italiana biblioteche, giardini naxos, november 21-24, 1982 (messina: universita di messina, 1986). 2. tommaso giordano, "biblioteche tra conservazione e innovazione," in giornate uncee su/le biblioteche pubbliche statali, roma, january 21-22, 1993 (roma: accademia nazionale dei lincei, 1994): 57-65. for the most recent developments in the debate, see the articles by antonio scolari, "a proposito di sbn," giovanna mazzola merola, "lo studio sull'evoluzione de! servizio bibliotecario nazionale," and claudio leombroni, "sbn un bilancio per ii futuro," bollettino aib 37, no. 4 (1977): 437-66. 3. further information on sbn can be found at www.iccu.sbn.it/sbn.htm, accessed oct. 27, 1999, where the collective catalogue of participating libraries is also accessible. 4. catalogo italiano dei periodici (acnp),www.cib.unibo.it/ cataloghi/infoacnp.htm, accessed sept. 19, 1999. 5. there is a considerable literature on the european commission's "libraries programme": for a summary of projects in the programme, see telematics for libraries: synopses of projects (luxembourg: office for official publications of european communities, 1998). updated information on the latest version of the programme can be found at www.echo.lu/ digicult, accessed oct. 26, 1999. on italian participation in the programme see: "ministero per i beni culturali e ambientali, l'osservatorio dei programmi internazionali delle biblioteche 1995-1998" (roma: mbac, 1999). 6. associazione italiana biblioteche (aib), xliv congresso nazionale aib. genova, 1988: www.aib.it/aib/congr/co98univ. htm, accessed oct. 27, 1999. 7. more information about cibit can be found at www.ilc.pi.cnr.it/pesystem/19.htm, accessed may 19, 2000. 8. progetto eden: deposito legale editoria elettronica n azionale, www.bncf.firenze.sbn.it/ progetti.html, accessed sept. 29, 1999. 9. more information about essper mav be found at www.liuc.it/biblio/ essper /default.htm, access~d may 19, 2000. 10. barbara mcfadden allen and arnold hirshon, "hanging together to avoid hanging separately: opportunities for academic libraries and consortia," information technology and libraries 17, no. 1 (1998): 37-44. 11. the infer web page can be found on the universita di roma i site, www.uniromal.it/infer, accessed may 19, 1999. 12. le monde, 22 jan. 1999: a whole page is devoted to this topic. see especially the article titled "les journaux scientifiques menaces per la concurrence d'internet." accessed feb. 4, 1999, www.lemonde.fr/ nvtechno /branche / journo / index.html. the point was taken up again by el pa(s, 27 jan. 1999; see the article titled "las revistas cientfficas, amenazadas por internet." 13. the letter, signed by werner reinhardt, dbi president, is available at www.ub.uni-siegen.de/pub/misc/offener_brief-engl. pdf, accessed feb. 4, 1999. digital resource-sharing and library consortia in italy i giordano 89 from dreamweaver to drupal: a university library website case study jesi buell and mark sandford information technology and libraries | june 2018 118 jesi buell (jbuell@colgate.edu) is instruction and design and web librarian and mark sandford (msandford@colgate.edu) is systems librarian at colgate university, hamilton, new york. abstract in 2016, colgate university libraries began converting their static html website to the drupal platform. this article outlines the process librarians used to complete this project using only in-house resources and minimal funding. for libraries and similar institutions considering the move to a content management system, this case study can provide a starting point and highlight important issues. introduction the literature available on website design and usability is predominantly focused on business or marketing websites. what separates library websites from other informational or commercial websites is the complexity of the information architecture—they contain both intricate informational and transactional functions. website managers need to maintain congruity between many interrelated but disparate tools in a singular interface and navigational system. libraries are also often challenged with finding individuals who possess the appropriate skills to build and maintain a secure, accessible, attractive, and easy-to-use website. in contrast to libraries, commercial companies employ a team of designers, developers, content managers, and specialists to triage internal and external issues. they can also spend months or years perfecting a website and, of course, all these factors have great costs associated with them. given that many commercial websites need a team of highly skilled workers with copious time and funding, how can librarians be expected to give their patrons similar experiences to sites like google? this case study will outline how a small team of librarians completely overhauled their fragmented, dreamweaver-based website to a more secure, organized, and appealing open-source platform with drupal within a tight timeline and very few financial consequences. it includes a timeline of major milestones in the appendix. goals and objectives the first necessity for restructuring the colgate university libraries’ website was building a team that had the skills and knowledge necessary to perform this task. the website overhaul was spearheaded by jesi buell, instructional design and web librarian, and mark sandford, systems librarian. buell has a user experience (ux) design and editing background while sandford has systems, cataloging, and server experience. they were advised by web development committee (wdc) members cindy li, associate director of library technology and digital initiatives, and debbie krahmer, digital learning and media librarian. together, the group understood trends in digital librarianship, the needs of the libraries’ patrons, as well as website and catalog design and mailto:jbuell@colgate.edu mailto:msandford@colgate.edu from dreamweaver to drupal | buell and sandford 119 https://doi.org/10.6017/ital.v37i2.10113 maintenance. the first thing the wdc did was outline its goals and objectives, and this documented weaknesses the group wanted to address with a new website. the wdc identified four main improvements colgate libraries needed to make to the website: improve design colgate libraries’ old website suffered from varied design and language use across pages and various tools (libguides, catalog, etc.). this led to an inconsistent and often frustrating user experience and detracted from the user’s sense of a single, cohesive website. the wdc also wanted to improve and update the aesthetic quality of the website. while many of these changes could have been made with an overhaul of the existing site, the wdc would have still needed to address the underlying cause. responsibility for content was decentralized, and content creation relied too heavily on technical expertise with dreamweaver. further, the ad hoc nature of the content—the product of years of “fitting in” content without a holistic approach—meant that changes to visual style could not be accomplished by changing a single css file. there were far too many exceptions to make changes simply. improve usability the wdc needed to make sure all the webpages were responsive and accessible. a restructuring of layout and information architecture (ia) was also necessary to improve findability of resources. on the old site, some content was hidden behind several layers of links. with no platform to ensure or enforce accessibility standards, website managers had to trust that all content creators were conscious of best practices or, failing that, pages had to be re-edited to improve accessibility. improve content creation and governance a common source of library staff frustration was the authoring experience using dreamweaver. there was no way to track when a webpage was changed or see who had made those changes. situations occurred where content was deleted or changed in error, and no one else knew until a patron discovered a mistake. staff could also mistakenly push out outdated versions of pages. it was not an ideal situation, and it was impossible for an individual (the web librarian) to monitor hundreds of pieces of content for daily changes to check for accuracy. the only other option would be narrow access to only those on the wdc, but that would mean everyone had to wait for the web librarian to push content live, which would also be frustrating. beyond the security and workflow issues, many of the library staff felt uncomfortable adding or editing content because dreamweaver requires some coding knowledge (html, css, javascript). therefore, the group wanted to install a content management system (cms) that provided a wysiwyg (what you see is what you get) content editor so that no coding knowledge would be needed. unite disparate sites (website, blog, and database list) under one updated url on a single secure server colgate libraries’ website functionality suffered from what marshall breeding describes as “a fragmented user experience.”1 the libraries website’s main address was http://exlibris.colgate.edu. however, different tools lived under other urls—one for a blog, another for the database list, yet another still for the mobile site librarians had to maintain information technology and libraries | june 2018 120 because the main website was not responsive. additionally, some portions of the website had been set up on other servers because of various limitations in the windows.net environment and inhouse skills. this was further complicated by the fact that most specialized interactivity or visual components had to be created from scratch by existing staff. the libraries’ blog was on an externally hosted wordpress site, and the database a–z list was on a custom-coded php page. a unified domain would make usage statistics easier to track and analyze. additionally, it would eliminate the need for multiple credentials for the various external sites. custom code, be it in php, .net, or any other language, also needs to be regularly updated as new security vulnerabilities arise.2 moving to a well-maintained cms would help alleviate that burden. by establishing goals and objectives, the wdc had identified that it wanted a cms to help with better governance, easier maintenance, and ways to disperse web maintenance responsibilities across library faculty. it was important to choose a cms platform that offered a wysiwyg editor so that content authoring did not require coding knowledge. additionally, the group wanted to update the site’s aesthetic and navigational designs. the wdc also decided that this was the optimal time to introduce a discovery layer (since all these changes would be one entirely new experience for colgate users) rather than smaller, continual changes that would require users to keep readjusting how they used the website. the backend complexity of updating both the website platform and implementing a discovery layer required abundant and detailed planning. however, while there was a lot of overlap in the preparatory work for implementing the discovery layer as well the cms, this article will focus primarily on the cms. planning after the wdc had detailed goals and objectives, and the proposal to update the libraries’ website platform was accepted by library faculty, the group had to take several steps to plan the implementation. the first steps in planning dealt with analysis. content analysis the web librarian conducted a content analysis of the existing website. using microsoft excel to document the pages and the omni group’s omnigraffle to organize the spreadsheet into a diagram, she cataloged each page and the navigation that connected that page to other pages. this can be extremely laborious but was necessary because some content was inherited from past employees over the course of a decade, and no one knew exactly what content was live on the website. this visual representation allowed for content creators to see redundancy in both content and navigation. it also made it easy for them to identify old content and combine or reorder pages. needs analysis the wdc wanted to make sure it considered more than the content creators’ needs. this group surveyed colgate faculty, staff, and students to learn what they would like to see improved or changed. the web librarian conducted several ux studies with both students and faculty, and this elucidated several key areas in need of improvement. from dreamweaver to drupal | buell and sandford 121 https://doi.org/10.6017/ital.v37i2.10113 peer analysis peer analysis involves thoroughly investigating peer institution’s websites to analyze how they organize both their content and their site navigation. it also gives insight into what other services and tools they provide. it is important to choose institutions similar in size and academic focus. colgate university is a small, liberal arts institution that only serves an undergraduate population, so the libraries would not seek to emulate a large university that serves graduate populations or distance learners. peer analysis is an excellent opportunity to see where a website is not measuring up to other websites as well as to borrow ideas from peers to customize for your specific patrons. evaluating platforms now that the group knew what the libraries had and what the libraries wanted from our web presence, it was time to evaluate the available options. this involved evaluating cms products and discovery layer platforms. the wdc researched different cmss and listed positives and negatives. ultimately, the group determined that drupal best satisfied the majority of colgate’s identified needs. a separate committee was formed to evaluate the major discovery-layer services with the understanding that any option could be integrated into the main website as a search box. budgeting as free, open-source software, drupal does not require a subscription or licensing fee. campus it provided a virtual server for the website at no cost to the libraries. budgeting was organized by the associate director of library technology and digital initiatives and the university librarian. money was set aside in case a consultant or developer was needed, but the web and systems librarians were able to execute the conversion from dreamweaver to drupal without external support. if future development support is needed for specific projects, it can be budgeted for and purchased as needed. the last step was creating a timeline defining achievable goals, ownership (who oversees completing the goal and who needs to be involved with the work), and date of completion. timeline the timeline was outlined as follows: october 2015–january 2016 halfway through the fall 2015 semester, the wdc began to create a proposal for changes to be made to the website. this proposal would be submitted to the university librarian for consideration by december 1. in the meantime, the web librarian completed a content inventory, peer analysis, and ux studies. she also gathered faculty and staff feedback on the current website through suggestion-box commentary, one-on-one interviews, online questionnaires, and anecdotal stories. by the deadline for the proposal, this additional information was condensed and presented to the university librarian. after incorporating suggested changes made by the university librarian, the wdc was able to present both the proposal and results from various studies to the library faculty on january 4, information technology and libraries | june 2018 122 2016. at the end of the meeting, the faculty voted to move forward and adopt the proposed changes. february 2016 february was spent meeting with stakeholders, both internal and external to the libraries, to gather concerns, necessary content, and ideas for improvements. the wdc members shared the responsibility of running these meetings. all members from the following departments were interviewed: research and instruction, borrowing services, acquisitions, library administration, cataloging, government documents, information literacy, special collections and university archives, and the science library. together, the wdc also met with members from it and communications. it was vital that these sessions identify several components. first, what content was important to retain on the new site, and why? the act of justification made stakeholders evaluate whether the information was necessary and useful to the libraries’ users. the wdc also asked the stakeholders to identify changes they wanted to see made to the website. the answers ranged from minor aesthetic tweaks to major navigational overhauls. last, it was important to understand how specific changes might impact workflows and functionality for tools outside colgate libraries’ own website. for example, the wdc had to update information with the communications department so that the libraries’ website would be findable on the university’s app. all the answers the wdc received were compiled into a report, and the web librarian used this information to inform design decisions moving forward. march 2016 while the associate director of library technology and digital initiatives coordinated demos from discovery layer vendors, the wdc also met to choose the final template from three options designed by the web librarian. the web and systems librarians also met to create a list of developers in case assistance was needed in the development of the drupal site. the wdc team researched potential developers and inquired about their pricing. the web librarian began to create wireframe templates of the different types of pages and page components (homepage, hours blocks, blogs, forms, etc.). she also began transferring existing content from the old website to the new website. this process, in addition to the development of new content identified by stakeholders, was to be completed by mid-summer. meanwhile, the systems librarian began to consolidate the external sites under drupal to the extent possible. while libguides lives externally to drupal and maintains its own url that the libraries’ website links out to, he was able to bring the database a–z list, blog, and analytics into the drupal platform. this entailed setting up new content types in drupal to accommodate various functional requirements for the a–z list and assist in creating pages to search for and display database information. from dreamweaver to drupal | buell and sandford 123 https://doi.org/10.6017/ital.v37i2.10113 april–may 2016 drupal allows for various models of permissions and authentication. by default, accounts can be created within the drupal system and roles and permissions assigned to individuals as needed. the ldap (lightweight directory access protocol) module allowed us to tie authentication to university accounts and includes the ability to tie drupal permissions to active directory roles and groups. connecting drupal to the university ldap server required the assistance of it infrastructure staff but was straightforward. it staff provided the connection information for the drupal module’s configuration and created a resource account for the drupal module to use to connect to the ldap service. as currently implemented, the ldap module simply verifies credentials and, if a local drupal account does not exist, creates one for the user. permissions for staff are added to accounts after account creation as needed as a part of the onboarding process. permissions in drupal can be highly granular. since one of the goals of the migration to drupal was to simplify maintenance of the website, the wdc decided to begin with a relatively simple, permissive approach. currently, all library staff can edit any page. because of drupal’s ability to track and revert changes easily, undoing a problematic edit is a simple procedure, and because all changes are tied to an individual login, problems can be addressed through training as needed. the wdc discussed a more fragmented approach that tied editing privileges to specific parts of the site but decided against it. the wdc team felt it was better to begin with the presumption of trustworthiness, expecting staff to only make changes to pages they were personally responsible for. additionally, trying to divide the site into logical pieces, then accounting for the inevitable exceptions, would be complicated and time-consuming. the wdc reserved the right to begin restricting permissions in the future, but thus far this has proven unnecessary. july–august 2016 as the libraries ramped up to the official launch, it was crucial to educate the library faculty and staff so they could become independent back-end content creators. both the web and systems librarians held multiple training sessions for the libraries employees so that everyone felt comfortable both editing and generating content. the associate director of library technology and digital initiatives drafted a campus-wide email announcing the new website and discovery layer at this point. it was sent out a month in advance of the official launch. the new website launched in two parts. the soft launch occurred on august 1, 2016. the web and systems librarians set up a link to the new website on the old site so that users could choose between getting acclimated to the new website or using the tool they were used to in the frantic weeks leading up to the beginning of the semester. august 15, 2016, was the official launch. at this point, the http://exlibris.colgate.edu dreamweaver-based website was retired, and it redirected all traffic heading to the old url to the new drupal-based website at http://cul.colgate.edu. because drupal’s url structure and information architecture differed from the old website, the wdc decided that mapping every page on the old site to the new one would be too time consuming. while it was acknowledged that this may cause some disruption (as it would break existing links), it seemed necessary for keeping the project moving forward. library staff updated all external links possible. the google search operator “inurl” allowed us to identify other sites information technology and libraries | june 2018 124 outside the libraries’ control that pointed to the old website. the wdc reached out to the maintainers of those few sites as appropriate. the biggest risk the libraries took by not redirecting all urls to the correct content was the potential to disrupt faculty who had bookmarked content or had direct urls in course materials. however, the wdc team received very few complaints about the new site, and most users agreed that the improvements to the site far outweighed any temporary inconveniences caused by it. if nothing else, the simplified architecture made finding content easier, so direct links and bookmarks became far less important than they once were. implementation and future steps by strictly following the timeline and working closely together, the web librarian and systems librarian were able to launch colgate libraries’ new website in time for the 2016 fall semester. the wdc team was able to pull off this feat within eight months without spending any extra money. the timeline above only gives a high-level view of the steps the wdc took to accomplish this task. the librarians who worked on this project cannot overemphasize the complexity of this endeavor, especially with a small team. however, a website conversion is feasible with organization, time, and with the online support the drupal community provides (especially the community of libraries on the drupal platform). it is also critical to have in-house personnel that have technical (coding and server-side) knowledge, project management knowledge, and information architecture and design knowledge. the response from incoming and returning students and faculty to the updated look and improved usability of the libraries’ digital content was overwhelmingly positive. following best design practices, in january 2017 more ux testing was conducted with student and teaching faculty participants to gauge their reactions to the new website. 3 users overwhelmingly found the new website to be both more aesthetically pleasing and usable than the old website. on the back end, the libraries’ content is now more secure, responsive, and accessible because the libraries are using a cms. library faculty and staff have been able to add or remove content that they are responsible for, but the website can still maintain a consistent look and feel across all pages. governance has been improved exponentially as library staff have been able to easily and quickly contribute to the website’s content without administrative delays. as the team moves forward, the wdc plans to investigate different advanced drupal tools, implementing an intranet, and better leveraging google analytics. as with all library endeavors, improvement requires continued effort and attention. from dreamweaver to drupal | buell and sandford 125 https://doi.org/10.6017/ital.v37i2.10113 appendix: detailed timeline 1. october 2015 a. began discussion with wdc to create proposal for website changes (web librarian) 2. november–december 2015 a. complete content inventory (web librarian) b. complete peer analysis (web librarian) c. complete ux studies (web librarian) d. gather faculty and staff feedback on current website (web librarian) 3. december 1, 2015 a. submit proposal to change from dreamweaver to drupal to university librarian for consideration and approval (web librarian) 4. january 4, 2016 a. submit revised proposal to library faculty for consideration and approval (web librarian) 5. january 2016 a. set up test drupal site (systems librarian) 6. february 2016 a. complete meetings with departments to gather feedback on concerns, content, and ideas for improvements (library department meetings were split among wdc members) 7. march 2016 a. demo primo, ex libris, and summon for library faculty and staff consideration (associate director of library technology and digital initiatives) b. from three options, choose template for our website (web librarian—approval by the wdc and then the library faculty) c. create list of developers in case we need assistance (web librarian and systems librarian) d. create wireframe templates for homepage (web librarian) e. begin transferring content from old website to new website and create new content with other stakeholders—to be completed by mid-summer (web librarian) f. begin consolidating multifarious external sites under drupal as much as possible (systems librarian) 8. april 2016 a. get drupal working with the ldap (systems librarian) b. agree on permissions and roles for back-end users (systems librarian—with approval by wdc) c. agree on discovery layer choice (associate director of library technology and digital initiatives) d. meet with outside stakeholders—communications, it, administration 9. may 2016 a. integrate discovery layer search (systems librarian) 10. july 2016 a. provide training for library faculty and staff as back-end content creators (web librarian) information technology and libraries | june 2018 126 b. prepare campus-wide email to announce new website and discovery layer with our new url (associate director of library technology and digital initiatives and web librarian) 11. august 1, 2016 a. set up a link on our old site (http://exlibris.colgate.edu) so for two weeks users could choose between using the old interface or start getting acclimated to the new website before the fall semester started (systems librarian) 12. august 15, 2016 a. official launch—we retire our http://exlibris.colgate.edu dreamweaver-based website and redirect all traffic headed to our old url to our new drupal-based website at http://cul.colgate.edu (systems librarian) 13. september–october 2016 a. update and get approval from library faculty for a new web style guide and governance guide (web librarian) 14. january 2017 a. conduct ux studies of students and faculty to see how people are using both the new website and the new discovery layer; gather feedback and ideas for improvement (web librarian) bibliography breeding, marshall. “smarter libraries through technology: strategies for creating a unified web presence.” smart libraries newsletter 36, 11 (november 2016): 1-2. general onefile (accessed august 3, 2017). http://go.galegroup.com/ps/i.do?p=itof&sw=w&v=2.1&it=r&id=gale%7ca471553487. naudi, t. “nearly all websites have serious security vulnerabilities--new research shows.” database and network journal 45, 4 (2015): 25. general onefile (accessed august 3, 2017). http://bi.galegroup.com/essentials/article/gale%7ca427422281. raward, r. “academic library website design principles: development of a checklist.” australian academic & research libraries 32, 2 (2001): 123-36. http://dx.doi.org/10.1080/00048623.2001.10755151 1 marshall breeding, “smarter libraries through technology: strategies for creating a unified web presence,” smart libraries newsletter 36, no. 11 (november 2016): 1–2. general onefile. 2 tamara naudi, “nearly all websites have serious security vulnerabilities—new research shows,” database and network journal 45, no. 4 (2015): 25. general onefile. 3 roslyn raward, “academic library website design principles: development of a checklist,” australian academic & research libraries 32, no. 2 (2001): 123–36. http://dx.doi.org/10.1080/00048623.2001.10755151 introduction goals and objectives improve design improve usability improve content creation and governance unite disparate sites (website, blog, and database list) under one updated url on a single secure server planning content analysis needs analysis peer analysis evaluating platforms budgeting timeline october 2015–january 2016 february 2016 march 2016 april–may 2016 july–august 2016 implementation and future steps appendix: detailed timeline bibliography article title | author 41content-based information retrieval and digital libraries | wan and liu 41 content-based information retrieval and digital libraries this paper discusses the applications and importance of content-based information retrieval technology in digital libraries. it generalizes the process and analyzes current examples in four areas of the technology. content-based information retrieval has been shown to be an effective way to search for the type of multimedia documents that are increasingly stored in digital libraries. as a good complement to traditional textbased information retrieval technology, content-based information retrieval will be a significant trend for the development of digital libraries. w ith several decades of their development, digital libraries are no longer a myth. in fact, some general digital libraries such as the national science digital library (nsdl) and the internet public library are widely known and used. the advance of computer technology makes it possible to include a colossal amount of information in various formats in a digital library. in addition to traditional text-based documents such as books and articles, other types of materials—including images, audio, and video—can also be easily digitized and stored. therefore, how to retrieve and present this multimedia information effectively through the interface of a digital library becomes a significant research topic. currently, there are three methods of retrieving information in a digital library. the first and the easiest way is free browsing. by this means, a user browses through a collection and looks for desired information. the second method—the most popular technique used today—is textbased retrieval. through this method, textual information (full text of text-based documents and/or metadata of multimedia documents) is indexed so that a user can search the digital library by using keywords or controlled terms. the third method is content-based retrieval, which enables a user to search multimedia information in terms of the actual content of image, audio, or video (marques and furht 2002). some content features that have been studied so far include color, texture, size, shape, motion, and pitch. while some may argue that text-based retrieval techniques are good enough to locate desired multimedia information, as long as it is assigned proper metadata or tags, words are not sufficient to describe what is sometimes in a human’s mind. imagine a few examples: a patron comes to a public library with a picture of a rare insect. without expertise in entomology, the librarian won’t know where to start if only a text-based information retrieval system is available. however, with the help of content-based image retrieval, the librarian can upload the digitized image of the insect to an online digital image library of insects, and the system will retrieve similar images with detailed description of this insect. similarly, a patron has a segment of music audio, about which he or she knows nothing but wants to find out more. by using the content-based audio retrieval system, the patron can get similar audio clips with detailed information from a digital music library, and then listen to them to find an exact match. this procedure will be much easier than doing a search on a text-based music search system. it is definitely helpful if a user can search this non-textual information by styles and features. in addition, the advance of the world wide web brings some new challenges to traditional text-based information retrieval. while today’s web-based digital libraries can be accessed around the world, users with different language and cultural backgrounds may not be able to do effective keyword searches of these libraries. content-based information retrieval techniques will increase the accessibility of these digital libraries greatly, and this is probably a major reason it has become a hot research area in the past decade. ideally, a content-based information retrieval system can understand the multimedia data semantically, such as its objects and categories to which it belongs. therefore, a user is able to submit semantic queries and retrieve matched results. however, a great difficulty in the current computer technology is to extract high-level or semantic features of multimedia information. most projects still focus on lower-level features, such as color, texture, and shape. simply put, a typical content-based information retrieval system works in this way: first, for each multimedia file in the database, certain feature information (e.g., color, motion, or pitch) is extracted, indexed, and stored. second, when a user composes a query, the feature information of the query is calculated as vectors. finally, the system compares the similarity between the feature vectors of the query and multimedia data, and retrieves the best matching records. if the user is not satisfied with the retrieved records, he or she can refine the search results by selecting the most relevant ones to the search query, and repeat the search with the new information. this process is illustrated in figure 1. the following sections will examine some existing content-based information retrieval techniques for most common information formats (image, audio, and video) in digital libraries, as well as their limitations and trends. gary (gang) wan (gwan@tamu.edu) is a science librarian and assistant professor, and zao liu (zliu@tamu.edu) is a distance learning librarian and assistant professor at sterling c. evans library, texas a&m university, college station, texas. gary (gang) wan and zao liu 42 information technology and libraries | march 200842 information technology and libraries | march 2008 ■ content-based image retrieval there have been a large number of different contentbased image retrieval (cbir) systems proposed in the last few years, either building on prior work or exploring novel directions. one similarity among these systems is that most perform feature extraction as the first step in the process, obtaining global image features such as color, shape, and texture (datta et al., 2005). one of the most well-known cbir systems is query by image content (qbic), which was developed by ibm. it uses several different features, including color, sketches, texture, shape, and example images to retrieve images from image and video databases. since its launch in 1995, the qbic model has been employed for quite a few digital libraries or collections. one recent adopter is the state hermitage museum in russia (www.hermitage. ru), which uses qbic for its web-based digital image collection. users can find artwork images by selecting colors from a palette or by sketching shapes on a canvas. the user can also refine existing search results by requesting all artwork images with similar visual attributes. the following screenshots demonstrate how a user can do a content-based image search with qbic technology. in figure 2.1, the user chooses a color from the palette and composes the color schema of artwork he or she is looking for. figure 2.2 shows the artwork images that match the query schema. another example of digital libraries or collections that have incorporated cbir technology is the national science foundation’s international digital library project (www.memorynet.org), a project that is composed of several image collections. the information retrieval system for these collections includes both a traditional text-based search engine and a cbir system called simplicity (semantics-sensitive integrated matching for picture libraries) developed by wang et al. (2001) of pennsylvania state university. from the front page of these image collections, a user can choose to display a random group of images (figure 3.1). below each image is a “similar” button; clicking this allows the user to view a group of images that contain similar objects to the previously selected one (figure 3.2). by providing feedback to the search engine this way, the user can find images of desired objects without knowing their names or descriptions. simply put, simplicity segments each image into small regions, extracts several features (such as color, figure 1. the general process of content-based information retrieval figure 2.1. a user query figure 2.2. the search results for this query article title | author 43content-based information retrieval and digital libraries | wan and liu 43 location, and shape) from these small regions, and classifies these regions into some semantic categories (such as textured/nontextured and graph/photograph). when computing the similarity between the query image and images in the database, all these features will be considered and integrated, and best matching results will be retrieved (wang et al., 2001). similar applications of cbir technology in digital libraries include the university of california–berkeley’s digital library project (http://bnhm.berkeley.edu), the national stem digital library (ongoing), and virginia tech’s anthropology digital library, etana (ongoing). while these feature-based approaches have been explored over the years, an emerging new research direction in cbir is automatic concept recognition and annotation. ideally, automatic concept recognition and annotation can discover the concepts that an image conveys and assign a set of metadata to it, thus allowing image search through the use of text. a trusted automatic concept recognition and annotation system can be a good solution for large data sets. however, the semantic gap between computer processors and human brains remains the major challenge in the development of a robust automatic concept recognition and annotation system (datta et al., 2005). a recent example of efforts in this field is li and wang’s alipr (automatic linguistic indexing of pictures—real time, http://alipr.com) project (2006). through a web interface, users are able to search images in several different ways: they may do text searches and provide feedback to the system to find similar images. users may also upload an image, and the system will perform concept analysis and generate a set of annotations or tags automatically, as shown in figure 4. the system then retrieves images from the database that are visually similar to the uploaded image. in the process of automatic annotation, if the user doesn’t think the tags given by the system are suitable, he or she can input other tags to describe the image. this is also the “training” process for the alipr system. since cbir is the major research area and has the longest history in content-based information retrieval, there are many models, products, and ongoing projects in addition to the above examples. as image collections become a significant part of digital libraries, more attention has been paid to possibilities of providing content-based image search as a complement to existing metadata search. ■ content-based audio retrieval compared with cbir, content-based audio retrieval (cbar) is relatively new, and fewer research projects on it can be found. in general, existing cbar approaches start from the content analysis of audio clips. an example of this content analysis is extracting basic audio elements, such as duration, pitch, amplitude, brightness, and bandfigure 3.1. a group of random images in the collection figure 3.2. cbir results figure 4. alipr’s automatic annotation feature 44 information technology and libraries | march 200844 information technology and libraries | march 2008 width (wold et al., 1996). because of the great difficulties in recognizing audio content, research in this area is less mature than that in content-based image and video retrieval. although no cbar system has been found to be implemented by any digital library so far, quite a few projects provide good prototypes or directions. one good example is zhang and kuo’s (2001) research project on audio classification and retrieval. the prototype system is composed of three stages: coarse-level audio segmentation, fine-level classification, and audio retrieval. in the first stage, audio signals are semantically segmented and classified into several basic types including speech, music, song, speech with music background, environment sounds, and silence. some physical audio features—such as the energy function, the fundamental frequency, and the spectral peak tracks—are examined in this stage. in the second stage, further classification is conducted for every basic type. features are extracted from the time-frequency representation of audio signals to reveal subtle differences of timbre and pattern among different classes of sounds. based on these differences, the coarse-level segmentation obtained in stage one can be classified to narrower categories. for example, speech can be differentiated into the voices of men, women, and children. finally, in the information retrieval stage, two approaches—query-by-keyword and query-by-example—are employed. the query-by-keyword approach is more like the traditional text-based search system. the query-by-example approach is similar to content-based image retrieval systems where an image can be searched by color, texture, and histogram, and audio clips can be retrieved with distinct features, such as timbre, pitch, and rhythm. this way, a user may choose from a given list of features, listen to the retrieved samples, and modify the input feature set to get more desired results. zhang and kuo’s prototype is a very typical and classic cbar system. it is relatively mature and can be used by large digital audio libraries. more recently, li et al. (2003) proposed a new feature extraction method particularly for music genre classification named daubechies wavelet coefficient histograms (dwchs). dwchs capture the local and global information of music signals simultaneously by computing their histograms. similar to other cbar strategies, this method divides the process of music genre classification into two steps: feature extraction and multi-class classification. the music signal information representing the music is extracted first, and then an algorithm is used to identify the labels from the representation of the music sounds with respect to their features. since the decomposition of audio signal can produce a set of subband signals at different frequencies corresponding to different characteristics, li et al. (2003) proposed a new methodology, the dwchs algorithm, for feature extraction. with this algorithm, the decomposition of the music signals is obtained at the beginning, and then a histogram of each subband is constructed. hence, the energy for each subband is computed, and the characteristics of the music are represented by these subbands. one finding from this research reveals that this methodology, along with advanced machine learning techniques, has significantly improved accuracy of music genre classification (li et al. 2003). therefore, this methodology potentially can be used by those digital music libraries widely developed in past several years. ■ content-based video retrieval content-based video retrieval (cbvr) is a more recent research topic than cbir and cbar, partly because the digitization technology for video appeared later than those for image and audio. as digital video websites such as youtube and google video become more popular, how to retrieve desired video clips effectively is a great concern. searching by some features of video, such as motion and texture, can be a good complement to the traditional text-based search method. one of the earliest examples is the videoq system developed by chang et al. (1997) of columbia university (www.ctr.columbia.edu/videoq), which allows a user to search video based on a rich set of visual features and spatio-temporal relationships. the video clips in the database are stored as mpeg files. through a web interface, the user can formulate a query scene as a collection of objects with different attributes, including motion, shape, color, and texture. once the user has formulated the query, it is sent to a query server, which contains several databases for different content features. on the query server, the similarities between the features of each object specified in the query and those of the objects in the database are computed; a list of video clips is then retrieved based on their similarity values. for each of these video clips, key-frames are dynamically extracted from the video database and returned to browser. the matched objects are highlighted in the returned key-frame. the user can interactively view these matched video clips by simply clicking on the keyframe. meanwhile, the video clip corresponding to that key-frame is extracted from the video database (chang et al. 1997). figures 5.1–5.2 show an example of a visual search through the videoq system. many other cbvr projects also examine these content features and try to find more efficient ways to retrieve data. a recent example is wang et al.’s (2006) project, vferret, a content-based similarity search tool for continuous archived video. the vferret system segments video data into clips and extracts both visual and audio features as metadata. then a user can do a metadata search or article title | author 45content-based information retrieval and digital libraries | wan and liu 45 content-based search to retrieve desired video clips. in the first stage, a simple segmentation method is used to split the archived digital video into five-minute video clips. the system then extracts twenty image frames evenly from each of these five-minute video clips for visual feature extraction. additionally, the system splits the audio channel of each clip into twenty individual fifteensecond segments for further audio feature extraction. in the second stage, both audio and visual features are extracted. for visual features, the color element is used as the content feature. for audio features, 154 audio features originally used by ellis and lee (2004) to describe audio segments are computed. for each fifteen-second video segment, the visual feature vector extracted from the sample image and the audio feature vector extracted from the corresponding audio segment are combined into a single feature vector. in the information retrieval stage, the user submits a video clip query at first, then its feature vector is computed and compared with that of video clips in the database, and the most similar clips are retrieved (wang et al. 2006). similar projects in this area include carnegie mellon university’s informedia digital video library (www. informedia.cs.cmu.edu) and muvis of finland’s tampere university of technology (http://muvis.cs.tut.fi/index. html). content-based information retrieval for other digital formats with the advance of digitization technology, the content and formats of digital libraries are much richer than before. they are not limited to text, image, audio, and video. some new formats of digital content are emerging. digital libraries of 3-d objects are good examples. since 3-d models have arbitrary topologies and cannot be easily “parameterized” using a standard template as in the case for 2-d forms (bustos et al. 2005), contentbased 3-d model retrieval is a more challenging research topic than other multimedia formats discussed earlier. so far, four types of solutions—primitive-based, statistics-based, geometry-based, and view-based—have been found (bimbo and pala 2006). primitive-based solutions represent 3-d objects with a basic set of parameterized primitive elements. parameters are used to control the shape of each primitive element and to fit each primitive element with a part of the model. with statistics-based approaches, shape descriptions based on statistical modfigure 5.1. the user composes a query figure 5.2. search results for the sample query 46 information technology and libraries | march 200846 information technology and libraries | march 2008 els are created and measured. geometry-based methods, however, use geometric properties of the 3-d object and their measures as global shape descriptors. for viewbased solutions, a set of 2-d views of the model and descriptors of their content are used to represent the 3-d object shape (bimbo and pala 2006). another novel example is moustakas et al.’s (2005) project on 3-d model search using sketches. in the experimental system, the vector of geometrical descriptors for each 3-d model is calculated during the feature extraction stage. in the retrieval stage, a user can initially use one of the sketching interfaces (such as the virtual reality interface or by using an air mouse) to sketch a 2-d contour of the desired 3-d object. the 2-d shape is recognized by the system, and a sample primitive is automatically inserted in the scene. next, the user defines other elements that cannot be described by the 2-d contour, such as the height of the object, and manipulates the 2-d contour until it reaches its target position. the final query is formed after all the primitives are inserted. finally, the system computes the similarities between the query model and each 3-d model in the database, and renders the best matching records. an online demonstration can be found for a european project specifically designed for a 3-d digital museum collection, sculpteur (www.sculpteurweb.org). from its web-based search interface, a user can choose to do a metadata search or content-based search for a 3-d object. the search strategy here is somewhat similar to that in some cbir systems: the user can upload a 3-d model in vrml formats, then select a search algorithm (such as similar color, texture, etc.) to perform a search within a digital collection of 3-d models. as 3-d computer visualization has been widely used in a variety of areas, there are more research projects focusing on the content-based information retrieval techniques for this new multimedia format. ■ conclusion there is no doubt that content-based information retrieval technology is an emerging trend for digital library development and will be an important complement to the traditional text-based retrieval technology. the ideal cbir system can semantically understand the information in a digital library, and render users the most desirable data. however, the machine understanding of semantic information still remains to be a great difficulty. therefore, most current research projects, including those discussed in this paper, deal with the understanding and retrieval of lower-level features or physical features of multimedia content. certainly, as related disciplines such as computer vision and artificial intelligence keep developing, more researches will be done on higher-level feature-based retrieval. in addition, the growing varieties of multimedia content in digital libraries have also brought many new challenges. for instance, 3-d models now become important components of many digital libraries and museums. content-based retrieval technology can be a good direction for this type of content, since the shapes of these 3-d objects are often found more effectively if the user can compose the query visually. new cbir approaches need to be developed for these novel formats. furthermore, most cbir projects today tend to be web-based. by contrast, many project were based on client applications in the 1990s. these web-based cbir tools will have significant influence on digital libraries or repositories, as most of them are also web-based. particularly in the age of web 2.0, some large digital repositories—such as flickr for images and youtube and google video for video—are changing people’s daily lives. the implementation of cbir will be a great benefit to millions of users. since the nature of cbir is to provide better search aids to end users, it is extremely important to focus on the actual user’s needs and how well the user can use these new search tools. it is surprising to find that little usability testing has been done for most cbir projects. such testing should be incorporated into future cbir research before it is widely adopted. bibliography bimbo, a. and p. pala. 2006. content-based retrieval of 3-d models. acm transactions on multimedia computing, communications, and applications 2, no. 1: 20–43. bustos, b., et al. 2005. feature-based similarity search in 3-d object databases. acm computing surveys 37, no. 4: 345–387. chang, s., et al. 1997). videoq: an automated content based video search system using visual cues. in proceedings of the 5th acm international conference on multimedia, e. p. glinert, et al., eds. new york: acm. datta r., et al. 2005. content-based image retrieval: approaches and trends of the new age. in proceedings of the 7th international workshop on multimedia information retrieval, in conjunction with acm international conference on multimedia, h. zhang, , j. smith, and q. tian, eds. new york: acm. ellis, d. and k. lee. minimal-impact audio-based personal archives. in proceedings of the 1st acm workshop on continuous archival and retrieval of personal experiences carpe, j. gemmell, et al., eds. new york: acm. li, t., et al. 2003. a comparative study on content-based music genre classification. in proceedings of the 26th annual international acm sigir conference on research and development in information retrieval, c. clarke, et al., eds. new york: acm. li, j. and j. wang, j. 2006. real-time computerized annotation of pictures. in proceedings of the 14th annual acm international article title | author 47content-based information retrieval and digital libraries | wan and liu 47 conference on multimedia, k. nahrstedt, et al., eds. new york: acm. marques, o. and b. furht. 2002. content-based image and video retrieval. norwell, mass: kluwer. moustakas, k., et al. 2005. master-piece: a multimodal (gesture+speech) interface for 3d model search and retrieval integrated in a virtual assembly application. proceedings of the enterface: 62–75. wang, j., et al. 2001. simplicity: semantics-sensitive integrated matching for picture libraries. ieee trans. pattern analysis and machine intelligence 23, no. 9: 947–963. wang, z., et al. 2006. vferret: content-based similarity search tool for continuous archived video. in proceedings of the 3rd acm workshop on continuous archival and retrival of personal experiences, k. maze et al., eds. new york: acm. wold, e., et al. 1996. content-based classification, search, and retrieval of audio. ieee multimedia 3, no. 3: 27–36. zhang, t. and c. kuo. 2001. content-based audio classification and retrieval for audiovisual data parsing. norwell, mass.: kluwer. lita national forum cover 2 lita guides cover 3 lita workshops cover 4 index to advertisers statement of ownership, management, and circulation information technology and libraries, publication no. 280-800, is published quarterly in march, june, september, and december by the library information and technology association, american library association, 50 e. huron st., chicago, illinois 60611-2795. editor: john webb, librarian emeritus, washington state university libraries, pullman, wa 99164-5610. annual subscription price, $55. printed in u.s.a. with periodical-class postage paid at chicago, illinois, and other locations. as a nonprofit organization authorized to mail at special rates (dmm section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. extent and nature of circulation (average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: june 2007 issue). total number of copies printed: average, 5,354; actual, 5,280. sales through dealers and carriers, street vendors, and counter sales: average, 0; actual 462. paid or requested mail subscriptions: average, 4,283; actual, 4,193. free distribution (total): average, 292; actual, 292. total distribution: average, 5,028; actual, 4,947. office use, leftover, unaccounted, spoiled after printing: average, 326; actual, 333. total: average, 5,354; actual, 5,280. percentage paid: average, 94.19; actual, 94.10. s t a t e m e n t o f o w n e r s h i p , m a n a g e m e n t , a n d c i r c u l a t i o n ( p s f o r m 3 5 2 6 , s e p t e m b e r 2 0 0 7 ) f i l e d w i t h t h e u n i t e d s t a t e s p o s t o f f i c e p o s t m a s t e r i n c h i c a g o , o c t o b e r 1 , 2 0 0 7 . 44 information technology and libraries | december 2007 author id box for 3 column layout column titlecommunications afghanistan digital library initiative: revitalizing an integrated library system yan han and atifa rawan this paper describes an afghanistan digital library initiative of building an integrated library system (ils) for afghanistan universities and colleges based on open-source software. as one of the goals of the afghan equality digital libraries alliance, the authors applied systems analysis approach, evaluated different open-source ilss, and customized the selected software to accommodate users’ needs. improvements include arabic and persian language support, user interface changes, call number label printing, and isbn-13 support. to our knowledge, this ils is the first at a large academic library running on open-source software. the last quartercentury has been devastating for afghanistan, with an uninterrupted period of inva sions, civil wars, and oppressive regimes. “since 1979, the education system was virtually destroyed on all levels. schools and colleges were closed, looted, or physically reduced; student bodies and faculties were emptied by war, migration, and eco nomic hardship; and libraries were gutted.”1 kabul university (ku), for example, was largely demolished by 1994 and completely closed down in 1998. it is universally recognized that afghanistan desperately needs trained faculty, teachers, librarians, and staff. the current state of the higher education system is one of dramatic destruction and deteriora tion. based on rawan’s assessments of ku library, most of its collections were damaged or destroyed. she found that there were approximately 60,000 to 70,000 books in english, 2,000 to 3,000 books in persian, and 2,000 theses in persian. none of these collections have manual or online catalog records. the library has eigh teen staff members, but not all are fully trained in library activities.2 rebuilding the educational infra structure in afghanistan is essential. afghan equality digital libraries alliance the university of arizona (ua) library has been involved in rebuilding academic libraries in afghanistan since april 2002. in 2005, we were invited to be part of the digital libraries alliance (dla) as part of the afghan equality alliances: 21st century universities for afghanistan initiative funded by the usaid and washington state university. dla’s goal is to build the capacity of afghan libraries and librarians to work with open source digital libraries platforms; and to provide and enhance access to schol arly information resources and open content that all afghanistan univer sities can share. revitalizing the afghan ils an integrated library system (ils) usually includes several critical com ponents, such as acquisitions, cat aloging, catalog (search and find), circulation, and patron management. traditionally it has been the center of any library. recent developments in digital libraries have resulted in dis tributed systems in libraries, and the ils is treated as one of many digital library systems. it still is critical to have a centralized ils to provide a primary way to access libraryowned materials for afghanistan universi ties and colleges. other services, such as interlibrary loan and other digital library systems, can be further devel oped to extend libraries’ services to users and communities. the ua library is working collab oratively with other dla members, including universities around the world and universities in afghanistan. one of the goals is to develop a digital library environment, includ ing a centralized ils for four aca demic universities in kabul (kabul university, polytechnic university, kabul medical university, and kabul education university). in the future, the ils will include other regional institutions throughout afghanistan. the ils will support 30,000 students and 2,000 faculty in afghan universi ties and colleges. overview of the ils market currently the ils market is primar ily dominated by commercial sys tems, such as innovative interface, endeavor, and sirsi. compared with other computing areas, opensource systems in ils are immature and limited, as there are only a few prod ucts available, and most of them do not have the full features of an ils. however, they are providing a valu able alternative to those costly com mercial systems. based on the availability of exist ing funding, experiences with com mercial vendors, and consideration of vendor supports and future direc tions, the authors decided to build a digital library infrastructure with the open concept (open access, open source, and open standards). the decision is widely influenced by glo balization, open access, open source, open standards, and increasing user expectations. at the same time, the decision gives us an opportunity to develop and integrate new tools and services for libraries as suggested by the university of california.3 koha is probably the most renowned opensource ils. it is yan han (hany@u.library.arizona.edu) is systems librarian and atifa rawan (rawana@u.library.arizona.edu) is librarian at the university of arizona libraries, tucson. afghanistan digital library initiative | han and rawan 45 a fullfeatured ils, developed in new zealand and first deployed in horowhenua library trust in 2000. so far koha has been running in a few public and special libraries. the underlying architecture is the linux, apache, mysql, and perl (lamp) stack. building on a simi lar lamp (linux, apache, mysql, and php) architecture, openbiblio has a relatively short history, releas ing its first beta 0.1.0 version in 2002 and currently in beta 0.5.1 version. webils is an opensource ils based on unesco’s cds/isis database, developed by the institute for computer and information engineering in poland. the software has some ils features, including cataloging, catalog (search and find), loan, and report modules. weblis must run on windows and window based web servers, such as xitami/ microsoft iis and isis database. gnuteca, another opensource ils widely deployed in south america universities, was developed in brazil. as with webils, it has some ils features, such as cataloging, cata log, and loan; however, the software interface is written in portuguese, which presents a language barrier for u.s. and afghanistan users. the paper open source integrated library systems provides a good overview of other systems.4 systems analysis the authors adopted systems analy sis by taking account of afghan col lections, users’ needs, and systems functionality required to perform essential library operations. koha was chosen as the base software, due to its functionality, maturity, and support. some of the reasons are: ■ the software architecture is open source lamp, which is popular, stable, and predominant. ■ our staff have skills in these open software systems. ■ it is a fullfeatured opensource ils. certain components, such as multiple branch support and user management, are critical. ■ two large public libraries serv ing population of 30,000 users in new zealand and united states have been running their ils on koha for a few years. the soft ware is stable, and most bugs have been fixed. ■ koha has a mailing list that is used by koha developers and users as a communication tool to ask and answer questions. kabul universities have com puter science faculty and students who have the capacity to participate in the development. due to working schedules and locations, we prefer to develop and maintain the system in the ua library. the technical project team consists of three people: yan han, who is responsible for manag ing the overall implementation and development in the open source ils system; one parttime (twenty hours per week) student developer whose major task is to develop and man age source code; and a temporary student (ten hours per week for two months) responsible for translating english to farsi and dari. testing tasks, such as unit testing and sys tem testing, are shared by all mem bers of the team. major challenges farsi and dari languages support koha version 2.2 cannot correctly handle east asian language records, including farsi and dari records. supporting persian, farsi, and dari records is a very important require ment, as these afghan universities have quite a few persian and dari materials. koha generates a web based graphical user interface (gui) through perl included templates that use a html meta tag with western character set (iso85591) to encode characters. browsers such as internet explorer and firefox use the meta tag to decode characters with a predefined character set. therefore, other characters, such as arabic and persian as well as chinese would not be displayed correctly. the perl tem plates were identified and modified to allow characters to be encoded in unicode, and this solved the prob lem. persian and dari characters can be entered into the cataloging module and displayed correctly in the gui. however, we should understand the limitations of this approach when dealing with other east asian character sets, such as chinese characters. only frequently used characters can be represented. a project of academia sinica is one of the efforts to deal with 65,000 unique chinese characters.5 farsi/dari gui as the project is designed for local afghanistan users, there is a need for a farsi and dari gui. the current version of koha does not have such an interface, and we decided to create a new farsi/dari gui for the opac. the koha system’s internal structure is logically arranged; therefore, our development work in translation is not difficult to manage. the transla tion student translates english words in perl template files into farsi and dari. at the same time he works with the developer to make sure it is dis played correctly in the opac. figure 1 is the screenshot of the gui. other improvements we further developed a spine label printing module and integrated the module into the ils, as there is no such function provided. the module allows library staff to print one or more standardized labels (1.5 inches high by 1 inch wide) with oclc formats on gaylord lsl 01 paper, which has fiftysix labels per sheet. 46 information technology and libraries | december 2007 lstaff can select an appropriate label slot to start and print out his or her choices of labels through the web preview feature. this feature eases library staff operations and provides cost savings for label papers. isbn13 replaced isbn10 after january 1, 2007, and any ils has to be able to handle the new isbn13. our ils has been improved to han dle both isbn standards. thanks to koha’s delegation of the gui and major functionality, interfaces such as fonts and web pages can be modi fied through the templates and css. a z39.50 service has been configured to allow users to search other librar ies’ catalogs. hardware and software support afghanistan is still developing its fun damental infrastructure: electricity, transportation, and communication. when considering buying hardware for the ils, difficult issues, such as server services and computer parts, have to be solved. even international it companies, such as dell, hp, and ibm, have very limited services and support in afghanistan. regarding software and system support, our strategies are to: ■ maintain and develop the open source software at the ua library by the project team; ■ run one server in kabul, afghanistan, administrated by a local system administrator. ■ run one server in the ua library administrated by the library’s system administrator. cost we estimated our overall cost for building the opensource system is low and reasonable. the system is currently run ning on a dell 2800 server ($5,000 for 3ghz cpu, 4gb ram, and five 73gb hard drives), kernel built debian linux (free), apache 2 (free), mysql (free), and perl (free). han spends four hours per week for coor dination, communication, and man agement of the project. the student developer works twenty hours per week for development and mainte nance, while the translation student will spend one hundred hours for translation. conclusion revitalizing an afghan ils is the first important goal to build digital library initiatives for the afghanistan higher education system. by under standing afghan university librar ies, collections, and users, the ua library is working with other dla members to build the open source ils. the new farsi and dari user interface, language support, and other improvements have been made to meet needs of afghan uni versities and colleges. the cost of using and developing existing open source software is reasonable. acknowledgments we thank usaid, washington state university, and other dla mem bers for providing support. this work was supported by usaid and washington state university. references and notes 1. nazif sharani et. al., conference transcription, conference on strate gic planning of higher education for afghanistan, 2002, indiana university, bloomington, oct. 6–7. 2. atifa rawan, transformation in afghanistan: rebuilding libraries, paper presented at azla conference, mesa, ariz., oct. 11–13, 2005. 3. the university of california libraries, rethinking how we provide bibliographic services for the university of california, 2005, http://libraries.univer sityofcalifornia.edu/sopag/bstf/final. pdf. 4. eric anctil and jamshid beheshti, open source integrated library systems: an overview, 2004, www.anctil.org/users/ eric/oss4ils.html (accessed nov. 5, 2006). 5. derming juang et al., “resolving the unencoded character problem for chinese digital libraries,” proceedings of the 5th acm/ieee-cs joint conference on digital libraries, jcdl 2005, denver (june 7–11, 2005): 311–19 (new york: acm pr., 2005). figure 1: afghanistan academic libraries union catalog in farsi/dari lita cover 2, cover 3, cover 4 index to advertisers 10738 20190318 galley determining textbook cost, formats, and licensing with google books api: a case study from an open textbook project eamon costello, richard bolger, tiziana soverino, and mark brown information technology and libraries | march 2019 91 eamon costello (eamon.costello@dcu.ie) is assistant professor, open education at dublin city university. richard bolger (richard.bolger@dcu.ie) is lecturer at dublin city university. tiziana soverino (tiziana.soverino@dcu.edu) is researcher at dublin city university. mark brown (mark.brown@dcu.ie) is full professor of digital learning, dublin city university. abstract the rising cost of textbooks for students has been highlighted as a major concern in higher education, particularly in the us and canada. less has been reported, however, about the costs of textbooks outside of north america, including in europe. we address this gap in the knowledge through a case study of one irish higher education institution, focusing on the cost, accessibility, and licensing of textbooks. we report here on an investigation of textbook prices drawing from an official college course catalog containing several thousand books. we detail how we sought to determine metadata of these books including: the formats they are available in, whether they are in the public domain, and the retail prices. we explain how we used methods to automatically determine textbook costs using google books api and make our code and dataset publicly available. introduction the cost of textbooks is a hot topic for higher education. it has been reported that by 2014 the average student spent $1,200 annually on textbooks.1 another study claimed that between 2006 and 2016 the costs of college textbooks increased over four times the cost of inflation.2 despite this rise in textbook costs, a survey of more than 3,000 us faculty members (“the babson survey”) found that almost every course (98 percent) mandated a textbook or related study resources.3 one response to the challenge of rising textbook costs is open textbooks. open textbooks are a type of open educational resource (oer). oers have been defined as “teaching, learning, and research resources that reside in the public domain or have been released under an intellectual property license that permits their free use and repurposing by others. open educational resources include full courses, course materials, modules, textbooks, streaming videos, tests, software, and any other tools, materials, or techniques used to support access to knowledge.”4 oers stem from the principle that access to education is a human right and that, as such, education should be accessible to all.5 hence an open textbook is made available under terms which grant legal rights to the public, not only to use, but also to adapt and redistribute. creative commons licensing is the most prevalent and well-developed intellectual property licensing tool for this purpose. open textbook projects aimed at promoting publishing and redistributing open textbooks, both in digital and print formats, have been growing. for example, the bcampus project in canada began in 2012 with the aim of creating a collection of open textbooks aligned with the most popular subject areas in british columbia.6 the project has shown strong growth, with over 230 open digital textbooks now available and more than forty institutions involved. a significant recent determining textbook cost, formats, and licensing | costello, bolger, soverino, and brown 92 https://doi.org/10.6017/ital.v38i1.10738 development in open textbooks occurred in march 2018, when the us congress announced a $5 million investment in an open textbook initiative.7 in addition to helping change institutional culture, and challenge attitudes to traditional publishing models, one of the most oft-cited benefits of open textbooks is cost savings. according to the college board’s survey of colleges, the average annual cost to us undergraduate students in 2017 for textbooks and materials was estimated at $1,250.8 this figure is remarkably close to the aforementioned figure of $1,200 a year, as reported by baglione and sullivan. however, there is little known about the monetary face value of books that students are expected to buy, beyond studies based on self-reported data. students themselves in the us have attempted to at least open the debate in this area by highlighting book price disparities.9 nonetheless, they only report on a very small number of books, and the college board representing on-campus us textbook retailers have disputed their results for this reason, claiming that they have been selective in the book prices they have chosen. hence this study seeks to address the gap that exists in knowledge about the true cost of textbooks in higher education. this is in the context of a wider research project we are conducting on open textbooks in ireland.10 determining the cost of books is not straightforward as books can be new, used, rental, or digital subscription. however, the cost of new books does set a baseline for other forms, particularly rental and used books. our aim here is hence to start with new books, by analyzing costs of all the required and recommended textbooks of one higher education institution (hei) in ireland. the overarching research question this study sought to address is: what is known about the currently assigned textbooks in an irish university? the sub-questions were: • rq1: what is the extent of textbooks that are required reading? • rq2: what are the retail costs of textbooks? • rq3: are textbooks available in digital or e-book form? • rq4: are textbooks available in the public domain? the next section outlines our methodology and how we sought to find answers to these questions. methods in this section we describe our approach, the dataset generated, and the methods we used to analyze the data. we identified a suitable data source comprising the official course catalog of a hei in ireland with more than ten thousand students. in the course catalog faculty give required and recommended textbook details for all courses. this information is freely accessible on the website of the hei; the course catalog is powered by a software system known as akari (http://www.akarisoftware.com/). akari is a proprietary software system used by several heis in and outside ireland to create and manage academic course catalogs. the course team gained access to a download of all books recorded in the database of the course catalog (figure 1). in this catalog, fields are provided for lecturers to input information for students about books such as title, international standard book number (isbn), author, and publisher. following manual and automated data cleansing, 3,014 unique records of books were created. due to the large number of books, at this stage we sought a programmatic solution for finding out more information about these books. information technology and libraries | march 2019 93 figure 1. course catalog screenshot. we initially thought that isbns might prove the best way to accurately reconcile records of books. however, many isbns were incomplete or mistyped. moreover, many instructors simply did not enter an isbn. given the capacity for errors in the data—for instance, some lectures simply entered “i will tell you in class” in the book title field—we required a tool that could handle fuzzy search queries, e.g. cases where a book title or author were misspelled. the tool we selected was the google books application programming interface (api).11 this api provides an interface to the google books database of circa thirty million books. the service, like the main google search engine, is forgiving of queries that are mistyped or misspelled. hence, we constructed a query based on a combination of author name, book title, and publisher. following experimentation, we determined that these three search terms together allowed us to find books with a high degree of accuracy whilst also accounting for possible spelling errors. determining textbook cost, formats, and licensing | costello, bolger, soverino, and brown 94 https://doi.org/10.6017/ital.v38i1.10738 figure 2. system design. we then wrote a custom javascript middleware program deployed in the google cloud platform. this program parsed the file of the book search queries, passed them to the google books api as search requests and saved the results. the api returned results in javascript object notation (json) format. json is a modern web language for describing data. it is related to javascript and can be used to translate objects in the javascript programming language into textual strings. it is used as a replacement for xml as it is arguably more human readable and is considerably less verbose. we then imported this json into a mongodb database to filter and clean the data, before finally exporting them to excel for statistical analysis. mongodb is a document store database that natively stores objects in the json format and allows for efficient querying of the data. the google books api provides some key metadata on books aside from the usual author, publisher, isbn, edition, pages, etc. as it gives prices for selected books. google draws this information from its own e-book store which contains over three million books and a network of resellers who sell print and digital versions of the books. in addition to price, google books also contains information on accessible versions of books, digital/e-pub versions, pdf versions, and whether the book is in the public domain. we have published a release of this dataset and all of our code to the software repository github. we then used the zenodo platform to generate a digital object identifier (doi) for the code.12 one of the functions of the zenodo platform is to allow for code to be properly cited and referenced. we published our code in this way for others interested in replicating this work in other contexts. in the next section we will provide an analysis of the results of our queries. results after extracting and processing the data from the course catalog and google platforms, we obtained 3,030 unique course names and in these courses we found over 15,414 books listed. required versus recommended reading from the course catalog data, we found that 11,022 (71.5 percent) books were required readings and the remaining 4,392 (28.5 percent) were recommended. information technology and libraries | march 2019 95 upon cleaning and removing duplicates and missing data, we identified 3,014 books that could be queried using the google books api. querying the api returned results for 2,940 books, i.e. it found 97 percent of the books and only seventy-four books could not be found. the google books api returns information in json format. figure 3 below shows an example of the json information returned for one book. { "volumeinfo" : { "title" : "psychiatric and mental health nursing", "authors" : [ "phil barker" ], "industryidentifiers" : [ { "type" : "isbn_13", "identifier" : "9781498759588" }, { "type" : "isbn_10", "identifier" : "1498759580" } ], "imagelinks" : { "smallthumbnail" : "http://books.google.com/books/content?id=btsocgaaqbaj&printsec=frontcover&img=1&zo om=5&edge=curl&source=gbs_api" } }, "saleinfo" : { "isebook" : true, "retailprice" : { "amount" : 62.39, "currencycode" : "usd" } }, "accessinfo" : { "publicdomain" : false, "pdf" : { "isavailable" : true } } } figure 3. sample of book information returned by google books api. digital formats and public domain license figure 4 shows the numbers of pdf (1,219) and e-book (1,016) versions of books reported to be available. eight hundred and fifty-four were available in both pdf and e-book format. from the determining textbook cost, formats, and licensing | costello, bolger, soverino, and brown 96 https://doi.org/10.6017/ital.v38i1.10738 total of 2,940 individual books listed their availability was as follows: figure 4. availability of 2,940 books in digital formats and public domain license. as per figure 4, only 0.18 percent (six) of the books had a version available in the public domain according to google books. cost results the google books api only returned prices for 596 (20 percent) of the books that we searched for. within that sample, the cost ranged from $0.99 to over $452, as illustrated in figure 5. the median price of a book was $40, and the mean price was $56.67. as there are on average 3.96 books per course, this implies an average cost to students of $224.41 per course taken. as students take an average of 8.05 courses per year, this further implies a cost per year of $1,806.50 per student if they were to buy new versions of all the books. 1,219 (39.73% ) 1,016 (34.56% ) 6 (0. 18%) 0 500 1000 1500 2000 2500 pdf ebook openpdf e-book public domain information technology and libraries | march 2019 97 figure 5. summary of book prices (n = 596). discussion and conclusion we have demonstrated that it is possible to programmatically search and determine the prices of large numbers of books. we used this information to attempt to estimate the full economic cost of books to students on average in an irish hei. we are still actively developing this tool and encourage others to use and even contribute to the code which we have published with the dataset. this proof of concept tool may allow stakeholders with an interest in book costs for students to quickly get real data on large numbers of books. ultimately, we hope that this will help highlight the costs of many textbooks. our findings also highlight relatively low levels of digital book availability. very few books were found to be in the public domain. a limitation of this research is that there are issues around the coverage of google books and its index policies or algorithms. in a literature review of research articles about google books in 2017, fagan pointed out that the coverage of google books is “hit and miss.”13 in 2017, google books included about thirty million books, though google did not release specific details on its database, as emphasized by fagan. it is known that content includes digitized collections from over forty libraries, and that us and englishlanguage books are overrepresented.14 furthermore, google books is only returning results for books that are in the public domain and cannot tell us if books are made available through open licenses such as creative commons. accepting such caveats, however, we have found the google books api to be a very useful tool for answering questions about large numbers of books in a systematic way and hope that our findings can help others. the prices that we derived in this study were for new books only. however, the new book prices provide a baseline for all other prices, e.g. a used book or a loan book price will be relative to a new book price and library budgets will need to take account of new book prices.15 further study is required to determine a more realistic figure for the cost of textbooks and the next phase of our 0 50 100 150 200 250 300 350 400 450 500 1 16 31 46 61 76 91 10 6 12 1 13 6 15 1 16 6 18 1 19 6 21 1 22 6 24 1 25 6 27 1 28 6 30 1 31 6 33 1 34 6 36 1 37 6 39 1 40 6 42 1 43 6 45 1 46 6 48 1 49 6 51 1 52 6 54 1 55 6 57 1 58 6 d ol la rs cost in usd books determining textbook cost, formats, and licensing | costello, bolger, soverino, and brown 98 https://doi.org/10.6017/ital.v38i1.10738 wider open textbook research projects involves interviews and focus groups with students to better understand the lived reality of their relationship with textbooks.16 references 1 stephen l. baglione and kevin sullivan, “technology and textbooks: the future,” american journal of distance education 30, no. 3 (aug. 2016): 145-55, https://doi.org/10.1080/08923647.2016.1186466. 2 etan senack and robert donoghue, “covering the cost: why we can no longer afford to ignore high textbook prices,” report, the student pirgs (feb. 2016), www.studentpirgs.org/textbooks. 3 elaine allen and jeff seaman, “opening the textbook: educational resources in u.s. higher education, 2015-16,” report, babson survey research group (july 2016), https://www.onlinelearningsurvey.com/reports/openingthetextbook2016.pdf. 4 william and flora hewlett foundation (2019), http://www.hewlett.org/programs/education-program/open-educational-resources. 5 2012 paris oer declaration, http://www.unesco.org/new/fileadmin/multimedia/hq/ci/wpfd2009/english_declaratio n.htm. 6 mary burgess, “the bc open textbook project,” in open: the philosophy and practices that are revolutionizing education and science, rajiv s. jhangiani and robert biswas-diener (eds.). (london: ubiquity pr., 2017): 227–36. 7 nicole allen, “congress funds $5 million open textbook grant program in 2018 spending bill,” sparc open (mar. 20, 2018), https://sparcopen.org/news/2018/open-textbooks-fy18/. 8 jennifer ma et al., “trends in college pricing,” report, the college board (oct. 2017), https://trends.collegeboard.org/sites/default/files/2017-trends-in-college-pricing_0.pdf. 9 kaitlyn vitez, “open 101: an action plan for affordable textbooks,” report, student pirgs (jan. 2018), https://studentpirgs.org/campaigns/sp/make-textbooks-affordable. 10 mark brown, eamon costello, and mairéad nic giolla mhichíl, “from books to moocs and back again: an irish case study of open digital textbooks,” in exploring the micro, meso and macro. proceedings of the european distance and e-learning network 2018 annual conference, genova, 17-20 june, 2018 (budapest: the european distance and e-learning network): 206-14. 11 google books api (2018), https://developers.google.com/books/docs/v1/reference/volumes. 12 eamon costello and richard bolger, “textbooks authors, publishers, formats and costs in higher education,” bmc research notes 12, no. 1 (jan. 2019): 12-56, https://doi.org/10.1186/s13104-019-4099-1. information technology and libraries | march 2019 99 13 jody condit fagan, “an evidence-based review of academic web search engines, 2014-2016: implications for librarians’ practice and research agenda,” information technology and libraries 36, no. 2 (mar. 2017): 7-47, https://doi.org/10.6017/ital.v36i2.9718. 14 ibid. 15 anne christie, john h. pollitz, and cheryl middleton, “student strategies for coping with textbook costs and the role of library course reserves,” portal: libraries and the academy 9, no. 4 (oct. 2009): 491-510, http://digital.library.wisc.edu/1793/38662. 16 eamon costello et al., “textbook costs and accessibility: could open textbooks play a role?” proceedings of the 17th european conference on elearning (ecel), vol. 17 (athens, greece: 2018): 99-106. 36 information technology and libraries | march 200736 information technology and libraries | march 2007 author id box for 2 column layout opac design enhancements and their effects on circulation and resource sharing within the library consortium environment michael j. bennett a longitudinal study of three discrete online public access catalog (opac) design enhancements examined the possible effects such changes may have on circulation and resource sharing within the automated library consortium environment. statistical comparisons were made of both circulation and interlibrary loan (ill) figures from the year before enhancement to the year after implementation. data from sixteen libraries covering a seven-year period were studied in order to determine the degree to which patrons may or may not utilize increasingly broader opac ill options over time. results indicated that while ill totals increased significantly after each opac enhancement, such gains did not result in significant corresponding changes in total circulation. m ost previous studies of online public access catalog (opac) use and design have centered on transactionlog analysis and user survey results in the academic library environment. measures of patron success or lack thereof have traditionally been expressed in the form of such concepts as “zerohit” analysis or the “branching” analysis of kantor and, later, ciliberti.1 missing from the majority of the literature on opac study, however, are the effects that use and design have had on public library patron borrowing practices. major drawbacks to transactionlog analyses and user surveys as a measure of successful opac use include a lack of standardization and the inherent difficulties in interpreting resulting data. as peters notes, “[s]urveys measure users’ opinions about online catalogs and their perceptions of their successes or failures when using them, while transaction logs simply record the searches conducted by users. surveys,” he concludes, “mea sure attitudes, while transaction logs measure a specific form of behavior.”2 in both cases it is difficult, in many instances, to draw clear conclusions from either method. circulation figures, on the other hand, measure a more narrowly defined level of patron success. circulation is a discrete output that is the direct result of patrons’ initiated interaction with one or many library collections, one or many levels of library technology. with the recent advent of such enhanced opac functionality as patronplaced holds on items from broader and broader catalogs, online catalogs now more than ever not only serve as search mechanisms but also as ways for patrons to directly obtain materials from multiple sources. it follows that an investigation of the possible effects such enhancements may have on general circulation trends is warranted. ■ literature review during the midtolate 1980s, transactionlog analysis was introduced as an inexpensive and easy method of looking at opac use in primarily the academic library environment. peters’s transactionlog survey of more than thirteen thousand searches executed over a five month period at the university of missourikansas city remains particularly instructive today for its large sample and transferable design as well as its interpreta tion of results.3 here analysis was broken into two phases. in phase one, usage patterns by search type and failure rates as measured by zero hits were examined as dependent vari ables with search type as the independent variable in a comparison study. phase two took this one step further in the assigning of what peters termed “probable cause” of zero hits. these probable causes fell into patterns that, in turn, resulted in the identification of fourteen discernable error types that included such things as typographical errors and searches for items not in the catalog. once again, search type formed the independent variable while error type shaped the dependent variable in a simple study of error types as a percentage of total searches. peters found that users rarely employed truncation or any advanced feature searches and that failures were due primarily to such consistent erroneous search patterns as typographical errors and misspellings. more importantly, however, he cogently reassessed transactionlog analysis as a tool and critiqued its limitations. zero hits, for exam ple, need not necessarily construe failure when a patron performs a quality search and finds that the library simply does not own the title in question. concerning intelligible outputs from transactionlog study, peters found that, “if the user is seen as carrying on a dialog of sorts with the online catalog, then it could be said that most transaction logs record only half of the conversa tion. more information about the system’s response to the user’s queries would help us better understand why patrons do what they do.”4 a look at subsequent transactionlog analyses into the 1990s reveals somewhat differing research approaches yet strikingly similar results. wallace (1993) duplicated peters’s methods at eleven terminals within the university of colorado library system.5 her efforts spanned twenty hours of search monitoring and resulted in 4,134 logged searches. these were defined by carl system search type, (e.g., word, subject), then analyzed as cumulative totals and percentages of all searches. in this case, how michael j. bennett michael j. bennett (mbennett@cwmars.org) is digital initiatives librarian, c/w mars library network, worcester, massachusetts. article title | author 37opac design enhancements | bennett 37 ever, failed searches (peters’s zero hits) were eliminated entirely from the sample as wallace focused primarily on patterns of completed searches and did not concern her self with questions of search success or failure, thus limit ing the scope of her findings. among searches analyzed, results were comparable to peters’s.6 in keeping with peters’s line of thinking, wallace remarked, intriguing vagaries in human behavior during an infor mation search process continue to stymie researchers’ efforts to understand that process. . . . current, widely used and described guidelines, rules and principles of searching simply do not take into account important aspects of what is really going on when an individual is using a computer to search for information.7 in 1998, ciliberti et al. conducted a materials avail ability study of 441 opac searches at adelphi university over a threeweek period during fall semester.8 their work combined kantor’s branchinganalysis methodol ogy with transactionlog analysis of opac use in order to better understand if users obtain the materials they need through the online catalog.9 sampling was accom plished during random open hours and drew informa tion from undergraduate, graduate, and faculty users. survey forms included questions of what patrons were searching for. forms were then picked randomly by staff for recreation. the study was unclear as to the actual design of these forms and their queries. as a result their effectiveness remains questionable. a sevencategory scheme was developed to code search failures that closely followed kantor’s branching analysis, where the concept of errors extends beyond just opac and its design to include such things as library collection devel opment and circulation practices.10 the survey itself along with the loss of accuracy that can be expected from patrons attempting to describe their searches on paper, then having these same searches recreated by research staff lead this author to question the data’s validity. as peters has noted, surveys are good for assessing opac users’ opinions but not necessarily their behavior.11 it would seem that in this instance the tool did not fit the task. this study did, however, use transaction logs after the initial survey analysis and indeed found discrepancies between the selfreport (survey) and actual transactionlog data. search errors were subsequently categorized as pre viously described.12 though branching analysis is adept at examining on a holistic, entirelibrary scale (e.g., the ques tion of why patrons are not able to obtain materials), the method’s inherent breadth of focus does not lend itself to fine scrutiny of opac design issues in and of themselves. further refinement of the transactionlog analysis methodology may be seen in blecic’s et al. fouryear longi tudinal study of opac use within the university of illinois library system.13 once again, failed searches, termed “zero postings” by the authors, were examined as dependent variables and percentages of the total number of searches and were used as a control. reasons for zero postings (e.g., searches missing search statements, author names entered in incorrect order) fell into seven separate catego ries. subsequent transactionlog sets were then culled after three incremental opac enhancements. enhancements included redesigns of general introductory and explain screens. ztest analysis of the level of equality between percentages of zero postings from log set to log set was then made in order to assess whether or not the enhance ments had any affect on diminishing said percentages and thus improving searching behavior. what blecic et al. found was temporary improve ment in patron searches followed by an unexpected lowering of patron performance over time. confounding attributes to the study include its longitudinal nature in an academic setting where user groups are not constant but variable. sadly, no attempt at tracking such possible changes in user populations was made. also of note was the fact that, as time passed, the commandbased opac was increasingly being surrounded by webbased journal database search interfaces that did not require the use of sophisticated search statements and arguments. as users became accustomed to this type of searching, their com mand syntax skills may have suffered as a result.14 merits of the study include its straightforward design, logical data analysis, and plausible conclusions. longitudinal studies, though prone to the confound ing variables described, nevertheless form a persuasive template for further research into how incremental opac enhancements affect actual opac use over time. variations of transactionlog analysis also include the purely experimental. thomas’s 2001 simulation study of eightytwo firstyear undergraduates at the university of pittsburg utilized four separate experimental screen inter faces.15 these interfaces included one that mimicked the current catalog with data labels and brief bibliographic displays, a second interface with the same bibliographic display but no data labels, and a third that contained the data labels but modified the brief display to include more subjectoriented fields. a fourth interface viewed the same brief displays as the third group but with the labels removed. users were pretested for basic demographic informa tion and randomly assigned to one of the four experi mental interface groups. each group was then given the same two search tasks. for the first task, users were asked to select items that they would examine further for a hypothetical research paper on bigband music and the music of duke ellington. the second task involved asking participants to examine twenty bibliographic records and to decide whether they would choose to look into these records further. participants were then asked to identify the data elements used to inform their 38 information technology and libraries | march 200738 information technology and libraries | march 2007 relevance choices. resulting user behavior was subse quently tracked through transaction logs. for thomas’s experimental purposes, though, trans action logs took on a higher level of sophistication than in earlier comparative studies. here participants’ actions were monitored with a greater level of granularity. quantitative data were tracked for screens visited, time spent viewing them, total number of screens, total number of bibliographic citations examined at each level of speci ficity, and total time it took to complete the task. because of the obtrusive nature of the project, a third party was hired to administer the experiment. chisquare analysis of demographic data found no significance among partici pant groups in terms of their experience in using comput ers, online catalogs, or prior knowledge of the problem topic. this important analysis allowed the researchers a higher level of confidence in their subsequent findings. results in many instances were, however, inconclu sive. factors impairing the clarity of conclusions included the number of variables analyzed and the artificiality of the test design itself. thomas comments on one particular example of this: one of the fields that previous researchers said that library users found important was the call number field. obviously, without the call number, locating the actual item on the shelf is greatly complicated. in this experi ment, however, participants were not asked to retrieve the items they selected; thus, their perceived need for the call number may well have been mitigated.16 here is further evidence that a study of opac activity viewed in the context of actual outcomes, namely circula tion, is a logical approach to consider. most recently, graham at the university of lethbridge, alberta, examined opac subject searching and no hit results and considered two possible experimental enhancement types in order to allow users the ability to conduct more accurate searches.17 over a oneweek period, 1,521 nohit subject searches were first sampled and placed into nine categories by error type. subtotals were then expressed as percentage distributions of the total. a similar examination of 37,987 nohit findings was also made over the course of four calendar years, form ing a longitudinal approach. percent distribution of error types from the two studies were then compared and were found to be similar with “nonlibrary of congress subject headings” being the predominant area of concern. graham then attempted to improve subject searching by systematically enhancing the catalog in two ways. first, crossreferences were created based upon the original no hit search term and linked to existing library of congress subject headings (lcshs) that graham interpreted as appropriate to the searcher’s original intentions. second, in instances where the original search could not be easily linked to an existing lcsh, a pathfinder record was cre ated that suggested alternate search strategies. all total, 10,520 new authority records and 2,312 pathfinder records were created over the course of the longitudinal study.18 the experiment, unfortunately, only went this far. no attempt was subsequently made to test whether these two methods of adding value to an existing opac search interface made a difference in users’ experiences. though creative in its suggested ameliorations to nohit searches, the study also lacked any statistical testing of comparative data among sample years. possible problematic design issues, such as the relative complexity of pathfinders and how this might affect their end use were discussed but never tested through the analysis of real outcomes. in summary, major weaknesses of the transactionlog analysis model as demonstrated through the literature include: 1. lack of standardization among general study methodologies. 2. lack of standardization of opacs themselves: command structure and screen layout differ among software vendors. 3. lack of standards on measurable levels of search “success” or “failure.” while the following study of opac design enhance ments in the public library consortium environment did not directly address the first two points of emphasis, it was this author’s expectation that the lack of stan dardized notions of opac search success or failure found throughout the literature may be better addressed through a longitudinal analysis of discrete circulation and ill statistics. in this way, these quantifiable outcomes, both the direct results of patron initiation, would better assume clearer measures of patron success or failure in opac end use. ■ purpose and methodology in recent years, both academic and public libraries have invested substantial capital in improving opac design and automated systems. to what extent have these improvements affected the use of library materials by public library patrons? in order to better examine the question, this study tracked, over a sevenyear period dating back from july 1998 through june 2005, the circulation and systemwide holds statistical trends of sixteen member libraries of c/ w mars, a massachusetts automated library network of 140 libraries. during this time a number of discrete, incre mental opac modifications granted patrons the ability to accomplish tasks remotely through the opac that previ ously had required library staff mediation. among these article title | author 3�opac design enhancements | bennett 3� changes, the initiation of intraconsortium (c/w mars) patronplaced holds, and the subsequent introduction of a link from the existing opac to the massachusetts virtual catalog (nine massachusetts consortiums, four university of massachusetts system libraries) were examined. this author hypothesized that such opac enhance ments that allow for broader choices of patronplaced holds would result in increases in both total circulation and total network transfers (ill) of library materials one year after initial enhancement adoption. as both total cir culation and total ill grew, it was hypothesized that ill as a percent of total circulation would likewise increase due to the fact that each opac enhancement was targeted directly toward facets of ill procurement. opac enhancements followed the schedule below: 1. general c/w mars network systemwide holds (requests mediated through library staff only), november 2000 2. patronplaced holds (request button placed on c/ w mars opac screens), december 2002 3. c/w mars participation in the massachusetts virtual catalog (additional button for pass through opac searches and requests from c/w mars catalog into the massachusetts virtual catalog), august 2004 these dates served as independent variables in a study of separate dependent variables (total circulation and total ills received) for all eight libraries one year after initial adoption of a new enhancement. for the sake of continu ity the terms holds and ills were used interchangeably throughout this examination. ttest comparisons to fig ures from the year prior to enhancement were then made for statistical significance. in addition, ills received as a percentage of total circulation (dependent variable) for all fifteen libraries one year after initial adoption of a new enhancement were also calculated and compared to the year prior to enhancement through ztest analysis. libraries chosen were a random sample from both central and western geographic regions of the network. sampled institutions did not go through any substantial renovations, drastic open hours changes, or closures dur ing the study period in order to better avoid potential con founding variables that may have skewed the resulting data. raw circulation and ill figures were taken directly from the massachusetts board of library commissioners’ (mblc) data files for fiscal years 1999 through 2004.19 in the mblc’s data files, the following fields, sorted by library, correlated to this study’s statistical reporting: “dircirc” = “circulation” “loan from” = “ill” as fiscal year (fy) 2005 figures for circulation and ill had not yet been compiled by mblc at the time of this writing, these statistics were in turn taken directly from reports run off of c/w mars’s network servers. it should be noted that similar c/w mars reports are distributed and used by the consortium’s libraries them selves each fiscal year for reporting circulation and ill statistics to mblc. raw data by library were entered into microsoft excel spreadsheets. totals for circulation and ills received for all libraries by fy of opac enhancement were totaled and then compared to fy data prior to enhancement as a percent change value. excel’s data analysis tools were then employed to run ttests (paired two sample for means) in tables 1 through 5 to analyze the level of change for significance from one sample to the next in both total circulation and total ills. (all tables and charts can be found in appendix following article.) tests for sig nificance employed twotailed ttests with an alpha level set to .05. raw data for these same libraries across identical study years were also entered into subsequent spread sheets (tables 6 through 10) for additional ztests (two samples for means) to analyze the level of change for significance from one fy sample to the next in ills received as a percentage of total circulation. here tests for significance employed twotailed ztests with an alpha level set to .05. ■ results and discussion the results of a sixteenlibrary, sevenyear longitudinal study of total circulation and total illsreceived statistics are outlined in tables 1 through 5, charts 1 through 10. in addition, an analysis of ills received as a percentage of total circulation during this same time period among sampled libraries is represented in tables 6 through 10. over the course of the study a total of 22,277,245 circula tion and 624,286 ill transactions were examined from july 1998 through june 2005. yearly comparisons in total circulation and total ills received from fy ’99 to fy ’00 were made to analyze the level of changes in circulation and ill statistics between years before any opac ill enhancements were under taken. as such these numbers gave insight into what changes, if any, normally occur in circulation and ill fig ures prior to a schedule of substantial opac ill enhance ments. although the yeartoyear comparisons over the course of subsequent enhancement rollouts were made to test for the statistical significance of the year prior and following a particular functionality addition, the ’99 to ’00 40 information technology and libraries | march 200740 information technology and libraries | march 2007 comparison was made to form a control of what circula tion and ill trends may look like between years of no drastic workflow or design changes. results showed that this yearly comparison prior to the beginning of opac enhancements (table 1, charts 1 and 2) showed no significant change from one year to the next in total circulation (t = 1.81, p > 0.05) or total ills received (t = 0.76, p > 0.05). circulation from ’99 to ’00 declined slightly by 3.42 percent while total ills received increased 3.35 percent. the mblc’s available retrospec tive data set currently only goes back to fy ’99, so a deeper understanding beyond this twoyear comparison of normal yeartoyear trends was impossible to achieve. yet data from this sample suggest that both circulation and ills may trend statistically flat from one year of little if any alteration of ill design to the next. additionally, comparisons of the percent of total ills received to total circulation were made between ’99 and ’00 (as will be seen in table 6) and were found to be insignificantly different (z = 0.23, p > 0.05). ills received made up 0.61 percent of total circulation in fy ’99 and 0.65 percent of total circulation in fy ’00. during fy ’01 (november 2001), c/w mars rolled out automated systemwide holds functionality whereby library staff were first able to place patron requests for materials at other c/w mars member libraries through the consortium’s automated circulation system. up until this point, holds (ills) were placed primarily by staff through email or faxed requests from one ill depart ment to another. patrons would request material either verbally with staff or through the submission of a paper or electronic form. staff would then look up the item in the electronic catalog and make the request. with the advent of systemwide holds, staff still accepted requests in a similar fashion, but instead of using the fax or email, they began to place requests directly into the network’s innovative millennium circu lation clients. from there, the automated system not only randomly chose the lending library within the system but also automatically queued paging slips at the lending library for material that would subsequently be sent in transit to the borrowing location. by this time in the network’s development, opac had also graduated from a characterbased telnet system to a smoother web design. but the catalog, in terms of directly assisting in the placing of ill requests, func tioned as it always had—it was still individually a search ing mechanism. the introduction of systemwide holds led to the sec ond largest jump in ill figures out of all comparative samples (table 2, chart 4). interestingly enough, the con siderably significant 127.23percent gain in ill activity from fy ’00 to fy ’01 (t = 4.07, p < 0.05) did not translate into a significant increase in total circulation. in fact, cir culation declined during this period, not significantly (t = 1.87, p > 0.05), but by 2.40 percent nonetheless (table 2, chart 5). a comparison of the percent of ills to total circulation from fy ’00 to fy ’01 (table 7) indicated a sig nificant increase of 0.65 percent to 1.52 percent (z = 4.20, p < 0.05). more on the possible effects to circulation that rising levels of ills may elicit will be touched upon. though no statistical evaluations were made between fy ’01 and fy ’02 (as no novel ill changes were made over this period), it should be noted that during fy ’02 the network first allowed patrons the ability, through opac, to log into their own accounts remotely. patrons were given the ability to set up a personal identification number and view such things as a list of their checked out items. patrons were also allowed to place checks next to such items and to renew these items remotely. fy ’03 saw the original direct ill enhancement to opac. during this year patrons were first given the opportunity to directly place ill requests of their own (patronplaced holds) for material found in the catalog through the addition of an opac screen request button. up until this time, all material requests had been medi ated by library staff. comparative total circulation results from the year before enhancement to fy ’03 (table 3, chart 5) showed only a slightly significant 4.18 percent increase (t = 2.94, p < 0.05). illsreceived figures (table 3, chart 6), however, jumped by a considerable 25.58 percent margin (t = 4.66, p < 0.05), strongly suggesting that the opac request button addition and its facilitation of patronplaced holds had a positive effect upon total ill activity as was hypothesized. finally, total ills received as a percentage of total circulation increased slightly from fy ’02 (2.52 percent) to fy ’03 (3.04 percent) (table 8) but did not rep resent a significant shift (z = 1.51, p > 0.05). the last augmentation to the network’s opac design that this study examined was an additional link for ills through the massachusetts virtual catalog. the massachusetts virtual catalog at the time of this study was an online union catalog of nine massachusetts net work consortia and four university of massachusetts system libraries. unlike the previous requestbutton enhancement that allowed for seamless patronplaced holds within the c/ w mars catalog, the massachusetts virtual catalog link was not a button but a descriptive hyperlink (can’t find the title you want here? try the massachusetts virtual catalog next!) from the network’s opac to the virtual catalog’s own dedicated opac interface. once there, patrons were required to login to the virtual catalog and recreate their search queries from scratch as previous searches were not automatically passed through to the second catalog. in essence, the virtual catalog acted as an additional step for patrons to take beyond c/w mars’s list of holdings to broaden their search for materials that the network’s member libraries did not own. article title | author 41opac design enhancements | bennett 41 comparative figures for total circulation between fy ’04 and fy ’05 (table 4, chart 7) when the virtual catalog link was added to the c/w mars opac screen found circulation down an insignificant 2.04 percent (t = 0.97, p > 0.05), which ran counter to hypothesized expectations. total ills received between fy ’04 and fy ’05 (table 4, chart 8), however, rose 30.85 percent, which proved to be a highly significant increase (t = 7.03, p < 0.05). additionally ills as a percent of total circulation rose from 4.70 percent in fy ’04 to 6.27 percent in fy ’05 (table 9), which was sta tistically significant (z = 3.28, p < 0.05) and pointed to not only gains in ill itself after the introduction of the virtual catalog link but also to the ever increasing proportion of total circulation that ill activity accounted for. the final statistical comparison accomplished in this study was a look at what possible cumulative effect, if any, both opac enhancements may have had from the year before the first enhancement’s rollout (patronplaced holds request button) to one year after the latest addition (virtual catalog hyperlink from opac). in turn, com parative numbers for circulation and ills between fy ’02 and fy ’05 were examined. total circulation over this time (table 5, chart 9) increased insignificantly by 3.46 percent (t = 1.47, p > 0.05). total ills received (table 5, chart 10), how ever, increased by 157.47 percent, the highest significant increase of any two comparative samples (t = 7.20, p < 0.05). ills as a percent of total circulation also increased significantly from 2.52 percent in fy ’02 to 6.27 percent in fy ’05 (z = 7.71, p < 0.05) (table 10). if one steps back and examines the various compari sons discussed up to this point, certain trends become evident. over the course of the sevenyear study, total circulation remained relatively flat, oscillating slightly back and forth, year to year with only one significant increase that occurred after the introduction of patron placed holds in fy ’03. these results, excluding fy ’03, ran against hypothesized expectations that predicted that as ill enhancements were rolled out, correspondingly significant increases in circulation would result. total ills received (the fy ’99 to fy ’00 control com parison) before the advent of first, network systemwide holds, then a succession of opac design enhancements that allowed for a broader range of patroninitiated ills suggested that these totals run statistically flat from one year to the next. with the advent of systemwide holds, the ill picture, however, began to change dramatically with a significant increase in total ills. this was fol lowed by significant increases in ill activity in each study year that came after an opac ill enhancement. these results pointed toward the substantial effect that these enhancements made in total ill activity and sup ported hypothesized expectations. when such opac rollouts were examined as a cumu lative influence through the prism of ill levels of this past fiscal year (fy ’05) compared to the year before their initial advent (fy ’02), the positive effect that such enrich ments had on not only total ill but also on total circula tion becomes clearest. for it is through this comparison that it was found that not only did total ills increase significantly but that ills as a percentage of total circula tion also increased significantly from the time before the first opac enhancement to the present. total circulation was surprisingly impervious to change and ran statisti cally flat during this time. it is clear from this longitudinal study that incremen tally granting patrons access to online tools for them to initiate such traditional library business as ills spurs sig nificantly large increases in such activity. in other words, these online tools are not ignored but are intellectually and literally grasped. what may be surprising, however, is the degree to which ill has increased as a result of them, to a point where ill has not only taken up a sig nificantly greater proportion of total circulation than ever before but also appears to be changing the very nature of circulation itself. future studies may include a deeper examination of the circulation and ill statistical picture farther back in time than this investigation covers to better clarify trends leading up to such major enhancement rollouts. also, similar longitudinal studies from different consortia envi ronments may shed further light on evidence discussed throughout this writing. consortia are uniquely poised to offer large statistical sample sizes and standardized workflows within their networkwide ill and circulation software packages and automated statistical programs. this, in turn, results in highquality, consistent data samples from heterogeneous library sources that are rela tively uncorrupted by scattershot recording methods and differing circulation and ill methodologies. finally, a future look at the effects that similar opac ill enhancements may have on borrowing trends beyond general raw transactional figures is warranted. chris anderson, for example, has recently commented on long tail statistical analysis and its relation to library catalogs. here outwardly shifting demand curves for library mate rials are hypothesized as collections become more visible and interconnected through the web.20 in a similar vein, a more granular examination of such concepts as possible circulation and illactivity trends in terms of discrete material types borrowed, patron types who borrow, or a crosstabulation of these data points would appear to be a fertile next step toward a greater knowledge of ills and circulation as a whole. references 1. t. peters, “when smart people fail: an analysis of the transaction log of an online public access catalog,” the journal of academic librarianship 15, no. 5 (1989): 267–73. 42 information technology and libraries | march 200742 information technology and libraries | march 2007 2. ibid., 272. 3. ibid. 4. ibid., 272. 5. p. wallace, “how do patrons search the online catalog when no one’s looking? transactionlog analysis and impli cations for bibliographic instruction and system design,” rq 33, no. 2 (1993): 239–43. 6. peters, “when smart people fail.” 7. wallace, “how do patrons search the online catalog when no one’s looking?” 239. 8. a. ciliberti et al., “empty handed? a material availabil ity study and transactionlog analysis verification,” the journal of academic librarianship 24, no. 4 (1998): 282–89. 9. p. kantor, “availability analysis,” journal of the american society for information science 27, nos. 5–6 (1976): 311–19. 10. ciliberti et al., “empty handed? a material availability study and transactionlog analysis verification.” 11. peters, “when smart people fail.” 12. ciliberti et al., “empty handed? a material availability study and transactionlog analysis verification.” 13. d. blecic, et al., “a longitudinal study of the effects of opac screen changes on searching behavior and searcher suc cess,” college & research libraries 60, no. 6 (1999): 515–30. 14. ibid. 15. d. thomas, “the effect of interface design on item selec tion in an online catalog,” library resources & technical services 45, no. 1 (2001): 20–46. 16. ibid., 41. 17. r. graham, “subject nohits searches in an academic library online catalog: an exploration of two potential ame liorations,” college & research libraries 65, no. 1 (2004): 36–54. 18. ibid. 19. massachusetts board of library commissioners 2005, “public library data, data files,” http://www.mlin.lib.ma.us/ advisory/statistics/public/index.php (accessed oct. 13, 2005). 20. c. anderson, “the long tail,” wired magazine 12, no. 10 (2004): 170–77; “q&a with chris anderson,” oclc newsletter, 2005, no. 268, http://www.oclc.org/news/publications/news letters/oclc/2005/268/interview.htm (accessed july 20, 2006). appendix a: tables and charts table 1. yearly comparison prior to the beginning of ill opac enhancements table 2. general systemwide holds implementation (adopted 11/00) article title | author 43opac design enhancements | bennett 43 table 3. opac design enhancement: patron-placed holds (adopted 12/02) table 4. opac design enhancement: patron-placed massachusetts virtual catalog holds (adopted 8/04) table 5. opac design enhancements: “cumulative effect” (fy ’02 to fy ’05) table 6. yearly comparison prior to the beginning of ill opac enhancements of ill received as a percentage of total circulation 44 information technology and libraries | march 200744 information technology and libraries | march 2007 table �. opac design enhancement: patron-placed massachusetts virtual catalog holds (adopted 8/04) ill received as a percentage of total circulation table 10. opac design enhancements: “cumulative effect” (fy ’02 to fy ’05) ill received as a percentage of total circulation table 7. general systemwide holds (adopted 11/00) ill received as a percentage of total circulation table 8. opac design enhancement: patron-placed holds (adopted 12/02) ill received as a percentage of total circulation article title | author 45opac design enhancements | bennett 45 chart 1. circulation comparison prior to any ill opac enhancement (fy ’99 to fy ’00) chart 2. ill received comparison prior to any ill opac enhancement (fy ’99 to fy ’00 chart 4. holds received comparison before and after general systemwide holds implementation (adopted 11/00) chart 5. circulation comparison before and after patron-placed holds opac enhancement (adopted 12/02) chart 3. circulation comparison before and after general systemwide holds implementation (adopted 11/00) chart 6. holds received comparison before and after patron-placed holds opac enhancement (adopted 12/02) 46 information technology and libraries | march 200746 information technology and libraries | march 2007 chart 7. circulation comparison before and after massachusetts virtual catalog opac enhancement (adopted 8/04) chart 8. holds received comparison before and after massachusetts virtual catalog opac enhancement (adopted 8/04) chart 9. circulation comparison opac enhancements “cumulative effect” (fy ’02 to fy ’05) chart 10. ill comparison opac enhancements “cumulative effect” (fy ’02 to fy ’05) lita 35, 47, cover 2, cover 4 nealschuman cover 3 index to advertisers ital_24n4p12-23 ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ articles “good night, good day, good luck”: applying topic modeling to chat reference transcripts megan ozeran and piper martin information technology and libraries | june 2019 59 megan ozeran (mozeran@illinois.edu) is data analytics & visualization librarian, university of illinois library. piper martin (pm13@illinois.edu) is reference services & instruction librarian, university of illinois library. abstract this article presents the results of a pilot project that tested the application of algorithmic topic modeling to chat reference conversations. the outcomes for this project included determining if this method could be used to identify the most common chat topics in a semester and whether these topics could inform library services beyond chat reference training. after reviewing the literature, four topic modeling algorithms were successfully implemented using python code: (1) lda, (2) phrase-lda, (3) dmm, and (4) nmf. analysis of the top ten topics from each algorithm indicated that lda, phraselda, and nmf show the most promise for future analysis on larger sets of data (from three or more semesters) and for examining different facets of the data (fall versus spring semester, different time of day, just the patron side of the conversation). introduction the library at the university of illinois at urbana-champaign has included chat reference services since the spring of 2001.1 today, this service is extensively used by library patrons, resulting in thousands of conversations each semester. while in-person reference edges out chat for the largest number of interactions at the main library information desk over the most recent four years, chat questions have a higher number of more complex questions that incorporate teaching or strategizing.2 since the initial implementation of chat, the library has continually assessed and improved chat reference by evaluating the software, measuring the effectiveness and value of the service, and providing staff training.3 for several years, librarians at the university of illinois have used chat transcripts for training graduate assistants and new employees and chat statistics for determining staffing. unlike other forms of reference interactions, chat offers a textual record of the conversation, so librarians have used this unique opportunity in a couple different ways. in a training exercise, students read through actual transcripts and are guided in recognizing both well-developed and less-than-ideal interactions. they are then asked to think about ways those chat conversations could have been improved and to share strategies for doing so. graduate assistant supervisors also use chat transcripts to evaluate the performance of individual graduate assistants, checking for appropriate levels of helpfulness and for adherence to the library’s chat policies. finally, part of the library’s assessment strategy looks at chat interaction numbers, such as chats per hour, the duration of each conversation, and the level of complexity of each conversation to help make decisions about optimal chat staffing levels. however, prior to the project described here, the library had not yet good night, good day, good luck | ozeran and martin 60 https://doi.org/10.6017/ital.v38i2.10921 analyzed the chat reference conversations on a large scale to understand the range and consistency of topics being discussed. while these uses of chat data have been successful, such a large body of information from patrons about the library and its collections and services seemed underutilized. in an environment of growing data-informed decision-making, both within the broader library community and at the university of illinois in particular, it was now an opportune time to implement this kind of largescale topic analysis. if common themes emerged from the chat interactions beyond simply showing the most frequently asked questions, these themes could inform the library’s reference services beyond just training for chat reference. for example, patterns in the number of citation questions could indicate the best times to offer a citation management tool workshop; multiple inquiries about a new resource or tool might prompt planning a new workshop; and repeated confusion regarding a service or policy may signal a need to bolster the online guides or faq. since the number of chat transcripts was so large, automating analysis through a programming language such as python seemed the best course of action. this article presents the results of a pilot project that tested the application of algorithmic topic modeling to chat conversations. the outcomes for this project included (1) determining if this method could be used to identify the most common chat reference topics in a semester; and (2) whether this information indicated if it could be used to inform reference services beyond just training for chat, such as improving faqs, workshops, the library website, or other instruction. literature review chat reference services are well established in academic libraries, and there are abundant examples in the literature exploring these services. however, there is a lack of research on ways to employ automated methods to analyze chat reference. numerous articles approach chat analysis via traditional qualitative methods, where research teams hand-code chat themes, topics, or question categories.4 schiller employed a tool called qda miner to partially automate the otherwise human-driven coding process, using the software to automatically generate clusters of manually created codes.5 only one paper appeared to explicitly address the issue primarily by using algorithmic analysis methods. in addition to conducting sentiment analysis, kohler applied three topic modeling algorithms to chat reference conversations at rockhurst university.6 kohler identified the algorithm of non-negative matrix factorization (nmf) as the “winning topic extractor” based on how evenly it distributed the topic clusters across all the chat conversations.7 the other algorithms kohler tested, latent dirichlet allocation (lda) and latent semantic analysis (lsa), had much more skewed distributions of topics. the most common topic identified by lda appeared in so many of the chat conversations that it was essentially meaningless as a category. lda is one the most well-established topic modeling algorithms, but as kohler found, it does not work very well with short texts like chat conversations. to supplement the lack of library research in this area, non-library research that has applied topic modeling to short texts was also reviewed. interestingly, although the nmf algorithm worked well for kohler’s analysis of library chat conversations, there was little mention of nmf in the nonlibrary literature. on the other hand, it was not surprising that lda was one of the most commonly discussed algorithms, either as an example of what doesn’t work or as a basis upon which a modified algorithm was created to perform better for short texts.8 another common algorithm information technology and libraries | june 2019 61 was biterm topic modeling (btm). proposed by cheng et al., btm takes pairs of words (biterms), rather than individual words, as the unit on which to base topics.9 by creating biterms, the researchers increased the number of items to sort into topics, thus mitigating a common problem with analyzing short texts. a final commonly used algorithm was the dirichlet mixture model (dmm).10 a key feature of dmm for analyzing short texts is that it assumes each text (in this project, each chat conversation) is associated with only one topic. while longer texts like articles or books likely encompass many topics, it is plausible that a chat conversation could be summarized in one topic. methodology at the time of this project (spring 2018), the library was using locally developed chat software called iwonder. the chat widget is embedded on the library homepage, on the “ask a librarian” page, in libguides, and within the library’s interface for its licensed ebsco databases. the chat service was available 87 hours per week at the time the data was collected. during the day, chat service is provided by a mix of librarians, library staff, and graduate assistants, most of whom are scheduled at the main library’s information desk. subject-specific libraries, including the engineering library, the agricultural and life sciences library, and the social sciences, health, and education library, also contribute many hours on chat reference from their respective locations. the evening and weekend shifts are all covered by graduate assistants from the university of illinois school of information sciences. the authors decided that one semester of chat transcripts would be the most appropriate corpus with which to work for this pilot project because it would encompass a substantive and meaningful (but also manageable) number of conversations. in preparation, institutional review board approval was received, and a graduate student completing a degree in information management from the school of information sciences was selected to assist with this project through the school’s practicum program. this practicum student is an experienced programmer, and his presence on the team allowed the project to proceed more quickly than if the authors had pursued the project without his expertise. to begin the project, all chat conversations from the spring 2017 semester were obtained by querying the local server using mysql workbench, limiting the query to chat logs between the dates 1/17/2017 and 5/12/2017 (inclusive). because each line of a chat conversation was saved as a separate line in the database, this meant retrieving approximately 90,000 lines of data. the actual text of the chat conversations was unstructured (by its nature), but the text was saved with related metadata. for instance, each chat conversation was automatically given a unique identifier, so the individual lines could be grouped into conversations and put in order by their timestamp. the 90,000 lines represented almost 6,000 individual conversations. the chat logs were cleaned using a combination of openrefine (primarily for ascii character cleanup) and python code to remove personally identifiable information (pii) and to make the data easier to analyze.11 by default, the chat software did not collect any information about patrons, but sometimes patrons volunteered pii because they thought it was needed to answer their questions. therefore, part of the cleaning process involved removing as much of this patron pii as possible, replacing it with the word “removed” to denote the change. in addition, library staff usernames were scrubbed by replacing each username with a generic “staff###”, where “###” was a unique (incremented) number assigned to each original username. this maintained good night, good day, good luck | ozeran and martin 62 https://doi.org/10.6017/ital.v38i2.10921 the ability to track a single staff member across multiple conversations, if desired, without identifying the actual person. another important part of the data cleaning was to remove urls, because these would be unnecessary in identifying topics, and they significantly increased the number of unique “words” that the analysis algorithms identified. the urls were nearly always saved within an html tag, so most urls were easily identified for removal. the data cleaning process has been described here in a linear fashion for ease of understanding, but over the course of the project it was actually an iterative process, as more cleaning issues were discovered during analysis. based on the analyses performed in the related literature, the practicum student wrote code to test five topic modeling algorithms: (1) latent dirichlet allocation (lda), (2) phrase-lda (lda applied to phrases instead of words), (3) biterm topic modeling (btm), (4) dirichlet mixture modeling (dmm), and (5) non-negative matrix factorization (nmf). ultimately, the processing power and time required to implement btm meant that this algorithm could not be implemented for this project. however, for the other four models, lda, phrase-lda, dmm, and nmf, were all successfully implemented. all code related to this project, including the cleaning and analysis, are available on github (https://github.com/mozeran/uiuc-chat-log-analysis). results outputs of the lda, phrase-lda, dmm, and nmf modeling algorithms are shown in tables 1 through 4. after removing common stop words, the remaining words were put into lowercase and stemmed before topic modeling algorithms were applied. the objective of the stemming process was to convert singular and plural versions of a word to a hybrid form so that they are treated as the same word. thus, many words ending in “y” are shown ending in “i”. for instance, “library” and “libraries” would both be converted to “librari” and thus be treated as the same word. the phrase “easi search” refers to “easy search,” the all-in-one search box on the library homepage. the word “ugl” refers to the undergraduate library (ugl). the word “remov” showed up in the topic lists surprisingly frequently, probably because patron pii was replaced with the word “removed.” since explicitly denoting the removal of pii is unlikely to be of import, it makes sense in the future to simply remove the pii without replacement. table 1: lda (top 10 words in each topic) topic 1 music map laptop remov find ok one also may score topic 2 look search find help databas thank use articl research would topic 3 book librari thank help check look remov reserv would els topic 4 help use student find articl librari hi look tri question topic 5 request librari account item thank ok get help loan number topic 6 thank chat good know one night go okay think hi topic 7 thank look librari remov help would contact inform find like topic 8 search articl databas click thank journal help page ok find topic 9 articl thank journal access look help remov full link find topic 10 access tri link thank use work get campu remov let table 2: phrase-lda information technology and libraries | june 2019 63 (top 10 phrases in each topic) topic 1 interlibrari loan, lose chat, chat servic, lower level, chat open, writer workshop, spring break, studi room, call ugl, add chat topic 2 good night, great day, good day, good luck, drop menu, sound good, nice day, ye great, remov thank welcom, make sens topic 3 anyth els, tri find, abl find, find anyth, feel free, ll tri, social scienc, tri access, ll back, abl access topic 4 easi search, academ search, find articl, search box, tri search, databas subject, search bar, search term, databas search, search databas topic 5 graduat student, grad student, peer review, undergrad student, illinoi undergrad, scholarli sourc, univers illinoi, undergradu student, primari sourc, googl scholar topic 6 main librari, librari catalog, librari account, librari homepag, call number, librari websit, netid password, main stack, creat account, borrow id topic 7 page remov, click link, open new tab, link remov, send link, remov click, left side, remov link, page click, error messag topic 8 give one moment, contact inform, moment pleas, faculti staff, give minut, pleas contact, email address, staff member, faculti member, unit state topic 9 full text, journal articl, access articl, find articl, databas journal, light blue, articl titl, titl articl, journal databas, found articl topic 10 request book, request item, check book, doubl check, print copi, cours reserv, copi avail, physic copi, book avail, copi past table 3: dmm (top 10 words in each topic) topic 1 work open chat way onlin say specif avail day sourc topic 2 check titl research much onlin avail day text sourc say topic 3 pleas sourc day onlin titl found right hello may take topic 4 chat also copi pleas think onlin undergrad sourc work way topic 5 pleas sorri found item chat way right open work time topic 6 found also right much think could research undergrad sorri way topic 7 contact hello account sorri could ask titl moment may think topic 8 copi onlin sorri ask think say right also much sourc topic 9 much research way may right think open take hello result topic 10 abl avail also titl catalog pleas say campu onlin take table 4: nmf (top 10 words in each topic) topic 1 request take titl today moment way item may place say topic 2 specif start type journal topic research tab way subject result topic 3 ugl today ask wonder call may contact peopl someon talk topic 4 sourc univers scholarli research servic resourc tell illinoi guid librarian topic 5 account log set vpn us password id say campu problem topic 6 main locat undergradu call tab review two circul ugl number topic 7 reserv class time undergradu cours websit show im titl onlin good night, good day, good luck | ozeran and martin 64 https://doi.org/10.6017/ital.v38i2.10921 topic 8 text full troubl problem still pdf websit onlin send moment topic 9 chat night hey yeah oh well time tonight take yep topic 10 unfortun uiuc onlin wonder version graduat print seem way grad discussion interpreting the results of a topic model can be a bit of a guessing game. none of these algorithms look at the semantic meaning of words, so the resulting topics are not based on semantics. each algorithm simply employs a different method of mathematically determining the likelihood that words are related to each other. when this likelihood is high enough (as defined by the algorithm), the words are listed within the same topic. identifying topics mathematically is much quicker than a person hand-coding conversations. however, automatic classification also means that the resulting topics could make absolutely no sense to people, who understand the semantic meaning of the words within a topic. this lack of coherent meaning is most present in the results of the dmm model (table 3). for instance, the words that comprise topic 1 are the following: “work open chat way online say specify available day source.” it is difficult to imagine what overarching concept links all, or even most, of these words. only a few words appear to have any significance at all: “open” could refer to open access, or to the library’s open hours; “online” may refer to finding resources online, or the fact that a student is taking online classes; and “source” is likely some reference to a research resource. these words barely relate to each other semantically, and the remaining seven words don’t provide much clarification. thus, it appears that dmm is not a particularly good topic modeling algorithm for library chat reference. the results seen from the lda model (table 1) appear slightly more comprehensible. in topic 2, for instance, the words are as follows: “look search find help database thank use article research would.” while not all the words relate to each other, a common theme could emerge from the words look, search, find, database, article, and research. it’s possible that this topic 2 identified chat conversations where a patron needed help finding research articles. even topic 6, at first glance a silly list of words, makes some sense: “thank chat good know one night go okay think hi.” greetings and sign-offs probably comprised a good number of the total words in the corpus, so it is understandable that a “greetings” topic could be mathematically identified. overall, lda appears to have potential in topic modeling chat reference, but it probably needs to be further tweaked. when applying the lda model to phrases (table 2), the coherence increases within the phrases, but the topics are not always as coherent. topic 1 includes the following phrases: “interlibrary loan, lose chat, chat service, lower level, chat open, writer workshop, spring break, study room, call ugl, add chat.” each phrase, individually, makes perfect sense for the context of this library; as a collection, however, the phrases don’t comprise one coherent topic. four of the phrases explicitly mention chat services (an interesting meta-topic), while the rest appear completely unrelated. on the other hand, topic 10 does show more semantic relation between the phrases: “request book, request item, check book, double check, print copy, course reserve, copy available, physical copy, book available, copy past.” it seems pretty clear that this topic refers to books— whether on reserve, being requested, or checking if they are even available. with the wide difference in topic coherence, the phrase-lda algorithm is not perfect for topic modeling chat reference, but further exploration is warranted. information technology and libraries | june 2019 65 the final algorithm, nfm (table 4), is also imperfect. it is possible to distill each topic into an actual semantic concept, but there is almost always at least one word that makes it a little less clear. topic 5 probably provides the best coherence: “account log set vpn use password id say campus problem.” it seems clear this topic refers to identity verification, likely for off-campus use of library resources. the other topics given by the algorithm have more confusing elements, such as in topic 1 where the relatively meaningless words may, way, and say all appear. it’s interesting that kohler found nmf to work very well, while the results above are not nearly as coherent as those identified in her implementation.12 this is a perfect example of how the tuning of many different parameters can affect the ultimate results of each topic modeling algorithm. this is why the authors think it is worth continuing to explore how to improve the implementation of lda, phrase-lda, and nmf algorithms for chat conversations, as well as share the original code for others to test and revise. it will take many different projects at many different libraries before an optimum topic model implementation is found for chat reference. next steps for the most part, the more coherent results from the lda and nmf topic modeling algorithms support anecdotal understanding of the primary themes in chat conversations. currently, two members of the research & information services unit, the department responsible for scheduling the chat reference service at the main library, are examining the model outputs to determine whether any of the results are strong enough at this stage to suggest changes to services or resources. they will also share the results with the chat coordinators at other libraries on campus in case the results indicate changes for them. additionally, results will be shared with the library’s web working group, since repeated questions about the same services or locations may suggest the need to display them in a more prominent place on the library website or provide a more discoverable online path to them. since this was a pilot project that used a fairly small data set, it is anticipated that years of transcripts—along with improved topic model implementation—will reveal even more significant and robust themes. with the encouraging results of this pilot project, there is much to continue to explore.13 one future question is whether there are differences between fall and spring semesters. if some topics arise more frequently in one semester than the other, perhaps the library needs to offer more workshops during that semester. alternatively, perhaps support materials should be created (such as handouts or online guides) that emphasize the related services and place them more prominently, while withdrawing or de-emphasizing them in the other semester. another area for further analysis is how the topics that emerge in the late-night chat interactions compare to other times of day. this will help the library design more relevant training materials for the graduate assistants who staff those shifts, or potentially change who is staffing the shifts. also of interest is comparing the text written by the chat operators versus the chat users, as this would further spotlight the terminology that patrons use. if patrons are using significantly different terms from staff, then modifying the language of the library’s website may reduce confusion. there are also improvements to make to the data cleaning process, such as better identifying when to remove stop words and when to remove punctuation. these steps weren’t perfectly aligned, which is why; for example, the “ll” that appears in topic 3 of the phrase-lda results (table 2) is most likely a mutation of the contractions like “i’ll,” “we’ll,” and “you’ll.” generating “ll” as a word from multiple different contractions not only created a meaningless word, but since “ll” good night, good day, good luck | ozeran and martin 66 https://doi.org/10.6017/ital.v38i2.10921 occurred more frequently than any unique contraction, it was potentially treated as more important by the topic modeling algorithms. conclusion this project has demonstrated that topic modeling is one possible way to employ automated methods to analyze chat reference, with mixed success. the library will continue to improve chat reference analysis based on this project experience. the authors hope that other libraries will use the lessons from this project and the code in github as a starting point to employ similar analysis for their own chat reference. in fact, a related project at the university of northern iowa library is evidence of growing interest in topic modeling of chat reference transcripts.14 considering how frequently patrons use chat reference, is it important for libraries to explore and embrace whatever methods will allow them to assess and improve such services. acknowledgements the authors wish to acknowledge the research and publication committee of the university of illinois at urbana-champaign library, which provided support for the completion of this research. many thanks are owed to xinyu tian, our practicum student, for the extensive work he did in identifying relevant literature and developing the project code. notes 1 jo kibbee, david ward, and wei ma, “virtual service, real data: results of a pilot study,” reference services review 30, no. 1 (mar. 1, 2002): 25–36, https://doi.org/10.1108/00907320210416519. 2 the library uses the read scale (reference effort assessment data scale), which allows reference transactions to be translated into a numerical scale that takes into account the effort, skills, knowledge, teaching moment, techniques and tools used by the staff in the transaction. see readscale.org for more information. 3 david ward and m. kathleen kern, “combining im and vendor-based chat: a report from the frontlines of an integrated service,” portal: libraries and the academy 6, no. 4 (oct. 2006): 417–29, https://doi.org/10.1353/pla.2006.0058; joann jacoby et al., “the value of chat reference services: a pilot study,” portal: libraries and the academy 16, no. 1 (jan. 2016): 109– 29, https://doi.org/10.1353/pla.2016.0013; david ward, “using virtual reference transcripts for staff training,” reference services review 31, no. 1 (2003): 46–56, https://doi.org/10.1108/00907320310460915. 4 robin brown, “lifting the veil: analyzing collaborative virtual reference transcripts to demonstrate value and make recommendations for practice,” reference & user services quarterly 57, no. 1 (fall 2017): 42–47, https://doi.org/10.5860/rusq.57.1.6441; maryvon côté, svetlana kochkina, and tara mawhinney, “do you want to chat? reevaluating organization of virtual reference service at an academic library,” reference & user services quarterly 56, no. 1 (fall 2016): 36–46, https://doi.org/10.5860/rusq.56n1.36; donna goda and corinne bisshop, “frequency and content of chat questions by time of semester at the university of central florida: implications for training, staffing and marketing,” public services quarterly 4, no. 4 (dec. 2008): 291–316, https://doi.org/10.1080/15228950802285593; information technology and libraries | june 2019 67 kelsey keyes and ellie dworak, “staffing chat reference with undergraduate student assistants at an academic library: a standards-based assessment,” the journal of academic librarianship 43, no. 6 (2017): 469–78, https://doi.org/10.1016/j.acalib.2017.09.001; michael mungin, “stats don’t tell the whole story: using qualitative data analysis of chat reference transcripts to assess and improve services,” journal of library & information services in distance learning 11, no. 1–2 (jan. 2017): 25–36, https://doi.org/10.1080/1533290x.2016.1223965. 5 shu z. schiller, “chat for chat: mediated learning in online chat virtual reference service,” computers in human behavior 65 (dec. 2016): 651–65, https://doi.org/10.1016/j.chb.2016.06.053. 6 ellie kohler, “what do your library chats say?: how to analyze webchat transcripts for sentiment and topic extraction,” in brick & click libraries conference proceedings (brick & click, maryville, mo: northwest missouri state university, 2017), 138–48, https://www.nwmissouri.edu/library/brickandclick/presentations/eproceedings.pdf. 7 kohler, 141. 8 for example: guan-bin chen and hung-yu kao, “re-organized topic modeling for microblogging data,” in proceedings of the ase bigdata & socialinformatics 2015, ase bd&si ’15 (new york, ny: acm, 2015), 35:1–35:8, https://doi.org/10.1145/2818869.2818875. 9 x. cheng et al., “btm: topic modeling over short texts,” ieee transactions on knowledge and data engineering 26, no. 12 (dec.2014): 2,928–41, https://doi.org/10.1109/tkde.2014.2313872. 10 for example: chenliang li et al., “topic modeling for short texts with auxiliary word embeddings,” in proceedings of the 39th international acm sigir conference on research and development in information retrieval (acm press, 2016), 165–74, https://doi.org/10.1145/2911451.2911499. 11 we used python packages gensim, langid, nltk, numpy, pandas, re, sklearn, and stop_words for data cleaning and analysis. 12 kohler, “what do your library chats say?” 13 the library implemented new chat reference software after this project was completed, so analysis of chat conversations that took place after the spring 2018 semester will require a reworking of the data collection and cleaning processes. 14 hyunseung koh and mark fienup, “library chat analysis: a navigation tool,” (poster, dec. 5, 2018), https://libraryassessment.org/wp-content/uploads/2018/11/58-kohfienuplibrarychatanalysis.pdf. reproduced with permission of the copyright owner. further reproduction prohibited without permission. pearls marmion, dan information technology and libraries; mar 2000; 19, 1; proquest pg. 53 pearls ed. note: "pearls" is a new section that will appear in these pages from time to time. it will be ital 's own version of the "top technology trends" topic begun by pat ensor. these pearls might be gleaned from a variety of places, but most often will come from discussion lists on the net. our first pearl, from thomas dowling appeared on web4lib on august 19, 1999 under the subject "pixel sizes for web from : thomas dowling to : multiple recipients of list sent : thu, 19 aug 1999 06:07 :08 -0700 (pdt) subject: [web4lib] pixel s izes for web pages dan marmion pages." he is responding to a query that asked if web site developers should assume the standard monitor resolution is 640x480 pixels, or something else. you may want to consult the web4lib archive for comments from the last few merry go-rounds on this topic. monitor size in inches is different from monitor size in pixels , which is different from window size in pixels, which is d ifferent from the rendered size of a browser's default font. not only are these four measurements different, they operate almost wholly independently of each other . so a statement like "i have trouble reading text at 600x800" puts the blame in the wrong place . html inherently has no sense of screen or window dimensions. many web designers will argue that the only aspects to a page with fixed pixel dimensions should be inline images; such designers typically restrain their use of images so that no single image or horizontal chain of images is wider than, say, 550px (with obvious exceptions for sites like image archives where the main purpose of a page is to display a larger image) . outside of images, find ways to express measurements relative to window size (percentages) or relative to text size (ems). users detest horizontal scrolling. in my experience, users with higher screen resolutions and/or larger monitors are less likely to run any application full screen; average window size on a 1280x1024 19" or 21 " monitor is very likely to be less than b00px wide. (the browser window i currently have open is 587px wide and 737px high .) i applaud your decision to support web access for the visually impaired . since that entails much , much more than monitor resolution, i trust the people actually writing your pages are familiar with the web content accessibility guidelines. it is actually possible to design web sites that are equally usable , even equally beautiful, under a wide range of viewing conditions. failing to accomplish that completely is understandable; failing to identify it as a goal is not. my recommendations to your committee would be a) find a starting point that isn't tied up in presentational nitpicking; b) find a design that looks attractive anywhere from 550 to 1550 pixels wide; c) crank up both your workstations ' resolution and font size; and d) continue to run your browsers in windows that are approximately 600 to 640 pixels wide . thomas dowling ohiolink ohio library and information network tdowllng @ohiolink.edu pearls i 53 2 information technology and libraries | june 2007 i write my final president’s column a month after the midwinter meeting in seattle. you will read it as preparations for the ala annual conference in washington, d.c. are well underway. despite that discon nect in time, i am confident that the level of enthusiasm will continue uninterrupted between the two events. indeed, the midwinter meeting was highly charged with positive energy and excitement. the feelings are reignited if you listen to the numerous podcasts now found on the lita blog. the lita bloggers and podcasters were omni present reporting on all of the meetings and recording the musings of the lita top tech trendsters. by the time you have read this you will have also, hopefully, cast your ballot for lita officers and directors after having had the opportunity to listen to brief podcast interviews with the candidates. the lita board approved the election pod casts at the annual conference in new orleans. thanks to the collaborative efforts of the nominating committee and the bigwig members, we have this new input into our voting decisionmaking. the most exciting aspects of the midwinter meeting were the facetoface, networking opportunities that make lita so great. the lita happy hour crowd filled the six arms bar and lit it up with the wonderful lita glow badges. what was particularly gratifying to me was the number of new lita members alongside those of us who have been around longer than we care to count. the net working that went on there was phenomenal! the other important networking opportunity for lita members was the lita town meeting led by lita vice president mark beatty. the room was packed with eager members ready to brainstorm about what they think lita should be doing after consuming a wonderful breakfast. lita’s sponsored emerging leader, michelle boule, and mark have collated the findings and will be working with the other emerging leaders to finetune a direction. the podcast interview of michelle and mark is an excellent summary of what you can expect in the next year when mark is president. as stated earlier, this is my last president’s column, which means my term is winding down. using lita’s strategic plan as a guide, i have worked with many of you in lita to ensure that we have a structure in place that allows us to be more adaptable to the rapidly chang ing world and to make sure that lita is relevant to lita members 365 x 24 x 7 and not just at conferences and lita national forum. attracting and retaining new members is critical for the health of any organization and in that vein, mark and i have used the ala emerging leaders program as a jumping off point to work with lita’s emerging leaders. the bigwig group is foment ing with energy and excitement as they rally bloggers and have this past year launched the podcasting initiative and the lita wiki. all of these things are making it easier for members to communicate about issues of interest in their work as well as to conduct lita business. the lita blog had over nine thousand downloads of its podcasts in the first three weeks after midwinter which confirms the desire for these types of communications! i appointed two task forces that provided recommen dations to the lita board at midwinter. the assessment and research task force has recommended that a perma nent committee be established to monitor the collection of feedback and assessment data on lita programs and services. having an established assessment process will enable the board to know how well we are accomplishing our strategic plan and to keep us on the correct course to meet membership needs. the education working group has recommended the merger of two committees, the education and regional institutes committees, into one education committee. this merged committee will develop a variety of educational opportunities including online and facetoface sessions. we hope to have both of these committees up and going later in 2007. happily, the feedback from the town meeting parallels the recom mendations of the task forces. the board will be revisit ing the strategic plan at the annual conference using information gathered at the town meeting. we will also be looking at what new services we should be initiating. all arrows seem to be pointing towards more educational and networking opportunities both virtual and in person. i anticipate that lita members will see some great new things happening in the next year. i have very much enjoyed the opportunity to serve as the lita president this past year. the best part has been getting to know so many lita members who have such creative ideas and who roll up their sleeves and dig in to get the work done. i am very grateful for everyone who has volunteered their time and talents to make lita such a great organization. bonnie postlethwaite (postlethwaiteb@umkc.edu) is lita president 2006/2007 and associate dean of libraries, university of missouri–kansas city. president’s column bonnie postlethwaite using augmented and virtual reality in information literacy instruction to reduce library anxiety in nontraditional and international students articles using augmented and virtual reality in information literacy instruction to reduce library anxiety in nontraditional and international students angela sample information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11723 dr. angela sample (asample@oru.edu) is head of access services, oral roberts university abstract throughout its early years, the oral roberts university (oru) library held a place of pre-eminence on campus. oru’s founder envisioned the library as central to all academic function and scholarship. under the direction of the founding dean of learning resources, the library was an early pioneer in innovative technologies and methods. however, over time, as the case with many academic libraries, the library’s reputation as an institution crucial to the academic work on campus had diminished. a team of librarians is now engaged in programs aimed at repositioning the library as the university’s hub of learning. toward that goal, the library has long taught information literacy (il) to students and faculty through several traditional methods, including one-shot workshops and sessions tied to specific courses of study. now, in conjunction with disseminating augmented, virtual, and mixed reality (avmr) learning technologies, the library is redesigning instruction to align with various realities of higher education today, including uses of avmr in instruction and research and following best practices from research into serving 1. online learners; 2. international learners not accustomed to western higher-education practices; and 3. learners returning to university study after being away from higher education for some time or having changed disciplines of study. the library is innovating online tutorials targeted for nontraditional and international graduate students with various combinations of avmr, with the goal to diminish library anxiety. numerous library and information science studies have shown a correlation between library anxiety and reduced library use, and library use has been linked to student learning, academic success, and retention.1 this paper focuses on il instruction methods under development by the library. current indicators are encouraging as the library embarks on the redesign of il instruction and early development of inclusion of avmr in il instruction for nontraditional and international students. literature review the patron approaches the reference desk, with eyes downcast. in a voice so soft that it is barely above a whisper, the patron mumbles, “is this where i can get help with research?” some variation on the above scenario is an occurrence long familiar to academic reference librarians. in 1986, mellon put a name to this nervousness of patrons; she called it library anxiety.2 mailto:asample@oru.edu information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 2 since then, librarians have implemented various measures to help put patrons at ease and minimize their library anxiety. scholars have studied many of these measures aimed at reducing library anxiety, both to determine the efficacy of such interventions and to understand better the causes of library anxiety. this paper describes one library’s intervention using a virtual-reality tour of the library to learn about some of the services available at the library prior to their initial visit in an attempt to reduce some aspects of their library anxiety. library anxiety library and information science (lis) researchers have long recognized anxiety related to libraries and research can have a detrimental effect on students. mizrachi described library anxiety as the feeling of being overwhelmed, intimidated, nervous, uncertain, or confused when using or contemplating use of the library and its resources to satisfy an information need. it is a state-based anxiety that can result in misconceptions or misapplications of library resources, procrastination, and avoidance of library tasks.3 since mellon’s theoretical framing of library anxiety in 1986, researchers have studied a number of library-related anxieties, including research anxiety, information literacy anxiety, library technophobia, and computer anxiety. various studies have focused on different groups of students—freshmen, nontraditional students, and international students, to name a few—who may experience higher levels of library anxiety. another area that has been of interest to researchers is the study of the efficacy of various measures aimed at reducing the library anxiety of students. causes and factors researchers have found several causes of library anxiety. in her seminal article, mellon used a grounded theory approach to understand and “describe students’ fear of the library as library anxiety.”4 mellon noted most of the students in her study described their feelings as being lost in the library, which mellon stated “stemmed from four causes: (1) the size of the library; (2) a lack of knowledge about where things were located; (3) how to begin; and (4) what to do.”5 head and eisenberg also found a majority of students (84 percent) had difficulties in knowing where to begin.6 bostick and later jiao and onwuegbuzie named “five general antecedents of library anxiety . . . namely, barriers with staff, affective barriers, comfort with the library, knowledge of the library, and mechanical barriers.”7 barriers with staff are the feelings students have regarding the accessibility and approachability of library staff.8 affective barriers are students’ self-perceptions of their competence in using the library and library resources. affective barriers’ arise from feelings of inadequacy and can be heightened by the perception that others possess library skills that they alone do not.9 comfort with the library deals with the student’s perception of the library as a “safe and comforting environment.”10 knowledge of the library is students’ knowledge of “where things are located and how to find their way around in the building.”11 mechanical barriers refer to students’ perception of the reliability of machines in the library (e.g., copiers, printers, computers, etc.).12 researchers focused on investigating the information-seeking behavior of students have identified stages of library anxiety. in her work, kuhlthau identified six stages of information seeking in information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 3 which students may experience library anxiety: task initiation, topic selection, prefocus exploration, focus formulation, information collection, and search closure.13 in blundell’s presentation of her theoretical model of the academic information search process (aisp) of undergraduate millennial students (figure 1), she described the varying levels of anxiety students may feel throughout this process depending upon their success at finding needed information.14 anxiety at stage 2: development/refinement “ranges from mild to extreme, depending on the success of the student’s aisp in finding information he/she believes is appropriate for addressing the academic need.”15 at stage 3, “based on information located through the aisp in stages 1 & 2, [the] student either fulfills [the] academic need with minimal anxiety, refocuses aisp with mid to high-level anxiety, or abandons the academic need completely with high/extreme levels of anxiety.”16 figure 1. blundell aisp model.17 although blundell studied undergraduate millennial students’ information-seeking behaviors, the same behaviors may also be descriptive of other groups of students. blundell omitted anxiety at or prior to stage 1 when the assignment is received by the student. one reason for the omission of anxiety in blundell’s model at stage 1 may be a seemingly paradoxical finding by many researchers regarding students’ inflated belief in their research skills as compared to their actual level of information literacy (il) skills.18 students with a high self-assessment of their il skills may feel confident at the onset of research, only experiencing anxiety when encountering low success rates when searching for information or when experiencing information overload. however, many other students may experience anxiety at the onset of receiving an assignment, particularly on a information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 4 topic in which they have little or no knowledge. others may experience anxiety if they realize they do not know where to look for information, how to use library research tools, or feel apprehension at the thought of asking for help from a librarian. for example, library anxiety can result from the requirements of the assignment; most professors require peer-reviewed sources. many new students do not know what a peer-reviewed source is, much less how to find one. indeed, many of the causes of library anxiety described from mellon’s and later jiao’s and onwuegbuzie’s work can be positioned throughout all six of kuhlthau’s and all three of blundell’s stages of information seeking and could explain some of the potential steps blundell noted in her model. negative effects in addition to the obvious discomfort students might feel, library anxiety, as with other forms of anxiety, can have a detrimental effect on students’ academic performance. as mellon noted, “students become so anxious about having to gather information in a library for their research paper that they are unable to approach the problem logically or effectively.”19 the findings from jiao’s and onwuegbuzie’s numerous studies support the negative effect library anxiety can have on students’ academic performance in various ways, including research performance, research proposal writing, and study habits.20 research has also shown the link between higher levels of library anxiety and avoidance of the library.21 avoidance of the library could hinder students’ academic performance or retention; studies have linked library use to higher gpas and increased retention rates.22 other negative effects of library anxiety include the reluctance of students to ask for help from a librarian and the tendency to procrastinate until it is too late to do well on assignments. when library anxiety is at a level high enough to cause students to enter a panic mode, logical thinking, the ability to apply existing skills, and building or acquiring new skills can be impaired. at-risk student groups acknowledging the negative effects library anxiety can have on students’ academic performance, several studies have looked to determine whether particular demographic groups of students experience library anxiety at higher rates and what factors or causes may be most prevalent in the causes of library anxiety for a particular group. in one study conducted by jiao, onwuegbuzie, and lichtenstein, students who fell into the following groups tended to have the highest levels of library anxiety: “male, undergraduate, not speak english as their native language, have high levels of academic achievement, be employed either partor full-time, and visit the library infrequently.”23 some studies have focused on learning more about the library anxiety of a particular group. some of the groups investigated include graduate, international, and nontraditional students. still others have focused on possible racial differences in the prevalence of library anxiety. although a few studies have found library anxiety to be higher for undergraduate students than graduate students, one of the most often-studied groups at risk for library anxiety has been graduate students.24 these researchers have looked at a number of factors in relation to graduate students’ library anxiety. in an early study, they found graduate students with the preferred learning style of visual learners tend to have higher levels of library anxiety. 25 in another study of graduate students, they examined the relation between library anxiety and trait anxiety, defined as “the relative stable proneness within each person to react to situations seen as stressful. ”26 jiao and onwuegbuzie, together with bostick, investigated the potential relationship between race and library anxiety in 2004, which study they replicated in 2006. in both, the researchers found information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 5 caucasian american graduate students reported higher levels of library anxiety than their african american counterparts.27 another group frequently examined in library anxiety studies is international students. mizrachi noted “studies involving international students in american universities consistently show their levels of library anxiety to be much higher than their american peers.”28 onwuegbuzie and jiao found international esl students “had higher levels of library anxiety associated with ‘barriers with staff,’ ‘affective barriers,’ and ‘mechanical barriers,’ and lower levels of library anxiety associated with ‘knowledge of the library’ than did native english speakers.”29 later, jiao and onwuegbuzie found the most prevalent causes of library anxiety for international students were mechanical barriers (library technology) as the greatest source, followed by affective barriers. 30 in the more recent pilot study by lu and adkins, the greatest barriers for international students were affective and staff barriers, while mechanical barriers, such as technologies, were no longer a significant cause of anxiety for most.31 collins and veal found adult learners in their study had the highest degree of library anxiety pertaining to affective barriers. 32 in their study, kwon, onwuegbuzie, and alexander revealed graduate students who had higher levels of library anxiety resulting from affective barriers and knowledge of the library had weaker critical-thinking skills, lower self-confidence, less inquisitiveness, and reduced systematicity (“less disposed toward organized, logical, focused, and attentive inquiry”).33 kwon found similar results in undergraduate students.34 interventions recognizing the multiple causes and multidimensional aspects of library anxiety, librarians have devised a number of interventions aimed at addressing one or more of its causes. some of the means to address barriers with staff have focused on outreach, engaging library instruction, online presence, and other similar efforts to reach students and provide needed support for students’ research. librarians have used information literacy instruction (ili), reference desk consultations, and print and online guides to address library anxiety stemming from affective barriers, knowledge of the library, and even the mechanical barriers arising from lack of technology skills. a common intervention is ili, which several studies have found to have some success in reducing students’ library anxiety. bell explored students’ levels of library anxiety before and after a onecredit il course.35 platt and platt examined the efficacy of two 50-minute ili sessions, required of students enrolled in the research methods in psychology course, in reducing library anxiety, which found “the greatest changes . . . were related primarily to knowledge of what resources are available in the library and how to access them.”36 in contrast to the typical one-session il class, fleming-may, mays, and radom investigated and found a three-workshop instruction model correlated with students’ increased confidence in using the library and lessening library anxiety. 37 notwithstanding the benefits of library instruction sessions for students in relieving library anxiety, pellegrino found students were far more likely to ask a librarian for help when their instructor, rather than a librarian, encouraged or required them do so.38 by familiarizing students with the location and arrangement of library services in the building, library orientations have been found to help relieve library anxiety. 39 library orientations primarily aim to address one of the causes of library anxiety: a lack knowledge of the library. these orientations often introduce students to various library staff, which may also help with the dimension of library anxiety due to barriers with staff. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 6 other interventions have been attempted with some success. martin and park found students were more apt to request assistance from the librarian if persuaded the consultation would save time. 40 mcdaniel found in a study of graduate students that the use of peer mentors was effective in reducing affective barriers.41 robbins discussed the use of library events to help ease students anxiety, but found in the follow-up survey many students were unaware of the events.42 diprince et al. discussed ways the use of a print guide can help alleviate library anxiety. 43 oru library oral roberts university the oru library serves the students, faculty, and staff of oral roberts university (oru). oru is a small, private, not-for-profit, liberal arts college located in tulsa, oklahoma. founded in 1963 by oral roberts, enrollment is approximately 3,600 students. oru is an interdenominational christian institution focused on a whole-person education of spirit, mind, and body. oru offers more than 150 majors, minors, and pre-professional programs in a range of degree fields, from business, biology, engineering, nursing, ministry, and more.44 history “the first building will be the library which is the core of the whole academic structure.”45 —oral roberts (1962) from the founding of oru, founder oral roberts had a vision of the library’s centrality to academics.46 this set a precedent early in the history of oru library of the importance of the library to the academic work of the students and faculty of oru. expanding on traditional views of the function of an academic library to serve mainly as the repository of books and articles, through the vision of early library administrators, oru library emerged as one of the early adopters of electronic technology with the dairs (dial access information retrieval system) computer.47 throughout the years, due to a number of factors, the oru library receded from the forefront of pre-eminence in academics on campus. library practices followed the general trend of academic libraries. the oru library continued to acquire needed materials (e.g., books, journals, access to databases). library instruction likewise kept up with current models of instruction. the typical method of instruction to undergraduates has been teaching one or two sessions to a class at the request of the instructor. on largely the efforts of the instruction librarian, il became a required component of undergraduate education at oru. with rare exceptions, undergraduate students at oru are required as a part of comp 102: composition ii to attend two sessions of an il course. other forms of ili include workshops and sessions for undergraduates working on their senior papers and other sessions for graduate and postgraduate students, all typically at the request of the instructors of classes. with the new addition of augmented, virtual, and mixed reality (avmr) learning technologies, at the behest of their dean, oru librarians have begun to look at ways to incorporate these technologies into their classes and daily work. several oru instructors are using avmr technologies in their classes.48 to help prepare students for the use of these technologies in their classes, one oru instruction librarian has begun to introduce students to avmr technologies. other oru instruction librarians are exploring ways to use avmr technologies to create visualizations of library and research concepts, such as a 3d visualization of how boolean logic information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 7 works in database searches. oru instruction librarians are also exploring ways to incorporate avmr technologies into a new program of online ili. although still in very early stages of planning, the proposed online ili will include a virtual tour of the library. this paper focuses on the implementation and early feedback from a formative assessment on a virtual tour of the oru library. oru modular students in addition to traditional 15-week semesters, two colleges at oru offer graduate modular programs, the college of education and the college of theology and ministry. many of the students who enroll in these programs are nontraditional students who are returning to college after some time. several of these students work full-time jobs and have family obligations in addition to their academic work. often, these students are not local to the tulsa campus; several are us students who live out of state and many others are international students. the modular classes offered by both programs can be a hybrid of online and modular format. the college of theology and ministry offers one-week courses on campus; the college of education offers two-and-a-half-day on-campus classes. modular classes are intensive due to the compressed nature of the curriculum. often, modular students are visiting campus for the first time, and in addition to locating their classes, are very busy with coursework. adding to these pressures, modular students may be using computer technologies in new ways. navigating the library’s resources is yet another stressor for many of these students. for students who are not familiar with the operations of an academic library, they may not be aware of library services nor how to access those services. the project in january 2017, the global learning center opened on the campus of oru. one hallmark of this renovated structure is the integration of avmr technologies.49 despite several professors on campus from various disciplines and colleges implementing avmr into their curriculum, students’ use of the facilities was somewhat lower than had been hoped. in the fall 2018 semester, the idea of creating a virtual tour of the oru library arose from a conversation between the author and a colleague, dan eller. eller described an online ili course he envisioned for oru’s graduate theology modular students. as a part of this course, he envisioned a virtual tour that could help students by reducing their library anxiety. early in 2019, oru’s associate vice president of technology and innovation, michael mathews, contacted dr. mark roberts, dean of learning resources (of which the oru library is a part) to propose making avmr learning technologies available through the oru library. dean roberts agreed and created an avmr team of library faculty to oversee this project. in the spring 2019 semester, the oru it department sent one of their employees, stephen guzman, to work with the library’s avmr team to set up an avmr station and work in the library to help make these new technologies available and known to oru students. in addition to other avmr projects guzman helped the library’s avmr team begin, he volunteered to take the 360 images when he learned of the library’s desire to create a virtual tour of the library. guzman also helped in the selection of editing software, 3dvista, for which the library acquired a license. working with the 360 images guzman took and stitched together, the author used 3dvista to create a virtual tour of the library. this software allows for the addition of elements to the 360 images that make up the virtual tour to enhance the viewer’s experience and to provide information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 8 information and hyperlinks to external webpages with more information. some of the elements added to the oru library virtual tour are hotspots that enable a viewer to move from one area to another, icons that present pop-up windows with more information, and other icons that link to the online profiles of various library faculty. throughout the tour, consistent use of icons for the same functions is maintained. for example, icons with arrows allow the viewer to move from one location to another (figure 2), while icons with question marks displayed over library personnel (figure 3) open the personnel’s profile webpage when clicked. icons that contain the letter “i” feature pop-up windows with information and related links. the tour begins from outside the building so new visitors will be able to recognize the building when they arrive on campus (figure 4). viewers can navigate through subsequent 360 images by clicking on the arrow icons so the viewer virtually travels the same path they will follow to enter the library when on campus. there are two other options to navigate the tour. the viewer can click on the small icons of scenes displayed on the left side of the screen to move to another area. the floor plans displayed at the upper right of the screen have red dots indicating the location of various scenes and, when clicked, move the viewer to that scene. figure 2. avmr station near the reference desk, oru library virtual tour. other elements of the tour include small icons of the scenes on the left of the screen. beneath these icons are the names of the various areas. the title of the current scene appears in yello w lettering, providing information to help orient the viewer. small floorplans located in the upper right side of the screen offer additional information on the location of the area (figure 3). viewers can toggle these floorplans on and off. another feature supplying location information is the dropdown menu for the floorplans (the dark blue bar at the upper right of the screen) which shows the floor level of the building on which the area is located. in the lower right of the screen, an information icon is available with details on what behavior to expect when clicking on icons and a description of the various ways to navigate the tour. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 9 figure 3. dean mark roberts near alexa and the self-checkout station at the circulation desk, oru library virtual tour. figure 4. oru campus, oru library tour. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 10 methodology the aim of the virtual tour of the library is to reduce several dimensions of students’ library anxiety. the primary goal of the tour is to reduce anxiety related to knowledge of the library by familiarizing students with images and information regarding the building prior to their arrival on campus. another aim is to reduce barriers with staff, which we address by proving information along with images of library faculty. affective barriers and mechanical barriers are two of the most prevalent causes of library anxiety, which the intervention of the tour does not directly address. the hope is, however, that with the minimization of any anxiety stemming from knowledge of the building and barriers with staff, students will be encouraged to consult with librarians, particularly as information on the variety of ways to contact librarians is included on the information pop-up window on the reference desk. preand post-surveys the preand post-surveys administered to students included 42 statements from bostick’s library anxiety scale. bostick’s library anxiety scale, developed in 1992, is a 5-point likert scale survey instrument that contains 43 statements. the pre-survey also contains demographic questions. the one statement omitted from bostick’s original survey was number 40, “the change machines are always out of order,” as the oru library does not have change machines.50 with the exception of the demographical questions, the post-survey is the duplicate of the pre-survey, with the same 42 statements. although several researchers have adapted bostick’s library anxiety scale, such as blundell’s adaptation to add “elements related specifically to information technology (both hardware such as computers, and software such as online research databases),”51 for the purposes of this preliminary inquiry, the researcher decided to use the original questions from the library anxiety scale. the original statements were used because reduction of library anxiety stemming from information technology use was not a goal of this study. administration of survey a link to the pre-survey was posted on the homepage of the oru library. the author sent email invitations containing a link to the pre-survey to students enrolled in the june 2019 summer modular theology classes. the author met with groups of education modular students during the week they were on campus (june 24–30, 2019) to recruit participation. in a library session, another librarian encouraged her modular students to participate in the study. at the end of the pre-survey, a unique number and instructions to note the number were provided to participants to be used to log in to the post-survey. the link to the virtual tour appeared on the final screen of the pre-survey. the link to the post-survey was provided on the same page as the virtual tour, allowing participants to navigate to the post-survey when desired. the surveys asked for no identifying information; however, the unique number provided on the pre-surveys and entered by the participants on the post-surveys allowed the researcher to link the participants’ responses to both surveys. once the results were downloaded, each of the participants’ preand post-survey responses were coded p1 through p7 to track any potential effects of the virtual tour on participants’ responses. because of the low rate of participation, formal statistical analyses were not applied to these findings. the results were examined in two ways. each participants’ preand post-survey responses were compared to determine if responses changed from preto post-survey. the total number of responses on each point of the likert scale information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 11 to each of the 42 statements were examined to determine trends in participants’ levels of library anxiety. results although approximately 100 students enrolled in either the graduate theology or graduate education modular classes visited the campus june 24–30, 2019, participation in this preliminary study was extremely low. to date only seven participants have completed both the preand postsurveys. the responses from this formative assessment will be used by the oru library to guide future iterations of the virtual library tour and inclusion in ili. the following discusses initial findings from the preand post-surveys. most of the participants reported little or no discomfort or anxiety with using the library. all participants indicated they are us citizens, and all indicated some level of familiarity with the library. four reported they had often visited the library, three responded they had visited the library previously, but not often. of the seven participants, five indicated they are graduate students, one marked “other,” and one reported doctoral-student status. ages of the participants varied from one at 20–29, one 30–39, two 40–49, and three at 50 years or over. the following describes the effect the virtual tour of the library had on participants’ responses. interestingly, one participant showed no change in responses from preto post-survey. note: bostick’s original categorization of the statements have been retained for all 42 of the statements on both instruments. knowledge of the library the principal aim of the virtual tour was to reduce library anxiety related to knowledge of the library by acquainting students with “where things are located and how to find their way around in the building.”52 bostick categorized 5 of the 42 statements as knowledge of the library. based on participants’ responses, there is some indication the tour did help acquaint students with the library. the changes in participants’ responses showed a greater positive trend after viewing the virtual tour; although on two statements, responses showed a negative trend (table 1). table 1 shows the questions on which participants had a change in their responses from preto postsurvey. the number in the positive column indicates the number of participants whose responses displayed a favorable change in the perceptions of participants to that statement following the virtual tour. the number in the negative column shows the number of participants whose responses on the post-survey showed a negative effect of the virtual tour on their responses. statement positive negative i don’t feel physically safe in the library. 1 1 i enjoy learning new things about the library. 3 1 i want to learn how to do my own research. 1 the library is a safe place. 2 the library is an important part of my school. 2 totals 9 2 table 1. statements in knowledge of the library category, which showed change on post-survey. the number of responses of strongly disagree statements in this category were unchanged from preto post-survey. the only statement that received any responses of strongly disagree was five information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 12 to the statement “i don't feel physically safe in the library.” taken together with the two responses of disagree to this statement, all the participants feel safe in the library. to the statement “the library is a safe place,” all seven participants answered either agree (five on pre-survey, four on post-survey) or strongly agree (two on pre-survey, three on post-survey) (figure 6). curiously, responses to “i enjoy learning new things about the library” changed from no responses of disagree on the pre-survey to one response of disagree on the post-survey. the other shift in the number of responses of disagree was on the statement “the library is an important part of my school” (two on pre-survey, one on post-survey), indicating a slight improvement (figure 5). figure 5. comparison of strongly disagree and disagree responses in knowledge of the library category. to the statements in this category, none of the respondents replied undecided, except to the statement “i enjoy learning new things about the library.” there was one undecided response on the pre-survey and no responses of undecided on the post-survey to this statement. the other change in this category was to the statement “the library is an important part of my school,” which moved from no responses of undecided on the pre-survey to one undecided on the postsurvey. the respondents, for the most part, wanted to learn to do their own research, with five responses of agree or strongly agree on both the preand post-surveys. five of the participants felt the library is of importance (one agree and four strongly agree). six of the seven participants reported they enjoy learning new things about the library. the shift in responses from five agree and one strongly agree on pre-survey to two agree and four strongly agree indicates the tour might have affected participants’ views on this statement (figure 6). information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 13 figure 6. comparison of strongly agree and agree responses in knowledge of the library category. affective barriers while not a direct goal of the virtual tour, the responses of participants showed the most gains on the post-survey within the category of affective barriers. this seems to indicate that viewing the virtual tour improved students’ self-perceptions of their competence in using the library and library resources. out of the 42 statements on each of the instruments, 12 are in bostick’s category, affective barriers. the statements in table 2 are those on which participants had a change in their responses from preto post-survey. the numbers in the positive column indicate the number of participant responses, which improved on the post-survey. a number in the negative column indicates participants’ post-survey responses that moved in a negative direction. statement positive negative a lot of the university is confusing to me. 2 i am unsure how to begin my research. 2 i can never find things i need in the library. 3 i don’t know what resources are available in the library. 2 i don't know what to do next when the book i need is not on the shelf. 1 i feel comfortable using the library. 3 i get confused trying to find my way around the library. 2 i’m embarrassed that i don’t know how to use the library. 1 1 the directions for using the computers are not clear. totals 17 1 table 2. statements in affective barriers category, which showed change on post-survey. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 14 looking at the responses to statements in this category reveals some possible effects of the tour and some potential areas of library anxiety. the responses to “i don’t know what resources are available in the library” were split on the pre-survey, with three responses of agree, one of strongly agree, one undecided, one of strongly disagree, and two of disagree. the post-survey responses showed almost no change; the only change was one additional response of undecided, with no strongly agree responses (figures 7.1, 8.1, 9.1). these findings indicate more information on what sources are available to patrons may be needed on the virtual tour. most of the respondents indicated confidence about where to begin research. on both preand post-surveys, there were five responses of strongly disagree or disagree to the statement “i am unsure how to begin my research” (figure 7.1). most indicated they feel confident in using the library based on the responses to the statements “i’m embarrassed that i don’t know how to use the library,” “i feel comfortable using the library,” “i can never find things in the library,” and “i get confused trying to find my way around the library” (figures 7.1, 7.2, 9.1, 9.2). responses were equally positive to the statements “the library won't let me check out as many items as i need,” “a lot of the university is confusing to me,” “i don’t know what to do next when the book i need is not on the shelf,” and “i can’t find enough space in the library to study” (figures 7.1, 7.2, 8.1, 8.2, 9.1, 9.2). responses were divided on the statement “i feel like i’m bothering the reference librarian if i ask a question” (figures 8.2, 10.2). this finding needs further research to determine what is causing students to feel reluctance to ask the librarian for assistance. figure 7.1. comparison of strongly disagree and disagree responses in affective barriers category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 15 figure 7.2. comparison of strongly disagree and disagree responses in affective barriers category. figure 8.1. comparison of undecided responses in affective barriers category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 16 figure 8.2. comparison of undecided responses in affective barriers category. figure 9.1. comparison of strongly agree and agree responses in affective barriers category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 17 figure 9.2. comparison of strongly agree and agree responses in affective barriers category. mechanical barriers although not a goal of the study, there was positive change in participants’ responses in both statements in the category of mechanical barriers. it is unclear how the virtual tour might have caused the improvement in participants’ perception of the reliability of machines in the library. statement positive negative the computer printers are often out of paper. 1 the copy machines are usually out of order. 1 totals 2 table 3. statements in mechanical barriers category, which showed change on post-survey. in this category, on both the preand post-surveys, there was one strongly disagree response to both statements. no respondents replied agree or strongly agree to the statements in this category. responses of disagree to both statements increased one from one disagree response on the pre-survey to two disagree responses on the post-survey. the number of undecided responses fell from five to four on the post-survey. as noted above, it is not clear what caused the change in responses. barriers with staff a secondary goal of the tour was to reduce barriers with staff and thus to reduce library anxiety by providing information with images of library faculty. by providing information and images of the library faculty, this study sought to reduce the anxiety students may have regarding the accessibility and approachability of library staff. in this category, participants showed some positive effects of the virtual tour on how participants viewed library staff. however, the responses of participants exhibited the most variability in this category, with almost an equal number of responses being positive or negative after viewing the tour. the reasons for this information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 18 variance are unclear. in future studies, additional space for comments will be included on the surveys as well as possible follow-up focus-group discussions to determine the causes of negative trends in responses. table 4 shows the statements within this category on which participants had a change in their responses from preto post-survey. the number in the positive column indicates how many participant responses changed in a favorable direction on the post-survey. the number in the negative column indicates the number of participants whose post-survey responses moved in a negative direction. on the survey instruments, 12 of the 15 statements categorized as bostick’s barriers with staff showed changes in responses. statement positive negative i can always ask a librarian if i don’t know how to work a piece of equipment in the library. 1 i can’t get help in the library at the times i need it. 1 1 if i can’t find a book on the shelf the library staff will help me. 2 2 library staff don’t have time to help me. 1 1 the librarians are unapproachable. 2 the library is a comfortable place to study. 2 the library staff doesn’t care about students. 1 3 the library staff doesn’t listen to students. 1 the reference librarians are not approachable. 2 the reference librarians are unhelpful. 2 the reference librarians don’t have time to help me because they’re always busy doing something else. 1 1 there is often no one available in the library to help me. 2 1 totals 15 12 table 4. statements in barriers with staff category, which showed change on post-survey. the findings in the category, overall, were favorable. most feel the librarians and library staff care and are responsive and available to students. pre-survey responses indicated one or two of the participants felt librarians are unapproachable or unhelpful. post-survey responses reflected a positive change in participants’ views on librarians’ approachability and helpfulness. participants also reported the library to be a comfortable study location and that the rules are reasonable (figures 10.1, 10.2, 11.1, 11.2, 12.1, 12.2). information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 19 figure 10.1. comparison of strongly disagree and disagree responses in barriers with staff category. figure 10.2. comparison of strongly disagree and disagree responses in barriers with staff category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 20 figure 11.1. comparison of undecided responses in barriers with staff category. figure 11.2. comparison of undecided responses in barriers with staff category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 21 figure 12.1. comparison of strongly agree and agree responses in barriers with staff category. figure 12.2. comparison of strongly agree and agree responses in barriers with staff category. comfort with the library according to collins and veal, comfort with the library is students’ perceptions of the library as a “safe and comforting environment.”53 out of the 42 statements, bostick placed 8 within this category, all of whom showed some change in responses from pre-survey to post-survey. the changes reflected in this category were positive, but it is unclear how the virtual tour might have influenced participants’ perceptions on statements such as “there is too much crime in the library” or “good instructions for using the library’s computers are available.” further investigation is needed to determine what may account for changes in perception on statements such as these. table 5 depicts the changes, both positive and negative, in participants’ responses on the statements in this category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 22 statement positive negative good instructions for using the library’s computers are available. 2 i don’t understand the library’s overdue fines. 1 2 i feel comfortable in the library. 2 i feel safe in the library. 2 1 the library never has the materials i need. 1 the people who work at the circulation desk are helpful. 3 1 the reference librarians are unfriendly. 1 there is too much crime in the library. 1 2 totals 12 7 table 5. statements in comfort with the library category that showed change on post-survey. the following bar graphs compare the responses on the pre-surveys to the post-survey responses within this category. as with other categories, responses were mostly favorable in this category (figures 13, 14, 15). figure 13. comparison of strongly disagree and disagree responses in comfort with the library category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 23 figure 14. comparison of undecided responses in comfort with the library category. figure 15. comparison of strongly agree and agree responses in comfort with the library category. conclusion the oru library has found the virtual tour to be of use in familiarizing students with the library. anecdotal statements from students who viewed the tour during its creation noted the desire that such a tour had been available when they began college and further commented on the assistance that the tour will provide new students. a limitation of this study is the low participation, with no participation from students from some of the groups that other studies have shown may have higher levels of library anxiety (e.g., new students, international students). however, given the indications of positive effects of the virtual tour from our study results and anecdotal statements, we are encouraged that this tool that will assist our students in reducing library anxiety, with the result that they will visit and use the information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 24 library more often to their benefit. again, although participation was low, these results have also encouraged oru librarians to seek other ways to include avmr and other innovative technologies in our instruction, outreach, and services. the 360 virtual tour of the library is undergoing updates and additions to provide students with disabilities information on access points and accessible restrooms. other projects underway include incorporating avmr in il sessions, the addition of a digital sandbox with various technologies and equipment including a vr station, and the addition of vr equipment in our designated faculty research room for use by university faculty to learn and teach students how to use avmr technologies. the response from students and faculty to these new services has been enthusiastic and encouraging that the oru library is positively influencing and supporting the academic work of oru faculty and students. recommended reading varnum, kenneth j. beyond reality: augmented, virtual, and mixed reality in the library. chicago: ala editions, 2019. elliott, christine, marie rose, and jolanda-pieta van arnhem. augmented and virtual reality in libraries. lanham, md: rowman & littlefield, 2018. endnotes 1 anthony j. onwuegbuzie and qun g. jiao, “information search performance and research achievement: an empirical test of the anxiety-expectation mediation model of library anxiety,” journal of the american society for information science & technology 55, no. 1 (2004): 41–54, https://doi.org/10.1002/asi.10342; qun g. jiao and anthony j. onwuegbuzie, “is library anxiety important?,” library review 48, no. 6 (1999), https://doi.org/10.1108/00242539910283732; qun g. jiao and anthony j. onwuegbuzie, library anxiety: the role of study habits (paper presented at the annual meeting of the midsouth educational research association (msera), bowling green, kentucky, november 15–17, 2000), http://files.eric.ed.gov/fulltext/ed448781.pdf. 2 constance a. mellon, “library anxiety: a grounded theory and its development,” college & research libraries 47, no. 2 (1986), https://doi.org/10.5860/crl_47_02_160; see also constance a. mellon, “library anxiety: a grounded theory and its development,” college & research libraries 76, no. 3 (2015), https://doi.org/10.5860/crl.76.3.276. 3 diane mizrachi, “library anxiety,” encyclopedia of library and information sciences (boca raton, fl: crc press, 2017): 2782. 4 mellon, “library anxiety,” (1986): 163; see also mellon, “library anxiety,” (2015): 280. 5 mellon, “library anxiety,” (1986): 162; see also mellon, “library anxiety,” (2015): 278. 6 alison j. head and michael b. eisenberg, truth be told: how college students evaluate and use information in the digital age: project information literacy progress report (university of washington's information school, 2010): 3. https://doi.org/10.1002/asi.10342 https://doi.org/10.1108/00242539910283732 http://files.eric.ed.gov/fulltext/ed448781.pdf https://doi.org/10.5860/crl_47_02_160 https://doi.org/10.5860/crl.76.3.276 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 25 7 sharon lee bostick, “the development and validation of the library anxiety scale,” (phd diss., wayne state university, 1992); qun g. jiao and anthony j. onwuegbuzie, “antecedents of library anxiety,” library quarterly 67, no. 4 (1997): 72, https://doi.org/10.1086/629972. 8 jiao and onwuegbuzie, “antecedents of library anxiety.” 9 mellon, “library anxiety” (1986); see also mellon, “library anxiety” (2015); see also constance a. mellon, “attitudes: the forgotten dimension in library instruction,” library journal 113, no. 14 (1988). 10 kathleen m. t. collins and robin e. veal, “off-campus adult learners’ levels of library anxiety as a predictor of attitudes toward the internet,” library & information science research 26, no. 1 (2004): 4, https://doi.org/https://doi.org/10.1016/j.lisr.2003.11.002. 11 mizrachi, “library anxiety,” 2784. 12 anthony j. onwuegbuzie, “writing a research proposal: the role of library anxiety, statistics anxiety, and composition anxiety,” library & information science research 19, no. 1 (1997), https://doi.org/10.1016/s0740-8188(97)90003-7. 13 carol collier kuhlthau, “developing a model of the library search process: cognitive and affective aspects,” research quarterly 28, no. (winter 1988), https://www.jstor.org/stable/25828262; carol c kuhlthau, “inside the search process: information seeking from the user’s perspective,” journal of the american society for information science 42, no. 5 (1991), https://doi.org/10.1002/(sici)10974571(199106)42:5<361::aid-asi6>3.0.co;2-%23. 14 shelley blundell, “documenting the information-seeking experience of remedial undergraduate students,” proceedings from the document academy 1, no. 1 (2014), https://doi.org/10.35492/docam/1/1/4. 15 blundell, “documenting the information-seeking experience,” 5. 16 blundell, “documenting the information-seeking experience,” 6. 17 used by permission of the author. retrieved from http://remedialundergraduateaisp.pbworks.com/w/file/88755941/modelrevised%20-%208 .4.jpg. 18 blundell, “documenting the information-seeking experience”; melissa gross and don latham, “attaining information literacy: an investigation of the relationship between skill level, self estimates of skill, and library anxiety,” library & information science research 29, no. 3 (2007), https://doi.org/10.1016/j.lisr.2007.04.012; melissa gross and don latham, “undergraduate perceptions of information literacy: defining, attaining, and self-assessing skills,” college & research libraries 70, no. 4 (2009), https://doi.org/10.5860/0700336; melissa gross and don latham, “experiences with and perceptions of information: a phenomenographic study of first-year college students,” library quarterly 81, no. 2 (2011), https://doi.org/10.1086/658867; melissa gross, “the impact of low-level skills on https://doi.org/10.1086/629972 https://doi.org/https:/doi.org/10.1016/j.lisr.2003.11.002 https://doi.org/10.1016/s0740-8188(97)90003-7 https://www.jstor.org/stable/25828262 https://doi.org/10.1002/(sici)1097-4571(199106)42:5%3c361::aid-asi6%3e3.0.co;2-%23 https://doi.org/10.1002/(sici)1097-4571(199106)42:5%3c361::aid-asi6%3e3.0.co;2-%23 https://doi.org/10.35492/docam/1/1/4 http://remedialundergraduateaisp.pbworks.com/w/file/88755941/modelrevised%20-%208.4.jpg http://remedialundergraduateaisp.pbworks.com/w/file/88755941/modelrevised%20-%208.4.jpg https://doi.org/10.1016/j.lisr.2007.04.012 https://doi.org/10.5860/0700336 https://doi.org/10.1086/658867 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 26 information-seeking behavior: implications of competency theory for research and practice,” reference & user services quarterly (2005), https://www.jstor.org/stable/20864481. 19 mellon, “attitudes,” 138; jiao and onwuegbuzie, “antecedents of library anxiety.” 20 qun g. jiao and anthony j. onwuegbuzie, “perfectionism and library anxiety among graduate students,” journal of academic librarianship 24, no. 5 (1998), https://doi.org/10.1016/s00991333(98)90073-8; jiao and onwuegbuzie, “is library anxiety important?”; qun g. jiao and anthony j. onwuegbuzie, “library anxiety among international students” (paper presented at the annual meeting of the mid-south education research association point clear, alabama, november 17–19, 1999), https://eric.ed.gov/?id=ed437973; qun g. jiao and anthony j. onwuegbuzie, “self-perception and library anxiety: an empirical study,” library review 48, no. 3 (1999), https://doi.org/10.1108/00242539910270312; qun g. jiao and anthony j. onwuegbuzie, “identifying library anxiety through students’ learning-modality preferences,” library quarterly 69, no. 2 (1999), https://doi.org/10.1086/603054; qun g. jiao and anthony j. onwuegbuzie, library anxiety: the role of study habits; qun g. jiao and anthony j. onwuegbuzie, “library anxiety and characteristic strengths and weaknesses of graduate students’ study habits,” library review 50, no. 2 (2001), https://doi.org/10.1108/00242530110381118; qun g. jiao and anthony j. onwuegbuzie, “dimensions of library anxiety and social interdependence: implications for library services, ” library review 51, no. 2 (2002), https://doi.org/10.1108/00242530210418837; qun g. jiao and anthony j. onwuegbuzie, the relationship between library anxiety and reading ability (paper presented at the annual meeting of the mid-south educational research association, chattanooga, tennessee, november 6–8, 2002), https://eric.ed.gov/?id=ed478612; qun g. jiao and anthony j. onwuegbuzie, “reading ability as a predictor of library anxiety,” library review 52, no. 4 (2003), https://doi.org/10.1108/00242530310470720; anthony j. onwuegbuzie, and vicki l. waytowich, “the relationship between citation errors and library anxiety: an empirical study of doctoral students in education,” information processing & management 44, no. 2 (2008), https://doi.org/10.1016/j.ipm.2007.05.007; onwuegbuzie, “writing a research proposal”; anthony j. onwuegbuzie and qun g. jiao, “i’ll go to the library later: the relationship between academic procrastination and library anxiety,” college & research libraries 61, no. 1 (2000), https://doi.org/10.5860/crl.61.1.45; onwuegbuzie and jiao, “information search performance and research achievement”; anthony j. onwuegbuzie, qun g. jiao, and sharon l bostick, library anxiety: theory, research, and applications, vol. 1 (lanham, maryland: scarecrow press, 2004). 21 jiao and onwuegbuzie, “identifying library anxiety”; qun g. jiao, anthony j. onwuegbuzie, and art a. lichtenstein, “library anxiety: characteristics of ‘at-risk’ college students,” library & information science research 18, no. 2 (1996), https://doi.org/10.1016/s07408188(96)90017-1; nahyun kwon, “a mixed-methods investigation of the relationship between critical thinking and library anxiety among undergraduate students in their information search process,” college & research libraries 69, no. 2 (2008), https://doi.org/10.5860/crl.69.2.117; mellon, “attitudes.” 22 gaby haddow, “academic library use and student retention: a quantitative analysis,” library & information science research 35, no. 2 (2013), https://www.jstor.org/stable/20864481 https://doi.org/10.1016/s0099-1333(98)90073-8 https://doi.org/10.1016/s0099-1333(98)90073-8 https://eric.ed.gov/?id=ed437973 https://doi.org/10.1108/00242539910270312 https://doi.org/10.1086/603054 https://doi.org/10.1108/00242530110381118 https://doi.org/10.1108/00242530210418837 https://eric.ed.gov/?id=ed478612 https://doi.org/10.1108/00242530310470720 https://doi.org/10.1016/j.ipm.2007.05.007 https://doi.org/10.5860/crl.61.1.45 https://doi.org/10.1016/s0740-8188(96)90017-1 https://doi.org/10.1016/s0740-8188(96)90017-1 https://doi.org/10.5860/crl.69.2.117 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 27 https://doi.org/https://doi.org/10.1016/j.lisr.2012.12.002; adam murray, ashley ireland, and jana hackathorn, “the value of academic libraries: library services as a predictor of student retention,” college & research libraries 77, no. 5 (2016), https://doi.org/10.5860/crl.77.5.631; krista m. soria, “factors predicting the importance of libraries and research activities for undergraduates,” journal of academic librarianship 39, no. 6 (2013), https://doi.org/10.1016/j.acalib.2013.08.017; krista m soria, jan fransen, and shane nackerud, “library use and undergraduate student outcomes: new evidence for students’ retention and academic success,” portal: libraries and the academy 13, no. 2 (2013), https://doi.org/10.1353/pla.2013.0010; krista m. soria, jan fransen, and shane nackerud, “stacks, serials, search engines, and students’ success: first-year undergraduate students’ library use, academic achievement, and retention,” journal of academic librarianship 40, no. 1 (2014), https://doi.org/10.1016/j.acalib.2013.12.002; krista m soria, jan fransen, and shane nackerud, “beyond books: the extended academic benefits of library use for firstyear college students,” college & research libraries 78, no. 1 (2017), https://doi.org/10.5860/crl.78.1.8. 23 jiao, onwuegbuzie, and lichtenstein, “library anxiety,” 1. 24 jiao and onwuegbuzie, “identifying library anxiety”; see also bostick, “the development and validation”; barbara fister, julie gilbert, and amy ray fry, “aggregated interdisciplinary databases and the needs of undergraduate researchers,” portal: libraries and the academy 8, no. 3 (2008), https://doi.org/10.1353/pla.0.0003; mellon, “library anxiety”; jiao and onwuegbuzie, “perfectionism and library anxiety among graduate students”; jiao and onwuegbuzie, “is library anxiety important?”; jiao and onwuegbuzie, “library anxiety among international students”; jiao and onwuegbuzie, “self-perception and library anxiety: an empirical study”; jiao and onwuegbuzie, “identifying library anxiety through students’ learning-modality preferences”; jiao and onwuegbuzie, library anxiety: the role of study habits; jiao and onwuegbuzie, “library anxiety and characteristic strengths and weaknesses of graduate students’ study habits”; jiao and onwuegbuzie, “dimensions of library anxiety and social interdependence”; jiao and onwuegbuzie, the relationship between library anxiety and reading ability; jiao and onwuegbuzie, “reading ability as a predictor of library anxiety”; onwuegbuzie and waytowich, “the relationship between citation errors and library anxiety”; onwuegbuzie, “writing a research proposal”; onwuegbuzie and jiao, “i'll go to the library later”; onwuegbuzie and jiao, “information search performance and research achievement”; onwuegbuzie, jiao, and bostick, library anxiety: theory, research, and applications. 25 onwuegbuzie and jiao, “the relationship”; anthony onwuegbuzie and qun g. jiao, “understanding library-anxious graduate students,” library review 47, no. 4 (1998), https://doi.org/10.1108/00242539810212812. 26 jiao and onwuegbuzie, “is library anxiety important?” 27 qun g. jiao, anthony j. onwuegbuzie, and sharon l bostick, “racial differences in library anxiety among graduate students,” library review 53, no. 4 (2004), https://doi.org/10.1108/00242530410531857; qun g. jiao, anthony j. onwuegbuzie, and sharon l. bostick, “the relationship between race and library anxiety among graduate https://doi.org/https:/doi.org/10.1016/j.lisr.2012.12.002 https://doi.org/10.5860/crl.77.5.631 https://doi.org/10.1016/j.acalib.2013.08.017 https://doi.org/10.1353/pla.2013.0010 https://doi.org/10.1016/j.acalib.2013.12.002 https://doi.org/10.5860/crl.78.1.8 https://doi.org/10.1353/pla.0.0003 https://doi.org/10.1108/00242539810212812 https://doi.org/10.1108/00242530410531857 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 28 students: a replication study,” information processing & management 42, no. 3 (2006), https://doi.org/10.1016/j.ipm.2005.03.018. 28 mizrachi, “library anxiety,” 2784. 29 anthony j. onwuegbuzie and qun g. jiao, “academic library useage: a comparison of native and non-native english-speaking students,” australian library journal 46, no. 3 (1997): 263, https://doi.org/10.1080/00049670.1997.10755807; jiao and onwuegbuzie, “antecedents of library anxiety.” 30 jiao and onwuegbuzie, “library anxiety among international students.” 31 yunhui lu and denice adkins, “library anxiety among international graduate students,” proceedings of the american society for information science and technology 49, no. 1 (2012), https://doi.org/10.1002/meet.14504901319. 32 collins and veal, “off-campus adult.” 33 nahyun kwon, anthony j. onwuegbuzie, and linda alexander, “critical thinking disposition and library anxiety: affective domains on the space of information seeking and use in academic libraries,” college & research libraries 68, no. 3 (2007): 276, https://doi.org/10.5860/crl.68.3.268. 34 kwon, “a mixed-methods investigation.” 35 judy carol bell, “student affect regarding library-based and web-based research before and after an information literacy course,” journal of librarianship & information science 43, no. 2 (2011), https://doi.org/10.1177/0961000610383634. 36 jessica platt and tyson l platt, “library anxiety among undergraduates enrolled in a research methods in psychology course,” behavioral & social sciences librarian 32, no. 4 (2013): 248, https://doi.org/10.1080/01639269.2013.841464. 37 rachel a. fleming-may, regina mays, and rachel radom, “‘i never had to use the library in high school’: a library instruction program for at-risk students,” portal: libraries and the academy 15, no. 3 (2015), https://doi.org/10.1353/pla.2015.0038. 38 catherine pellegrino, “does telling them to ask for help work?,” reference & user services quarterly 51, no. 3 (2012), https://doi.org/10.5860/rusq.51n3.272. 39 kathy christie anders, stephanie j. graves, and elizabeth german, “using student volunteers in library orientations,”practical academic librarianship: the international journal of the sla 6, no. 2 (2016): 17–30, http://hdl.handle.net/1969.1/166249. 40 pamela n. martin and lezlie park, “reference desk consultation assignment: an exploratory study of students’ perceptions of reference service,” reference & user services quarterly 49, no. 4 (2010), https://doi.org/10.5860/rusq.49n4.333. https://doi.org/10.1016/j.ipm.2005.03.018 https://doi.org/10.1080/00049670.1997.10755807 https://doi.org/10.1002/meet.14504901319 https://doi.org/10.5860/crl.68.3.268 https://doi.org/10.1177/0961000610383634 https://doi.org/10.1080/01639269.2013.841464 https://doi.org/10.1353/pla.2015.0038 https://doi.org/10.5860/rusq.51n3.272 http://hdl.handle.net/1969.1/166249 https://doi.org/10.5860/rusq.49n4.333 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 29 41 sarah mcdaniel, “library roles in advancing graduate peer-tutor agency and integrated academic literacies,” reference services review 46, no. 2 (2018), https://doi.org/10.1108/rsr-02-2018-0017. 42 elaine m. robbins, “breaking the ice: using non-traditional methods of student involvement to effect [sic] a welcoming college library environment,” southeastern librarian 62, no. 1 (2014), https://digitalcommons.kennesaw.edu/seln/vol62/iss1/5. 43 elizabeth diprince et al., “don’t panic!,” reference & user services quarterly 55, no. 4 (2016), https://doi.org/10.5860/rusq.55n4.283. 44 oral roberts university, “about oru,” (2019), https://www.oru.edu/admissions/undergraduate/. 45 oral roberts, our partnership with god [sound recording]. eighth world outreach, oral roberts evangelistic association, tulsa, ok: abundant life recordings, 1962). 46 oral roberts, our partnership. 47 margaret m. grubiak, “an architecture for the electronic church: oral roberts university in tulsa, oklahoma,” technology and culture 57, no. 2 (2016), https://doi.org/10.1353/tech.2016.0066. 48 stephanie hill, “oru receives innovation award,” press release, may 2, 2017, http://www.oru.edu/news/oru_news/20170502-glc-innovation-award.php?locale=en. 49 hill, “oru receives.” 50 bostick, “the development and validation,” 160. 51 blundell, “documenting the information-seeking experience,” 263. 52 mizrachi, “library anxiety,” 2784. 53 collins and robin e. veal, “off-campus adult,” 7. https://doi.org/10.1108/rsr-02-2018-0017 https://digitalcommons.kennesaw.edu/seln/vol62/iss1/5 https://doi.org/10.5860/rusq.55n4.283 https://www.oru.edu/admissions/undergraduate/ https://doi.org/10.1353/tech.2016.0066 http://www.oru.edu/news/oru_news/20170502-glc-innovation-award.php?locale=en abstract literature review library anxiety causes and factors negative effects at-risk student groups interventions oru library oral roberts university history oru modular students the project methodology preand post-surveys administration of survey results knowledge of the library affective barriers mechanical barriers barriers with staff comfort with the library conclusion recommended reading endnotes virtual reality as a tool for student orientation in distance education programs: a study of new library and information science students articles virtual reality as a tool for student orientation in distance education programs a study of new library and information science students sandra valenti, brady lund, and ting wang information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.11937 dr. sandra valenti (svalenti@emporia.edu) is assistant professor, school of library and information management, emporia state university. brady lund (blund2@g.emporia.edu) is doctoral student of library and information management at emporia state university. ting wang (twang2@emporia.edu) is doctoral student of library and information management, emporia state university. abstract virtual reality (vr) has emerged as a popular technology for gaming and learning, with its uses for teaching presently being investigated in a variety of educational settings. however, one area where the effect of this technology on students has not been examined in detail is as tool for new student orientation in colleges and universities. this study investigates this effect using an experimental methodology and the population of new master of library science (mls) students entering a library and information science (lis) program. the results indicate that students who received a vr orientation expressed more optimistic views about the technology, saw greater improvement in scores on an assessment of knowledge about their program and chosen profession, and saw a small decrease in program anxiety compared to those who received the same information as standard textand-links. the majority of students also indicated a willingness to use vr technology for learning for long periods of time (25 minutes or more). the researchers concluded that vr may be a useful tool for increasing student engagement, as described by game engagement theory. literature review computer-assisted instruction (cai) has, for many years, been considered an effective method of instructional delivery that improves student engagement and outcomes.1 new technologies, such as the learning management system (lms), online video, laptops and tablets, word processors, spreadsheets, and presentation platforms, have all significantly altered how knowledge is transferred and measured in students. when adopted by instructors, these technologies can improve the quality of student learning, work, and their evaluation of this work. empirical research has shown that learning technologies do indeed contribute to better learning than a lecture alone.2 positive reaction to the adoption of new learning technologies among student populations has been shown across all grade levels, from pre-k through postgraduate education.3 research in the fields of instructional design technology (idt) and information science (is) have shown that the novelty of new learning technology provides short-term improvement in outcomes.4 this supports the broader hypothesis that engagement increases retention of knowledge. these findings would suggest that, at least in the short term, instructors could anticipate improvement in knowledge retention through the use of a new technology like virtual mailto:svalenti@emporia.edu mailto:blund2@g.emporia.edu mailto:twang2@emporia.edu information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 2 reality. when used in sustained instructional efforts, many learning technologies show som e promise for improving the attainment of learning outcomes.5 this is why interest in learning technology has grown so significantly in the past two decades and the job outlook for instructional designers is increasing faster than the national average. 6 a large proportion of instructional technologies are not truly “adopted” by instructors, but rather used only in one-off sessions and then discarded.7 there seem to be some common factors among those technologies that are adopted and used regularly by instructors: 1. practicality, or the amount of work the new technology requires versus the perceived value of said technology; 2. affordability, or the cost of a new technology versus the perceived value of said technology; and 3. stability, or the likelihood of the product to be continuously supported and updated by its manufacturer (e.g., a product like microsoft office has a higher likelihood of ongoing maintenance).8 as noted by lund and scribner, only recently, with the introduction of free vr development programs and inexpensive viewers/headsets like google cardboard, has vr fit this criteria. 9 it is finally practical to use vr as a learning tool for classrooms with large numbers of students. “virtual reality is the computer-created counterpart to actual reality. through a video headset, computer programs present a visual world that can, pixel-perfectly, replicate the real world—or show a completely unreal one.”10 virtual reality is distinct from augmented reality, which augments a real-world, real-time image (e.g., viewed through a camera on a mobile device) with computer-generated information, such as images, text, videos, animation, and sound.11 the focus of the present study is virtual reality only, not related augmented (or mixed) reality technology. an important contribution to the study of virtual reality in library and information science (lis) is varnum’s beyond reality.12 this short introductory book covers both theoretical and practical considerations for the use of virtual, augmented, and mixed reality in a variety of library contexts. while the book describes how vr can be utilized in a variety of library education (for non-lis majors) contexts, it does not include an example of how virtual reality may be used for library school education. it also does not investigate in significant detail the use of virtual reality for a virtual orientation to an academic program. these are the gaps in which the following study attempts to address. the present study may be viewed through the framework game engagement theory, as described by whitton.13 game engagement theory suggests that five major learning engagement factors exist and that using gaming activities may improve how well learning activities address these factors. these factors include: • challenge, motivation to undertake activity; control, the level of choice; • immersion, extent to which an individual is absorbed into activity; • interest, an individual’s interest in the subject matter; and • purpose, the perceived value of the outcome of the activity. information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 3 it has been suggested by several researchers, including dede, that immersive experiences like vr touch on similar factors of engagement.14 emporia state university’s school of library and information management the setting for this study is emporia (ks) state university’s school of library and information management (esu slim). esu slim is the oldest library school west of the mississippi river, founded in 1902. compared to other lis education programs, esu slim is unique in that it offers a hybrid course delivery format. the six core courses in the mlis degree program are online with two in-person-class weekends for each class. each class weekend is eleven hours: from 6 to 9 p.m. friday and 9 a.m. to 5 p.m. saturday at one of nine distance education locations scattered throughout the western half of the united states. due to this course delivery format, the student population of esu slim may skew slightly older and have more individuals who are employed fulltime in relation to residential master’s programs. esu slim uses a cohort system, with a new group of students beginning annually at each of the eight distance locations as well as the main emporia, kansas campus. before each new cohort begins its first course, a one-day, in-person student orientation is offered on the campus in which the cohort will attend classes. the purpose of this experimental study is to examine how well vr technology can support or satisfy the role of the in-person student orientation by emulating the experience/information students receive during this informational session. methods this study was designed with a pre-test/post-test experimental design. depending on the state in which the students reside, they were assigned either to the experimental or control group . the experimental group received a cardboard vr headset (similar to google cardboard) and a set of instructions on how to use them. they were instructed to utilize this headset to view an interactive experience that introduced elements of library service and library education as a form of new student orientation. students in the control group received a set of links that contained the same information as the vr experience, but in a more static (non-immersive or interactive) setting. participants for this study were library school students from four states: south dakota, idaho, nevada, and oregon. these students were all enrolled in a mixed-delivery program in lis. for each core course in the program, students attend two intensive, in-person, weekend class sessions. the rest of the course content is delivered via a learning management system. for this study, the researchers were particularly interested in understanding the role of vr orientation for distance education students, as these students do not have access to the physical university campus and thus miss out on information that in-person interaction with faculty and the library environment might provide. this also seemed like a worthwhile population to study given that a large portion of lis programs have adopted the distance education (online or mixed-delivery) format. in march 2019, a sample of this population was asked to complete a short survey to indicate their interest in virtual reality for new student orientation and the extent to which acquiring information via this medium may relieve their anxiety and increase their success in the program. sixty-one percent of students indicated at least some elevated level of anxiety about their first mls information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 4 course, while 55 percent agreed that knowing more about the program’s faculty and course structure and purpose would decrease that anxiety. students were also asked to indicate the most pressing information needs they have about the program. these needs are displayed in table 1 below. this information was used to guide the design of the vr content for this study. table 1. information needs expressed by new mls students information need number of respondents (out of 55) information about esu’s curriculum 50 what courses professors normally teach 42 information about information access 41 information about librarianship in general 39 professors’ research interests 35 information about esu’s faculty 27 to see who they are via a video introduction 25 information about esu’s library 24 why they teach for esu’s mls program 23 a little personal information about faculty 20 information about my regional director 14 to which associations do faculty belong 13 information about esu’s physical spaces 5 information about esu’s archives 4 these students were also asked to indicate the extent to which they would like to use vr to virtually “meet” faculty, learn more about the program’s format, see program spaces, and learn about library services, using a five-point likert scale. the findings for this question are displayed in figure 1. information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 5 figure 1. new mls students reception to using vr as an orientation tool based on the largely positive response towards using vr for new student orientation, the researchers progressed to the experimental phase of the study. a vr experience was developed using veer vr (veer.tv), a completely free and intuitive vr-creation platform. within this platform, creators are able to upload images that were captured using a 360-degree vr camera (we used a samsung gear 360 camera) and drag-and-drop interactive elements, including text boxes, videos, audio, and transitions to new images. thus, it was possible to create a vr experience within the setting of an academic library where users could navigate throughout the building and virtually meet faculty and learn about fundamental concepts in librarianship. for this phase of the study a set of research questions were defined, hypothesis created, and independent and dependent variables identified: research questions 1. research question 1: will vr improve students’ knowledge of topics related to their library school and basic library topics, relative to those without a vr experience? 2. research question 2: will vr reduce students’ anxiety about their library program, relative to those without a vr experience? 3. research question 3: will students’ perceptions towards the usefulness of vr be significantly different based on whether or not they utilized the vr experience? 0 2 4 6 8 10 12 14 16 18 20 i'd like to use vr to "meet" faculty i'd like to use vr to learn more about the program format i'd like to use vr to see the classrooms i'd like to learn more about library services using vr f re q u e n c y o f r e sp o n d e n ts category of vr use as student orientation tool strongly agree agree neutral disagree strongly disagree information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 6 hypothesis use of vr will improve students’ knowledge of topics related to library schools and librarianship, reduce their anxiety, and result in a more positive perspective towards vr technology. variables independent variable: whether a student viewed the vr experience for a virtual orientation or viewed the web links for an online orientation. dependent variables: change in students’ scores on a post-test assessment of orientation knowledge, compared to their pre-test scores. change in students’ anxiety levels and perceptions of vr. experimental phase the experimental phase of the study was conducted in august 2019. twenty-nine students agreed to participate in this study. the age and gender characteristics of this population are as follows: fourteen under age 35, eleven age 35–44, four age 45+; nine male, seventeen female, and three fluid or transgender. thirty-three percent of the students who agreed to participate were in the control group, while 67 percent were in the experimental group. all participants in the study received a free vr headset, which was theirs to keep. funding for these vr headsets was provided by a generous grant from a benefactor at the researchers ’ university. participants in the control group were encouraged to use the vr headset after they had completed their participation in the study. both groups received instructions with their viewer that instructed them to complete a pre-test survey, embedded within a module of their learning management system account. following the pre-test, the experimental group was instructed to use the vr experience created by the researchers to learn about their library school, its faculty, and the library concepts. the control group was instructed to use links provided in the module to experience the same content, but without the vr experience. following the experience, both groups were instructed to complete a post-test survey in the module, as well as a follow-up survey that asked questions about how long they interacted with the content, how the experience affected their program anxiety, and additional comments. once the data was collected for all participants, the researchers’ conducted a series of analyses on the data, including an analysis of covariance (ancova) for post-test scores among the control and experimental groups, and ancova for program anxiety following the experimental treatment. 15 results figure 2 displays the amount of time participants in the experimental group spent using the vr experience. nearly 60 percent of participants spent more than 25 minutes using the virtual reality experience. this finding may seem remarkable, given the average attention span of students is generally no more than a handful of minutes, but aligns with that of geri, winer, and zaks, who found that engagement with interactive video lengthens the attention span of users, and supports the premise of engagement theory as discussed in the literature review.16 only 10 percent of individuals assigned to the experimental group decided not to use the headset. additionally, about information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 7 one-third of participants in both the experimental and control groups indicated that they used the vr headset to view other content after they completed the study. figure 2. amount of time experimental group participants spent in vr experience in table 2, responses for likert question about the participants’ post-test perspectives of vr are shown. participants in the vr group generally had more favorable perspectives on their experience than participants in the control group. participants in the control group, however, were a bit more optimistic on the idea that vr has promising uses for education and librarianship (though both groups expressed optimistic perspectives on these questions). there was some indication that participants would be willing to use vr for student orientation again, as both groups responded favorably to the idea that vr orientation information is appropriate and negatively to the idea that it would be better to get information from other sources. tables 3 and 4 display the ancova for pre-test/post-test score change among groups and the change in anxiety among the groups, respectively. post-test scores for the experimental (17.23 correct out of 20 questions, or 86 percent) and control group (17.38/20, or 87 percent) were virtually identical; however the pre-test scores differed (experimental group, 72 percent, scored worse on the pre-test than control group, 78 percent), so the change in scores was actually greater for the experimental group. as shown in table 3, though, this difference in score change was not found to be statistically significant, f (1, 20) = .641 p = .4, r = .01. that is, no significant difference was found as to whether vr improves scores compared to links. it can be concluded, however, that information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 8 the links and vr together did improve scores from the pre-test to the post-test, with ancova values of f (1, 20) = 7.6, p < .01, r = .47. table 2. post-test perspectives of vr for experimental and control groups question control (textlinks)* experimental (vr)* the instructions were easy to understand and follow 3 3.38 the viewer/text-links were fun to use 3 3.63 the vr/text-links content was engaging 3 3.13 i would recommend continuing vr/textlinks use 2.67 3 i felt better informed about the topics presented 2.5 3.11 the information given was helpful 2.5 3.38 i feel more connected to the school than before 2.5 2.88 virtual reality is just a fad 2 2.88 there are exciting uses for vr in education 4 3.5 there are exciting uses for vr in librarianship 4 3.5 using vr is too time consuming 2 3 i’d rather get information in formats other than vr 2.5 2.89 vr orientation information is appropriate 4 3.38 *five-point likert scale (level of agreement—1, strongly disagree; 5, strongly agree) table 3. ancova for pre-test/post-test change in scores degrees of freedom fvalue pvalue pretest 1 .135 .7 group 1 .641 .4 error 18 total 19 corrected total 20 though the vr group generally reported less anxiety on a five-point likert scale following the experiment than the control group (both groups showed some reduction), this difference was not statistically significant at p<.05 (though it was significant at p<.1). it is worth noting that few students indicated prior experience with vr before this study, so it may have simply been the unfamiliar technology that resulted in anxiety not dropping as far as anticipated, not the nature of the content. at the same time, it is worth noting, as bawden and robinson did, that information overload, which could certainly be the product of immersive vr orientations, is connected to information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 9 information anxiety.17 thus, it may be better, in the design of vr orientations, to keep the amount of new information at a minimum, only introducing broad concepts and allowing more freedom and flexibility for the user. table 4. ancova for anxiety following the orientation experience sum of squares df mean square f sig. between groups 3.219 1 3.219 3.44 9 .07 9 within groups 17.733 19 .933 total 20.952 20 discussion participants in this study expressed willingness to use vr for extended periods of time (over 25 minutes) and demonstrated strong levels of engagement. based on this finding, it seems possible that a well-designed vr orientation could be a suitable substitute for the in-person orientation for distance students. this is a significant finding, given that the majority of existing research on orientation for distance education students focuses on the design of online course modules or video streaming for orientation, which are not nearly as immersive and dynamic as physical presence in the environment.18 vr much more closely emulates physical presence than noninteractive/immersive videos and text. those among the participants who were in the experimental (vr) group expressed more favorable perspectives towards the technology. this suggests that experience with the technology increases comfort and interest in the technology. this aligns with the findings of theung, mei-ling, liu, cheok, among others, who found that use of vr were more likely to accept the technology after usage.19 additionally, stated interest in using vr for other purposes, including one-third of participants who have already utilized the technology to explore other apps suggested by the researchers. the findings of this study align with game engagement theory in several of its key aspects. vr is shown to have garnered the interest of the students who participated in the study, as indicated in table 2, aligning with the aspect of interest. they could see the purpose of the experience and were able to take control of the experience to ensure that they interacted with necessary information to satisfy this purpose. this is opposed to the control group, which had to follow links and read text in a sequential order with little control or creativity involved. accordingly, greater improvement in scores was observed for the experimental group. even though the improvement was not statistically significant, this could likely be explained by the relatively small sample size. with a larger number of participants, the statistical strength of the differences between the two study groups may have been more pronounced. this is one limitation of the present study. in addition to a small participant group, several other limitations exist with this study. participants came from only a small sample of states, all in the western half of the united states. a less homogeneous sample may have produced more robust results. some vr headsets arrived late due to delays in distributing them, giving the students less opportunity to review the content than information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 10 they otherwise may have had. finally, the researchers were not able to easily troubleshoot problems with accessing the vr experience for distance students. while the best was done to help all participants figure out how to use the technology, several students opted to discontinue participation when the technology gave them trouble. this also led to a smaller study sample population than initially anticipated. conclusion the findings of this study may have several important implications for library professionals who are considering using vr technology for library orientations or instruction. this study found vr to have a positive effect on students’ interest and to slightly increase scores and reduce anxiety among them. while there is no indication from this study whether vr would produce positive effects over a sustained period of time (e.g., every class session over the course of a semester), in limited usage it appears to at least draw students’ attention more so than the traditional online teaching options like static text and links. the same vr experience developed to introduce students to basic concepts within the librarianship/the library could be used for undergraduate and graduate students in all majors during library orientation sessions. this may make the library a more memorable component of students’ early university experiences, as opposed to lecture information that students are likely to easily forget. library professionals may consider these factors when deciding whether to opt for the more traditional methods of instruction/orientation or experimenting with a more innovative method of teaching like virtual reality. endnotes 1 jennifer j. vogel et al., “using virtual reality with and without gaming attributes for academic achievement,” journal of research on technology in education 39, no. 1 (2006): 105–18, https://doi.org/10.1080/15391523.2006.10782475. 2 yigal rosen, “the effects of an animation-based on-line learning environment on transfer of knowledge and on motivation for science and technology learning,” journal of educational computing research 40, no. 4 (2009): 451–67, https://doi.org/10.2190/ec.40.4.d; elisha chambers, efficacy of educational technology in elementary and secondary classrooms: a metaanalysis of the research literature from 1992–2002 (carbondale, il: southern illinois university at carbondale, 2002). 3 elisha chambers, “efficacy of educational technology in elementary and secondary classrooms: a meta-analysis of the research literature from 1992–2002,” phd diss., southern illinois university at carbondale, 2002. 4 jason m. harley et al., “comparing virtual and location-based augmented reality mobile learning: emotions and learning outcomes,” educational technology research and development 64, no. 3 (2016): 359–88, https://doi.org/10.1007/s11423-015-9420-7; jocelyn parong and richard e. mayer. “learning science in immersive virtual reality,” journal of educational psychology 110, no. 6 (2018): 785–95, https://doi.org/10.1037/edu0000241; paul legris, john ingham, and pierre collerette, “why do people use information technology? a https://doi.org/10.1080/15391523.2006.10782475 https://doi.org/10.2190%2fec.40.4.d https://doi.org/10.1007/s11423-015-9420-7 https://psycnet.apa.org/doi/10.1037/edu0000241 information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 11 critical review of the technology acceptance model,” information and management 40, no. 3 (2003): 191–204, https://doi.org/10.1016/s0378-7206(01)00143-4. 5 zaid khot et al., “the relative effectiveness of computer‐based and traditional resources for education in anatomy,” anatomical sciences education 6, no. 4 (2013): 211–15, https://doi.org/10.1002/ase.1355; michael j. robertson and james g. jones, “exploring academic library users’ preferences of delivery methods for library instruction,” reference & user services quarterly 48, no. 3 (2011): 259–69. 6 joshua kim, “instructional designers by the numbers,” inside higher ed (2015), https://www.insidehighered.com/blogs/technology-and-learning/instructional-designersnumbers. 7 elena olmos-raya et al., “mobile virtual reality as an educational platform: a pilot study on the impact of immersion and positive emotion induction in the learning process,” eurasia journal of mathematics science and technology education 14, no. 6 (2018): 2045-57, https://doi.org/10.29333/ejmste/85874. 8 brady d. lund and shari scribner, “developing virtual reality experiences for archival collections: case study of the may massee collection at emporia state university,” the american archivist, https://doi.org/10.17723/aarc-82-02-07. 9 lund and scribner, “developing virtual reality experiences for archival collections.” 10 kenneth j. varnum, “preface,” in kenneth j. varnum, ed., beyond reality: augmented, virtual, and mixed reality in the library (chicago: ala editions, 2019): x. 11 brady d. lund and daniel a. agbaji, “augmented reality for browsing physical collections in academic libraries,” public services quarterly 14, no. 3 (2018): 275–82, https://doi.org/10.1080/15228959.2018.1487812. 12 kenneth j. varnum, ed., beyond reality: augmented, virtual, and mixed reality in the library (chicago: ala editions, 2019). 13 nicola whitton, “game engagement theory and adult learning,” simulation and gaming 42, no. 5 (2011): 596–609, https://doi.org/10.1177/1046878110378587. 14 chris dede, “immersive interfaces for engagement and learning,” science 323, no. 5910 (2010): 66–69, https://doi.org/10.1126/science.1167311. 15 pat dugard and john todman, “analysis of pre‐test‐post‐test control group designs in educational research,” educational psychology 15, no. 2 (1995): 181–98, https://doi.org/10.1080/0144341950150207. https://doi.org/10.1016/s0378-7206(01)00143-4 https://doi.org/10.1002/ase.1355 https://www.insidehighered.com/blogs/technology-and-learning/instructional-designers-numbers https://www.insidehighered.com/blogs/technology-and-learning/instructional-designers-numbers https://doi.org/10.29333/ejmste/85874 https://doi.org/10.17723/aarc-82-02-07 https://doi.org/10.1080/15228959.2018.1487812 https://doi.org/10.1177%2f1046878110378587 https://doi.org/10.1126/science.1167311 https://doi.org/10.1080/0144341950150207 information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 12 16 nitza geri, amir winer, and beni zaks, “challenging the six-minute myth of online video lectures: can interactivity expand the attention span of learners?,” online journal of applied knowledge management 5, no. 1 (2017): 101–11. 17 david bawden and lyn robinson, “the dark side of information: overload, anxiety and other paradoxes and pathologies,” journal of information science 35, no. 2 (2009): 180–91, https://doi.org/10.1177/0165551508095781. 18 moon-heum cho, “online student orientation in higher education: a developmental study,” educational technology research and development 60, no. 6 (2012): 1051–69, https://doi.org/10.1007/s11423-012-9271-4; karmen crowther and alan wallace, “delivering video-streamed library orientation on the web: technology for the educational setting,” college and research libraries news 62, no. 3 (2001): 280–85. 19 yin-leng theng et al., “mixed reality systems for learning: a pilot study understanding user perceptions and acceptance,” international conference on virtual reality (2007): 728–37, https://doi.org/10.1007/978-3-540-73335-5_79. https://doi.org/10.1177/0165551508095781 https://doi.org/10.1007/s11423-012-9271-4 https://doi.org/10.1007/978-3-540-73335-5_79 abstract literature review emporia state university’s school of library and information management methods research questions hypothesis variables experimental phase results discussion conclusion endnotes this paper discusses some of the problems associated with search and digital-rights management in the emerging age of interconnectivity. an open-source system called context driven topologies (cdt) is proposed to create one global context of geography, knowledge domains, and internet addresses, using centralized spatial databases, geometry, and maps. the same concept can be described by different words, the same image can be interpreted a thousand ways by every viewer, but mathematics is a set of rules to ensure that certain relationships or sequences will be precisely regenerated. therefore, unlike most of today’s digital records, cdts are based on mathematics first, images second, words last. the aim is to permanently link the highest quality events, artifacts, ideas, and information into one record documenting the quickest paths to the most relevant information for specific data, users, and tasks. a model demonstration project using cdt to organize, search, and place information in new contexts while protecting the authors’ intent is also introduced. ■ statement of the problem human history is composed of original events, artifacts, ideas, and information translated into records that are subject to deciphering and interpretation by future generations (figure 1). it’s like putting together a puzzle, except that each person assembling bits and pieces of the same information may end up with a different picture. we are at a turning point in the history of humanity’s collective knowledge and expertise. we need more precise ways to structure questions and more interactive ways to interpret the results. today, there is nearly unlimited access to online knowledge collections, information services, and research or educational networks to preserve and interpret records in more efficient and creative ways.1 there is no reason digital archiving and dissemination techniques could not also be used to streamline redundancies between collections, build cross-references more methodically.2 content should be presented and techniques utilized according to orderly specifications. this will help to document work more responsibly, making shared records more correct, interesting, and complete. the open-source system proposed, context driven topologies (cdt), packs and unpacks ideas and information in themes similar to museum exhibitions using specifications created by each author and network. data layers are formed by registering unique combinations of geography, knowledge domains, and internet addresses to create multidimensional shapes showing where data originate, where they belong, and how they relate to similar information over time. the topologies can be manipulated to consolidate and compare multiple sources to identify the most reliable source, block out repetitious or irrelevant background information, and broadcast precise combinations of ideas and information to and from particular places. “places,” in this sense, means geographic region and cultural background, knowledge domain and education level, and all of their corresponding online resources. modern information must be searchable on multiple and simultaneous levels.3 today’s searches occur for a number of reasons that did not exist when most current collections, repositories, and publications were created. digital records have the potential to reach far broader audiences than original events, artifacts, and ideas. therefore, digitized items and the acts of publishing and referencing over networks could theoretically serve a longer-term and more expanded purpose than most individual collections, repositories, or publications are designed to serve. there is no shortage of interesting work to look at. we live in a complex world that is just recently being digitized, mapped, analyzed, and broadcast over the internet in fine detail and compelling overall relationships. many deborah l. macpherson (debmacp@gmail.com) is projects director, accuracy&aesthetics (www.accuracyandaesthetics.com) in vienna, virginia. deborah l. macpherson digitizing the non-digital: creating a global context for events, artifacts, ideas, and information digitizing the non-digital | macpherson 95 figure 1. 50 word word-search-puzzle (courtesy of kevin lightner) 96 information technology and libraries | june 2006 of these relationships require mathematics, images, and maps to explain them. we need more than keywords to explore and reference all that has been documented, but we have formed the habit of using keywords and machine-based classification schemes. the entire digital world is in a mire of conflicting priorities, funding opportunities, and intellectual quests toward the future. to advance humanity’s collective curiosity and knowledge, and to coordinate similar efforts across disciplines and cultures, we need one form of record keeping. one global context to show: 1. where ideas and information begin; 2. if the original is non-digital (e.g., an artifact or real world event), and if so, the location where the artifact resides or the time and place of the event; and 3. a marking system to keep track of the ways information has been exchanged, reinterpreted, and reused to create a more comprehensive and simplified guide to humanity’s collective knowledge and expertise. digitizing the non-digital is a concept to address three issues: ■ tools to assemble the bigger pictures needed to document the best paths to the most relevant information in sets rather than retrieving results item by item; ■ placeholders for information that has not been digitized or was never recorded; and ■ distribution to and from specific places according to the ways it is used, the kind of information it is, and the types of people who are able to understand it. there is currently little distinction between all data that have been collected or exist, versus the data and techniques selected to draw conclusions. there are no tools to differentiate between information under rigorous discussion by a discipline or culture versus random bits and pieces. there is a need to develop the equivalent of interpretive exhibits to instruct and inspire the general public. there is currently no way to herd information into crowded areas to be consolidated, compressed, and prioritized by its relationship to similar ideas and information. citation patterns are able to show connections or structure-related information.4 however, they currently do not show whether the reference is for or against the other work. there are very few big pictures.5 there is no way to trace where an idea has led over time. the global context proposed is not like the ancient library of alexandria or large-scale contemporary initiatives. the envisioned process looks beyond the quest to digitize or publish every available event, artifact, and idea. it is not about each item itself. it is being able to make sense of the ways the same information can be viewed in different contexts, and being able to construct a reliable process to search and document the results. having bigger pictures will allow researchers, curators, and others to see what is missing or decide which archival works should be converted into digital form. we do not have the time, resources, or reasons to digitize every item in every collection. the aim is to gradually identify what the most telling examples are in different areas so someone new to an event, artifact, idea, or information can see it in various contexts and automatically be shown the most compelling or instructive sequences first (figure 2). a coordinated effort to overlap and see all archives and publications by ranking accuracy and appeal to the public in relationship to all knowledge will make it possible for entirely new lines of inquiry to be established. it will help researchers coordinate work across disciplines. an example of this principle today is the international virtual observatory alliance (ivoa).6 ivoa is a coordinated effort by astronomers worldwide to document our universe more efficiently by systematizing their records; showing where they originate; indicating how they were collected; meeting their rigorous mathematical figure 2. photomosaic ® thousands of miniature images of the civil war combine to make one large portrait. (courtesy of robert silvers) article title | author 97digitizing the non-digital | macpherson 97 standards; and deciding themselves how and where their records belong in relationship to each other, and which ones are most important. only astronomers are qualified to do this. the same is true in any area of humanity’s specialized knowledge and expertise. the most difficult aspect of creating a global context is accommodating and expressing each area in its unique way as created from within, while still being able to get the most descriptive examples from all areas to fit together in a sensible and appealing overview. until digital archives and publications can be deeply searched on a global level using simpler tools and predetermined pathways accessible by anyone, two researchers in different geographic or academic areas may be investigating the same topic from different points of view and will not know it. there is no way to be led to the best internet resources. today, as so much information surrounds us, it is hard to believe that common lines of inquiry could be discovered by accident. context of the place, time, idea, or education level should be able to drive internet topologies to the most appropriate online resources. constructing a reliable and beautiful digital history of all events—both natural and man-made—artifacts, ideas, and information means contributing to and combining a wide range of knowledge, expertise, networks, archives, and tools. mapping digital knowledge to historical knowledge means arguing about and perfecting an entirely new set of checks and balances. historical and digital knowledge are different. historical knowledge is fluid, continuous, and held by traditionally separated cultures and disciplines. digital knowledge goes everywhere that can be marked and traced by the times and places it was created, captured, and distributed. trying to visualize what is happening and relating it to working practices and the types of information that came before it is not like tracing the history of the human race back to adam and eve or the universe back to the big bang, where substantial guesswork beyond our memory or experience is involved. the entire conversion into the networked age is happening before our eyes in less than one generation without the benefit of reflection, careful review, and storytelling. we’re collecting everything indiscriminately over and over again while all datasets are rapidly expanding. we need to step back, slow down, and acknowledge that many current digitization and publication methods do not consistently generate reflective or reviewed results that are able to tell a story. we do not currently have one shared map, context, mathematical record, language, or set of symbols to interpret from different points of view for a variety of purposes over time. we do not currently mark the original versus subsequent interpretations of the same information as an integral component of most digital records. there is no financial support for one single shared storage space to preserve only the highest-resolution, most agreed-upon versions because we may never be able to agree on what they are. therefore, there is also not one system that can be fine-tuned to discover research and results that may be accidentally overlapping. instead, unusual approaches get watered down by constrained words designed to fit metadata requirements developed by archivists and engineers rather than the original authors. links get broken, web sites are no longer maintained, trends change. there are currently very few feasible ways to pick up on a line of inquiry previously initiated by others without sorting through and regenerating the same information again.7 a simplified version of the work needs to be preserved on the network, able to be referenced by others even if they are far away, live in a different time, or are more or less advanced in their ways of thinking. if digital information is reliable, someone in a remote place or in the future should not need to collect the same information again or unintentionally retrieve out-of-date or duplicate results. searches in the public domain should not be boring. they should be as easy to click through as tv channels, with more directions to go and better content. all searchers should not have to start at the top like everyone else on the first page of google, citeseer, or arxiv with a blank white space and a box to enter key words. investigators should be able to outline the facts they know, dial in measurements, specify relationships, and generally be able to use their own knowledge and expertise to isolate and extract entire ideas over broad spectrums or select only relevant portions of archives and publications to reintegrate into larger bodies of work for further discussion. digital objects are able to depict more than the unaided eye can see. an example is the evaluation of the center of mass of michelangelo’s david performed for david’s restoration by the visual computing lab based on a 3d model of the statue built by stanford university (figure 3).8 the digital david does not have mass. the original david is a beautiful object sculpted of a known and predictable material. the model makes it possible to test restoration techniques without permanent damage in ways no one would dare attempt on the irreplaceable original without first knowing more. the documentation process is an enhanced original that should be permanently bound to the digital history of the original sculpture. the evaluation method could be applied to other objects, but this model belongs with this object and this type of research. a global context built upon a solid, mathematically linked foundation would mean this conscientious work would not be lost or need to be repeated. digital records are not being used nearly to their full potential. so many influences on humanity’s intellectual evolution could be examined as history takes shape over 98 information technology and libraries | june 2006 time. concurrent and conflicting interpretations can take on more meaning than the original by itself. for example, how could the internet and legal citations be used to map the subsequent interpretations of the u.s. constitution from the time, place, and reasons where it was written to every supreme court case and related citation since the original context? what would this map look like (figure 4)? the impact that these four pages of ink on paper have had to the united states and the entire world cannot currently be examined in one volume to see where the most contentious and useful passages are. similar dynamics in wikipedia are shown in history flow by martin wattenberg at ibm research.9 what if techniques developed in one field could be applied to content from another area? for example, what if computer models created to track storms and hurricanes could be used to arrange and watch the evolution and real world impact of all the documents and actions associated with a war? being able to see how originals evolve in their interpretation and impact on society over time is practical because not all records are worth keeping. even worse, mundane or meaningless events, artifacts, ideas, or information may seem more important than they actually were if they are not translated into digital form or distributed in the right way.10 the task today is to make the most advanced ways of thinking and working more approachable and appealing to someone new, which is everyone outside a particular discipline or culture, while traversing a map of humanity’s collective knowledge and expertise. because shared memories of this magnitude would be so far-reaching and complex, the record itself needs to be able to show every user how to use it. every unique purpose for looking around, publishing, or referencing work, and adding to or taking away from a collaborative global context should be geared toward improvement and simplification. while millions and millions of people are accessing enormous numbers of files and collections, some paths are better than others. in order to sort and choose the best parts of vast collections, documenting everyone going in and out of various semantic places can ultimately identify the best paths to information everyone understands. what if someone who does not care at all about paintings makes an inquiry—which ten should they be shown to get them interested? there is also the issue of gearing the internet to provide more efficient pathways to widely accessed preapproved and curated information. every mouse click could accumulate to document the most reliable pathways in and out of shared information spaces to generate an assortment of scenarios for looking at the same information in different ways (figure 5).11 we think there is far too much information to consolidate into one big picture, that our ideas and methods are too incompatible to coexist comfortably in one space, but perhaps this is not really the case. perhaps we can understand what is happening more clearly by working backwards. ■ proposed solution and design for a running prototype even though many networks are in place and countless computers have been manufactured, technology advances rapidly. there are very few reasons to repair obsolete equipment or maintain outdated web resources. therefore, why not go back to the drawing board on all of it? we may have completely new computers and networks within ten years, anyway. a record-keeping and referencing system this ambitious needs to incorporate every type of record, classification scheme, symbol, style, and quirk. when visiting a new place outside your comfort zone, it needs to be obvious what the best local techniques are to filter and understand the results. people new to an area need to have the option of using tools they can invent or already know. figure 3. david’s center of mass (courtesy of the visual computing lab and stanford university) article title | author 99digitizing the non-digital | macpherson 99 the visualization of cdt’s model demonstration project will bring together research scientists, artists, integrators, and institutions to develop a running prototype. the purpose is to establish and record a series of planned and spontaneous situations in different parts of the world across a range of disciplines and existing networks so that these situations can be mapped. the project will be a group of people thinking together to confront the roadblocks in assembling incompatible ideas and information into one context. the group will collaborate in larger and smaller groups in roughly three-month intervals as participants continue with their existing work. the development of this system has to be dynamic, changing piece by piece both from the bottom up and the top down while everyone’s regular work continues. therefore, the system will be geared toward sample sets of active work products, rather than the record-keeping system by itself. the current objective is to establish a network of ten art museums, ten scientific research institutes, and ten new media/new technology efforts in ten cities that speak different natural languages (for example: english, german, french, italian, hindi, mandarin, ga [belonging to the cluster of kwa languages in ghana], uzbek, spanish, and arabic). the overall intent is to use mathematics, art, and individual ways of knowing to develop a series of professional sketches to serve as shortcuts between languages and key words in the search process. the first step is to map the background of each of the project participants’ previous work by time, location, and discipline. the database will include scientific visualizations, art objects, performances, algorithms, mathematical formulae, musical recordings, and many other forms of creative and scholarly expression. the next steps will be to hold a series of interactive workshops. at the first workshop, the research scientists will explain the mathematics and images they use in their work. two sets of artists will isolate the aesthetics to render their own map through the scientists’ ideas. two traveling exhibits will be created, one to be experienced in person, the other to be presented through a new media and online exhibit. both will be tracked physically and conceptually using cdt. the results will be generated and interpreted using gis, matlab, photoshop, and flow visualization software. for more information, please contact the author. a survey of individual and institutional requirements will be undertaken to define practical ways to move and organize ideas and information into a unified sample map of previously unrelated content and techniques. for example, at one institute, perhaps only two participants and four local professors will understand what that part of the map is showing. another part may only have meaning to one artist. a unified map for everyone, with built-in copyright protection for the participating artists, scientists, and institutions, will be presented to nonspecialist general publics around the world for feedback and further change within specified limits. the participating publics will be people interested in contemporary art, cutting-edge scientific research, new media, and events where all three communities can interact. each part of the prototype will be able to be examined in groups to compare and contrast different elements against different backgrounds. some arrangements will be assisted by the computer and network. the project will map everything with which each event, idea, and artifact has ever been associated in scale, proportion, and relative placement in the record overall. for example, if the records in question are paintings, any group could figure 5. thick and thin (courtesy of the artist john simon) figure 4. the constitution of the united states (courtesy of the u.s. national archives and records administration [nara]) 100 information technology and libraries | june 2006 be gathered together into the same reference window without copying the images. the assembly window has a built-in scale for the items it is showing, so they will be displayed in the correct proportion to each other. the system binds images of physical objects with their dimensions and the times and places they were created while this information is known—so a user does not ever have to guess later when looking back at any part of the record. any group of paintings can be automatically arranged chronologically, by size, culture, or any number of comparisons and curatorial issues. a sample sequence is: 1. a zoomed-in map showing a group of paintings in an exhibit. each painting links to its history. 2. within the map of all paintings shown in an intricate collage. 3. inside the map of all human endeavor shown as an appealing landscape. higher levels can then be used to reorganize a theme, for example, “only germany 2005 to 2007,” and drilling back down to generate other exhibitions. this would lead to other paintings and other curators’ conclusions, which would provide a more complete representation of each painting, exhibition, museum, curator, culture, and era. when the records in question are scientific visualizations, problems of presenting unrelated files together are more complex. the records may not share a common scale or system of reference. it may only be possible to place mathematical constructs in contexts based on where they originate geographically and by knowledge domain. an important part of the work will be determining the best contexts by which to introduce ideas or information to untrained viewers and devising methods to start deeper in the records using mathematical, cultural, or other prior knowledge and preferences. the same concept can be described by different words, the same image can be interpreted a thousand ways by every viewer, but mathematics is a set of rules to ensure that certain relationships or sequences will be precisely regenerated. therefore, unlike most of today’s digital records, cdts are based on mathematics first, images second, words last. ideas and information will be encoded to persist over specified periods of time. better examples will find higher placement by connecting to more background information and showing stronger relationships to larger numbers of open questions. cycles will be implemented to return to the same idea later and remove information that is never referenced or has not changed the course of the record’s flow. out-of-date, irrelevant, or rarely used information has to either be compressed or be thrown away, a new type of identity and a process to assemble and eliminate information will be created in thirty prototype forms showing the intertwined history of the events, artifacts, ideas, and information generated by the project and all it branches out to when connecting back to the publications, exhibits, ideas, artifacts, and other information generated by the participating individuals and institutions. the cdt model will relate and join tables to display all the different forms together in one map. each piece of information and the patterned space around it will be documented a special way to generate drawings leading back to originals reliably structured to transfer to other computers and networks. they will transfer without ambiguity because the transactions and paths to the internet addresses are based on mathematical relationships that can be checked. each contributor has the first opportunity to place his or her ideas in context and define the limits of how their originals can be referenced, changed, and presented. at the end of the project, the set will be closed so that it can be cleaned of information that was only temporary, placeholders can be examined, and the entire model can be manipulated as one whole. for more information, please see www.contextdriventopologies.org the more specified a single piece or set of information, the easier it will be to define its history and place it in context. each unique placement and priority assigned by each individual or institution may not agree with the priorities and placements envisioned by others, but sooner or later, there will begin to be correspondence and everyone will be looking very generally at the same emerging map. ■ conclusions there will be innumerable contexts to create, discover, and remark upon in the future by creating a shared pace of curiosity and knowledge acquisition. a global context could be used to extrapolate new knowledge from trends that occur over longer periods of time in more places than we currently share or document. as the envisioned system is fine-tuned, it will become an ideal place to test an idea that is only partially complete to see where the idea fits or to determine if it has already been done. the results could be immediately applied to improve education. in today’s frantic information overload, we should not forget that digital information—and even cold, hard, raw data—is more than ones and zeros. they represent peoples’ work, their fingerprints; people are attached to their data. one wishes networks of computers could understand one’s ideas and work, but we only show them the boring parts. the proposed system will capture beauty so computers can help to find where it is hidden inside all the repositories, publications, and collections through which no person has the time to sort. the system will allow article title | author 101digitizing the non-digital | macpherson 101 users to specify how they think their information relates to the rest of the world so their intended context can be traced in the future. one hopes that using networks and computers to compare ideas and works on larger levels will restore craftsmanship and attention spans to make users want to spend more time with better information. a shared visual language driven by mathematical relationships that can be checked will allow future historians to see where records simply will not harmonize. users will be able to analyze why different ways of looking can shape and divide knowledge and history as it changes. visiting online archives and publications will change. developing processes to pre-organize searches and results for public viewing can change now by creating a system for curators and others to develop sets of information, rather than publishing individual items on their web sites. library facilities can change, and research rooms can become multimedia centers. networks can broadcast content and techniques in one package. there is not one clearly defined reason why being able to see these kinds of overviews or make these types of comparisons can be useful. the internet is a worldwide invention being constructed for a variety of purposes. a perfectly legitimate reason to capture the history of transactions across it in a simple form is just to see what might happen with the objective of increasing our understanding and respect for each other. the most important reason for establishing a global context is to allow users to transfer and update complex histories, thoughts, images, studies, visualizations, drawings, flow diagrams, sequences, transformations, cultural objects, stories, expressions, and purely mathematical or dynamic relationships without depending on constrained keywords or illegible codes that do not describe this information as well as the information can describe itself. all cultures and disciplines would be able to construct their parts of the record precisely the way they prefer. we would finally be able to use computers to show why and how we think information is related—a huge leap forward in the world of digital record keeping. references and notes 1. citeseer, 2005, http://citeseer.ist.psu.edu (accessed apr. 6, 2006); internet2, 2005, www.internet2.edu (accessed apr. 6, 2006); jane’s information group, 2005, www.janes.com (accessed apr. 6, 2006); machine learning network online information service (mlnetois), 2005, www.mlnet.org (accessed apr. 6, 2006); national technical information service, 2005, www.ntis .gov (accessed apr. 6, 2006); smithsonian institution libraries, galaxy of knowledge, 2005, www.sil.si.edu/digitalcollections (accessed apr. 6, 2006); thompson scientific, isi web of knowledge, 2005, www.thomsonisi.com (accessed apr. 6, 2006); visual collections, david rumsey collections, 2005, www .davidrumsey.com/collections (accessed apr. 6, 2006); world health organization, statistical information system, 2005, www3.who.int/whosis/menu.cfm (accessed apr. 6, 2006). 2. g. ammons et al., “debugging temporal specifications with concept analysis,” in proceedings of the acm sigplan 2003 conference on programming language design and implementation (new york: association for computing machinery, june 2003). 3. w. huyer and a. neumaier, “global optimization by multilevel coordinate search,” in global optimization 14 (1999): 331–55 4. a. bagga and b. baldwin, (workshop paper), in colingacl ‘98: 36th annual meeting of the association for computational linguisitics and 17th international conference on computational linguisitics, aug. 10–14, 1998, montréal, quebec, canada: proceedings of the conference (new brunswick. n.j.: acl; san francisco: morgan kaufmann, 1998); s. deerwester et al., “indexing by latent semantic analysis,” journal of the american society for information science 41, no. 6 (1990): 391–07; a. mccallum and b. wellner, “toward conditional models of identity uncertainty with application to proper noun coreference,” in proceedings of the ijcai workshop on information integration on the web (mountain view, calif: research institute for advanced computer science, 2003), 79–84; t. nisonger, “citation autobiography: an investigation of isi database coverage in determining author citedness,” college and research libraries 65, no. 2 (mar. 2004): 152–63; k. van deemter and r. kibble, “on coreferring: coreference in muc and related annotation schemes,” computational linguistics 26, no. 4 (dec. 2000); k. boyack, “mapping all of science and technology at the paper level,” presented at the session mapping humanity’s knowledge and expertise in the digital domain as part of the 101st annual meeting of the association of american geographers (denver: association of american geographers, 2005): 54; metacarta, 2005, www.metacarta.com. 5. j. burke, “knowledgeweb project, 2005.” www.k-web .org (accessed apr. 6, 2006); visual browsing in web and non-web databases, iowa state university, www.public.iastate .edu/~cyberstacks/bigpic.htm (accessed apr. 6, 2006). 6. international virtual observatory alliance, 2005, www .ivoa.net (accessed apr. 6, 2006). 7. s. bradshaw, “charting excursions through the literature to manage knowledge in the biological sciences,” presented at the session mapping humanity’s knowledge and expertise in the digital domain, as part of the 101st annual meeting of the association of american geographers (denver: association of american geographers, 2005): 56, project paper available from http://dollar .biz.uiowa.edu/~sbradsha/beedance/publications.html (accessed apr. 6, 2006). 8. m. callieri et al., “visualization and 3d data processing in the david restoration,” ieee computer graphics and applications 24, no. 2 (mar./apr., 2004): 16–21. 9. m. wattenberg, “history flow,” 2005, http://research web.watson.ibm.com/history (accessed apr. 6, 2006). 10. k. börner, “semantic association networks: using semantic web technology to improve scholarly knowledge and expertise management,” in visualizing the semantic web, 2nd ed. vladmire geroimenko and chaomei chen, eds., (london: springer verlag, 2006) 99–115. 11. g. sidler, a. scott, and h. wolf, “collaborative browsing in the world wide web,” in proceedings of 8th joint european networking conference, edinburgh, scotland (new york: elsevier, 102 information technology and libraries | june 2006 1997); j. thomas, “meaning and metadata: managing information in a visual resource reference collection,” in proceedings of association for computers and the humanities and the association for literary and linguistic computing meeting (charlottesville, va.: university of virginia, 1999); h. yu and a. vahdat, “design and evaluation of a conit-based continuous consistency model for replicated services,” in acm transactions on computer systems 20, no. 3 (aug. 2002): 239–82. 12. visualization of context driven topologies/cdt model demonstration project, 2005, www.contextdriventopologies.org (accessed apr. 6, 2006). image acknowledgments: 50-word word-search puzzle www.synthfool.com/puzzle.gif permission: kevin lightner, synthesizer enthusiast. wrightwood, california abraham lincoln www.photomosaic.com/samples/large/abrahamlincoln.jpg permission: from the artist robert silver. david’s center of mass http://vcg.isti.cnr.it/projects/davidrestoration/restaurodavid.htm http://graphics.stanford.edu/projects/mich/book/book.html permission: roberto scopigno, visual computing lab, isti-cnr, via g. moruzzi, 1, 56124 pisa italy and marc levoy, stanford computer graphic lab, gates computer science bldg. stanford, ca 94305 u.s. constitution www.archives.gov/ repository: national archives building, washington, d.c. permission: nara government records are in the public domain. thick and thin www.numeral.com/drawings/plotter/thickandthin.html 1995 11" × 15" ink on paper. permission: from the artist john simon, new york city. specializing in algorithms and conceptual art. the benefits of enterprise architecture for library technology management: an exploratory case study sam searle information technology and libraries | december 2018 27 sam searle (samantha.searle@griffith.edu.au) is manager, library technology services, griffith university, brisbane, australia. abstract this case study describes how librarians and enterprise architects at an australian university worked together to document key components of the library’s “as-is” enterprise architecture (ea). the article covers the rationale for conducting this activity, how work was scoped, the processes used, and the outputs delivered. the author discusses the short-term benefits of undertaking this work, with practical examples of how outputs from this process are being used to better plan future library system replacements, upgrades, and enhancements. longer-term benefits may also accrue in the future as the results of this architecture work inform the library’s it planning and strategic procurement. this article has implications for practice for library technology specialists as it validates views from other practitioners on the benefits for libraries in adopting enterprise architecture methods and for librarians in working alongside enterprise architects within their organizations. introduction griffith university is a large comprehensive university with multiple campuses located across the south east queensland region in australia. library and information technology operations are highly converged and from 1989–2017 were offered within a single division of information services. scalable, sustainable, and cost-effective it is seen as a key strategic enabler of the university’s core business in education and research. “information management and integration” and “foundation technology” are two of four key areas outlined in the griffith digital strategy 2020, which highlights enterprise-wide decision-making and proactive moves to take advantage of as-a-service models for delivering applications.1 from late 2016 through to early 2018, library and learning services (“the library”) and it architecture and strategy (itas) worked iteratively to document key components of the library’s “as-is” enterprise architecture (ea). around fifty staff members have participated in the process at different points. the process has been very positive for all involved and has led to a number of benefits for the library in terms of improved planning, decision-making, and strategic communication. as manager, library technology services, the author was well placed to act as a participant-asobserver with the objective of sharing these experiences with other library practitioners. the author actively participated in the processes described here and has been able to informally discuss the benefits of this work with the architects and some of the library staff members who were most involved. mailto:samantha.searle@griffith.edu.au benefits of enterprise architecture for library technology management | searle 28 https://doi.org/10.6017/ital.v37i4.10437 literature review enterprise architecture (ea) emerged over twenty years ago and is now a well-established it discipline. like other disciplines such as project management and change management, there are a number of best practice frameworks in common use, including the open group architecture framework (togaf).2 a global federation of member professional associations has been in place since 2011, with aims including the formalization of standards and promotion of the value of ea.3 educational qualifications, certifications, and professional development pathways for enterprise architects are available within universities and the private training sector. according to the international higher education technology association educause, ea is relatively new within universities but is growing in importance. as a set of practices, “ea provides an overarching strategic and design perspective on it activities, clarifying how systems, services, and data flows work together in support of business processes and institutional mission.”4 yet despite this growing interest in our parent organizations, individual academic libraries applying ea principles and methods are notably absent from the scholarly literature and library practitioner information sharing channels. the fullest account to date of the experience and impacts of enterprise architecture practice in a library context is a case study from the canada institute for scientific and technical information (cisti). at the time of the case study’s writing in 2008, cisti was already well underway in its adoption of ea methods in an effort to address the challenges of “legacy, isolated, duplicated, and ineffective information systems” and to “reduce complexity, to encourage and enable collaborations, and, finally, to rein in the beast of technology.”5 the author of this case study concludes that while getting started in ea was complex and resource-intensive, this was more than justified at cisti by the improvements in technology capability, strategic planning, and services to library users. broader whole-of-government agendas are a driver for ea adoption in non-university research libraries. the national library of finland’s ea efforts were guided by a national information society policy and the ea architecture design method for finnish government. 6 a 2009 review of the it infrastructure at the u.s. library of congress (lc) argued lc was lagging behind other federal agencies in adoption of government-recommended ea frameworks. the impact of this included: inadequate linking of it to the lc mission; potential system interoperability problems; difficulties assessing and managing the impact of changes; poor management of it security; and technical risk due to non-adherence to industry standards and lack of future planning.7 a followup review in 2015 noted that lc had since developed an architecture, but that it had still fallen short by not gathering data from management and validating the work with stakeholders. 8 there is little discussion in the literature about the ea process as a collaborative effort. in their 2016 discussion of emerging roles for librarians, parker and mckay proposed ea as a new area for librarians themselves to consider moving into, rather than as a source of productive partnerships.9 they argued that there are many similarities in the skillsets and practices of enterprise architects and information professionals (in particular, systems librarians and corporate information managers). areas of crossover identified included: managing risks, for example, related to intellectual property and data retention; structured and standardized approaches to (meta)data and information; technical skills such as systems analysis, database design and vendor management; and understanding and application of information standards and internal information technology and libraries | december 2018 29 information flows. while not a research library, within a broader information management context state archives and records nsw has promoted the benefits to records managers of working with enterprise architects, including improved program visibility, strategic assistance with business case development, and the embedding of recordkeeping requirements within the organization’s overall enterprise architecture.10 getting started: context and planning library technology services context in 2015–16, the awareness of enterprise and solution architecture expanded significantly within griffith university’s library technology services (lts) team. in 2015, some members of the team participated in activities led by external consultants to document griffith’s overall enterprise architecture at a high level. in 2016, the author became a member of the university’s solution architecture board (sab). lts submitted several smaller solution architectures to this group for discussion and approval, and team members found this process useful in identifying alternative ways to do things that we may not have otherwise considered. as a small team looking after a portfolio of high-use applications, lts was seeking to align itself as much as possible with university-wide it governance and strategy. these broader approaches included aggressively seeking to move services to cloud hosting, standardizing methods for transferring data between systems, complying with emerging requirements for greater it security, and participating in large-scale disaster recovery planning exercises. the author also needed to improve communication with senior it stakeholders. there was little understanding outside of the library of the scale and complexity involved in delivering online library services to a community of over 50,000 people. in a resource-scarce environment, it was increasingly important to make business cases not just in formal project documents but also opportunistically in less formal situations (the “elevator pitch”). existing systems were definitely hindering the library in making progress toward an improved online student experience and more efficient usage of staff resources. a complex ecosystem of more than a dozen library applications had developed over time. the library had selected these at different times based on requirements for specific library functions rather than alignment with an overall architectural strategy. our situation mirrored that described at cisti: “a complex and ‘siloed’ legacy infrastructure with significant vendor lock-in” combined with “reactionary” projects that “extended or redesigned [existing infrastructure] to meet purported needs, without consideration for the complexity that was being added to overcomplicated systems.”11 complex data flows between local systems and third-party providers that were critical to library services were not always well-documented. while lts staff members were extremely experienced, much of their knowledge was tacit. as in many libraries, staff could be observed sharing in informal, organic ways focused on the tasks at hand, but less effort was spent on capturing knowledge systematically. building a more explicit shared understanding about the library’s application portfolio would help address risks associated with staff succession. improved internal documentation would also address emerging requirements for team members to both develop their own understanding in new areas (upskilling) as well as become more flexible in terms of taking up broader roles and responsibilities across the team (cross-skilling). benefits of enterprise architecture for library technology management | searle 30 https://doi.org/10.6017/ital.v37i4.10437 there was also a sense that the time was right to take stock and evaluate the current state of affairs before embarking on any major changes. the team was supporting several applications, including the library management system and the interlibrary loans system, that were end-of-life. we needed to make decisions, and these needed to not only address our current issues but also provide a firm platform for the future. it was in this context that in 2016 library technology services approached the information technology architecture and solutions group for assistance. information technology architecture and solutions context in 2014, griffith university embarked on a new approach to enterprise architecture. the chief technology officer was given a mandate by the senior leadership of the university to ensure that it architecture was managed within an architecture governance framework, and the information services ea team was tasked with developing and maintaining an ea and providing services to support the development of solution architectures for projects and operational activities. two new boards were established to provide governance: the information and technology architecture board (itab) would control architectural standards and business technology roadmaps, while the solution architecture board (sab) would “support the development and implementation of solution architecture that is effective, sustainable and consistent with architectural standards and approaches.” project teams and operational areas were explicitly given responsibility to engage with these boards when undertaking the procurement and implementation of it systems. sets of architectural, information, and integration principles were developed, which promoted integration mechanisms that minimized business impact and were future-proof, loosely coupled, reusable, and shared services.12 our enterprise architects saw their primary role as maximizing the value of the university’s total investment in it by promoting standards and frameworks that could potentially improve consistency and reduce duplication across the whole organization. in order to do this , they would need to work with and through other business units. from the architects’ perspective, a collaboration with the library offered an opportunity to exercise skillsets and frameworks that were in place but still relatively new. griffith was still maturing in this area and attempting to move from the hiring of consultants as the norm to building more internal capability. working with the library would be a good learning experience for a junior architect, who was on a temporary work placement from another part of information services as a professional development opportunity. she could build her skills in a friendly environment before embarking on other engagements with potentially less open client groups. determining scope in a statement of architecture work once the two teams had decided that the process could have benefits on both sides, the next step was to jointly develop a statement of architecture work outlining what the process would include and how we would work together. a formal document was eventually endorsed at the director level, but prior to that, the librarians and the architects had a number of useful informal conversations in which we discussed our expectations, as well as the amount of time that we could reasonably contribute to the process. in developing the statement of work, the two teams agreed to focus on the current “as-is” environment and on assessment of the maturity of the applications already in use (see figure 1). this would help us immediately with developing business cases and roadmaps, without information technology and libraries | december 2018 31 necessarily committing either team to the much greater effort required to identify an ideal “to-be” (i.e., future) state to work towards. figure 1. overview of the architecture statement of work. full size version available at https://doi.org/10.6084/m9.figshare.6667427. the open group architecture framework (togaf) supports the development of enterprise architectures through four subdomains: business architecture, data architecture, application architecture, and technology architecture.13 the work that we decided to pursue maps to two of these areas: data architecture, which “describes the structure of an organization’s logical and physical data assets and data management resources;” and application architecture, which “provides a blueprint for the individual applications to be deployed, their interactions, and their relationships to the core business processes of the organization.” enterprise architecture process and outputs once the architecture statement of work had been agreed on, the two teams embarked on the process of working together over an extended period. while the lapsed time from approval of the statement of work through to endorsement of the architecture outputs by the solution architecture board was approximately fourteen months, the bulk of the work was undertaken within the first six months. following an intense period of information gathering involving large numbers of staff, a smaller subset of people then worked iteratively to refine the outputs for final approval. several times architecture activities had to be placed on hold in favor of essential ongoing operational work and higher priority projects, such as a major upgrade of the institutional repository. the process involved four main activities which are described in more detail in following sections. https://doi.org/10.6084/m9.figshare.6667427 benefits of enterprise architecture for library technology management | searle 32 https://doi.org/10.6017/ital.v37i4.10437 data asset and application inventory the first activity consisted of a series of three workshops to review information held about library systems in the ea management system, orbus software’s iserver. this is the tool used by the griffith ea team to develop and store architectural models, and to produce artifacts such as architecture diagrams (in microsoft visio format) and documentation (in microsoft word, excel, and powerpoint formats).14 the architects guided a group of librarians who use and support library systems through a process of mapping the types of data held against an existing list of enterprise data entities. in this context, a data entity is a grouping of data elements that is discrete and meaningful within a particular business context. for library staff, meaningful data entities included all the data relating to a person, to items and metadata within a library collection, and to particular business processes such as purchasing. we also identified the systems into which data were entered (system of entry), the systems that were considered the “source of truth” (system of record), and the systems that made use of data downstream from those systems of record (reference systems). the main output of this process was a workbook (figure 2) showing a range of relationships: between systems and data entities; between internal systems; and between internal systems and external systems. the first two columns in the worksheet contain a list of all the data entities and sub-entities stored in library systems (as expressed in the enterprise architecture). along the top of the worksheet is a list of all the products in our portfolio along with a range of systems they are integrated with. each of the orange arrows in this spreadsheet represents the flow of data from one system to another. the workbook in this raw form is definitely messy and the data within it is not really meant to be widely consumed in this format. the workbook’s main role is as the data source for the application communication diagram that is described in a later section. as a result of this data asset inventory, the management system used by our architects now contains a far more comprehensive and up-to-date view of the library’s architectural components than before: • the data entities better reflect library content. for example, while iserver already had a collection item data entity, we were able to add new data entity subtypes for bibliographic records, authority records, and holdings records. • library systems are now captured in ways that make more sense to us. workshopping with the architects led to the breakdown of several applications into more granular architectural components. for example, the library management system is now represented not just as a single system, but rather as a set of interconnected modules that support different business functions, such as cataloguing and circulation. similarly, our reading lists solution was broken down into its two main components: one for managing reading lists and one for managing digitized content. this granularity has enabled us to build a clearer picture of how systems (and modules within systems) interface with each other. information technology and libraries | december 2018 33 figure 2. part of the data asset and application inventory worksheet. full size version available at https://doi.org/10.6084/m9.figshare.6667430. https://doi.org/10.6084/m9.figshare.6667430 benefits of enterprise architecture for library technology management | searle 34 https://doi.org/10.6017/ital.v37i4.10437 • the wide range of technical interfaces we have with third parties, such as publishers and other libraries, is now explicitly expressed. feedback from the architects suggested that the library was very unusual compared to other parts of the organization in terms of the number of critical external systems and services that we use as part of our service provision. previously iserver did not contain a full picture of these critical services, including: o the web-based purchasing tools that we use to interact with publishers, such as ebsco’s gobi;15 o the library links program that we use to provide easier access to scholarly content via google scholar;16 and o various harvesting processes that enable us to share metadata with content aggregators, such as the national library of australia’s trove service and the australian national data service’s research data australia portal. 17 application maturity survey the second activity was an application maturity assessment. this involved forty-four staff members from all areas of the library with different viewpoints (technical, non-technical, and management) answering a series of questions in a spreadsheet format. the survey contained questions about: • how often a system was used; • how easy it was to use; • how well it supported the business processes that person carried out; • how well it performed, for example, in terms of response times; • how quickly changes/enhancements were implemented in the product; • how easily the system could be integrated with other systems; • the level of compliance with industry standards; and • overall supportability (including vendor support). as different respondents were assigned multiple systems depending on their level of support and/or use, the final overall number of responses to the survey was 144 responses relating to eleven different systems. the outputs of this process were a summary table and a series of four graphs. the summary table (see figure 3) presents aggregated scores on a scale of one (low) to five (high) for each application as well as recommended technical and management strategies. it is interesting, and somewhat disheartening, to note that scores for the business criticality of the applications are generally much higher than the scores for fitness. there is also some variation in the strategies required; some systems need to be replaced, but there are others where the issues seem to be less technical. the third row of the table shows a product that is scored as highly business-critical and perfectly suited to the job from a technical perspective, yet the product still scores much more poorly for business fit, which could indicate that something has gone wron g in the way that this product has been implemented. information technology and libraries | december 2018 35 figure 3. table summarizing the results of the application maturity assessment [product names redacted]. applications are rated on a scale of one to five, and one of four management strategies (technology refresh—not shown here, optimise, implementation review, or replace) is recommended. full size version available at https://doi.org/10.6084/m9.figshare.6667433. figure 4. two of the four graph types produced from the application maturity survey results, for a product [name redacted] that is performing well. full size version available at https://doi.org/10.6084/m9.figshare.6667436. figures 4 and 5 show the four graph types produced automatically from the survey results. on the left in figure 4 is a view displaying the business criticality, business fit, and technical fit for an individual application (shown in pink) as compared to the overall portfolio (shown in blue). on the right is a graph showing scores for the range of measures covered by the survey. this https://doi.org/10.6084/m9.figshare.6667433 https://doi.org/10.6084/m9.figshare.6667436 benefits of enterprise architecture for library technology management | searle 36 https://doi.org/10.6017/ital.v37i4.10437 particular product is doing well; technical and business fit are high in the graph on the left, and most measures are above average in the graph on the right. figure 5 shows the remaining two graphs for the same product. the graph on the left plots the scores for business criticality and application suitability (fitness for purpose) to produce a recommended technical strategy. the graph on the right plots the scores for business fit and technical fit to produce a recommended management strategy. in both graphs, it is possible to see how the specific application is performing (the red square) compared to the portfolio overall (the blue diamond). placement within the quadrant with the green optimize label is preferred, as in this case. figure 5. the remaining two graph types from the application maturity survey results, for a system [product name redacted] that is performing well. the specific system’s location is shown by the red square, while the blue diamond maps the average for all systems in the application portfolio. full size version available at https://doi.org/10.6084/m9.figshare.6667442. figures 6 and 7 present the same set of graphs for an end-of-life system. in figure 6 the graph on the left shows that the product is very business-critical but that its scores for technical fit and business fit (the lower corners of the pink triangle) are lower than the average across all applications (the lower corners of the blue triangle). the graph on the right shows that supportability and the time to market for changes and enhancements (the least prominent “points” in the pink polygon) are below the portfolio average (shown in blue along the same axes) while scores for other criticality, standards compliance, information quality, and performance were more in line with the portfolio average. https://doi.org/10.6084/m9.figshare.6667442 information technology and libraries | december 2018 37 figure 6. the first and second (of four) graphs for a system [product name redacted] that is end of-life. full size version available at https://doi.org/10.6084/m9.figshare.6667478. in figure 7, this application is placed well within the quadrant suggesting replacement. figure 7. the third and final graphs for a system [product name redacted] that is end-of-life. the placement of the red square within the replace quadrant indicates that this product is a high candidate for decommissioning. this is a marked difference from the portfolio as a whole (the blue diamond), which could be reviewed for possible implementation improvements. full size version available at https://doi.org/10.6084/m9.figshare.6667484. https://doi.org/10.6084/m9.figshare.6667478 https://doi.org/10.6084/m9.figshare.6667484 benefits of enterprise architecture for library technology management | searle 38 https://doi.org/10.6017/ital.v37i4.10437 the graphs are also useful for highlighting anomalies. figure 8 shows a product that is assessed as better-than-average in the portfolio on most measures. however, the survey results quite clearly show that information quality is a major issue. figure 8. graph from application maturity survey showing a specific area of concern (data quality) for an otherwise well-performing application [product name redacted]. full size version available at https://doi.org/10.6084/m9.figshare.6667487. this type of finding will help library technology services to target our continuous improvement efforts and work through our relationships with user groups and vendors to get a better result. application communication diagram the third major activity was the production of an application communication diagram (see figure 9). this is a visual representation of all of the information that was collated through the workshops using the workbook described above. https://doi.org/10.6084/m9.figshare.6667487 information technology and libraries | december 2018 39 figure 9. application communication diagram [simplified view]. full size version available at https://doi.org/10.6084/m9.figshare.6667490. https://doi.org/10.6084/m9.figshare.6667490 benefits of enterprise architecture for library technology management | searle 40 https://doi.org/10.6017/ital.v37i4.10437 the diagram includes a number of things to note. • key applications that make up the library ecosystem. an example of this is the large blue box on the top left. this represents the intota product suite from proquest, which contains multiple components, including our link resolver, discovery layer, and electronic resource manager. • physical technology. self-checkout machines appear as the small green box mid-right. • other internal systems that connect to library system components. examples of these are throughout and include: corporate systems, such as peoplesoft for human resources and finances; identity management systems like metadirectory and ping federate; the learning management system blackboard; and research systems, including the research information management system and the researcher profiles system. • external systems that connect to our systems. these are mostly gathered into the large grey box bottom right. • actors who access the systems. this includes administrators, staff, students, and the general public. actors are identified using a small person icon. • interfaces between components. each line in the diagram represents a unique connection into another system or interface. captions on these lines indicate the nature of the connection, e.g. manual data entry, z39.50 search, export scripts, and lookup lists. the production of this diagram has been an iterative process that has taken place over a long time period. the number of components involved in the diagram is quite large, so it is worth noting that the version presented here has actually been simplified. the architects’ tools can present information in different ways and this particular “view” was chosen to balance the need for detail and accuracy with the need to communicate meaningfully with a variety of stakeholders. production of interactive visualizations in the fourth and final work package, the data entity and application inventory spreadsheet was used as a data source to provide an interactive visualization (see figure 10). a member of the architecture team converted the workbook (see figure 2) from microsoft excel .xls into a .csv file. he developed a php script to query the file and return a json object based on the parameters that were passed. the data driven documents javascript library (d3.js) was used to produce a force graph that uses shapes, colors, and lines to visually present the spreadsheet information in a more interactive way.18 this tool enables navigation through the library’s network of data entities (shown as orange squares) and applications (shown as blue dots). in the example being displayed, the data entity “bibliographic records—marc” has been selected. it is possible to see both in the visualization and in the popup box on the left how marc records are captured, stored, and used across our entire ecosystem of applications. this visualization was very much an experiment and the value of this in the long term is something we are still discussing. in the short term, other outputs have proven to be more useful for planning purposes. information technology and libraries | december 2018 41 figure 10. interactive visualization of library architecture, showing relationships between a single data subentity (bibliographic records—marc) and various applications. full size version available at https://doi.org/10.6084/m9.figshare.6667493. https://doi.org/10.6084/m9.figshare.6667493 benefits of enterprise architecture for library technology management | searle 42 https://doi.org/10.6017/ital.v37i4.10437 discussion the process described above was not without its challenges, including establishing a common language. enterprise architecture and libraries are both fertile breeding grounds for jargon and acronyms. there was also a disconnect in our understandings of who our users were, with the architects tending to concentrate on internal users, while the librarians were keen to include the perspectives of the academic staff and students who make up our core client base. these were minor challenges, and the experience of working with the enterprise architects was overall an interesting and positive one for the library. our collaboration validated mckay and parker’s view that there is much crossover in the skillsets and mindsets of librarians and enterprise architects.19 both groups tended to work in systematic and analytical ways, which was helpful in removing some of the more emotive aspects that might have arisen through a more judgmental “assessment” process. the enterprise architects’ job was to promote conformance with standards that are aspirational in many respects for the library. however, the collaborative nature of the process and the immediate usefulness of its outputs helped us to approach this as an opportunity to improve our internal practices as well as the services that we offer to library customers. the architects observed in return that library staff were very open-minded about the process; this had not necessarily always been their experience with other groups in the university. one reason for this may have been lts’s efforts to communicate early with other library staff. before embarking on this work, we sent emails and provided verbal updates to all participants and their supervisors. these communications were clear about both the time commitment needed for workshops and surveys and also about the benefits we hoped to achieve. short-term impacts in the library domain the level of awareness and understanding in library technology services about ea concepts and methods is much higher than what it was previously. our capacity to self-identify architectural issues is better as a result and this is enabling us to be proactive rather than reactive. a recent example of this is a request from our solution architecture board (sab) to seek an exemption from our it advisory board (itab) for our proposed use of the niso circulation interchange protocol (ncip) to support interlibrary loan. while ncip is a niso standard that is widely used in libraries, it is not one of the integration mechanisms incorporated into the architecture standards. as a result of this request, we plan to develop a document for these it governance groups about all the library-specific data transfer protocols that we use; not just ncip, but also z39.50, the open archives initiative protocol for metadata harvesting (oai-pmh), the edifact standard for transferring purchasing information, and possibly others. it is in our interests to educate these important governance groups about integration methods commonly used in the library environment, since these are not well understood outside of our team. the baseline as-is application architecture diagram gives us a much better grasp on the complexity we are faced with. understanding this complexity is a prerequisite to controlling it. the diagram, and the process worked through to populate it, makes it easier to identify manual processes that should be automated and integrations that might be done more efficiently or effectively. for example, like most libraries, we still have many scheduled batch processes that we could potentially replace in the future with web services to provide real-time updates. information technology and libraries | december 2018 43 the iserver platform is now an important source of data to support our decision-making, in terms of arriving at broad recommendations for replacing, reimplementing, or optimizing our systems as well as highlighting specific areas of concern. importantly, the process produced relative results, so that we can see across our application portfolio which systems are underperforming compared to others. this makes it easier to determine where the team should be putting its efforts and highlights areas where firmer approaches to vendor management could be applied. a practical example of this was our decision in late 2017 to review (and ultimately unbundle and replace) an e-journal statistics module that was underperforming compared to other modules within the same suite. the outputs from this process are also helping library technology services communicate, both within our own team and also with other stakeholders. the results of the application maturity assessment were included as part of a business case seeking project funding to upgrade our library management system and replace our interlibrary loans system. that funding bid was successful. while it is possible that the business case would have been approved regardless, a recommendation from the architects that the system needed to be replaced was likely more persuasive than the same recommendation coming solely from a library perspective. in our organizational context, enterprise architects are trusted by very senior executives; they are perceived as neutral and objective, and the processes that they use are understood to be systematic and data-driven. longer-term impacts in an enterprise context there are a number of longer-term impacts that may arise from this work. seeing the library’s applications in a broader enterprise context is likely to lead to more questioning of the status quo and to a desire to investigate new ways to do things. in large organizations like universities, available enterprise systems can offer better functionality and more standardized ways of operating than library systems. financial systems are an obvious example, as are business intelligence tools. the canned and custom reports and dashboards within library systems meet a narrow set of requirements, but do not compare well for increasingly complex analytics when compared to enterprise data warehousing, emerging “data lake” technologies for less structured data, and sophisticated reporting tools. an enterprise approach also highlights where the same process is being done across different systems. for example, oai-pmh harvesting is a feature of multiple systems at griffith. traditionally each system provides its own feeds. our data repository, publications repository, and researcher profile system all provide oai-pmh harvesting endpoints for sending metadata to different aggregators. an alternative solution to explore could be to harvest all publications data from multiple systems into our corporate data warehouse (particularly if this evolved to provide more linked data functionality) and provide a single oai-pmh endpoint that could then be managed as a single service. the ea process has further raised our already high level of concern with the current library systems market. there has been a move in recent years towards larger, highly-integrated “black box” solutions. while there have been some moves towards openness, for example through the development of apis, these are often rhetorical rather than practical. the pricing structures for products mean that we continue to pay for functionality that would not be required if we could integrate library applications with non-library enterprise tools in smarter ways. at griffith, the benefits of enterprise architecture for library technology management | searle 44 https://doi.org/10.6017/ital.v37i4.10437 products that scored most highly in our maturity assessment in terms of business and technical fit were the less expensive, lightweight, browser-based, cloud-native tools designed to do one or two things really well. this suggests that strategies around a more loosely coupled microservices approach, such as that being developed through the folio open source library software initiative, will be worth exploring in future.20 conclusion there are few documented examples of librarians working closely with enterprise architects in higher education or elsewhere. the goal of this case study is to encourage other librarians to learn more about architects’ work practices and to seek opportunities to apply ea methods in the library systems space for the benefit not just of the library but also for the organization as a whole. as a single institution case study, the applicability of this work may be limited in other environments. griffith has a long tradition of highly converged library and it operations; other organizations may have more structural barriers to entry if the library and it areas are not as naturally cooperative. a further obvious limitation relates to resourcing. the author of the cisti case study cautions that getting started in ea can be complex and resource-intensive. few libraries are likely to be in the position of cisti in having dedicated library architects, so working with others will be required. in many universities, work of this nature is outsourced to specialist consultants because of a lack of in-house expertise. at griffith university, we conducted this exercise entirely with in-house staff. a downside of this was that, despite our best efforts at the scoping stage, competing priorities in both areas meant that this work took far longer than we expected. in theory, external consultants could have guided the library through similar activities to produce similar outputs, and probably in a shorter timeframe. however, we would observe that the process has been just as important as the outputs; the knowledge, skills, and relationships that have been built will continue into the future. at cisti, investments in ea were assessed by the library as justified by the improvements in technology capability, strategic planning, and services to library users. the griffith experience validates this perspective. it is also important to note that ea work can and should be done in an iterative way. our experience suggests that some outputs can be delivered earlier than others and useful insights can be gleaned even from drafts. our local “ecosystem” of library applications, enterprise applications, and integrations between these different components mus t respond to changes in technologies; legal and regulatory frameworks; institutional policies and procedures; and other factors. it is therefore unrealistic to expect outputs from a process like this to remain current for long. assuming that the library’s data and application architecture will always be a work-in-progress, it will continue to be worth the effort involved to build and maintain positive working relationships with the enterprise architects, who now have a deeper understanding of who we are and what we do. acknowledgements thank you to anna pegg, associate it architect; jolyon suthers, senior enterprise architect; colin morris, solution consultant; the library technology services team; all our library and learning services colleagues who participated in this initiative; and joanna richardson, library strategy information technology and libraries | december 2018 45 advisor, for support and feedback during the writing of this article. this work was previously presented at theta (the higher education technology agenda) 2017, auckland, new zealand. references 1 griffith university, “griffith digital strategy 2020,” 2016, https://www.griffith.edu.au/__data/assets/pdf_file/0026/365561/griffithuniversity-digitalstrategy.pdf. 2 the open group, “togaf®, an open group standard,” accessed june 4, 2018, http://www.opengroup.org/subjectareas/enterprise/togaf. 3 federation of enterprise architecture professional associations, “a common perspective on enterprise architecture,” 2013, http://feapo.org/wp-content/uploads/2013/11/commonperspectives-on-enterprise-architecture-v15.pdf. 4 judith pirani, “manage today’s it complexities with an enterprise architecture practice,” educause review, february 16, 2017, https://er.educause.edu/blogs/2017/2/managetodays-it-complexities-with-an-enterprise-architecture-practice. 5 stephen kevin anthony, “implementing service oriented architecture at the canada institute for scientific and technical information,” the serials librarian 55, no. 1–2 (july 3, 2008): 235–53, https://doi.org/10.1080/03615260801970907. 6 kristiina hormia-poutanen, “the finnish national digital library: national library of finland developing a national infrastructure in collaboration with libraries, archives and museums,” accessed march 24, 2018, http://travesia.mcu.es/portalnb/jspui/bitstream/10421/6683/1/fndl.pdf. 7 karl w. schornagel, “information technology strategic planning: a well-developed framework essential to support the library’s and future it needs. report no. 2008-pa-105,” may 2, 2009, https://web.archive.org/web/20090502092325/https://www.loc.gov/about/oig/reports/20 09/final%20it%20strategic%20planning%20report%20mar%202009.pdf. 8 joel willemssen, “information technology: library of congress needs to implement recommendations to address management,” december 2, 2015, https://www.gao.gov/assets/680/673955.pdf. 9 rebecca parker and dana mckay, “it’s the end of the world as we know it . . . or is it? looking beyond the new librarianship paradigm,” in marketing and outreach for the academic library, ed. bradford lee eden (lanham, md: rowman and littlefield, 2016): 81–106. 10 new south wales state archives and records authority, “recordkeeping in brief 59—an introduction to enterprise architecture for records managers,” 2011, https://web.archive.org/web/20120502184420/https://www.records.nsw.gov.au/recordkee ping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-inbrief-59-an-introduction-to-enterprise-architecture-for-records-managers. 11 anthony, “implementing service oriented architecture,” 236–37. https://www.griffith.edu.au/__data/assets/pdf_file/0026/365561/griffithuniversity-digital-strategy.pdf https://www.griffith.edu.au/__data/assets/pdf_file/0026/365561/griffithuniversity-digital-strategy.pdf http://www.opengroup.org/subjectareas/enterprise/togaf http://feapo.org/wp-content/uploads/2013/11/common-perspectives-on-enterprise-architecture-v15.pdf http://feapo.org/wp-content/uploads/2013/11/common-perspectives-on-enterprise-architecture-v15.pdf https://er.educause.edu/blogs/2017/2/manage-todays-it-complexities-with-an-enterprise-architecture-practice https://er.educause.edu/blogs/2017/2/manage-todays-it-complexities-with-an-enterprise-architecture-practice https://doi.org/10.1080/03615260801970907 http://travesia.mcu.es/portalnb/jspui/bitstream/10421/6683/1/fndl.pdf https://web.archive.org/web/20090502092325/https:/www.loc.gov/about/oig/reports/2009/final%20it%20strategic%20planning%20report%20mar%202009.pdf https://web.archive.org/web/20090502092325/https:/www.loc.gov/about/oig/reports/2009/final%20it%20strategic%20planning%20report%20mar%202009.pdf https://www.gao.gov/assets/680/673955.pdf https://web.archive.org/web/20120502184420/https:/www.records.nsw.gov.au/recordkeeping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in-brief-59-an-introduction-to-enterprise-architecture-for-records-managers https://web.archive.org/web/20120502184420/https:/www.records.nsw.gov.au/recordkeeping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in-brief-59-an-introduction-to-enterprise-architecture-for-records-managers https://web.archive.org/web/20120502184420/https:/www.records.nsw.gov.au/recordkeeping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in-brief-59-an-introduction-to-enterprise-architecture-for-records-managers benefits of enterprise architecture for library technology management | searle 46 https://doi.org/10.6017/ital.v37i4.10437 12 jolyon suthers, “information and technology architecture,” 2016, accessed april 6, 2018 https://www.caudit.edu.au/system/files/media%20library/resources%20and%20files/com munities/enterprise%20architecture/ea2016%20joylon%20suthers%20caudit%20ea%2 0symposium%202016%20-%20it%20architecture%20v2_0.pdf. 13 the open group, “togaf® 9.1,” 2011, 2018, http://pubs.opengroup.org/architecture/togaf9doc/arch/index.html: part 1 introduction section 2: core concepts. 14orbus software, “iserver for enterprise architecture,” accessed march 26, 2018, https://www.orbussoftware.com/enterprise-architecture/capabilities/. 15 ebsco, “gobi®,” accessed june 5, 2018, https://gobi.ebsco.com/gobi. 16 google scholar, “google scholar support for libraries,” accessed june 5, 2018, https://scholar.google.com/intl/en/scholar/libraries.html. 17 national library of australia, “trove,” accessed june 5, 2018, https://trove.nla.gov.au/; australian national data service, “research data australia,” accessed june 5, 2018, https://researchdata.ands.org.au/. 18 mike bostock, “d3.js—data-driven documents,” accessed april 3, 2018, https://d3js.org/. 19 parker and mckay, “it’s the end of the world,” 88. 20 breeding, marshall, “five key technology trends for 2018,” computers in libraries, 37, no.10 (december 2017), http://www.infotoday.com/cilmag/dec17/breeding--five-key-technologytrends-for-2018.shtml. https://www.caudit.edu.au/system/files/media%20library/resources%20and%20files/communities/enterprise%20architecture/ea2016%20joylon%20suthers%20caudit%20ea%20symposium%202016%20-%20it%20architecture%20v2_0.pdf https://www.caudit.edu.au/system/files/media%20library/resources%20and%20files/communities/enterprise%20architecture/ea2016%20joylon%20suthers%20caudit%20ea%20symposium%202016%20-%20it%20architecture%20v2_0.pdf https://www.caudit.edu.au/system/files/media%20library/resources%20and%20files/communities/enterprise%20architecture/ea2016%20joylon%20suthers%20caudit%20ea%20symposium%202016%20-%20it%20architecture%20v2_0.pdf http://pubs.opengroup.org/architecture/togaf9-doc/arch/index.html http://pubs.opengroup.org/architecture/togaf9-doc/arch/index.html https://www.orbussoftware.com/enterprise-architecture/capabilities/ https://gobi.ebsco.com/gobi https://scholar.google.com/intl/en/scholar/libraries.html https://trove.nla.gov.au/ https://researchdata.ands.org.au/ https://d3js.org/ http://www.infotoday.com/cilmag/dec17/breeding--five-key-technology-trends-for-2018.shtml http://www.infotoday.com/cilmag/dec17/breeding--five-key-technology-trends-for-2018.shtml abstract introduction literature review getting started: context and planning library technology services context information technology architecture and solutions context determining scope in a statement of architecture work enterprise architecture process and outputs data asset and application inventory application maturity survey application communication diagram production of interactive visualizations discussion short-term impacts in the library domain longer-term impacts in an enterprise context conclusion acknowledgements references in march 2003 the university of mississippi libraries made our metasearch tool publicly available. after a year of working with this product and integrating it into the library web site, a wide variety of libraries interested in our implementation process and experiences began to call. libraries interested in this product have included consortia, public, and academic libraries in the united states, mexico, and europe. this article was written in an effort to share the recommendations and concerns given. much of the advice is general and could be applied to many of the metasearch tools available. google scholar and other open web initiatives that could impact the future of metasearching are also discussed. m any libraries are looking for ways to facilitate the discovery process for users. implementing a onestop search product that does not require databasespecific knowledge is one of the paths librar ies are choosing.1 as these search engines are made available to patrons, the burden of design falls to the library as well as to the product developers. most library users may be familiar with a few databases, but the vast majority of electronic resources remain unrevealed. using a metasearch product, a single search is broadcast out to similar and divergent electronic resources, and search results are returned and typically mixed together. metasearch results are returned in realtime and link the user to the native interface. although there are many products that support onestop searching, the university of mississippi libraries chose to purchase innovative interfaces’ metafind product because it tied into a digital initiative partnership with innovative. some of the possibilities of the types of resources you can search include: n library catalogs n licensed databases n locally created databases n full text from journals and newspapers n digital collections n selected web sites internet search engines the simplicity of google searching is very appeal ing to users. in fact, users have come to expect this kind of empowering tool. at the university of mississippi, students use and have been using google for research. as google scholar went public, it became evident that university faculty also use it for the same reasons. it was apparent from the university of mississippi libraries’ 2003 libqual+ survey results that users would like more personal control than the library was offering (table 1). unintentionally elaborate mazes are created and users become lost in a quagmire of choices. as indicated by our libqual+ survey results, our users want easytouse tools that allow them to find informa tion on their own, and they want information to be easily accessible for independent use. these are clearly two areas that many libraries are struggling to improve for their patrons. the question is how to go about it. based on several changes made between 2003 and 2005, which included implementing a metasearch tool, the adequacy mean improved for both questions and for undergradu ates as well as graduate students and faculty (table 2). the adequacy mean compares the minimum level of ser vice that a user expects with the level of service that they perceive. in table 1, the negative adequacy mean figures indicate that the library was not meeting users’ minimum level of service for these two questions or that the per ceived level of service was lower than the minimal level of service. table 2 compares the adequacy mean from 2005 with 2003 and indicates a notable, positive change in adequacy mean for each question and with each group. n design perspectives and tension generally, there are conflicts within libraries regarding the question of how to improve access for patrons and allow for independent discovery. for those leading a metasearch implementation, these tensions are important to understand. in implementing new technologies, there are key development issues that may decrease internal acceptance until they are addressed. however, one may also find that there are some underlying fears regarding this technology. although the following crosssubculture comparisons simply do not do justice to each of the valid perspectives, these brief descriptions highlight the types of perspectives one might encounter when considering or implementing a metasearch product. expert searchers prefer native interfaces and all of the functionalities of the native interface. they are typically unhappy with the “dumbeddown” or clunky searching of a metasearch utility. they would prefer for patrons to be taught the ins and outs of the database they should be using for their research. this presupposes that the students either know which database to use, will spend time inves tigating each database on their own, or that they will ask for assistance. however, there are clearly native interface 44 information technology and libraries | june 2007 metasearching and beyond: implementation experiences and advice from an academic library gail herrera gail herrera (gherrera@olemiss.edu) is assistant dean for technical services & automation and associate professor at the university of mississippi. metasearching and beyond | herrera 45 functionalities—such as lim iting to full text—that, while wonderful to patrons, are not consistent across resources or a part of the metasearch standard. users would cer tainly benefit if limiting to fulltext was ubiquitous among vendors and if there were some way to determine fulltext availability within metasearch tools. results ranking is another issue that expert searchers may bring to the table. currently, there is a niso metasearch initiative that is striving to standard ize metasearching.2 another downside for the expert searcher is that there is no browse function. those who are in administrative or manage rial positions working with electronic resources see metasearching as an opportunity to reveal these resources to users who might not otherwise discover them. for example, many users have learned to search ebsco’s academic search premier not realizing that key articles on a local civil rights figure such as james meredith are also available in america: history & life, jstor, and lexisnexis. metasearching removes the need for the user to spend additional time choosing databases that seem relevant and searching them indi vidually. from a financial perspective, if a library is pay ing for these electronic resources, they should be using them as much as possible. and while the university of mississippi libraries generally target the undergraduate audience with our metasearch tool, the james meredith search is a good example of how a metasearch tool might reveal other databases with information that a serious researcher could then further investigate by link ing through the citation to the native interface. those associated with library instruction may also be uncomfortable with metasearching. in fact within a short time of implementing the product, several instructors conveyed their fear that in making searching so simple, they would no longer have a job as the product developed. generally, it seems that users are always in need of instruc tion although the type of instruction and the tools continue to change. it is an understandable fear and one that would be wise to acknowledge for those embarking on a metasearch implementation. while metasearch can be an empowering tool for users, you may also encounter some emotional reactions among library employees. from an information literacy point of view, frost has noted that metasearching is “a step backward” and “a way of avoiding the learning process.”3 it is true that in providing an easy search tool, the library is not endeavoring to teach all students intermedi ate or advanced information retrieval knowledge or skills. however, it is important to provide tools that meet users at their level of expertise and as previously noted, this is an area identified in need of improvement. for those working at public service points such as the reference desk, metasearching is an adjustment. many times those working with patrons tend to use databases with which they are more familiar or in which they feel more confident. federated search tools may reveal resources that are typically less used and therefore unfa miliar to library employees. training may then become an issue worthy of addressing not just for the metasearch interface and design but also for the lessused resources. for those involved in technical support, this product may range from exciting to exasperating. the amount of time your technical support personnel have to dedicate to your metasearch project should be a major factor when investigating the available products. just like any other technological investment, you are either going to (1) purchase the technology and outsource manage ment or (2) obtain a lesser price from a vendor for the tool and invest in developing it yourself. there is also a middle ground, but this costshifting is important to keep in mind. regardless of your approach, it is critical to include the technical support person on your imple mentation team and to keep in mind the kind of time investment that is available when reviewing prices. along with developing this product, one may also find oneself investing additional time and money into infra structural upgrades such as the proxy server, network equipment, or dns servers. in addition to these perspectives, there is a general tension in library web site design philosophies between how librarians would like patrons to use their services table 1. 2003 libqual adequacy mean undergrad grad faculty easy-to-use access tools that allow me to find things on my own -.10 -.30 -.29 making information easily accessible for independent use .37 -.09 .03 table 2. positive change in libqual adequacy mean from 2003 to 2005 undergrad grad faculty easy-to-use access tools that allow me to find things on my own .53 .46 .24 making information easily accessible for independent use .22 .22 .45 46 information technology and libraries | june 2007 and what patrons want. the traditional design based on educating users and having users navigate to information “our way” has definitely curtailed over the past several years with attention being paid increasingly to usability. as usability studies give librarians increasing informa tion, libraries are moving toward designing for our users based on their approaches and needs rather than how librarians would have them work. depending on where one’s library is in this spectrum of design philosophy, implementing a metasearch tool may be harder or easier. judy luther surmised the situa tion well, “for many searchers, the quality of the results matter less than the process—they just expect the process to be quick and easy.”4 moving toward this lofty goal is to some extent dictated by the abilities and inabilities of the technologies chosen. as a technologist, the general rule seems to be that the easier navigation is made for our users; the more complex the technical structure becomes. n metasearch categories in arranging categories of searches for a metasearch product, some libraries group their electronic resources by subject, and others use categories that reflect fulltext avail ability. the university of mississippi libraries use both. the most commonly used category is our fulltext category. this fulltext category was set as the default on our most popular search box located on our articles and databases web page (figure 1). since limiting to fulltext materials is not a standard, the category was defined by the percentage of fulltext they contain. this is an important distinction to understand because a user may receive results that are not fulltext, but the majority of results will likely be fulltext. at our library, if the resource contains more than 50 percent fulltext, it is included in the fulltext category. other categories included in this implementation are ready reference, library catalogs, digital collections, lim ited resources, publicly available databases, and broad subject categories. one electronic resource may be included in the fulltext category, a broad sub ject category such as “arts and humanities” and also have its own individual category in order to mix and match individual resources on sub ject guides using a tailormade search box. the limited resource category contains resources that should be searchable using the metasearch tool but that have a limited number of simultaneous users. if it were included in the default fulltext category that is used so much, it would tie up the resource too much. investigating resources with only one or two simultaneous users at the begin ning of the project may help you avoid error messages and user frustration. one might wonder, “why profile limited resources then?” there may be specific search boxes on subject guides where librarians decide to add that individual but limited resource. it might also be necessary to shorten the timeout period for limited user resources. along those same lines, having paypersearch resources profiled could also be expensive and is not recommended. since the initial implementation, migrating away from per search resources has become a priority. within the first few months of implementation, the free resources such as pubmed and askeric were moved to a new “publicly available” category. the reason is that since there is not any authentication involved, these results return very quickly and are always the first results a user sees. while they are important resources, our intent was really to reveal our subscription resources. this approach allows users to search these resources if specifically chosen but they are not included in the default fulltext category. this approach does still allow subject librarians to mix and match these free individual resources on subject guide search boxes. n response time of all of the issues with our metasearch tool, response time has been the most challenging. there are so many issues when it comes to tracking down sluggish response that it can be extremely difficult to know where to start. if one’s metasearch software is not locally hosted, response time could involve the library network, campus network, offcampus network provider, and the vendor’s network, not to mention the networks of all the electronic resources users are searching. when one adds the other variable of authentication, the picture becomes even more over whelming and difficult to troubleshoot. for authentication, the university of mississippi libraries purchased innovative’s web access management module (wam), which is based on the figure 1. metasearch tailored search box with full text category selected metasearching and beyond | herrera 47 ezproxy software. as the use of our electronic resources from oncampus and offcampus has grown, the inci dence of increasing network issues has risen. in work ing with our campus telecommunications group, the pursuit of evergreater bandwidth has become a priority. troubleshooting has included tracking down trouble some switch settings, firewall settings, as well as campus dns and vendor dns issues. if your network adminis trators use packet shapers, this may be another hurdle. clearly, our metasearch product has placed a significant load increase on the proxy server. in looking at proxy statistics, 24 percent of total proxy hits were from the metasearch product (figure 2). with this in mind, one may find the load on one’s proxy server increasing very dramatically during peak usage and may need to plan for upgrades accordingly. even with improvements and tweaks along the way, response time is still an issue and one of the highest hurdles in selling a metasearch product internally and externally. one metasearch statistical module includes response time information for individual resources along with usage data. the response time information would be very helpful in troubleshooting and in working with electronic resource vendors. usage tracking is another criterion to consider in reviewing metasearch products. n response time and tailored search boxes during implementation, one of the first discussions to have is who will be the target audience for this product. at this institution, undergraduates were the target audi ence and more specifically, those looking for three to five articles for a paper. while our metasearch software has a master screen showing all of the resources divided into the main categories, facing users with over sixty check boxes was not a good solution (figure 3). this master screen is good for demonstrating categories to library staff, overall functionality of the technology, and also for quickly checking all of your resources for connectivity errors. from early conversations with students, keeping basic users far away from this busy screen is a good goal. remember, the purpose is to give them an easy starting point. the best way to keep users in a simple search box is to construct search boxes and handpick either individual resources or categories keep ing in mind the context of the web page. for example, the articles and databases page has a simple search box that searches for articles. subject guide boxes search individual electronic resources selected by the subject librarian. the university of mississippi libraries also have a large col lection from the american institute of certified public accountants (aicpa). the search box on that page searches our catalog, which contains aicpa books along with the aicpa digital collection. some libraries are interested in developing a standard metasearch box to display as a widget or standing content area throughout their web site. this is interesting and worth considering. however, matching the web page content with appropri ate resources has been our approach. as the standards and technology develop, this may be worth further con sideration depending on usability findings. for the most commonly used search box on the articles and databases page (figure 1), the default category checked is the full text articles category. donna fyer stated that, “for the average end user, the less decision making, the better.”5 this certainly rings true for our users. originally, a simple metasearch search box was placed on the library homepage. the library catalog and the basic metasearch box were both displayed. this seemed confusing for users since both products have search capabilities. with the next web site redesign, the basic metasearch box moved from the library homepage to the articles and journals web page. this was a success ful place for the article quick search box to reside since the default was set to search the fulltext category. there were some concerns that users might be typing journal titles into the search box but these were rare instances and not necessarily inappropriate uses. the next rede sign eventually moved this search box to the articles and databases page, where it remains. for the articles and databases pages, the simple search box (figure 1) by default searches the fulltext category and searches the title keyword index. the index category with the label, “article citations,” can also be checked by the user. the majority of metasearches begin with this search box and figure 2. total proxy hits vs. metafind proxy hits 4� information technology and libraries | june 2007 most users do not change the default settings for the resources or the index. n subject guide search boxes in addition to the “article quick search” box, subject librarians slowly became interested in a search box for their subject guides as the possibili ties were demonstrated. in order to do this, the ven dor was asked to profile each resource with its own unique value in order to mix and match individual resources. while the idea of searching resources by subject category sounds useful and appealing, sometimes universal design begets universal dis cord. even with a steering committee involved, it is hard for everyone to agree what resources should be in each of the main subject categories: arts and humanities, science and engineering, business and economics, and social science. some libraries have put a lot of time and effort into creating a large number of subject categories. the master search screen (figure 3) displays several of this library’s categories but not the broad subject categories noted above. these general sub ject categories are brought out in the multipurpose interface called the “library search engine” (figure 4). the library search engine design is a collection of the categories and resources showing the full functionality of our metasearch tool. the subject categorization approach within our metasearch interface is a good way to show the multifunction ality of the product but remains relatively unused by patrons. by giving each resource its own value, subject librarians have the flexibility to select spe cific resources and/or categories for their subject guides. it is worth noting that it required additional setup from our vendor and was not part of the original implementation. after a few months of testing with the initial implemen tation, willing subject librarians chose individual resources for their tailored search boxes. once a simple search box has been constructed, it can be easily copied with minor modi fications to make search boxes for those requesting them. while progress was slow to add these boxes to subject guides, after about a year there was growing interest. in setting these up, subject librarians have several choices to make. first of all, they choose the resources that will be searched. for example, the biology subject guide search box searches academic search premier, bioone, and jstor by default. basicbiosis and pubmed are also avail able but are not checked by default. users can check these search boxes if they also wish to search these resources. choosing the resources to include in the search box as well as setting what resources are checked by default is the most important decision. the subject librarian is also encour aged to assist in evaluating the number of hits per resource returned. with response time being a critical factor, deter mining the number of hits per resource should involve testing and take into consideration the overall number of resources being searched. n relevance selecting the default index is another decision in setting up search boxes. again, users are googleoriented and tend to go with whatever is set as the default option. out of the box, our metasearch tool defaults to the keyword index or keyword search. the issue of relevancy is a hot topic for metasearch products. this issue typically comes up in metasearch discussions. it is also listed as an issue in the niso metasearch initiative. from the technical side of the equation, results are displayed to the user as soon as they are retrieved. this allows users to begin immediately exam figure 3. master screen display (partial screenshot) figure 4. library search engine subject categories metasearching and beyond | herrera 4� ining the results. adding a relevancy algorithm as a step would mean all of the results would have to be returned, ranked, and then displayed. with response time being a key issue, a faster response is more important than relevance. another consideration is if the metasearch results are displayed to the user as interfiled or by electronic resource where the resource is returning results based on its own relevancy rankings. one way to increase relevance is to change the default index from keyword to title keyword. for our students, bringing back keywords in the title made the results more relevant. this is the default index used for our article search on the articles and database web page. subject librarians have the choice of indexes they prefer when blending resources. one caveat in using title keyword is that there are resources that do not support title keyword searching. for other resources, title keyword is not an appropriate index. for example, wilson biographies does not have a title keyword search. it makes perfect sense that a biography database would not support title keyword searching. in these cases, the search may fail and note that the index is not supported. to accommodate this type of exception, the profile for wilson biographies needed to be changed to have the title keyword searchmapped to a basic keyword search. while this does not make the results as relevant as the other search results, it keeps any errors from appearing and allows results to be retrieved. n results per source and per page for metafind, there are also two minor controls that can work as hidden values unseen by the patron or as compo nents within the search box for users to manipulate. the first control is the number of hits to return per resource. if a subject librarian is only searching two or three resources in his tailored search box, he probably will want to set this number higher. if there are many resources, this number should be lower in order to keep response time reasonable. the second control is the number of results to return per page. in general, it is important to adjust these controls after testing the response for the resources selected. while users typically use the default settings, showing these two con trols gives the user a visual clue that the metasearch tool is not retrieving all of the results from the resource. instead, it is only retrieving the first twentyfive, for example. n implementation advice one of the most important pieces of advice is that it is extremely important to have a date in one’s contract or rfp for all of the profiling to be completed if the vendor is doing the resource profiling. from this library’s experi ence, the profiling of a resource can take a very long time, and this is a critical point to include in the contract. one might also consider adding cost and turnaround time for new resources after the initial implementation to the contract. the more resources profiled, the more useful the product. however, one also needs to pay attention to response time. if the plan is to profile one’s own resources or connectors, librarians should be mindful of the time involved and ask other libraries with the same product about time investments. being able to work with vendors who will provide an opportunity to evaluate the product “live” is preferable. in deciding who to target for an implementation team, consider representatives from reference, collection development, and systems. it is also very important to include whoever manages electronic resource access/ subscriptions and a web manager. in watching other pre sentations, exclusion of any of these representatives can seriously undermine the implementation. buyin is essen tial to success. additionally, giving librarians as many options as possible, such as control over what types of resources are in their search boxes as well as the number of hits per resource makes the product more appealing. n questions to ask once the implementation team is set, interviewing refer ences for the products under consideration is an impor tant part of the process. unstructured conversations with references really allow librarians to explore together what the group wants and how its needs fit with the services the vendor offers. a survey of questions via email is another possibility. in choosing this method, be sure to leave some room for open comments. regardless of the approach, it is important to spend some time asking ques tions. provided are a list of recommended questions: n who is responsible for setting up each resource—the vendor or you? n how much time does it typically take to set up a new resource and what is the standard cost to add a new resource? n is there a list or database of alreadyestablished pro files for electronic resources for this product? n how much time would you estimate that it took to implement the product? n will you be able to edit all of the public web pages yourself or will you be using vendor support staff to make changes? if the vendor support staff has to make some of the changes, how responsive are they? 50 information technology and libraries | june 2007 n can you easily mix and match individual resources for subject guides, departmental pages, or other kinds of web pages? or do you only have the option to set up global categories? n is your installation local or does the vendor host it? are there response issues? n is there an administrative module to allow you to maintain categories, resource values, and configura tion options? n how much time goes into managing the product monthly? and who manages the product at your library? n what kind of statistical information does the vendor provide? n how satisfied are you with the training, implementa tion support, and technical documentation? n how does the vendor handle broken resources or subscription changes? as with most technologies, there are upfront and hid den costs. it is important to determine what hidden costs are involved and if you have the resources to support all of the costs. sometimes libraries choose the least expen sive product. however, this approach can lead librar ies down the path of hidden costs. for example, if the product is less expensive but your library is responsible for setting up new electronic resources, managing all of the pages, and finding ways to monitor and troubleshoot performance outside of the tools provided, the hidden expenditures in time and training may be more costly in the end than purchasing the premium metasearch tool. in essence, one must pay for the product one way or another. the big question is, where are the resources to support the product? if one’s library has more it/web personnel than money, the lowercosting product may be the way to go, but be sure to check with other librar ies to see if they have been able to successfully clear this hurdle. additionally, if your library has more onetime money than yearly subscription money, this may dictate the details of the rfp, and your library may lean toward a purchase rather than an annual subscription. n metasearch summary clearly, students want a simple starting place for their research. implementing a metasearch tool to meet this need can be a hard sell internally for many reasons. at this institution, response time has been the overriding critical issue. response has lagged due to server and network issues that have been difficult to track down and improve. however, authentication is truly the most time consuming and complex part of the equation. some fed erated search tools are actually searching locally stored information, which helps with response. while these are not truly metasearch tools and are not performing real time searches, this approach may yield more stability with faster response. over the years in implementing new services such as the library web site, illiad, electronic resources, and off campus authentication, new services are often adopted at a much faster rate by library users than by library employees. typically, there will be early adopters who use the services immediately based on need. it then takes general users about a year to adopt a new service. iii’s metasearch technology has been available for the past four years. however, our implementation is evolving with each web site redesign. still, it is used regularly. the university of mississippi libraries has been pro viding access to its electronic resources in two distinct ways: (1) providing urls on web pages to the native interface of the electronic resource and (2) metasearching. as the library moves forward in developing digital col lections and the number of electronic resources profiled for metasearching increases, it is possible that this kind of global discovery tool will compete in popularity with the library catalog. providing such information mining tools to patrons will cause endless frustration for the library literate. response times, record retrieval order, as well as licensing and profiling issues, are all obstacles to pro viding a successful metasearch infrastructure. retrieval inconsistency and ad hoc retrieval order of records is very unsettling for librarians. however, this is the kind of tool to which web users have become accustomed and certainly seems to fill a need that to date has been lacking where library electronic resources are concerned. n open web developments one other trend appearing is scholarly research discovery tools on the open web. enter google scholar along with other similar initiatives such as windows live academic search. google scholar beta was released in november 2004 and very soon after began an initiative to work with libraries and their openurl resolvers.6 this bridging between an open web tool and libraries is an interest ing development. a fair amount has been written about google scholar to date although the project is still in its beta phase. what does google scholar have to do with metasearching? good question. it remains to be seen how much scholarly information will become search able via google scholar. for now, the jury is still out as to whether google scholar will begin to encroach upon the traditional territory of the indexing and abstracting world. if sufficient content becomes available on the open web, whether from publishers or vendors allowing their metasearching and beyond | herrera 51 content to be included, then the authentication piece that directly effects response time may be overcome. in using google scholar or other such open web portals, search ing happens instantly. when a user uses the openurl resolver to get to the fulltext, that is where authentication enters into the picture and removes the negative impact on searching. the tradeoff is that there are many issues involved in openurl linking and the standardization of the metadata needed to provide consistent access. there are many parallels between what google scholar is attempting to offer and what the promises of metasearching have been. for metasearching, under graduate students looking for their three to five articles for a paper are considered our target audience. for in depth searching, metasearching does have limitations, but for the casual searcher looking for a few fulltext articles, it works well. interestingly, similar recommen dations are being made for google scholar.7 however, opinions differ on this point. roy tennant went so far as to indicate it is a step forward in access to those users without access to licensed databases, but remained reserved in his opinion regarding the usefulness for those with access.8 google scholar also throws in a few bonuses. while providing access to open access (oa) materials in our opac for specific collections such as the directory of open access journals, these same resources have not been included in our metasearch discovery tool. google scholar is searching these open repositories of scholarly informa tion, although there is some concern over the automatic inclusion of materials such as syllabi and undergraduate term papers within the institutional repositories.9 google scholar also provides a useful citation feature and rel evancy. google scholar recognizes the user’s preference for fulltext access and provides a visual cue from the brief results when article fulltext is available. this func tionality is not currently available from our metasearch software but would be extremely helpful to users. on the downside, some of google scholar’s linking policies make it difficult for libraries to extend services beyond full text articles to their users. another notable development among subscription indexing services is the ability to reveal content to web search engines. ebsco’s initiative is called ebscohost connection.10 in implementing metasearching, libraries have debated about providing access to free versus subscrip tion resources. for our purposes, free resources were not included in the most commonly used search in the full text category. there are those who would argue against this decision, and they have very good points. in fact, it has already been noted that some libraries use google scholar to verify incomplete interlibrary loan citations quickly.11 in watching the development of google scholar, it seems possible that this free tool that uncovers free open access resources and institutional repository mate rials may not necessarily be a competitive product, but may be a very complementary one. n impact on the opac what will this mean for the “beloved” opac? for a very long time, users have expected more of the library catalog than it has provided. while the library catalog is typically appreciated by library personnel, its usefulness for finding materials other than books has been hard for general users to understand. many libraries including the university of mississippi have been loading records from their electronic resources in hopes of making the library catalog more useful. the current conversation regarding digital library creation also begs the question, “what is the library catalog?” although the library catalog serves as a searchable inventory of what the library owns, it is simply a pointing mechanism, whether it points the user to a shelf, a building, or a url. in our endeavor to provide instant gratification and fulltext, as well as the user’s desire for information regardless of format, the library catalog is beginning to take a backseat. it was clear four years ago in plan ning digital collections that a metasearch tool would be needed to tie together subscription resources, digital collections, publicly available resources, and the library catalog. it will be interesting to see whether patrons choose to use the formal tools provided by the library or the informal tools developing on the open web, such as google scholar, to perform their research. more than likely, discovery and access will happen through many avenues. while this may complicate the big picture for those in library instruction, it is important to meet users on the open web. one’s best intentions and designs are presented to users but they may choose unintended paths. librarians should watch the paths they are taking and build upon them. sometimes even one’s best attempts fall short, as pointed out clearly in karen schneider’s latest series, “how opacs suck.”12 still it is important to acknowl edge design shortcomings and keep forging ahead. dale flecker, who spoke at the taiga forum, recommended not to spend years trying to “get it right” before imple menting, but instead to consider ourselves in perpetual beta and simply implement and iterate.13 in other words, do not try to make the service perfect before implement ing it. most libraries do not have the time and resources to do this. instead, find ways to gain continual feedback and constantly adjust and develop. students are familiar with internet search engines and do not want to choose between resources. access to a simple resource discovery tool is an important service for users. unfortunately, authentication, product design 52 information technology and libraries | june 2007 and management, and licensing restrictions tend to be stumbling blocks to providing fast and comprehen sive access. regarding the metasearch tool used at the university of mississippi libraries, development part nerships have already been formed between the vendor and a few libraries to improve upon many of the issues discussed. innovative is developing a nextgeneration metasearch product called research pro that leverages ajax technology. while efforts are made to participate in discussions and develop our alreadyexisting tools, it is also impor tant to pay attention to other developments such as google scholar. at this point, google scholar is in beta but this kind of free searching could turn the current infra structure on its ear to the benefit of patrons. the efforts to meet users on the open web and reveal scholarly content are definitely worth keeping an eye on. references 1. roland dietz and kate noerr, “onestop searching bridges the digital divide,” information today 21, no. 7 (2004): s24. 2. niso metasearch initiative, http://www.niso.org/ committees/ms_initiative.html (accessed may 8, 2006). 3. william j. frost, “do we want or need metasearching?” library journal 129, no. 6 (2004): 68. 4. judy luther, “trumping google? metasearching’s prom ise.” library journal 128, no. 16 (2003): 36. 5. donna fyer, “federated search engines,” online 28, no. 2 (2004): 19. 6. jill e. grogg and christine l. ferguson, “openurl link ing with google scholar,” searcher 13, no. 9 (2005): 39–46. 7. mick o’leary, “google scholar: what’s in it for you?” information today 22, no. 7 (2005): 35–39. 8. roy tennant, “is metasearching dead?” library journal 130, no. 12 (2005): 28. 9. o’leary, “google scholar.” 10. what is ebscohost connection?, http://support.epnet .com/knowledge_base/detail.php?id=2716 (accessed may 10, 2006). 11. laura bowering mullen and karen a. hartman, “google scholar and the library web site: the early response by arl libraries,” college & research libraries 67, no. 2 (2006): 106–22. 12. karen g. schneider, “how opacs suck,” ala techsource, http://www.techsource.ala.org/blog/karen+g./sch neider/100003/ (accessed may 10, 2006). 13. dale flecker, “my goodness, life is different,” pre sentation to the taiga forum, mar. 27–28, 2006, http://www .taigaforum.org/pres/fleckerlifeisdifferenttaiga20060327.ppt (accessed may 10, 2006). lita cover 2, cover 3, cover 4 index to advertisers testing for transition: evaluating the usability of research guides around a platform migration articles testing for transition: evaluating the usability of research guides around a platform migration ashley lierman, bethany scott, mea warren, and cherie turner information technology and libraries | december 2019 76 ashley lierman (lierman@rowan.edu) is instruction librarian, rowan university. bethany scott (bscott3@uh.edu) is coordinator of digital projects, university of houston. mea warren (mewarren@uh.edu) is natural science and mathematics librarian, university of houston. cherie turner (ckturner2@uh.edu) is assessment & statistics coordinator, university of houston. abstract this article describes multiple stages of usability testing that were conducted before and after a large research library’s transition to a new platform for its research guides. a large interdepartmental team sought user feedback on the design, content, and organization of the guide homepage, as well as on individual subject guides. this information was collected using an open-card-sort study, two face-to-face, think-aloud testing protocols, and an online survey. significant findings include that users need clear directions and titles that incorporate familiar terminology, do not readily understand the purpose of guides, and are easily overwhelmed by excess information, and that many of librarians’ assumptions about the use of library resources may be mistaken. this study will be of value to other library workers seeking insight into user needs and behaviors around online resources. introduction like many libraries that employ springshare’s popular libguides platform for creating online research resources, the university of houston libraries (uhl) has accumulated an extensive collection of guides over the years. by 2015, our collection included well over 250 guides, with varying levels of complexity, popularity, usability, and accessibility. this presented a major challenge when we planned to migrate our libguides instance (locally branded as “research guides”) to libguides v2 in fall 2015, but also an opportunity: the transition would be an ideal time to appraise, reorganize, and streamline existing guide content. although uhl had conducted user research in the past to improve usability, in preparing for the migration it became clear that another round of tests would be beneficial in revising our guides for the new platform. our research guides would be presented much differently in libguides v2, and the design and organization of information would need to be tailored to the needs of our user community like any other service. user feedback would be vital to reorganizing our guides’ content and to making customizations to the new system. this article will describe the usability testing process that was employed before and after uhl’s migration to libguides v2. usability testing is one technique in the field of user experience (ux). the primary goal of ux is to gain a deep understanding of users’ preferences and abilities, in order to inform the design and implementation of more useful, easy-to-use products or systems. best practices for ux emphasize “improving the quality of the user’s interaction with and perceptions of your product and any related services.”1 usability tests conducted as part of this case study mailto:lierman@rowan.edu mailto:bscott3@uh.edu mailto:mewarren@uh.edu mailto:ckturner2@uh.edu testing for transition | lierman, scott, warren, and turner 77 https://doi.org/10.6017/ital.v38i4.11169 were informed by the work of jakob nielsen, who pioneered several ux ideas and techniques, and the explanations on conducting your own usability testing provided in steve krug’s seminal works on the topic, don’t make me think and rocket surgery made easy. uhl’s transition to libguides v2 consisted of five stages: (1) card sort testing to determine the best organization of guides in the new system; (2) the migration itself; (3) face-to-face usability testing after migration to study user expectations and behavior after the change; (4) a survey to identify any significant variations in distance users‘ experiences; and (5) final analysis and implementation of the results. incorporating usability testing was a relatively easy and inexpensive process with a high yield of useful insights, which could be adapted as needed to other library settings in order to evaluate similar online resources. literature review as libraries have moved from traditional paper pathfinders to online research guides of increasing sophistication, there has been substantial study into the effectiveness of online research guides for various audiences and information needs. several studies highlight the apparent disconnect between students’ and librarians’ perceptions of research guides, especially regarding the purpose, organization, and intended use of the guides. reeb and gibbons used an analysis of surveys and web usage statistics from several university libraries to show that students rarely or never used online guides despite the extensive time spent by librarians to curate and present information resources.2 similarly, in courtois, higgins, and kapur’s one-question survey (“was this guide useful?”) the authors were surprised to find that 40 percent of the responses received rated guides unfavorably, noting that “it was disheartening for many guide owners to receive poor ratings or negative comments on guides that require significant time and effort to produce and maintain.”3 hemmig concluded that in order to increase the value of a guide from a user perspective, librarians must adopt a user-centric approach by guiding the search process, understanding students’ mental models for research, and providing “starter references.”4 staley’s survey of student users also indicates a need to be mindful of what resources guides are actually expected to provide, as it found that pages linking to articles and databases were far more used than pages with other content.5 data has also shown that undergraduate students are unable to match their information needs with the resources provided on broad subject-area guides, leading several authors to conclude that students would be able to use course-specific guides more easily. for instance, strutin found that course guides are among the most frequently used guides, especially when paired with library instruction sessions.6 several other studies cite survey data, statistics, and educational concepts like cognitive load theory to conclude that ideally, guides would be customized to the specific information needs of each course and its assignments in order to better match the mental models and information-seeking behavior of undergraduate students.7 while the value of online research guides has been under study for quite some time, usability testing of guides is a relatively recent phenomenon. in 2010, librarians at concordia university conducted usability testing of two online research guides and found that undergraduate students generally found the guides difficult to use.8 librarians at metropolitan state university conducted two rounds of usability tests on their libguides with a broader range of participant types, highlighting the ability to incorporate usability testing as part of an iterative design process.9 at ithaca college, subject librarians partnered with students in a human-computer interaction information technology and libraries | december 2019 78 course to test both course guides and subject guides through a series of usability tests, preand post-test questionnaires, and a group discussion in which students evaluated the findings of the usability tests and discussed their experiences.10 at the university of nevada, las vegas, librarians conducted usability testing with both undergraduate students and librarians, and surprisingly found that attitudes towards the guides were similar in both groups: interface design challenges were the greatest barrier to task completion, rather than the level of expertise of the user.11 finally, at northwestern university, librarians conducted several types of usability tests as a part of a transition from the original libguides platform to libguides v2, to determine what features worked from the original guides and what could be improved or updated during the migration.12 throughout these and other usability studies, the authors have identified a number of desirable and undesirable elements in research guide design: • clean and simple design is highly prioritized by users. students preferred streamlined text, plentiful white space, and links to “best bets” rather than comprehensive but overwhelming lists of databases.13 these findings also align with accepted web design best practices. • guide parts and included resources should be labeled clearly and without jargon.14 sections and subpages within each guide should be named according to key terms that students recognize and understand. also, librarians should consider creating subpages using a “need-driven approach,” based on the purpose of each research task or step, rather than by the format of materials or resources.15 • the tabbed navigation of libguides v1 is both unappealing to and easily missed by users, and if it must be implemented, great care should be taken to maximize its visibility and usability.16 • consistency of guide elements, both within a guide and from one guide to the next, helps users more easily orient themselves when using guides; certain elements should always be present in the same place on the page, including navigational elements and table of contents, contact information, supplemental resources such as citation and plagiarism information, and common search boxes.17 with the findings and recommendations of these predecessors in mind, we designed a multi-stage study to expand upon their results and identify new challenges and opportunities that the libguides v2 platform might present. methodology stage 1: card sort the majority of research guides at uhl are organized by subject area, by course, or both. there are a number of guides, however, that are not affiliated with any particular subject area or course, containing task-oriented information that may be valuable across a wide variety of disciplines. the organizational system for these guides had developed organically over time as new guides were developed, rather than being structured intentionally, and it had become evident that these guides were not particularly discoverable or well-used by students. the migration to libguides v2 presented an opportunity to reorganize these guides based on user input. a team of three librarians from the liaison services department conducted an open-card-sort study in november 2015, in order to determine how best to organize those research guides not already affiliated with a course or subject area. card sorting is a method of identifying the testing for transition | lierman, scott, warren, and turner 79 https://doi.org/10.6017/ital.v38i4.11169 categories and organization of information that make the most sense to users, by asking users to sort potential tasks into named categories representing the menus or options that would be available on the site. an open-card sort allows users to create and name as many categories as they need, as opposed to a closed-card sort, which requires users to sort the available options into a predetermined set of categories. to prepare for the study, we reviewed all of our guides to develop a complete list of those not affiliated with a subject or course. for each guide, we developed a brief, clear description of the guide’s topic that would be easy for an average library user to understand, each on a small laminated card. over an approximately ninety-minute period, we staffed a table in the 24-hour lounge of m.d. anderson library, where we recruited passersby to participate in the study. after answering a few demographic questions, participants were asked to place the cards into groups that seemed logical to them. they could create as many or as few groups as necessary, but were asked to try to place every card in a group. while the participants organized the cards, they were asked to explain their thought processes and rationale, and one librarian observed the sorting process and took notes on their actions and explanations. when a participant finished grouping the cards, they were asked to write on an index card a name for each of the groups they had created. the final groupings were photographed and the labels retained for recording purposes. after the testing was complete, participants’ responses were organized into a spreadsheet and reviewed for recurring patterns and commonalities. a new set of categories was developed based on those most commonly created by students during the study, and these categories were titled using the most common terminology used by students in their group labels. stage 2: migration at the direction of the instructional design librarian (idl), research guide editors at uhl revised and prepared their own guide content throughout fall 2015, eliminating unneeded information and reorganizing what remained. the idl led multiple trainings and work sessions throughout the process to ensure compliance. during this same time, the idl completed back-end work in the libguides system to prepare for migration, and the web services department created a custom layout for the new guide site. the data migration itself took place on december 18, 2015, followed by cleanup and full implementation in january 2016. the idl provided a deadline by which all content must be ready for public consumption, prior to the start of the spring semester. af ter that deadline, the web services department switched the url for uhl’s research guides site to the libguides v2 instance and made the new system publicly available. stage 3: face-to-face testing after the migration process was complete, the idl assembled a team of ten other librarian and staff stakeholders from the liaison services, special collections, and web services departments to develop a usability testing protocol. this team assisted the idl in developing two different face-toface testing scripts and the text of a survey for distance users, as well as helping to administer face-to-face testing. the method we chose for the face-to-face testing process was think-aloud testing. in a think-aloud test, the user is provided a set of tasks to complete using the web resource that have been identified as common potential uses. the user is asked to attempt each task, and to narrate any thoughts or reactions to the resource, as well as the thought process and rationale behind each decision made. information technology and libraries | december 2019 80 several members of the team were already familiar with usability practices and had participated in think-aloud user testing before. training for the others was provided in the form of short practical readings, verbal guidance from the idl in group meetings, and practice sessions before conducting the face-to-face testing. in the practice sessions, group members volunteered for their roles in the testing, discussed protocol and logistics and asked any questions, and practiced the tasks they would each need to complete: making the recruitment pitch to users, walking through the consent process, using recording software, using the notetaking sheet, and so on. as the team leader and one of the members experienced with usability, the idl conducted the actual testing interviews. each of the face-to-face tests focused on either subject guides or the guide homepage. for both tests, tables were set up in the 24-hour lounge for recruitment and testing. two team members recruited students in the library at the time of testing by offering snacks and library branded giveaways. two additional team members facilitated the test and took notes during testing. both tests also used the same consent forms and demographic questions, and largely the same followup questions. participants in both homepage and subject guide testing were guided to the appropriate starting points and interviewed about their impressions of the homepage and guides, their perceptions of the purpose of these resources, and their understanding of the research guides name. subject guide testers were allowed to select which of our two testing guides they would be more comfortable using: the general business resources guide, or the biology and biochemistry resources guide. subject guide testers were also asked how they would seek help if the guide did not meet their needs. both groups were then asked to complete one of two sets of tasks. the homepage tasks were designed to test users’ ability to find individual guides, either for a specific course or for general information on a subject; the subject guide tasks were designed to test users’ ability to find appropriate resources for research on a given topic. after completing the tasks for their appropriate resources, participants answered several general follow-up questions, with additional questions from the facilitator as necessary. stage 4: survey unlike the face-to-face testing, the survey focused only on use of subject guides, not the homepage. otherwise, however, because the purpose of the survey was to compare the behavior of distance users to the behavior of on-campus users, the survey was designed to mimic the face-to-face test as closely as possible. several team members with liaison responsibilities identified distance user groups in their subject areas who would be demographically appropriate and available at the needed time, and contacted appropriate faculty members to ask for assistance in distributing the survey via email. ultimately, the survey was distributed to small cohorts of users in the areas of social work, education, nursing, and pharmacy, and customized for each targeted cohort. each version of the survey linked users to their appropriate subject guide and then asked the same questions regarding impressions of the guide that were asked in the face-to-face testing. users were also asked to complete tasks using the guide that were similar in purpose to those in the face-to-face testing, and they were prompted to enter the resource they found at the end of each task. demographic information was requested at the end of the survey to ensure that in the event of drop-offs, basic demographic information would be more likely to be lost than testing data. the testing for transition | lierman, scott, warren, and turner 81 https://doi.org/10.6017/ital.v38i4.11169 survey was distributed to the target groups over a three-week period in june 2016. six users at least partially completed the survey, and four completed it in full. stage 5: analyzing and implementing results after completing the face-to-face testing, the idl reviewed and transcribed the recordings of each test session, along with additional insights from the notetakers. responses to each interview question were coded and ordered from most to least common, as were patterns of behavior and difficulties in completing each task. task results and completion times were also recorded for each user and organized into a spreadsheet with users’ demographic information. the idl then reported out to research guide editors on common responses and task behaviors observed in the testing, and interpretations of the implications of these results for guide design. after survey responses were collected, the idl compiled and analyzed the results using a similar process, although the survey received few enough responses that coding was not necessary. users’ responses to questions were noted and grouped, and success and failure rates on tasks were tallied. a second report out to research guide editors summarized these results and described which responses closely resembled those received in the face-to-face testing and which varied. finally, when all data had been collected, the idl compiled recommendations based on the testing results with other recommendations derived from past uhl studies and from reviewing the literature, and from these developed a set of research guides usability guidelines. the guidelines were organized from highest to lowest priority, based on how commonly each was indicated in testing or in the literature. research guide editors were asked to revise their guides according to these guidelines within one year of their implementation, and advised that their compliance would be evaluated in an audit of all guide content in summer 2017. in the interest of transparency, the idl also included in the guidelines document an annotated bibliography of the relevant literature review, and a formal report on the procedures and results of the usability testing process. findings card sort one significant observation from the card sort was that, while librarians tended to organize guides into groups based on type of user (e.g., “undergraduates,” “student athletes,” “first-years,” etc.), none of the students who participated categorized resources in this way, and they did not seem to be particularly conscious of the categories into which they or other users might fit. instead, their groupings focused on the type of task to which each guide would be most appropriate, rather than the type of user that would be most likely to use that guide. for example, users readily recognized guides related to citation tasks and preferred them to be grouped together, regardless of the level at which they addressed the topic, and also grouped advanced visualization techniques like gis with simpler multimedia-related tasks like finding images. similarly, category labels tended to include “how to . . . ” language in describing their contents, focusing on the task to which the guides in that category would be beneficial. this aligns with the recommendation from sinkinson et al. to name guide pages based on purpose rather than format.18 it is worth noting, however, that all of the students who participated in the card-sort study were undergraduates and may not have fully understood some of the more complex research tasks being described. it should also be noted that all users created some sort of category for “general” or “basic” research tasks, and most either explicitly created an “advanced” research category, or information technology and libraries | december 2019 82 created several more granular categories and then verbally described these as being for “advanced” research tasks. in general, organization by task type was most preferred, followed by level of sophistication of task. face-to-face testing: homepage no significant correlations were found between user demographics and users’ success rates in completing each task, nor between demographics and time on task. users’ ability to navigate the system was generally consistent regardless of major, year in program, and—somewhat surprisingly—frequency of library use. this is, however, in keeping with costello et al.‘s finding that technology barriers were more significant in user testing than level of experience.19 when testing the homepage, we found that all users were able to find known guides (such as a course guide for a specific course) and appropriate guides for a given task (such as a citation guide for a particular style) quickly and easily. when seeking a guide, users generally used the by subject view of all guides to locate both subject and course guides. if this view was not helpful, as in the case of citation style guides, users’ next step was most commonly to switch to the all guides view and use the search function to look for key terms. users understood and used the by subject and all guides views intuitively, expressed more confusion and hesitation about the by owner and by type views, and disregarded the by group view entirely. we had been concerned about whether the search function would confuse users by highlighting results from guide subpages, but on the contrary, the study participants used the search function easily, and the fact that it surfaced results from within guides seemed to help them find and identify relevant terms, rather than confusing them. overall, users responded favorably to the look and feel of guides, albeit with a few specific critiques: the initially limited color palette made it difficult for some users to distin guish parts of a guide from one another, and the text size was found to be uncomfortably small in some areas. face-to-face testing: subject guides in subject guide testing, we found overwhelmingly that users both valued and made use of link and box descriptions within guides, using them throughout the navigation process as sources of additional information. users generally preferred briefer descriptions, rather than reading lengthy paragraphs of text, but several noted specific instances in which they would not have understood the nature or purpose of a database without the description that was provided. we also found, conversely, that librarian profile boxes were of less value to users than we had assumed. when asked how they would find help when researching, most subject guide testers said they would turn to google, ask at the library service desk, or use the contact us link in the libguides footer; only two mentioned the profile box as a potential source of help at all. users also seemed unsure of the purpose of the profile box, and not to recognize whose photo and contact information they were seeing, in spite of box labels and text. contrary to our expectations, users also readily clicked through to subpages of guides to find information, sometimes even when more useful information was actually available on the guide landing page. this was particularly evident in one of the subject guides that included internal navigation links in a box on the landing page: if users saw a term they recognized in one of these links, they would click it immediately, without exploring the rest of the page. in general, users latched on quickly to terms in subpage and box titles that seemed relevant to their tasks, and some expressed feelings of increased confidence and reassurance when seeing a familiar term featured testing for transition | lierman, scott, warren, and turner 83 https://doi.org/10.6017/ital.v38i4.11169 prominently on an otherwise unfamiliar resource. scanning for keywords in this manner also sometimes led users astray, however: some navigated to inappropriate pages or links because they featured words like “research” or “library” in their titles. users also expressed confusion about page titles that did not match their expectations of tasks they could complete online, such as “biology reading room.” these findings support those of many prior authors regarding the importance of including clear descriptions with key words that users readily understand.20 many of our results from subject guide testing not only ran counter to our expectations, but challenged the assumptions on which we had based our questioning. for example, we had been curious to learn whether links to external websites were used significantly compared to links to library databases, or if they simply cluttered guides unnecessarily. in testing, however, we found that users did not distinguish between the two types of resources at all, and used both interchangeably. a better question seemed to be not whether users found those links useful, but how to distinguish them from library content—or whether the distinction was necessary from the user’s perspective. some team members had also been concerned about the scroll depth of guide pages, but the majority of users not only said they did not mind scrolling, but seemed surprised and amused by being asked. their own assumptions about this type of resource clearly included the need to scroll down a page. a few other miscellaneous issues presented themselves in our face-to-face testing. one was that the purpose and nature of research guides was not readily evident to users. many used language that conflated guides with search tools like databases, or even with individual information resources like books or articles. for example, a user asked whether the by owner view listed the authors of articles available in this resource. the curated and directional nature of research guides was not at all clear to users. furthermore, in spite of the improvements to guide look and feel in libguides v2, several users still spoke of guides as being cluttered, lengthy, and overwhelming, leaving them intimidated and unsure of where to begin. consistently, testers tended to gravitate toward course guides even when subject guides would have been more appropriate for a given task, and some users expressed that this choice was because of the greater specificity in course guide titles. users demonstrated a great preference for familiarity, gravitating toward terms and resources that were known to them, and even repeating behaviors that had been unproductive earlier in the testing process. finally, one of the greatest points of confusion for users seemed to be the relationship of research guides to physical materials within the library. users readily and confidently followed links to online resources from research guides but expressed confusion and hesitancy when guides pointed to books or other resources available in the library. survey the survey of off-campus users had few responses, but the demographics of the respondents varied more than those of the on-campus testing participants, including graduate students and faculty. the users who did respond showed evidence of less use of guide subpages than we had observed in the face-to-face testing, indicating that the presence of a librarian during testing may have influenced users to explore guides more thoroughly than they would have when working on their own. at the same time, more experienced researchers in the survey group—in this case, a late-program graduate student and a faculty member—were apparently more likely than less experienced users to explore guides thoroughly, and to succeed at research tasks. survey respondents also were far more likely to state that they would use the profile box on guides for information technology and libraries | december 2019 84 help, with some indicating that they recognized their liaison librarian’s picture and were familiar with the librarian as a source of assistance. liaison librarians at uhl often work more closely with higher-level students and faculty than with undergraduates, and this greater familiarity was not surprising. discussion implementation of findings based on the results of the literature review and testing, a number of changes and recommendations were implemented. a brief description of the nature and purpose of research guides was added to the guide homepage’s sidebar, and more color variation was added to guides, while font sizes were increased. existing documentation was also reworked and expanded to create the research guides usability guidelines document for all guide editors, which included adding or revising the following recommendations: • pages, boxes, and resources should all be labeled with clear, jargon-free language that includes keywords familiar to their most frequent users. • page design should be “clean and simple,” minimizing text and incorporating ample white space. • brief, oneto two-sentence descriptions should be provided for all links. • each guide should have an introduction on its landing page with a brief description of its contents and purpose. it may be helpful to include links to subpages in this box as well, but this should be done judiciously, as these links may take users off the landing page prematurely. • pages and resources aimed at undergraduates should be organized and titled according to their relevance to research tasks (e.g., “find articles”), and not by user group. • electronic resources should be prioritized on guides over print resources. • clear distinctions should be made between library and non-library links when the distinction is important. • a profile box with a picture should be included, but the importance of this item is not as great as we had previously imagined. limitations one of the most significant challenges in our testing was actually negotiating the irb application process. delays in our application raised concerns within the team that we would not receive approval in time to test with students before the start of the summer break. although we did receive approval in time, the window for testing afterward was extremely narrow. submitting the application also bound us to the scripts and text that we had originally drafted, which severely limited the flexibility of the testing process. this became a challenge at several points when a particular phrasing or design of a question was found to be ineffective in practice, but could not be altered from its original form. tensions between the requirements for institutional review and the unique needs of usability testing are a persistent problem for user experience development in an academic setting, and must be planned for accordingly as much as possible. in some cases, as well, we might have improved our results by better designing our questions. one example of this was the question about the name “research guides,” which anecdotal evidence has suggested might be challenging for users. simply asking whether that name made sense to the participant was clearly not effective in practice, and did not yield actionable insights. in the future, testing for transition | lierman, scott, warren, and turner 85 https://doi.org/10.6017/ital.v38i4.11169 we might consider informal testing of our planned questions with users in the target demographic before proceeding with full-scale usability testing. a final challenge was in gathering data on use of guides by distance users. though we were able to get enough responses to draw some tentative conclusions, we had hoped for a larger pool of data. though it would make the results more difficult to compare to in-person testing, reducing the length of the survey might have helped to produce more responses. additionally, increased marketing and more flexible timing for survey distribution might have also helped us reach a larger audience. conclusions the results of our testing were very instructive, and led to the creation of valuable documentation for guide editors to use in their work. we also learned a number of lessons relating to process that would be of value to other librarians seeking to perform similar testing at their own institutions. the first of these is that working with a large, interdepartmental team on this type of project— while occasionally unwieldy—is greatly beneficial overall. even if all the team members are not able to fully participate, involving as many colleagues as possible in the usability testing process lessens the workload for each individual, increases flexibility, and ultimately increases buy-in and compliance with the resulting changes and recommendations. for a platform used directly by a relatively large percentage of librarians, as libguides generally is, the number of stakeholders in user research is correspondingly large, and as many of these stakeholders as possible should be involved to some degree. not only will this distribute the benefits of the process more broadly, it will make it possible to staff more extensive and more frequent testing sessions. in the course of our testing process, we also came to recognize the value of testers familiar with the user group under examination. a majority of librarians involved in testing were from publicfacing departments, with significant student contact in their day-to-day work. as a result, we were able to quickly attract a diverse set of participants for our testing simply through our collective knowledge of students’ likely behaviors and preferences: where students were most likely to congregate, what kinds of rewards would motivate them to participate, how to reach them at a distance, and how far their patience would be likely to extend for an in-person interview or an online survey. the incentives and location that the testing teams selected were so effective that the numbers of volunteers we received overwhelmed our capacity to accommodate within the allotted testing time, resulting in a substantial pool of responses for analysis. therefore, we conclude that the effectiveness of user research can be increased by including (or at least consulting) those most familiar with the user group to be studied. simply assuming that participants will be available may ultimately compromise the effectiveness of testing. additionally, time management is an extremely important element of testing development. failing to fully account for the demands of the irb process, for example, led to significant limitations for our project concerning the timing of testing, the availability of participants, our capacity for marketing and distribution of the survey, and the quality of our testing instrument. while acknowledging that, as in our case, sometimes the need for usability testing arises on short notice, we recommend allocating as much time and preparation to the process as possible, to ensure that every aspect of the testing can be given adequate attention. information technology and libraries | december 2019 86 figure 1. average monthly guide views by transition period. testing for transition | lierman, scott, warren, and turner 87 https://doi.org/10.6017/ital.v38i4.11169 as a final note, nearly two years after the best practices were implemented, we collected and compared guide traffic statistics from three key periods: • september 2014 through december 2015, the sixteen months preceding our transition to libguides v2; • january 2016 through august 2017, our first twenty months on libguides v2, during which time best practices had not yet been fully developed and implemented; and • september 2017 through april 2019, from the beginning of best practices implementation through the time of writing (best practices were implemented gradually between september 2017 and february 2018). mindful of the fact that guide usage fluctuates with the academic year, we compared average views for each guide on a monthly basis. figure 1 shows the average number of times each guide was viewed in a month for each period of the transition. as the figure shows, for most of the academic year, guide views dropped sharply after our transition from libguides v1 to libguides v2, and continued to decline slightly with time through the period when our best practices were implemented. there are a number of possible causes for this phenomenon: • guide usage may be declining over time generally for a variety of reasons, and the transition to the new look of v2 may have confused and disoriented users in the immediate aftermath, causing use of some guides to be discontinued. • a substantial number of older guides were eliminated in the transition to v2, some of which may have been more heavily used than suspected, and new guides that have been created since may not yet have gained traction and recognition from users. • librarians may also have reduced their efforts to incorporate guides into their teaching and outreach strategies. • improved organization in the new system may be helping users to find the guide they need on the first try, without having to move through and examine multiple guides. in any case, this trend is concerning and merits further investigation, but a direct correlation with the transition to libguides v2 and the implementation of best practices has not been established. a more accurate measure of the effect of the best practices would be a user satisfaction survey, although a comparison would be difficult to make due to a lack of a baseline from bef ore the transition. we will continue to investigate trends in the use of our guide and how our best practices have affected our users, and how they can be improved upon in the future. information technology and libraries | december 2019 88 appendix a: homepage testing script welcome and demographics hello! thank you for agreeing to participate. i’ll be helping you through the process, and my colleague here will be taking notes. before we get started, i’d like to ask you a few quick questions about yourself. • are you a student? o (no:) ▪ what is your status at uh? (faculty, staff, fellow, etc.) ▪ with what college or area are you affiliated? o (yes:) ▪ are you an undergraduate or a grad student? ▪ what program are you in? ▪ what year are you in now? • how often do you use this library? • how often do you use the libraries’ website or online resources? • about how many hours a week would you say you spend online? • have you ever used the libraries’ research guides before? (if not) have you ever heard of them? are you ready to start? do you have any questions? homepage tour first, i’d like to ask you a few questions about the homepage, which you can see here. don’t worry about right or wrong answers, i just want to know your reactions. • when you look at this page, what are your first impressions of it? • just from looking at these pages, what do you think this resource is for? • look at the categories across the top of the screen. what do you think each of those mean? what would you use them for? • what would you call the resources listed here? • we call these resources “research guides.” does that name make sense to you? tasks: odd-numbered participants now we’re going to ask you to complete two tasks using this page and the links on it. this isn’t a test, and nothing you do will be the wrong or right answer. we just want to see h ow you interact with the site and what we can do to make that experience better. do you have any questions so far? let’s begin. please try to talk about what you’re doing as much as possible, and tell us what you’re thinking and why you’re taking each step. 1. you need to find sources for an assignment for your history class, and you aren’t sure where to start. you clicked a link on the help section of the library webpage that led you here. find a guide that you think can help you. 2. you are taking chemistry 1301, and your professor told you that the library has a research guide especially for this class. find the guide you think they meant. testing for transition | lierman, scott, warren, and turner 89 https://doi.org/10.6017/ital.v38i4.11169 tasks: even-numbered participants now we’re going to ask you to complete two tasks using this page and the links on it. this isn’t a test, and nothing you do will be the wrong or right answer. we just want to see how you interact with the site and what we can do to make that experience better. do you have any questions so far? let’s begin. please try to talk about what you’re doing as much as possible, and tell us what you’re thinking and why you’re taking each step. 1. you need to format a bibliography in mla style, and your professor told you that the library has a research guide that can help. find the guide you think she meant. 2. you are taking a psychology course for the first time, and you want find out what types of tools you should use to do research in psychology. you clicked a link on the help section of the library webpage that led you here. find a guide that you think can help you. follow-up questions now i’d like to ask you a few follow-up questions. • was this easy or hard to do? • what was the easiest part? • what was the hardest part? • what did you like about using this site? • what’s one thing that would have made these tasks easier to complete? information technology and libraries | december 2019 90 appendix b: subject guides testing script welcome and demographics hello! thank you for agreeing to participate. i’ll be helping you through the process, and my colleague here will be taking notes. before we get started, i’d like to ask you a few quick questions about yourself. • are you a student? o (no:) ▪ what is your status at uh? (faculty, staff, fellow, etc.) ▪ with what college or area are you affiliated? o (yes:) ▪ are you an undergraduate or a grad student? ▪ what program are you in? ▪ what year are you in now? • how often do you use this library? • how often do you use the libraries’ website or online resources? • about how many hours a week would you say you spend online? • have you ever used the libraries’ research guides before? (if not) have you ever heard of them? are you ready to start? do you have any questions? guide impressions first, i’d like to ask you a few questions about this page. don’t worry about right or wrong answers, i just want to know your reactions. • when you look at this page, what are your first impressions of it? • just from looking at this page, what do you think this resource is for? what would you use it for? • what would you call this type of resource? • we call resources like this “research guides.” does that name make sense to you? • if you couldn’t find what you were looking for on this page, what would you do to find help? now we’re going to ask you to complete two tasks using this page and the links on it. this isn’t a test, and nothing you do will be the wrong or right answer. we just want to see how you interact with the site and what we can do to make that experience better. do you have any questions so far? let’s begin. please try to talk about what you’re doing as much as possible, and tell us what you’re thinking and why you’re taking each step. tasks: general business resources guide 1. find a database that you could use for research in a general business class. 2. imagine you want to find information on census data. find an appropriate resource on this guide. 3. find a tool you could use to find a dissertation to use in a general business class. testing for transition | lierman, scott, warren, and turner 91 https://doi.org/10.6017/ital.v38i4.11169 tasks: biology and biochemistry resources guide 1. find a database that you could use for research in a biology class. 2. imagine you want to find information on taxonomy. find an appropriate resource on this guide. 3. find a tool you could use to find a thesis to use in a biology class. follow-up questions now i’d like to ask you a few follow-up questions. • was this easy or hard to do? • what was the easiest part? • what was the hardest part? • what did you like about using this site? • what did you dislike? • what’s one thing that would have made these tasks easier to complete? • did it bother you to have to scroll down the page to find additional information? • if you had been doing this on your own, do you think you would have kept scrolling, or gone to other pages on the guide? • did you notice or read the text below the links? • did the names of the different pages on the guide make sense to you? did you know what to expect? • do you think you would use these resources yourself if you were a student in the appropriate class? information technology and libraries | december 2019 92 appendix c: example survey— social work students screening questions are you a university of houston student, faculty member, or employee? • yes • no are you at least 18 years of age? • yes • no consent university of houston consent to participate in research project title: usability testing of library research guides you are being invited to participate in a research project conducted by ashley lierman, the instructional design librarian, and a team of other librarians from the university of houston libraries. non-participation statement your participation is voluntary and you may refuse to participate or withdraw at any time without penalty or loss of benefits to which you are otherwise entitled. you may also refuse to answer any question. if you are a student, a decision to participate or not or to withdraw your participation will have no effect on your standing. purpose of the study the purpose of this study is to investigate user interactions with the research guides area of the uh libraries’ website, in order to understand user needs and expectations and improve the performance of the site. procedures you will be one of approximately fifty subjects to be asked to participate in this survey. you will be asked to provide your initial thoughts and reactions to the libraries’ research guides, and to complete three ordinary research tasks using the page and associated links, then answer followup questions about your experience. the survey includes 23 questions and should take approximately 20-30 minutes. confidentiality your participation in this project is anonymous. please do not enter your name or other identifying information at any point in this survey. testing for transition | lierman, scott, warren, and turner 93 https://doi.org/10.6017/ital.v38i4.11169 risks/discomforts no foreseeable risks or discomforts should result from this research. benefits while you will not directly benefit from participation, your participation may help investigators better understand our users’ needs and expectations from the libraries’ website. alternatives participation in this project is voluntary and the only alternative to this project is non participation. publication statement the results of this study may be published in professional and/or scientific journals. it may also be used for educational purposes or for professional presentations. however, no individual subject will be identified. if you have any questions, you may contact ashley lierman at 713-743-9773. any questions regarding your rights as a research subject may be addressed to the university of houston committee for the protection of human subjects (713743-9204). by clicking the “i agree to participate” button below, you affirm your consent to participate in this survey. if you do not consent to participate, you may simply close this window. • i agree to participate guide impressions click the link below (will open in a new window) and explore the page it leads to, then return to this survey and answer the questions. http://guides.lib.uh.edu/socialwork when you look at the page linked above, what are your first impressions of it? just from looking at the page, what do you think this resource is for? what would you use it for? what would you call this type of resource, if you had to give it a name? if you couldn’t find what you were looking for on the page linked above, what would you do to find help? on the following pages, you will be asked to complete three brief tasks. this is not a test, and nothing you do will be the wrong or right answer. the purpose of these tasks is simply to allow you to experiment with using the guide in an authentic way. when you have completed all of the tasks, you will be asked a few questions about your experiences. http://guides.lib.uh.edu/socialwork information technology and libraries | december 2019 94 first task click the link below to open the social work resources guide (will open in a new window): http://guides.lib.uh.edu/socialwork on the social work resources guide, find a link to a database that you could use to investigate possible psychiatric medications. enter the name of the database you found: second task click the link below to open the social work resources guide (will open in a new window): http://guides.lib.uh.edu/socialwork imagine you want to find a psychological assessment. find an appropriate resource on social work resources guide. (you do not need to actually find an assessment, only the name of a resource that would help you locate one.) enter the name of the resource you found: third task click the link below to open the social work resources guide (will open in a new window): http://guides.lib.uh.edu/socialwork on the social work resources guide, find a tool you could use to find historical census data. enter the name of the tool you found: follow-up questions were the tasks on the preceding pages easy or difficult to do? • extremely easy • somewhat easy • neither easy nor difficult • somewhat difficult • extremely difficult what was the easiest part of completing the tasks? what was the most difficult part of completing the tasks? what did you like about using the guide that you were linked to? what did you dislike about using the guide? what is one thing that would have made the tasks easier to complete? demographics thank you for completing the survey! before you leave, please answer a few demographic questions about yourself. http://guides.lib.uh.edu/socialwork http://guides.lib.uh.edu/socialwork http://guides.lib.uh.edu/socialwork testing for transition | lierman, scott, warren, and turner 95 https://doi.org/10.6017/ital.v38i4.11169 are you a student? • yes • no type of student: • undergraduate • graduate • not a student program or major: year in program: • 1st • 2nd • 3rd • 4th • 5th or higher • not a student how often do you use the university of houston libraries? • daily • a few times a week • a few times a month • a few times a year • never how often do you use the libraries’ website or online resources (e.g. databases, catalog, etc.)? • daily • a few times a week • a few times a month • a few times a year • never have you ever used the libraries’ research guides before? • yes • no ending screen we thank you for your time spent taking this survey. your response has been recorded. information technology and libraries | december 2019 96 references 1 “user experience basics,” usability.gov, https://www.usability.gov/what-and-why/userexperience.html. 2 brenda reeb and susan gibbons, “students, librarians, and subject guides: improving a poor rate of return,” portal: libraries and the academy 4, no. 1 (2004): 123-30, https://doi.org/10.1353/pla.2004.0020. 3 martin p. courtois, martha e. higgins, and aditya kapur, “was this guide helpful? users’ perceptions of subject guides,” reference services review 33, no. 2 (2005): 188-96, https://doi.org/10.1108/00907320510597381. 4 william hemmig, “online pathfinders: toward an experience-centered model,” reference services review 33, no. 1 (2005): 66-87, https://doi.org/10.1108/00907320510581397. 5 shannon m. staley, “academic subject guides: a case study of use at san jose state university,” college & research libraries 68, no. 2 (2007): 119-40, http://crl.acrl.org/content/68/2/119.short. 6 michal strutin, “making research guides more useful and more well used,” issues in science and technology librarianship 55 (2008), https://doi.org/10.5062/f4m61h5k. 7 kristin costello et al., “libguides best practices: how usability showed us what students really want from subject guides” (presentation, brick & click ’15: an academic library conference, maryville, mo, november 6, 2015): 52-60; alisa c. gonzalez and theresa westbrock, “reaching out with libguides: establishing a working set of best practices,” journal of library administration 50, no. 5-6 (2010): 638-56, https://doi.org/10.1080/01930826.2010.488941; jennifer j. little, “cognitive load theory and library research guides,” internet reference services quarterly 15, no. 1 (2010): 53-63, https://doi.org/10.1080/10875300903530199; dana ouellette, “subject guides in academic libraries: a user-centered study of uses and perceptions,” canadian journal of information and library science 35, no. 4 (2011): 436-51, https://doi.org/10.1353/ils.2011.0024. 8 luigina vileno, “testing the usability of two online research guides,” partnership: the canadian journal of library and information practice and research 5, no. 2 (2010): 1-21. https://doi.org/10.21083/partnership.v5i2.1235. 9 alec sonsteby and jennifer dejonghe, “usability testing, user-centered design, and libguides subject guides: a case study,” journal of web librarianship 7, no. 1 (2013): 83-94. https://doi.org/10.1080/19322909.2013.747366. 10 laura cobus-kuo, ron gilmour, and paul dickson, “bringing in the experts: library research guide usability testing in a computer science class,” evidence based library and information practice 8, no. 4 (2013): 43-59, http://ejournals.library.ualberta.ca/index.php/eblip/article/view/20170. 11 costello et al., 56. https://www.usability.gov/what-and-why/user-experience.html https://www.usability.gov/what-and-why/user-experience.html https://doi.org/10.1353/pla.2004.0020 https://doi.org/10.1108/00907320510597381 https://doi.org/10.1108/00907320510581397 http://crl.acrl.org/content/68/2/119.short https://doi.org/10.5062/f4m61h5k https://doi.org/10.1080/01930826.2010.488941 https://doi.org/10.1080/10875300903530199 https://doi.org/10.1353/ils.2011.0024 https://doi.org/10.21083/partnership.v5i2.1235 https://doi.org/10.1080/19322909.2013.747366 http://ejournals.library.ualberta.ca/index.php/eblip/article/view/20170 testing for transition | lierman, scott, warren, and turner 97 https://doi.org/10.6017/ital.v38i4.11169 12 john j. hernandez and lauren mckeen, “moving mountains: surviving the migration to libguides 2.0,” online searcher 39, no. 2 (2015): 16-21. 13 ouellette, 447; denise fitzgerald quintel, “libguides and usability: what our users want,” computers in libraries 36, no. 1 (2016): 8; sonsteby and dejonghe, 89. 14 costello et al., 56; hernandez and mckeen, 20; sonsteby and dejonghe, 89. 15 caroline sinkinson et al., “guiding design: exposing librarian and student mental models of research guides,” portal: libraries and the academy 12, no. 1 (2012): 74, https://doi.org/10.1353/pla.2012.0008. 16 costello et al., 56; ouellette, 444-45; quintel, 8; kate a. pittsley, and sara memmot, “improving independent student navigation of complex educational web sites: an analysis of two navigation design changes in libguides,” information technology and libraries 31, no. 3 (2012): 56, https://doi.org/10.6017/ital.v31i3.1880; sonsteby and dejonghe, 87. 17 cobus-kuo, gilmour, and dickson, 50; costello et al., 56. 18 sinkinson et al., 74. 19 costello et al., 56. 20 costello et al., 56; hernandez and mckeen, 20; sonsteby and dejonghe, 89; sinkinson et al., 74. https://doi.org/10.1353/pla.2012.0008 https://doi.org/10.6017/ital.v31i3.1880 abstract introduction literature review methodology stage 1: card sort stage 2: migration stage 3: face-to-face testing stage 4: survey stage 5: analyzing and implementing results findings card sort face-to-face testing: homepage face-to-face testing: subject guides survey discussion implementation of findings limitations conclusions appendix a: homepage testing script welcome and demographics homepage tour tasks: odd-numbered participants tasks: even-numbered participants follow-up questions appendix b: subject guides testing script welcome and demographics guide impressions tasks: general business resources guide tasks: biology and biochemistry resources guide follow-up questions appendix c: example survey— social work students screening questions consent guide impressions first task second task third task follow-up questions demographics ending screen references information security in libraries: examining the effects of knowledge transfer tonia san nicolas-rocca and richard j. burkhard information technology and libraries | june 2019 58 tonia san nicolas-rocca (tonia.sannicolas-rocca@sjsu.edu) is assistant professor in the school of information at san jose state university. richard j. burkhard (richard.burkhard@sjsu.edu) is professor in the school of information systems and technology in the college of business at san jose state university. . author three (email) is title, institution. abstract libraries in the united states handle sensitive patron information, including personally identifiable information and circulation records. with libraries providing services to millions of patrons across the u.s., it is important that they understand the importance of patron privacy and how to protect it. this study investigates how knowledge transferred within an online cybersecurity education affects library employee information security practices. the results of this study suggest that knowledge transfer does have a positive effect on library employee information security and risk management practices. introduction libraries across the u.s. provide a wide range of services and resources to society. libraries of all types are viewed as important parts of their communities, offering a place for research, to learn about technology, to access accurate and unbiased information, and a place that inspires and sparks creativity. as a result, there were over 171 million registered public library users in the u.s. in 2016.1 a library is a collection of information resources and services made available to the community in which it serves. the american library association (ala) affirms the ethical imperative to provide unrestricted access to information and to guard against impediments to open inquiry.2 further, in all areas of librarianship, best practice leaves the library user in control of as many choices as possible.3 in a library, the right to privacy is the right to open inquiry without having the subject of one’s interest examined or scrutinized by others.4 many library resources require the use of a library card. to obtain a library card in the u.s. one must provide official photo identification showing personally identifiable information (pii), such as name, address, telephone number, and email address. pii connects library users or patrons with, for example, items checked out, and websites visited. as such, pii has the potential to build up an image of a library patron that could potentially be used to assess the patron’s character. in response, the ala developed a policy concerning the confidentiality of pii about library users.5 confidentiality extends to “information sought or received and resources consulted, borrowed, acquired or transmitted,” and includes, but is not limited to, database search records, reference interviews, circulation records, interlibrary loan records, and other personally identifiable uses of library materials, facilities, or services.6 in more recent years, the ala has further specified that the right of patrons to privacy applies to any information that can link “choices of taste, interest, or research with an individual.”7 when library users recognize or fear that their privacy or information technology and libraries | june 2019 59 confidentiality is compromised, true freedom of inquiry no longer exists. therefore, it is imperative that libraries use extra care when handling patron personally identifiable information. while librarians and other library employees may understand the importance of data protection, they generally don’t have the resources available to assess information security risk, employ risk mitigation strategies, or offer security education, training, or awareness (seta) programs. this is of particular concern as libraries increasingly have access to databases of both proprietary and personal information.8 seta programs are risk mitigation strategies employed by organizations worldwide to increase and maintain end-user compliance of information security and privacy policies. in libraries, information systems are widely used to provide services to patrons, however, there is little known about information security practices in libraries.9 given the sensitivity of the data libraries handle, and the lack of information security resources available to them, it is important for those currently or planning to work in the library environment to develop the knowledge necessary to identify risks and develop and employ risk mitigation strategies to protect information and information resources they are entrusted with. therefore, the research question in this present study is: how can cybersecurity education strengthen information security practices in libraries? currently, there is a dearth of research on information security practices in libraries.10 this is an important research gap to acknowledge given that patron privacy is fundamental to the practice of librarianship in the u.s, and the advancement in technology coupled with federal regulations adds to the challenges of keeping patron privacy safe.11 thus this study contributes to current literature by evaluating the effects of knowledge transfer as a means to strengthen information security within libraries. furthermore, this study will offer a preliminary investigation as to whether knowledge utilization leads to motivation and the participation of information security risk management activities within libraries. the remainder of this paper proceeds as follows: first, a review of knowledge transfer is covered. a description of the cybersecurity course, including students and course material, is provided. data collection and analysis are then presented. this is followed by a discussion of the findings, limitations, and future research. literature rivew knowledge transfer in seta knowledge transfer through seta programs plays a key role in the development and implementation of cybersecurity practices.12 knowledge is transferred when learning takes place and when the recipient of that knowledge understands the intricacies and implications associated with that knowledge so that he or she can apply it.13 for example, in a security education program, an educator may transfer knowledge about information security risks to users who learn and apply the knowledge to increase patron privacy. the knowledge is applied when evidenced by users who are able to identify risks to patron data and implement risk mitigations strategies that serve to protect patron information and information system assets. knowledge transfer can be influenced by four factors: absorptive capacity, communication, motivation, and user participation.14 this study evaluates the extent to which knowledge transferred from a cybersecurity course strengthens information security practices within libraries. this study adapts the theoretical model as proposed by spears & san nicolas-rocca information security in libraries | san-nicolas-rocca and burkhard 60 https://doi.org/10.6017/ital.v38i2.10973 (2015) (see figure 1) to examine the effects of cybersecurity education on information security practices in libraries.15 figure 1. factors of knowledge transfer leads to knowledge utilization. absorptive capacity absorptive capacity is the ability of a recipient to recognize the importance and value of eternally sourced knowledge, assimilate and apply it and has been found to be positively related to knowledge transfer.16 activating a student’s prior knowledge could enhance their ability to process new information.17 that is, knowledge transfer is more likely to take place between the instructor and students enrolled in a cybersecurity course if the student has existing knowledge or has had experience in some related area. for the present study, students have stated that prior to enrolling in the cybersecurity course, they had little to no knowledge of cybersecurity. one student mentioned, “while i am the director of a small academic library, i have no understanding of cybersecurity. i am taking this course to learn about cybersecurity so that i can better secure the library i work in and to share the information with those who work in the library.” another student mentioned, “my goal is to work in a public library after graduation. i am taking this course because i keep hearing about cybersecurity breaches in the news, and i want to learn more about cybersecurity because i think it will help me in my future job.” while all of the students enrolled in the course had no cybersecurity experience, all of them had some understanding of principle 3 in the ala code of ethics, which states, “we protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.”18 understanding of principle 3 in the code of ethics demonstrates existing knowledge in some related area with regards to cybersecurity, albeit limited knowledge. given this understanding, students should have the ability to process new information from the cybersecurity course. information technology and libraries | june 2019 61 communication the success of any seta program depends on the ability of the instructor to effectively communicate the applicability and practical purpose of the material to be mastered, as distinguished from abstract or conceptual learning.19 according to current research, knowledge transfer can only occur if communication is effective in terms of type, amount, competence, and usefulness.20 for the present study, students were enrolled in an online graduate level cybersecurity course at a university we call mountain view university (mvu). we changed the name to protect the privacy of the research participants. while research suggests that the best form of communication for knowledge transfer is face-to-face communication, the cybersecurity course at mvu is only offered online.21 therefore, communication relating to the course was conducted via course management software, email, video conferencing, discussion board, and prerecorded videos. motivation motivation can be a significant influence on knowledge transfer.22 that is, an individual’s motivation to participate in seta programs has been found to influence the extent to which knowledge is transferred.23 specifically, without motivation, a trainee may fail to use information shared with them about methods used to protect and safeguard patron privacy. in this present study, research participants voluntarily enrolled in the cybersecurity course. the cybersecurity course is not a core course or a class required for graduation. therefore, enrolling in the course implies motivation to learn about cybersecurity by participating in course activities and completing assigned work. user participation user participation in information security activities may influence effective knowledge transfer initiatives.24 according to previous research, when users participate in cybersecurity activities, security safeguards were more aligned with organizational objectives and were more effectively designed and performed within the organization.25 for the present study, given that students enrolled in the cybersecurity course, it is expected that they will participate in information security risk management activities, such as the completion of personal and organizational risk management projects. cybersecurity course information this study will examine whether cybersecurity education strengthens information security practices within libraries. based on the model in figure 1, students enrolled in the cybersecurity course (motivation), and therefore, were expected to participate in all course activities and complete assigned work (user participation), such as isrm assignments. isrm assignments are described in the course material section below. as per figure 2, the cybersecurity course was offered online, and used multiple forms of communication, including email, video conferencing, discussion board, and pre-recorded videos (communication). students were able to access these resources through canvas, a learning management system. students came into the class with some understanding of principle 3 in the ala code of ethics. therefore, given that this knowledge is in a “related area,” students may be able to process new information relating to cybersecurity (absorptive capacity). as per the above information and as depicted in figure 1, motivation, user participation, communication, and absorptive capacity will lead to knowledge transfer. therefore, this study will focus on how knowledge transfer, as a means to strengthen information security, leads to knowledge utilization by cybersecurity students within information organizations. information security in libraries | san-nicolas-rocca and burkhard 62 https://doi.org/10.6017/ital.v38i2.10973 specifically, this study will explore the possibility of knowledge utilization leading to motivation, and participation in isrm initiatives in libraries. figure 2. knowledge transfer elements: cybersecurity knowledge transfer for information organizations. course material the course was offered to graduate students at mountain view university. course material was created based on the national institute of technology special publication (nist sp) 800-53 and 60, as well as federal information processing standards (fips) publications 199 and 200. the focus of the course was information security risk management (isrm). course requirements included lab exercises, discussion posts relating to current cybersecurity findings and news reports, and isrm assignments. isrm assignments included a personal risk management assignment, which then led to the completion of an organizational risk management project (ormp). students completed the ormp for various libraries, healthcare institutions, pharmaceutical companies, government organizations, and small businesses. with instructor approval, students were allowed to select the organization they wanted to work with. the objective of the course was for students to obtain an understanding of isrm and be able to apply what they have learned to the workplace. course communication seta programs depend strongly on the ability of the knowledge source to effectively communicate the importance and applicability of the knowledge shared. current research suggests that the type of communication medium, relevance and usefulness of the information, and competency of the instructor can affect knowledge transfer. given that face-to-face communication is considered the best method for successful knowledge transfer, it is important to understand if online communication methods were effective in the cybersecurity course described herein as the main focus of this study is to determine if knowledge transfer leads to knowledge utilization. according to table 1, respondents “strongly agree” or “agree” that the materials used, relevance of communication, comprehension of instructor communication, and the amount of time communicating about cybersecurity in the course was effective (data collection described in section, data collection and analysis. information technology and libraries | june 2019 63 questions response strongly agree agree neither agree nor disagree disagree strongly disagree medium: the material used in the cybersecurity course i took at mvu communicated security lessons effectively. 12 (50%) 12 (50%) 0 (0.00%) 0 (0.00%) 0 (0.00%) relevance: communication during the cybersecurity course i took at mvu was effective in focusing on things i needed to know about cybersecurity for my job. 10 (45.45%) 12 (54.55%) 0 (0.00%) 0 (0.00%) 0 (0.00%) comprehension: in the cybersecurity course i took at mvu, the instructor’s oral and/or written communication with me was understandable. 12 (54.55%) 10 (45.45%) 0 (0.00%) 0 (0.00%) 0 (0.00%) amount: in the cybersecurity course i took at mvu, the amount of time communicating about cybersecurity was sufficient. 12 (54.55%) 10 (45.45%) 0 (0.00%) 0 (0.00%) 0 (0.00%) table 1. effectiveness of communication in cybersecurity course. data collection and analysis the purpose of this study is to determine if knowledge transfer through cybersecurity education, as a means to strengthen information security, leads to knowledge utilization within libraries. specifically, this study will examine if research participants will engage in isrm activities after completion of the cybersecurity education course. the model in figure 1 is examined via survey instrument by the authors. the survey instrument was available to former students who completed an online, semester long, cybersecurity course from fall 2013 through fall 2017. one hundred and twenty-six former students completed one of eight cybersecurity courses, and all were asked to participate in this study. thirty-nine students accessed the survey, but only thirty-eight agreed to participate. of those who agreed to participate in the survey, only twenty-two work in a library in the u.s. or a u.s. territory. of the other sixteen participants, twelve do not currently work within a library environment, and four do not have a job. therefore, responses from twenty-two research participants who work in a library in the u.s. or u.s. territory will be reported in this study. table 2 provides a list of the types of libraries the twenty-two research participants work in. type of library environment response (22) academic library 3 (13.64%) public library 11 (50%) school library (k-12) 2 (9.09%) special library 6 (27.27%) table 2. types of libraries research participants work in. information security in libraries | san-nicolas-rocca and burkhard 64 https://doi.org/10.6017/ital.v38i2.10973 having knowledge and an understanding of information security policies, work processes, and information and information system use within a library environment, a knowledge recipient may understand the value of the knowledge shared with them through effective seta programs and utilize the new knowledge to protect information and information resources. according to table 3, most survey participants stated that they have average to excellent knowledge of their library’s computing-related policies, work processes that handle sensitive patron information, how access to patron information is granted, and how internal staff tend to use computing devices to access organizational information. a few respondents stated that their knowledge is below average. questions: response excellent above average average below average poor how would you rate your knowledge of your organization’s computing-related policies for internal staff computer usage? 4 (18.18%) 10 (45.45%) 8 (36.36%) 0 (0.00%) 0 (0.00%) how would you rate your knowledge of your library’s work processes that handle sensitive patron information? 4 (18.18%) 11 (50%) 6 (27.27%) 1 (4.55%) 0 (0.00%) within the organization you work for, how would you rate your knowledge of how access to patron information is granted? 3 (13.64%) 12 (54.55%) 5 (22.73%) 2 (9.10%) 0 (0.00%) how would you rate your knowledge on how internal staff tend to use computing devices to access organizational information? 2 (9.10%) 11 (50%) 8 (36.36%) 1 (4.55%) 0 (0.00%) table 3. knowledge of organization’s computing-related policies. knowledge transfer for this study, knowledge transfer is measured as the extent to which the cybersecurity student acquired knowledge or understands the key educational objective. according to table 4 below, all survey participants stated that during the cybersecurity course, they acquired knowledge on information security risks, and solutions to manage information security risks within organizations. furthermore, 91 percent of the twenty-two survey participants stated that they gained an understanding of the feasibility to implement solutions and potential impact of not implementing solutions to manage information security risk within the organizations in which they work. this is consistent with previous research that has measured knowledge transfer.26 question: during the cybersecurity course i took at mvu, i _________. response acquired knowledge on information security risks within the organization. 22 (100%) acquired knowledge on solutions to manage information security risks identified within my organization. 22 (100%) gained an understanding of the feasibility to implement solutions to manage information security risks identified within my organization. 20 (90.90%) gained an understanding of the potential impact of not implementing solutions to manage information security risks identified within my organization. 20 (90.90%) information technology and libraries | june 2019 65 table 4. indicators of knowledge transfer. knowledge utilization the desired outcome of knowledge transfer is knowledge utilization.27 this study is interested in the extent to which cybersecurity students have been engaged in information security risk management initiatives in their workplace since the completion of the cybersecurity course. according to table 5, twelve of the twenty-two survey participants have utilized the knowledge transferred to them from the cybersecurity course within the libraries in which they work. of the twelve survey participants, ten performed security procedures within the organization on an ad hoc, informal basis. seven worked on defining new or revised security policies. four implemented new or revised security procedures for organizational staff to follow, and two evaluated at least one security safeguard to determine whether it is being followed by organizational staff. question: since the completion of the cybersecurity course i took at mvu, i have ______ (please check all that apply). response performed security procedures within the organization on an ad hoc, informal basis. 10 (83.33%) worked on defining new or revised security policies. 7 (58.33%) implemented new or revised security procedures for organizational staff to follow. 4 (33.33%) evaluated at least one security safeguard to determine whether it is being followed by organizational staff. 2 (16.66%) not performed any security procedures within the organization. 10 (45.45%) table 5. indicators of knowledge utilization in the library. participation knowledge transfer through cybersecurity education may influence a cybersecurity student to utilize the knowledge they have gained by participating in isrm activities. according to table 6, sixteen of the twenty-two survey participants have participated in isrm activities in the library in which they work since the completion of the cybersecurity course. fifteen communicated with internal senior management on training materials. seven performed a policy review and communicated with internal senior management on training materials. five worked on a security questionnaire, one had an interview with an external collaborator, and another research participant analyzed their library’s business or it process workflow. question: since the completion of the cybersecurity course you took at mvu, have you performed any of the following activities within the workplace: (please check all that apply) response security questionnaire 5 (31.25%) interview with external collaborator (i.e. trainers) 1 (6.25%) policy review 7 (43.75%) business or it process workflow analysis 1 (6.25%) communication with internal peers or staff on training materials 15 (93.75%) communicate with internal senior management on training materials 7 (43.75%) i have not performed any security activities in my workplace 6 (14.29%) table 6. participation in isrm activities. participation may also include discussions on isrm activities. according to table 7, sixteen of the twenty-two survey participants have participated in discussion on isrm activities within the information security in libraries | san-nicolas-rocca and burkhard 66 https://doi.org/10.6017/ital.v38i2.10973 libraries they are currently working at. fifteen survey participants participated in discussions on physical security, and ten had discussions on password policy. seven survey participants had discussions on user provisioning, and six had discussions on encryption. four survey participants had discussions on mobile devices, and another four had discussions on vendor security question: since the completion of the cybersecurity course you took at mvu, have you participated in discussions on the following areas of security? (check all that apply) response password policy 10 (62.5%) user provisioning (i.e., establishing or revoking user logons and system authorization) 7 (43.75%) mobile device 4 (25%) encryption 6 (37.5%) vendor security 4 (25%) physical security 15 (93.75%) disaster recovery, business continuity, or security incident response 6 (37.50%) i have not participated in any discussions relating to security in my workplace 6 (27.27%) table 7. participation in discussions on isrm activities. participation in cybersecurity education may lead to formal responsibility or accountability of isrm activities. according to table 8, nine of the twenty-two survey respondents stated that since the completion of the cybersecurity course, they are formally responsible or accountable for isrm in the libraries in which they work. three research participants are responsible for identifying organizational members to participate in cybersecurity training. five survey participants stated that they are responsible for communicating results on cybersecurity training to upper management, peers, and staff. three research participants are responsible for organizational compliance with government regulations. two are responsible for communicating organizational risk to the board of directors, and one research participant is responsible for organizational compliance of funder requirements. question: since the completion of the cybersecurity course you took at mvu, are you formally responsible or accountable in the following ways? (check all that apply) response identifying organizational members to participate in cybersecurity training 3 (33.33%) communicating results to upper management 5 (55.56%) communicating results to peers or staff 5 (55.56%) responsible for organizational compliance of funder requirements 1 (1.11%) responsible for organizational compliance with government regulations 3 (33.33%) responsible for internal audit 0 (0%) responsible for communicating organizational risk to the board of directors 2 (22.22%) i am not formally responsible for security in my workplace 13 (59.10%) table 8. participation via accountability of isrm activities. motivation an objective of seta programs is to motivate knowledge recipients to comply with information security policies that serve to protect information and information resources. as such, cybersecurity education may motivate students to comply with organizational information security policies that serve to protect information and information resources. according to table 9, information technology and libraries | june 2019 67 since the completion of the cybersecurity course, eighteen of the twenty-two survey participants stated that they believe it is important to protect patron sensitive data. two respondents stated that they wholeheartedly feel responsible to protect their patrons from harm, and another two stated that they would be embarrassed if their organization experienced a data breach. since the completion of the cybersecurity course i took at mvu, _________. response i wholeheartedly feel responsible to protect our patrons from harm. 2 (9.10%) i believe it is important to protect our patrons’ sensitive data. 18 (81.82%) i would be embarrassed if my organization experienced a data breach. 2 (9.10%) my job could be in jeopardy if my organization were to experience a data breach. 0 (0.00%) i do not care about cybersecurity in my organization. 0 (0.00%) table 9. motivation to protect patron privacy. discussion the purpose of this study was to evaluate the effects of knowledge transfer as a means to strengthen information security within libraries. given the results from the survey instrument, the findings suggest that knowledge transfer through cybersecurity education can lead to knowledge utilization. specifically, knowledge transfer through cybersecurity education may influence a library employee to utilize the knowledge they have gained by participating in discussions about, and the accountability and responsibility of isrm activities. in addition, participating in seta programs. seta programs are implemented within organizations as a means to increase compliance of information security policies. the findings suggest that library employees who completed a cybersecurity education course believe that it is important to, or feel that they have a responsibility to, protect patron private information. a couple of research participants stated that they would feel embarrassed if their organization experienced a data breach. a student enrolled in a cybersecurity education course may develop an understanding of and value the information that is passed on from the knowledge source about isrm activities. with ongoing development and implementation of seta programs, activating a student’s prior knowledge of isrm activities could enhance their ability to process new information and apply to their job. limitations and future research this research was conducted based on an online cybersecurity course offered at a university located in the western u.s. therefore, future research is needed to study how cybersecurity courses in other parts of the u.s and internationally affects knowledge transfer as a means to strengthen isrm initiatives in libraries, and other information organizations. it would also be valuable to conduct a modified version of this research within a classroom-based, face-to-face cybersecurity course. furthermore, seta programs implemented in libraries in the united states and internationally would add to this research area. there were 126 potential research participants identified, and although all were asked to participate, only thirty-eight completed the online survey. of the thirty-eight completed surveys, responses from twenty-two participants were reported in this article. participation from additional research participants may have generated different results. information security in libraries | san-nicolas-rocca and burkhard 68 https://doi.org/10.6017/ital.v38i2.10973 while a major limitation of this study is its small pilot study and exploratory focus, a next phase of research should further investigate what type of seta programs would be most effective in different library environments. while cybersecurity education may not be feasible for all library employees to obtain, examining and implementing the most effective seta program for each library environment could strengthen cybersecurity practices in libraries across the u.s. a future study instrument should take into account the factors that influence knowledge transfer (absorptive capacity, communication, motivation, and user participation) as a means to strengthen isrm practices. a common an important outcome for seta programs is user compliance to information security policies. as such, a future study should test library employee knowledge of, and compliance to, information security policies. conclusion u.s. libraries handle sensitive patron information, including personally identifiable information and circulation records. with libraries providing services to millions of patrons across the united states, it is important that they understand the importance of patron privacy and how to protect it. this study investigated how knowledge transferred within an online cybersecurity education course as a means to strengthen information security risk management affects library employee information security practices. the results of this study suggest that knowledge transfer does have a positive effect on library employee information security and risk management practices. references 1 “public library survey (pls) data and reports,” institute of museum and library services, retrieved on june 10, 2018 from https://www.imls.gov/research-evaluation/datacollection/public-libraries-survey/explore-pls-data/pls-data. 2 “policy concerning confidentiality of personally identifiable information about library users,” american library association, july 7, 2006, http://www.ala.org/advocacy/intfreedom/statementspols/otherpolicies/policyconcerning; "professional ethics," american library association, may 19, 2017, http://www.ala.org/tools/ethics. 3 “privacy: an interpretation of the library bill of rights,” american library association, amended july 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 4 ibid. 5 “policy concerning confidentiality of personally identifiable information about library users,” american library association; “code of ethics of the american library association,” american library association, amended jan. 22, 2008, http://www.ala.org/advocacy/proethics/codeofethics/codeethics. 6 “policy concerning confidentiality of personally identifiable information about library users,” american library association; “code of ethics of the american library association,” american library association. 7 “privacy: an interpretation of the library bill of rights,” american library association. information technology and libraries | june 2019 69 8 samuel t.c. thompson, “helping the hacker? library information, security, and social engineering,” information technology and libraries 25, no. 4 (2006): 222-25, https://doi.org/10.6017/ital.v25i4.3355. 9 roesnita ismail and awang ngah zainab, “assessing the status of library information systems security,” journal of librarianship and information science 45, no. 3 (2013): 232-47, https://doi.org/10.1177/0961000613477676. 10 ibid. 11 shayna pekala, “privacy and user experience in 21st century library discovery,” information technology and libraries 36, no. 2 (2017): 48–58, https://doi.org/10.6017/ital.v36i2.9817. 12 tonia san nicolas-rocca, benjamin schooley and janine l. spears, “exploring the effect of knowledge transfer practices on user compliance to is security practices,” international journal of knowledge management 10, no. 2, (2014): 62-78, https://doi.org/10.4018/ijkm.2014040105; janine spears and tonia san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” international journal of knowledge management 11, no. 4 (2015): 52-69, https://doi.org/10.4018/ijkm.2015100104. 13 dong-gil ko, laurie j. kirsch and william r. king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” mis quarterly 29, no. 1 (2005): 59-85, https://doi.org/10.2307/25148668. 14 spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” pp. 52-69; dana minbaeva et al., “mnc knowledge transfer, subsidiary absorptive capacity and hrm,” journal of international business studies 45, no. 1 (2014): 38-51, https://doi.org/10.1057/jibs.2013.43; geordie stewart and david lacey, “death by a thousand facts: criticising the technocratic approach to information security awareness,” information management & computer security 20, no. 1 (2012): 29-38, https://doi.org/10.1108/09685221211219182; mark wilson et al., “information technology training requirements: a role-and performance-based model” (nist special publication 800-16), national institute of standards and technology, (2018), https://www.nist.gov/publications/information-technology-security-training-requirementsrole-and-performance-based-model; san nicolas-rocca, schooley and spears, “exploring the effect of knowledge transfer practices on user compliance to is security practices,” 62-78. 15 spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” 52-69. 16 janine l. spears and henri barki, “user participation in information systems security risk management,” mis quarterly 34, no. 3 (2010): 503-22, https://doi.org/10.2307/25750689; piya shedden, tobias ruighaver, and atif ahmad, “risk management standards-the perception of ease of use,” journal of information systems security 6, no. 3 (2010): 23–41. information security in libraries | san-nicolas-rocca and burkhard 70 https://doi.org/10.6017/ital.v38i2.10973 17 shedden, ruighaver and ahmad, “risk management standards-the perception of ease of use” pp. 23-42; janne hagen, eirik albrechtsen, and stig ole johnsen, “the long-term effects of information security e-learning on organizational learning,” information management & computer security 19, no. 3 (2011): 140-154, https://doi.org/10.1108/09685221111153537. 18 “code of ethics of the american library association,” american library association. 19 spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” pp. 52-69; wilson et al., “information technology training requirements: a roleand performance-based model” (nist special publication 800-16). 20 thompson s.h. teo and anol bhattacherjee, “knowledge transfer and utilization in it outsourcing partnerships: a preliminary model of antecedents and outcomes,” information & management 51, no. 2 (2014): 177–86, https://doi.org/10.1016/j.im.2013.12.001; ko, kirsch, and king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” 59-85; minbaeva et al., “mnc knowledge transfer, subsidiary absorptive capacity and hrm,” 38-51; geordie stewart and david lacey, “death by a thousand facts: criticising the technocratic approach to information security awareness,” information management & computer security 20, no. 1 (2012): 29-38, https://doi.org/10.1108/09685221211219182. 21 martin spraggon and virginia bodolica, “a multidimensional taxonomy of intra-firm knowledge transfer processes,” journal of business research 65, no. 9 (2012) 1,273-282: https://doi.org/10.1016/j.jbusres.2011.10.043; shizhong chen et al., “toward understanding inter-organizational knowledge transfer needs in smes: insight from a uk investigation,” journal of knowledge management 10, no. 3 (2006): 6-23, https://doi.org/10.1108/13673270610670821. 22 maryam alavi and dorothy e. leidner, “review: knowledge management and knowledge management systems: conceptual foundations and research issues,” mis quarterly 25, no. 1 (2001): 107-36, https://doi.org/10.2307/3250961. 23 ko, kirsch, and king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” 59-85. 24 san nicolas-rocca, schooley, and spears, “exploring the effect of knowledge transfer practices on user compliance to is security practices,” 62-78; spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” 52-69. 25 spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” 52-69; spears and barki, “user participation in information systems security risk management,” 503-22. 26 san nicolas-rocca, schooley, and spears, “exploring the effect of knowledge transfer practices on user compliance to is security practices,” 62-78; janine l. spears and tonia san nicolasrocca, “information security capacity building in community-based organizations: information technology and libraries | june 2019 71 examining the effects of knowledge transfer,” 49th hawaii international conference on system sciences (hicss), koloa, hi, 2016, pp. 4,011-20, https://doi.org/10.1109/hicss.2016.498; ko, kirsch, and king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” 59-85. 27 ko, kirsch, and king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” 59-85; teo and bhattacherjee, “knowledge transfer and utilization in it outsourcing partnerships: a preliminary model of antecedents and outcomes,” 177–86. letter from the editor kenneth j. varnum information technology and libraries | june 2019 1 https://doi.org/10.6017/ital.v38i2.11241 welcome to the june 2019 issue of ital. you’ll likely notice a new look to the journal when you read this issue’s content. our helpful and supportive partners at boston college, where information technologies and libraries is archived, have updated the journal’s content management system to the current version of open journal systems. i am grateful to john o’connor at boston college for his patience with and quick, helpful responses to my numerous questions as we adapted to the new user interface and editorial workflows. columns in this issue include bohyun kim’s final “president’s message” as her term concludes, summarizing the work that has gone into the planned division merger that would combine lita, alcts, and llama. editorial board member cinthya ippoliti discusses the role of libraries in fostering digital pedagogy in her “editorial board thoughts” column. and, in the second of our new “public libraries leading the way” column, jeffrey davis discusses the technologies and advantages of digital pass systems. peer-reviewed articles in this issue include: • “no need to ask: creating permissionless blockchains of metadata records,” by dejah rubel, laying a path for using blockchain for managing metadata. • “50 years of ital/jla: a bibliometric study of its major influences, themes, and interdisciplinarity,” by brady lund, a thorough study of how our journal has influenced, and been influenced by, other leading information technology journals. • “weathering the twitter storm: early uses of social media as a disaster response tool for public libraries during hurricane sandy,” by sharon han. this article is the 2019 lita/ex libris student writing award-winning paper. • “‘good night, good day, good luck’: applying topic modeling to chat reference transcripts,” by megan ozeran and piper martin, describing a process to categorize chat reference themes using topic mapping software. • “information security in libraries: examining the effects of knowledge transfer,” by tonia san nicolas-rocca and richard j burkhard, investigating the importance of knowledge transfer across an organization to enhance information security behaviors. • “wikidata: from ‘an’ identifier to ‘the’ identifier,” by theo van veen, describing how libraries could use wikidata as a source of linked open data. thank you to this issue’s authors, and all of information technology and libraries’ readers for supporting peer-reviewed, open-access, scholarly publishing. in closing, i would like to thank the members of the editorial board whose terms are ending june 30: patrick “tod” colegrove, joseph deodato, richard guajardo, and frank cervone. i’m grateful to these four individuals, upon whom i’ve relied for their excellent advice and guidance in steering ital’s course. we are in the process of appointing new editorial board members with two-year terms starting on july 1, and i’ll introduce them in the next issue. kenneth j. varnum, editor varnum@umich.edu june 2019 20190318 10974 galley public libraries leading the way the democratization of artificial intelligence: one library’s approach thomas finley information technology and libraries | march 2019 8 thomas finley (tfinley@friscotexas.gov) is adult services manager, frisco public library. chances are that before you read this article, you probably checked your email, used a mapping app to find your way, or typed a search term online. without your even perceiving it, artificial intelligence (ai) has already helped you to accomplish something today. email spam filters use variants of ai to help cut down on harmful or useless emails in your inbox.1 with ai doing the factcrunching, mapping apps quickly preview the best route based on a myriad of factors. search engine companies like google have been using ai to suggest or produce results faster for longer than anyone outside of the company really knew until recently.2 according to a recent study by northeastern university and gallup, 85% of americans are already using ai products.3 the true revelation behind these recent technological developments may not be the fact that ai is already embedded into the fabric of our modern lives. the real surprise might just be the sudden ubiquitous availability (and approachability) of ai tools for all. as google’s former chief scientist of ai and machine learning, fei-fei li, said in 2017, “the next step for ai must be democratization, lowering the barriers of entry, and making it available to the largest possible community of developers, users and enterprises."4 this sounds a lot like most public libraries’ mission statements. as with other important workforce development efforts, libraries are uniquely placed to participate in this new revolution as key platforms for the discovery and dissemination of emerging tech knowledge. at the frisco public library (https://www.friscolibrary.com), we saw this ai trend surfacing, we see ai as a critical future job skill, and we investigated ways to introduce our patrons into this space. as such, the frisco public library has leveraged readily available technology in a cost-effective way that has engaged community interest. our efforts are also replicable and scalable in terms of multi-nodal experiences both at home and in classroombased learning. some basic definitions let’s take a few steps back to give some broad definitions and boundaries to the scope of ai. according to the oxford english dictionary, artificial intelligence is “the capacity of computers or other machines to exhibit or simulate intelligent behavior.”5 in the literature, you will find a further distinction between general ai, narrow ai, and something called machine learning.6 general ai is something that begins to look like science fiction: an artificial intelligence that learns how to learn, then is able to generalize what it has learned and apply that knowledge to a different case. in advanced examples of general ai, scientists are thinking of not putting a specific problem in front of a general ai program to solve, rather, they are giving it an entire dataset so the program itself can choose what problems it should work on. removing the limited point of view of whoever programs the program.7 narrow ai is easier to understand because it is what we interact with the most in our day-to-day lives. it is what powers those little speed ups that help us do things faster every day: search information technology and libraries | march 2019 9 through our emails to help us avoid spam, translate speech to text when we dictate a message on a smartphone, or helps to parallel park a car at the touch of a button. narrow ai accomplishes a specific task extremely fast and accurately, and thus, becomes an extension and multiplier of our own human productivity. a lot of these narrow ai activities are based in a type of artificial intelligence called machine learning (ml). ml is a set of very complex processes that can review large sets of information; create and train models based on this data; make predictions of what will happen next; and then to refine that data for better future results.8 machine learning is the focus of our efforts at the frisco public library due to two main reasons: 1) it is what has been made available through free tools such as google’s open ai resources; and 2) it makes ai attainable in a library setting. our approach: makerspaces for everyone, at home the frisco public library has had 4 years of success with circulating makerspace technology in reasonably-priced, hard shell waterproof boxes with foam inserts. each kit is cataloged, rfid tagged, security tagged, and sealed with zip ties to enable self-checkouts (zip ties can be easily cut open at home, but prevent items from disappearing in the library). these cases are easy to handle and can take some abuse while protecting their contents. this is important because we circulate about 20 different kinds of robotics kits, no-soldering circuitry kits, 3d scanning kits, programing kits, and internet of things kits. most kits contain the theme item with quick start guides, instruction booklets, and a book to inspire advanced learning. we call these maker kits, and we have about 150 total. in our community, they are wildly popular and have circulated more than 4,000 times since their introduction in january 2016.9 aiy: artificial intelligence kits for everyone in 2017, google released their maker-focused aiy voice project kit (where aiy is a catchy substitute for do-it-yourself with artificial intelligence yourself). the kit consists of several components that pairs a raspberry pi (entry-level computer) and a small speaker that is housed in a cardboard box with a button prominently placed on top.10 the result is a stripped-down version of an amazon echo or google home device — essentially a smart speaker. although the aiy voice kit is not necessarily initially set up to play music, it is designed to take voice commands like the other products on the market. with a minimum of python coding expertise, aiy kits enable mass participation in artificial intelligence. there isn’t even any soldering required to put this kit together! this is 100% in line with fei-fei li’s (google’s former chief scientist for ai and ml) remarks about the need to democratize ai. google has since released another kit called aiy vision that uses similar components paired with a camera. more information on the kits can be found at https://aiyprojects.withgoogle.com/. frisco public library’s artificial intelligence maker kits based on our previous experience with other maker kits, we made a few modifications to the original google design that most librarians with access to a 3d printer can accomplish. the original aiy voice kit uses a punch-out cardboard box to fold and envelop the device. apart from being an extremely cost-effective way of making a box, it also seems like there is delicious irony (and message) in the contrasting of cardboard-as a cheap, widely available material-with the advanced tech of ai. durability being our priority, we knew we needed to upgrade this aspect of google’s original design. our maker librarian, adam lamprecht, quickly found a shared design file public libraries leading the way: the democratization of artificial intelligence 10 https://doi.org/10.6017/ital.v38i1.10974 uploaded to the website, www.thingiverse.com, that he modified to better suit our needs (see figure 1).11 figure 1. ai maker kits with 3d printed aiy voice device. we then printed these in a variety of colors on our 3d printers and modified the grid-patterned foam inserts to make room for the device and a few other items (see figure 2). we are currently circulating 21 of these kits without major incident. information technology and libraries | march 2019 11 figure 2. interior view of the kits. library instruction: python as a window onto artificial intelligence our basic artificial intelligence classes have been key in the introduction of this technology to the public. we reserve 10 kits for a class and pair them with classroom laptops for ease of use. the structure of the class provides a short introduction to the technology and then walks participants through a basic voice recognition coding challenge. all of this is accomplished in python. python is great for beginning coders because it is easier to learn than other programming languages, takes less time to write lines of code, and it can telescope up into a very large number of projects and applications.12 in fact, according to neal ford, director and software architect at thoughtworks, python, “is very good at solving bigger kinds of problems.”13 so with python, a beginning learner has a programming language that continues to be useful beyond the classroom and into the world of work or school. python provides another important advantage: “python provides the front-end method of how to hook into google’s open ai,” states tech writer sardar yegulalp.14 it is this combination of a free, accessible coding language with the powerful (and also free) resources of google’s open ai that truly lowers the barrier to entry for anyone interested in a hands-on experience with artificial intelligence. public libraries leading the way: the democratization of artificial intelligence 12 https://doi.org/10.6017/ital.v38i1.10974 lessons learned the ai maker kits are, by far, our most complicated circulating kits. we are hearing back from patrons that the kits are right on the mark. our users get it, they see the power in getting access to these ai tools (utilizing python) and by all accounts thus far, are happy with their results. there has been a perception gap between library staff, however, and what an ai kit can reasonably accomplish. adam lamprecht reports, “staff members had the expectation that perhaps with this kit, a rookie coder was going to be able to jump directly into developing deep learning neural networks (a very advanced subset of artificial intelligence) and so we definitely benefited from ongoing discussions of those broad ai terms and expectations.”15 google’s aiy voice is a good start but there is lots of room to grow ai classes for more depth. aiy vision is the next logical step that would allow us to enter into the world of basic image recognition. our approach does rely on one company’s platform, but there are more platforms to explore ai now. one of which is amazon’s offerings of machine learning on aws (amazon web services). these services have recently been opened up for a wider audience and amazon is now offering everyone the same online courses they use to train their own engineers.16 the aws ml resources are currently behind paywalls but access to the training alone could be powerful for the right learner. there are even interesting developments for younger learners in ai with robotics. anki (www.anki.com) is a consumer robotics company that uses ai to enliven its products. they released vector in 2018: a seemingly simple toy that responds to its environment and simple commands with the aid of ai. with the release of their software development kit the company is allowing others under the hood of their robots-which potentially means an entry-point for autonomous (or semi-autonomous) robotic vehicle technology powered by ai. what is clear is that the world of ai is already upon us. public libraries are well positioned to help meet the challenge of developing the workforce of the nearand far future with ai classes being a vital tool. the doorway to artificial intelligence is now open, the only question that remains is this: do you step through it? references 1 cade metz, “google says its ai catches 99.9 percent of gmail spam,” wired. july 9, 2015, https://www.wired.com/2015/07/google-says-ai-catches-99-9-percent-gmail-spam/. 2 jack clark, “google turning its lucrative web search over to ai machines,” bloomberg business, october 26, 2015, https://www.bloomberg.com/news/articles/2015-10-26/google-turningits-lucrative-web-search-over-to-ai-machines. 3 rj reinhart, “most americans already using artificial intelligence products,” gallup, march 6, 2018, https://news.gallup.com/poll/228497/americans-already-using-artificial-intelligenceproducts.aspx. 4 scot petersen, “google joins chorus of cloud companies promising to democratize ai,” eweek, march 10, 2017, ebscohost academic search complete. information technology and libraries | march 2019 13 5 “artificial intelligence, n,” oed online, december 2018, oxford university, accessed march 1, 2019. 6 bernard marr, “what is the difference between artificial intelligence and machine learning?,” forbes, december 6, 2016, https://www.forbes.com/sites/bernardmarr/2016/12/06/whatis-the-difference-between-artificial-intelligence-and-machine-learning/#6d40eeec2742. 7 lex fridman, “juergen schmidhuber: godel machines, meta-learning, and lstms,” mit ai podcast, december 22, 2018. 8 serdar yegulalp, “what is tensorflow? the machine learning library explained,” infoworld. june 6, 2018, https://www.infoworld.com/article/3278008/tensorflow/what-is-tensorflow-themachine-learning-library-explained.html. 9 frisco public library, 2019 “unpublished maker kit statistics 2016-2019.” 10 “aiy projects: voice kit,” google, accessed december 15, 2018, https://aiyprojects.withgoogle.com/voice/. 11 adam lamprecht, “google aiy voice box,” thingiverse, accessed february 14, 2019, https://www.thingiverse.com/thing:3247685. 12 elena ruchko, “why learn python? here are 8 data-driven reasons,” dbader.org, accessed february 14, 2019, https://dbader.org/blog/why-learn-python. 13 christina cardoza, “the python programming language grows in popularity,” sd times, june 15, 2017, https://sdtimes.com/artificial-intelligence/python-programming-language-growspopularity/. 14 yegulalp, “what is tensorflow? the machine learning library explained.” 15 adam lamprecht, email message to the author, february 15, 2019. 16 locklear mallory, “amazon opens up its internal machine learning training to everyone,” engadget, november 26, 2018, https://www.engadget.com/2018/11/26/amazon-opensinternal-machine-learning-training/. user experience methods and maturity in academic libraries articles user experience methods and maturity in academic libraries scott w. h. young, zoe chao, and adam chandler information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11787 scott w. h. young (swyoung@montana.edu) is ux and assessment librarian, montana state university. zoe chao (chaoszuyu@gmail.com) is ux designer, truist financial. adam chandler (alc28@cornell.edu) is director of automation, assessment, and post-cataloging services, cornell university. abstract this article presents a mixed-methods study of the methods and maturity of user experience (ux) practice in academic libraries. the authors apply qualitative content analysis and quantitative statistical analysis to a research dataset derived from a survey of ux practitioners. results reveal the type and extent of ux methods currently in use by practitioners in academic libraries. themes extracted from the survey responses also reveal a set of factors that influence the development of ux maturity. analysis and discussion focus on organizational characteristics that influence ux methods and maturity. the authors conclude by offering a library-focused maturity scale with recommended practices for advancing ux maturity in academic libraries. introduction user experience (ux) is a design practice for creating tools and services from a user-centered perspective. academic libraries have been practicing ux for some time, with ux methods having been incorporated across the profession. however, there has been a lack of empirical data showing the extent of ux methods in use or state of ux maturity in libraries. to help illuminate these areas, we distributed a survey to ux practitioners working in academic libraries that inquired into methods and maturity. we followed a mixed-methods approach involving both qualitative content analysis and quantitative statistical analysis to analyze the dataset. our results reveal the mostand least-common ux methods currently in use in academic libraries. results also demonstrate specific organizational characteristics that help and hinder ux maturity. we conclude by offering a set of strategies for reaching higher levels of ux maturity. background and motivation: ux in academic libraries ux has been represented in the literature of library and information science for at least two decades, when “the human interaction involved in service use” was recognized as a factor affecting the value and impact of libraries.1 the practice of ux has expanded and evolved and is now a growing specialty in the librarianship profession.2 ux in libraries is motivated by a call to actively pay close attention to users’ unique and distinctive requirements, which allows libraries to more effectively design services for our communities.3 as a practice, ux is now beginning to be represented in graduate curricula, public services and research support, access services, space design, and web design.4 with its attunement to a set of practices and principles, ux can be viewed as a research and design methodology similar and related to other methodologies that focus on users, services, problem solving, participation, collaboration, and qualitative data analysis .5 notably, ux is related to human-centered design, service design, and participatory design.6 mailto:swyoung@montana.edu mailto:chaoszuyu@gmail.com mailto:alc28@cornell.edu information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 2 specific methods of ux practice are today wide-ranging. they include surveys, focus groups, interviews, contextual inquiry, journey mapping, usability testing, personas, card sorting, a/b testing, ecology maps, observations, ethnography, prototyping, and blueprinting.7 some ux methods are incorporated into agile development processes.8 though tools and techniques are available to library ux practitioners in abundance, the rate of adoption of these tools is less understood. in a notable contribution to this question, pshock showed through a nation-wide survey that the most familiar ux methods among library practitioners included usability testing, surveys, and focus groups.9 the question of methods is related to the question of maturity—how advanced is library ux practice? in addition to the rate of adoption of methods and tools, several different ux maturity models have been advanced in recent years. priester derives maturity from four factors: culture of innovation, infrastructure agility, acceptance of failure, and library user focus.10 in discussing ux capacity in libraries, macdonald proposes a six-stage maturity model: unrecognized, recognized, considered, implemented, integrated, and institutionalized.11 sharon defines maturity as a combination of staff resources and organizational buy-in.12 similarly, sheldon-hess proposes a five-level scale of ux maturity, based primarily on the degree of implementation of ux practice and user-centered thinking in an organization.13 and even earlier, nielsen proposed an eight-level scale of ux maturity, starting with a “hostility toward usability” and concluding with a “userdriven” organization.14 after reviewing a number of different maturity models, anderson reports that the most common hierarchies include the following steps: (1) absence/unawareness of ux research, (2) ux research awareness—ad hoc research, (3) adoption of ux research into projects, (4) maturing of ux research into an organizational focus, (5) integrated ux research across strategy, and (6) complete ux research culture.15 the field of library ux shows a clear and compelling interest in ux maturity, and we can benefit from further empirical evidence that can help illuminate the current state and future progress toward ux maturity, including the rate of adoption of methods, resource allocation toward ux, and organizational buy-in. the research presented in this paper is motivated by the need to provide current and comprehensive data to answer questions related to ux maturity in academic libraries. methods research questions the research questions for this study are the following: • rq1: how mature is ux practice within academic libraries? • rq2: what factors influence ux maturity? to answer these questions, we distributed a survey to ux practitioners working in academic libraries. survey responses were analyzed qualitatively using content analysis and quantitatively using statistical analysis. survey participants the team members sent out the survey on may 23, 2018, to library profession electronic discussion lists.16 of the 87 received responses, 74 included an institution name. we identified size and setting classification using the carnegie classification of institutions of higher education (see information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 3 table 1) for the institutions.17 eight of them cannot be mapped to the carnegie classification due to being outside the united states (n = 6) or of different scopes (one research lab and one information school). six schools have more than one response, which are treated separately to represent the diversity of opinion and experience within an organization. classification response count percentage four-year, large 4918 56 four-year, medium 1019 11 university outside us 6 7 four-year, small 5 6 non-university 2 2 four-year, very small 1 1 two-year, very large 1 1 unspecified 13 15 table 1. institutional profiles of survey respondents, with response counts. materials and procedure our online survey was organized into two main parts. after an initial informed consent section, the survey investigated (1) demographics and ux methods and (2) ux maturity. demographics and ux methods in the first main part of the survey, participants were asked to select among 20 different ux methods that “you personally use at least every year or two at your institution.” the list of methods is derived from the ux research cheat sheet by nielsen norman group.20 participants were asked to complete an optional free-text response question: “would you like to add a comment clarifying the way you completed [this question]?” ux maturity in the second main part of the survey, participants were asked to identify the ux maturity stage that “properly describes the current ux status” in their organization. the stages were adapted from the eight-stage scale of ux maturity proposed by nielsen norman group: • stage 1: hostility toward usability • stage 2: developer-centered ux • stage 3: skunkworks ux • stage 4: dedicated ux budget • stage 5: managed usability • stage 6: systematic user-centered design process • stage 7: integrated user-centered design information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 4 • stage 8: user-driven organization we concluded the survey by asking participants to optionally “explain why you selected that stage” with a free-text response. research data analysis content analysis we followed the methodology of content analysis.21 each qualitative survey response functioned as a meaning unit, with meaning units sorted into themes and subthemes. each article author coded units independently; themes were resolved through discussion among the author group. the process of coding via content analysis allowed us to identify overarching trends in ux practice and maturity. results are further discussed below. statistical analysis data preparation and statistical analysis were conducted using r version 3.4.1 (see table 2 for full r package). base r was used for our statistical analysis. other r packages utilized in the project are listed in the table below. r package name version ggplot2 3.0.0 tibble 2.1.1 dplyr 0.7.5 tidyr 0.8.1 stringr 1.4.0 readr 1.1.1 readxl 1.3.1 table 2. r packages used in the analysis data preparation the following steps were taken in the data analysis: 1. content analysis into themes (see above) 2. normalize institution names. we received more than one response from a few institutions. for these, the responses were treated as separate responses that happened to have the same demographics. 3. for responses that included institution names, we added a total student population variable to the response using values derived from wikipedia and the carnegie classification of institutions of higher education. information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 5 4. for variables we derived during the content analysis we coded them as 0 or 1 dummy variables, that is, 0 = not present, and 1 = present. coding them in this way allows us to bring them into a multiple linear regression model. 5. using an r script, we tested each response for the presence of the content analysis, 0 or 1. 6. plots were created using the r ggplot2 library. 7. linear regression models were conducted using the base r lm function. research dataset dataset, survey instrument, and r code are available through dryad at https://doi.org/10.5061/dryad.jwstqjq5d.22 survey respondents eighty-seven participants responded to one or more components of the survey. see table 3 for a breakdown of survey responses. survey question responses ux methods multiple choice: “please check the following ux methods that you personally use at least every year or two at your institution.” 81 ux methods free-text response: “would you like to add a comment clarifying the way you completed [the question related to ux methods]?” 20 ux maturity stage multiple choice: “which of the following [maturity stages] do you think properly describes the current ux status in your organization?” 79 ux maturity stage free-text response: “please explain why you selected that stage.” 54 table 3. survey responses. results our research results demonstrate that certain characteristics of a library organization are related to ux maturity. these characteristics include the type and extent of ux methods that are currently in use, as well as organizational factors such as leadership support, staffing, and collaboration. we further explicate below according to our two research questions. rq1: how mature is user experience practice within academic libraries? our survey also asked participants to identify which stage of the nielsen norman group maturity scale “properly describes the current ux status” in their organization. our findings indicate that most libraries are in a low-to-middle range of maturity, with more than 75% of respondents placing their organization at either stage 3, stage 4, or stage 5 (figure 1). https://doi.org/10.5061/dryad.jwstqjq5d information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 6 figure 1. histogram of responses by stage, showing that the majority of respondents placed their organization at either stage 3, stage 4, or stage 5. rq2: what factors influence ux maturity? overview of statistical analysis results we use linear regression for two different applications in this study (see appendix a for glossary of terms related to statistical analysis). the process of creating a statistical model allows us to see, with varying degrees of confidence, the impact of different variables on ux maturity stage. the results of the linear regression help us to tease out the variables with the most predictive value. using certain methods does not cause the library to be at a higher stage; rather, libraries that use certain methods tend to be at a higher stage, statistically. that is what is meant by “predictive” in this context. linear regression provides a ground truth in what we think we are seeing in survey responses: a useful general principle in science is that when you don’t know the true form of a relationship, start with something simple. a linear equation is perhaps the simplest way to describe a relationship between two or more variables and still get reasonably accurate predictions.23 the other reason we are using the linear regression output is to inform a possible future version of a ux maturity survey instrument, one more finely tuned to libraries than the nielsen instrument alone that we used in this iteration.24 we feel that our use of multiple linear regression is appropriate and helpful given the exploratory nature of our study. the complete output is available at https://doi.org/10.5061/dryad.jwstqjq5d. size of institution we used the institution’s student population, the number of full-time enrolled students, as our proxy for the size of the library. our assumption being, larger enrollment generally means larger https://doi.org/10.5061/dryad.jwstqjq5d information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 7 number of library staff. there are different ways ux maturity level could be compared with student enrollment; because the range in our sample is very wide, from 1,098 to 200,000 students across the sample of institutions, we attempted to control for the vast differences in size between the smallest and largest by sorting, from smallest to largest, 1 to 69 (the total number of cases in our dataset with both a stage and population defined), then assigning rank to the institution as an additional demographic variable. we then created a simple linear regression model comparing maturity stage as a function of ranked size. the null hypothesis is that there is no relationship between ranked size of institution and stage. stage is the response variable and ranked size of the institution is the explanatory variable. the adjusted r-squared relationship is 0.027. this means that only about 3% of the variance is accounted for by the ranked size of the institution. the probability, or “p-value,” of getting our observed result if the null hypothesis is true for this relationship is 0.095 (almost 10%). this exceeds the standard .05 confidence level commonly used in statistical analysis. therefore the size of the institution is not a reliable predictor of ux maturity level in our sample, a counterintuitive finding. the full statistical summary is available in the appendix. methods currently in use by academic libraries our next rq2 finding relates to the type and extent of ux methods that are currently in use in academic libraries. our survey asked participants to select which ux methods “you personally use at least every year or two at your institution.” user surveys, usability testing, and user interviews stand out as the most commonly used. figure 2 shows response counts for all of the methods in the survey. information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 8 figure 2. number of respondents that selected each method in the survey, showing the type and extent of ux methods currently in use in academic libraries. we then examined the number of methods in use per institution compared to the reported maturity stage (figure 3). the number of methods used per institution illustrates a trend: more methods used at an institution generally means the institution is at a higher stag e of maturity. information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 9 figure 3. the number of methods used per institution, illustrating that more methods currently in use at an institution generally indicates a higher level of maturity. another way of representing the same two variables (reported number of methods and maturity stage) is with a scatterplot and statistical test (figure 4). in this simple linear regression model we have two variables: the response variable is stage and the explanatory variable is number of ux methods used in the past two years. in plotting these two variables on a chart, we can draw a line that minimizes the distance between the line and all of the points on the plot. like the chart above, the linear relationship between total methods and stage is clearly visible. the total number of methods practiced accounts for about 18% of the variance when predicting the correct maturity stage. (recall from our discussion about ranked size of institution that rank accounts for less than 3% of the variation, and is not even statistically significant.) in this case, the p-value is far below the 0.05 threshold, meaning the likelihood that we are seeing a relationship by random chance is very low. therefore total number of methods is predictive of stage. generally, the more methods respondents chose, the higher the maturity stage. we can see from this data that the number of methods used is more predictive of maturity stage than institution size. information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 10 figure 4. maturity stage compared against total number of methods used, showing the positive relationship between number of ux methods used and ux maturity stage. for a more granular view, figure 5 shows the relation of specific ux methods used in different ux research phases (as categorized in the survey question, with methods organized by discovery, exploration, listening, and testing) to reported maturity stage. information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 11 figure 5. showing the relation of specific ux methods to reported maturity stage. factors that influence ux methods: recency, formality, regularity we then applied content analysis to the free-text questions of our survey. following the question that asked participants to select among 20 different ux methods that “you personally use at least every year or two at your institution,” the free-text question asked, “would you like to add a comment clarifying the way you completed [this question]?” each of the 20 free-text responses to this question was counted and categorized as a “meaning unit.” themes were extracted from the free-text survey responses. we identified 3 themes across 20 meaning units: formality, regularity, and recency (see table 4). question would you like to add a comment clarifying the way you completed [this question related to ux methods]? thematic analysis theme definition number of meaning units* example meaning unit information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 12 recency how new or developed a library’s ux practice 7 “i am fairly new here and we are still developing a process that is well-rounded.” formality how formal or structured the ux practice is 9 “we are aware of many of the techniques mentioned, but we don’t have a formal process for implementing them.” regularity how often or frequently ux is practiced 4 “right now we are doing a workflow analysis of interlibrary loan, but once completed probably wouldn’t do that for another three to four years.” *each free-text response was counted and categorized as a single meaning unit. table 4. qualitative questions and thematic analysis for ux methods responses (n = 20). factors that influence ux maturity: leadership support, collaboration, ux lead, ux group, growth, and resources we also conducted a content analysis on the free-text responses to the survey question related to the ux maturity scale that asked participants to “explain why you selected that stage.” each of the 54 free-text responses to this question was counted and categorized as a “meaning unit.” themes were extracted from the free-text survey responses. we identified 7 themes across 54 meaning units: leadership support, collaboration, ux lead, ux group, growth, and resources, and strategic alignment (see table 5). question please explain why you selected [the current ux status in your organization]? thematic analysis theme definition number of meaning units* example meaning unit leadership support the degree to which ux work is seen, understood by, and supported by library leadership. 32 “just last year, the ux team moved into administration so that we can tie our work to strategic planning for the organization.” information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 13 ux group the presence of a committee or working group that conducts or otherwise supports ux work. 31 “i also chair a web working group which focuses on improving our website from a usability standpoint.” collaboration the degree to which ux work is collaboratively shared by individuals and departments throughout the library 30 “i don't know if ux has become a necessarily planned activity across the whole organization. i am team of one, and though i’ve tried, i haven’t been able to add anyone else to form an official ux team as well.” ux lead personnel assigned to ux work, especially a dedicated ux lead 30 “i have recently been hired to partially work with ux and another person has been appointed ux coordinator.” growth the degree to which expansion occurs around staffing, resources, and organizational understanding of ux work. 13 “we . . . will soon be posting a position for a ux librarian.” resources the amount of time and budgetary resources dedicated to ux. 10 “budget is our biggest constraint when it comes to ux testing.” strategic alignment the inclusion of ux or usercenteredness in strategic planning 2 “we do employ user research to determine where to target priorities and strategy. however, i do not think we have a robust process for iterative testing or participatory design yet.” * each free-text response was counted and categorized as a single meaning unit. table 5. qualitative questions and thematic analysis for ux maturity responses (n = 54). information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 14 this data can be visualized to show the relationships between ux maturity stage and the coded thematic responses (figure 6). figure 6. coded responses versus selected stage (0 indicates no comment related to that theme), showing that a lack of leadership support is often cited as a reason for not advancing past stage 3; the presence of dedicated staff in the form of a ux lead or a ux group is often cited as a reason for reaching stage 5. full ux maturity model: ux maturity as a function of ux methods in building a full model for the purposes of quantitative data analysis, we are attempting to predict the maturity stage based on the many different variables that appear in our dataset. this statistical exercise is a heuristic tool that can help us understand the survey responses and to draw results from the dataset that reveal key characteristics of ux maturity in libraries. we approached building a full model using a modified backward stepwise approach. with this approach, we begin with the full range of variables and work backward step by step to focus only on those variables that combine to form a model that makes the best predictions about the response variable—the ux maturity stage—for each case. through this process, those variables that are less predictive are removed from the model one by one until we can settle on a model that explains the most variance.25 the modified backward stepwise “step” function used to create our model required 18 iterations before settling on the best version. using adjusted r-squared as our metric, our full model accounts for 62% of the variance for this dataset. adjusted r-square is an appropriate measure because it allows us to include many variables but also includes a penalty for including too many variables (as a penalty, the adjusted r-square value will decrease). with this model, we can make reasonable estimates of the maturity stage that a survey respondent selected by knowing which methods they use combined with the coded explanation the respondent provided via the free-text survey questions. the coded responses (see table 4 and table 5) provided measurable insights into information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 15 the organizational context of our respondents’ institutions, and this allows us to analyze and predict their respective maturity levels. with this additional information, we have a model that represents the multiple dimensions available in the dataset (see appendix b for additional data analysis). in table 6, we show the relationship between specific ux methods and the ux maturity stages. we see here that journey mapping, for example, is a highly influential factor for ux maturity. variable estimated influence on maturity stage p-value (significance) journey maps 1.7 0.001*** design review 1.3 0.010* user interviews 0.9 0.047* usability testing 0.8 0.158 benchmark testing 0.7 0.067 usability ug review 0.3 0.498 user stories 0.2 0.440 requirements and constraints 0.2 0.514 user surveys -0.3 0.492 diary camera studies -0.6 0.325 faq review -0.8 0.062 prototype testing -0.8 0.076 field studies -1.3 0.003** *p < .05 (statistically significant result), **p < .01 , ***p <.001 (a highly statistically significant result) table 6. relationship between ux method variables and predicted maturity stage. in table 7, we show the relationship between the coded responses from the free-text survey questions (presented in tables 4 and 5), and the ux maturity stages. we see through this analysis that variables such as “resources” are important for advancing maturity. similarly, we see that a lack of “leadership support” has a strong negative effect on maturity. information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 16 variable estimated influence on maturity stage p-value (significance) resources: yes 2.9 0.014* collaboration: yes 0.8 0.147 growth: yes 0.2 0.561 resources: no -0.2 0.615 ux lead: no -0.5 0.216 leadership support: yes -0.7 0.177 ux lead: yes -0.9 0.038* ux group: no -0.9 0.022* leadership support: no -1.0 0.009** strategic alignment: no -2.8 0.012* *p < .05 (statistically significant result), **p < .01 , ***p <.001 (a highly statistically significant result) table 7. relationship between organizational variables and predicted maturity stage, in descending order of influence on maturity stage. a statistical example case: estimating ux maturity to help the reader understand the statistical summary provided by our model, we take a close look at one case drawn from one actual survey participant. in this example case, the respondent’s institution is a four-year, large university. the intercept for this multiple regression model happens to be 4.1119. intercept in a multiple regression model represents the mean response (stage) when all the predictors are all zero.26 it is a baseline. our example institution has practiced the following methods, with their respective influence on ux maturity included in parentheses: • user interviews (+ 0.9521) • usability testing (+ 0.7984) • benchmark testing (+ 0.7124) • usability bug review (+ 0.2692) • field studies (1.3346) • prototype testing (0.8454) • user surveys (0.3204) additionally, this institution has the following organizational characteristics, with their respective influence on ux maturity included in parentheses: • leadership support: yes (0.6842) information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 17 • resources: no (0.2192) by adding these numbers together with the starting point (4.111), we can calculate that the predicted stage for this institution is 3.44. actual stage selected by survey respondent: 3 residual -0.44 the model predicts a stage of 3.440 for this large, four-year university library. the model’s predicted value for this library is 0.440 greater than the stage selected by this survey respondent. the leftover part, or error, is the residual. the attentive reader might at this point ask why the variable called leadership support: yes has a negative estimate, 0.6842. that certainly is counterintuitive. other evidence and our own interpretation lead us to expect that leadership support: yes should have a positive effect on the maturity. in this particular case, the negative estimate has a high p-value (0.177) and is thus unreliable and not significant to the model. part of the unreliability stems from the relatively small number of institutions (n = 9) that were coded as leadership support: yes. in contrast, responses that were coded as leadership support: no (n = 23) produced an even lower negative estimate of -0.9911, with a very reliable and highly significant p-value of 0.009. this shows us that when leadership support is lacking, maturity reliably suffers. we discuss this and other organizational characteristics in more detail below. discussion in interpreting our results, we have identified four key areas that we wish to emphasize: the significance of leadership support, the importance of organization-wide collaboration, the role of applied ux methods, and the emerging theory and practice of ux and design in libraries. leadership support and strategic alignment a major theme evident in the results relates to leadership support and strategic alignment. as expressed by the survey respondents, leadership support is viewed as the degree to which ux work is seen, understood by, and supported by library leadership and organizational strategic planning. in particular, a lack of support and visioning from leadership exerts negative pressure on ux maturity. on the other hand, when ux is coordinated with leadership vision and situated into strategic planning, ux maturity was rated more highly. from a leadership perspective, ux maturity relies on an allocated budget and designated staff to move beyond an ad hoc approach and reach higher levels on the maturity scale. one might expect that the larger an institution, the more advanced the ux maturity stage. however, based on our data analysis, size of institution is not a significant factor in ux maturity. therefore the resources provided to library ux activities may not be about how large institutions are, but rather if leadership acknowledge the importance of ux and provide official, particularly financial, support. organizational collaboration another major theme was collaboration—the degree to which ux research is collaboratively shared by individuals and departments throughout the library. higher levels of ux maturity are driven by a widespread understanding of ux within an organization, with user research data information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 18 integrated into decision-making across multiple touchpoints. conversely, a lack of collaboration was a factor that hindered maturity. many respondents shared similar experiences, telling us that other staff or departments within the organization are not ready to embrace the potential of ux data, methods, and insights. we recognize that cultivating ux is an organic process that can result in uneven growth of ux within an organization. some units may be ready to move further and faster while others may hesitate to contribute or collaborate. not every department will immediately see the relevance or value of ux work for their area. accounting or human resources, for example, might consider ux as beyond the scope of their practice. thinking inclusively and holistically from the perspective of user-centered service design, however, opens up new connections between ux and the work of all departments across the organization. ux can help center those users—even internal users—who interact with service points such as accounting or human resources in ways that can improve the service experience for all involved. applied ux methods across the 20 methods that we included in the survey, our results indicate that the application of different ux methods varies widely in type and extent. many methods are in use to varying degrees. as the methods relate to maturity, we find that a greater number of methods in use during the previous two years was indicative of a higher maturity rating. in short, more methods lead to more maturity. the five most common methods included usability testing, user surveys, user interviews, accessibility evaluation, and field studies. these methods are similar in their ease of implementation and their wide representation in the library literature, and due to their commonness, they are not strongly indicative of ux maturity, high or low. the five least common methods included journey maps, benchmark testing, design review, faq review, and diary/camera studies. in this grouping we see a set of ux methods that are not as well known or widely discussed, but which can paint a more complete picture of the user experience. journey mapping in particular was strongly and positively influential on ux maturity in our statistical model. this result does not necessarily indicate that a library can boost ux maturity simply by creating a journey map. rather, we interpret this to indicate that the method itself is reflective of a coordinated ux effort in the institution. journey mapping aims to obtain a high-level overview of a user’s interactions with every touch point to accomplish a task. as such, the successful implementation of a journey map relies on cross-functional and cross-departmental input and interpretation. this result calls for greater collaboration toward greater ux maturity. ux as an emerging practice within libraries many respondents focused on the newness or the maturity of their library’s ux practice, and most responses connected a low methods usage to the newness of the practice. in these responses, we see that ux in libraries is still a new field, and the practice is emerging with variations across institutions in terms of methods and maturity. we note that institutional size was not a factor that influenced maturity—some smaller institutions reported mature ux practice while some larger institutions reported lower ux maturity. in this result, we see that the amount of possible resources matters less than the intentional application of those resources in support of ux work. as institutions begin to see the value of ux and dedicate increasingly more resources relative to their budgets, ux maturity increases. our survey respondents shared a variety of experiences along this journey toward maturity. many told us “i’m new here” and that their library doesn’t fully understand ux and isn’t yet ready to include ux research in decision-making or strategic planning, or that the institution doesn’t have a plan yet for how to integrate the ux librarian into library operations. still others reported that librarians in other units or library administrators are information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 19 not required or encouraged to consult with the ux librarian or integrate ux research. in this way, many libraries continue a more traditional model of decision-making that does not regularly apply intentional methods to account for the voices of users. on the upper end of the maturity scale, on the other hand, we see a wide adoption of ux as a legitimate area of work varies across units and within leadership groups. in this way, some libraries have demonstrated more responsiveness to ux and have more successfully integrated ux practices into strategic and operational workflows. through the survey responses, we see a threestep progression that marks the emergence of ux as a trusted and legitimate methodology for understanding user experiences and designing library services: recency, formality, and regularity (table 4). in the earlier stages of maturity, survey respondents emphasize the newness or recency of a group or person assigned to conduct ux work. from there, a ux practice emerges as increasingly more formal as more ux methods are introduced more often into different contexts. finally, as a library reaches ux maturity, we see a frequent application of a wide variety of ux methods in all corners of the library and with many stakeholders, along with organizational decision-making that regularly includes ux research data. a ux maturity scale for libraries to help in understanding of the ux maturity scale and the characteristics related to each of its stages, we have adapted the nielsen norman ux maturity scale for a library context. table 8 shows a set of organizational characteristics that correspond to the eight stages of ux maturity. the indicators in table 8 are presented as an approximate guideline for understanding and diagnosing ux maturity. stage key indicators stage 1–2 apathy or hostility to ux practice; lack of resources and staff for ux stage 3 ad hoc ux practices within the organization; ux is practiced, but unofficially and without dedicated resources or staff; leadership does not fully understand or support ux stage 4 leadership beginning to understand and support ux; dedicated ux budget; ux is assigned fully or partly to a permanent position stage 5 the ux lead or ux group collaborates with units across the organization and contributes ux data meaningfully to organizational and strategic decision-making stage 6 ux research data is regularly included in projects and decision-making; a wide variety of methods are practiced regularly by multiple departments stage 7–8 ux is practiced throughout the organization; decisions are made and resources are allocated only with ux insights as a guide table 8. key indicators for ux maturity in academic libraries. information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 20 this scale reflects the research presented in this paper while building on related models and prior research (more granularity is available in stages 2–6 because we received more survey responses representing those stages). we note that our research is consonant with prior work in this area. priestner includes a greater focus on library users (in contrast to a focus on library staff) as a key driver of library ux maturity.27 macdonald reports that ux work is defined by applied methods, in particular, qualitative research.28 sharon describes a ux maturity model based on two primary factors: the presence of ux researchers on staff and whether the organization actually listens to and responds to ux research.29 finally, sheldon-hess bases library ux maturity on the extent of applied ux methods and the level of user-centeredness present in an organization, as indicated by degree to which staff consider user perspectives in internal communications and decision making.30 taken together, we see common strands that can help illuminate the key factors of ux maturity in libraries: applied methods, leadership support in the form of resources and strategic alignment, organizational collaboration, and decision-making that includes ux research. strategies for climbing the maturity scale: toward a more user-centered library our results reveal a few key barriers and boosts to higher maturity, and one key point of stagnation. across the maturity scale, important factors that positively influence maturity involve leadership support and resource allocation toward ux in the form of personnel and infrastructure such as physical space, materials, strategic direction, and a working budget. notably, respondents in our survey reported being stuck at stage 3 due to a lack of leadership support. for instance, when resource-related comments appeared, we primarily heard about a lack of resources, which impaired maturity. participants reported a mixture of personnel in support of ux work. some libraries have a staff member dedicated to ux but lack a committee structure to support and advocate for the work. other libraries do not have dedicated ux staff but had formed committee infrastructure to collaboratively move ux forward. participants who lacked either a ux group or a ux lead reported lower levels of maturity and were particularly stagnated at stage 3 (see figure 5 above). alternatively, libraries are boosted to stage 5 with the presence of a fully empowered ux lead who has the support of a ux group or committee that can network throughout the organization and drive collaboration and cross-functional implementation of ux methods and research data. we found that respondents from libraries that possessed both a dedicated ux staff and a ux group tended to place themselves higher on the maturity scale. for those who reside at stage 5, having a ux group or a ux lead are the two main themes present in the survey. to move forward to stage 5, a library needs to organize a ux group with an appointed lead to coordinate ux practice widely throughout the organization, including in library spaces, web presence, learning services, and digital initiatives. a systematic and cooperative ux approach planned by an official ux group and led by a designated ux lead is the key indicator of stage 5. the support for the group and its lead needs to come rom not only leadership, but also colleagues throughout the library, which relates to the two major themes of leadership support and organizational collaboration. stage 7–8 is achievable only with significant investment in ux. given parent entity pressures, existing hierarchies, and prevailing non-user-centered cultures, libraries face a formidable set of challenges on the road to becoming user-centered organizations.31 this road is somewhat illuminated by the small number of survey respondents that marked themselves at a stage 7 or 8. highlights from their responses are instructive. one respondent told us, information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 21 we have multiple teams in the library to help with service design, conducting and gathering user research, and helping library staff think more about the user in their everyday work. we also have a couple special interest groups (sigs) dedicated to user research, ux, and assessment. we also have multiple departments within the library with ux expertise. from this response, we can see the key characteristics of ux maturity: leadership support up the line along with wide-spread collaboration throughout the organization. staff infrastructures including multiple ux-oriented committees help drive and coordinate ux work. this respondent also reported the recent hiring of an assessment librarian situated in the library’s administration department who will help coordinate ux work throughout the organization. these elements work together to meaningfully integrate user perspectives into both digital and physical spaces and in multiple units. moreover, this respondent marked 19 out of 20 ux methods currently in use (all but diary/camera studies), thus reinforcing the symbiotic relationship between ux maturity and ux methods: the variety of methods in use are a signal of maturity, and correspondingly, a greater maturity allows the space and resources for the application of more and different methods. another survey respondent at stage 7 remarked the following, my workplace has been very supportive in addressing ux issues both in digital and physical spaces. since being hired, i have created workflows that incorporate data that we gather from users. if there isn’t data gathered in a certain area, we usually find a way to update workflows so that we can get that data. almost every project that i have worked on digitally and in the physical spaces at the library has been the result of ux/ui data that has been gathered from our users. the elevated level of maturity at this library is especially reflected through the practice of “almost every project” being driven by user data. a truly user-centered library indeed integrates user data across all projects and advocates for the user at every opportunity. this respondent also marked a high variety of methods currently being practiced: 17 out of a possible 20 (methods not in use include diary/camera studies, user stories, and competitive analysis), further underscoring the two-way connection between methods and maturity. in further considering the upper reaches of maturity, we are inspired by an emerging theory of design-oriented librarianship that signals a professional paradigm shift such that ux could become recognized as a fundamental component of library research, education, and practice. 32 by investing more in ux methods, practices, and principles, libraries can achieve greater value and empowerment for our communities by designing more user-centered services and tools.33 ultimately, achieving stage 7–8 will result from deeply integrating user-centeredness across all operational phases, strategic planning, and decision-making of a library organization. limitations we note a few limitations of our study. first, the ux stages used in the survey were defined by jakob nielson in 2006 for corporate application, so it is perhaps a bit dated. further, the main goal of our statistical analysis is to develop a model that can accurately predict the ux maturity of a information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 22 library based on the ux methods employed at the institution combined with organizational characteristics. allison outlines three broad categories of error in regression analysis: measurement error: very few variables can be measured with perfect accuracy, especially in the social sciences. sampling error: in many cases, our data are only a sample from a larger population, and the sample will never be exactly like the population. uncontrolled variation: [age and schooling] are surely not the only variables that affect a person’s income, and these uncontrolled variables may “disturb” the relationship.34 in terms of measurement error, survey respondents may have bias when self-reporting maturity stage due to social pressures to produce desirable responses, meaning people tend to respond to self-report items in a manner that makes themselves look good.35 the resulting measurement error takes the form of over-reporting “desirable behavior” or under-reporting “undesirable behavior.” this is evident in some responses for ux maturity stages. for example, one respondent chose stage 5—“managed usability”—but the comment described a slightly different picture: i think we are still floundering between “dedicated ux budget” and “managed usability.” . . . we are at the stage where people know they should consult with us, but either they don’t or they do but don’t really hear the results, they are using us to confirm what they want to hear. in terms of sampling error, self-selection bias is a factor: our respondents might not be representative of the full population of ux librarians. we also did not make all of our questions mandatory, and as a result were not able to make use of all possible data within the scope of our survey. in terms of uncontrolled variation, our survey and statistical model does not fully account for all variables that influence ux maturity in libraries; for example, we included a limited list of ux methods, and we did not include questions that inquired specifically into the presence of a ux lead or ux group. future directions we see at least three paths forward for future research related to ux methods and maturity. first, librarianship would benefit from a ux maturity scale created specifically with and for our field’s theoreticians and practitioners. we propose one such scale above, but our scale has not undergone further testing, research, or validation. we note especially the library ux maturity scales of sheldon-hess and macdonald, which could be further synthesized or built upon.36 second, a selfassessment tool for diagnosing ux maturity could be developed based on a validated maturity scale. and third, the theory advanced by clarke that librarianship can be usefully conceived of and practiced as a design discipline warrants further critical attention, especially as it relates to the application of ux methods and the development of ux maturity.37 conclusion we applied a mixed-methods approach that involved content analysis and statistical analysis to a profession-wide survey. our research data and analysis demonstrate the type and extent of ux methods currently in use by academic libraries. the five most common methods are usability information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 23 testing, user surveys, user interviews, accessibility evaluation, and field studies. the five least common methods are journey maps, benchmark testing, design review, faq review, and diary/camera studies. furthermore, we identify the organizational characteristics that help or hinder the development of ux maturity. ux maturity in libraries is related to four key factors: the number of ux methods currently in use; the level of support from leadership in the form of strategic alignment, budget, and personnel; the extent of collaboration throughout the organization; and the degree to which organizational decisions are influenced by ux research. when one or more of these four connected factors advances, so too does ux maturity. we close by emphasizing three key factors for reaching higher levels of ux maturity. first, we encourage library leadership to see the value of ux and support its practice through strategic alignment and resource allocation. second, we encourage libraries to commit to integrating ux principles and practices across all units, especially into leadership groups and through organization-wide collaboration and workflows. third, ux methods should be reinforced and amplified with personnel, such as a standing ux group and a dedicated ux lead that can help direct ux work and enhance ux maturity. libraries have the promise and potential to more deeply practice ux. doing so can allow libraries to more deeply connect with users and reach higher levels of ux maturity, with the ultimate result of delivering tools and services that further empower our user communities. information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 24 appendix a: glossary of statistical terms term working definition adjusted r-squared adjusted r2 = variance of fitted model values / variance of response values. “the adjusted r-squared compares the descriptive power of regression models—two or more variables—that include a diverse number of independent variables—known as a predictor. every predictor or independent variable, added to a model increases the r-squared value and never decreases it. so, a model that includes several predictors will return higher r-squared values and may seem to be a better fit. however, this result is due to it including more terms. the adjusted r-squared compensates for the addition of variables and only increases if the new predictor enhances the model above what would be obtained by probability. conversely, it will decrease when a predictor improves the model less than what is predicted by chance.” source: https://www.investopedia.com/ask/answers/012615/whatsdifference-between-rsquared-and-adjusted-rsquared.asp confidence level “the confidence level tells you how sure you can be. it is expressed as a percentage and represents how often the true percentage of the population who would pick an answer lies within the confidence interval. the 95% confidence level means you can be 95% certain; the 99% confidence level means you can be 99% certain. most researchers use the 95% confidence level.” source: https://researchbasics.education.uconn.edu/confidenceintervals-and-levels/ confidence interval “a confidence interval is an interval which has a known and controlled probability (generally 95% or 99%) to contain the true value.” source: https://stats.oecd.org/glossary/detail.asp?id=5055 explained variance “explained variance (also called explained variation) is used to measure the discrepancy between a model and actual data. in other words, it’s the part of the model’s total variance that is explained by factors that are actually present and isn’t due to error variance.” source: https://www.statisticshowto.datasciencecentral.com/explained -variance-variation/ explanatory and response variables “the response variable is the focus of a question in a study or experiment. an explanatory variable is one that explains changes in that variable. it can be anything that might affect the https://www.investopedia.com/ask/answers/012615/whats-difference-between-rsquared-and-adjusted-rsquared.asp https://www.investopedia.com/ask/answers/012615/whats-difference-between-rsquared-and-adjusted-rsquared.asp https://researchbasics.education.uconn.edu/confidence-intervals-and-levels/ https://researchbasics.education.uconn.edu/confidence-intervals-and-levels/ https://stats.oecd.org/glossary/detail.asp?id=5055 https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/variance/ https://www.statisticshowto.datasciencecentral.com/explained-variance-variation/ https://www.statisticshowto.datasciencecentral.com/explained-variance-variation/ information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 25 response variable.” source: https://www.statisticshowto.datasciencecentral.com/explanato ry-variable/ multiple regression “multiple regression is a statistical method for studying the relationship between a single dependent [or response] variable and one or more independent [or explanatory] variables. it is unquestionably the most widely used statistical technique in the biological and physical sciences.”38 null hypothesis “in general, this term relates to a particular hypothesis under test, as distinct from the alternative hypotheses which are under consideration. it is therefore the hypothesis which determines the probability of the type i error. in some contexts, however, the term is restricted to an hypothesis under test of ‘no difference’.” source: https://stats.oecd.org/glossary/detail.asp?id=3767 probability or p-value “the p value is the probability of getting our observed result, or a more extreme result, if the null hypothesis is true.”39 simple linear regression “simple linear regression models the relationship between the magnitude of one variable and that of a second for example, as x increases, y also increases. or as x increases, y decreases.”40 statistical significance “statistical significance refers to the claim that a result from data generated by testing or experimentation is not likely to occur randomly or by chance but is instead likely to be attributable to a specific cause. having statistical significance is important for academic disciplines or practitioners that rely heavily on analyzing data and research, such as economics, finance, investing, medicine, physics, and biology. statistical significance can be considered strong or weak. when analyzing a data set and doing the necessary tests to discern whether one or more variables have an effect on an outcome, strong statistical significance helps support the fact that the results are real and not caused by luck or chance. simply stated, if a statistic has high significance then it's considered more reliable.” source: https://www.investopedia.com/terms/s/statisticalsignificance.asp variance “the variance is the mean square deviation of the variable around the average value. it reflects the dispersion of the empirical values around its mean.” source: https://stats.oecd.org/glossary/detail.asp?id=5160 https://www.statisticshowto.datasciencecentral.com/explanatory-variable/ https://www.statisticshowto.datasciencecentral.com/explanatory-variable/ https://stats.oecd.org/glossary/detail.asp?id=3767 https://www.investopedia.com/terms/s/statistical-significance.asp https://www.investopedia.com/terms/s/statistical-significance.asp https://stats.oecd.org/glossary/detail.asp?id=5160 information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 26 appendix b: additional data analysis model: stage as a function of population rank model: stage as a function of total methods information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 27 model: variables that combine to produce the most accurate stage predictions information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 28 endnotes 1 david liddle, “best value—the impact on libraries: practical steps in demonstrating best value,” library management 20, no. 4 (june 1, 1999): 206–14, https://doi.org/10.1108/01435129910268982. 2 daniel pshock, “the user experience of libraries: serving the common good,” user experience 17, no. 2 (2017), https://web.archive.org/web/20190822051708/http://uxpamagazine.org/the-userexperience-of-libraries/. 3 bruce massis, “the user experience (ux) in libraries,” information and learning sciences 119, no. 3/4 (march 12, 2018): 241–44, https://doi.org/10.1108/ils-12-2017-0132. 4 rachel fleming-may et al., “experience assessment: designing an innovative curriculum for assessment and ux professionals,” performance measurement and metrics 19, no. 1 (december 15, 2017): 30–39, https://doi.org/10.1108/pmm-09-2017-0036; rachel ivy clarke, satyen amonkar, and ann rosenblad, “design thinking and methods in library practice and graduate library education,” journal of librarianship and information science (september 8, 2019), https://doi.org/10.1177/0961000619871989; aja bettencourt-mccarthy and dawn lowewincentsen, “how do undergraduates research? a user experience experience,” ola quarterly 22, no. 3 (february 22, 2017): 20–25, https://doi.org/10.7710/1093-7374.1866; juan carlos rodriguez, kristin meyer, and brian merry, “understand, identify, and respond: the new focus of access services,” portal: libraries and the academy 17, no. 2 (april 8, 2017): 321–35, https://doi.org/10.1353/pla.2017.0019; asha l. hegde, patricia m. boucher, and allison d. lavelle, “how do you work? understanding user needs for responsive study space design,” college & research libraries 79, no. 7 (2018), https://doi.org/10.5860/crl.79.7.895; amy deschenes, “improving the library homepage through user research—without a total redesign,” weave: journal of library user experience 1, no. 1 (2014), https://doi.org/10.3998/weave.12535642.0001.102. 5 amanda kraft, “parsing the acronyms of user-centered design,” in 2019 ascue proceedings (association supporting computer users in education (ascue), myrtle beach, south carolina, 2019), 61–69, https://eric.ed.gov/?id=ed597115. 6 ideo, the field guide to human-centered design (san francisco: ideo, 2015); joe marquez and annie downey, library service design: a lita guide to holistic assessment, insight, and improvement (lanham, md: rowman & littlefield, 2016); scott w. h. young and celina brownotter, “toward a more just library: participatory design with native american students,” weave: journal of library user experience 1, no. 9 (2018), https://doi.org/10.3998/weave.12535642.0001.901. 7 aaron schmidt and amanda etches, useful, usable, desirable: applying user experience design to your library (chicago: ala editions, 2014); joe j. marquez and annie downey, getting started in service design: a how-to-do-it manual for librarians (chicago: american library association, 2017). https://doi.org/10.1108/01435129910268982 https://web.archive.org/web/20190822051708/http:/uxpamagazine.org/the-user-experience-of-libraries/ https://web.archive.org/web/20190822051708/http:/uxpamagazine.org/the-user-experience-of-libraries/ https://doi.org/10.1108/ils-12-2017-0132 https://doi.org/10.1108/pmm-09-2017-0036 https://doi.org/10.1177/0961000619871989 https://doi.org/10.7710/1093-7374.1866 https://doi.org/10.1353/pla.2017.0019 https://doi.org/10.5860/crl.79.7.895 https://doi.org/10.3998/weave.12535642.0001.102 https://eric.ed.gov/?id=ed597115 https://doi.org/10.3998/weave.12535642.0001.901 information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 29 8 zoe chao, “rethinking user experience studies in libraries: the story of ux café,” weave: journal of library user experience 2, no. 2 (2019), https://doi.org/10.3998/weave.12535642.0002.203. 9 daniel pshock, “results from the 2017 library user experience survey,” designing for digital, march 6, 2018, https://web.archive.org/web/20190829163234/https://d4d2018.sched.com/event/dm8h/d 16-02-results-from-the-2017-library-user-experience-survey. 10 andy priestner, “approaching maturity? ux adoption in libraries,” in user experience in libraries: yearbook 2017, ed. andy priestner (cambridge, england: ux in libraries, 2017), 1–8. 11 craig m. macdonald, “‘it takes a village’: on ux librarianship and building ux capacity in libraries,” journal of library administration 57, no. 2 (february 17, 2017): 194–214, https://doi.org/10.1080/01930826.2016.1232942. 12 tomer sharon, “ux research maturity model,” prototypr (blog), 2016, https://web.archive.org/web/20190829163113/https://blog.prototypr.io/ux-researchmaturity-model-9e9c6c0edb83?gi=c462f7ac4600. 13 coral sheldon-hess, “ux, consideration, and a cmmi-based model,” coral sheldon-hess blog (blog), 2013, https://web.archive.org/web/20190117144529/http://www.sheldonhess.org/coral/2013/07/ux-consideration-cmmi/. 14 jakob nielsen, “corporate ux maturity: stages 1–4,” nielsen norman group, 2006, https://web.archive.org/web/20190709231540/https://www.nngroup.com/articles/uxmaturity-stages-1-4/; jakob nielsen, “corporate ux maturity: stages 5–8,” nielsen norman group, 2006, https://web.archive.org/web/20190709231533/https://www.nngroup.com/articles/uxmaturity-stages-5-8/. 15 nikki anderson, “ux maturity: how to grow user research in your organization,” medium, may 1, 2019, https://medium.com/researchops-community/ux-maturity-how-to-grow-userresearch-in-your-organization-848715c3543. 16 including: the user experience working group under the digital libraries federation assessment interest group (dlf aig ux), code4lib, assessment listserv of association of research libraries (arl), access conference list, coalition for networked information (cni), library and information technology association (lita), library user experience (libux) slack channel, and ala user experience interest group. 17 current index and classification list available from http://www.carnegieclassifications.iu.edu/classification_descriptions/size_setting.php ; data at time of analysis available from indiana university center for postsecondary research (2018). carnegie classifications 2018 public data file, https://web.archive.org/web/20191006220952/http://carnegieclassifications.iu.edu/downl oads/ccihe2018-publicdatafile.xlsx. https://doi.org/10.3998/weave.12535642.0002.203 https://web.archive.org/web/20190829163234/https:/d4d2018.sched.com/event/dm8h/d16-02-results-from-the-2017-library-user-experience-survey https://web.archive.org/web/20190829163234/https:/d4d2018.sched.com/event/dm8h/d16-02-results-from-the-2017-library-user-experience-survey https://doi.org/10.1080/01930826.2016.1232942 https://web.archive.org/web/20190829163113/https:/blog.prototypr.io/ux-research-maturity-model-9e9c6c0edb83?gi=c462f7ac4600 https://web.archive.org/web/20190829163113/https:/blog.prototypr.io/ux-research-maturity-model-9e9c6c0edb83?gi=c462f7ac4600 https://web.archive.org/web/20190117144529/http:/www.sheldon-hess.org/coral/2013/07/ux-consideration-cmmi/ https://web.archive.org/web/20190117144529/http:/www.sheldon-hess.org/coral/2013/07/ux-consideration-cmmi/ https://web.archive.org/web/20190709231540/https:/www.nngroup.com/articles/ux-maturity-stages-1-4/ https://web.archive.org/web/20190709231540/https:/www.nngroup.com/articles/ux-maturity-stages-1-4/ https://web.archive.org/web/20190709231533/https:/www.nngroup.com/articles/ux-maturity-stages-5-8/ https://web.archive.org/web/20190709231533/https:/www.nngroup.com/articles/ux-maturity-stages-5-8/ https://medium.com/researchops-community/ux-maturity-how-to-grow-user-research-in-your-organization-848715c3543 https://medium.com/researchops-community/ux-maturity-how-to-grow-user-research-in-your-organization-848715c3543 http://www.carnegieclassifications.iu.edu/classification_descriptions/size_setting.php https://web.archive.org/web/20191006220952/http:/carnegieclassifications.iu.edu/downloads/ccihe2018-publicdatafile.xlsx https://web.archive.org/web/20191006220952/http:/carnegieclassifications.iu.edu/downloads/ccihe2018-publicdatafile.xlsx information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 30 18 five institutions appear in this count twice, that is, on five occasions, two persons from the same institution responded separately to the survey. we invited this type of response to capture diversity of opinion and experience within an organization. 19 one institution appears in this count twice, for the same reason as explained in the previous endnote. 20 susan farrell, “ux research cheat sheet,” nielsen norman group, february 12, 2017, https://web.archive.org/web/20190828224735/https://www.nngroup.com/articles/uxresearch-cheat-sheet/. 21 klaus krippendorff, content analysis: an introduction to its methodology, 3rd edition (los angeles: sage, 2012). 22 scott w. h. young, zoe chao, and adam chandler, “data from: user experience methods and maturity in academic libraries,” distributed by dryad, https://doi.org/10.5061/dryad.jwstqjq5d. 23 paul d. allison, multiple regression: a primer (thousand oaks, ca: pine forge, 1999), 6. 24 we are aware that our use of linear regression with this small sample surely “over-fits” the dataset, that is, the model is unlikely to predict as accurately if applied to a different dataset. the model will undergo further refinement in the future. 25 we made a conscious choice to leave in some variables in this model that are statistically insignificant. we did so because it might be too early to fully dismiss these elements as unimportant; it could be that our sample was too small to really be certain. furthermore, our primary emphasis is in creating a model that does a good job of accurately predicting stage based on an array of different characteristics. removing all the nonsignificant variables in this model would actually lower the prediction accuracy. adjusted r-squared accounts for additional variables. 26 for more description of the multiple linear regression model, please see https://web.archive.org/web/20191006231250/https://newonlinecourses.science.psu.edu/s tat462/node/131/. 27 priestner, “approaching maturity? ux adoption in libraries.” 28 craig m. macdonald, “user experience librarians: user advocates, user researchers, usability evaluators, or all of the above?,” proceedings of the association for information science and technology 52, no. 1 (2015): 1–10, https://doi.org/10.1002/pra2.2015.145052010055. 29 sharon, “ux research maturity model.” 30 sheldon-hess, “ux, consideration, and a cmmi-based model.” 31 macdonald, “‘it takes a village.’” https://web.archive.org/web/20190828224735/https:/www.nngroup.com/articles/ux-research-cheat-sheet/ https://web.archive.org/web/20190828224735/https:/www.nngroup.com/articles/ux-research-cheat-sheet/ https://doi.org/10.5061/dryad.jwstqjq5d https://web.archive.org/web/20191006231250/https:/newonlinecourses.science.psu.edu/stat462/node/131/ https://web.archive.org/web/20191006231250/https:/newonlinecourses.science.psu.edu/stat462/node/131/ https://doi.org/10.1002/pra2.2015.145052010055 information technology and libraries march 2020 user experience methods and maturity in academic libraries | young, chao, and chandler 31 32 rachel ivy clarke, “toward a design epistemology for librarianship,” the library quarterly 88, no. 1 (2018): 41–59, https://doi.org/10.1086/694872; rachel ivy clarke, “how we done it good: research through design as a legitimate methodology for librarianship,” library & information science research 40, no. 3 (july 1, 2018): 255–61, https://doi.org/10.1016/j.lisr.2018.09.007. 33 clarke, amonkar, and rosenblad, “design thinking and methods in library practice and graduate library education.” 34 allison, multiple regression: a primer, 14. 35 delroy l. paulhus, “socially desirable responding: the evolution of a construct,” in the role of constructs in psychological and educational measurement (mahwah, nj: lawrence erlbaum, 2002), 49–69. 36 sheldon-hess, “ux, consideration, and a cmmi-based model”; macdonald, “‘it takes a village.’” 37 rachel ivy clarke, “design thinking for design librarians: rethinking art and design librarianship,” in the handbook of art and design librarianship, ed. paul glassman and judy dyki, 2nd edition (chicago: ala neal-schuman, 2017), 41–49; clarke, “toward a design epistemology for librarianship”; clarke, “how we done it good”; clarke, amonkar, and rosenblad, “design thinking and methods in library practice and graduate library education”; shannon marie robinson, “critical design in librarianship: visual and narrative exploration for critical praxis,” the library quarterly 89, no. 4 (october 1, 2019): 348–61, https://doi.org/10.1086/704965. 38 allison, multiple regression: a primer, 1. 39 geoff cumming, understanding the new statistics: effect sizes, confidence intervals, and metaanalysis (new york: routledge, 2012), 26. 40 peter bruce and andrew bruce, “regression and prediction,” in practical statistics for data scientists: 50 essential concepts (sebastopol, ca: o’reilly media, 2017). https://doi.org/10.1002/pra2.2015.145052010055 https://doi.org/10.1016/j.lisr.2018.09.007 https://doi.org/10.1086/704965 abstract introduction background and motivation: ux in academic libraries methods research questions survey participants materials and procedure demographics and ux methods ux maturity research data analysis content analysis statistical analysis data preparation research dataset survey respondents results rq1: how mature is user experience practice within academic libraries? rq2: what factors influence ux maturity? overview of statistical analysis results size of institution methods currently in use by academic libraries factors that influence ux methods: recency, formality, regularity factors that influence ux maturity: leadership support, collaboration, ux lead, ux group, growth, and resources full ux maturity model: ux maturity as a function of ux methods a statistical example case: estimating ux maturity discussion leadership support and strategic alignment organizational collaboration applied ux methods ux as an emerging practice within libraries a ux maturity scale for libraries strategies for climbing the maturity scale: toward a more user-centered library limitations future directions conclusion appendix a: glossary of statistical terms appendix b: additional data analysis model: stage as a function of population rank model: stage as a function of total methods model: variables that combine to produce the most accurate stage predictions endnotes hathitrust as a data source for researching early nineteenth-century library collections: identification, coverage, and methods articles hathitrust as a data source for researching early nineteenth-century library collections: identification, coverage, and methods julia bauder information technology and libraries | december 2019 14 julia bauder (bauderj@grinnell.edu) is associate professor and social studies and data services librarian, grinnell college. abstract an intriguing new opportunity for research into the nineteenth-century history of print culture, libraries, and local communities is performing full-text analyses on the corpus of books held by a specific library or group of libraries. creating corpora using books that are known to have been owned by a given library at a given point in time is potentially feasible because digitized records of the books in several hundred nineteenth-century library collections are available in the form of scanned book catalogs: a book or pamphlet listing all of the books available in a particular library. however, there are two potential problems with using those book catalogs to create corpora. first, it is not clear whether most or all of the books that were in these collections have been digitized. second, the prospect of identifying the digital representations of the books listed in the catalogs is daunting, given the diversity of cataloging practices at the time. this article will report on progress towards developing an automated method to match entries in early nineteenth-century book catalogs with digitized versions of those books, and will also provide estimates of the fractions of the library holdings that have been digitized and made available in the google books/hathitrust corpus. introduction digital libraries such as google books and hathitrust have created tantalizing opportunities for research into the history of american culture: automated analyses of the entire corpus of books published at a given point in time. the attraction of this prospect is most clearly demonstrated by the avalanche of papers written using the google books ngram data, which provides counts over time of the words and phrases used in the works that make up the google books corpus. as soon as this data became available in 2009, it was used to make arguments about social, linguistic, and other changes over time as reflected in changes in the words used in print.1 however, for nearly as long, other researchers have been cautioning that the google books corpus is not a representative sample of publishing output, let alone of what the public at large was actually reading in a given year, and that its unrepresentativeness makes it dangerous to draw sweeping conclusion s from this data.2 one potentially feasible solution to the problem of unrepresentativeness in the google books corpus would be to use corpora based on the holdings of a specific library or a group of libraries. using library holdings to form corpora helps to remedy some known issues with using the google books corpus as an indicator of social change, such as the fact that many books did not become mailto:bauderj@grinnell.edu hathitrust as a data source | bauder 15 https://doi.org/10.6017/ital.v38i4.11251 popular and/or widely available until well after their official publication date, and that some prolific authors who contributed hundreds of thousands of words to the google books corpus were never as widely purchased and read as authors who wrote a single, short, best-selling work.3 although using books held by a set of libraries at a given time as the corpus has its own problems of unrepresentativeness—particularly, for long-established libraries, the fact that the books on the shelf at a given time represent not only works of interest to current users but also those of interest to users from decades past—triangulating this data with that provided by the google books ngram data would at least give some sense of whether and where these different corpora disagree.4 creating corpora using books that are known to have been owned by a given library at a given point in time is potentially feasible because digitized records of the books in several hundred nineteenth-century library collections are available in the form of scanned book catalogs: a book or pamphlet listing all of the books available in a particular library. however, there are two potential problems with using those book catalogs to create corpora. first, it is not clear whether most or all of the books that were in these collections have been digitized, incorporated into google books and hathitrust, and hence made available for ngram analyses. second, the prospect of identifying the digital representations of the books listed in the catalogs is daunting, as both widely agreed-upon cataloging standards and universal identifiers were not adopted until late in the nineteenth century. this article will report on progress towards developing a fully-automated method to match entries in early nineteenth-century book catalogs with digitized versions of those books, and will also provide estimates of the fractions of the library holdings that have been digitized and made available in the google books/hathitrust corpus. methods practical considerations dictated using data from hathitrust rather than from google books for this research. the hathitrust corpus, although not perfectly coextensive with the google books corpus, has very substantial overlap with it. the hathitrust digital archive was founded in 2008, when a group of large academic libraries formed a collaboration to archive and disseminate their digitized books. the vast majority of those digitized books—around 95 percent, as of mid-2017— had originally been scanned as part of the google books project; the agreements that google books entered into with the libraries typically stipulated that google had to provide the library with a digital copy of each book scanned from that library. 5 it was necessary to use hathitrust rather than google books as the comparison corpus because the metadata for the titles in hathitrust is readily available in ways that the google books metadata is not, including as bulk marc-data downloads. the libraries included in this analysis are social libraries, which were a type of quasi-public library that predated the now-standard, tax-supported public library in the united states. these libraries were privately owned and operated, but were open to some large portion of the population of a particular area who were willing and able to pay a fee or buy a share to belong to the library. although the presence or absence of a book in social library collections is not a perfect indicator of the book’s popularity—most social libraries pointedly refused to purchase the “trashy” but widely read sensational fiction of the day—it is a defensible proxy (although with some caveats, as noted above) for the popularity of the “serious” literature and nonfiction works that made up the bulk of these libraries’ collections. information technology and libraries | december 2019 16 roughly one hundred social library book catalogs published between 1800 and 1860 can be found in hathitrust.6 for the purposes of the present study, attention was focused on the thirteen library catalogs from ten different american libraries that were published between 1776 and 1825. (a list of these catalogs can be found in appendix a.) these catalogs were chosen because they are likely to present the worst-case scenario in terms of both of the challenges mentioned above: the highest percentage of rare and extremely old books, which google’s partner libraries would have been least likely to permit to be scanned by google, and, presumably, the most primitive and eclectic cataloging practices. to the extent that it was possible to do so, this analysis focused on book-length monographs. when serials or pamphlets were listed in a separate section of the catalog, those catalog pages were excluded from the process by which entries were extracted from the catalogs and parsed into csv files. serials present particularly intractable matching problems: not only are the original catalogs often unclear about which specific volumes were held, but also hathitrust’s marc data does not always clearly indicate which volumes are available in hathitrust either. pamphlets have limited coverage in hathitrust. the selected catalogs were downloaded from hathitrust as pdfs, and the pdftotext software was used to extract the ocr data from the relevant pages of the scans as hocr (a file format for ocr that includes information about where each word is located on the page in addition to the words themselves). 7 then cleaning scripts were created that parsed the hocr data into csv files for analysis, with one catalog entry per line of the csv file.8 given the widely varied cataloging practices of the early nineteenth century, several different cleaning scripts were written, each tailored to a particular catalog format. for example, many of the catalogs had entries that spanned multiple lines (see figures 1 and 2), so the scripts for those catalogs had to be able to identify when each new entry started. many catalogs had extraneous information, such as the name of the donor of the book or the size of the book, that had to be filtered out (see figure 1; f, q, o, and d refer to the size of the book: folio, quarto, octavo, or duodecimo). in addition, various forms of dittoes were frequently used in these catalogs (see figures 1, 2, and 3), so one of the tasks for the cleaning scripts was to identify the dittoes and replace them with the correct words from the previous entry. figure 1. library company of philadelphia, a catalogue of the books belonging to the library company of philadelphia: to which is prefixed, a short account of the institution, with the charter, laws, and regulations (philadelphia, pa: printed by bartram & reynolds, 1807), 5. hathitrust as a data source | bauder 17 https://doi.org/10.6017/ital.v38i4.11251 figure 2. library company of baltimore, a catalogue of the books, &c. belonging to the library company of baltimore: to which are prefixed the act for the incorporation of the company, their constitution, their by-laws, and an alphabetical list of the members (baltimore, md: printed by eades and leakin, 1809), 46. figure 3. washington library company, catalogue of books in the washington library (washington, dc: printed by anderson and meehan, 1822), 17. unfortunately, the horizontal-line dittoes seen in figures 1 and 2—a type of ditto which is used in seven of the thirteen catalogs—are represented inconsistently or not at all in the hocr, so they cannot reliably be used to identify places where words need to be carried down from the previous entry. for the catalog of the library of company of philadelphia, from which figure 1 was taken, the numbers after the horizontal-line dittoes (which identify the books’ locations on the shelves) can be used to distinguish between a line that is indented because it is a continuation of the entry above and a line that is indented but is the start of a new entry. in theory, a cleaning script for the catalog of the library company of baltimore (figure 2) could use a similar process to identify the last line of an entry by watching for the right-justified count of volumes at the end of each entry. information technology and libraries | december 2019 18 when a right-justified digit was encountered, the script could then carry down the first word from that entry if the first word in the next entry was indented. however, these isolated digits were also not handled well by the ocring process: many do not appear in the hocr file at all, and those that do are as likely to be ocred as a colon, an exclamation point, a capital i, etc., as they are to be a digit. hence, the three catalogs of the library company of baltimore, which use this format and have this ocr issue, were not analyzed for this project. table 1. results of verification library date founded if known, or inc. if not known9 date catalog printed number of spreadsheet entries number of entries handverified handverified entries that cannot be positively identified handverified, positively identifiable entries that are not in hathitrust positively identifiable entries successfully matched when work was in hathitrust library company of philadelphia 1731 1807 7619 128 0% 16.9% 79.8% horsham library company 1808 1810 143 143 28.4% 5.1% 79.8% salem (ma) athenaeum inc. 1810 1811 1585 130 0.8% 11.3% 72.3% new york society library 1754 1813 4522 135 5.7% 17.9% 76.1% providence library company 1753 1818 688 688 17.1% 9.4% 87.2% apprentices’ library (new york, ny)10 1820 1820 1811 124 34.4% 15.0% 69.7% washington (dc) library company inc. 1814 1822 900 124 12.9% 3.2% 83.7% boston library inc. 1794 1824 2273 138 4.1% 11.1% 82.5% mercantile library (new york, ny) 1820 1825 1386 138 0% 11.3% 86.0% hathitrust as a data source | bauder 19 https://doi.org/10.6017/ital.v38i4.11251 the catalogs of the other nine libraries could all be parsed with an acceptable success rate and, with one exception, were included. the exception was the salem athenaeum’s 1818 catalog, which was identical in format and nearly identical in content to the athenaeum’s 1811 catalog. given the overwhelming similarity it was decided to include only one of the catalogs; given that the goal of this analysis was to try to use the worst-case-scenario catalogs, the older of the two catalogs was chosen for inclusion. once the catalogs were parsed into csv files, they were run through another script that attempted to match each entry in the catalog against metadata from hathitrust. in february 2019, marc records containing metadata for 2,824,875 public-domain titles in hathitrust were downloaded from hathitrust via their oai feed and ingested into a local apache solr index for searching and matching, using code from the solrmarc and vufind projects.11 because of ocr errors in the catalog files and mistakes in the original catalogs, many of the words in the entries have one or more character-level errors. therefore, solr’s fuzzy searching option was used, which allows words to match as long as the levenshtein distance between them is two or less. (the levenshtein distance is the number of edits, such as changing one letter to another or adding or deleting a letter, it would take to turn one word into the other.) no attempt was made to match specific editions; as can be seen from the excerpts in figures 2 and 3, many of the catalogs do not contain sufficient detail to do so, even if it was desirable. the goal was merely to determine whether the text of that work, from any edition, was contained in the hathitrust corpus. once the catalogs had been checked against hathitrust, a sample of the entries was hand-verified. for the two smallest catalogs, the horsham library company and the library company of providence, all entries were hand-verified. for the other catalogs, a random sample of approximately 130 items (+/10) was selected. microsoft excel’s random-number generator was used to assign each line in the csv file a number between 0 and 1, and then the lowest 1.5 percent to 12.5 percent (depending on the number of items in the catalog) were examined. results percentage of works included in hathitrust a minimum of four of the books in every catalog examined was missing from hathitrust. as can be seen in table 1, the fraction of books from the hand-verified sample that was missing from hathitrust ranged from 3.2 percent for the washington library company to just shy of 18 percent for the new york society library. the library company of philadelphia, at 16.9 percent missing, had the second-highest missing number. it is not surprising that these two libraries, as two of the oldest and most venerable libraries in the united states at the time, owned the most books that are not represented in hathitrust, as both have a high percentage of very old and rare works. however, not all of the books from these collections that are not represented in hathitrust fall into that category. only six of the twenty missing works from the library company of philadelphia sample, and no more than eight of twenty-two from the new york society library, were published before 1700, for example.12 percentage of works that cannot be positively identified as can be seen in figures 1 through 3, some catalogs provided relatively full titles (figures 1 and 2), while others described the works in only two or three words each (figure 3). as might be expected, it is much easier to positively identify the works when fuller titles are provided, although two or three words proved to be enough to identify the work unambiguously the information technology and libraries | december 2019 20 majority of the time. (all of the titles shown in figure 3 can be positively identified, for example.) in the samples taken from the nine catalogs, the percentage of titles that were unidentifiably ambiguous ranged from 0 percent (library company of philadelphia, mercantile library of new york) to more than one in four (apprentices’ library of new york, 34.4 percent; horsham library company, 27.9 percent). the apprentices’ library of new york and the horsham library company were particularly problematic because they frequently omitted the name of the author, in addition to greatly compressing the title; without an author name, titles such as modern geography (apprentices’ library) and history of rome (horsham library company) present far too many potential matches. however, even including the author’s name does not make all greatly compressed entries identifiable. one particularly egregious example comes from the library company of providence’s 1818 catalog, which contains an entry reading “bell’s inquiry.” the list of candidates for this work includes a practical inquiry into the authority, nature, and design of the lord’s supper, by william bell; an inquiry into the causes which produce, and the means of preventing diseases among british officers, soldiers, and others in the west indies, by john bell; and inquiry into the policy and justice of the prohibition of the use of grain in distilleries, by archibald bell. figure 4. new york society library, a catalogue of the books belonging to the new-york society library (new york: printed by c. s. van winkle, 1813), 139. success rates for the parsing and matching scripts when there was a single, identifiable work that matched the catalog entry, and that work was in hathitrust, the matching scripts identified it at least 70 percent of the time for every individual catalog. unsurprisingly, catalogs such as those of the horsham library company and the apprentices’ library of new york that had entries that were difficult to positively identify were also more difficult for the script to properly match, although the matching script still succeeded between roughly 70 and 80 percent of the time. hathitrust as a data source | bauder 21 https://doi.org/10.6017/ital.v38i4.11251 for two other libraries with below-average matching results (the library company of philadelphia and the new york society library), many of the matching problems were caused by issues with the scanned catalogs that the data-cleaning scripts did not handle well. the new york society library catalog listed out the contents of multivolume sets in a way that was difficult for the cleaning script to identify and remove (see figure 4); instead, it was common for each volume of the set to end up with its own entry in the dataset. since the hathitrust records generally do not list out the contents of each volume, it was very rare for the cleaning script to correctly match a set based on an entry for one volume in the set. twenty-seven percent (six out of 22) missed matches from that sample failed because of this table-of-contents issue. for the library company of philadelphia, the problem lies with a quirk in the hocr where the character heights for many of the horizontal-line dittoes are extremely high—around twenty pixels, when the text around those dittoes is typically around ten pixels high. it appears as if the ocr program may have treated each horizontal-line ditto as an em dash and assigned it a height that would be proportional for an em dash of that length. these extra-tall line heights for the first “word” on the line cause issues with the algorithm that processes the text line-by-line, causing some entries to be inappropriately divided across two entries in the data spreadsheets. unsurprisingly, the matching script had difficulty correctly identifying the correct work in hathitrust when it was trying to match based on only half of the book’s title. conclusions although not a complete success, the results of this study provide hope that it might be possible to create full-text corpora based on the works in individual libraries with minimal manual labor, with a few caveats. the first caveat is that the digitized catalogs of those libraries must meet certain specifications: 1) the catalog is formatted, and has been ocred, in such a way that it is consistently possible to parse the catalog line-by-line and to identify algorithmically where each entry starts and ends. 2) the catalog provides at least the authors’ last names, if not their full names, plus a more-orless complete and accurate transcription of the title proper. 3) either the catalog contains minimal extraneous information (such as tables of contents or donors’ names), or the extraneous information is consistently formatted in a way that it can be algorithmically identified and removed. the second caveat is that even if all of these conditions are met, the full-text corpora that can be created will probably still be missing some small percentage of the books available in that library. one potential direction for future research could be more closely examining the books that are absent from hathitrust to see if there are any commonalities among them that might bias research done using these corpora, or if the missing works can safely be treated as random omissions. on the other hand, as was noted above, the catalogs used in this study represent a likely worst-case scenario for being able to positively identify the works listed in the catalogs and for those works being present in hathitrust. another promising avenue for future research would be to repeat this analysis on catalogs from the mid-to-late nineteenth century to see if the works in those catalogs are in fact more likely to exist in the hathitrust corpus. information technology and libraries | december 2019 22 appendix a: american library catalogs from 1776 to 1825 included in hathitrust boston library, catalogue of books in the boston library, june, 1824, boston: munroe and francis, 1824, http://hdl.handle.net/2027/hvd.32044080249337. general society of mechanics and tradesman of the city of new york, catalogue of the apprentices’ library, instituted by the society of mechanics and tradesman of the city of new-york, on the 25th november, 1820: with the names of the donors: to which is added, an address delivered on the opening of the institution by thomas r. mercein, a member of the society. new york: printed by william a. mercein, no. 93 gold-street, 1820, http://hdl.handle.net/2027/nnc2.ark:/13960/t8md1cv2t. horsham library company, the constitution, bye-laws, and catalogue of books, of the horsham library company. philadelphia, pa: j. rakestraw, 1810, http://hdl.handle.net/2027/nnc1.cu55910696. library company of baltimore, a catalogue of the books, &c. belonging to the library company of baltimore: to which are prefixed the act for the incorporation of the company, their constitution, their by-laws, and an alphabetical list of the members. baltimore, md: printed by eades and leakin, 1809, http://hdl.handle.net/2027/nyp.33433069263907. library company of baltimore, a supplement to the catalogue of books, &c. belonging to the library company of baltimore. baltimore, md: printed by j. robinson, 1816, http://hdl.handle.net/2027/nyp.33433069263899. library company of baltimore, a supplement to the catalogue of books, &c. belonging to the library company of baltimore. baltimore, md: printed by j. robinson, 1823, http://hdl.handle.net/2027/nyp.33433069263899. library company of philadelphia, a catalogue of the books belonging to the library company of philadelphia: to which is prefixed, a short account of the institution, with the charter, laws, and regulations. philadelphia, pa: printed by bartram & reynolds, 1807, http://hdl.handle.net/2027/nyp.33433075914816. mercantile library association of the city of new york, catalogue of the books belonging to the mercantile library association of the city of new-york: to which are prefixed, the constitution and the rules and regulations of the same. new york: printed by hopkins & morris, 1825, http://hdl.handle.net/2027/nyp.33433057517090. new york society library, a catalogue of the books belonging to the new-york society library. new york: printed by c. s. van winkle, 1813, http://hdl.handle.net/2027/mdp.39015023478822. providence library company, charter and by laws of the providence library company, and a catalogue of the books of the library. providence, ri: printed by miller and hutchens, 1818, http://hdl.handle.net/2027/nyp.33433059555346. salem athenaeum, catalogue of the books belonging to the salem athenæum, with the by-laws and regulations. salem, ma: printed by thomas c. cushing, 1811, http://hdl.handle.net/2027/hvd.32044080252174. http://hdl.handle.net/2027/hvd.32044080249337 http://hdl.handle.net/2027/nnc2.ark:/13960/t8md1cv2t http://hdl.handle.net/2027/nnc1.cu55910696 http://hdl.handle.net/2027/nyp.33433069263907 http://hdl.handle.net/2027/nyp.33433069263899 http://hdl.handle.net/2027/nyp.33433069263899 http://hdl.handle.net/2027/nyp.33433075914816 http://hdl.handle.net/2027/nyp.33433057517090 http://hdl.handle.net/2027/mdp.39015023478822 http://hdl.handle.net/2027/nyp.33433059555346 http://hdl.handle.net/2027/hvd.32044080252174 hathitrust as a data source | bauder 23 https://doi.org/10.6017/ital.v38i4.11251 salem athenaeum, catalogue of the books belonging to the salem athenæum, with the by-laws and regulations. salem, ma: printed by w. palfray, 1818, http://hdl.handle.net/2027/hvd.32044080252174. washington library company, catalogue of books in the washington library, july 20, 1822. washington, dc: printed by anderson and meehan, 1822, http://hdl.handle.net/2027/chi.098498263. references 1 see, e.g., jean-baptiste michel et al., “quantitative analysis of culture using millions of digitized books,” science, 311, no. 6014 (january 11, 2011): 176-82, https://doi.org/10.1126/science.1199644; jean m. twenge, w. keith campbell, and brittany gentile, “male and female pronoun use in u.s. books reflects women’s status, 1900 -2008,” sex roles 67, nos. 9-10 (november 2012), 488-93, https://doi.org/10.1007/bf00287963; patricia m. greenfield, “the changing psychology of culture from 1800 through 2000,” psychological science 24, no. 9, 1722-31, https://doi.org/10.1177/0956797613479387. 2 eitan adam pechenick, christopher m. danforth, and peter sheridan dodds, “characterizing the google books corpus: strong limits to inferences of socio-cultural and linguistic evolution,” plos one 10, no. 10 (october 7, 2015): e0137041. https://doi.org/10.1371/journal.pone.0137041; alexander koplenig, “the impact of lacking metadata for the measurement of cultural and linguistic change using the google ngram data sets—reconstructing the composition of the german corpus in times of wwii,” digital scholarship in the humanities 32, no. 1 (april 2017): 169-88, https://doi.org/10.1093/llc/fqv037. 3 pechenick et al., 2015; lindsay dicuirci, colonial revivals: the nineteenth-century lives of early american books (philadelphia: university of pennsylvania press, 2019). 4 robert a. gross, “reconstructing early american libraries: concord, massachusetts, 1795 -1850,” proceedings of the american antiquarian society, 97, no. 1 (january 1, 1987): p. 331-451. 5 jennifer howard, “what ever happened to google’s effort to scan millions of university library books?,” edsurge, august 20, 2017, https://www.edsurge.com/news/2017-08-10-whathappened-to-google-s-effort-to-scan-millions-of-university-library-books. 6 book catalogs fell out of favor in the latter half of the nineteenth century as library collections became larger and more dynamic, making book catalogs much more difficult and expensive to compile and to keep up to date. by the end of the nineteenth century, book catalogs had largely been replaced by the card catalog system that remained in use through most of the twentieth century. although card catalogs were far superior for their primary purposes—maintaining an inventory of books presently owned by the library and allowing library users to locate the books that they wanted—they leave no permanent record of the books listed in the catalog at any particular point in time. 7 information about pdftotext can be found at https://manpages.debian.org/testing/popplerutils/pdftotext.1.en.html. http://hdl.handle.net/2027/hvd.32044080252174 http://hdl.handle.net/2027/chi.098498263 https://doi.org/10.1126/science.1199644 https://doi.org/10.1007/bf00287963 https://doi.org/10.1177/0956797613479387 https://doi.org/10.1371/journal.pone.0137041 https://doi.org/10.1093/llc/fqv037 https://www.edsurge.com/news/2017-08-10-what-happened-to-google-s-effort-to-scan-millions-of-university-library-books https://www.edsurge.com/news/2017-08-10-what-happened-to-google-s-effort-to-scan-millions-of-university-library-books https://manpages.debian.org/testing/poppler-utils/pdftotext.1.en.html https://manpages.debian.org/testing/poppler-utils/pdftotext.1.en.html information technology and libraries | december 2019 24 8 the cleaning scripts, as well as data and other code used in this project, are available in https://github.com/julia-bauder/library-catalog-analysis-public. 9 the founding and incorporation dates were taken from the prefatory texts in the book catalogs used in this analysis, as listed in appendix a. 10 the scan of this catalog that is available from hathitrust is missing pages 3-6. 11 apache solr is a widely used, open-source search platform. solrmarc is a utility that can be used to index marc records into solr. vufind is an open-source library discovery layer built in part on solr and solrmarc. for more information, see http://lucene.apache.org/solr/, https://github.com/solrmarc/solrmarc, and https://vufind.org/vufind/, respectively. the hathitrust oai feed is available at https://www.hathitrust.org/oai. 12 five of the missing works from the new york society library sample were undated in the catalog. https://github.com/julia-bauder/library-catalog-analysis-public http://lucene.apache.org/solr/ https://github.com/solrmarc/solrmarc https://vufind.org/vufind/ https://www.hathitrust.org/oai abstract introduction methods results percentage of works included in hathitrust percentage of works that cannot be positively identified success rates for the parsing and matching scripts conclusions appendix a: american library catalogs from 1776 to 1825 included in hathitrust references cultivating digitization competencies: a case study in leveraging grants as learning opportunities in libraries and archives article cultivating digitization competencies a case study in leveraging grants as learning opportunities in libraries and archives gayle o'hara, emily lapworth, and cory lampert information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.11859 gayle o’hara (gayle.ohara@wsu.edu) is manuscripts librarian, washington state university. emily lapworth (emily.lapworth@unlv.edu) is digital special collections & archives librarian, university of nevada las vegas. cory lampert (cory.lampert@unlv.edu) is head of digital collections, university of nevada las vegas. © 2020. abstract this article is a case study of how six digitization competencies were developed and disseminated via grant-funded digitization projects at the university of nevada, las vegas libraries special collections and archives. the six competencies are project planning, grant writing, project management, metadata, digital capture, and digital asset management. the authors will introduce each competency, discuss why it is important, and describe how it was developed during the course of the grant project, as well as how it was taught in a workshop environment. the differences in competency development for three different stakeholder groups will be examined: early career grant staff gaining on-the-job experience; experienced digital collections librarians experimenting and innovating; and a statewide audience of cultural heritage professionals attending grant-sponsored workshops. introduction digitization of cultural heritage resources is commonly viewed as an important and necessary task for libraries, archives, and museums. there are many reasons for engaging in digitization projects and creating digital collections, including providing increased access to unique collections, preserving fragile records, raising the global profile of the institution, meeting user demand, and supporting the teaching, learning, and research needs of host institutions. in addition, there is an expectation among the public that research resources are digitized and available online. from the perspective of librarians and archivists, digitization of special collections and archives materials involves more than just reformatting analog materials into a digital format (this article uses the term “digitization” to refer to the entire lifecycle of digitization projects involving special collections and archives materials, from planning to preservation). materials must be selected and prepared, the digital surrogates must be described and preserved, and access must be provided to the appropriate audiences. digitization work is often project-based, since each set of materials to be digitized may require different equipment, specifications, approaches, or workflows. digitization projects and workflows can be a solo affair, a temporary project team, or a permanent functional area complete with staff specializing in activities such as project management, grantwriting, web development, or metadata. staff learning needs will significantly vary depending on organizational characteristics, assigned roles, project specifications, and motivation of individuals. overall, the libraries’ and archives’ profession-wide approach to teaching and developing digitization competencies is somewhat haphazard. there are many methods to learn about digitization, including self-study of published resources, online tutorials and resources, conference presentations, workshops, continuing education courses, and masters in library and information mailto:gayle.ohara@wsu.edu mailto:emily.lapworth@unlv.edu mailto:cory.lampert@unlv.edu information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 2 science (mlis) program classes.1 in many graduate school programs there has been a move toward integrating digital library theory and practice, but courses are necessarily broad in nature, and not every student will be required or have the opportunity to complete a practicum or internship while studying. this can make it difficult for new librarians to identify which skills are most in demand and which type of self-study is most useful for the job market. identifying key competencies, and how to acquire them, may be helpful in supporting new librarians as th ey make the jump from graduate education to their first professional position, but it is not a challenge limited to newer professionals. even seasoned librarians and archivists, with practical experience in their portfolio, may find that their local experience does not translate to different organizations, is too broad for a particular project, or is not deep enough for them to lead the initiation of a new digitization program. the digital collections department at the university of nevada, las vegas (unlv) has a decadelong record of hiring early career librarians for grant-funded projects, providing them with opportunities to develop digitization competencies on the job. from 2017 to 2019, unlv’s digital collections department completed two grant-funded digitization projects that specifically set out goals to contribute to competency development for multiple stakeholders. early career project managers learned, practiced, and refined skills; the department experimented and innovated its own workflows; and the project team held two workshops to contribute to the development of digitization competencies throughout the state. the six main competencies that were developed during the grant projects are project planning, grant writing, project management, metadata, digital capture, and digital asset management. the authors, who were members of the grant project teams, will discuss the six competencies in this article. using the grant projects as a case study, they will describe each competency and share how it was used and developed within the project team via on-the-job learning, and within the state via the statewide workshops. literature review the idea of professional competencies for librarians and archivists is well-established and documented in academic literature, and defined competencies are recognized as valuable tools for education, recruitment, professional development, and evaluation. drawing from organizational project management literature, daniel, oliver, and jamieson define competency as the ability to apply combined knowledge, skills, and abilities in service of a measurable and observable goal. 2 in the united states, the american library association (ala) defines “core competencies of librarianship” and “competencies for special collections professionals.”3 the competency framework of the archives & records association of the united kingdom and ireland (ara) describes five levels of experience: novice, beginner, competent, proficient, and expert/authoritative.4 ara’s recognition of the varying dimensions of competency is a helpful guide, and aligns with the reality of different levels of expertise. however, the competencies identified by ala, ara, and other similar professional organizations are necessarily broad; competencies for specific library roles are harder to generalize and define. in order to identify the knowledge, skills, and abilities required of “digital librarians,” researchers such as choi and rasmussen analyzed job announcements and surveyed practitioners.5 job announcement analysis shows that there is no single definition of a digital librarian; instead digital librarian positions consist of many varied roles and responsibilities in almost infinite combinations. the competencies discussed in this article (project planning, grant writing, project management, metadata, digital capture, and digital asset management) were locally important to information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 3 unlv’s digitization projects, but they also align with the competencies identified in previous research. in their study of projects undertaken in the national digital stewardship residency program (ndsr), blumenthal et al. found that project management skills and technical skills (including metadata, workflow enhancement/development, digital asset management, and digitization) were important.6 the level of required technical competency tended to vary by project but workflow enhancement stood out as a universally important skill. a 2019 analysis of the latest career trends for information professionals by san jose state university’s (sjsu) ischool noted that there is increasing demand for project management skills across all career types. 7 this usually encompasses the ability to organize complex tasks and collaborate with other departments or institutions in service of a shared goal. sjsu also cited “new technologies” as a necessary skill. however, they specified that this refers to “all iterations relating to interest in, familiarity with, or experience with new and emerging technologies” (emphasis in the original). in choi and rasmussen’s article analyzing job ads, the authors note that many of the frequently stated job requirements tend to be vaguely described or cover broad areas, including current trends in digital libraries, competency on general technological knowledge, and the current state of information technology as three most frequently mentioned competencies. 8 digital asset management, digital scanning, digital preservation, and metadata were some of the specific technical skills desired, as well as project management, planning, and organization. research shows that the more generic the competencies, the more broadly applicable they are; but specific competencies depend on the local environment, the role of the position, and the variables of the project or responsibilities. the wide range of competencies required by the digital library field paired with the specificity of local implementation requires new librarians and archivists to seek out learning opportunities that target both theory and practice. in fact, one of the most important aspects of practical experience is the benefit gained by experiencing the concepts in real-world situations that require decision-making, iteration, and sometimes even failure. the education field points to the kolb model of experiential learning, a cycle that is composed of four elements: concrete experience, reflective observation, abstract conceptualization, and active experimentation. 9 these elements mirror the process of learning observed in the grant case studies. new project staff are often trained to do tasks, then reflect upon what went well or was challenging. then permanent staff in leadership roles encourage and facilitate discussions in abstract concepts such as the philosophy behind an organization’s decision to prioritize efficiency or the concepts of creating authentic digital surrogates. while it may not happen in every project, within both grant cases, the final phase of the learning cycle was also reached as project staff and permanent employees worked together to move practice forward through testing, experimentation with new methods, and ultimately innovation of new models for digital library practices in the area of large-scale digitization. kolb’s model can be useful throughout the library and archives field, as shown in the following example. the federal agencies digital guidelines initiative (fadgi) started in 2007 as a collaboration of federal agencies seeking to articulate common sustainable practices and guidelines for digitized and born-digital cultural heritage resources. the fadgi website is a treasure trove of approved and recommended guidelines covering still image digitization, embedding metadata, project planning for digitization activities, and more.10 it essentially provides step-by-step guides for all aspects of digitization and is a tool that those interested or actively involved in digitization should be familiar with and consult on a regular basis. however, information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 4 fadgi technical standards are relatively prescriptive, so organizations often have to decide how to implement them within their local environments, taking into consideration a wide range of variables. if every new digitization project manager conscientiously implemented the fadgi standards without associated institutional context, they could be investing their organization in long-term cost commitments that cannot be sustained over time or that do not meet the project goals. this scenario points to the need for hands-on experience and learning as outlined in the kolb model. the digitization project manager may want to revisit the goals of the project (access vs. preservation, or both) and resource allocations (storage capacity, software and hardware specifications, staff time and expertise), and then pilot a subset of materials by capturing with the fadgi standard and calculating the storage sizes of the files and any associated workflows for long-term management. through this small experiential exercise, much information can be gained, reflected upon, and then used to conceptualize how to proceed. most of the tasks associated with digital library projects demand increasing competency over time to progress from enacting the technical standard in an organizational context, to revising it across projects or local environments, to educating others about the role of the standard, or to, at the highest levels of competency actively participate in the creation or revision of the standard itself as it changes over time. the ability to not only implement but also refine and even innovate comes from a process of mastery of the competency in question. experiential learning is an important method for developing and refining competencies from a novice to more expert level, but not all librarians and archivists have the opportunity to learn from more experienced colleagues on the job. matusiak and hu emphasize the importance but also the inconsistency of integrating experiential learning into mlis programs.11 for those who do not gain practical experience in library school or on the job, workshops are an additional learning opportunity that can help professionals bridge the gap from written resources to local implementation. the illinois digitization institute is one example described in detail by maroso in 2005.12 digital directions is a conference that presents the “fundamentals of creating and managing digital collections” in two days.13 other available workshops focus more closely on different aspects of digitization, such as metadata or preservation, or training for specific equipment via a vendor. in the following examination of unlv’s digitization grant projects and workshops, the authors address six competencies that were either employed or developed by staff or have been identified in existing literature. these competencies may be viewed as critical building blocks for digitization projects and the authors address how they were developed to different levels of expertise and using different methods, experiential learning, and workshops. overview of grant projects unlv’s digital collections completed two grant projects with the main goals of: (1) the large-scale digitization of archival collections, (2) the development of large-scale digitization models and workflows that could be reused, and (3) statewide workshops to share those models and workflows with other libraries and archives institutions. both projects were funded by library services & technology act (lsta) grants administered by the nevada state library and archives. the first project, “raising the curtain: large-scale digitization models for nevada cultural heritage,” digitized mainly visual materials on the topic of las vegas entertainment, while the second project, “building the pipelines: large-scale digitization models for nevada cultural heritage,” digitized mostly text documents about water issues in southern nevada. information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 5 digital collections hired two types of temporary project-specific staff for the two digitization grants: project managers and student assistants. the project manager for each grant coordinated the day-to-day activities, such as preparation of the materials, digital capture, quality control, metadata, and ingest into the digital collection management system, as well as helping to fine-tune workflow documentation. the student assistants contributed to digital capture, quality control, metadata creation, and upload to the digital collection management system. these grant projects are strong examples of experiential learning and competency development. two of the authors were principal investigators (pis) for both of the grants, and one author was the project manager for the second building the pipelines grant. at time of hire, the project manager for the second grant had experience working in special collections and archives but had not previously worked in a digital environment. one student assistant was hired for this project; she had already worked on the first large-scale digitization grant project in digital collections and was already familiar with the digitization workflow, as well as the hardware and software. employing a student who had already experienced the concrete tasks (phase 1, “concrete experience” in the kolb model) allowed her to help the new project staff as they cou ld together perform “reflective observation” (phase 2) and learn from their compiled shared experience. the project pis were intentional in designing opportunities for discussion. they regularly met with the student and project manager to help them understand what they were seeing and experiencing in the context of the organization's mission and the grant goals (kolb’s “abstract conceptualization”). the building the pipelines grant project facilitated each of them gaining more competency and moving to the next level while also helping the pis learn through experimenting with new approaches (the final phase of “active experimentation”). the same experiential learning model was also successfully used for the first raising the curtain grant project. as previously stated, conducting a day-long digitization workshop for nevada libraries and archives was a goal of both large-scale digitization grant projects undertaken by unlv digital collections. the nevada statewide large-scale digitization workshops, which were held towards the end of each grant period, were free for participants, and travel grants were available thanks to the grant funding. the workshops sought to provide an overview of large-scale digitization using unlv projects as examples, as well as to provide practical advice related to developing digitization competencies. the first workshop that unlv held in may 2018 consisted of presentations and discussions addressing the basics, methods, and challenges of large-scale digitization. the second workshop, held in may 2019, still shared what unlv learned about large-scale digitization during the grant project, but widened the scope to address multiple important digitization competencies, whether the project is large or small. competencies whether presented in a project-based learning environment, a one-day workshop, or in a selfstudy scenario, learners can benefit from a clear understanding of what is meant by competencies in each of the areas that make up a successful digitization project. below, the authors share the competencies most critical to success in the case study projects. these were also the competencies selected as priorities for the workshops. while expertise is not mandatory in all of the competencies in order to start a digitization project or apply for a grant, reflection and planning for each of these steps should be addressed prior to initiating any project. by identifying available resources (such as existing documentation, available staff with expertise to consult, or approval from a supervisor for a self-study plan) project managers can ensure that if there are any information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 6 competency gaps, they will learn the needed competencies to carry out the project. in addition, throughout the learning process, interpersonal skills such as proactive communication, adaptability to change, flexibility in evolving job scope, and cultivation of comfort with ambiguity are all qualities that are just as necessary as any technical skill in mastering competency in digitization. project planning this competency can be defined as the ability to create a shared and documented vision and plan so that specifications, expectations, roles, goals, and timelines are considered in advance and clear to everyone involved. planning for a digitization project is best approached holistically. the planning period is the time to consider all needed competencies and plan for their implementation. writing up a project plan is important, especially since digitization can involve many collaborators and stakeholders. even if one is working alone, there are so many components, steps, and details involved in digitization projects that it is important to plan ahead for them and to document everything. brainstorm and write down ideas and plans for the project, from the overall scope, goals, timelines, and roles, to the specific details of each component, including specifications and workflows for digital capture, metadata, access, preservation, assessment, and promotion (see appendix a, “an overview of planning and implementing digitization projects”). the plan should be communicated, remain flexible, and be updated (or better yet, versioned) to document changes implemented during the project. an important part of project planning is selecting materials for digitization. to develop competency in effectively selecting materials, a person should be familiar with the materials and the digitization process or collaborate closely with people who are. it is often not until one is in the weeds and discussing the nitty-gritty details of a project that the challenges and actual viability of digitizing specific materials become apparent. format is a huge factor in digitization, as is description, and understanding how materials will be used.14 digitizing a group of materials that can all be processed the same way is much easier than undertaking a project to digitize many different formats that require different digitization specifications, equipment, description, processing, etc. one must also take into account legal and ethical considerations. successful selection of materials takes all factors into account and targets materials that fit with the overall goals and vision of a specific project.15 in the case of unlv’s grant projects, the head of digital collections and the digital collections librarian identified the main goals, developed tentative workflows, and authored the grant applications as copis. the pis had multiple years of experience planning and completing digitization projects, which they drew upon to plan these projects. they both started off developing their digitization competencies by completing pilot projects, developing workflows and writing grants to fund smaller-scale, highly curated “boutique” projects. as they honed their skills and the department’s workflows over the years and the organization built the capacity and expertise to successfully scale up the rate of digitization, digital object production grew from one staff member using one scanner to digitize a couple hundred items in a year, to a robust department with a digitization lab that produces tens of thousands of digital surrogates per year. the pis documented the vision and goals of the projects in the grant applications, along with timelines, desired outcomes, the roles of the team members, and budgets. the grant application information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 7 provided a structure to help with the bigger picture of project planning, and the digital collections librarian also used a template to create detailed digitization plans for the collections. the template was developed locally based on past experience planning and implementing digitization projects (see appendix b for unlv’s “digitization plan template”). project planning was completed prior to the hire of the project managers and student assistants. the project managers and student assistants were responsible for enacting the project plans, and during the projects they were empowered to adapt and improve upon the plans. the modelling provided by the pis, coupled with the day-to-day experience of the project managers, led to the continuous improvement of and adaptation of workflows through experiential learning. the grant application and digitization plan, along with all of the prepared workflow documentation and tracking spreadsheets, provided a concrete example of how large digitization projects can successfully be planned. by implementing and refining the plans herself, the project manager gained direct experience and intimate knowledge of the plans, including what worked well and what did not. the project manager therefore developed competency in project planning to be able to create plans herself, and the pis further refined their own planning skills, allowing them to plan for even larger or more complex projects in the future. based on previous experience with projectbased learning, the pis had already established a level of expertise at roughly level 4 in the ara tiers. level 5 includes innovation, which was a target of the grant project as it required the pis not only to successfully map past experience to a new situation, but in cases where experiences did not map, gain new knowledge through experimentation. the project team included project planning as a topic of the statewide digitization workshops, sharing digitization plan templates, finalized workflows, and other planning resources that aided in the successful completion of the grant projects. building upon feedback from the first workshop in 2018, the second one addressed the ability to create a digitization project plan of any scale, recognizing that many nevada institutions do not have the ability to engage in large-scale projects. despite the emphasis on the foundational importance of project planning, most attendees noted that they do not currently create detailed digitization plans prior to starting a project. providing examples of plans, practical resources, sharing hands-on experiences, and welcoming discussion was helpful to participants, as indicated by feedback on the post-workshop survey. the workshop organizers scheduled time for participants to work on their own digitization plan, and also offered private consultations to help them, but many participants did not have a specific project in mind and did not seem ready to jump into the details of project planning during the workshop. overall, these teaching strategies helped participants gain a better idea about how to plan digitization projects, but they do not match the experience of creating or implementing a plan oneself. grant writing all projects begin with an idea, but only a small fraction of possible projects are acted upon. this is due primarily to a scarcity of resources. grant writing is not a necessary competency for all projects, but it is a valuable skill that can secure funding for projects that otherwise would not have been prioritized or possible. in its simplest form, a grant is a well-communicated idea with supporting rationale that effectively communicates why a project is a priority to undertake.16 grant applications are usually composed of a narrative section that covers the main goals, a budget with associated costs for the project, letters of support from partners, and details about the project team leading the work. even if a grant is not needed to undertake a project, the process of writing one often mirrors the very same decision-making that is necessary in the project planning information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 8 step. project planning is recommended for all digitization projects and it is nearly always required by external grant funders. grant writing can be undertaken alone, in a team, or (in larger institutions) as part of a research office or external funding program. in any case, it can be defined as the skill of writing text, calculating costs, and compiling relevant documentation to successfully propose projects for the award of external funding. competency in grant writing requires excellent communication skills, including the ability to craft persuasive arguments advocating for the project and the analytical ability to interpret instructions and guidelines to ensure the project is in compliance with the funder’s requirements. often, grant writing involves several people: disciplinary experts, collaborative partners, commercial vendors or contractors, technicians, and advisory boards. being able to facilitate discussions and coordinate actions is vital to wrangling the pieces of a large grant pre-award, as well as successful grant administration once funded. grants are competitive in nature, so creativity and originality in framing of a problem can mean the difference between a highly ranked grant and one that is passed over by reviewers. one method to obtain competency in grant writing is to read as many grant proposals as possible, specifically targeting those for similar projects.17 in addition, some funders look for a panel of grant reviewers and seeking out opportunities to participate in these processes is a valuable education. in the case of unlv digital collections and the pis, grant writing has been honed over time by some of the strategies mentioned above: reading other grant applications, serving on grant review panels, collaborating with other stakeholders, and communicating with the granting agency to understand criteria and solicit feedback. although the grant proposal was written and the grant was secured prior to the hire of the project managers, the project managers were able to develop a thorough understanding of the grant process. by successfully completing the grant projects, in addition to reviewing the grant proposals, contributing to quarterly reports, and discussing the projects with the pis and other stakeholders, the project managers gained valuable experience and understanding to inform their own future grant applications. given the scarcity of resources, the statewide digitization workshops made it a priority to address various aspects of locating grant opportunities, preparing to write proposals, seeking out collaborations to strengthen applications, and the mechanics and timelines to expect when applying for grants. one of the panel sessions in the workshops included a presentation by the state library’s grant administrator, who provided an overview of the state process and what the board looks for when reviewing project proposals. many participants found this particularly helpful because seeking out and applying for grants for digitization projects was not within their frame of reference, especially as many did not believe they had the requisite expertise in digitization. awareness of a need, gathering information, and analyzing examples are some of the first steps in developing a competency. the workshops helped attendees take these first steps of developing competency in grant writing and management but fell short of actually helping them to write their own grants. in this case, however, it was appropriate since the attendees did not have specific projects in mind and likely needed to spend more time in the first stages of competency development before jumping into implementation. workshops are most effective when the level of the content is appropriate to the level of expertise of the attendees. project management project management training is not often specifically emphasized in mlis programs. while there is literature on this topic, most people learn on the job.18 a successful project manager demonstrates information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 9 mastery of this competency by taking responsibility and assuming leadership of the project throughout the process, even if they are not intimately involved in the day-to-day tasks. they often are responsible for hiring and training project team members as well as communicating and responding to project team members and stakeholders. while they are tracking and analyzing progress using appropriate metrics, they are often the one raising a red flag if the project is experiencing delays or challenges. because they are responsible for ensuring the completion of the main goals of the project within the specified timeline, they often need to analyze bottlenecks and propose possible solutions in order to deliver high-quality results. ideally, they learn from their experiences and also help other team members and the organization learn from experience. a key role of the project manager is not only to deliver the outputs, but to assess and analyze, both during the project, in order to make improvements, and after, in order to inform future projects. therefore, investment in mentoring and supporting a project manager, whether a temporary or permanent staff member, can greatly influence how much learning takes place during the project and how that acquired knowledge is transferred to others. documentation is a key part of project management. this needs to happen at every interval of the project—while planning, during implementation, and at the conclusion.19 documenting concrete data including the time spent on specific activities helps to plan cost predictions for future projects, as well as to make recommendations regarding future staffing and equipment. mastering this competency involves planning, an eye for both details and the big picture, clarity, transparency, communication, and dedicated recordkeeping from the start of the project to the end. much like in project planning, the unlv pis had multiple years of experience stewarding projects from start to finish, which assisted them in on-the-job development of the project management competency. they were able to share with the project managers their accumulated years of learning experiences on both the projects, providing guidance on what to look for and how to comprehensively document the current digitization projects. this mentorship, combined with the experience of managing the day-to-day workings of the digitization projects, allowed the early career librarians to develop this competency. in addition, monthly project staff meetings, complemented by on the spot consultation when necessary, contributed to the ease of competency development. during the statewide digitization workshops, the project teams discussed digitization project management and shared strategies and tools such as using google sheets and trello to track workflows and progress. the teams also provided advice for aspects of project management such as managing student workers, troubleshooting equipment, transparent communication, and more. the project team chose to focus specifically on their own large-scale digitization experience because literature and resources about general and library project management are readily available. in addition, participants were encouraged to consider how their non-digitization experiences with project management could be translated to this kind of project as a way to encourage reflective learning based on their individual experience. metadata digitizing materials would not be a valuable endeavor without comparable investment in describing them with metadata that aids users in discovering and using the digital objects. developing a project plan that includes metadata approaches is essential in scoping project work and resources. metadata assignment and quality review is often a far more resource-intensive step information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 10 than the process of digital capture. metadata is one digitization competency that is robustly addressed in library school programs. standards are well documented and examples of digital library metadata are easily accessible online. the importance of metadata to the library and archives profession means that many professionals already have a foundational knowledge. what makes metadata a difficult competency to master is the level of detail and specificity it entails, which makes the step from theory to practice challenging. metadata competencies require an understanding of recognized standards, the ability to interpret and apply them, and an awareness of metadata mobility including: reusability, interoperability, and flexibility for migration or transfer.20 metadata-related skills require comfort moving along a wide spectrum of varied tasks, often toggling between awareness and understanding of high-level philosophical issues (such as inclusiveness of subject terms) and a laser-focused eye for detail to troubleshoot data issues (like correcting spreadsheets or code). metadata work traverses several phases of the digitization lifecycle: from initial preparation of collections, during capture, through the ingest into systems, and over the long-term to maintain and preserve the assets. metadata quality itself is difficult to quantify, making this a competency that can be tricky to evaluate. mastery can be indicated through the identification and study of appropriate standards, including compliance with any data reuse requirements, such as a regional digital library consortium, or metadata requirements to ensure compatibility with existing systems and data. in addition to selection of standards, or adherence to existing standards, metadata can be subjective and needs to be undertaken with attention to the level of specificity required for the project. completion of successful projects demonstrates efficient processing of records balanced with an appropriate level of metadata richness. documentation of metadata approach via a metadata application profile (map) as well as training materials and examples for metadata creators are also good indicators of metadata expertise. while technical skills are valuable for metadata competencies, communication and soft skills should not be underestimated as part of this skill set. often metadata competency is an area where collaboration is required. many libraries have catalogers, metadata librarians or aggregators that can advise and sometimes train or provide documentation for projects. before creating a new metadata approach from scratch, consultation can be a very effective way to gain greater competency. at unlv, the choice of an already processed collection eased the metadata choices for digitization. this meant there was already a certain amount of basic metadata regarding the collection; in addition, having the curator function as a subject expert engaged in prepping the collection enabled the project team to have a readymade list of prioritized subject terms, people, and corporate bodies available to input as each folder in the collection was digitized. the building the pipelines project manager had prior coursework in metadata, as well as experience assigning metadata in a previous internship. using the unlv’s metadata application profile as a guide and the existing metadata procedures established for the project, the project manager was able to hone a better understanding of metadata theory applied in practice, including how to best capture the “aboutness” of these particular digital objects. the project manager also observed the importance of consistency in applying metadata by performing quality control of the studentcreated metadata. a final contributing factor in developing competency in this area is that the team, consisting of the digital collections librarian, the project manager, and the student assistant, had many resources available as a team to solve problems. as previously mentioned, the team met information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 11 to review any project concerns and to pull in adjacent team members such as technical services staff, the metadata librarian, the curator, or even those with experience in programming and application development who could advise on how the metadata would appear in other systems, such as the one being developed for future digital collections. this larger group feedback was invaluable in the learning process and often touched on the more abstract concepts underpinning the tasks. at the statewide workshops, a metadata “bootcamp” was held in which staff addressed the types of metadata standards attendees were likely to encounter, the role of a metadata application profile, how to identify an existing map and apply it to your collection materials, as well the value of having a subject expert available for consultation. while reuse of existing description data (e.g. , finding aids or inventories) was an important topic for the first workshop, in response to feedback the second workshop’s metadata bootcamps focused more on concrete steps required to make digitized images searchable regardless of other workflows or systems that might be in use. again , this was an example of tailoring the content to the learning level of the audience. while all participants were familiar with metadata, many did not have experience using a map or taking interoperability into consideration. many recognized a need to devote more time to developing this competency, regardless of project. digital capture whether it is done in-house or outsourced to a vendor, competency in digital capture (digitization in the most specific sense) is key. this competency requires considering the materials to be digitized, how they will be displayed, and how long-term access will be provided to the digital objects. working in-house, technical mastery is not required, but it is necessary to have a solid idea of what hardware and software capabilities are, as well as who to consult should difficulties arise (and they will).21 mastery of this competency means having a vision for the ongoing presentation and use of the digitized material and outlining specifications to make that happen. documenting digitization specifications is useful not only for the project manager and for fu ture projects, but also as a training tool for students, interns, and volunteers. it can also be a source of important preservation and technical metadata ensuring files created today are sustainable into the future. in addition, a robust quality control workflow should be in place prior to uploading digital objects for display and use. a key component of digital capture is efficiently preparing the selected materials. at unlv, experience has taught the digital collections department that digitization is most successful when using materials that have already been physically processed (surveyed and arranged) and for which an inventory (finding aid) has been created. digitization of archival materials can quickly become complicated because they are often not physically uniform or consistent, and sometimes they are grouped together for digitization into complex/compound/aggregate digital objects. well thought-out workflows for naming and tracking individual files can make the digital capture process smoother, especially when files are related (such as the front and back of a photo, or pages of a scrapbook). this item-level documentation is critical to managing the large volume of files created in digital capture. any conservation or preservation concerns of the physical materials should also be addressed prior to capture. additional consultation may be required if unforeseen complications or problems arise during digital capture; item-level review may not be possible for all materials during the planning stage. for instance, there may need to be an alternate workflow for items that contain personally identifiable information or which are too fragile to undergo scanning or capture. information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 12 there are a number of options for capturing images to create digital surrogates, including digital camera systems and a variety of scanners. depending upon the method of capture, additional software may be needed to edit, output, and ingest the images into a digital management system. for a text-heavy collection, software for optical character recognition (ocr) makes the items fulltext searchable. for audiovisual materials, digital capture is even more complex. the local hardware, software, and procedures for capture all may require an investment in hands-on time learning and testing procedures. the repetitive nature of capturing items may also require some investigation of ergonomics or more human-friendly configurations of these variables. at unlv, step-by-step documentation for using the various hardware and software is key to developing staff competencies. such documentation includes screenshots of steps in the process to contribute to comprehensive understanding and correct implementation of the workflow. projectbased staff also make suggestions, as they move through projects, to improve current workflows. the clear documentation, repetition of tasks, access to workflows of prior digitization projects, consultation with experienced staff, and review of available resources (such as the previously mentioned fadgi website) all contributed to competency development for the project managers. although the pis have years of experience digitizing, it is a detailed process that can be forgotten without use and practice, and it is a competency that must be continually cultivated because of changing technology. if it is decided to outsource digital capture, there are a number of factors to take into account in order to find the right vendor. issues to consider include cost, company stability, prior clients and completed projects, timelines, where the work is performed, and preferred communication methods. requesting a quote for services can be a good way to gain visibility into vendor communications, flexibility, and workflows, and will be essential if the project funds are administered in conjunction with any state or organizational purchasing rules or guidelines. although it can be time-consuming, it is vital for the research and legwork to take place prior to starting the project (see the “project planning” section). in outsourcing, confidence in the digital capture partner is key. mastery of this aspect of digitization means a comprehensive, transparent agreement, a regular flow of communication, and comfort in letting go of control over a major part of the project. resources provided by the northeast document conservation center (nedcc) and the sustainable heritage network help to consider the pros and cons of both in-house and outsourced digital capture.22 project management skills can also be very useful as working with a vendor shifts the needed competency from digital capture to more of a project management focus. unlv often employs vendors for the more challenging formats mentioned, such as oversized materials like maps and architectural drawings, and for materials like newspapers that require specialized zoning in the metadata to retrieve articles. working with a vendor can be an informative experience, teaching communication skills, negotiation of contracts, building appropriate timelines, and quality reviewing deliverables. some granting agencies cover a limited timeframe and outsourcing digital capture can free up an organization’s time to do more librarycentric work like metadata or archival processing. for the building the pipelines project, most of the material in the selected collections was flat printed material that was not oversized or in challenging formats such as film/transparent material, newspapers, or media (audio/video). this led to a high comfort level for in-house digital capture as there were established procedures for the archival collection. information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 13 at the statewide workshops, participants attended a digital capture session where they were presented with digital capture workflows and information about unlv’s decision-making regarding digitization equipment, outsourcing vendors, and technical standards, and then they went into the digitization lab to observe the equipment in action. the digital capture bootcamp was facilitated by the head of digital collections, the student assistant, and the visual resources curator (who is a professional photographer). this unstructured session offered a place for attendees to preview equipment that might be suitable for their projects, get a sense of costs if they were looking to purchase equipment, and to observe the digital capture in a large-scale workflow (a specially designed rapid capture overhead camera system), a medium-scale workflow (with digital slr camera and copy stand), and a small-scale workflow (flatbed and map scanners). attendees were encouraged to match equipment to their project needs or identify if outsourcing was an appropriate approach for their collection. attendees were not able to use the equipment themselves or practice the digital capture workflows, but the small workshop format allowed them to view demonstrations in person, ask specific questions, and also see example workflows in action, which is a step above what online research or resources provide for competency development. digital asset management competency in digital asset management goes beyond identifying the storage capacity necessary for a project. digital asset management includes the storage, access, and preservation of digital files and their accompanying metadata. there are different ways to provide access to digital objects, some of the most popular being online content management systems like omeka or digital collection management systems like contentdm.23 as mentioned previously, metadata is important for staff and users to discover and locate digital objects. competency in digital asset management requires technical knowledge of how to securely and efficiently transfer digital files that are requested, or how to provide secure and user-friendly online access. it also requires planning to ensure that whichever approach taken is sustainable and can meet demand. good digital preservation means planning and implementing the necessary actions to ensure that digitized resources continue to be discoverable, accessible, and usable well into the future. in the case of digitized libraries and archives materials, this means that they must be well-documented and trustworthy. preserving digital materials includes maintaining multiple copies of files, capturing checksums to verify if the bits of a file have been corrupted over time, and in some cases, migrating file formats so that items can be viewed and used with future hardware and software. models for digital preservation include the open archival information system (oais) model and the national digital stewardship alliance (ndsa) levels of preservation.24 software and tools to aid in digital preservation tasks are available, as is training. however, digital preservation is still relatively new to many in the libraries and archives profession, although some individuals and institutions have developed very sophisticated and carefully considered programs and approaches. since digital preservation is based on technology, it will always be changing. one must not only learn and be able to implement the current standards and best practices of digital preservation, but also always keep up with changes. success in digital preservation requires ongoing effort and evaluation. successful digital preservation means that staff and users can find, understand, view, and use digital resources at any point in the future. information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 14 for the grant project managers, this was the most challenging competency. while they were exposed to the complexities of digital preservation at unlv, this process was already wellestablished, having been developed over time by the pis and other library staff. the project managers essentially stewarded the newly digitized objects up to this point and then handed over the reins to the digital collections librarian. while they were free to ask questions and developed an understanding of the standards that contribute to long-term digital preservation, the project managers did not implement this particular workflow, nor did they contribute to adapting it. it is important to keep in mind that digital preservation is not an all or nothing proposition; small steps can be taken by libraries and archives professionals to address short-term digital preservation while gaining a better understanding of long-term solutions.25 given the complexity of this competency, it was difficult to train participants in the statewide digitization workshop setting. however, unlv’s digital collections staff emphasized the multiplicity of options available for libraries and archives with varying levels of resources and encouraged participants to be open to starting despite ambiguity about the ultimate long-term solution for their organization. digital collections staff also provided an overview of these options and shared the evolution of digital preservation strategies at unlv, including suggesting some first steps such as creating an inventory of digital assets and a digital preservation plan. developing expertise in this competency requires in-depth research, consultation, and analysis to customize plans for local circumstances. the statewide workshops provided only an hour-long introduction to the topic and a broad overview as an example. digital preservation is a topic that is well-addressed by other more intensive workshops though, such as the society of american archivists digital archives specialist courses and the powrr institute.26 summary of competency development: experiential learning versus workshops learning through experience for project teams unlv’s grant projects are examples of how specific time-bound projects and grant funding can be utilized to develop both individual and organizational competencies, and to share what is learned via workshops, aiding in the professional development of others. the early career project managers advanced the most in competency development because of the opportunity for focused training and experiential learning through practice. they progressively developed digitization competencies in a number of ways, including training from the pis, working with experienced student assistants and staff, reading locally created documentation, observing project activities and decision-making by the team, proposing solutions to challenges and testing them through trial and error, learning by doing tasks and suggesting small iterations to improve them, consulting the workflows from previous projects, and reviewing recommended resources such as the fadgi website. the project managers, though temporary employees, were treated with the same status as permanent staff and encouraged to attend meetings, ask questions, take risks, and experiment in a safe and controlled environment of learning. given the many multifaceted details and tasks that go into a digitization project from start to finish, it is unrealistic to expect staff to remember everything without engaging in the process themselves. training for the grant projects broke each project down into a series of discrete steps, including preparation, digital capture, quality review, ocr transcription, metadata creation and review, and upload into the digital asset management system. each task was reviewed and practiced in a linear manner. given the volume of materials, basic mastery and self-sufficiency for information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 15 the grant project staff were achieved fairly quickly. this allowed project staff to then identify areas for workflow improvements and test adjustments for increased efficiency. despite having just two dedicated staff, one of whom had no prior digitization experience, over 55,000 digital surrogates were created during the ten-month building the pipelines project, far exceeding the original goal of 10,000 images. in both grant projects the project managers were able to develop digitization competencies as a result of on-the-job experience, enriching their skill sets while also assisting unlv digital collections to refine workflows for large-scale digitization projects. this in turn strengthened competencies on the organizational level, and those of the pis. in the best-case scenario of this kind of project, temporary staff develop valuable digitization competencies via project-based work; however, that is not always the case, and temporary project-based positions can be very harmful to the personal and professional development of workers. when undertaking a project that uses temporary labor, the organization should plan for and prioritize equitable hiring practices, fair compensation and benefits, and a positive and productive experience for temporary staff.27 learning through experience for organizations grants are temporary in nature, so it is important that organizations who fund them and who receive funding think about the long-term implications of the temporary work. it is important for project staff to clearly document all of the details of the digitization approaches and workflows that worked successfully in the grant, as well as any problems that can be avoided. all the extra work of testing and refining new workflows completed by the project staff, can (and should) be adopted and integrated by permanent staff into the existing structure of the department or institution. one of the drawbacks for institutions undertaking grant-funded projects is that temporary staff leave and take their expertise with them. it is essential for permanent staff to not only teach, but also be open to and active in learning from the temporary staff during the project, even if the permanent staff are not doing the day-to-day work. building opportunities for information-sharing and knowledge transfer into a project plan vastly increases the value of the grant project funding. this organizational learning is a form of accountability to the funder ensuring that projects can be sustained and that lessons learned contribute to increased capacity in the funded organization and beyond. learning through workshops for professional development grant projects also pave the way to share lessons learned with colleagues via workshops or collaborative endeavors. as previously stated, conducting a day-long digitization workshop for nevada libraries and archives institutions was a goal of both large-scale digitization grant projects undertaken by unlv digital collections. besides the metropolitan areas surrounding las vegas and reno, much of nevada is rural and sparsely populated. these workshops provided a forum for people who might not usually come together to meet and talk about their work. many libraries and archives institutions in nevada are small and may have limited or no experience with digitization. the workshops sought to provide an overview of large-scale digitization using unlv projects as examples, as well as to provide practical advice related to developing digitization competencies. the first workshop at unlv, held in may 2018, consisted of presentations and discussions addressing the basics, methods, and challenges of large-scale digitization (see appendix c for the may 2018 agenda, “nevada statewide large-scale digitization symposium”). the grant team surveyed participants after the workshop and received mostly positive responses. sixteen out of information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 16 nineteen people who completed the survey said they learned something, thirteen said they were confident and likely to apply what they learned, and eleven people said that if there was a follow up workshop they would attend. the comments from the surveys showed that participants wanted more interactive activities, and many of them were not ready to implement large-scale digitization at their institutions—they wanted to learn more about the basics of digitization first. this feedback highlighted two challenges of workshop-based learning: the tendency toward passive delivery of large amounts of information and designing content for an audience with unknown or varying skill levels. the second workshop, held in may 2019, still shared what unlv learned about large-scale digitization during the grant project, but widened the scope to address multiple important digitization competencies, whether the project is largeor small-scale (see appendix d for the may 2019 agenda, “nevada statewide large-scale digitization symposium”). prior to the workshop, attendees were surveyed about their expertise level and topics of interest and were asked to review a project planning document with their local materials in mind. sessions were designed as bootcamps with more extensive documentation that could be used as a template for implementation at their home organization. participants were encouraged to ask questions and share their own experiences during the workshops and were given the option to sign up for a private consultation. the team endeavored across the workshop to allow for more interactive, hands-on learning. although unlv adjusted the second workshop based on the feedback from the first, teaching practical how-to skills that are broadly applicable in a one-day workshop is challenging. digitization is a complicated and technical undertaking that is most easily learned via hands-on experience, which is most effectively gained through repetition rather than a one-day workshop. there was not enough time or equipment for participants to actually practice parts of the digitization process themselves and so experiential learning was not always an option for every competency. also, if participants return to an organization with different equipment, hardware, and software, there are limits to hands-on training. another potentially problematic issue is staying up to date with the rapid technological changes that characterize digital collections. if a person gains a basic intellectual understanding of digitization via a workshop or other professional education opportunities, and then returns to their setting without starting a specific project in a timely manner, there is a risk that the knowledge they gained becomes outdated. despite the drawbacks of workshop-based learning, workshops are still valuable venues for colleagues to come together and learn from one another. they can also provide demonstrations or hands-on learning activities that help to bridge the gap from written theory to local implementation. conclusion online access to libraries and archives materials is expected and increasingly necessary in order for institutions and their collections to remain vital, useful, and relevant. ideally, digitization in libraries, archives, and museums would be a permanent functional area with specialized staff. however, many medium and smaller libraries and archives institutions do not have the capacity to sustain such an area. competencies in the areas of project planning and management, grant writing and administration, digital capture, metadata, and digital asset management are instrumental in order to complete a successful digitization project or institute a digitization program in any setting. information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 17 despite the proliferation of professional workshops, online resources, literature, and conferences regarding digitization skills, it can be difficult to make time to study these materials and put such learning into practice in a way that builds to more sophisticated learning through experience. the diversity of collection materials to be digitized, the range of local circumstances, and the changing pace of technology prevent any profession-wide standardized approach to digitization education. instead, individuals, organizations, and the profession as a whole must strategically invest in the most effective and efficient methods and opportunities for developing digitization competencies. locally, unlv digital collections has found that experiential project-based learning is the most effective way to pilot new workflows and develop competencies. project-based experiences, if thoughtfully designed with an eye to mentoring and supporting temporary staff, provide an opportunity for individuals to develop and practice these competencies in a hands-on way that encourages deep learning. there is a unique place for small pilot projects, modest grant projects, or one-time experimental projects to create a space for this kind of learning in almost any organization. as capacity increases, digitization projects can also be designed to develop competencies at the staff functional group level, the organizational level, or the regional level. workshops in turn can be an opportunity for project teams or experienced individuals to share what they’ve learned and teach basic competencies to others. although not as comprehensive and effective as experiential learning, workshops can provide a solid introduction to digitization competencies, especially if interactive and hands-on learning methods are incorporated and there is a willingness for organizations to remain available for consultation or questions from attendees. workshops that have a preand post-session component can add continuity, and workshops that can be offered multiple times have the ability to evolve and scale. rotating instructors, incorporating hands-on sessions, and on-going mentoring are all ways to improve workshopbased learning. scaffolding these approaches and sharing what is learned individually or locally with others is a way to continue to develop the capacity of libraries and archives institutions to provide global online access to unique historical materials. although this approach is already widespread in the profession, it is important not to leave individuals or institutions with less resources behind. when planning new digitization projects or initiatives, institutions should consider adding and investing in new positions, partnerships, and regional collaborations and networks. when new permanent positions are not possible, temporary positions should be designed to be empowering and valuable for workers, rather than exploitative and harmful. in an age where technology is changing rapidly and is driven by large, well-resourced corporations, developing the profession’s competencies in digitization, keeping pace with digital technologies, and remaining relevant in the information environment depends on decentralized, peer-to-peer educational opportunities that use efficient and effective methods of teaching, such as interactive and hands -on learning. information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 18 appendix a an overview of planning and implementing digitization projects created by emily lapworth for local use, march 8, 2018. shared at the statewide digitization workshops. these steps were written for large-scale digitization but can be applied to any size digitization project. 1) identify collections for digitization. a) brainstorm your goals for this project. think about what you will do with these digital surrogates, and who your audience is. b) criteria for selection of materials i) formats: start simple. if everything's the same, large-scale workflows are easier to apply. ultimately you will need to create different workflows for each format with differing requirements. for example, print photos are digitized differently than film negatives. text documents benefit from transcription using optical character recognition (ocr) software, while photos do not, and handwritten materials present additional discoverability challenges. when creating complex digital objects with different formats within them things can become even more complicated. ii) condition: fragile materials require extra handling time and possibly additional physical treatment prior to digitization. iii) existing arrangement and description: it is easiest if online access can directly mirror physical access, but the materials may need additional arrangement and description before digitization, depending on your goals. if the materials already have item or folder level description that is ideal. if there is any hierarchy in the existing description, especially inconsistent or complex hierarchy, consider how you will reuse that description for digital objects. iv) copyright: plan on providing public online access only if you own the copyright, have permission from the copyright holder, or if it is a strong case of fair use. c) see preparation step (below) to come up with some idea of how you will undertake this project. it will likely be modified during the actual preparation, but you need to have some idea of what you will do and how you will do it in order to gather support and resources. 2) assess the technical infrastructure needed to create, manage, provide access to, and preserve the digital files. a) estimate how much storage space you will need, and how much space will be needed for long-term digital preservation. b) make sure that your current digital preservation policies and workflows will be able to accommodate this project. adjust them if needed. c) identify what equipment and software will be needed and if you already have it, can acquire it, or can use someone else’s. d) assess if your existing workflows and systems for providing access to digital materials will be able to accommodate this project, and what changes you might need to make. e) technology could be a great area for collaboration! if you lack certain resources, explore opportunities to collaborate with other institutions. 3) coordinate with other stakeholders to verify choices and plans for digitization. information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 19 a) find out what kind of support there is (financial, staffing, etc.) from management, administration, and the community. b) identify possible collaborators and discuss plans, make agreements, etc. c) decide who will manage and oversee the project and how different responsibilities will be distributed. d) identify and apply for grants if appropriate. 4) prepare collections for digitization. a) arrangement: assess how are the materials physically arranged and described, and if it will help or slow down your anticipated workflows. plan for and complete additional processing if needed. b) decide how you will display digitized materials. mirroring existing arrangement is the easiest, but you also have to consider the file formats you want to create. c) description: figure out how you can reuse existing description. plan metadata fields, vocabularies, prioritized subject terms and names. d) prepare preliminary metadata. reuse what you already have! e) prepare physical materials. verify that physical contents of the collection match existing description or inventory. remove staples, unbind, unsleeve, flatten, etc. identify and address any preservation or conservation issues. f) identify physical formats (this will help determine timeline and what equipment is needed). g) decide: outsource or in house? h) create and test workflows and procedures. i) create documentation for workflows and procedures (important for duration of project, for reusing for future projects, and also for future employees stewarding these digital assets to know what you did and how you did it). j) create and prepare systems, documents, or mechanisms to track work (it’s important to stay organized, especially when dealing with a large amount of materials or a team of workers). 5) digitize collections a) set up consistent file naming procedures and make sure they are followed. b) when dealing with mixed materials in house: depending on equipment and composition of materials, start with the easiest or what you have the most of, then take note of other formats (e.g., transparencies, oversize, etc.) that require different equipment or settings so you group them together to do later all at once. c) keep specifications simple if possible, especially if you have student workers doing the digitization. (for example, if you have complex digital objects with both text and photographic prints, and can digitize both materials on the same equipment without changing settings, do so. if you normally digitize text at 300 ppi but want photos at 600 ppi, rather than having the technician stop and change the settings, capture all at 600 ppi if you have the space.) d) auto-crop is a great tool if you have it but otherwise try to improve the efficiency of your processes with any tools at your disposal. sometimes this can be as simple as placing the item with the correct orientation to avoid the need to manually rotate later. e) file formats: archival images are generally tiffs. smaller derivative files may be necessary for access or to speed up ocr processes. sometimes it’s better to output them at the time of scanning than to batch process later. 6) process images information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 20 a) see above: try to improve your digitization workflows and procedures to shave time off of image processing. b) ocr: if you have textual materials ocr transcription makes them much more accessible with less manual work into creating detailed metadata. this is especially true for large aggregations of textual documents. resist the urge to have perfect ocr transcription. something is better than nothing, and when dealing with scale, you do not have the time to correct everything. here is also an opportunity for crowdsourcing, if you have the technical resources to set it up. c) ocr file output: depending on how you choose to display and make the digital surrogates available, you may need to output text files and/or pdf/as. 7) describe and provide access a) reuse description that already exists (e.g., from an inventory or a finding aid). if a finding aid exists, make sure you are using all available information and understand how description is inherited and can be reused. b) at the beginning of the project transform the metadata that already exists into a format you can use to describe the digital objects. you can add to this existing metadata throughout the workflow. c) at the beginning of the project identify preferred subject terms and important names to look out for and add to digital object metadata when appropriate. this is especially important when metadata is created by students or teams or anyone unfamiliar with the subject matter of the collection. it will help ensure consistency and make faceting better for users. d) explore how search engine optimization (seo) works for your public online access system. take that into consideration when creating metadata in order to optimize discovery of the materials. e) make it as easy as possible for users to identify the provenance of the digital object and to find other digital objects from the same collection. f) consider the links between the original collection description and the digital surrogates. consider adding digitization information or links to digital surrogates into finding aids and other records. consider also adding a link to the finding aid in the digital object metadata. consider using persistent identifiers, such as arks (archival resource keys), to d o this, instead of using regular urls. g) find out how your access system indexes full text transcripts and how it displays different file formats. consider if you are able/if you want to offer multiple file formats of a digital object. for example, a compound digital object that includes both text and images could be available as a collection of image files, a single pdf file, or both. identify what would be most useful to your users. h) don’t forget about structural, administrative, technical, and preservation metadata! 8) implement quality control procedures (qc) a) have a strategy (e.g., sampling), guidelines, and goals for qc. b) for staff performing quality control, identify the most important things to look for. c) decide how much time should be spent on qc. d) identify and acquire any automated tools that can be used. e) set up procedures or steps to follow when errors are found. 9) preserve digital assets a) you should have already planned how you will ensure access to and preservation of the digital files and metadata in the long term. best practice is to have policies in place information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 21 identifying what digital assets should be preserved and to what extent. identify applicable standards and best practices, implement software and technical solutions. b) set up workflows and procedures to ensure that the digital files receive appropriate ongoing digital preservation treatment. 10) publicize and promote a) work with administration, collaborators, and other stakeholders to publicize and promote the project. b) depending on your audience, social media, academic listservs, and professional organization publications can be other avenues to spread the word. c) set up harvesting with your regional digital library for inclusion in the digital public library of america. 11) assess a) web statistics can be used to track the use of online materials. see saa/acrl’s “standardized statistical measures and metrics for public services” section 8 “online interactions” for general information on what information to collect, and the digital libr ary federation’s “best practices for google analytics” for specific information on google analytics. if you are a contentdm user, see “google analytics in contentdm.”28 b) surveys, interviews, and focus groups are other methods that can be used to gather feedback. c) record and compile any oral or written feedback received from stakeholders and audiences. d) analyze feedback and use statistics to identify areas of success and areas for improvement. make improvements as necessary and incorporate findings into planning for future projects. information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 22 appendix b digitization plan template template created by emily lapworth for local use and shared at the statewide digitization workshop. project overview collection name(s): collection number(s): link to finding aid(s) or existing description(s): project staff: project supervisor: research value/audience: goals: available resources: staff, money, equipment, software, etc. additional resources needed: staff, training, money, equipment, software, etc. priority level: low, medium, high. why is this being digitized now? part of the regular workflow, part of a grant project, or specially requested? publicity and promotion plans: assessment plans: estimated time frame/due date: estimate how much time should be spent on the collection, or when it should be finished by. date completed/approximate hours spent: formats and quantity of items: e.g., seven boxes of photographic prints, three folders of flat text documents, two drawers of oversize materials, etc. existing arrangement & description: how the collection is currently arranged, what description is currently available? copyright: what is the copyright status of the materials and can you legally digitize and provide access to them? restricted or sensitive materials: e.g., skip over restricted folders, digitize restricted item notice, or physically cover pii (personally identifiable information, such as social security numbers) during digitization. preservation issues: any fragile or delicate materials that need extra attention? supply needs: e.g., envelopes needed for rehousing information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 23 notes for future/follow-up: e.g., missing items, materials that should be restricted, recommend additional processing, rehousing, digitization, metadata enhancement, etc. preparation what will be digitized and what won’t be? e.g., series x will not be digitized at this time. it consists of audiovisual materials which would need to be outsourced. how will items be arranged and described online? how will identifiers/file names be assigned? e.g., each folder = a compound object, file titles from the finding aid will be used as titles for the digital objects what physical preparation must take place before digitization? e.g., remove all staples and fasteners digitization equipment/technical specs: • outsourcing or in-house equipment to be used • file types (e.g., tiffs) • file quality (e.g., 24-bit color, 600ppi) • file naming other specifications: • where will digital files be stored and preserved? • how will special physical formats be handled? (e.g., scrapbooksentire page or individual photos; magazinesentire issue or just cover? etc.) digital file processing • image correction? • cropping or other editing? • ocr or transcription? • create derivative files? digital file quality control: what procedures and workflows will you put in place to ensure that everything is digitized accurately and according to the project specifications? metadata what standards, fields, guidelines, and controlled vocabularies will you use? metadata quality control: what procedures and workflows will you put in place to ensure that all metadata is accurate, consistent, and conforms to the project specifications? access how will digital objects be accessed? what systems, workflows, and procedures will be used to provide access? information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 24 appendix c nevada statewide large-scale digitization symposium funded by lsta may 18, 2018 coffee and pastries 9:00 9:30 digitization lab tour 9:30 10:00 welcome 10:00 10:15 opening remarks from the dean of university libraries and the director of special collections and archives. session: what is large-scale? [live streaming begins] 10:15 11:00 this session will cover the characteristics of large-scale digitization and what sets it apart from other types of digitization projects. the unlv entertainment project team will also provide an update on the lsta funded project they undertook to digitize over 25,000 items from unlv’s entertainment collections. panel: methods for ramping up identifying resources 11:00 12:00 there is a mandate to increase efficiency in digitization, but what resources can help you get there? this session will detail four methods to increase digitization output and address how organizations of varying resource levels can adopt them. 12:00 1:00 lunch enjoy a catered lunch and some discussion time with colleagues from across the state and region. there will be time to walk around the room and share digitization activities at your organization via whiteboards. during lunch you can also browse the “equipment buffet” where we will have handouts/displays on various types of digitization equipment and outsourcing vendors. panel: challenges of digitization at a larger scale 1:00 2:00 information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 25 ramping up digitization is not as simple as merely increasing numbers. in this session we will discuss the challenges encountered each phase of digitization when scaling up and some strategies to meet the challenges. break [live streaming ends] 2:00 2:15 during the break, browse the “equipment buffet” where we will have handouts/displays on various types of digitization equipment and outsourcing vendors. using the provided worksheet, shop the buffet and rank how well each product meets your digitization needs. discussion: resource 5: statewide collaboration (in groups) 2:15 3:15 the last session of the day will focus on an additional resource to ramping up digitization: your peers and partners right here in nevada! we will review the notes about organizational projects and shared challenges, identify potential partnerships or collaborations, discuss grant opportunities, and work as a group to prioritize our state’s most at-risk collections. wrap up / assessment 3:15 3:30 before everyone departs for home, we will share contact information from attendees, complete a workshop evaluation and discuss follow up activities for next year. all attendees will leave with a customized plan of action for their organization. attendee learning objectives: • be able to define characteristics of digitization projects (mass, large-scale, boutique) and where your organization fits. decide on the type of digitization appropriate for your organization to move toward. • understand pros and cons of each method and the type of resources needed to support implementation. identify one or more method/resource for your organization to target to increase your organizational capacity. • understand complexities of large-scale digitization and identify one or more challenges at your organization. • gain perspective on projects across nevada. be able to identify at least one future collaborative opportunity. information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 26 appendix d nevada statewide large-scale digitization workshop funded by lsta may 10, 2019 workshop outcomes: • digitization boot camp sessions guided by survey responses • upr lsta project update and lessons learned • project consultations available • reflections on statewide workshops compare over 1 year agenda 8:00 9:00 *concurrent session coffee and pastries digitization lab equipment consultations welcome 9:00 9:15 opening remarks from the dean of university libraries panel: challenges of digitization at a larger scale 9:15 -10:00 what does it take to complete a large digitization project? in this case study panel presentation, we will cover the approach used in digitizing the union pacific railroad water documents, including: writing the grant and selecting materials, preparing archival collections for efficient digitization, managing the project, the student technician perspective, and trouble-shooting imaging and technical issues. panelists: project manager; curator; digital collections librarian; student technician; visual resources curator goal: overview of large-scale digitization and project deliverables. boot camp: preparing to digitize 10:00 11:00 goal: dig into the decisions needed to create a digitization plan. there will be a short presentation to go over the planning document, including asking “what makes a good project”? we will discuss labor and students and complete hands-on activities with actual collections to encourage work on individual plans. 11:00 12:00 *concurrent session boot camp: capture images group a boot camp: create metadata group b information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 27 goal: provide introductions to two main workflows in digitization projects: digital capture and metadata creation. there will be demonstrations, hands-on activities, and a chance to ask questions with the goal of helping to complete digitization plans. 12:00 1:00 lunch 1:00 2:00 *concurrent session boot camp: capture images group b boot camp: create metadata group a goal: provide introductions to two main workflows in digitization projects: digital capture and metadata creation. there will be demonstrations, hands-on activities, and a chance to ask questions with the goal of helping to complete digitization plans. boot camp finding external funding 2:00 2:30 goal: learn what opportunities exist to secure funding for your project. hear tips on successful grant writing. discuss possible collaboration opportunities across the state. presenting online images dams overview 2:30 3:30 goal: see several options for presenting your collection to an online audience. options will highlight strategies for many staffing configurations including: solo librarian/historian, low it resourced institutions, common systems in the profession, and complex open source development communities focused on digital asset management platform (islandora 8). wrap up / assessment 3:30 3:45 goal: complete short survey on the workshop and ideas for future statewide events related to digitization. one on one consultations available 3:45 4:30 information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 28 endnotes 1 some examples include: “moving theory into practice: digital imaging tutorial,” cornell university library/research department, http://preservationtutorial.library.cornell.edu/contents.html; “bcr’s cdp digital imaging best practices version 2.0,” bibliographical center for research, june 2008, https://sustainableheritagenetwork.org/system/files/atoms/file/bcrcdpimagingbp.pdf; “new self-guided curriculum for digitization,” digital public library of america, https://dp.la/news/new-self-guided-curriculum-for-digitization/; elizabeth la beaud, “analysis of digital preservation course offerings in ala accredited graduate programs,” slis connecting 6, no. 2 (2017): 10, https://doi.org/10.18785/slis.0602.09. 2 anne daniel, amanda oliver, and amanda jamieson, “toward a competency framework for canadian archivists,” journal of contemporary archival studies 7, article 4 (2020): 1–13, https://elischolar.library.yale.edu/jcas/vol7/iss1/4. 3 “ala’s core competences of librarianship,” american library association, 2009, http://www.ala.org/educationcareers/careers/corecomp/corecompetences; “guidelines: competencies for special collections professionals,” association of college and research libraries, 2017, http://www.ala.org/acrl/standards/comp4specollect. 4 archives & records association of the united kingdom and ireland, “the ara competency framework,” 2016, https://www.archives.org.uk/160-cpd/cpd/700-competency-framework.html. 5 youngok choi and edie rasmussen, "what is needed to educate future digital librarians," d-lib magazine 12, no. 9 (september 2006), https://doi:10.1045/september2006-choi. youngok choi and edie rasmussen, "what qualifications and skills are important for digital librarian positions in academic libraries? a job advertisement analysis," the journal of academic librarianship 35, no. 5 (2009): 457–67, https://doi.org/10.1016/j.acalib.2009.06.003. 6 karl-rainer blumenthal et al., “what makes a digital steward: a competency profile based on the national digital stewardship residencies,” lis scholarship archive (2017), https://doi.org/10.17605/osf.io/tnmra. 7 “mlis skills at work: a snapshot of job postings,” san jose state university school of information, 2019, https://ischool.sjsu.edu/lis-career-trends-report. 8 choi and rasmussen, “what qualifications.” 9 david a. kolb and ronald fry, “toward an applied theory of experiential learning,” in theories of group process, ed. cary l. cooper (london: john wiley, 1975), 33–57. 10 “guidelines,” federal agencies digital guidelines initiative, http://www.digitizationguidelines.gov/guidelines/. 11 krystyna k. matusiak and xiao hu, "educating a new cadre of experts specializing in digital collections and digital curation: experiential learning in digital library curriculum," proceedings of the american society for information science and technology 49, no. 1 (2012): 1– 3, https://doi.org/10.1002/meet.14504901018. http://preservationtutorial.library.cornell.edu/contents.html https://sustainableheritagenetwork.org/system/files/atoms/file/bcrcdpimagingbp.pdf https://dp.la/news/new-self-guided-curriculum-for-digitization/ https://doi.org/10.18785/slis.0602.09 https://elischolar.library.yale.edu/jcas/vol7/iss1/4 http://www.ala.org/educationcareers/careers/corecomp/corecompetences http://www.ala.org/acrl/standards/comp4specollect https://www.archives.org.uk/160-cpd/cpd/700-competency-framework.html https://doi:10.1045/september2006-choi https://doi.org/10.1016/j.acalib.2009.06.003 https://doi.org/10.17605/osf.io/tnmra https://ischool.sjsu.edu/lis-career-trends-report http://www.digitizationguidelines.gov/guidelines/ https://doi.org/10.1002/meet.14504901018 information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 29 12 amy lynn maroso, "educating future digitizers," library hi tech 23, no. 2 (june 1, 2005): 187– 204, https://doi.org/10.1108/07378830510605151. 13 “agenda, digital directions: fundamentals of creating and managing digital collections, october 19-20, 2020, tucson, az,” northeast document conservation center, https://www.nedcc.org/preservation-training/dd20/agenda. 14 kim christen and lotus norton-wisla, “digitization project decision-making: starting a digitization project,” center for digital scholarship and curation, sustainable heritage network, july 1, 2017, https://sustainableheritagenetwork.org/digital-heritage/digitizationproject-decision-making-starting-digitization-project. 15 kim christen and lotus norton-wisla, “digitization project decision-making: should we digitize? can we digitize?,” center for digital scholarship and curation, sustainable heritage network, july 1, 2017, https://sustainableheritagenetwork.org/digital-heritage/digitizationproject-decision-making-should-we-digitize-can-we-digitize-0. 16 taylor surface, “getting a million dollar digital collection grant in six easy steps,” oclc next, december 6, 2016, http://www.oclc.org/blog/main/getting-a-million-dollar-digital-collectiongrant-in-six-easy-steps/. 17 institute of museum and library services, “putting your best foot forward: tips on making your preliminary proposal competitive”, december 31, 2015, https://www.imls.gov/blog/2015/12/putting-your-best-foot-forward-tips-making-yourpreliminary-proposal-competitive. 18 examples of project management literature relevant to cultural heritage digitization projects include: cyndi shein, hannah e. robinson, and hana gutierrez, “agility in the archives: translating agile methods to archival project management,” rbm: a journal of rare books, manuscripts, and cultural heritage 19, no. 2 (2018), https://rbm.acrl.org/index.php/rbm/article/view/17418/19208; michael dulock and holley long, “digital collections are a sprint, not a marathon: adapting scrum project management techniques to library digital initiatives,” information technology and libraries 34, no. 4 (2015), https://doi.org/10.6017/ital.v34i4.5869; michael middleton, “library digitisation project management," proceedings of the iatul conferences (1999), http://docs.lib.purdue.edu/iatul/1999/papers/20; “dlf project managers toolkit," digital library federation, https://wiki.diglib.org/dlf_project_managers_toolkit; theresa burress and chelcie juliet rowell, “project management for digital library projects with collaborators beyond the library,” journal of college & undergraduate libraries 24, no. 2–4 (2017), https://doi.org/10.1080/10691316.2017.1336954. 19 “guiding digital success,” online computer library center (oclc), https://www.oclc.org/content/dam/oclc/contentdm/guiding_digital_success_handout.pdf . 20 useful metadata resources include: digital public library of america, “metadata application profile,” https://pro.dp.la/hubs/metadata-application-profile; dublin core metadata initiative, “guidelines for dublin core application profiles,” https://www.dublincore.org/specifications/dublin-core/profile-guidelines/; oksana l. https://doi.org/10.1108/07378830510605151 https://www.nedcc.org/preservation-training/dd20/agenda https://sustainableheritagenetwork.org/digital-heritage/digitization-project-decision-making-starting-digitization-project https://sustainableheritagenetwork.org/digital-heritage/digitization-project-decision-making-starting-digitization-project https://sustainableheritagenetwork.org/digital-heritage/digitization-project-decision-making-should-we-digitize-can-we-digitize-0 https://sustainableheritagenetwork.org/digital-heritage/digitization-project-decision-making-should-we-digitize-can-we-digitize-0 http://www.oclc.org/blog/main/getting-a-million-dollar-digital-collection-grant-in-six-easy-steps/ http://www.oclc.org/blog/main/getting-a-million-dollar-digital-collection-grant-in-six-easy-steps/ https://www.imls.gov/blog/2015/12/putting-your-best-foot-forward-tips-making-your-preliminary-proposal-competitive https://www.imls.gov/blog/2015/12/putting-your-best-foot-forward-tips-making-your-preliminary-proposal-competitive https://rbm.acrl.org/index.php/rbm/article/view/17418/19208 https://doi.org/10.6017/ital.v34i4.5869 http://docs.lib.purdue.edu/iatul/1999/papers/20 https://wiki.diglib.org/dlf_project_managers_toolkit https://doi.org/10.1080/10691316.2017.1336954 https://www.oclc.org/content/dam/oclc/contentdm/guiding_digital_success_handout.pdf https://pro.dp.la/hubs/metadata-application-profile https://www.dublincore.org/specifications/dublin-core/profile-guidelines/ information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 30 zavalina et al., “developing an empirically-based framework of metadata change and exploring relation between metadata change and metadata quality in marc library metadata,” procedia computer science 99 (2016 ) 50–63, https://doi.org/10.1016/j.procs.2016.09.100. 21 “guidelines: technical guidelines for digitizing cultural heritage materials,” federal agencies digital guidelines initiative, http://www.digitizationguidelines.gov/guidelines/digitizetechnical.html; “digital preservation at the library of congress,” library of congress, https://www.loc.gov/preservation/digital/. 22 robin l. dale, “reformatting: 6.7 outsourcing and vendor relations,” northeast documentation conservation center, https://www.nedcc.org/free-resources/preservation-leaflets/6.reformatting/6.7-outsourcing-and-vendor-relations; “deciding to outsource or digitize inhouse," digital stewardship curriculum, center for digital scholarship and curation/sustainable heritage network, https://www.sustainableheritagenetwork.org/system/files/atoms/file/1.20_outsourcingvsin house.pdf. 23 “omeka,” roy rosenzweig center for history and new media, https://omeka.org/; “contentdm: build, showcase, and preserve your digital collections,” oclc, https://www.oclc.org/en/contentdm.html. 24 “iso 14721:2012 space data and information transfer systems—open archival information system (oais)—reference model,” international organization for standardization, https://www.iso.org/standard/57284.html; “levels of digital preservation,” national digital stewardship alliance/digital library federation, https://ndsa.org//activities/levels-of-digitalpreservation/. 25 “from theory to action: ‘good enough’ digital preservation solutions for under-resourced cultural heritage institutions”, preserving digital objects with restricted resources (powrr), august 2014, http://commons.lib.niu.edu/handle/10843/13610. 26 “digital archives specialist (das) curriculum and certificate program,” society of american archivists, https://www2.archivists.org/prof-education/das; “powrr institutes,” digital powrr, https://digitalpowrr.niu.edu/institutes/. 27 sandy rodriguez et al., “collective responsibility: seeking equity for contingent labor in libraries, archives, and museums,” institute for museum and library services white paper, http://laborforum.diglib.org/wpcontent/uploads/sites/26/2019/09/collective_responsibility_white_paper.pdf. 28 saa-acrl/rbms joint task force on public services metrics, “standardized statistical measures and metrics for public services in archival repositories and special collections libraries,” 2018, https://www2.archivists.org/standards/standardized-statistical-measuresand-metrics-for-public-services-in-archival-repositories; molly bragg et al., “best practices for google analytics in digital libraries: digital library federation assessment interest group analytics” working group, 2015, https://doi.org/10.17605/osf.io/ct8bs; “google analytics in contentdm,” oclc, https://doi.org/10.1016/j.procs.2016.09.100 http://www.digitizationguidelines.gov/guidelines/digitize-technical.html http://www.digitizationguidelines.gov/guidelines/digitize-technical.html https://www.loc.gov/preservation/digital/ https://www.nedcc.org/free-resources/preservation-leaflets/6.-reformatting/6.7-outsourcing-and-vendor-relations https://www.nedcc.org/free-resources/preservation-leaflets/6.-reformatting/6.7-outsourcing-and-vendor-relations https://www.sustainableheritagenetwork.org/system/files/atoms/file/1.20_outsourcingvsinhouse.pdf https://www.sustainableheritagenetwork.org/system/files/atoms/file/1.20_outsourcingvsinhouse.pdf https://omeka.org/ https://www.oclc.org/en/contentdm.html https://www.iso.org/standard/57284.html https://ndsa.org/activities/levels-of-digital-preservation/ https://ndsa.org/activities/levels-of-digital-preservation/ http://commons.lib.niu.edu/handle/10843/13610 https://www2.archivists.org/prof-education/das https://digitalpowrr.niu.edu/institutes/ http://laborforum.diglib.org/wp-content/uploads/sites/26/2019/09/collective_responsibility_white_paper.pdf http://laborforum.diglib.org/wp-content/uploads/sites/26/2019/09/collective_responsibility_white_paper.pdf https://www2.archivists.org/standards/standardized-statistical-measures-and-metrics-for-public-services-in-archival-repositories https://www2.archivists.org/standards/standardized-statistical-measures-and-metrics-for-public-services-in-archival-repositories https://doi.org/10.17605/osf.io/ct8bs information technology and libraries december 2020 cultivating digitization competencies | o’hara, lapworth, and lambert 31 https://help.oclc.org/metadata_services/contentdm/get_started/google_analytics_in_con tentdm. https://help.oclc.org/metadata_services/contentdm/get_started/google_analytics_in_contentdm https://help.oclc.org/metadata_services/contentdm/get_started/google_analytics_in_contentdm abstract introduction literature review overview of grant projects competencies project planning grant writing project management metadata digital capture digital asset management summary of competency development: experiential learning versus workshops learning through experience for project teams learning through experience for organizations learning through workshops for professional development conclusion appendix a appendix b project overview preparation digitization metadata access appendix c appendix d endnotes

status update failed

$instname wireless status