The Code4Lib Journal – Supporting Oral Histories in Islandora Mission Editorial Committee Process and Structure Code4Lib Issue 35, 2017-01-30 Supporting Oral Histories in Islandora Since 2014, the University of Toronto Scarborough Library’s Digital Scholarship Unit (DSU) has been working on an Islandora-based solution for creating and stewarding oral histories (the Oral Histories solution pack). Although regular updates regarding the status of this work have been presented at Open Repositories conferences, this is the first article to describe the goals and features associated with this codebase, as well as the roadmap for development. An Islandora-based approach is appropriate for addressing the challenges of Oral History, an interdisciplinary methodology with complex notions of authorship and audience that both brings a corresponding complexity of use cases and roots Oral Histories projects in the ever-emergent technical and preservation challenges associated with multimedia and born digital assets. By leveraging Islandora, those embarking on Oral Histories projects benefit from existing community-supported code. By writing and maintaining the Oral Histories solution pack, the library seeks to build on common ground for those supporting Oral Histories projects and encourage a sustainable solution and feature set. Introduction: Institutional Context The Digital Scholarship Unit (DSU) at the University of Toronto Scarborough (UTSC) Library works with campus IT to design stable best-practices infrastructures that can support the research conducted by and with faculty on campus as well as digitized and born-digital special collections. The DSU also works with liaison librarians to provide consultation and development services for experimentation and co-curricular pedagogy. Activities undertaken by the unit include helping stakeholders preserve and manage data, teaching digital scholarship tools, providing training on issues relating to scholarly publishing and open access, coordinating access to repository and data services developed by the central University of Toronto libraries system, and giving advice on metadata development and management. UTSC is 50 years old, and prides itself on a relationship with multimedia innovation, having originally been built with a television studio to develop and transmit closed-circuit lectures (UTSC Timeline). Today, several courses at UTSC incorporate the creation of audiovisual materials into the assignments completed by undergraduate students and researchers seek the means to develop and analyze unique collections of video and audio materials under the banner of Oral History. In 2014, the library embarked on an expansion of the services of the Digital Scholarship Unit, and several faculty approached the library seeking support for oral history projects. Scoping Oral History Support – Requirements Oral History is defined more as process than as product, “a field of study and a method of gathering, preserving and interpreting the voices and memories of people, communities, and participants in past events” [1]. The breadth of this definition can challenge those tasked with developing infrastructure, software, and workflows to make possible this “gathering, preserving and interpreting” work. Teams working on oral histories commonly request abilities to do the following: relate born-digital and surrogate materials drawn from an archive together in a non-exclusive fashion (graph-type relationships) manage materials and administrative processes as required by research ethics boards and granting agencies. Materially this means granular permissions, long-term stewardship, and reliable access and discovery of materials describe, transcribe, and translate audio and video files It is also important to note that these features need to be flexible enough to address the unique research and/or collection building goals of each project. Relevant materials can be in multiple formats and contain any number of speakers, and research questions can require, for example, that annotation rather than transcription information be appended to objects and indexed for discovery. As projects evolve, it’s also possible that new data sources will be included, enriched and otherwise integrated into pre-existing data sources. In the end, the processes of oral history appear to defy any consistent structure or approach. As a result, a software platform that must address the needs of multiple groups needed to be highly flexible and configurable. The UTSC library found that Islandora provided ways of addressing most of these requirements. Relating born-digital and surrogate materials together Oral history project teams often ask for the ability to link audio and video files with other born-digital and surrogate materials for both administrative and historical context. For example, a recording of an interview might be associated with consent forms agreeing to preservation and publication of all or parts of a recording, or an interview may be associated with other files contextualizing the content of the interview for an audience. Recordings might also be linked together to form an exhibition or collection of interviews on a particular topic. These connections may be non-exclusive – a single article or image may, for example, be relevant to two recorded interviews. Using Islandora Solution Packs, it is possible to support ingest and display of a number of content types. Islandora Solution Packs are Drupal modules that offer content models, workflows, forms, and integration with other solution packs or third party software projects to serve the needs of particular types of content or a certain user base. Solution packs vary quite widely in their scope and application[2] and often leverage other solution packs or contributed Drupal modules as dependencies. Moreover, the underlying Fedora 3 architecture supports complex and graph type relationships between objects (these functions are further enhanced and extended in Fedora 4) and there are interface mechanisms for linking objects together in compounds and collections, and sharing objects in multiple relationships. The ecosystem of solution packs allowed us to utilize a great deal of existing functionality and extend it for our purposes in storing, displaying, and interrelating multimedia content. Managing materials and administrative processes A common request from UTSC researchers is that oral histories be available to the source community in which they were collected, an audience that may be located internationally and may have permissions to some materials and not to others. An academic audience may seek and use a collection of oral histories differently from members of the general public, and these different modes of access require granular permissions for access. In Islandora, Fedora’s XACML-based security is highly configurable, as is the role-based Drupal permissions system. Managing not just access, but also authorship and administration, Islandora allows for fairly complex use cases to address the (often fraught) notions of authorship inherent in Oral History work. The oral history materials themselves, often containing video and audio files, can be difficult to preserve and serve reliably. Codecs change and these files tend to be large in size. It is widely recognized that the problem of preserving multimedia materials will continue to grow, as the availability of recording equipment makes generating these materials ever easier. The Islandora Video and Audio Solution Packs produce several formats, an access copy, and can be configured alongside utility modules to create and store technical, administrative, and preservation metadata. Describing, transcribing, and translating audio and video files We recognized that our need to flexibly describe and index oral histories could be addressed using Islandora’s form building modules (XML Forms) and search (Solr) functions. Islandora’s form building modules (XML Forms) [3] allow for the creation of form templates that can be used to author XML datastreams in a Fedora object. A user can create a new form, associate it with a Content Model in Fedora, and it will become available to users creating a new instance of an object using the Content Model. Once the XML datastream has been created, the GSearch application leverages an XSL stylesheet to transform the new XML into Solr documents, and make the information contained available to Islandora search functions [4]. Indexing new XML elements often requires an extension to the XSL stylesheet to address any new elements not covered in the default XSL packaged with solution packs. While the forms and search infrastructure provided a lot of functionality, there was no Islandora mechanism available for researchers who wish to describe, transcribe, or translate interview content and associate it with time codes. Beyond the research applications, UTSC is also bound by the Accessibility for Ontarians with Disabilities Act. This act requires compliance with Web Content Accessibility Guidelines 2.0, and includes a requirement for text representations of non-text content. Given Islandora’s coverage of many of the other requirements identified as essential in an Oral Histories infrastructure, the group determined that it would be best to develop an additional solution pack that provided these additional transcription, translation, and description features. They were bundled (with some hubris) by the unit as the “Oral History Solution Pack.” The Islandora Oral History Solution Pack The Oral History Solution Pack (OHSP) has existed in beta form since 2015, and was recently accepted into the Islandora Foundation Labs Repository. The intention is to continue the work to bring the module into the Foundation and release it as part of a regular Islandora release. Those familiar with Islandora will be familiar with the workflow by which one or more source files are transcoded and ingested as Islandora datastreams in an Islandora object: For interface ingestion, users are presented with the option of creating an “Oral History” object, and asked to provide a file and fill out descriptive metadata. The object is uploaded and triggers a series of processes that transcode the source file to one or more derivatives, creates relevant datastreams, and provides feedback to the user once the object has been created. The OHSP follows the same pattern, making available an oral history specific Content Model to collection policies, and leveraging Islandora’s Video and Audio Solution Packs as dependencies to generate derivatives. The module is designed to operate with the rest of Islandora’s released Solution Packs and Utility Modules, and the intention is to provide substantial flexibility in implementation. A major contributor to the concept for implementation is Edward Garrett of Pinedrop, whose transcript Drupal modules were discovered early in the research phase for this project. Transcript UI leverages the concept of “tiers” (also used in transcription tools such as ELAN) to encode various layers of information about an audio or video file (including transcription, translation, and description). The Viewer Edward Garrett was contracted by the University to abstract the viewer functions from his module into the Transcript UI module, which is a dependency and sub-module for the Oral History Solution Pack. Garrett’s viewer design makes possible the display of information tiers. In Figure 1, it is possible to see how an annotation and transcription tier are both available from the drop-down menu. The names and number of tiers are highly customizable. In Figure 2, it is possible to see how individual tier “cues” are highlighted as a video plays, In Figure 3, it is possible to see user options for the closed captioning functions of the viewer, out-of-the-box elements of the video.js player used at the heart of the viewer and fed using Islandora data. Note that users not using a bootstrap theme need to include a custom bootstrap build targeting the oral history elements in order for the viewer to display properly (an example is packaged with the module). Figure 1. Viewer configuration by user Figure 2. Highlighting and CC on video Figure 3. Available CC options in player The viewer displays captions in the video.js player using a simple WebVTT file Timed Track Element; transcript cues are parsed and synchronised via HTML5 Video Events and API. A WebVTT file is either provided at ingest, or generated via a source XML file. When installing and configuring the OHSP, administrators have the option to turn off the creation of WebVTT files, to enable CC for display, to enable the transcript display, and to display the media and transcript side-by-side (rather than the default, which is to display the transcript underneath). Administrators can also configure the speakers and tiers in source files for display as you can see in Figure 4. At the top, you can see that two custom tiers were used in the source XML for the transcript. Here in the interface, an administrator can define the human-readable name that will be associated in the viewer drop-down menu with the tier provided in the source XML. Similarly, speaker name elements can be glossed with more human readable names using a typical Drupal convention of machine name|human-readable name. This configuration panel is designed to enable users to flexibly define the nature of their source transcript files. Figure 4. Configuring Transcript tiers and speaker names for display. Ingest Formats and Workflow To serve the functions of the viewer, the OHSP supports the addition of enriched source material and produces additional derivatives on ingest. In addition to a source video or audio file, users ingest time-coded content files in either a) a custom XML format or b) a WEBVTT format. The XML format is required to support tiers. An XSD schema is provided with the module so that people creating XML files can ensure they are producing a valid format.[5] The custom XML format has a root element containing a number of nested elements. Each consists of a and element indicating the timecode for each cue, and an optional element. Each cue can contain multiple customizable tier elements (such as transcription, translation, and annotation). If tiers are used, they must be used in every cue. For instance, if a element exists in one cue, it needs to exist in every other cue even if it is empty. In addition, the cue tiers need to appear in the same order in each cue. The XSD schema does not account for these tier customizations. Tiers are identified via the use of tags when authoring an input XML file. See an example of the required XML format in Figure 5. Further discussion of methods for creating source XML files is provided in the implementation section. WebVTT is a W3C standard that provides a very simple time-encoded text format to display captions or subtitles. It was developed to support the HTML5 element for timed text tracks for closed captions and subtitles in the