The column of the table that flags recommended fields is drawn by these tests:
|
Creator
Spatial
|
As Oregon Digital uses both creator and contributor, I wanted the table to flag a record only if neither was present. After searching Stack Overflow for possible solutions, I changed the above to:
Creator/Contributor
Spatial
|
I then returned to my new landing page for the Oregon-specific required metadata checker, index_oai_qdc_ore.php. I searched the document for ‘XSL’ to find and update the call for the XSL file to use my newly revised file (lines 6-7):
$xml = new DOMDocument;
if (@$xml->load($feedURL) === false) {
echo "Please enter a valid feed URL.
";
} else {
$xsl = new DOMDocument;
if ($mp == 'oai_qdc') {
$xslpath = 'xsl/check_req_fields_qdc_ore.xsl';
} elseif ($mp == 'MODS') {
$xslpath = 'xsl/check_req_fields_mods.xsl';
} else {
$xslpath = 'xsl/check_req_fields_dc.xsl';
}
I moved on to the mapping checker next. It is structured very similarly to the required data checker, with pairs of PHP and XSL files. I cloned the index_oai_qdc.php and extract_qdc.xsl files in the mapping_checker folder and set about adapting them to work with Oregon Digital’s OAI feed. Figure 4 shows the base pair of Qualified Dublin Core files in red rectangles:
Figure 4. XSL/PHP file pairs in mapping checker
The mapping checker’s XSL (extract_qdc.xsl) uses for-each elements to select Dublin Core elements from each record and build a table display of required vs. recommended data:
|
As with the required data checker, adapting these for-each elements for Oregon required updating the select statements:
|
Finally, I had a look at the facet checker. After inspecting it and reading the NCDHC’s documentation, I elected to add the facet checker because it was already coded, didn’t need adaptation, and provides a visual audit tool to help identify inconsistencies in collections. Figure 5 shows a view of subjects used in Utah State Archives’ Salt Lake City (Utah) Fire Department Photographs collection:
Figure 5. Facet viewer demonstrated using subjects in Utah State Archives’ Salt Lake City (Utah) Fire Department Photographs collection
This demonstrates how the same subject headings have been used consistently across the records, but could also reveal variations in spelling or application that needed remediation.
When I updated the top-level index.php to include the newly coded features, I uncommented the Facet Viewer and moved it to the bottom of the list. The pages are called as list items in an unordered list.
Finally, I updated inc/byline.php; this file is called on each of the tool’s php pages to provide the footer. I added details to reflect the version changes and update the MWDL contact information.
Technical challenges/Testing
Before making the tool live for member and staff use, I wanted to test it to make sure the updates were working correctly. A colleague suggested using MAMP, a free personal web server program (see sources). University of Utah Library IT support built a virtual machine for me with MAMP installed that was accessible through VMWare Fusion (version 8.5.8).
The testing process was fairly slow, primarily owing to some issues of copying the project file structure from my local machine to the correct root directory in the virtual machine as per MAMP documentation (/Applications/MAMP/htdocs). There seemed to be a glitch with both the menu-based copy-and-paste and drag-and-drop function that should have worked as per VMWare Fusion documentation (see references).
However, once the project had been copied to the virtual machine, I could test it and this resulted in a few rounds of iterative changes to improve aspects of the tool. For example, I realized I had failed to update the if variable for rights to look for “dcterms:rights”, resulting in false positives for missing rights statements in all Oregon records.
I then uploaded the project to the GitLab, created a merge request and notified University of Utah staff we were ready to make the tool live. At this time, we decided to move the tool to a newer server and update its URL from dpla-aggregation.sandbox.lib.utah.edu to mwdlmetadata.tools.lib.utah.edu, a more branded URL indicating its publisher and purpose.
Once the tool was live, further testing revealed a dependency that needed updating for the set-selector to work correctly. A colleague in the Utah Digital Infrastructure Development team helped troubleshoot and made this update, and the tool was fully functional.
Further changes/extensions needed
After using the new version of the tool to audit collections, I developed a list of future enhancements and cosmetic issues that need further troubleshooting.
First, I realized the recommended field language is missing from the qualified Dublin Core required data checkers. This was true of the original adaptation, so by copying the files, my updated checkers inherited the same flaw.
Highlighting of table headers for recommended fields is also inconsistent in the mapping checker. The tool was originally built to display required fields in bright yellow and recommended fields in pale yellow. For example, I forgot to give the contributor column a CSS class in the mapping checker (extract_qdc_ore.xsl) when I added it, resulting in that column header not having any highlighting at all (line 3):
Identifier |
Creator |
Contributor |
Spatial |
After using the tool on several collections, I also noticed that the Oregon required data checker draws the table cells for each record, even if nothing is flagged as missing. This is not the case for CONTENTdm collections; while it isn’t bothersome for small collections, it is cumbersome for ones with large numbers of records. At the time of this writing, I haven’t had a chance to troubleshoot why this is happening.
It might also be possible to code a single required data checker for all qualified Dublin Core feeds by using more sophisticated XSL-if statements and variable definitions. However, I have observed a marked performance issue with the Oregon required data checker that I suspect may be due to my combined creator/contributor test; this bears further testing to see if separating them out improves performance and if the tool could be streamlined.
I made a subsequent cosmetic update to reflect the current name of Oregon Digital’s repository (Samvera rather than Hydra) and updated index.php to indicate the Simple Dublin Core required metadata checker also works with University of Utah’s Solphal system as well as Islandora systems. I anticipate future updates following the forthcoming revision of the MWDL MAP in 2018.
Conclusion
The project to update the metadata tool was ultimately successful, allowing MWDL staff to rapidly audit Oregon Digital’s incoming collections. Between March and April 2018, we harvested over 50 new collections ranging in size from less than 5 records to more than 20,000, and the bulk of the audit process took roughly 2 days. The work was performed by the Metadata Librarian and a part-time undergraduate student worker. Our student metadata assistant began in February 2018 and, with no prior knowledge of or experience auditing library metadata, was able to immediately add value by efficiently delivering feedback about metadata quality. Relying on student workers for first-pass auditing frees the Metadata Librarian for other tasks. Further applications of the tool could include doing internal quality control projects as well as preparing collections for either platform or metadata migrations.
The profile of repositories used by MWDL’s members continues to diversify; as of this writing, an additional two MWDL members are planning moves from CONTENTdm to Islandora, and I envision continual adaptations and refinements of the tool to meet changing member needs.
Acknowledgements
Huge thanks to Anna Neatrour and Brian McBride at University of Utah for advice, technical assistance, and good humor. A special thanks to the North Carolina Digital Heritage Center for creating the tool and sharing their work, and to Sandra McIntyre for her work on the original MWDL adaptation of the tool.
About the Author
Teresa K. Hebron is the Digital Metadata Librarian at Mountain West Digital Library, based at the University of Utah in Salt Lake City, UT.
Footnotes
[1] https://mwdl.org/getinvolved/oai_queries.php [back]
[2] http://re.cs.uct.ac.za/ [back]
[3] http://validator.oaipmh.com [back]
References
DPLA OAI Aggregation Tools project Version 1.0 [Internet]. [Updated 2016 May 25]. North Carolina Heritage Center; [cited 15 March 2018]. Available from: https://github.com/ncdhc/dpla-aggregation-tools
Gregory L, Williams S. 2014. On Being a Hub: Some Details behind Providing Metadata for the Digital Public Library of America. D-Lib Magazine [Internet]. [Cited May 23, 2018]; 20:7/8. Available from: http://www.dlib.org/dlib/july14/gregory/07gregory.html
MAMP [Internet]. [Updated 2018]. appsolute; [cited 2018 March 18]. Available from: https://www.mamp.info/en/
MAMP & MAMP Pro 4 Documentation [Internet]. [Updated 2018 May 25]. appsolute; [cited 18 March 2018]. Available from: http://documentation.mamp.info/
McIntyre S. 2015. New Tools for Rapid Auditing of Your Collections [Internet]. [Cited 2018 May 23]. Available from: http://mwdlnews.blogspot.com/2015/11/new-tools-for-rapid-auditing-of-your.html
Mountain West Digital Library Dublin Core Application Profile Version 2.0 [Internet]. [Updated 20 July 2011]. Mountain West Digital Library Metadata Task Force; [cited 2018 May 23]. Available from: https://mwdl.org/docs/MWDL_DC_Profile_Version_2.0.pdf
Moving and Copying Files between Virtual Machines and Your Mac [Internet]. [Updated unknown]. VMWare; [cited 2018 March 25]. Available from: http://pubs.vmware.com/fusion-7/index.jsp?topic=%2Fcom.vmware.fusion.help.doc%2FGUID-3C0EA5DA-98DD-4835-9C84-354472B25303.html
Neatrour A, Cummings R, McIntyre S. 2016. Regional Aggregation and Discovery of Digital Collections: The Mountain West Digital Library. In: Varnum K, ed. Exploring Discovery: The Front Door to Your Library’s Licensed and Digitized Content. ALA Editions. Available from: https://collections.lib.utah.edu/details?id=713372
Testing against one of several values in XSLT? [Internet]. [Updated 2017 December 6]. Stackoverflow.com; [cited 2018 March 28]. Available from: https://stackoverflow.com/questions/47679280/testing-against-one-of-several-values-in-xslt
XSLT Tutorial [Internet]. [Updated Unknown]. W3Schools.com; [cited 2018 March 20]. Available from: https://www.w3schools.com/xml/xsl_intro.asp
Subscribe to comments: For this article | For all articles
Leave a Reply
Name (required)
Mail (will not be published) (required)
Website
Δ
ISSN 1940-5758
Current Issue
Issue 60, 2025-04-14
Previous Issues
Issue 59, 2024-10-07
Issue 58, 2023-12-04
Issue 57, 2023-08-29
Issue 56, 2023-04-21
Older Issues
For Authors
Call for Submissions
Article Guidelines
Log in
This work is licensed under a Creative Commons Attribution 3.0 United States License.