Microsoft Word - book-3.docx


INTERNATIONAL JOURNAL OF LIBRARIANSHIP, 3(1), 107-109 

ISSN:2474-3542 

 
Library Linked Data in the Cloud: OCLC's Experiments with New Models of Resource 
Description. Series: Synthesis Lectures on the Semantic Web: Theory and Technology (Book 
9) By Carol Jean Godby, Shenghui Wang, Jeffrey K. Mixter, Morgan & Claypool Publishers 
©2015, 240p. ISBN: 1627052194, 9781627052191 

 
This book is the ninth book of the series Synthesis Lectures on the Semantic Web: Theory and 
Technology, published by Morgan & Claypool. It addresses the challenge all libraries as a whole 
have been facing: exposing library materials to multiple search engines and having the presence 
of library metadata on the Web.  

This book describes OCLC’s experiments in linked data technology. OCLC’s efforts are 
enormous and time-consuming, involving various processes such as concept designing, data 
modeling, metadata mapping, computer programming, and format conversion. Although the series 
of the Synthesis Lectures cover more topics surrounding semantic web and library linked data, this 
book provides a relatively comprehensive overview of how to publish traditional library metadata 
to the cloud. It also gives conceptual and technical details with concrete examples to explain these 
experiments. As transforming libraries’ data into linked data goes beyond the library field, some 
chapters, terms, and explanation may present a challenge to readers. Without the basic 
understanding on concepts and topics such as Schema markup, FRBR (Functional Requirements 
for Bibliographic Records), and library authority files, some readers may have a certain degree of 
difficulty weaving different pieces of information together to achieve a full understanding of these 
experiments. The long list of the resources contained in the bibliography section is very useful for 
those new to library linked data to grasp basic knowledge and for those who have a strong interest 
in this topic to pursue further understanding. Majority of the bibliographies come with a URL, 
which is convenient for access.  

This book is organized into five chapters. The first chapter begins with an introduction of the 
difference between the Web of documents and the Semantic Web, which illustrates the fundamental 
reason for the worldwide linked data movement. Several sections provide a broad view of the 
milestones related to library linked data from 1995 to 2014. They briefly cover these core library 
linked data achievements (e.g., FRBR, RDA-Resource Description and Access, and WorldCat 
Linked Data) that were made through the collaboration among Library of Congress, OCLC, and 
other library communities. Meanwhile, this chapter concisely discusses the procedure of OCLC’s 
experiments on converting raw bibliographic records in WorldCat to RDF markups. This 
procedure involves mapping MARC records to the Schema.org standard, creating corresponding 
RDF turtles, applying FAST (Faceted Application of Subject Terminology) and VIAF (Virtual 
International Authority File) IDs, as well as assigning cool URIs (Uniform Resource Identifier). 
The figures and tables in this chapter are particularly helpful for readers to understand activities 
involved in OCLC’s linked data experiments. This chapter does a good job of providing necessary 


Wang / International Journal of Librarianship, 3(1) 

 
108 

 
background information and basic knowledge that can help readers delve into the following 
chapters. 

Chapter two describes modeling library authority files (e.g., Library Congress Subject 
Headings, FAST, Dewey Decimal Classification, and VIAF) as linked data. It discusses the process 
of mapping these files in SKOS (Simple Knowledge Organization System), FOAF (Friend of a 
Friend), and FAST as well as designing/creating the URIs to be put in a set of RDF datasets for 
these authority files. The authors also point out the strength and limitation of modeling using 
different standards. For example, SKOS is not rich enough to model persons while FOAF is a 
better fit.  Additionally, this chapter introduces the VIAF project and describes how to model 
these authority files as RDF turtles using Schema.org, SKOS, etc. Examining the development of 
modeling these files calls for the need to evolve VIAF into an authoritative hub of data.  

Chapter three provides how to model and discover the creative works library communities 
have generated with MARC for years in depth. Starting from the review of treating creative works 
as the entities of objects and explaining the relationship with other three entities (i.e., expressions, 
manifestations, and items) defined by FRBR Group 1 Conceptual Model, several sections detail 
modeling FRBR concepts with Schema.org in conjunction with a set of bibliography vocabulary, 
designing the syntax and semantics of URLs in RDF descriptions, and other modeling efforts of 
creative works. Furthermore, the chapter succinctly reviews the data mining algorithms that are 
used to discover creative works in WorldCat. This chapter demonstrates the complexity of 
modeling and discovering creative works. 

Text mining plays an important role in processing unstructured (free-text) information to 
generate meaningful information and knowledge. Chapter four first analyzes the need to apply text 
mining algorithms to library linked data projects. Then it discusses the algorithms OCLC used to 
promote strings in MARC records to entities. These algorithms can identify whether strings are 
names with identifiers (e.g., appearing in a controlled MARC field), labeled names (not subject to 
authority control), names in semi-structured text (e.g., an illustrator name in MARC 600 filed), or 
names in unstructured text (e.g., information contained in MARC 520 field). The chapter also 
examines subject/concept matching and explains the methods (e.g., thesaurus-matching and 
instance-based mapping) that can algorithmically map these concepts to authority files and 
vocabularies in the domain of library resource description. Lastly, the chapter introduces the 
document clustering which is important to handle a large amount of data. It also analyzes the pros 
and cons of different clustering algorithms and points out the future need to handle digital objects 
and cultural heritage materials. This chapter may be particularly interesting for library 
professionals with some computer programming background. 

Chapter five briefly summarizes the experiments OCLC has taken and indicates future 
directions. Some directions OCLC is looking ahead include closely collaborating with the 
Schema.org group as it provides a set of robust Semantic Web standards and with other library 
communities such as Library of Congress as it is also experimenting with BIBFRAME 
(Bibliographic Framework) and RDA. This chapter shares the lessons OCLC has learned along 
the way and the action it takes such as participating in community standard initiatives and forming 
joint forces. In addition, this chapter discusses in details OCLC’s challenges in three regards (i.e., 


Wang / International Journal of Librarianship, 3(1) 

 
109 

 
conceptual, technical, and environmental).  
Generally speaking, the detail-oriented description throughout the book is instrumental for 

those pursuing or will pursue a library data project. This book is a good reference book for those 
implementing a library linked data project. The challenge of comprehending this book can be 
overcome by reading more than once and performing the relevant reading. This book clearly 
demonstrates the efforts and the contributions OCLC has made through its various experiments to 
transform legacy library metadata and increase the visibility and usability of libraries’ creative 
works.  

Overall, this book is a valuable addition to the literature of library linked data. It can spark an 
interest and provide an eye-opening experience for beginners while offering intriguing thinking 
and processing specifics for some readers to understand details involved in the library linked data 
projects. 
 
 
--- Xiaocan (Lucy) Wang, Emerging Technologies Librarian/Associate Professor, Spiva 
Library, Missouri Southern State University.