OCLC Linked Data Roundtable: Stories from the Front

Blue circle, green circle, orange circle overlapping above OCLC in blackThe session “OCLC Linked Data Roundtable: Stories from the Front” took place during the 2018 ALA Midwinter Meeting on Saturday, February 10. Approximately 75 people were in attendance for the session, which included three 10-minute presentations on the topic of linked data implementations, followed by a Q&A period.

Sarah Newell (senior product analyst at OCLC) kicked off the session with a presentation titled “Linked Data Prototypes.” Newell detailed a linked data prototyping project underway with three collaborators: University of California, Davis; Cornell University; and the Montana State Library. Based on a combination of monthly meetings, site visits, and directed activities, OCLC is hoping to build a comprehensive linked data tool that addresses the practical workflow needs of its collaborators. The group also continues to test the prototype and expects development to wrap up in June 2018.

Building on past OCLC linked data prototypes, this project aims to address three areas of concern:

  • a lack of connection between MARC strings and linked data entities
  • the creation and editing of linked data entities
  • the creation and editing of relationships between those entities

OCLC’s ultimate goal is to serve as a central hub for these services by creating a tool composed of a reconciliation service, a linked data entity minter, and a linked data relator. At the time of the presentation, OCLC has completed an “entity ecosystem,” a format-agnostic database that supports both triple stores and other relationships. It can be indexed with both Apache Solr and Elasticsearch. The project collaborators are now also testing a reconciliation service that sits on top of the entity ecosystem and uses an application programming interface (API) to communicate with a subset of WorldCat in order to reconcile linked data entities stored in the prototype tool. The next step in the development process will be to build a service with a Wikibase user interface that allows users to edit existing entities.

Those interested in the project can read the OCLC linked data summary. Institutions who would like to become collaborators for the next phase of the project can contact Sarah Newell.

Sally McCallum (chief of Network Development and the MARC Standards Office at the Library of Congress) delivered a presentation titled “Works, Authorities, and Bibliographic Descriptions.” McCallum presented on an ongoing internal project at the Library of Congress (LC) to investigate the creation of BIBFRAME work records. The primary motivations behind the project are to achieve fluid transformation between MARC and BIBFRAME and establish the purpose and scope of title authority records.

The presentation began with a brief overview of the current landscape of title authority records. LC establishes title authorities when title variations (especially translations) exist, titles require identifying information, references to related works are recorded, or subject strings are needed for titles. Catalogers use these title authorities to ensure consistency of access points, collocate and distinguish works, or display a resource for which the library has no holdings. In the future, LC hopes to investigate if and how end users make use of title authority records.

At this stage, LC is gathering information on what information should be included in work records and asking questions about when work records should be created. Information about works can be derived from both bibliographic and authority MARC records. Neither source alone provides all the information that could potentially be useful to include in a work record. For example, bibliographic records don’t currently identify themselves as a work versus an instance. They also don’t include such information as series numbering, variant titles, and related agents. Authority records, on the other hand, lack the rich related resource linking present in some bibliographic records, genre and form classification, notes that apply to all instances of the work, and subject classification.

This investigation will be brought to the larger community for input at a later date.

Myung-Ja (MJ) K. Han (metadata librarian at the University of Illinois Urbana-Champaign) delivered a presentation titled “Linked Data for User Services.” Han presented on two linked data projects from the University of Illinois Urbana-Champaign (UIUC). The first, Emblematica Online, was completed in 2015 and the second, Special Collections Online, was completed in January of this year.

Emblematica Online is a portal to facsimiles of emblem books, a genre of book popular during the Renaissance that is of particular interest to scholars today. The collection contains approximately 1,400 books from multiple libraries and 28,000 emblems. Emblems are described individually using the SPINE metadata schema. Item pages also pull in descriptive information from linked data resources outside of Emblematica. The Virtual International Authority File (VIAF) is used to pull in biographical information on the fly about names associated with resources. The website also uses Iconclass, a classification system designed for describing the subjects of images, to support the display of multilingual hierarchical description. Metadata for each emblem includes Uniform Resource Identifiers (URIs) for Iconclass subjects. Iconclass Web Services displays each URI’s label on the emblem’s item page as well as broader and narrower terms. These descriptions can link to searches in other emblem portals, as well.

Special Collections Online was a 20-month Andrew W. Mellon grant-funded project that wrapped up in January 2018. Enhanced searching and description of special collections resources will demonstrate how libraries can become effective consumers, not just publishers, of linked open data (linked open data). For this project, UIUC chose a sample of its digitized special collections to use in a web portal. This portal makes use of LOD-compliant schemas internally and links out to existing LOD sources such as DBpedia, Wikidata, and VIAF. Knowledge cards displayed on search result pages and contextual information displayed on item pages also make use of LOD functionalities. Both of these features pull information on the fly from external sources such as Wikimedia. Further, users will be able to add annotations to resources directly into the system.

Through user testing, the project team learned a great deal about the potential advantages for user experience achieved by harnessing linked data to bring additional contextual information to their own portal and linking users to additional resources beyond the library’s collections. Another important takeaway is that a significant amount of manual labor is required for completing quality metadata work, in particular for reconciling names.

This portal is not yet fully available, but a detailed overview of the project and mockups for the future site are available on the Linked Open Data for Digitized Special Collections project website. The project team will also publish scripts used for building the project on GitHub.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.