LETTER TO THE EDITORS INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2023 https://doi.org/10.5860/ital.v42i4.16991 About this Section Letters to the Editor reflect the opinions of their authors and are not necessarily those of the ITAL Editorial Board or ALA’s Core Division. Each letter’s copyright is held by its authors and is published under a Creative Commons CC-BY-NC-4.0 license. Dear Editors, At the risk of coming across as a defensive cataloger afraid of losing my job, I am writing to respond to the article published in ITAL Vol. 43, No. 3 by Richard Brzustowicz, “From ChatGPT to CatGPT: the Implications of Artificial Intelligence on Library Cataloging.” To Mr. Brzustowicz’s credit, he does not advocate that ChatGPT can or should replace human catalogers wholesale, or that ChatGPT should be used for cataloging without human intervention. He also correctly identifies potential issues of bias inherent in relying on artificial intelligence (AI) for subject description. The potential promises and pitfalls of ChatGPT and other AI technologies is something that catalogers should be paying attention to, and I thank Mr. Brzustowicz for sparking conversation. Nevertheless, there was no indication given that Mr. Brzustowicz (self-identified as an Instruction and Outreach Librarian) has adequate knowledge and credentials to evaluate ChatGPT-generated MARC records, or their possible implications for re-engineering cataloging work. He does not provide any information as to his cataloging background or training (if any), nor gives any indication as to whether he discussed or shared his experiment with catalogers at his institution or within his professional circle. If he had, many of the errors and misinterpretations of his data and outcomes may have been mitigated. Brzustowicz shows his bias in the second sentence of his introduction: “Unfortunately (emphasis added), this crucial process [i.e., cataloging] can be both labor-intensive and time-consuming, often requiring significant resources.” The characterization of the labor required for a crucial library task as “unfortunate” is one that catalogers have heard from library administrators for decades, and while I don’t think the slight was intentional, it adds ammunition to the argument that cataloging is too expensive and should be made more “efficient” (never mind the many efficiencies already in place in many libraries, including cooperative cataloging, shelf-ready, and heavy use of macros, batch processing, vendor-supplied records, and scripting technologies). Close examination of the sample records Brzustowicz provides shows that while ChatGPT appears to be able to create basic descriptive metadata comparable to that found in OCLC WorldCat for many types of materials, it also seems to be bringing in data from other records that is inaccurate and/or inappropriate. All of the ChatGPT derived records contain 040 fields identifying record creation or editing by different institutions, and two of the records include an OCLC control number in MARC field 035. In both cases, the OCLC record number is for a completely unrelated resource, so it appears that ChatGPT is pulling them in at random. There is no way to know which records ChatGPT pulled the 040 data from, but in any event it should reflect Brzustowicz’s institution rather than DLC (The Library of Congress) or other institutions. Also, while Brzustowicz cites the Knopf 1996 edition of Interview with the Vampire, and its OCLC record is the one used for comparison, his ChatGPT-generated record indicates a different publisher and pagination. The fact that ChatGPT “copy cataloged” the wrong edition seems to have escaped Brzustowicz’s notice. INFORMATION TECHNOLOGY AND LIBRARIES DECEMBER 2023 LETTER TO THE EDITORS 2 DEZELAR-TIEDMAN The ChatGPT record for the German edition of Pedagogy of the Oppressed, in addition to the erroneous 040 and 035 fields previously mentioned, also includes an 042 field indicating it is a PCC (Program for Cooperative Cataloging) record. The last time I checked, ChatGPT had not yet become a member of the BIBCO program, so incorrectly labeling an AI-generated record as such is highly problematic, as many libraries rely on the PCC coding to identify good quality records that can be imported without editing or significant review. As Brzustowicz indicates, catalogers interested in using ChatGPT to assist in cataloging work should do so with a critical eye. It could potentially be used as a starting point to generate a basic description but would require catalogers to closely examine the record to identify incorrect coding, check name headings against the authority file, and assess the suggested subject headings for accuracy and bias. With the examples given in Brzustowicz’s article, ChatGPT seems to improperly introduce coding and fields to its records that it doesn’t understand. The need to review this coding, and delete it from the record, would actually make the cataloging process less efficient. Part of the catalogers’ job to train ChatGPT might be to train it which fields to include and which to ignore. Brzustowicz’s Discussion section quickly indicates his misunderstanding of typical cataloging work. He states: “The ability to accurately create descriptive records using ChatGPT could significantly reduce the time and resources required for copy cataloging: this could free up library workers to focus on other important tasks…”. First, the implication here is again that cataloging takes too much time and is less important than other library functions like collection development and user services. And Brzustowicz seems unaware that significant reduction of time and staffing for copy cataloging has already been happening for decades. Copy cataloging is defined as downloading an existing catalog record to a local catalog database, so why would we need ChatGPT to generate another (imperfect) record if one already exists in WorldCat or another source? Copy cataloging is generally not the part of cataloging that is time consuming, especially since many libraries now batch load large portions of their records and use shelf-ready services, so that the majority of materials arrive at the library ready to be checked in and placed on the shelf, with no item-level cataloging needed. Even for smaller institutions (such as Brzustowicz’s) that may not have the budget to employ some of these innovations, other low-cost methods exist (including using Z39.50) to acquire copy records accurately and efficiently. It’s the original and specialized cataloging that is time consuming. While ChatGPT has some potential to be used as a starting point in generating a description where no record exists, or to assist with the assignment of subject terms, it is just one of many tools that a cataloger might employ, and it still needs to learn a lot before it can be of substantial help in making the original cataloging process more efficient. Assuming that this article was submitted for peer review, I suggest that the editors take a more critical eye on articles where the author’s credentials do not seem to match his subject. Had this submission been reviewed by an experienced cataloger, the inaccuracies regarding contemporary cataloging workflows, and his interpretation of his results, could have been corrected prior to publication. Sincerely, Christine DeZelar-Tiedman Cataloging Policies and Practices Librarian University of Minnesota Libraries dezel002@umn.edu mailto:dezel002@umn.edu