A quick one: Use Metafacture to create a list of records with cataloging errors
David Maus, 24.04.2018 · Permalink
Some records in our library catalog contain an error that causes trouble when we distribute catalog data to national aggregators like the Zentrales Verzeichnis Digitalisierter Drucke (ZVDD), the central access point to printed works from the 15th century up to today, digitized in Germany. The catalogers made a typo and didn't separate the name of the publisher and the place of publication.
Metafacture is a really helpful suite of tools when working with a comparatively large set of records in a somewhat unwieldy format. The following Metamorph transformation runs over a dump of our catalog and outputs a list that contains the record's primary key (Pica production number, PPN), an indicator of the resource type, and the erroneous publication statement.
If the subfield
p of the field
033A matches the specified regular expression, and both the
0 of field
003@ and subfield
0 of field
present, combine these three fields to an unnamed output entity. Because
always present, the combine acts as a filter and generates an output entity only if the
033A is detected.
I run this morph with a simple flux pipeline.
Turns out that only 678 of appr. 1.2 million records or 0.06% are affected. This speaks volumes for the dedication of our staff and makes the problem managable.