A quick one: Use Metafacture to create a list of records with cataloging errors
Some records in our library catalog contain an error that causes trouble when we distribute catalog data to national aggregators like the Zentrales Verzeichnis Digitalisierter Drucke (ZVDD), the central access point to printed works from the 15th century up to today, digitized in Germany. The catalogers made a typo and didn't separate the name of the publisher and the place of publication.
Metafacture is a really helpful suite of tools when working with a comparatively large set of records in a somewhat unwieldy format. The following Metamorph transformation runs over a dump of our catalog and outputs a list that contains the record's primary key (Pica production number, PPN), an indicator of the resource type, and the erroneous publication statement.
If the subfield p
of the field 033A
matches the specified regular expression, and both the
subfield 0
of field 003@
and subfield 0
of field 002@
are
present, combine these three fields to an unnamed output entity. Because 002@
and 003@
are
always present, the combine
acts as a filter and generates an output entity only if the
erroneous 033A
is detected.
I run this morph with a simple flux pipeline.
Turns out that only 678 of appr. 1.2 million records or 0.06% are affected. This speaks volumes for the dedication of our staff and makes the problem managable.