David Maus

DIY Atom feed for the #PolonskyGerman project blog

October last year I was given the opportunity to work with colleagues from Digital Bodleian in #PolonskyGerman, a project that seeks to open up the medieval German manuscript collections of the Herzog August Bibliothek Wolfenbüttel and the Bodleian Libraries for research and reuse. The project is doing fine but there is one itch to scratch: The project's web page has a blog but the blog has no Atom or RSS feed. How do I know when there is something new in the blog?

Easy answer: I DIY my own feed. Way back in time I wrote a little Ruby program that solved this problem in a generic fashion. This time I settled for a project specific solution.

The setup is straightforward: I fetch the front page with Curl, pipe it through TagSoup and feed the resulting XML into an XProc pipeline. The pipeline executes an XSLT transformation and validates the result against the Atom specification. If all went well I get an Atom feed I can plug into my rss2email instance.

Done.