CSEL is now on GitHub!

CSEL55

Authored and posted by Greta Franzini.

We’re really proud to announce that EpiDoc XML versions of the monumental Corpus Scriptorum Ecclesiasticorum Latinorum (CSEL) are now being added to the Open Greek and Latin Project‘s GitHub repository! We are in the process of digitising the public domain volumes of CSEL — you can the volumes with which we are beginning at http://www.roger-pearse.com/weblog/2009/10/24/list-of-csel-volumes-at-google-books/.

The Latin text was OCR-ed, corrected (at 99% accuracy) and encoded according to our specifications by French Data Entry company Jouve. CSEL is the first in a line of texts Jouve is currently helping us digitise. Each XML file is available under a Creative Commons Attribution-ShareAlike 4.0 International License and contains a link to the Archive.org scan it was taken from.

An accuracy of 99% means that there are plenty of data entry errors to be fixed. Similarly, our basic CTS-compliant EpiDoc markup is waiting to be further enriched.  The raw text was annotated by operators with no knowledge of Latin nor Greek, so a lot can –and should– be done to improve the XML.

So come and help us out! Feel free to download, modify, improve and share this work with friends and colleagues. The more, the merrier!

 

 

 

Share postShare on FacebookShare on LinkedInTweet about this on TwitterEmail this to someone

6 Comments

  1. I can not download or view the CSEL even if I have downloaded the GitHub software.

    Reply
    • To download: navigate to a CSEL volume in GitHub, click on it and click ‘View Raw’. Once the raw text opens up, right click and press ‘Save as’.
      To view: do you want to just view them or view them AND resubmit to GitHub?

      Reply
    • To download: navigate to and click on the volume you’re interested in, click on ‘View Raw’, then right click and ‘Save as’. The file should now be on your Desktop or in your Downloads folder.
      Which viewer are you using? Are you interested in simply viewing/reading the text or annotating and resubmitting it to GitHub?

      Reply
  2. what kind of software should I use to work (view and maybe edit) these files? I’m unfortunately not familiar with EpiDoc XML but would be interested in contributing to this fundamental project.

    Reply
    • Lukas, there are many XML viewers and editors you can use, both free and commercial, depending on your operating system. The most popular tool is the (Digital Humanities) community is oXygen insomuch as it supports TEI and EpiDoc XML. oXygen however is not free – your academic institution, if you belong to one, should be able to provide you with a discounted version.
      EpiDoc developers organise annual workshops to introduce members of the community to this tagset. The next workshop will be in London in April 2015, if you happen to be there. Alternatively, you could teach yourself by reading through the official EpiDoc documentation: http://sourceforge.net/p/epidoc/wiki/Home/ If you’re already familiar with XML, it will only be a matter of understanding the source material. Hope this helps.

      Reply
      • Thank you Greta, I’ll look for the software and I already started reading some documentation on EpiDoc website. Unfortunately from Brazil I’m not able to attend to the Seminar in London. Still, it looks simple for basic functioning. And it seems that there is a lot of exchange over the WWW on this topics, that may help me further. Thank you again.

        Reply

Trackbacks/Pingbacks

  1. Digitising ancient texts – the future that did not happen at Roger Pearse - […] This morning I saw the following announcement: […]
  2. » Epidoc and literary artifacts (inlustre monumentum est) - […] that I read through AWOL that Corpus Scriptorum Ecclesiasticorum Latinorum (CSEL) texts are now available in XML (TEI/Epidoc) format…

Submit a Comment

Your email address will not be published. Required fields are marked *