Summary authored and posted by Greta Franzini
Bruce Robertson and Federico Boschetti from Mount Allison University and CNRS Pisa respectively walked us through their work on OCR of Ancient Greek text.
Bruce described some of the issues OCR engines face when scanning polytonic Greek (e.g. line segmentation), image requirements for optimal OCR scanning, spellchecking as well as his joint effort with Federico to combine results from different OCR engines in order to obtain the best possible output. Federico elaborated on the different correction and editing methodologies of OCR output, including crowd sourcing and data entry contracts. His introduction served to contextualise the OCR proof-reading tool (a web application) he has developed to help flag-up errors and suggest corrections.