Florilegia: Big Textual Data Workshop, July 10-11, 2017
Preliminary Programme
Monday, July 10 – Raum 402
Paulinum, Augustuplatz 10, 4th Floor
09:00-11:30 Kick-Off Talks
09:00-09:20 Thomas Koentges and Gregory R. Crane (Universität Leipzig and Tufts University): Welcome
09:30-10:30 Introduction and initial Discussion
10:30-11:30 David Smith: Exploiting Relational Structure in Large Text Corpora
11:30-12:00 Coffee Break
12:00-14:30 How-Tos
12:00-12:45 Benjamin Kiessling: OCR of Different Languages
12:45-13:30 Alicia Gonzalez: Pushing Annotations of Different Languages to Annis
13:30-14:30 Lunch
14:30-16:30 Deep Learning and Topic Modelling
14:30-15:30 Oliver Hellwig: A Deep Learning approach to Tokenization of Sanskrit Texts
15:30-16:30 Paul Dilley (and Thomas Koentges): Iowa Corpus and Topic Modelling
16:30-17:00 Coffee Break
17:00-18:00 End-of-Day Discussion
Tuesday, July 11 – Raum 402 (Paulinum, Augustuplatz 10, 4th Floor)
09:00-11:00 Corpus Infrastructure and Resources
09:00-10:00 Thomas Koentges: Let’s Talk About .cex
10:00-10:30 Frederik Baumgardt: Perseid’s Plokamos: Of changing Corpora and Annotations
10:30-11:00 Patrick J. Burns: External Resources for Corpus Approaches
11:00-11:30 Coffee Break
11:30-13:30 Corpus Building and Presentation
11:30-12:00 Cliff Wulfman: Blue Mountain
12:00-12:30 Matt Munson: CapiTainS, the CHS, and First1KGreek
12:30-13:00 Neven Jovanović: Croatiae Auctores Latini (CroaLa) – a Neo-Latin Corpus for Fun
and Profit
13:00-13:30 Tyler Neill: Sanskrit Text Corpora and the Nyāyabhāṣya Digital Critical Edition
13:30-14:30 Lunch
14:30-16:30 Big Textual Data and Text Reuse
14:30-15:30 Donald Sturgeon: Text Tools for ctext.org
15:30-16:30 Paul Vierthaler: Working with Imperial Chinese Corpora: Studying Document
Similarity and Text Reuse
16:30-17:00 Coffee Break
17:00-18:00 Final Discussion and Future Plans
Recent Comments