Linguistic Annotation and Philology Workshop
July 6-7, 2017

P 702 (7. Etage) Paulinum, Hauptgebäude
Universität Leipzig
Augustusplatz 10-11 – 04109 Leipzig

 

The Linguistic Annotation and Philology Workshop aims to gather experts in order to assess the status of the art relative to linguistic annotation for historical languages and promote sharing of methods, resources, and best practices to build linguistically annotated corpora. Linguistic annotation can be broadly defined as any metadatum which renders the grammar of a text explicit and so easily queryable in a digital space. Linguistic annotation layers typically include morphology, syntax, and semantics (the latter being able to be extensively defined to include pragmatics). Contributions are welcome which deal with any aspect of them, including (but not limited to):

  • Annotation scheme, especially with reference to linguistic typology, Indo-European studies, and, more in general, linguistics
  • Annotation markup and query languages
  • Unicode encoding and its interface with philology
  • Tokenization, POS tagging, lemmatization, and parsing
  • Semantic annotation
  • Universal Dependencies compliance

While many and advanced resources have already been developed for modern languages, linguistic annotation for most historical languages is still in its infancy. Discussion and cooperation are therefore needed to address theoretical questions, identify already existing resources, and understand how to advance them and make them compliant with the best practices and standards available both in (Digital) Humanities and (computational) Linguistics.

 

July 6 – Raum 702

09:00-09:30
Giuseppe G. A. Celano & Gregory R. Crane, Universität Leipzig (Leipzig University) & Tufts University
Introduction

09:30-10:00
Carlotta Viti, Universität Zürich (University of Zurich)
Linguistic annotation for a contrastive syntax of Ancient Greek and Biblical Hebrew

10:00-10:30
Camilla Di Biase Dyson & Simon Schweitzer, Georg-August-Universität Göttingen (University of Göttingen) & Berlin-Brandenburgische Akademie der Wissenschaften (BBAW)
Stand-off Annotation to Ancient Egyptian text corpora based on BTS

10:30-11:00
Coffee break

11:00-11:30
Jonathan Robie, biblicalhumanities.org
XML, Treedown, CSS, and XQuery: Markup and Markdown for creating, visualizing and querying syntax trees

11:30-12:00
Marco Passarotti, Università Cattolica di Milano (Catholic University of Milan)
A Practical introduction to resources and tools for Latin at the CIRCSE Research Centre

12:00-13:30
Lunch

13:30-14:00
Dirk Roorda, Koninklijke Nederlandse Akademie van Wetenschappen (Royal Netherlands Academy of Arts and Sciences)
Programming theologians. What they want, do and need

14:00-14:30
Lydia Müller, Universität Leipzig (Leipzig University)
WebAnno and ASV Toolbox: language independent NLP tools

14:30-15:00
Stefan Schnell, University of Melbourne
GRAID and RefIND: Corpus annotation for cross-linguistic research at the discourse-grammar interface

15:00-15:30
Coffee break

15:30-16:00
Timo Korkiakangas, Universitetet i Oslo (University of Oslo)
Quantifying spelling variation: Scribes’ command of Early Medieval documentary Latin

16:00-16:30
Petr Zemánek, Univerzita Karlova (Charles University)
Complex linguistic corpus or a virtual collection: Digital representation of a collection of Assyrian cuneiform tablets

16:30-17:30
Discussion

July 7 – Raum 702

09:00-9:30
Anke Lüdeling, Carolin Odebrecht, Laura Perlitz, Gohar Schnelle, & Zarah Weiß, Freie Universität Berlin (Free University of Berlin)
A Digital infrastructure to support the study of historical German: The RIDGES Herbology Corpus

9:30-10:00
John Lee, 香港城市大學 (City University of Hong Kong)
Syntactic patterns in classical Chinese poems

10:00-10:30
Martin Haspelmath, Max-Planck-Institut für Menschheitsgeschichte, Jena (MPI-SHH) & Leipzig Universität (Leipzig University)
Comparative concepts in cross-linguistic grammatical databases and in glossing

10:30-11:00
Coffee break

11:00-11:30
Mattis List, Max-Planck-Institut für Menschheitsgeschichte, Jena (MPI-SHH)
Annotation and analysis of cross-linguistic lexical data in
historical linguistics: Towards the establishment of standards and best
practices

11:30-12:00
Dan Zeman, Univerzita Karlova (Charles University)
Universality in space and time – Modern treebanking for ancient languages

12:00-13:30
Lunch

13:30-14:00
Volker Gast, Friedrich-Schiller-Universität Jena (Friedrich-Schiller-University Jena)
From temporal annotations to temporal structure: Some explorations

14:00-14:30
Justin Cale Johnson, Leiden Universiteit (Leiden University)
Annotating the Babylonian Medical Corpus: Progress and prospects

14:30-15:00
Francesco Mambrini, Deutsches Archäologisches Institut (DAI) & Leipzig Universität (Univerisity of Leipzig)
Trees and Idiolects. Treebank annotation and the study of direct speeches in Ancient Greek literature

15:00-15:30
Coffee break

15:30-16:00
Anton Karl Ingason, Háskóli Íslands (University of Iceland)
Annotating and querying the Icelandic Parsed Historical Corpus and closely related cross-linguistic counterparts

16:00-16:30
Neven Jovanović, Sveučilište u Zagrebu (University of Zagreb)
From annotation to learners’ corpora

16:30-17:30
Final discussion