SP03: Potentials of Language Documentation: Methods, Analyses, and Utilization

View entire volume

ISBN 978-0-9856211-0-0


Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts, Paul Trilsbeek


In the past 10 or so years, intensive documentation activities, i.e. compilations of large, multimedia corpora of spoken endangered languages have contributed to the documentation of important linguistic and cultural aspects of dozens of languages. As laid out in Himmelmann (1998), language documentations include as their central components a collection of spoken texts from a variety of genres, recorded on video and/or audio, with time-aligned annotations consisting of transcription, translation, and also, for some data, morphological segmentation and glossing. Text collections are often complemented by elicited data, e.g. word lists, and structural descriptions such as a grammar sketch. All data are provided with metadata which serve as cataloguing devices for their accessibility in online archives.


Front matter

Table of contents



1. The threefold potential of language documentation
Frank Seifart, pp. 1–6

Part 1. Methods

2. Prospects for e-grammars and endangered languages corpora
Sebastian Drude, pp. 7-16

3. Language-specific encoding in endangered language corpora
Jost Gippert, pp. 17-24

4. Unsupervised morphological analysis of small corpora: First experiments with Kilivila
Amit Kirschenbaum, Peter Wittenburg, and Gerhard Heyer, pp. 25-31

5. A corpus linguistics perspective on language documentation, data, and the challenge of small corpora
Anke Lüdeling, pp. 32-38

6. Supporting linguistic research using generic automatic audio/video analysis
Oliver Schreer and Daniel Schneider, pp. 39-45

Part 2. Analyses

7. Bilingual multimodality in language documentation data
Marianne Gullberg, pp. 46-53

8. Tours of the past through the present of eastern Indonesia
Marian Klamer, pp. 54-63

9. Data from language documentations in research on referential hierarchies
Stefan Schnell, pp. 64-72

10. Information structure, variation and the Referential Hierarchy
Jane Simpson, pp. 73-82

11. How to measure frequency? Different ways of counting ergatives in Chintang (Tibeto-Burman, Nepal) and their implications
Sabine Stoll and Balthasar Bickel, pp. 83-89

12. On the sociolinguistic typology of linguistic complexity loss
Peter Trudgill, pp. 90-95

Part 3: Utilization

13. Visualization and online presentation of linguistic data
Hans-Jörg Bibiko, pp. 96-104

14. Language archives: They’re not just for linguists any more
Gary Holton, pp. 105-110

15. Creating educational materials in language documentation projects – creating innovative resources for linguistic research
Ulrike Mosel, pp. 111-117

16. From language documentation to language planning: Not necessarily a direct route
Julia Sallabank, pp. 118-125

17. Online presentation and accessibility of endangered languages data: The General Portal to the DoBeS Archive
Gabriele Schwiertz, pp. 126-128

18. Using language documentation data in a broader context
Nick Thieberger, pp. 129-134