Special Publications

LD&C publishes occasional Special Publications, usually on specific themes. The contents of each of these is given via RSS feed below.

SP24: Phonetic fieldwork in southern New Guinea.
SP22: Reflexiones teóricas en torno a la función del trabajo de campo en lingüística-antropológica: Contribuciones de investigadores indígenas del sur de México.
SP21: Interdisciplinary Approaches to Language Documentation
SP20: Collaborative Approaches to the Challenges of Language Documentation and Conservation
SP19: Documentation and Maintenance of Contact Languages from South Asia to East Asia
SP18: Archival Returns: Central Australia and Beyond
SP17: Language and Toponymy in Alaska and Beyond
SP16: Methodological Tools for Linguistic Description and Typology
SP15: Reflections on Language Documentation on the 20 year anniversary of Himmelmann 1998

SP14: A Grammar of Shilluk
SP13: Documenting Variation in Endangered Languages
SP12: The Social Cognition Parallax Interview Corpus (SCOPIC)
SP11: Mutsun-English English-Mutsun Dictionary, mutsun-inkiS inkiS-mutsun riica pappel
SP10: African language documentation: new data, methods and approaches
SP09: Language Documentation and Conservation in Europe
SP08: The Art and Practice of Grammar Writing
SP07: Language Endangerment and Preservation in South Asia
SP06: Microphone in the mud
SP05: Melanesian languages on the edge of Asia: Challenges for the 21st Century
SP04: Electronic Grammaticography
SP03: Potentials of Language Documentation: Methods, Analyses, and Utilization
SP02: Fieldwork and Linguistic Analysis in Indigenous Languages of the Americas
SP01: Documenting and Revitalizing Austronesian Languages


SP24: Phonetic fieldwork in southern New Guinea.
  • 1. The phonetics of Southern New Guinea languagesː an overview
    Abstract: This article provides an overview of the phonologies of Southern New Guinea languages, based on the six languages in this special issue plus two others for which JIPA illustrations have recently been published – Yelmek (Yelmek-Maklew family), Ngkolmpu, Nmbo and Nen (Yam family), Idi and Ende (Pahoturi River), Bitur (Marind-Anim branch of Trans-New Guinea) and Urama (Kiwaian branch of Trans-New Guinea). It surveys overall inventory sizes (maximal 28 consonants and 8 vowels, in Nmbo, minimal 13 consonants plus 5 vowels in Urama), and the most important segment types characteristic of the region, including retroflexion in Idi and Ende, labial-velar stops (Nen, Nmbo), rounded stops (Nmbo), relatively large liquid inventories (Pahoturi River) and prenasalised stop phonemes (Ngkolmpu, Nen, Nmbo).
  • Introduction: Phonetic fieldwork in southern New Guinea
  • 3. Phonetics and Phonology of Ngkolmpu
    Abstract: This paper describes the phonetics and phonology of segments in Ngkolmpu, a language spoken in the Merauke region of Indonesian Papua. The language is a member of the the Tonda-Kanum branch of the Yam family and displays a fairly typical segmental inventory for a Yam language with some notable exceptions. There are sixteen phonemic consonantal segments. As commonly found in Papuan languages, the primary manner distinction of stops is between voiceless oral stops and prenasalised stops. Rather unusually, both the plain oral stops and the prenasalised stops are voiceless for the oral period of the articulation. There are seven phonemic vowels and one epenthetic vowel whose distribution is phonotactically determined.
  • 2. A phonetic description of Yelmek
    Abstract: This paper provides a first description of the phonetics and phonology of a language from the Yelmek-Maklew family, a language family without a genealogical link to any other language family in New Guinea or elsewhere. The variety under consideration in this paper is used by people from the village of Wanam, located in the Papuan Province on the Indonesian side of New Guinea. Wanam is the northernmost of the four villages attributed to the Yelmek branch of the family (ISO 639-3:jel, glottocode: yelm1242). The variety in question has 13 consonant phonemes and 7 vowel phonemes. The vowel inventory includes a phonemic schwa, which is distinct from the epenthetic schwa that is used to split illicit consonant cluster. Noteworthy suprasegmental features include the absence of word-level stress and the fact that interrogative and declarative utterances have the same basic pitch contour.
  • 5. Phonetics and phonology of Idi
    Abstract: This paper provides a first description of the phonetics and phonology of Idi (Pahoturi River; ISO 639-3: idi, glottocode: idii1243) as spoken by about 1,000 people in the villages of Dimsisi and Sibidiri, located in the Morehead District of Western Province, Papua New Guinea. Idi has a fairly large inventory of 21 consonant phonemes and 8 vowel phonemes. As with other languages spoken in the region, the two central vowels show a hybrid status and could be analysed as sometimes phonemic and sometimes epenthetic. Other noteworthy characteristics are the presence of vowel harmony, voiced and voiceless retroflex plosives/affricates, nasality as a “floating” feature, and coarticulated labial-velar plosives, although the latter most likely originated as loan phonemes from Nen.
  • 4. The phonetics of Nmbo (Nɐmbo) with some comments on its phonology (Yam family; Morehead district)
    Abstract: This paper presents aspects of the phonetics and phonology of the Nmbo language as spoken by the Kerake tribe peoples of southern Western Province, Papua New Guinea. The paper is primarily concerned with the phonetics of consonants and vowels, but also presents description and audio examples of stress and clausal intonation patterns.
  • 6. The phonetics of Bitur
    Abstract: This paper offers a description of the phonetics of Bitur, a language spoken by less than a thousand people in Western Province, Papua New Guinea. With just thirteen consonants and five vowels, the phoneme inventory of Bitur is fairly typical of a Papuan language and yet relatively small in its more immediate geographic and genealogical contexts. The consonants of Bitur represent five manners of articulation and span four places of articulation. Prenasalized stops are noticeably absent, despite their prevalence in the region and among related languages. The low central vowel /a/ assimilates in height to nearby mid and high vowels, and it provides a means to distinguish high vowels from approximants. The Bitur syllable consists minimally of a vowel nucleus with simple onsets and codas allowed. Vowel length is not contrastive, but it seems to be the most salient prosodic feature of the Bitur word. As the first substantial phonetic description of a Lower Fly language—the least-known language group in Southern New Guinea—this paper represents an important contribution to our understanding of Papuan languages.
  • 7. A phonetic sketch of Urama
    Abstract: This paper provides a phonetic sketch of Urama (Glottocode: uram1241), one of the varieties of the Northeast Kiwai group (iso code: kiw). Urama’s consonant and vowel inventories, with 12 and 5 members respectively, are characteristic of Papuan languages generally. Vowel length is contrastive, but may be in the process of being lost. Urama exhibits a pitch accent system, but only a few words are found in which tone alone distinguishes meaning.
  • SP24 Front Matter
  • SP24 Full volume
  • SP24 Cover

SP22: Reflexiones teóricas en torno a la función del trabajo de campo en lingüística-antropológica: Contribuciones de investigadores indígenas del sur de México.

  • Autores
  • Prólogo
  • Introducción
  • Sk’an jtsatsubtastik ko’ontontik: Diálogos, retos y complejidades de ser una investigadora tsotsil
    Abstract: El artículo reflexiona acerca de las experiencias de una investigadora de origen tsotsil formada en los campos de la lingüística y la lingüística antropológica, quién habita en el mismo espacio territorial de la comunidad de estudio. Específicamente, examina cuáles son las implicaciones y los retos de ser mujer, tsotsil e investigadora, que, por un lado, se adentra en los espacios sociales y comunitarios que son exclusivos de los hombres, y por otro, incursiona en un campo que había sido privilegio de investigadores externos. La autora parte de una epistemología que propone el estudio del “nosotros” en contraste con las investigaciones establecidas en el estudio de los “otros”. Aporta una serie de metodologías para la investigación “desde dentro” que parten de la necesidad del conocimiento de los recursos culturales, el diálogo desde el mismo código lingüístico, la ecología de formulación de preguntas, las prácticas de reciprocidad, la anteposición de los intereses colectivos, la empatía, así como la conciencia de vivir en el escrutinio comunitario. Finalmente, propone un acercamiento a las comunidades de estudio de forma humanamente significativa y no sólo metodológicamente correcta./; The article reflects on the experiences of a researcher of Tsotsil origin trained in the fields of linguistics and anthropological linguistics, who lives in the same territorial space of the study community. Specifically, it examines the implications and challenges of being a woman, a Tsotsil and a researcher, who, on the one hand, delves into the social and community spaces that are exclusive to men, and on the other, enters a field in which external researchers have been privileged. The author starts from an epistemology that proposes the study of “we” in contrast to the research established in the study of “others”. It provides a series of methodologies for research “from within” that start from the need for knowledge of cultural resources, dialogue from the same linguistic code, the ecology of formulating questions, the practices of reciprocity, the preponderance of collective interests, empathy, as well as the awareness of living in community scrutiny. Finally, it proposes an approach to the study communities that is fundamentally humane, and not simply methodologically correct.
  • Entre propios y extraños: Cuando una investigadora indígena realiza estudios en su propia comunidad
    Abstract: Este trabajo presenta las experiencias de una investigadora indígena que realiza estudios lingüísticos y etnográficos en su propia comunidad. Actualmente hay un creciente número de académicos integrantes de comunidades indígenas que incursionan en la documentación, descripción y promoción de sus lenguas de origen en un campo que fue creado por y para miembros de instituciones académicas históricamente alejadas del trabajo colaborativo con los hablantes de lenguas indígenas. El artículo describe algunos de los aspectos que la pertenencia cultural y comunitaria de la autora, le permiten profundizar en la investigación lingüística local, así como sus limitantes y dificultades. Responde de esta forma a la necesidad de que haya materiales y literatura que hable de la complejidad de experiencias de las investigadoras desde los diversos papeles que juegan, como mujeres, como parte de familias complejas e intergeneracionales, como integrantes de una comunidad y como miembros de instituciones educativas.; This work presents the experiences of an indigenous researcher who carries out linguistic and ethnographic studies in her own community. There is a growing number of scholars who are members of indigenous communities who venture into the documentation, description and promotion of their languages of origin. A field that was created by and for members of academic institutions historically distant from collaborative work with speakers of indigenous languages. The author’s belonging to the community and thus the culture allows her to have a profound insight into local linguistic research, as well as its limitations and difficulties. There is a need for materials and literature that addresses the complexities of native researcher’s experiences. These complexities include the different roles they play, as women, as part of complex and intergenerational families, as community members and as members of educational institutions.
  • Activismo e investigación para la promoción de la lectoescritura del chatino: Experiencias y reflexiones de trabajo de campo
    Abstract: En este artículo exploro mis experiencias como investigadora local durante el trabajo de campo. Discuto las emociones personales relacionadas a la discriminación lingüística, la internalización de las actitudes negativas que los hablantes tienen sobre su propia lengua. Asimismo, discuto las relaciones de poder local en el tejido comunitario al cual mi familia pertenece y la violencia que experimento como mujer. Como investigadora local expongo las realidades y adversidades a las que me enfrento haciendo investigaciones y trabajando en proyectos colaborativos en mi propia comunidad con los hablantes, educadores y autoridades municipales. A pesar de las dificultades y limitaciones en mi rol como investigadora y promotora de la lectoescritura, expongo actividades que promueven la lectoescritura y la revitalización de la lengua chatina.; In this article, as an inside researcher I explore my experiences during fieldwork. I discuss personal emotions related to linguistic discrimination and internalization of speakers’ negative attitudes toward their language. Also, I discuss the local power relations in the community social network to which my family belongs, and the violence I experience as a woman. As a local researcher I unveil the realities and adversities that I face doing research and working on collaborative projects in my own community with speakers, educators, and local government. Despite drawbacks and limitations in my role as researcher and literacy advocate in the community, I showcase ongoing activities that foster literacy and revitalization of the Chatino language.
  • “¿Y ganas algo de esto?”. La experiencia de trabajo de campo en la comunidad de origen: de la reflexión a la sanidad emocional.
    Abstract: En este ensayo comparto mi experiencia de trabajo de campo como investigadora zapoteca formada dentro de marcos y metodologías de la investigación antropológica y lingüística; sugiero que se experimenta de manera distinta a como lo hacen los colegas de origen no indígena. Expongo el entramado de relaciones éticas, políticas y de polarización interna que atraviesan mis proyectos académicos y mi vida personal.; In this essay I share my fieldwork experience as a Zapotec researcher trained in anthropological and linguistic methodologies. I suggest fieldwork is experienced differently by colleagues of non-indigenous origin based on ethical, political and internal relationships that cross my academic projects and my personal life.
  • Entre la academia y la comunidad: La diabla alegre que baila en la fiesta y muestra su lengua
    Abstract: Se exploran aquí las complejidades y desafíos que surgen desde mi experiencia como investigadora indígena. Mi labor como lingüista y antropóloga se ha ubicado en instituciones académicas y comunitarias. Durante las últimas dos décadas, una buena parte de mi investigación se ha enfocado en la documentación y revitalización de las lenguas indígenas, en particular de las lenguas chatinas. Este artículo aborda mi propia experiencia como investigadora indígena que navega en dos espacios con el objetivo de compartir la realidad de mi posición en la academia y en los pueblos chatinos.; Explored here are the complexities and challenges that arise from my experience as an indigenous researcher. As a linguist and anthropologist I move between both academic and community spaces. During the last two decades, a good part of my research has focused on the documentation and revitalization of indigenous languages, in particular the Chatino languages. This article addresses my experience as an indigenous researcher navigating these two spaces with the aim of sharing the reality of my position with academia and Chatino communities.
  • Los principios éticos de las metodologías en el trabajo de campo lingüístico según quién
    Abstract: Este artículo busca establecer un diálogo entre las propuestas metodológicas que se han planteado para el trabajo de campo lingüístico y las crecientes experiencias de lingüistas indígenas. Es bien sabido que la teorización de las metodologías que rigen el actuar de los lingüistas en las comunidades de estudio se lleva a cabo desde una perspectiva foránea tanto de la lengua como de la comunidad. Dichas metodologías están diseñadas y orientadas por y para académicos no indígenas que son, en su mayoría, académicos provenientes de un país diferente al que pertenece la lengua y sus hablantes. Aquí se expone que los desafíos a los que nos enfrentamos los lingüistas locales y semi-locales no son los mismos a los que se enfrentan los lingüistas foráneos. De este modo, se contribuye a repensar la universalidad de los principios éticos metodológicos de comportamiento en el trabajo de campo en la lingüística contemporánea y se promueve una perspectiva del lingüista local indígena que implica la descolonización de las metodologías del trabajo de campo diseñadas por extranjeros y adoptadas de manera acrítica por linguistas locales y semi-locales.
  • Portada
  • Portadilla
  • Volumen completo

SP21: Interdisciplinary Approaches to Language Documentation

  • Introduction:Interdisciplinary Research in Language Documentation
  • Child language documentation: A pilot project in Papua New Guinea
    Abstract: The central aim of language documentation is to comprehensively document the characteristic speech practices of a community. Such practices necessarily also include child language and child-directed speech-and yet there are only very few documentation projects that focus on language from and with children. This paper argues for studying first language acquisition and socialization within a language documentation context, focusing on the types of data needed for such a study and drawing on the insights from a pilot project among the Qaqet of Papua New Guinea. The aim of this pilot project was to investigate the feasibility of a comprehensive child language documentation project, and this paper discusses the central challenges to such an endeavour and shows how they were addressed in the project.
  • Interdisciplinarity in areal documentation: Experiences from Lower Fungom, Cameroon
    Abstract: The Lower Fungom region of Northwest Cameroon is noteworthy for its exceptional lin- guistic diversity: Seven languages, or small language clusters, are spoken in its thirteen recognized villages. This situation prompts consideration of not only standard documen- tary concerns, such as how to collect sufficient information to grammatically describe each of the regions languages, but also raises the question: What factors have allowed Lower Fungom to develop and maintain its extreme linguistic diversity? Answering this question would not only be of relevance to linguistic scholarship but also has potential applications for addressing language endangerment in other parts of the world to the extent that the maintenance of linguistic diversity in Lower Fungom provides an obvious counterexam- ple to dominant worldwide trends. This paper considers the ways in which the standard documentary toolkit has been augmented by an interdisciplinary approach to studying the region, allowing for the creation of a documentary record which covers both the synchronic features of the target languages and offers sufficient ethnographic and historical context to allow us to begin to understand what has allowed it to maintain its surprising level of di- versity. In addition to outlining key results of this interdisciplinary research, concrete rec- ommendations are provided for linguists interested in engaging in similar kinds of work.
  • Domain-driven documentation: The case of landscape
    Abstract: It is becoming increasingly evident that the field of language documentation and the documentary multimedia resources it produces rely on expanding their relevance and usability to disciplines beyond linguistics in order to increase their chances of being sustainable in the long term. This paper argues that more attention should be paid to the needs and interests of such disciplines in language documentation schemes. One way of doing so is to set out from fundamental domains of human experience in designing documentation programs, domains which are of immediate concern to disciplines such as geography, biology, history, anthropology, and so on. Particular focus is placed on the domain of landscape, explored in two documentation programs coordinated by the author. In addition to providing clear interdisciplinary arenas of inquiry, such domain-driven approaches also offer excellent opportunities for efficient collection and construction of the comprehensive records of linguistic practices stipulated by current documentation initiatives.
  • Endangered Language Documentation: The challenges of interdisciplinary research in ethnobiology
    Abstract: In 2004, three national institutes jointly published Facilitating interdisciplinary research, a report that set standards for evaluating the interdisciplinarity of cross-disciplinary collaborations. Although endangered language documentation (ELD) projects often assemble multidisciplinary teams, the 2004 criteria, today followed by the NSF, create such a high bar for interdisciplinarity that it is probably better to evaluate the cross-disciplinary impact of ELD projects through a different criterion: that of service vs. science. According to this perspective, the cross-disciplinary goal of ELD projects should be to decrease reliance on outside provisioning of services while increasing their contribution to the research goals of external disciplines. This article first suggests that ELD projects should actively promote and evaluate the use project results across disciplines, beginning with greater attention to the archiving process and issues of discoverability and transparency of data. It then explores the potential for the cross-disciplinary impact of ELD ethnobiological research, which has often simply asked taxonomists to identify collected material to species, a service that only marginally benefits biological research agendas. To promote scientific collaboration across disciplines, ELD ethnobiological projects are best designed if they contribute methodologically, substantially, and theoretically to biological research. This article concludes with a description of such an effort.
  • Front Matter
  • Cover
  • Whole Volume

SP20: Collaborative Approaches to the Challenges of Language Documentation and Conservation

  • SP20 Whole Volume
  • SP20 Front Matter
  • SP20 Cover
  • A language vitality survey of Macuxi, Wapichana, and English in Serra da Lua, Roraima (Brazil)
    Abstract: Serra da Lua is a multilingual region in the state of Roraima (Brazil) where Macuxi (Carib), Wapichana (Arawak), Brazilian Portuguese and Guyanese English are all spoken. Based on a self-reported language survey we present an assessment of the vitality of the languages spoken in this region and the attitudes of the speakers to-wards these languages. While previous literature has reported the existence of English speakers in this region, the literature does not provide more details about domains of use and the attitudes towards the English language in contrast with Portuguese and the Indigenous languages. This paper helps to address this gap. In sum, the goals of this paper are twofold: first, in light of the results of the survey, to discuss the vitality of the Macuxi and Wapichana languages in the Serra da Lua communities according to the criteria set out by UNESCO’s “Nine Factors” for assessing language vitality; and second, to provide insight about the use of English in this region.
  • Keeping Haida alive through film and drama
    Abstract: The Haida language, of the northwest coast of Canada and Southern Alaska, has been endangered for most of the 20th century. Historically, orthography has been a difficult issue for anyone studying the language, since no standardized orthography existed. In spite of the orthographical issues, current efforts in Canada at revitalizing Haida lan-guage and culture have culminated in the theatrical production of Sinxii’gangu, a tradi-tional Haida story dramatized and performed completely in Haida. The most recent effort is Edge of the Knife, a film about a Haida man transforming into a gaagiid (wild man) as a result of losing a child. The story line addresses his restoration back into the community, and as a result, affords not just a resource for two Haida dialects, but also for history and culture. With regards to language, actors participated in two weeks of immersion to prepare and struggled through issues with Haida pronunciation during filming. Using the Haida language exclusively, not just in oral narratives (though there are some in the drama and the film) but in actual dialogue, provides learners with great context for developing strategies for pronunciation and conversation rather than only learning and hearing lexical items and short phrases. Capturing the storyline on film not only supports efforts at revitalization, but provides tangible documentation of both Canadian dialects of the Haida language.
  • The Online Terminology Forum for East Cree and Innu: A collaborative approach to multi-format terminology development
    Abstract: For Indigenous languages to thrive, it is essential for speakers to be able to talk about their present reality in relevant and meaningful ways. In this paper, we report on our work in terminology development through workshops and the creation and use of modern digital tools including online dictionaries and terminology forums, and by working with speakers in the creation and ongoing discussion of new words. We describe the technology required to make this possible and the necessity of producing various formats, such as interactive images, booklets, and multimedia apps. We dis-cuss the tools we have developed with and for East Cree and Innu speakers, transla-tors, and linguists and the challenges of quality terminology creation, including con-text, clarity, dialectal variation, multiple submissions, and the specificity of the struc-ture of Algonquian languages. We explain how videos can complement and support terminology development and diffusion and the importance of providing searchable, translated texts for models and context. We stress the importance of allowing oral, visual, and written submissions to interactive terminology databases. We also report on two Online Terminology Forum training workshops with Innu translators. We demonstrate the advantages of building a pan-Algonquian terminology database to combine, strengthen, and expand communities’ (re)vitalization efforts across thematic domains such as health, justice, environment, education, and technology.
  • Supporting rich and meaningful interaction in language teaching for revitalization: Lessons from Macuiltianguis Zapotec
    Abstract: Many language revitalization programs aimed at teaching Indigenous languages are small, informal efforts with limited time and resources. Even in communities that still have proficient speakers, students in revitalization programs often struggle to gain proficiency in the language. This paper offers an illustration of how one language revitalization program has tried to make teaching more effective by adapting commu-nicative language teaching strategies to be more useful and appropriate for their particular context. Having gained empirical support in the field of second language acquisition (SLA), communicative language teaching emphasizes the importance of rich and meaningful interaction for language learning to take place. “Rich” refers to the availability of target-like input that is not oversimplified. “Meaningful” refers to the type of interaction that takes place in real-life situations that necessitate communi-cation. However, existing research on these topics has largely ignored language revi-talization contexts, where providing learners with rich and meaningful interaction can be particularly challenging. This paper presents strategies for promoting rich and meaningful interaction in instructed language revitalization settings, as demonstrated through teacher practices at a Zapotec revitalization program in San Pablo Macuiltian-guis, Oaxaca, Mexico. The focus is on shifting from Spanish language use to Zapotec language use in specific, everyday social spaces, then supporting interaction within these spaces.
  • The Kawaiwete pedagogical grammar: Linguistic theory, collaborative language documentation, and the production of pedagogical materials
    Abstract: This paper describes the intersection between linguistic theory and collaborative language documentation as a fundamental step in developing pedagogical materials for Indigenous communities. More specifically, we discuss the process of writing a monolingual pedagogical grammar of the Kawaiwete language (a Brazilian Indigenous language). This material was intended to motivate L1 speakers of Kawaiwete to think about language as researchers: by exploring linguistic datasets through the production and revision of hypotheses, testing predictions empirically and assessing the con-sistency of hypotheses through logical reasoning. By means of linguistic workshops in Kawaiwete communities, linguistic training of Indigenous researchers and production of pedagogical materials, we intended to motivate younger generations of Kawaiwete speakers to become researchers of their own language.
  • “Data is Nice:” Theoretical and pedagogical implications of an Eastern Cherokee corpus
    Abstract: This paper serves as a proof of concept for the usefulness of corpus creation in Cherokee language revitalization. It details the initial collection of a digital corpus of Cherokee/English texts and enumerates how corpus material can augment contemporary language revitalization efforts rather than simply preserving language for future analysis. By collecting and analyzing corpus material, we can quickly create new classroom materials and media products, and answer deeper theoretical linguistic questions. With a large enough corpus, we can even implement machine translation systems to facilitate the production of new texts. Although the vast majority of print material in Cherokee is in the Western dialect, this corpus has focused on Eastern texts. Expanding the dataset to include both dialects, however, will allow for comparison and facilitate generalizations about the Cherokee language as a whole. A corpus of Cherokee data can answer second language learners’ questions about the structure of the language and provide patterns for more effective, targeted learning of Cherokee. It can also provide teachers with ready access to accurate representations of the language produced by native speakers. By combining documentation and technology, we can leverage the power of databases to expedite and facilitate language revitalization.
  • Indigenous universities and language reclamation: Lessons in balancing Linguistics, L2 teaching, and language frameworks from Blue Quills University
    Abstract: This article describes Dene and Cree language programs at University nuhelot'įne thaiyots'į nistameyimâkanak Blue Quills, a First Nations-owned university in Canada created in a former residential school building in the decades following a 1969 sit-in by concerned parents. The history of UnBQ and its role in language and cultural revitalization are situated in the context of the North American tribal college and university movement, as is the author's integration of the challenge by Leonard (2017, 2018) to explore Indigenous frameworks for language in his teaching of introductory linguistics. Follow-up interviews with students, a department head and the UnBQ president include their ideas for a possible Cree-based framework for linguistic analy-sis. Translation and co-creation of linguistic terminology into Plains Cree and Denesųłiné support language use in the classroom and students' understanding. Practi-cal challenges facing UnBQ are discussed.
  • Integrating collaboration into the classroom: Connecting community service learning to language documentation training
    Abstract: As training in language documentation becomes part of the regular course offerings at many universities, there is a growing need to ensure that classroom discussions of documentary linguistic theory and best practices are balanced with the practical application of these skills and concepts. In this article, we consider Community Ser-vice Learning (CSL) in partnership with community-based organizations as one means of grounding language documentation training in realistic and collaborative practice. As a case study, we discuss one recent CSL project undertaken as a collaboration between the Yukon Native Language Centre and graduate students in a semester-long introductory course on language documentation at Carleton University. This collabo-ration focused on annotating recently digitized legacy language lessons for several Indigenous languages spoken in the Yukon Territory, Canada, using documentary linguistic software tools to create a text-searchable, multimedia database for future pedagogical applications. Drawing on the reflections of both community- and univer-sity-based collaborators, we discuss the design of this project, some of the challenges that needed to be addressed as it progressed, and offer several recommendations for future initiatives to integrate CSL into language documentation training.
  • Introduction
    Abstract: This chapter introduces the volume, Collaborative Approaches to the Challenge of Language Documentation and Conservation, providing a short justification for the volume, summarizing each of the eight chapters, and identifying major themes that emerge in the chapters.

SP19: Documentation and Maintenance of Contact Languages from South Asia to East Asia

  • Foreword
    Abstract: This foreword introduces the special volume "Documentation and maintenance of contact languages from South Asia to East Asia', presenting the nature and aim of the volume, as well as a summary of each of the research articles included, and highlighting its contribution to the field.
  • Kodrah Kristang: The Initiative to Revitalize the Kristang Language in Singapore
    Abstract: Kristang is the critically endangered heritage language of the Portuguese-Eurasian community in Singapore and the wider Malayan region, and is spoken by an estimated less than 100 fluent speakers in Singapore. In Singapore, especially, up to 2015, there was almost no known documentation of Kristang, and a declining awareness of its existence, even among the Portuguese-Eurasian community. However, efforts to revitalize Kristang in Singapore under the auspices of the community-based non-profit, multiracial and intergenerational Kodrah Kristang (‘Awaken, Kristang’) initiative since March 2016 appear to have successfully reinvigorated community and public interest in the language; more than 400 individuals, including heritage speakers, children and many people outside the Portuguese-Eurasian community, have joined ongoing free Kodrah Kristang classes, while another 1,400 participated in the inaugural Kristang Language Festival in May 2017, including Singapore’s Deputy Prime Minister and the Portuguese Ambassador to Singapore. Unique features of the initiative include the initiative and its associated Portuguese-Eurasian community being situated in the highly urbanized setting of Singapore, a relatively low reliance on financial support, visible, if cautious positive interest from the Singapore state, a multiracial orientation and set of aims that embrace and move beyond the language’s original community of mainly Portuguese-Eurasian speakers, and, by design, a multiracial youth-led core team.
  • Documenting modern Sri Lanka Portuguese
    Abstract: Sri Lanka Portuguese (SLP) is a Portuguese-lexified creole formed during Sri Lanka’s Portuguese colonial period, which lasted from the early 16th century to the mid-17th century. The language withstood several political changes and became an important medium of communication for a portion of the island’s population, but reached the late 20th century much reduced in its distribution and vitality, having essentially contracted to the Portuguese Burgher community of Eastern Sri Lanka. In the 1970s and 1980s, the language was the object of considerable research and documentation efforts, which were, however, curtailed by the Sri Lankan civil war. This chapter reports on the activities, challenges, and results of a recent documentation project developed in the post-war period and designed to create an appropriate and diverse record of modern SLP. The project is characterised by a highly multidisciplinary approach that combines linguistics and ethnomusicology, a strong focus on video recordings and open-access dissemination of materials through an online digital platform (Endangered Languages Archive), archival prospection to collect diachronic sources, a sociolinguistic component aimed at determining ethnolinguistic vitality with a view to delineating revitalisation strategies, and a strongly collaborative nature. This chapter describes the principal outputs of the documentation project, which, in addition to a digital corpus of transcribed and annotated materials representing modern manifestations of SLP and the oral/musical traditions of the Burghers, also include the findings of the sociolinguistic survey, an orthographic proposal for the language, as well as the copies and transcriptions of hard-to-obtain historical sources on SLP (grammars, dictionaries, biblical translations, liturgical texts, collections of songs).
  • Peranakans in Singapore: Responses to language endangerment and documentation
    Abstract: Baba Malay is a critically endangered contact language that is home language to the Peranakans in Singapore and Malacca. This paper provides a diachronic perspective on the ways in which the Peranakan community in Singapore has responded to the issues of the issues of language endangerment and documentation. It reports qualitative observations of the community’s responses made by researchers of Baba Malay and community members in the 80s, when they first problematized the endangerment of Baba Malay. It also reports the qualitative and quantitative responses of community members towards language endangerment during and post- process of an ongoing language documentation project. Taken together, these observations show that Peranakans recognize how critically endangered Baba Malay is, and that the community is highly concerned about the potential loss of the language. The community’s general reactions towards language documentation, as well as bottom-up steps taken towards safeguarding the language, are discussed as well. These include the community-led initiatives such as the implementation of language classes, as well as individual-led initiatives, including the development of podcasts and a textbook for language learners.
  • Documenting online writing practices: The case of nominal plural marking in Zamboanga Chabacano
    Abstract: The emergence of computer-mediated communication has brought about new opportunities for both speakers and researchers of minority or under-described languages. This paper shows how the analysis of spontaneous contemporary language samples from online social networks can make a contribution to the documentation and description of languages like Chabacano, a Spanish-derived creole spoken in the Philippines. More specifically, we focus on nominal plural marking in the Zamboangueño variety, a still imperfectly understood feature, by examining a corpus composed of texts from online sources. The attested combination of innovative and vestigial features requires a close look at the high contact environment, different levels of metalinguistic awareness or even some language ideologies. The findings shed light on the wide variety of plural formation strategies which resulted from the contact of Spanish with Philippine languages. Possible triggers, such as animacy, definiteness or specificity, are also examined and some future research areas suggested.
  • SP19 Cover
  • SP19 Front Matter
  • SP19 Whole Volume

SP18: Archival Returns: Central Australia and Beyond

  • "For the children...": Aboriginal Australia, cultural access, and archival obligation
    Abstract: For whom are archival documents created and conserved? Who is obliged to care for them and provide access to their content, and for how long? The state, libraries, museums and galleries, researchers, interlocutors, genealogists, family heritage organisations? Or does material collected long ago and then archived belong personally, socially, emotionally, culturally, and intellectually to the people from whom the original material was collected and, eventually, to their descendants? In a colonised nation, additional ethical and epistemological questions arise: Are archives protected and accessed for the colonised or the colonisers, or both? How are differences regarding archival creation, protection, and access distinguished, and in whose interest? Is it for future generations? What happens when archives are accessed and read by family members and/or researchers, and what happens when they are not? A focus on two interrelated stories – firstly an experiential account narrated by Brenda L Croft about constructive archival management and access, and secondly a contrasting example relating how the Berndt Field Note Archive continues to be restricted from entitled claimants – facilitates a return to three interrelated questions: for whom are archives created and conserved, who is obliged to care for, and authorise access to, them, and to whom do they belong?
  • "The songline is alive in Mukurtu": Return, reuse, and respect
    Abstract: This chapter examines the return, reuse, and repositioning of archival materials within Indigenous communities and specifically within the Warumungu Aboriginal community in Central Australia. Over the last 20 years there has been an uptake in collecting institutions and scholars returning cultural, linguistic, and historical material to Indigenous communities in digital formats. These practices of digital return have been spurred by decolonisation and reconciliation movements globally, and at the same time catalysed by new technologies that allow for surrogates to be returned and concurrently reinvented, reused, and reimagined in community, kin-based, and place-based social and cultural networks. Examining the creation, use, and ongoing development of Mukurtu CMS, this article focuses on the implications for digital return as a type of repatriation that promotes decolonising strategies and reparative frameworks for engagement.
  • Contributors
  • Working at the interface: The Daly Languages Project
    Abstract: In this paper we present the Daly Languages Project (www.dalylanguages.org), funded by the ARC Centre of Excellence for the Dynamics of Language, and in collaboration with the Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC), which has developed website landing pages for all of the languages of the Daly region of northern Australia. These landing pages provide a useful and usable interface by which a range of users can access primary recordings, fieldnotes, and other resources about the Daly languages; they are powered by a relational database which allows for easy updating, ensuring consistency across the website and allowing for an immediate response to community requests. Moreover, since the website is built with a commitment to open source, it is available for other researchers to adapt to their own projects and language groups. In this paper we discuss the goals and outcomes of the project, the design and functionality of the website landing pages, and advise readers on how they can access and adapt the open-source framework for their own purposes.
  • Reflections on the preparation and delivery of Carl Strehlow's heritage dictionary (1909) to the Western Aranda people
    Abstract: This chapter reflects on the predicaments encountered while bringing ethnographic and linguistic archival materials, and in particular an Aranda, German, Loritja [Luritja], and Dieri dictionary manuscript compiled by Carl Strehlow and with more than 7,600 entries, into the public domain. This manuscript, as well as other unique documents held at the Strehlow Research Centre in Alice Springs and elsewhere in Australia, is surrounded by competing views about ownership and control. In this case study I discuss my research and work with Western Aranda people concerning the transcription and translation into English of the dictionary manuscript. I also discuss the immense difficulties I faced in seeing the dictionary through to final publication. I encountered vested interests in this ethno-linguistic treasure that I had not been aware of and ownership claims that I had not taken into account. They arose from diverse quarters – from academia, from individuals in the Lutheran church, from Indigenous organisations, and from the Northern Territory Government. One such intervention almost derailed the dictionary work by actions that forced the suspension of the project for over 12 months. In this chapter I track the complex history of this manuscript, canvas the views of various stakeholders, and detail interpretations and reactions of Aranda people to the issues involved.
  • Conundrums and consequences: Doing digital archival returns in Australia
    Abstract: The practices of archival return may provide some measure of social equity to Indigenous Australians. Yet priceless cultural collections, amassed over many decades, are in danger of languishing without ever finding reconnection to the individuals and communities of their origin. The extensive documentary heritage of Australian Indigenous peoples is dispersed, and in many cases participants in the creation of archival records, or their descendants, have little idea of where to find these records. These processes of casting memories of the past into the future bring various conundrums of a social, political, and technical nature. They raise questions about the nature and dynamics of ongoing cultural transmission, the role of institutional and community archives in both protecting records of languages, song and social history and disseminating them, and the responsibilities of researchers, organisations and end users in this complex intercultural space. These questions are perforce framed by ethical and legal questions about access, competing ideas of ownership, and shifting community protocols surrounding rights of access to and the dissemination of cultural information. This paper arises from a project designed to reintegrate such research collections of Central Australian cultural knowledge with the places and communities from which they originally emanated. While we show that the issues raised are seldom neutral and often complex, we also argue for the power that culturally appropriate mobilisation of archival materials has for those that inherit the knowledge they embody.
  • i-Tjuma: The journey of a collection – from documentation to delivery
    Abstract: In 2018, a collection of some 60 edited and subtitled films, resulting from a documentation project (2012–2018) in the Ngaanyatjarra Lands on verbal arts of the Western Desert, was ready to be returned to the Ngaanyatjarra community. In this case study, we describe the journey of this return and the cultural, ethical, and technological issues that we negotiated in the process. From the archived collection lodged with PARADISEC (Pacific and Regional Archive for Digital Sources in Endangered Cultures), we developed a workflow that harvested selected media and their associated metadata and transferred them to LibraryBox, a portable digital file distribution tool designed to enable local delivery of media via the LibraryBox Wi-Fi hotspot. We detail here the return of the curated collection in a series of community film festivals in the Ngaanyatjarra communities and via the delivery of media from LibraryBox to individual mobile phones. We also discuss the return of a digital collection of historical photographs of Ngaanyatjarra people and strategies to re-inscribe such old records for new purposes. These endeavours are motivated by the imperative to ‘mobilise’ our collection of Western Desert Verbal Arts by making the recordings available to the Ngaanyatjarra community. We anticipate that the lessons we learnt in the process will contribute to better design for local solutions in the iterative cycle of documentation, archiving, and return.
  • Incorporating archival cultural heritage materials into contemporary Warlpiri women's yawulyu spaces
    Abstract: National archives house a rich legacy of materials that document many intangible aspects of Indigenous cultural heritage. It is the moral right of Indigenous people to have access to these materials, but their reintroduction back into present-day worlds is not without impact. Here, I analyse contemporary spaces in which Warlpiri women have engaged with archival cultural heritage materials and incorporated them into present-day contexts for the performance of yawulyu. These include the production of song books, dance camps at bush locations, and broader community arts performances. These cases illustrate that for proper engagement with these legacy materials knowledgeable Indigenous people must lead activities which are supported as part of the repatriation process.
  • Cover
  • Ever-widening circles: Consolidating and enhancing Wirlomin Noongar archival material in the community
    Abstract: Returning archival documentation of endangered Indigenous languages to their community of origin can provide empowering opportunities for Indigenous people to control, consolidate, enhance and share their cultural heritage with ever-widening, concentric circles of people, while also allowing time and space for communities to recover from disempowerment and dislocation. This process aligns with an affirming narrative of Indigenous persistence that, despite the context of colonial dispossession, can lead to a positive, self-determined future. In 2007, senior Noongar of the Wirlomin clan in the south coast region of Western Australia initiated Wirlomin Noongar Language and Stories Inc., an organisation set up to facilitate cultural and linguistic revitalisation by combining community-held knowledge with documentation and recordings repatriated from the archives. Fieldnotes created in 1931 from discussions with local Aboriginal people at Albany, Western Australia have inspired the collaborative production of six illustrated bilingual books. Working with archival research material has presented challenges due to issues of orthography and legibility in written records, the poor quality of audio recordings, and the incomplete documentation of elicitation sessions. As the archive is so fragmentary, community knowledge is vital in making sense of its contents.
  • Front matter
  • Contents
  • Foreword
  • (Re)turning research into pedagogical practice: A case study of translational language research in Warlpiri
    Abstract: Speech corpora created primarily for linguistic research are not often easily repurposed for practical use by the communities who participated in the research. This chapter describes a process where methods and materials collected for language documentation research have been returned to speakers in communities; this involves the implementation of professional development activities for Warlpiri educators in bilingual education programs. Documentation of children’s speech took place in four Warlpiri communities in 2010. To make the research results available to educators in Warlpiri communities in an easily accessible way, the researcher produced short videos showing analyses of the children’s speech. These online videos, along with audio recordings and written transcripts of the children’s speech, were utilised by a team of linguists and educators at professional development workshops in the Northern Territory Department of Education. Educators actively worked with the materials, discussed issues relating to children’s oral language development, and identified potential pedagogical practices. Through this process the materials were returned to the Warlpiri community and utilised in an active cycle of locally focused professional learning activities.
  • Nura's vision: Nura's voice
    Abstract: For Nura Nungalka Ward (1942–2013) the art of teaching was a lifelong passion, culminating in Ninu grandmothers’ law, published by Magabala Books (2018). This autobiography is an extensive ethnography of daily life for Pitjantjatjara and Yankunytjatjara families still living on their traditional lands amid the profound changes brought by the arrival of white settlers, doggers, missionaries and atomic bomb tests. Nura’s achievement – compiling her life history illustrated with striking photographs into an English language autobiography – seems like a natural progression. Until you consider that Nura spoke and taught in Pitjantjatjara, her Aṉangu (Aboriginal) language from the remote northwest corner of South Australia, and the fact that she possessed no family photograph albums. How did she make that leap, way beyond her life experience in an oral storytelling tradition, to embrace the idea of a book? How did the return of archival records to Nura’s kin via a digital repository in the early 2000s help shape Nura’s memories? This chapter details Nura’s process: her compelling drive to teach and her willingness to embrace new technologies, such as the digital archive Aṟa Irititja, which she first used to record her knowledge and then drew on to achieve her ambitions. We discuss the complexities that occur when accessing the digital content and Nura’s vigilance in ensuring that she broke no cultural rules in the process. We also share Nura’s decade-long journey as she collaborated with three non-Aboriginal friends to move her spoken word story through the digital archive and into the printed form, in what is the most significant publication to date to be sourced through the Aṟa Irititja Project.
  • Enlivening people and country: The Lander Warlpiri cultural mapping project
    Abstract: This chapter discusses a cultural mapping project funded and directed by Lander Warlpiri Anmatyerr people in Central Australia with the collaboration of the authors and the support of the Central Land Council. The project arose from the concerns of elders over the changing lifeworld of Warlpiri people today and the reduced opportunities for younger people to acquire the embodied place-based knowledge and experiences regarded as foundational to local identity, social interrelationships, and cultural continuity. It aimed to revitalise cultural knowledge through engaging family groups in activities such as country visits and mapping, during which the teaching and recording of place names, Dreaming tracks, and countries occurred along with the performance of associated stories, song, and rituals. This process involved the sharing and negotiation of the knowledge of country elders hold, augmented by ethnographic information derived from archival and other sources; for example, land claim maps and digitised material, including photographs, audio and visual recordings of narratives, places, song, and geo-referenced data. Attending to the ways in which local Indigenous practices of representing and inscribing people’s relations with space and place may differ from and interlace with dominant western spatial regimes, cartographic practices, and technologies, we explore outcomes and issues that have arisen during the process of re-animation and evocation of place-based knowledge and memories.
  • Never giving up: Negotiating, culture-making, and the infinity of the archive
    Abstract: Archival returns are a significant issue of concern for Indigenous peoples in many settler-colonial contexts. This chapter focuses on one example from Central Australia, Aṟa Irititja, to reflect on how an archive might simultaneously preserve ‘culture’ and also reflect, accommodate, and inspire cultural change. We feature the words of an Aṉangu ‘senior law woman’, Janet Inyika (affectionately known as Mrs Never-Give-Up), and our co-authorship is consistent with this community archive’s commitment to co-production, yet also extends Inyika’s social justice work into the future. Together, we argue that a collaborative, intercultural approach to archiving, in conjunction with the affordances of digital media, facilitate negotiations that are culturally appropriate, and not threatening. Aṟa Irititja is inspiring the production of a new genre of archival metadata: advance directives on what to do with representations of a person upon his/her death. These words are urging a shift in protocols for the correct treatment of photographs, asserting new domains of individual authority, and establishing the archive as the proper medium through which these should occur. The archive is also a site through which culture-making is never complete, always ongoing – indeed, infinite.
  • Returning recordings of songs that persist: The Anmatyerr traditions of akiw and anmanty
    Abstract: Digitisation has made the return of recordings made by researchers in the past far more achievable than ever before. This technological advance, combined with the ethical and political imperative towards decolonising methodologies in Indigenous research, has resulted in considerable interest in ensuring that recordings of cultural value be returned to Indigenous communities. In this chapter, I reflect upon the fieldwork experience of returning archival song recordings concerning public aspects of male initiation ceremonies, known as akiw and anmanty, to Anmatyerr-speaking communities in the Northern Territory of Australia. Despite attenuation of song knowledge across the region, these songs continue to be sung at annual ritual events. Once these recordings were returned to these communities, Anmatyerr people quickly received them as important reiterations of their present-day socio-cultural expression. Evidently imbricated in a complex, ritually based form of complementary filiation and knowledge dissemination, these songs are shared and taught in a fragile and changing context of ceremonial practice. The account provided here offers insights into songs associated with arguably the most persistent and significant form of ceremonial practice in Central Australia, although sparsely documented in the Anmatyerr region. I also highlight the relational properties of song via their connections to place, Anengkerr ‘Dreaming’ and people and provide important insights into how these communities perceive the archiving and preservation of this material.
  • Return of a travelling song: Wanji-wanji in the Pintupi region of Central Australia
    Abstract: This chapter discusses responses to the return of legacy recordings of Pintupi singing made in 1976 and the collection of further metadata about the song Wanji-wanji featured on the recordings. Wanji-wanji was once a popular entertainment song that was performed across the western half of Australia, as can be seen by the many recordings of it held in archives. Custodianship of the song is unknown; the earliest reference to its performance dates back to the 1850s, where it is described as a ‘travelling dance’ (Bates 1913–1914) and so in terms of copyright its status may be comparable to ‘public domain’, i.e. outside of copyright. Responses to hearing the recording were emotional. Those who knew the song recalled the place and time in which they had heard it long ago. There was great interest in how widely it was known though little interest in the meanings of the lyrics. On the whole, responses to access and proposed uses of the recordings, as well as the future possible uses of the song, reflected its public domain status. Nevertheless, the confidence in people’s responses varied depending on whether the individual knew the song, had experience in using archival recordings, and whether they perceived community interest and support for classical Aboriginal singing practices.
  • Deciphering Arrernte archives: The intermingling of textual and living knowledge
    Abstract: Arrernte people are arguably the most documented Aboriginal group in Australia. Their language was studiously documented by Lutheran scholars, their ceremonies were subject to some of the most intensive ethnographic documentation and many of their songs were meticulously recorded. In addition, genealogical and historical archives are full of Arrernte social histories, and museum stores contain thousands of Arrernte-made artefacts. This chapter contains a condensed and edited transcript of interviews with two Arrernte men, Shaun Angeles and Joel Liddle, who discuss their deep and varied interests in these records and the archives that contain them. Both Joel and Shaun are of a younger cohort of Arrernte men living in the Alice Springs region who are increasingly interested in utilising the potential of archival material as a means of assisting Arrernte language and cultural transmission. These interviews explore some of the issues Arrernte peoples confront as they work through archives. We discuss the challenges of variant orthographies in the 19th and 20th century records, the limitations of conventional cataloguing requirements and the importance of reading archival texts in a way that sees them emplaced and tested against the knowledge of elders. Archival records are explained as being necessarily embedded within Arrernte social memory and orality and framed by local socio-cultural practices. Reflecting upon their own experiences, Joel and Shaun are able to provide advice to future generations in their dealings with collecting institutions and make recommendations to current and future researchers (ethnographic and linguistic) who are documenting Arandic material. The chapter concludes with a discussion about the role of digital technologies in the future dissemination of cultural materials.
  • "We never had any photos of my family": Archival return, film, and a personal history
    Abstract: The film Remembering Yayayi emerged from a project to return raw 16mm film footage shot in 1974 at the early Pintupi outstation of Yayayi, near Papunya, by filmmaker Ian Dunlop, with Fred Myers as translator and consultant. Two subsequent remote Pintupi communities, Kintore and Kiwirrkura, were involved in the footage’s return. The material had not been available for research (or other) purposes until 2005, when VHS copies were made from the workprint deposited in the National Archives of Australia. In 2006, Myers and Stefanoff took this rare historical visual material in Pintupi language to Kintore and Kiwirrkura, showing it to individuals and family groups and holding community screenings. Responses were overwhelmingly positive. The tapes quickly became regular entertainment for patients undergoing lengthy renal dialysis sessions and Myers received multiple requests for copies. Over several years, one of Myers’ long-term Pintupi friends, Marlene Spencer Nampitjinpa, came to provide a moving personal commentary on the footage, enabling a feature documentary to be produced from it. This chapter draws on a conversation with Stefanoff and Myers to reflect on how the repatriation project became a catalyst for memory and produced new Pintupi community historical knowledge, particularly about outstation life, early efforts at developing local forms of self-determination and the transformation of lives and wellbeing over a 40-year period.
  • Editors' preface
  • Abbreviations

SP17: Language and Toponymy in Alaska and Beyond

SP16: Methodological Tools for Linguistic Description and Typology

  • Linguistic diversity, language documentation and psycholinguistics: The role of stimuli
    Abstract: Our psycholinguistic theories tend to be based on empirical data from a biased sample of well-described languages, not doing justice to the enormous linguistic diversity in the world. As Evans and Levinson (2009: 447) put it, a major challenge of our discipline is to harness this linguistic diversity and “to show how the child’s mind can learn and the adult’s mind can use, with approximately equal ease, any one of this vast range of alternative systems.” This paper explores some of the possibilities and limits of how language documentation and description can contribute to taking up this challenge, focusing on the role of both natural data and stimuli in this enterprise.
  • Introduction: Methodological tools for linguistic description and typology
  • Automatic construction of lexical typological Questionnaires
    Abstract: Questionnaires constitute a crucial tool in linguistic typology and language description. By nature, a Questionnaire is both an instrument and a result of typological work: its purpose is to help the study of a particular phenomenon cross-linguistically or in a particular language, but the creation of a Questionnaire is in turn based on the analysis of cross-linguistic data. We attempt to alleviate linguists’ work by constructing lexical Questionnaires automatically prior to any manual analysis. A convenient Questionnaire format for revealing fine-grained semantic distinctions includes pairings of words with diagnostic contexts that trigger different lexicalizations across languages. Our method to construct this type of a Questionnaire relies on distributional vector representations of words and phrases which serve as input to a clustering algorithm. As an output, our system produces a compact prototype Questionnaire for crosslinguistic exploration of contextual equivalents of lexical items, with groups of three homogeneous contexts illustrating each usage. We provide examples of automatically generated Questionnaires based on 100 frequent adjectives of Russian, including veselyj ‘funny’, ploxoj ‘bad’, dobryj ‘kind’, bystryj ‘quick’, ogromnyj ‘huge’, krasnyj ‘red’, byvšij ‘former’ etc. Quantitative and qualitative evaluation of the Questionnaires confirms the viability of our method.
  • The TULQuest linguistic questionnaire archive
    Abstract: This article describes the development and structure of an online interactive archive for linguistic questionnaires developed by the Fédération de Typologie et Universaux Linguistiques (CNRS) program on Questionnaires. The archive allows users to both retrieve and deposit material, with questionnaires categorized according to a taxonomy of features. Questionnaires, defined by our project as any methodological tool designed to collect linguistic data, and written with a capital to highlight this special use of the term, are accompanied by additional materials beyond basic metadata, ranging from a summary of usage protocol, development context, reviews and user tips, as well as the possibility of linking together questionnaires that have been adapted from an original, reflecting the dynamic nature of questionnaire use.
  • Trajectoire: a methodological tool for eliciting Path of motion
    Abstract: This paper presents a methodological tool called Trajectoire that was created to elicit the expression of Path of motion in typologically and genetically varied languages. Designed within the research program TRAJECTOIRE ‘Path (of motion)’, supported by the Fédération de Typologie et Universaux Linguistiques, the Trajectoire elicitation tool aims to systematically explore the morpho-syntactic resources used for the expression of Path and the distribution of spatial information across the sentence, with a specific focus on the (a)symmetry in the expression of Source (the initial point) and Goal (the final point). Its main aim is to facilitate typologically-informed language descriptions, which in their turn can contribute new data to typologically-oriented research. Inspired by the research methods developed at the Max Planck Institute for Psycholinguistics (Nijmegen, NL), the Trajectoire material comprises 76 video-clips consisting of 2 training clips, 55 target clips and 19 fillers, and it includes 3 distinct versions ordering the clips differently to minimize possible routine effects. The 55 target clips vary for several parameters, namely Figure, Ground, the different portions of Path, Deixis, and less systematically, Manner. The scenes filmed in an outdoor natural environment ensure accessibility to non-Western populations. The paper first presents the structure and the use of the elicitation material. On the basis of the data obtained in about 20 different languages and reports by users, both researchers and speakers, it then discusses the advantages and some drawbacks of the Trajectoire elicitation tool, and considers the issue of the tool's dissemination and online open access.
  • Using questionnaires as a tool for comparative linguistic field research: Two case studies on Javanese
    Abstract: In this paper, we discuss how written questionnaires for targeted constructions can be a beneficial tool for comparative linguistic field research through two case studies on Javanese (Austronesian; Indonesia). The first case study is based on a questionnaire designed to elicit how a language or a dialect expresses the semantic meaning of modality (Vander Klok 2014); we show how it can be implemented in three different ways for comparative linguistic field research. The second case study is based on a questionnaire which investigates the morphosyntax of polar questions across four Javanese dialects; we show how items can be designed to maximize direct comparison of features while still allowing for possible lexical, phonological, or morphosyntactic variation. Based on these two studies, we also address methodological challenges that arise in using questionnaires in comparative linguistic field research and offer best practices to overcome these challenges.
  • Video elicitation of negative directives in Alaskan Dene languages: reflections on methodology
    Abstract: In this paper, we describe the use of video stimuli for the targeted elicitation of negative directives in Denaakk’e (Koyukon) and Nee’andeegn’ (Upper Tanana), two severely endangered Alaskan Dene languages. Negative directives are extremely rare in our previously collected data, yet they exhibit a great variety of forms. Forms further seem to depend on several factors, particularly on whether the prohibited act violates social norms known as hʉtlaanee/įįjih. To better understand the variety of on-record and off-record forms, we created video clips showing activities violating hʉtlaanee/įįjih and activities that are merely foolish or mildly dangerous. After viewing the clips, our consultants were asked to advise the actors as if they were their grandchildren. Their responses were discussed at length with the speakers. The speakers greatly enjoyed this task and produced a great variety of on-record and off-record responses including some unusual linguistic structures. In both languages, offrecord expressions were preferred over direct ones, particularly in situations where hʉtlaanee/įįjih was involved. We also identified several conventionalized off-record strategies. The emphasis on hʉtlaanee/įįjih made the task interesting and relevant for speakers. While our stimuli are designed for work with Alaskan Dene, the method can be adapted for cultural contexts around the world.
  • A proposal for conversational questionnaires
    Abstract: This paper proposes a new approach for collecting lexical and grammatical data: one that meets the need to control the features to be elicited, while ensuring a fair level of idiomaticity. The method, called conversational questionnaires, consists in eliciting speech not at the level of words or of isolated sentences, but in the form of a chunk of dialogue. Ahead of fieldwork, a number of scripted conversations are written in the area’s lingua franca, each anchored in a plausible real-world situation – whether universal or culture-specific. Native speakers are then asked to come up with the most naturalistic utterances that would occur in each context, resulting in a plausible conversation in the target language. Experience shows that conversational questionnaires provide a number of advantages in linguistic fieldwork, compared to traditional elicitation methods. The anchoring in real-life situations lightens the cognitive burden on consultants, making the fieldwork experience easier for all. The method enables efficient coverage of various linguistic structures at once, from phonetic to pragmatic dimensions, from morphosyntax to phraseology. The tight-knit structure of each dialogue makes it an effective tool for cross-linguistic comparison, whether areal, historical or typological. Conversational questionnaires help the linguist make quick progress in language proficiency, which in turn facilitates further stages of data collection. Finally, these stories can serve as learning resources for language teaching and revitalization. Five dialogue samples are provided here as examples of such questionnaires. Every linguist is encouraged to write their own dialogues, adapted to a region’s linguistic and cultural profile. Ideally, a set of such texts could be developed and made standard among linguists, so as to create comparable or parallel corpora across languages – a mine of data for typological comparison.
  • Front cover
  • Front matter
  • Whole volume

SP15: Reflections on Language Documentation on the 20 year anniversary of Himmelmann 1998

  • Reflections on linguistic fieldwork and language documentation in eastern Indonesia
    Abstract: In this paper, we reflect on linguistic fieldwork and language documentation activities in Eastern Indonesia. We first present the rich linguistic and biological diversity of this region, which is of significant interest in typological and theoretical linguistics and language documentation. We then discuss certain central educational issues in relation to human resources, infrastructures, and institutional support, critical for high quality research and documentation. We argue that the issues are multidimensional and complex across all levels, posing sociocultural challenges in capacity-building programs. Finally, we reflect on the significance of the participation of local fieldworkers and communities and their contextual training.
  • In search of island treasures: Language documentation in the Pacific
    Abstract: The Pacific region is home to about 1,500 languages, with a strong concentration of linguistic diversity in Melanesia. The turn towards documentary linguistics, initiated in the 1980s and theorized by N. Himmelmann, has encouraged linguists to prepare, archive and distribute large corpora of audio and video recordings in a broad array of Pacific languages, many of which are endangered. The strength of language documentation is to entail the mutual exchange of skills and knowledge between linguists and speaker communities. Their members can access archived resources, or create their own. Importantly, they can also appropriate the outcome of these documentary efforts to promote literacy within their school systems, and to consolidate or revitalize their heritage languages against the increasing pressure of dominant tongues. While providing an overview of the general progress made in the documentation of Pacific languages in the last twenty years, this paper also reports on my own experience with documenting and promoting languages in Island Melanesia since 1997.
  • The state of documentation of Kalahari Basin languages
    Abstract: The Kalahari Basin is a linguistic macro-area in the south of the African continent. It has been in a protracted process of disintegration that started with the arrival of Bantu peoples from the north and accelerated dramatically with the European colonization emanating from the southwest. Before these major changes, the area hosted, and still hosts, three independent linguistic lineages, Tuu, Kx'a, and Khoe-Kwadi, that were traditionally subsumed under the spurious linguistic concept "Khoisan" but are better viewed as forming a "Sprachbund". The languages have been known for their quirky and complex sound systems, notably involving click phonemes, but they also display many other rare linguistic featuresa profile that until recently was documented and described very insufficiently. At the same time, spoken predominantly by relatively small and socially marginalized forager groups, known under the term "San", most languages are today, if not on the verge of extinction, at least latently endangered. This contribution gives an overview of their current state of documentation, which has improved considerably within the last 20 years.
  • Caucasus – the mountain of languages
    Abstract: The widespread picture of linguistic diversity in the Caucasus as ‘the mountain of languages’ will be immediately confirmed if a closer look is taken at the region: multiethnic, multilingual, multireligious is the adequate description of this melting pot. What is responsible for the present-day ethnic, linguistic and sociocultural diversity is the historical coexistence of different ethnic groups in a geographically delimited region on the one hand, and the geopolitical situation at the border between the Orient and the Occident on the other. At the same time, this diversity leads to mutual influence of different kinds, ranging from linguistic and religious to ethnic assimilation. In this article, we will outline the results of relevant international projects in the field of ‘language documentation’ that we conducted over the past 15 years and what we have learned from these projects.
  • Reflections on the role of language documentations in linguistic research
    Abstract: I reflect the role of language documentations in linguistic research beyond its most common linguistic use as a high-quality database for descriptive work. I show that the original Himmelmann-ian conception of documentations, as multi-varied and multi-purpose, and to some extent community-driven, enable a range of research outcomes that would not have been foreseeable within the traditional descriptive, typological and theoretical agendas. I argue that it is overall more fruitful for innovative linguistic research to invest into the processing of haphazard language documentation data rather than attempting to collect precisely the kind of data demanded by specific analytic goals.
  • Reflections on linguistic analysis in documentary linguistics
    Abstract: This article reflects on the role of analysis in language documentation since Himmelmann (1998). It presents some of the criticism that Himmelmann's notion of analysis faced and how he responded (Himmelmann 2012). However, analysis in this context rarely refers to analysis alone, but the term includes the larger research goals and research questions. This study, then, situates the research goals, research questions and analyses that I have employed in my research on Besemah on a cline from facilitative to restrictive in terms of the diversity and spontaneity of the (archival) record that is produced, building upon Himmelmann's (2012) conceptual basis for distinguishing documentation and description. It does so through two case studies in Besemah, one with a highly facilitative research goal, question, and analysis and another with a highly restrictive research goal, question, and analysis.
  • Why cultural meanings matter in endangered language research
    Abstract: In this paper we illustrate why it is important for linguists engaged in endangered language documentation to develop an analytical understanding of the cultural meanings that language, language loss, and language documentation have for the communities they work with. Acknowledging the centrality of cultural meanings has implications for the kinds of questions linguists ask about the languages they are studying. For example: How is age interpreted? What reactions are provoked by accented speech or multilingualism? Is language shift experienced as a painful loss, or a source of newfound freedom, or both? It affects the standards we set for what counts as a satisfying explanation for language endangerment, with prediction necessarily limited in sociogeographic scope. It has implications for the research methods employed, calling for serious engagement with the particular histories and interpretive practices of local linguistic communities. Analyzing cultural meanings can help us see how language use and changes in language use are experienced and therefore acted on by people whose communicative behavior we are concerned with. It can help us interpret why language shift is taking place in a particular community, guide the practices of language documentation and preservation that linguists engage in with that community, and contribute to effective revitalization.
  • SP15 Cover
  • Reflections on reproducible research
    Abstract: Reproducibility in language documentation and description means that the analysis given in descriptive publication is presented in a way that allows the reader to access the data on which the claims are based, to verify the analysis for themself. Linguists, including Himmelmann, have long pointed to the centrality of documentation data to linguistic description. Over the twenty years since Himmelmann’s 1998 paper we have seen a growth in digital archiving, and the rise of the Open Access movement. Although there is good infrastructure in place to make reproducible research possible, few descriptive publications clearly link to underlying data, and very little documentation data is publicly accessible. We discuss some of the institutional roadblocks to reproducibility, including a lack of support for the development of published primary data. We also look at what work on language documentation and description can learn from the recent replication crisis in psychology.
  • Meeting the transcription challenge
    Abstract: The major challenge for language documentation in the next decade or two is what could be called the transcription challenge. This is a multilayered challenge that goes far beyond the practical challenge of speeding up the transcription process. Transcription, as practiced in language documentation, involves language making and changes the language ecology. Despite its centrality to language documentation, transcription remains critically undertheorized and understudied. Further progress in language documentation, and ultimately also its overall success, crucially depends on further investigating and understanding the transcription process, broadly conceived.
  • Reflections on the scope of language documentation
    Abstract: Language documentation is understood as the creation, annotation, preservation, and dissemination of transparent records of a language. This leads to questions as to what precisely is meant by terms such as annotation, preservation, and dissemination, as well as what patterns of linguistic behavior fall within the scope of the term language. Current approaches to language documentation tend to focus on a relatively narrow understanding of a language as a lexicogrammatical code. While this dimension of a language may be the most salient one for linguists, languages are also embedded in larger social structures, and the interaction between these structures and the deployment of lexicogrammatical codes within a community is an important dimension of a language which also merits documentation. Work on language documentation highlights the significance of developing theoretical models that underpin the notion of language, and this can have an impact not only for the practices of documentary linguists but also for the larger field of linguistics. It further suggests that documentary linguistics should not merely be seen as a subfield that is oriented around the collection of data but as one that is in a position to make substantive contributions to linguistic theory.
  • Introduction
    Abstract: This chapter introduces the volume, Reflections on Language Documentation 20 Years after Himmelmann 1998, providing a short justification for the volume, summarizing each of the four major parts of the volume, and identifying major themes that emerge in the 31 chapters. It concludes by noting some of the volume's limitations.
  • SP15 Front Matter
  • Reflections on language documentation in the Chaco
    Abstract: This chapter focuses on field research aimed at documenting Chaco languages with varying degrees of vitality, specifically those spoken in Argentina and in the vicinity of the Argentinian/Paraguayan, Argentinian/Bolivian, and Paraguayan/Bolivian borders. The case studies here selected provide an overview of recent experiences conducted in Chaco within the framework of Himmelmann 1998’s foundational program on documentary linguistics and subsequent publications along these lines. We emphasize the results of collaborative research on equal grounds and a discourse-oriented approach to language documentation. Our reflections also highlight the current threatening situation of indigenous peoples and their languages and discuss the function of language documentation, preservation, and archiving in this fragile scenario, with a view to supporting community language use and transmission as well as ongoing and future research in South America.
  • Reflections on (de)colonialism in language documentation
    Abstract: With origins in colonial logics and institutions, language documentation practices can reinforce colonial power hierarchies and norms in ways that work against the needs and values of Indigenous language communities. This paper highlights major patterns through which this occurs, along with their effects, and models how language documentation can be structured in ways that are more grounded in the experiences and perspectives of the communities that use it. I propose decolonial interventions that emerge from Indigenous research principles and perspectives, and illustrate how these practices can better support language community needs while also improving the scientific value of language documentation.
  • Reflections on linguistic fieldwork in Mexico and Central America
    Abstract: In this chapter, I endeavor to contribute towards a collective effort to reflect on the evolution and state-of-the-art of language documentation. I reflect on Himmelman (1998) from the perspective of language endangerment and revitalization in Mexico and Central America today. I identify a number of topics that are critical to the practice of language documentation in the region and that in my view were only marginally mentioned in Himmelmann's seminal paper. These topics revolve around the participation, consent, interests and needs of speakers of the very languages that are documented. Notably, I argue that (i) language documentation is critical for language revitalization, (ii) I echo current calls in the community-based research literature for ensuring that language documentation is collaborative, (iii) that to this end, training opportunities for language community members need to increase and (iv) that a concerted effort is needed to develop appropriate ways to ensure informed consent in language documentation.
  • Reflections on public awareness
    Abstract: In this reflection, I repeat Michael Krauss’s 1992 call for linguists of all kinds to be active in creating public awareness of language endangerment, and more importantly at this stage, in motivating global attitudinal changes in support of language diversity. I purposely do not distinguish between academic and non-academic, community and non-community linguists, requiring that we all participate in this call. I distinguish different target publics, namely the endangered or minoritized language community public and the majority language public in terms of message and response. I then briefly outline past and present efforts in varying media that are part of creating awareness and action on a global scale. I focus on integration of media and message, stressing that we must be able to provide a positive vision of a linguistically diverse world and a means for the general public, especially youth, to participate in its creation.
  • Reflections on documentary corpora
    Abstract: For decades, language documentation proponents have argued for the separability of LD as its own sub-discipline. Many corpus linguists have made this same claim; thus, corpus linguistics shares the ethos of data over theorizing, whereby primary data represent authentic, connected discourse that is natural (not elicited), broadly sampled (across speakers, generations, dialects), and balanced (reflecting different usage contexts and genres). Nevertheless, many misconceptions remain about what a language corpus is, how it is formatted, how big or balanced it needs to be, and most importantly, how it is queried. In this reflection, I dispel some of these misconceptions, while reassuring community members and field linguists alike that a corpus is an exceedingly powerful tool for guiding the expansion of the documentary record, keeping precious language data in circulation, and helping to produce the classic descriptive by-products of LD such as dictionaries, phrasebooks, and grammars. Above all, the less-familiar but more direct by-products of corpus interrogation, such as word lists, frequency counts, concordance lines, N-grams, collocations, distribution, and dispersion plots, are so immediately interpretable and useful by speakers, learners, and linguists, that LD should give corpus linguistic training the same attention as project planning, ethics, recording, transcription, annotation, metadata, and archiving.
  • Interdisciplinary research in language documentation
    Abstract: This paper explores the parameters of interdisciplinary work in language documentation. Citing the strong call for the involvement of disciplines, other than linguistics, beginning with Himmelmann, to the present trajectories for language documentation research, the author claims that more attention is needed to the enactment of interdisciplinary work from project conception to the follow-through in terms of where to disseminate outcome.
  • Reflections on documenting the lexicon
    Abstract: The lexicon presents unique challenges in language documentation. This reflection reviews some of those challenges, focusing on two major areas, what I have learned over time about what is important to document and the creation of dictionaries. Throughout I stress the value of considering the lexicon broadly, and, in the situation that linguists are involved, of working closely with speakers and community members in all stages of decision making, from what to document to how to spell, to how to represent meanings. N. Scott Momaday writes of words as medicine, and this is important to keep in mind in lexical documentation—one is engaging with worldview. The responsibility then of documenting the lexicon is large, and the stakes are high, given how words give deep insight into ways of being.
  • Reflections on language community training
    Abstract: I reflect upon four decades of language community training, treating Watahomigie & Yamamoto (1992) and England (1992) as the starting point. Because the training activities these papers report began in the 1970s, there is a convincing and growing literature on training, including work published in the years since Himmelmann’s (1998) article. The upshot of my reflections is this central point: Language documentation is better when it occurs alongside an active training component. Underlying this point is an acknowledgement that linguists and communities are engaged in mutual training, and in fact, that a binary distinction between linguist and community member is a false dichotomy. The Chickasaw Model, a model that formalizes training, linguistic analysis, documentation, and revitalization as a feedback loop (cf. Fitzgerald & Hinson 2013; 2016), offers a way to capture a fully integrated approach to training. I conclude with nine significant contributions growing out of the training literature.
  • Reflections on linguistic fieldwork
    Abstract: In this reflections piece, I draw upon my experience as a fieldworker in Australia, a linguist who also works with archival materials spanning 150 years, and a linguist whose work includes both documentary and descriptive aspects. I center this piece around three questions about aspects of fieldwork that have changed since the publication of Himmelmann (1998). The first is what we collect—that is, have our field methods changed? The second question concerns the documentation we produce—is it different? Thirdly, are there features of Himmelmann’s manifesto which were the products of its time, and has academia changed? Arguably in all cases that there has been change for the better, but we still have some way to go, and that some of the original formulation of a dichotomy between documentation and description are counterproductive.
  • Reflections on funding to support documentary linguistics
    Abstract: Funding for documentary linguistics has changed dramatically over the past two decades, largely due to the emergence of dedicated funding regimes focused on endangered languages. These new regimes have helped to shape and reify the field of documentary linguistics by facilitating and enforcing best practices and integrating archiving into the documentation process. As a result both the pace and quality of documentation have improved dramatically. However, several challenges remain, and additional efforts are needed to ensure the sustainability of funding for language documentation efforts. In particular, more funding needs to be allocated toward training and capacity building in under-resourced regions.
  • From comparative descriptive linguistic fieldwork to documentary linguistic fieldwork in Ghana
    Abstract: This paper surveys linguistic fieldwork practices in Ghana from the earliest times to the documentary linguistic era. It demonstrates that the most profound effect of the documentary linguistic turn in the language sciences on fieldwork in Ghana is in the rise of "insider" and "insider-outsider" field-working linguists. This goes against the definition of prototypical fieldwork as something done by remote outsiders. The challenges and opportunities of this development are reflected upon. It is argued that relevant fieldwork methodologies should be further developed taking the emerging features of different "insider" practices into account. Moreover, it is hoped that characterizations of documentary linguistic fieldwork would move beyond the outsider and accommodate the different types of "insider" fieldworkers.
  • Reflections on ethics: Re-humanizing linguistics, building relationships across difference
    Abstract: Himmelmann (1998) uses the word 'ethics' only once, but his arguments for proposing a field of documentary linguistics reflect assumptions about ethical stances that have been addressed in linguistics publications since 1998. This paper begins by outlining some of these ethical assumptions, and then focuses on considerations closely connected to what Dobrin & Berson (2011: 207) refer to as "re-humanizing linguistics'' and "building relationships across difference". The paper suggests that ethical language documentation work must be grounded in considerations of the human nature of research relationships, the histories of interactions between peoples which inform those research relationships, and varying conceptions of knowledge. Since language documentation work inevitably has social consequences for human beings, aligning language documentation practice with Indigenous research paradigms which emphasize relational accountability (Wilson 2008: 99), allows for a practice based on respect, reciprocity and responsibility and ultimately leads to good documentation.
  • Reflections on language documentation in India
    Abstract: The last twenty years have seen efforts to support the study of minority and lesser-studied languages of India from varied stakeholders: these include the Indian government, international and Indian nonprofit organizations, indigenous and state-level cultural and language committees and institutes, and individuals with a passion to preserve and document their cultures and languages. Their efforts have led to mixed success due to conflicting ideologies, history, and resource availability (Annamalai 2003). Basing my observations on my research, personal experience and engagement with language documentation activities in the country, I provide an overview of the current state of language study and my hopes and efforts for future of language documentation and description in India.
  • Reflections on diversity linguistics: Language inventories and atlases
    Abstract: This contribution gives a short overview of “language inventorying”: research aiming at creating comprehensive catalogues and atlases of all the languages in the world, which has seen a boost with the renewed interest in linguistic diversity triggered by the awareness of language endangerment in the 1990s. By focusing on the development of the ISO standard 639 and SIL’s Ethnologue, the main advances and issues in this area are discussed. The overview concludes by presenting the major alternative resources, in particular Glottolog.
  • Reflections on linguistic fieldwork in Australia
    Abstract: Shifts in White-Indigenous relations started to re-shape relations between field linguists and Australian Indigenous communities from the 1970s. So well before Himmelmann (1998) appeared, linguists working on Australian Indigenous languages had been discussing topics such as ethical engagement with Indigenous communities, accessibility of recordings and the best use of technology in archiving and recording. After Himmelmann (1998) appeared, these topics emerged as key topics in language documentation which led to more of these kinds of discussions not only among Australian linguists but also with linguists around the world. The development of language documentation as a field of research fostered greater collaboration between Indigenous communities, linguists, researchers from other disciplines and technology specialists in Australia. New funding initiatives followed the publication of Himmelmann (1998), providing additional support for documentation projects on Australian Indigenous languages. Since the 2000's government support for Indigenous-led initiatives around language has declined in Australia. But growing support for Indigenous researchers within universities is enabling Indigenous communities to become more equal partners in research on their languages.
  • Reflections on the diversity of participation in language documentation
    Abstract: In this paper, I reflect on the diversity of participation in language documentation in the Indonesian context over the past two decades. I show that progress has been made in documentation research on the minority languages, with the concerted efforts of different stakeholders (community/non-community—among the latter, affiliations with universities, non-governmental organizations, the government, and other types of organizations of local speech communities). However, challenging issues remain in relation to the local communities' capacity, motivation, and leadership for helpful and long-term active participation in language documentation.
  • Reflections on language documentation in the Southern Cone
    Abstract: Although many indigenous languages of Chile and Argentina have been documented only in the second half of the 20th century by academic anthropologists and linguists, some languages have a comparatively long tradition of descriptive and documentary scholarship conducted by Catholic missionaries. From a present-day perspective, early descriptions and documentations show some shortcomings (viz., they are often fragmentary and biased in several respects), but they nonetheless constitute a trove of valuable resources for later work and ongoing revitalization endeavors. Current documentary work is now more balanced in terms of Himmelmann's (1998) three-parameter typology (i.e., it pays close attention to communicative events of different kinds of modality, spontaneity, and naturalness), employs audio and video recordings, and takes copyright, access, and sustainability issues seriously. It is also more collaborative and empowering vis-à-vis the role played by indigenous collaborators than in the past and tends to be reasonably multi-disciplinary.
  • Reflections on software and technology for language documentation
    Abstract: Technological developments in the last decades enabled an unprecedented growth in volumes and quality of collected language data. Emerging challenges include ensuring the longevity of the records, making them accessible and reusable for fellow researchers as well as for the speech communities. These records are robust research data on which verifiable claims can be based and on which future research can be built, and are the basis for revitalization of cultural practices, including language and music performance. Recording, storage and analysis technologies become more lightweight and portable, allowing language speakers to actively participate in documentation activities. This also results in growing needs for training and support, and thus more interaction and collaboration between linguists, developers and speakers. Both cutting-edge speech technologies and crowdsourcing methods can be effectively used to overcome bottlenecks between different stages of analysis. While the endeavour to develop a single all-purpose integrated workbench for documentary linguists may not be achievable, investing in robust open interchange formats that can be accessed and enriched by independent pieces of software seems more promising for the near future.
  • Reflections on fieldwork: A view from Amazonia
    Abstract: Amazonia is both a place of exceptional linguistic, sociocultural, and ecological diversity and a place where the documentation of this diversity is limited and ever-increasingly urgent. While recent decades have shown considerable progress in this area, our understanding of Amazonian languages is still challenged by a low proportion of researchers relative to its many distinct language contexts. In light of Himmelmann's framing of language documentation as a 'fairly independent field of linguistic inquiry and practice', we discuss key facets of what we consider the single most important unifying question that underlies language documentation work in Amazonia: Just how much description and analysis is necessary for Amazonian language documentation to be coherent, useful, and interpretable by others? We argue that the social and cultural diversity of this vast region calls into question the actual separability of 'documentation' from `description and analysis' of Amazonian language data; and we advocate for taking Himmelmann's proposals as an invitation to finer-grained, broader-minded thinking about the kinds of research questions, methods, and focused training that best serve linguists working in Amazonian speech communities, rather than as a guide to defining an appropriate scope for fieldwork with an Amazonian language.
  • Reflections on descriptive and documentary adequacy
    Abstract: One of Himmelmann's primary goals in his 1998 paper was to argue for a strict division of documentation and description. Language documentation has since successfully developed to become a discipline in its own right. Nevertheless, the question concerning the interrelation of description (and thus analysis) and documentation remains a matter of controversy. This paper reflects on descriptive and documentary adequacy, focusing on two major issues. First, it addresses the question of how much analysis should enter into an adequate documentation of a language and, second, it discusses the role of language documentation and primary data in the replicability of linguistic analyses.
  • Reflections on language documentation in North America
    Abstract: In this paper we reflect on the state of language documentation in North America, especially Canada and Alaska. Using our own early experiences with the archival record on languages of North America as a launching point, we discuss changes that have come to this field over the past twenty years. These include especially the increasing recognition of long traditions of community-based language research within North America, and of members of language communities as primary stakeholders in efforts to preserve and properly share records of linguistic knowledge.
  • SP15 Whole Volume

SP14: A Grammar of Shilluk

  • Chapter 4: A descriptive analysis of adjectives in Shilluk
    Abstract: We argue that Shilluk has adjectives as a lexical category distinct from both nouns and verbs, and present a descriptive analysis of their morphological and syntactic properties. Aside from the base form, the inflectional paradigms of adjectives present two other forms, neither of which are productive. One is the contingent form, which has not been postulated in earlier work. This inflection is used when the attribute is referenced non- permanently, to a limited degree, or subjectively. The other is the plural form, which is available for seven adjectives only. Derivational morphology includes an essence nominalization and an intransitive verb derivation. When adjectives are used as predicates, there is no copula, nor any morphological marking of the syntactic juncture. In contrast, when adjectives are used as modifiers, their status as such is signposted by three different morphosyntactic structures. The choice between these three structures is determined by definiteness and semantic specificity.
  • Chapter 3: Forms and functions of the associated-motion derivations of Shilluk transitive verbs
    Abstract: The base paradigm of Shilluk transitive verbs includes inflectional marking for voice, subject, and tense-aspect-modality. This paradigm is described in Chapter 1. In addition to this base paradigm, however, transitive verbs present up to six derived paradigms: iterative, benefactive, ambitransitive, antipassive, and, depending on the ATR value of the root vowel, either one or two derivations that mark associated motion. Each of these six derivations presents its own set of forms marked for voice, subject, and tense-aspect-modality. The current chapter presents a descriptive analysis of the two derivations that mark associated motion. Apart from the patterns of morphological exponence, we also describe the morphosyntactic characteristics that are associated with these derivations. As in the earlier chapters, sound examples are embedded to make the phenomena accessible and the analysis accountable.
  • Chapter 2: Inflectional morphology and number marking in Shilluk nouns
    Abstract: This chapter offers a descriptive analysis of two topics in the morphology of Shilluk nouns: the inflectional paradigm, and number marking. Aside from the base form, the inflectional paradigm includes the following four forms: a) pertensive with singular possessor; b) pertensive with plural possessor; c) construct state; and d) proximal demonstrative. All of these can be interpreted as instances of head marking, which is characteristic of Shilluk morphosyntax in general (cf. Chapter 1). Following a description of the morphosyntactic functions of the base form and the four inflections, we describe in detail the patterns of morphophonological exponence through which the inflections are expressed. This pattern of exponence includes vowel length, tone, nasalisation, floating quantity, and suffixation. Floating quantity is of particular note: this marker has not been postulated in earlier work. Overall, we find that the inflectional paradigm is largely productive and regular. In contrast, the morphological marking for number is neither regular nor productive, and this is why we do not consider it to be part of the inflectional paradigm. The newly discovered marker of floating quantity supports Gilley’s (1992) tripartite analysis of number marking for Shilluk. For the sake of clarity and accountability, sound examples are embedded in relation to each of the numbered illustrations.
  • Cover
  • Front matter
  • Chapter 1: Forms and functions of the base paradigm of Shilluk transitive verbs
    Abstract: This chapter offers a descriptive analysis of the morphological forms that make up the base paradigm of Shilluk transitive verbs, and also of the functions that are expressed through them. With respect to morphological exponence, tone and vowel length play a central role, both in marking the functions and in distinguishing a total of seven different verb classes. As for the functions, they are syntactic voice, subject marking, and tense-aspect-modality (TAM). These functions interact with one another and with other aspects of the syntax of the clause. For example, Imperfective aspect is only available in Object voice, and certain TAM forms interact with focus marking. We pay special attention to syntactic alignment, a topic on with earlier analyses diverge. Older studies distinguish between active and passive voices (Westermann 1912, Tucker 1955). More recently, the passive has been reinterpreted as an ergative construction (Miller & Gilley 2001). We find that the construction at the center of the controversy has all the morphosyntactic properties of a passive, but not the information-structural characteristics. The scope of this chapter is restricted to the base inflectional paradigm. This means that it does not cover the many derivations which present inflectional paradigms that are largely parallel to the base paradigm. For the sake of clarity and accountability, sound examples are embedded in relation to each of the numbered illustrations.

SP13: Documenting Variation in Endangered Languages

  • Perspectives on linguistic documentation from sociolinguistic research on dialects
    Abstract: The goal of the paper is to demonstrate how sociolinguistic research can be applied to endangered language documentation field linguistics. It first provides an overview of the techniques and practices of sociolinguistic fieldwork and the ensuring corpus compilation methods. The discussion is framed with examples from research projects focused on European-heritage English-speaking communities in the UK and Canada that have documented and analyzed English dialects from the far reaches of Scotland to the wilds of Northern Ontario, Canada. The main focus lies on morpho-syntactic and discourse-pragmatic variation; however, the same techniques could be applied to other types of variation. The discussion includes examples from a broad range of research studies in order to illustrate how sociolinguistic analyses are conducted and what they offer for understanding language variation and change.
  • Introduction: Documenting variation in endangered languages
  • He nui nā ala e hiki aku ai: Factors influencing phonetic variation in the Hawaiian word kēia
    Abstract: Apart from a handful of studies (e.g., Kinney 1956), linguists know little about what variation exists in Hawaiian and what factors constrain the variation. In this paper, we present an analysis of phonetic variation in the word kēia, meaning ‘this’, examining the social, linguistic, and probabilistic factors that constrain the variation. The word kēia can be pronounced with a constricted glottis (e.g., as creak or a glottal stop) or without one (Pukui & Elbert 1986: 142) and, like many words in Hawaiian, it can undergo phonetic reduction. The analysis was conducted on interviews with eight native-speaking kūpuna (elders) who were recorded in the 1970s. We find that the likelihood of the word being realized with a constricted glottis decreases if the word immediately following kēia begins with an oral stop or if the speaker is a man. Additionally, we observe a higher likelihood of phonetic reduction as word sequences (kēia + the following word(s)) are repeated during the interaction. The results contribute to current models of speech production and planning, and they inform work aimed at supporting the ongoing efforts to conserve and revitalize the Hawaiian language.
  • Documenting variation in (endangered) heritage languages: how and why?
    Abstract: This paper contributes to recently expanded interest in documenting variable as well as categorical patterns of endangered languages. It describes approaches, tools and curricular developments that have benefitted from involving students who are heritage language community members, key to expanding variationist focus to a wider range of languages. I describe aspects of the Heritage Language Variation and Change Project in Toronto, contrasting a “truly” endangered language to a less clearly endangered language. Faetar, with
  • Documenting sociolinguistic variation in lesser-studied indigenous communities: Challenges and practical solutions
    Abstract: Documenting sociolinguistic variation in lesser-studied languages presents methodological challenges, but also offers important research opportunities. In this paper we examine three key methodological challenges commonly faced by researchers who are outsiders to the community. We then present practical solutions for successful variationist research on indigenous languages and meaningful partnerships with local communities. In particular, we draw insights from our research with Australian languages and indigenous languages of rural China. We also highlight reasons why such lesser-studied languages are crucial to the further advancement of sociolinguistic theory, arguing that the value of the research justifies the effort needed to overcome the methodological difficulty. We find that the challenges of sociolinguistics in these communities sometimes make standard variationist methods untenable, but the methodological solutions we propose can lead to valuable results and community relationships.
  • Three speakers, four dialects: Documenting variation in an endangered Amazonian language
    Abstract: This paper offers a case study on dialect contact in Máíhɨ̃ki (Tukanoan, Peru), with the goal of illustrating how documentation of variation can contribute to a general language documentation project. I begin by describing the facts of variation in one dialectally diverse Máíhɨ̃ki-speaking community. I then argue that the outcomes of dialect mixing in this speech community can be understood only through a fine-grained analysis centering the dialectal composition of the communities of practice to which speakers belonged in early life. The coarse-grained identity categories used in most variationist analyses, such as age and gender, are less informative. After proposing a network theory interpretation of this finding, I discuss its implications for the role of (a) ethnography and (b) the European dialect mixing literature in research on variation in endangered languages. Second, I describe some surprising similarities between this speech community and those described in classic variationist literature. Like urban English speakers, Máíhɨ̃ki speakers attach less indexical value to morphosyntactic than to phonological variation, and – although their language lacks a standard – engage in indexically motivated style-shifting. I discuss ways to adapt variationist methods to endangered language settings to capture these phenomena, then close with comments on the importance of documenting variation for conservation.
  • Language shift and linguistic insecurity
    Abstract: Variation in language is constant and inevitable. In a vital speech community some variation disappears as speakers age, and some results in long-term change, but all change will be preceded by a period of variation. Speakers of endangered languages may perceive variation in an especially negative light when it is thought to be due to contact with the dominant language. This contributes to negative evaluations of young people’s speech by older speakers, and in turn contributes to the linguistic insecurity of young speakers, which may result in even further shift toward the dominant language. In this paper we discuss language variation in the context of shift with respect to the notion of linguistic insecurity and what we identify as three distinct types of linguistic insecurity, particularly in cases of indigenous language loss in the Americas. We conclude with some observations on the positive results of directly addressing linguistic insecurity in language maintenance/revitalization programs.
  • Areal analysis of language attitudes and practices: A case study from Nepal
    Abstract: This paper has two aims. One aim is to consider non-structural (language attitude and use) variables as valid in the field of dialect and linguistic geography in an inner Himalayan valley of Nepal, where four languages have traditionally co-existed asymmetrically and which demonstrate different degrees of vitality vs. endangerment. The other aim is an application of modified spatiality as it aligns with speaker attitudes and practices amidst recent and ongoing socio-economic and population changes. We demonstrate that variation in self-reported attitudes and practices across languages in this region can be explained as much with adjusted spatial factors (labeled ‘social space’) as with traditional social factors (e.g. gender, age, formal education, occupation, etc.). As such, our study contributes to a discourse on the role and potential of spatiality in sociolinguistic analyses of smaller language communities.
SP12: The Social Cognition Parallax Interview Corpus (SCOPIC)

SP11: Mutsun-English English-Mutsun Dictionary, mutsun-inkiS inkiS-mutsun riica pappel

  • Mutsun-English English-Mutsun Dictionary
    Abstract: Mutsun is a Costanoan language (part of the Utian language family) from California in the area around the modern towns of San Juan Bautista, Hollister, and Gilroy. The last fluent speaker of Mutsun, Mrs. Ascension Solarsano, died in 1930. Because of her work and the work of earlier native Mutsun speakers with early linguists, there is a large written corpus of Mutsun. This dictionary was compiled by analyzing that documentation. The dictionary is written to be useful both for language revitalization and for linguistic research.

SP10: African language documentation: new data, methods and approaches

  • African language documentation: new data, methods and approaches
  • Pure fiction – the interplay of indexical and essentialist language ideologies and heterogeneous practices. A view from Agnack
    Abstract: This paper investigates the complex interplay between different sets of language ideologies and multilingual practice in a village in Lower Casamance (Senegal). In this heterogeneous linguistic environment, which is typical of many African settings, individuals have large and adaptive linguistic repertoires. The local language ideologies focus on different aspects of identity which languages serve to index, but enable individuals to focus on different facets of identity according to context. National language ideologies are essentialist and have as their goal to put constructed homogeneous communities on the polyglossic map of Senegalese languages. In contrast to similarly essential Western ideologies, however, these national ideologies operating in Senegal are not linked to actual standard language practices. Using the example of individuals in two households and by presenting rich ethnographic information on them, the paper explores the relationship between language use and language ideologies before describing a sampling method for documenting language use in these contexts. It is argued that the documentation of these contexts cannot be achieved independently of an understanding of the language ideologies at work, as they influence what is presented as linguistic practice, and that arriving at a holistic description and documentation of the multilingual settings of Africa and beyond is central for advancing linguistic theory in sociolinguistics, psycholinguistics and contact linguistics.
  • Multilingualism, affiliation and spiritual insecurity. From phenomena to process in language documentation
    Abstract: Documentary linguists have often been urged to integrate language ideologies and other topics more closely to ethnography than to linguistics in their research, but these recommendations have seldom coincided, in literature, with practical directions for their implementation. This paper aims to contribute to filling this gap. After re-considering current documentary approaches, a case study from a documentation project in NW Cameroon is presented to show how an ethnographically-informed sociolinguistic survey on multilingualism can lead to progressively deeper insights into the local language ideology. The methodological implications that this research perspective brings to both documentary linguistics and language support and revitalization projects are discussed. A number of practical suggestions are finally proposed, illustrating the importance of language documentation projects being carried out by multidisciplinary teams.
  • Linguistic variation and the dynamics of language documentation: Editing in ‘pure’ Kagulu
    Abstract: The Tanzanian ethnic community language Kagulu is in extended language contact with the national language Swahili and other neighbouring community languages. The effects of contact are seen in vocabulary and structure, leading to a high degree of linguistic variation and to the development of distinct varieties of ‘pure’ and ‘mixed’ Kagulu. A comprehensive documentation of the language needs to take this variation into account and to provide a description of the different varieties and their interaction. The paper illustrates this point by charting the development of a specific text within a language documentation project. A comparison of three versions of the text – a recorded oral story, a transcribed version of it and a further, edited version in which features of pure Kagulu are edited in – shows the dynamics of how the different versions of the text interact and provides a detailed picture of linguistic variation and of speakers’ use and exploitation of it. We show that all versions of the text are valid, ‘authentic’ representations of their own linguistic reality, and how all three of them, and the processes of their genesis, are an integral part of a comprehensive documentation of Kagulu and its linguistic ecology.
  • Why are they named after death? Name giving, name changing and death prevention names in Gújjolaay Eegimaa (Banjal)
    Abstract: This paper advocates the integration of ethnographic information such as anthroponymy in language documentation, by discussing the results of the documentation of personal names among speakers of Gújjolaay Eegimaa. Our study shows that Eegimaa proper names include names that may be termed ‘meaningless names’, because their meanings are virtually impossible to identify, and meaningful names, i.e. names whose meanings are semantically transparent. Two main types of meaningful proper names are identified: those that describe aspects of an individual’s physic or character, and ritual names which are termed death prevention names. Death prevention names include names given to women who undergo the Gaññalen ‘birth ritual’ to help them with pregnancy and birthgiving, and those given to children to fight infant mortality. We provide an analysis of the morphological structures and the meanings of proper names and investigate name changing practices among Eegimaa speakers. Our study shows that, in addition to revealing aspects of individuals’ lives, proper names also reveal important aspects of speakers’ social organisation. As a result, anthroponymy is an area of possible collaborative research with other disciplines including anthropology and philosophy.
  • Language documentation in Africa: turning tables
SP09: Language Documentation and Conservation in Europe

  • New speakers of Minderico: Dynamics and tensions in the revitalization process
    Abstract: From the sixteenth century on, the blankets of Minde, a small village in the center of Portugal, became famous all over the country. The wool combers, blanket producers, and traders of Minde began to use Minderico in order to protect their business from “intruders”. Later, this secret language extended to all social and professional groups and became the main means of communication in the village. During this process, Minderico turned into a full-fledged language with a very characteristic intonation and a complex morphosyntax, differentiating itself from Portuguese. However, the number of speakers declined drastically during the last 50 years. Minderico is now actively spoken by 150 speakers, but only 23 of them are fluent speakers. More than half of the fluent speakers are new speakers of the language. New speakerness is a relatively new phenomenon in the Minderico speaking community and a direct result of the revitalization process which was initiated in 2009. This paper examines the role of the new speakers in the revitalization of Minderico, considering issues of authenticity and socio-linguistic legitimacy.
  • Lemko linguistic identity: Contested pluralities
    Abstract: In their efforts to organize as a recognized minority within the Polish state, the Lemkos have faced a number of obstacles, both internal and external to the community. This article explores three aspects of self-representation of the Lemko community - group membership, victimhood and “speakerhood” – and examines how these representations are contested on a number of levels.
  • Identity and language shift among Vlashki/Zheyanski speakers in Croatia
    Abstract: The language Vlashki/Zheyanski, spoken in two areas – the Šušnjevica area and Žejane – of the multilingual, multiethnic Istrian peninsula of Croatia, evinces strong loyalty on the part of its elderly speakers, yet in both areas a language shift to Croatian is well underway. Vlashki/Zheyanski is a severely endangered Eastern Romance language known in the linguistic literature as Istro-Romanian. In order to study the domains and frequency of use of the language and equally to examine speaker attitudes about language and identity, we administered a questionnaire to speakers in both locations. Our sample included responses from individuals in four age groups. Our discussion here focuses on 16 men and women from the two older groups, 51–70 and 71-and- older. In Žejane, speakers saw knowledge of the language and family lineage as defining components of being a “real” member of the community. The name for the language, Zheyanski, comes from the village name. Hence, someone who speaks the language asserts that village belonging and village affiliation are at the core of speakers’ identity. In terms of national identification, whether Croatian, Italian, and/or Istrian, Zheyanski speakers by and large showed little enthusiasm for any of the three choices. In terms of language use, all respondents continue to use the language on a daily basis but report that they speak mostly Croatian to their grandchildren. In the Šušnjevica area, people used the same criteria, language knowledge and family lineage, to define group membership and feel close affiliation to their home village. Unlike in Žejane, the name of the language, “Vlashki”, does not correspond to a unitary group name accepted and liked by all. In terms of larger identity, villagers em- braced identities that they share with their Croatian-speaking neighbors: Most felt “extremely Istrian”, and at least “fairly Croatian”. The language shift to Croatian is also more advanced here: All the speakers report speaking mostly Croatian to their children. While speakers in both Žejane and the Šušnjevica area endued their language with a critical role in their identity, this attitude toward Vlashki/Zheyanski does not manifest itself in their communication with younger generations where other social forces have caused the shift to the use of Croatian.
  • Kormakiti Arabic: A study of language decay and language death
    Abstract: Kormakiti Arabic (also called Cypriot Maronite Arabic) is a language with approximately 150–200 speakers in Kormakitis, a village north-western Cyprus. Kormakiti Arabic is highly endangered, not only due to its low number of speakers but more importantly because younger Maronites with their roots in Kormakitis do not acquire Kormakiti Arabic naturally any more. Kormakitis itself is almost only inhabited by elderly Maronites who lived there before the separation of Cyprus in 1974. This paper is on language death and language decay of Kormakiti Arabic. Several historical sources are used in order to illustrate the historical and socio-linguistic environment this language survived until today. The linguistic evidence is then compared with the theory of Gaelic-Arvanitika-Model Sasse (1992a) in order to show parallels, as well as the differences between Arvanitika and Kormakiti Arabic.
  • Brief considerations about language policy: An European assessment
    Abstract: The rising of language policy worldwide is a consequence of a globalized world and the openness of borders. Even countries with a relative cultural homogeneity face nowadays new challenges regarding massive migration fluxes and the results of growing awareness for endangered languages and cultures, notably in Europe. This is being noticed around the Old Continent where diversity proves to be a distinct value since ever. In this paper we reflect on the scope of cultural identity and multilingualism to shed new light on language policy and consequently refresh our understanding of a key policy, which is already a decisive public policy for the European peoples.
  • Multilingualism and structural borrowing in Arbanasi Albanian
    Abstract: In this paper we present a brief overview of the history of linguistic contacts of Arbanasi Albanian, a Gheg Albanian dialect spoken in Croatia, with Croatian and Italian. Then we discuss a number of contact-induced changes in that language. We show that Arbanasi Albanian was subject to strong influences from Croatian (and, to a lesser extent, from Italian) on all levels of linguistic structure. Using the data from our own fieldwork, we were able to show that there were also influences on the level of syntax, including the borrowing of certain constructions, such as analytic causative and imperative constructions, as well as the extension of the use of infinitive in subordinate clauses.
  • Language Landscape: Supporting community-led language documentation
    Abstract: Different groups have differing motivations for participating in language documentation projects. Linguists want to increase our knowledge of languages and linguistic theory, but constraints on their work may lead to issues with their documentation projects, including their representations of the languages they study. Native speakers participate to maintain and develop their language, and may choose to represent it in a way which showcases their culture and attitudes. In order to encourage more native speakers to take part in documentation projects, a simple integrated system is required which will enable them to record, annotate and publish recordings. Language Landscape, our web-based application, enables native speakers to publish their recordings, and Aikuma, a mobile application for documentation, enables them to record and orally translate recordings, in both cases with minimal cost and training required. Language Landscape benefits communities by allowing them to document their language as they see fit, as demonstrated by our outreach program, through which some London school children created their own projects to document their own languages and those spoken around them.
  • Reflections of an observant linguist regarding the orthography of A Fala de Us Tres Lugaris
    Abstract: A Fala has never had a standardized orthography as it is a language of oral tradition and almost all written documents have always been produced only in Spanish. The few documents which exist in A Fala use orthographies that vary considerably, especially when indicating the phonemes which are absent in standard Spanish. However, in the past decades there have been signs of an increasing interest regarding the language and cultural identity in the three villages and there have also been attempts to establish organizations to promote the language, such as A Fala y Cultura, U Lagartu Verdi, and A Nosa Fala. This increase in language awareness leads inevitably to situations, when the speakers want to express their linguistic identity in written form and the lack of written standard makes this task rather difficult. The objective of this paper is to analyze the public inscriptions, direction signs and street names written in A Fala. The appearance of these signs expresses the willingness of the speakers of A Fala to claim their linguistic identity. At the same time, their inconsistent orthography reveals the problems that arise in the course of writing their language. There are two main causes of these difficulties: The influence of Spanish, as all the speakers are bilingual in Spanish, and variation within the language itself. Regarding the first cause, the main issues include the uncertainty how to write the phonemes that do not exist in standard Spanish, and also whether the phonemes that do exist in Spanish should be written in the same way or not. In respect of the second cause, the signposts and street names reflect the three main varieties: Valverdeñu, Lagarteiru and Mañegu. They also partially reflect the ideas of those who created them and testify to a certain evolution in time. In general, the linguistic data in the form of street names and direction signs provide relevant information about the options for writing those phonemes which do not have an equivalent in Spanish, as well as geographical (diatopic) variation, and the changes of ideas regarding the orthography. This paper will use this valuable linguistic material to reflect on the issues that are involved in the establishment of an orthographical standard.
  • BaTelÒc: A text base for the Occitan language
    Abstract: Language Documentation, as defined by Himmelmann (2006), aims at compiling and preserving linguistic data for studies in linguistics, literature, his- tory, ethnology, sociology. This initiative is vital for endangered languages such as Occitan, a romance language spoken in southern France and in several valleys of Spain and Italy. The documentation of a language concerns all its modalities, covering spoken and written language, various registers and so on. Nowadays, Occitan documentation mostly consists of data from linguistic atlases, virtual libraries from the modern to the contemporary period, and text bases for the Middle Ages. BaTelÒc is a text base for modern and contemporary periods. With the aim of creating a wide coverage of text collections, BaTelÒc gathers not only written literary texts (prose, drama and poetry) but also other genres such as technical texts and newspapers. Enough material is already available to foresee a text base of hundreds of millions of words. BaTelÒc not only aims at documenting Occitan, it is also designed to provide tools to explore texts (different criteria for corpus selection, concordance tools and more complex enquiries with regular expressions). As for linguistic analysis, the second step is to enrich the corpora with annotations. Natural Language Processing of endangered languages such as Occitan is very challenging. It is not possible to transpose existing models for resource-rich languages directly, partly because of the spelling, dialectal variations, and lack of standardization. With BaTelÒc we aim at providing corpora and lexicons for the development of basic natural language processing tools, namely OCR and a Part-of-Speech tagger based on tools initially designed for machine translation and which take variation into account.
  • The first Mirandese text-to-speech system 
    Abstract: This paper describes the creation of base NLP resources and tools for an under-resourced minority language spoken in Portugal, Mirandese, in the context of the generation of a text-to-speech system, a collaborative citizenship project between Microsoft, ILTEC, and ALM – Associaçon de la Lhéngua Mirandesa. Development efforts encompassed the compilation of a large textual corpus, definition of a complete phone-set, development of a tokenizer, inflector, TN and GTP modules, and creation of a large phonetic lexicon with syllable segmentation, stress mark-up, and POS. The TTS system will provide an open access web interface freely available to the community, along with the other resources. We took advantage of mature tools, resources, and processes already available for phylogenetically-close languages, allowing us to cut development time and resources to a great extent, a solution that can be viable for other lesser-spoken languages which enjoy a similar situation.
  • Bridging divides: A proposal for integrating the teaching, research and revitalization of Nahuatl
    Abstract: This paper discusses major historical, cultural, linguistic, social and institutional factors contributing to the shift and endangerment of the Nahuatl language in Mexico. As a practical proposal, we discuss our strategy for its revitalization, as well as a series of projects and activities we have been carrying out for the last several years. Crucial to this approach are several complementary elements: interdisciplinary research, including documentary work, as well as investigation of both the historical and the present state of Nahuatl language and culture; integration of both Western and native-speaking indigenous researchers as equal partners and the provision of space for indigenous methodologies; creation of teaching programs for native and non-native speakers oriented toward the preparation of language materials; and close collaboration with indigenous communities in developing community-based programs. The operability of this strategy will depend greatly on our ability to foster collaboration across academic, social, and ideological boundaries, to integrate theory, methodology and program implementation, and to efficiently combine grass- roots and top-down approaches. An important aim is to restore the culture of literacy in Nahuatl through our monolingual Totlahtol series, publishing works from all variants of the language and encompassing all genres of writing. We also strive to strengthen the historical and cultural identity of native speakers by facilitating their access to the alphabetical texts written by their ancestors during the colonial era.
  • Authenticity and linguistic variety among new speakers of Basque
    Abstract: This paper argues that the type of variety learned and used by Basque language learners is a key element in their self-perception as “true” or authentic speakers of Basque. Drawing on focus groups and individual interviews, we find that new speakers are for the most part strongly oriented towards the value of authenticity epitomized by local varieties. While new speakers report the utility of their mastery over the new standard Basque variety, they are not inclined to view this mastery as granting themselves greater authority or ownership over Basque. Rather they strongly valorize the informal and vernacular speech forms indexing colloquial speech and local dialect most identified with native speakers. The new speaker’s sociolinguistic context and motivations for learning Basque seem to be predictive of the strength of this orientation. The findings of this study point to the necessity of further study and documentation of local vernacular as well the urgency for language educators to find ways of incorporating the acquisition of local and dialectal features into language instruction.
  • El árabe ceutí, una lengua minorizada. Propuestas para su enseñanza en la escuela
    Abstract: The Arabic of Ceuta is the native language of 40% of the Spanish population of Ceuta, which also speaks Spanish. The remainder 60% is mostly monolingual and their native language is Spanish. There is also 1% of bilingual citizens whose native tongue is Sindhi. The Arabic of Ceuta is Moroccan Arabic, the native language of 60% of the population of the neighboring country and, specifically, it shares common features with the northern dialect area (Yebala region and the Atlantic coast down to the city of Larache). But its use in Spanish territory since the second half of 19th century gave rise to two phenomena: Spanish borrowings and code-switching in the case of bilingual speakers. The Arabic of Ceuta is an oral language, like Moroccan Arabic, which has never been standardized from the political sphere, in contrast with literal Arabic (also called cultivated, standard, modern or classic), which is not the native language of any Arab in the world and has emerged as the only means of educational, political, and cultural expression due to political and religious power. Despite this, there is a whole literary tradition, oral and written, in Moroccan Arabic, especially from the 20th century. Currently, there is a group of Moroccan professors and intellectuals working on its coding in order to generalize a writing system in Arabic script. Ceuta is the Spanish region with the highest school dropout rate in Spain, and this is particularly acute in schools where the majority of students are bilingual. Many experts recommend teachers and professors to teach in the native language of their pupils, at least at the beginning of their education. In this paper we will put forward some proposals for the recognition of Ceuta Arabic as coded by the movement of Moroccan intellectuals who are already working on the development of a dictionary, a grammar, text collections, and translations of works from the European literature to Moroccan Arabic. The ultimate goal should be its inclusion in the educational and administrative services of the city as well as to achieve an official status in the future, rightly recognized by the Spanish Constitution.
  • Language Revitalization: The case of Judeo-Spanish varieties in Macedonia
    Abstract: Judeo-Spanish is a secondary dialect of the Spanish language having evolved from the ancient standard Spanish in the course of its expansion southwards. Although the language enjoys a heritage and presence in the Balkans of over five centuries, it is now facing language death – its acuteness depending on the region. In Macedonia,1 the two varieties of Bitola and Skopje last documented by Kolonomos (1962) need to be labelled “moribund” or “nearly extinct”. This paper aims to point out some of the aspects relevant to the author’s doctoral research study, in which a documentation of the current language status of Judeo-Spanish in Macedonia is envisaged. The deliberations look at the reasons for language endangerment and at the same time evaluate possibilities and opportunities for language revitalization – what priorities are to be set, what role do linguists and especially the community play, what is the approach, what are skills, methods, and steps to be taken into consideration to ensure not only a documentation of the language, but also and foremost its conservation and revitalization.
  • The sociolinguistic evaluation and recording of the dying Kursenieku language
    Abstract: Since the times of the Teutonic order until 1923, the Curonian Peninsula was a part of Prussia, and later – a part of Germany. Baltic tribes’ migration pro- cesses of different intensity occurred here. In the 16th century the newcomers from Latvian speaking Courland started to dominate, moving to the spit in several waves up to the 18th century; at the same time, people from the continental part (the majority of them were Germanized Prussians), colonizers from other German lands, and Lithuanians from the Klaipeda area settled in the region. The Kursenieku language, also known as New Curonian (German Nehrungskurisch) can be categorized as a mixture of Latvian Curonian dialects with Lithuanian, German, and elements of the now extinct Old Prussian. Since it had no written form, Kursenieku was roofed by Lithuanian and later by German, which had functioned as languages of religion and education for a long time. The community disintegrated at the end of World War II. After the Kursenieki community left their homeland and settled in different towns and villages of Germany, there was no practical use for the maintenance of Kursenieku. The chronological reconstruction of the Kursenieku is possible and useful for the Baltic studies; however, there is no motive for revitalization: nowadays, there is no community willing to use this language. This article briefly presents the development of the Kursenieku language in its ethnocultural context. Moreover, it raises the discussion around its status (variety or language), provides its sociolinguistic characteristics, describes the work that has been done with the language, and presents urgent goals and research perspectives.
  • Language Documentation and Conservation in Europe (whole volume)
  • Foreword

SP08: The Art and Practice of Grammar Writing

  • The data and the examples: Comprehensiveness, accuracy, and sensitivity
    Abstract: Good grammars are read by diverse audiences with a wide variety of interests. One might not write a reference grammar in exactly the same way for all potential users, but particularly in the case of under-documented and endangered languages, it is likely that whatever is produced now will be consulted for answers to questions beyond those originally anticipated. A good grammar can provide more than descriptions of patterns the grammarian has noted at the time of writing; the examples it contains can provide a basis for future discoveries and new uses. It thus makes sense to consider the types of data that might best meet the needs of current and future readers, some of which we cannot even imagine at present. For some purposes, sensitive, typologically-informed elicitation is necessary, while for others, material drawn from unscripted connected speech is crucial. Here the potential contributions of examples of each type are considered for descriptions of phonetics, phonology, morphology, syntax, discourse, prosody, language change, and language contact.
  • Grammar writing from a dissertation advisor’s perspective
    Abstract: Anyone who intends to produce a grammar of a previously little-described language needs to (1) plan the scope, methods and timetable of the data gathering process, (2) think about the conceptual framework that will shape data-gathering and analysis, (3) gather and organize the data, (4) analyse the data, and (5) plan the structure of the written account and (6) write the grammar. The steps are not simply sequential but are to some extent cyclical. This chapter will look at an advisor’s role in guiding a PhD student through these steps. It will focus on the following questions: What kinds of data, and how much, are sufficient to base a grammar on? What is a realistic size for a PhD dissertation grammar? What are the main alternative ways of organizing a grammatical description, e.g. in terms of topic divisions and sequencing? What are the dos and don’ts to be followed in order to make the grammar as descriptively adequate and user friendly as possible? What are the main reasons why some students take forever to complete the analysis and writing process?
  • Sounds in grammar writing
    Abstract: While there has been much written on writing grammars in recent years, relatively little has been written on the place of sounds and their patterning in grammar writing. In this chapter I provide an overview of some of the challenges of writing about sounds, and discuss the kinds of information on sounds that are generally included in grammars. I then address what a grammar might ideally include on the sounds of a language, advocating the inclusion of sound files to augment the usual topics, increasing both the scientific merit and the human value of the grammar.
  • On the role and utility of grammars in language documentation and conservation
    Abstract: The National Science Foundation warns that at least half of the world’s approximately seven thousand languages are soon to be lost. In response to this impending crisis, a new subfield of linguistics has emerged, called language documentation or, alternatively, documentary linguistics. The goal of this discipline is to create lasting, multipurpose records of endangered languages before they are lost forever. However, while there is widespread agreement among linguists concerning the methods of language documen- tation, there are considerable differences of opinion concerning what its products should be. Some documentary linguists argue that the outcome of language documentation should be a large corpus of extensively annotated data. Reference grammars and dictionaries, they contend, are the products of language description and are not essential products of language documentation. I argue, however, that grammars (and dictionaries) should normally be included in the documentary record, if our goal is to produce products that are maximally useful to both linguists and speakers, now and in the future. I also show that an appropri- ately planned reference grammar can serve as a foundation for a variety of community grammars, the purposes of which are to support and conserve threatened languages.
  • Toward a balanced grammatical description
    Abstract: The writer of a grammatical description attempts to accomplish many goals in one complex document. Some of these goals seem to conflict with one another, thus causing tension, discouragement and paralysis for many descriptive linguists. For example, all grammar writers want their work to speak clearly to general linguists and to specialists in their language area tradition. Yet a grammar that addresses universal issues, may not be detailed enough for specialists; while a highly detailed description written in a specialized areal framework may be incomprehensible to those outside of a particular tradition. In the present chapter, I describe four tensions that grammar writers often face, and provide concrete suggestions on how to balance these tensions effectively and creatively. These tensions are: • Comprehensiveness vs. usefulness. • Technical accuracy vs. understandability. • Universality vs. specificity. • A ‘form-driven’ vs. a ‘function-driven’ approach. By drawing attention to these potential conflicts, I hope to help free junior linguists from the unrealistic expectation that their work must fully accomplish all of the ideals that motivate the complex task of describing the grammar of a language. The goal of a description grammar is to produce an esthetically pleasing, intellectually stimulating, and genuinely informative piece of work.
  • Endangered domains, thematic documentation and grammaticography
    Abstract: When setting out to document a language with the intended goal of describing it (typically through a grammar and dictionary), fieldworkers prefer to collect an array of linguistic data, ranging from elicited words and paradigms to an assortment of texts based on conversa- tions, narratives, procedures and so forth. Capturing a wide variety of speech acts provides a clearer record of the language and its use, and thus offers the potential for a richer description of the language at hand. However, without controlling for content, one may collect linguistic data based on an open-ended amount of topics or themes. The purpose of this chapter is to introduce the notion of endangered linguistic domains and themes in language documentation and description. Even in thriving minority languages, domains such as indigenous music or knowledge of flora and fauna come under pressure from the same forces that eventually lead to language endangerment. Gathering linguistic data based on a particular domain or specialized knowledge can generate a corpus applicable to a wider audience without sacrificing the needs of linguists. Similar to thematic dictionaries in lexicography, this introduces thematic grammars to grammaticography.
  • Walking the line: Balancing description, argumentation and theory in academic grammar writing
    Abstract: This chapter explores how to incorporate linguistic typology, argumentation, and theor- etical innovation into a reference grammar. It provides recommendations on how to produce a balanced grammar that is firmly grounded in theory, responsible to the unique structures of the language, and comprehensible now and over time. Linguistic typology provides a set of widely recognized linguistic categories used in the classification of grammatical patterns. These can be taken as starting points from which the structures of the language can be compared, contrasted, explored, and explained, profiling the unique shapes of language-particular categories. Argumentation for particular analyses provides clarification and explanation, although excessive argumentation can obscure descriptive facts. Simply asserting facts is appropriate for lower-level linguistic features, simple canonical structures, or uncontroversial elements or their functions. Argumentation is appropriate when structures differ from typologically-expected patterns, when the analysis counters descriptions in the literature, and in cases of multiple interpretations of a structure. Grammar writing immerses researchers in the structure of a language, revealing new vistas of understanding and novel ways of interpreting structure. Theoretically innov- ative analyses that reflect these insights can be incorporated as long as they are motivated, well-explained, and balanced by a typologically-informed descriptive base.
  • Corpus linguistic and documentary approaches in writing a grammar of a previously undescribed language
    Abstract: Drawing on her experiences with writing a grammar in the course of the Teop language documentation project, the author explores how corpus linguistic methods can be employed for the analysis and description of a previously undescribed language. After giving a short introduction into the creation of a digital corpus and complex corpus search methods, the chapter focuses on the importance of creating a diversified corpus. It demonstrates that different text varieties such as spoken and written legends, procedural texts and descriptions of objects show different preferences for certain ways of expression and thus represent valuable resources for various grammatical phenomena. Accordingly, a grammar which is based on texts should account for this variation by incorporating a detailed description of the corpus, giving references and metadata for each example and providing information on the kind of contexts particular grammatical features are usually associated with.
SP07: Language Endangerment and Preservation in South Asia

  • 5.The lifecycle of Sri Lanka Malay
    Abstract: The aim of this paper is to document the forces that led first to the decay and then the revival of the ancestral language of the Malay diaspora of Sri Lanka. We first sketch the background of the origins of the language in terms of intense contact and multilingual transfer; then analyze the forces that led to a significant language shift and consequent loss, as well as the factors responsible for the recent survival of the language. In doing so we focus in particular on the ideologies of language upheld within the community, as well as on the role of external agents in the lifecycle of the community.
  • 4. Script as a potential demarcator and stabilizer of languages in South Asia
    Abstract: South Asia is rich not only in languages, but also in scripts. However, the various roles script can play in this region have been only marginally explored. Besides an overview of the most important examples from South Asia in which script has contributed to the strengthening or weakening of a language, or to the classification of a tongue as a language or dialect, this paper offers first inputs for a discussion on the role of script today in smaller speech communities which lack a long literary tradition. Especially in cases of script invention, script is not only allocated the role of an identity marker for the speech community, but seems to be expected to strengthen the language itself, and finally to act as a preserver of the minority language.
  • Language Endangerment and Preservation in South Asia
  • 3. Ahom and Tangsa: Case studies of language maintenance and loss in North East India
    Abstract: North East India is probably the most linguistically diverse area on the Indian subcontinent, with long established communities speaking languages of four different families – Austroasiatic, Indo-European, Tai-Kadai and Tibeto-Burman. Comparing Tai Ahom, language of the rulers of a kingdom that consisted of what is now Assam, with the very diverse Tangsa varieties spoken on the India-Myanmar border, we will discuss factors of language decline and language maintenance. Tai Ahom has not been spoken as a mother tongue for 200 years, but survives in the large body of manuscripts, and in the language used in religious rituals. While both of these features have been necessary foundations of the ongoing revival of the language, neither was able to maintain the language in its spoken form. At least 35 different Tangsa sub-tribes are found in India, with more in Myanmar. Each has a distinct linguistic variety, many of which are mutually intelligible while others are not. Despite having no writing until very recently, each variety is still healthy. Since many Tangsas are now Christians, Bible translations are underway, and many Tangsa of all religions are interested in orthography and literacy development. This may lead to standardisation, which would represent a significant loss of diversity.
  • 2. Majority language death
    Abstract: The notion of ‘language death’ is usually associated with one of the ‘endangered languages’, i.e. languages that are at risk of falling out of use as their speakers die out or shift to some other language. This paper describes another kind of language death: the situation in which a language remains a powerful identity marker and the mother tongue of a country’s privileged and numerically dominant group with all the features that are treated as constituting ethnicity, and yet ceases to be used as a means of expressing its speakers’ intellectual demands and preserving the community’s cultural traditions. This process may be defined as the ‘intellectual death’ of a language. The focal point of the analysis undertaken is the sociolinguistic status of Punjabi in Pakistan. The aim of the paper is to explore the historical, economic, political, cultural and psychological reasons for the gradual removal of a majority language from the repertoires of native speakers.
  • 1. Death by other means: Neo-vernacularization of South Asian languages
    Abstract: Endangerment of a language is assessed by the shrinking number of its speakers and the failure to pass it on to the next generation. This approach views multilingualism in statistical terms. When multilingualism is defined by the functional relationship between languages the meaning of endangerment expands to include functional reduction in languages. This takes place when the economic, political and cultural value of a language comes to near zero. The language may still be spoken inter-generationally, but only for limited in-group communication. Such a language survives, but does not live. This situation can be found even in a language with a large population and official status. This paper illustrates such a situation with Tamil, a South Asian language. Tamil has a long literary history, is the official language of an Indian state and has political and cultural value. But its lack of economic value makes its speakers consider it a liability in education and for material progress and this restricts it from functioning substantively. Such a language will not die but will become a vernacular. Most Indian regional languages, which were vernaculars in the first millenium when Sanskrit was the dominant language, may become vernaculars again in the third millenium when English is the dominant language.
SP06: Microphone in the mud

  • Microphone in the mud
    Abstract: A young woman battles armed terrorists, a kidnapper, malaria, a tsunami, and dial-up Internet as she documents the endangered languages of hunter-gatherers in the jungles of the Philippines.

SP05: Melanesian languages on the edge of Asia: Challenges for the 21st Century

  • Systematic typological comparison as a tool for investigating language history
    Abstract: Similarities between languages can be due to 1) homoplasies because of a limited design space, 2) common ancestry, and 3) contact-induced convergence. Typological or structural features cannot prove genealogy, but they can provide historical signals that are due to common ancestry or contact (or both). Following a brief summary of results obtained from the comparison of 160 structural features from 121 languages (Reesink, Singer & Dunn 2009), we discuss some issues related to the relative dependencies of such features: logical entailment, chance resemblance, typological dependency, phylogeny and contact. This discussion focusses on the clustering of languages found in a small sample of 11 Austronesian and 8 Papuan languages of eastern Indonesia, an area known for its high degree of admixture.
  • The languages of Melanesia: Quantifying the level of coverage
    Abstract: The present paper assesses the state of grammatical description of the languages of the Melanesian region based on database of semi-automatically annotated aggregated bibliographical references. 150 years of language description in Melanesia has produced at least some grammatical information for almost half of the languages of Melanesia, almost evenly spread among coastal/non-coastal, Austronesian/non-Austronesian and isolates/large families. Nevertheless, only 15.4% of these languages have a grammar and another 18.7% have a grammar sketch. Compared to Eurasia, Africa and the Americas, the Papua-Austronesian region is the region with the largest number of poorly documented languages and the largest proportion of poorly documented languages. We conclude with some dicussion and remarks on the documentational challenge and its future prospects.
  • Even more diverse than we had thought: The multiplicity of Trans-Fly languages
    Abstract: Linguistically, the Trans Fly region of Southern New Guinea is one of the least known parts of New Guinea. Yet the glimpses we already have are enough to see that it is a zone with among the highest levels of linguistic diversity in New Guinea, arguably only exceeded by those found in the Sepik and the north coast. After surveying the sociocultural setting, in particular the widespread practice of direct sister-exchange which promotes egalitarian multilingualism in the region, I give an initial taste of what its languages are like. I focus on two languages which are neighbours, and whose speakers regularly intermarry, but which belong to two unrelated and typologically distinct families: Nen (Yam Family) and Idi (Pahoturi River Family). I then zoom out to look at some typological features of the whole Trans-Fly region, exemplifying with the dual number category, and close by stressing the need for documentation of the languages of this fascinating region.
  • Papuan-Austronesian language contact: Alorese from an areal perspective
    Abstract: This paper compares the grammar and lexicon of Alorese, an Austronesian language spoken in eastern Indonesia, with its closest genealogical relative, Lamaholot, spoken on east Flores, as well as with its geographical neighbours, the Papuan languages of Pantar. It focusses on the question how Alorese came to have the grammar and lexicon it has today. It is shown that Alorese and Lamaholot share a number of syntactic features which signal Papuan influences that must have been part of Proto-Lamaholot, suggesting (prehistoric) Papuan presence in the Lamaholot homeland in east Flores/Solor/Adonara/ Lembata. The data indicate that Proto-Lamaholot had a rich morphology, which was completely shed by Alorese after it split from Lamaholot. At the same time, lexical congruence between Alorese and its current Papuan neighbours is limited, and syntactic congruence virtually absent. Combining the comparative linguistic data with what little is known about the history of the Alorese, I propose a scenario whereby Lamaholot was acquired as non-native language by spouses from different Papuan clans who were brought into the Lamaholot communities that settled on the coast of Pantar at least 600 years ago. Their morphologically simplified language was transferred to their children. The history of Alorese as reconstructed here suggests that at different time depths, different language contact situations had different outcomes: prehistoric contact between Papuan and Proto-Lamaholot in the Flores area resulted in a complexification of Proto-Lamaholot, while post-migration contact resulted in simplification. In both cases, the contact was intense, but the prehistoric contact with Papuan in the Flores area must have been long-term and involve pre-adolescents, while the post-migration contact was probably of shorter duration and involved post-adolescent learners.
  • 'Realis' and 'irrealis' in Wogeo: A valid category?
    Abstract: Finite verb forms in Wogeo, an Austronesian language of New Guinea, are obligatorily marked with a portmanteau prefix denoting person and number of the subject on the one hand, and a grammatical category that is conventionally glossed in the literature as realis–irrealis, on the other. In similar languages, the latter category is usually described as modal, with a certain range of meanings which is, in many cases, only vaguely defined. A more in-depth investigation of the verbal system of Wogeo and the functional distribution of the respective categories shows, however, that the language is quite different from a postulated prototypical realis–irrealis language. Central attributes of the supposed realis–irrealis semantics are not realized by the obligatory prefixes but by other morphosyntactic means, while the prefixes are restricted to only a small part of the assumed realis–irrealis domain.
  • Projecting morphology and agreement in Marori, an isolate of Southern New Guinea
    Abstract: This paper is the first detailed investigation on agreement in Marori (Isolate, Papuan, Merauke-Indonesia), highlighting its significance in the cross-linguistic understanding of NUM(BER) expression and in the unification-based theory of agreement. Marori shows PERS and NUM agreement with distributed exponence in DUAL. The paper proposes that DUAL is formed by two basic NUM features (SG, PL) each with its binary values and that DUAL is [-SG,-PL] (unmarked). The novel aspect of the analysis is the idea that the NUM feature is mapped onto a language-specific structured semantic space of NUM. A morpheme is analysed as carrying a feature bundle, with the semantic spaces referred to by the individual features possibly overlapping with each other. The proposed analysis can provide a natural explanation for NUMBER agreement in Marori and can be extended to account for unusual cases of NUM agreement and expression in other languages.
  • From mountain talk to hidden talk: Continuity and change in Awiakay registers
    Abstract: When the Awiakay of East Sepik Province in Papua New Guinea left their village or bush camps and went to the mountains, they used a different linguistic register, ‘mountain talk’, in which several lexical items are replaced by their avoidance terms. In this way the Awiakay would prevent mountain spirits from sending sickness or dense fog in which they would get lost on their journeys. Over the last decade people’s trips to the mountain have become more frequent due to the eaglewood business. However, Christianity caused a decline in the use of ‘mountain talk’. Yet a linguistic register similar in its form and function has sprung up in a different setting: kay menda, ‘different talk’, or what people sometimes call ‘hidden talk’, is used when the Awiakay go to the town to sell eaglewood and buy goods. Like other cultural phenomena, linguistic registers are historical formations, which change in form and value over time. This paper aims to show how although in a different social setting, with an expanded repertoire and a slightly different function, kay menda is in a way a continuity of the ‘mountain talk’.
  • Cross-cultural differences in representations and routines for exact number
    Abstract: The relationship between language and thought has been a focus of persistent interest and controversy in cognitive science. Although debates about this issue have occurred in many domains, number is an ideal case study of this relationship because the details (and even the existence) of exact numeral systems vary widely across languages and cultures. In this article I describe how cross-linguistic and cross-cultural diversity—in Amazonia, Melanesia, and around the world—gives us insight into how systems for representing exact quantities affect speakers’ numerical cognition. This body of evidence supports the perspective that numerals provide representations for storing and manipulating quantity information. In addition, the differing structure of quantity representations across cultures can lead to the invention of widely varied routines for numerical tasks like enumeration and arithmetic.
  • Keeping records of language diversity in Melanesia: The Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC)
    Abstract: At the turn of this century, a group of Australian linguistic and musicological researchers recognised that a number of small collections of unique and often irreplaceable field recordings mainly from the Melanesian and broader Pacific regions were not being properly housed and that there was no institution in the region with the capacity to take responsibility for them. The recordings were not held in appropriate conditions and so were deteriorating and in need of digitisation. Further, there was no catalog of their contents or their location so their existence was only known to a few people, typically colleagues of the collector. These practitioners designed the Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC), a digital archive based on internationally accepted standards (Dublin Core/Open Archives Initiative metadata, International Asociation of Sound Archives audio standards and so on) and obtained funding to build an audio digitisation suite in 2003. This is a new conception of a data repository, built into workflows and research methods of particular disciplines, respecting domain-specific ethical concerns and research priorities, but recognising the need to adhere to broader international standards. This paper outlines the way in which researchers involved in documenting languages of Melanesia can use PARADISEC to make valuable recordings available both to the research community and to the source communities.
  • Introduction: Linguistic challenges of the Papuan region

SP04: Electronic Grammaticography

  • Reference grammars for speakers of minority languages
    Abstract: Most of the work done in grammaticography focuses on the writing of grammars for an audience of linguists, and more specifically, typologists. In this paper, we present a grammaticographic model designed mainly to take into account the needs of minority language speakers, because they play a central role in the preservation of their language. However, since in minority language situations it is not possible to generate as many grammars as there are different potential end users, we propose a multilevel grammar, based on our experience as grammarian of Innu, a First Nation language spoken in Quebec (Canada). In this type of grammatical description, the first (main) level is addressed to non-specialist users, the speakers of the language being described, whereas grammatical material aimed at other users (such as linguists) is presented in secondary levels and is limited to core information. Our grammaticographic model was initially conceived for paper (printed) grammars, but we believe that electronic publication offers interesting solutions for multilevel grammars, while paper (printed) grammatical descriptions have greater limitations.
  • Grammars for the people, by the people, made easier using PAWS and XlingPaper
    Abstract: The task of documenting the minority languages of the world, many of them endangered, is daunting. Further, it is most likely impossible to expect that linguists can go to every language and write a reference grammar for it. At the same time, the indigenous people are becoming more educated and more interested in working on their own languages. This paper describes a computational tool that teaches native speakers about various linguistic constructions, has them enter data from their language and answer simple questions about it, and then produces a draft of a practical grammar of the language. This grammar can be edited for publishing electronically and/or on paper and is useful for the people themselves as well as by linguists. The underlying XML technology allows much of the complexity to be hidden from the user, while providing multiple views and outputs possible from the same data. The marked-up XML files are archivable and usable by many XML editors. Localization and customization are also possible.
  • From Database to Treebank: On Enhancing Hypertext Grammars with Grammar Engineering and Treebank Search
    Abstract: This paper describes how electronic grammars can be further enhanced by adding machine-readable grammars and treebanks. We explore the potential benefits of im- plemented grammars and treebanks for descriptive linguistics, following the discursive methodology of Bird & Simons (2003) and the values and maxims identified by Nordhoff(2008). We describe the resources which we believe make implemented grammars and treebanks feasible additions to electronic descriptive grammars, with a particular focus on the Grammar Matrix grammar customization system (Bender et al. 2010) and the Fangorn treebank search application (Ghodke & Bird 2010). By presenting an ex- ample of an implemented grammar based on a descriptive prose grammar, we show one productive method of collaboration between grammar engineer and field linguist, and propose that a tighter integration could be beneficial to both, creating a virtuous cycle that could lead to more effective and informative resources.
  • Digital Grammars -- Integrating the Wiki/CMS approach with Language Archiving Technology and TEI
    Abstract: Although intrinsically closely related to the new field of language documentation, grammaticography is still mostly oriented to the book model, usually falling short of making use of related digital resources and hypertext functionalities. In this contribution, we show and discuss possible or easily achievable advances that can built on top of existing technology such as Language Archiving Technology as developed at The Language Archive at the MPI-PL: Exemplars and examples can be found in multimedia corpora of natural speech events annotated with ELAN and visualized with ANNEX, words and word forms can be linked to lexical entries in LEXUS online-databases, and the precise meaning of theoretical concepts can be given in ISOcat entries or related terminological databases. Independently from LAT, Wiki-technology provides online collaboration and version control and opens even the possibility to address different audiences in related sets of pages, but also poses challenges for the overall didactic structure of a descriptive work. As one of the formats, at least for export and exchange, the XML-based TEI may provide a suitable framework, although many specialized tags would still have to be introduced and formatting and functionalities for these tags still has to be implemented. Generally, synchronization between different versions (e.g., on-line and off-line) poses the most intriguing difficulties, but the advantages (also in terms of Nordhoff's maxims) of hypertext grammars as proposed here are overwhelming.
  • From corpus to grammar: how DOBES corpora can be exploited for descriptive linguistics
    Abstract: The principles and techniques of language documentation developed during the last one and half decades and the sheer amount of corpora which have been compiled for endangered languages up to now will have an impact on grammar writing in particular with respect to the data base of grammars. On the other hand, advances in computer technology allow a closer link between corpus data which are the basis for generalizations and the grammatical description itself. The future the grammatical description of a language will not only present selected illustrative examples, but will also be linked to the entire set of corpus data that are the empirical basis for it. This makes generalizations transparent to the reader and open to falsification by the scientific community. The article critically examines the relations between the DOBES corpus, the analysis and the grammatical description itself. Special attention will be laid on the particular the two fundamental perspectives of a semasiological and an onomasiological grammar, can be translated into the various kinds of search and concordancing routines to be executed in the corpus analysis. We present a typology of searches descriptive linguists need to apply. This typology defines requirements with regard to the functionality of specific software to be developed. In the second part, the article presents a technical solution, a preliminary version of a database/concordancing software specifically designed to fulfill the functions and principles outlined in the preceding sections.
  • Electronic Grammars and Reproducible Research
    Abstract: It is time for grammatical descriptions to become reproducible research. In order for this to happen, grammar descriptions must be testable, not only by the original author, but also by other linguists. Given the complexity of natural language grammars, and the ambiguity of prose descriptions, that testing is best done using computational tools to verify a computationally implementable grammar. At the same time, grammars need to be useful---and testable---for the foreseeable future; that is, they must be archivable. Yet if a computational grammar is tied to particular computational tools, it will inevitably become obsolescent. This paper describes a means of creating computationally interpretable grammars which are not tied to particular computational tools, nor (to the extent possible) to any particular linguistic theory, and which can therefore be expected to remain useful into the future. In order to make such formal grammars simultaneously understandable to humans, they are embedded into descriptive grammars of a more traditional sort, using the technique of Literate Programming. The implementation of this technology for morphology and phonology is described. It has been used to create morphological grammars for Bangla, Urdu and Pashto which are both human-readable and computationally testable.
  • Deconstructing descriptive grammars
    Abstract: Much work within digital linguistics has focused on the problem of developing concrete methods and general principles for encoding data structures designed for non-digital media into digital formats. This work has been successful enough that the field is now in a position to move past "retrofitting" digital solutions onto analog structures and to consider how new technologies should actually change linguistic practice. The domain of grammaticography is looked at from this perspective, and a traditional descriptive grammar is reconceptualized as a database of linked data, in principle curated from distinct sources. Among the consequences of such a reconceptualization is the potential loss of two valued features of traditional descriptive grammars, here termed coverage and coherence. The nature of these features is examined in order to determine how they can be integrated into a linked data model of digital descriptive grammars, thereby allowing us to benefit from new technology without losing important features intrinsic to the structure of the traditional version of the resource.
  • The grammatical description as a collection of form-meaning-pairs
    Abstract: This paper analyzes the structure of books containing grammatical descriptions and builds up on work by Good (2004). It argues that the discussion of morphology, syntax, semantics, and intonation found in grammatical descriptions can be seen as a collection of interdependent form-meaning-pairs. These form-meaning-pairs form part of the larger structure of frontmatter, mainmatter and backmatter (Mosel 2006) and have themselves an internal structure which includes, among other things, linguistic examples as formalized by Bow et al (2003).
  • Advances in the accountability of grammatical analysis and description by using regular expressions
    Abstract: This paper discusses the representativeness, coextensitivity and scientific accountability of corpus-based grammatical descriptions of previously unresearched languages. While a grammatical description of a previously unresearched language can hardly be representative for any kind of its varieties, it can be adequate n coextensitivity if it covers the linguistic phenomena presented in the corpus. In order to allow other researchers to retrieve the examples in their context and check the analysis, the corpus should not only contain text collections, but also the elicited data, provide metadata and be accessible to other researchers. Scientific accountability, however, can only be achieved, if the description facilitates the replicability of the analysis, which presupposes that the authors’ corpus linguistic search methods are documented, so that the readers can find other, if not all examples for the described phenomena, and scrutinize the search methods, the analysis and the description. As is illustrated in this paper, a suitable query language for this kind of scientific grammatical analysis and description are the so-called regular expressions which are implemented in the annotation tool ELAN.
  • Language description and hypertext: Nunggubuyu as a case study
    Abstract: Any reasonably complete description of a language is a complex object, typically composed of a grammar, a dictionary, and a text collection with internal relationships that can be represented as hyperlinks. The information would be fully searchable, links between text and media could be implemented, and the presentation would be based on a well-defined data structure with advantages for archiving and reusability. We present a small fragment from Heath's Nunggubuyu text collection with links to parts of the other elements of the description to demonstrate the benefit which this approach can bring. This initial step involves a certain amount of hand-coding but establishes a basis for the necessary data structure which will then be used in a second phase where we develop techniques for the automatic processing of scanned versions of Heath's work. Grammatical descriptions written with the kinds of structure we are developing, or capable of being converted to that structure (while being 'born digital') are likely to be in short supply. Presentations of old materials in new formats will inform new electronic grammars, and help gain the acceptance of the linguistic community for preferred formats.
SP03: Potentials of Language Documentation: Methods, Analyses, and Utilization

  • Data from language documentations in research on referential hierarchies
    Abstract: This paper outlines potentials of documentary linguistics for typological research in referential hierarchies. Specifically, I will demonstrate how the analysis of original text data from the Oceanic language Vera’a enhances knowledge about referential hierarchy effects in the domains of number marking and morphosyntactic properties of objects. With this language-specific research as a background, I will outline ways in which original text data from language documentation projects can be used in cross-corpus investigations of aspects of referential hierarchies across languages.
  • How to measure frequency? Different ways of counting ergatives in Chintang (Tibeto-Burman, Nepal) and their implications
    Abstract: The frequency of linguistic phenomena is standardly measured relative to some structurally defined unit (e.g. per 1,000 words or per clause). Drawing on a case study on the acquisition of ergativity by children in Chintang, an endangered Tibeto-Burman language of Nepal, we propose that from a psycholinguistic point of view, it is sometimes necessary to measure frequencies relative to the length of the time windows within which speakers and hearers use the language, rather than relative to structurally defined units. This approach requires that corpus design control for recording length and that transcripts be systematically linked to timestamps in the audiovisual signal.
  • Information structure, variation and the Referential Hierarchy
    Abstract: Silverstein (1976)’s hierarchy of features and ergativity (Referential Hierarchy) was proposed to capture apparent systematic variation with respect to word-class (pronouns versus nouns) in the expression of the grammatical functions Subject and Object and the semantic roles Agent and Undergoer linked to these functions. An assumption of the original hierarchy was obligatoriness of marking, rather than optionality (i.e. choice of marker or its absence). Optionality is often associated with a semantic/pragmatic force additional to straight expression of grammatical function. This additional meaning may determine reanalysis and subsequent change in the morphosyntactic expression of Subject/Object/Agent/Undergoer. Along the way, apparent counter-examples to the Referential Hierarchy may be created. To understand the counter-examples, and test the descriptive adequacy of the Referential Hierarchy, better language documentation is needed.
  • Prospects for e-grammars and endangered languages corpora
    Abstract: This contribution explores the potentials of combining corpora of language use data with language description in e-grammars (or digital grammars). We present three directions of ongoing research and discuss the advantages of combining these and similar approaches, arguing that the technological possibilities have barely begun to be explored.
  • Supporting linguistic research using generic automatic audio/video analysis
    Abstract: Automatic analysis can speed up the annotation process and free up human resources, which can then be spent on theorizing instead of tedious annotation tasks. We will describe selected automatic tools that support the most time-consuming steps in annotation, such as speech and speaker segmentation, time alignment of existing transcripts, automatic scene analysis with respect to camera motion, face/person detection, and the tracking of head and hands as well as the resulting gesture analysis.
  • Unsupervised morphological analysis of small corpora: First experiments with Kilivila
    Abstract: Language documentation involves linguistic analysis of the collected material, which is typically done manually. Automatic methods for language processing usually require large corpora. The method presented in this paper uses techniques from bioinformatics and contextual information to morphologically analyze raw text corpora. This paper presents initial results of the method when applied on a small Kilivila corpus.
  • Online presentation and accessibility of endangered languages data: The General Portal to the DoBeS Archive
    Abstract: Data depositories containing language documentation corpora are generally well structured, well maintained, and include large collections of many under-researched languages. However, they are not yet conceived of as resources that can be easily consulted on scientific or non-scientific questions pertaining to one of those languages. A general portal to the DoBeS archive has been created to facilitate access to the data, to attract more users to the archive, and to lower the threshold for users outside the linguistic community to access the data. The structure and the main features of this portal will be presented in this paper.
  • From language documentation to language planning: Not necessarily a direct route
    Abstract: In this paper I will consider how documentary linguists can provide support for community language planning initiatives, and I will discuss some issues. These relate partly to the process of language documentation: what and who we choose to document, how we define ‘a language’, and how we deal with language variation and change; and partly to community attitudes and dynamics.
  • Using language documentation data in a broader context
    Abstract: On the one hand we have never seen as much fieldwork and recording of small and endangered languages as we have over the past decade. On the other hand linguists are now also much more aware of the need to create records that can be reused by the people we record and that will still be available for their descendants. Our own descendants, the future researchers who will use our records, will also need to be able to find and make use of our research. The fragility of digital records means we need to pay attention to their curation over time and create suitable repositories if they do not already exist. In order for these aims to be achieved, we need to establish work practices now that allow the data to move easily from creation to the archive and to community use.
  • Tours of the past through the present of eastern Indonesia
    Abstract: The past twenty years have seen a variety of data being collected from largely undocumented languages in eastern Indonesia, an area hitherto almost unknown. Such data are valuable in reconstructing the history of this area at a macro-level. In addition, as research in particular areas becomes more fine-grained, it is possible to combine linguistic data with data from oral history and ethnographic observation in order to reconstruct the migration histories of specific speaker groups. A case study of such a micro-level reconstruction is presented here.
  • Creating educational materials in language documentation projects – creating innovative resources for linguistic research
    Abstract: In its first two sections this paper briefly discusses two models of language documentation projects: the hierarchical model, in which the language documentation corpus (LDC) serves as a resource for the development of educational materials (EMs), and the integrative model, which integrates the production of EMs into the LDC and makes them a resource for linguistic research. The third and the fourth section describe how the integrative model was applied in the Teop Language Documentation Project and what kind of linguistic research topics it provides.
  • Bilingual multimodality in language documentation data
    Abstract: Most people in the world speak more than one language, making bilingualism the norm rather than the exception. Furthermore, speakers generally also move their hands – they gesture – in coordination with speech and language in nontrivial ways. Bilingualism and multimodality should thus be on research agendas focused on the nature of linguistic systems and language use in context, yet they are often overlooked. Conversely, research and theorizing on bilingualism and multimodality is often based on Western-European, standardized languages, and little is known about other linguistic contexts. This paper makes the point that language documentation data has the potential to inform theoretical and empirical studies of linguistics, bilingualism and multimodality in entirely new ways, and, conversely, that documentation work would benefit from taking the bilingual and multimodal nature of its data into account.
  • Language archives: They’re not just for linguists any more
    Abstract: While many language archives were originally conceived for the purpose of preserving linguistic data, these data have the potential to inform knowledge beyond the narrow field of linguistics. Today language archives are being used by people without formal linguistic training for purposes not necessarily envisioned by the original creators of the language documentation. The DoBeS Archive is particularly well-placed to become an important resource for cultural documentation, since many of the DoBeS projects have been interdisciplinary in nature, documenting language within its broader social and cultural context. In this paper I present a perspective from a legacy archive created well before the modern era of digital language documentation exemplified by the DoBeS program. In particular, I describe two types of non-linguistic uses which are becoming increasingly important at the Alaska Native Language Archive.
  • A corpus linguistics perspective on language documentation, data, and challenge of small corpora
    Abstract: This paper deals with issues of corpus design that might prove problematic for the study of under-resourced languages, e.g. in language documentation. It argues that it is not yet well understood which linguistic and extra-linguistic (predictor) variables cause linguistic variation (i.e. the response variable), which means that the scope of a linguistic finding cannot always be assessed. In order to deal with this problem, it is argued that we need a flexible corpus architecture with the option of adding meta-data to corpora/sub-corpora at any point in time.
  • Visualization and online presentation of linguistic data
    Abstract: This contribution gives an introduction to state-of-the-art techniques for the visualization and online presentation of linguistic data and world-wide linguistic diversity, such as linguistic maps and online dictionaries, using a software environment called R. The aim is to draw linguists’ attention to the possibilities offered by these techniques and to give some practical hints as to how they can be used specifically for linguistic and language documentation data.
  • Language-specific encoding in endangered language corpora
    Abstract: The paper addresses problems of corpus building and retrieval resulting from codeswitching, which is a characteristic feature of endangered language recordings. The typical appearance of code-switching phenomena is first outlined on the basis of data collected in the DoBeS ‘ECLinG’ project, which dealt with three endangered Caucasian languages spoken in Georgia: Tsova-Tush (Batsbi), Udi, and Svan. The problem of language-specific retrieval is illustrated with examples showing the usage of the word da in Tsova-Tush contexts, which represents, as a homonym, either a native copula form (‘it is’) or the Georgian conjunction ‘and’. The subsequent section discusses the annotation requirements that are necessary to automatically distinguish the languages involved in code-switching, with a focus on the emerging ISO standard 639-6. It is argued that the fine-grained distinction of varieties and subvarieties and their interrelationship – as aimed at in this standard – requires a thorough reconsideration if it is to be applied in the markup of corpus data.
  • On the sociolinguistic typology of linguistic complexity loss
    Abstract: The nature of the human language faculty is the same the world over, and has been so ever since humans became human. This paper, however, considers the possibility that, because of the influence which social structure can have on language structure, this common faculty may produce structurally different types of language under different sociolinguistic conditions. Changing sociolinguistic conditions in the modern world are likely to have the consequence that, in time, the only languages remaining in the world will be severely atypical of how languages have been throughout most of human history.
  • The threefold potential of language documentation
    Abstract: In the past 10 or so years, intensive documentation activities, i.e. compilations of large, multimedia corpora of spoken endangered languages have contributed to the documentation of important linguistic and cultural aspects of dozens of languages. As laid out in Himmelmann (1998), language documentations include as their central components a collection of spoken texts from a variety of genres, recorded on video and/or audio, with time-aligned annotations consisting of transcription, translation, and also, for some data, morphological segmentation and glossing. Text collections are often complemented by elicited data, e.g. word lists, and structural descriptions such as a grammar sketch. All data are provided with metadata which serve as cataloguing devices for their accessibility in online archives. These newly available language documentation data have enormous potential.
SP02: Fieldwork and Linguistic Analysis in Indigenous Languages of the Americas

  • Chapter 4. Noun class and number in Kiowa-Tanoan: Comparative-historical research and respecting speakers' rights in fieldwork
    Abstract: The Kiowa-Tanoan family is known to linguists by two characteristic features: a) a package of complex morphosyntactic structures that includes a typologically marked noun class and number marking system and b) the paucity of information available on the Tanoan languages due to cultural ideologies of secrecy. This paper explores both of these issues. It attempts to reconstruct the historical noun class-number system based on the diverging, yet obviously related, morphosemantic patterns found in each of the modern languages, a study that would be greatly benefited by fieldwork and the input of native speakers. At the same time, it reviews the language situation among the Kiowa-Tanoan-speaking communities and what some of the difficulties are in doing this kind of fieldwork in the Pueblo Southwest, touching on the myriad complex issues involving the control of information and the speech communities’ rights over their own languages as well as the outside linguist’s role in such a situation. The paper underscores these points by using only language data examples from previous field research that are already available to the public so as not to compromise native speakers’ sensitivity to new research on their languages.
  • Chapter 3. Classifying clitics in Sm'algyax: Approaching theory from the field
    Abstract: Sm’algyax (British Columbia and Alaska) is a highly ergative VAO/VS language with an uncommonly wide range of clitics. This chapter has the two-fold function of demonstrating how Anderson’s (2005) constraint-based analysis of clitics gives insight into the complex behavior of Sm’algyax clitics, and how the clitics themselves afford empirical means of testing such a theory. The Sm’algyax data are drawn from both field research and published texts, reflecting a community-based approach to language documentation that has evolved through a long-term, collaborative relationship with the Tsimshian (Sm’algyax) communities. Building on Stebbin’s (2003) definitions of intermediate word classes in Sm’algyax and Anderson’s Optimality Theoretical approach, we determine that in terms of their varying phonological dependence, Sm’algyax clitics include internal, phonological word, and affixal clitics. The existence of affixal clitics in Sm’algyax, however, calls into question the viability of the Strict Layer Hypothesis (Selkirk 1984) as inviolable rules when describing clitics. Furthermore, Sm’algyax provides strong evidence that the direction of clitic attachment is more clitic specific than language specific. In characterising the behaviour of Sm’algyax clitics, we find that not only does linguistic theory help sharpen our understanding of the fieldwork data, but also that field linguistics has consequences for linguistic theory.
  • Chapter 6. Multiple Functions, multiple techniques: The role of methodology in a study of Zapotec determiners
    Abstract: Field linguists use a combination of techniques to compile a grammatical description, starting with various types of targeted elicitation and followed by the study of more natural speech in the form of recorded texts. These usual techniques were employed in my work on Teotitlán del Valle Zapotec, an Oto-Manguean language spoken in Mexico, but in an unusual order, with texts, mainly folk tales and life histories told by community elders, being collected and analyzed first due to the priorities of the documentation project I was a part of. This paper examines the role that methodology played in the investigation into one small area of the grammar, a set of noun phrase-final determiner clitics. These clitics make both spatial and temporal distinctions, raising theoretical questions regarding the role of a temporal marker in the NP. At the same time, it brought to light some interesting issues surrounding methodology in fieldwork: how does the method of collection affect the type of data gathered, and does the order in which different methodologies are employed affect the final outcome?
  • Chapter 5. The story of *ô in the Cariban Family
    Abstract: This paper argues for the reconstruction of an unrounded mid central/back vowel *ô to Proto-Cariban. Recent comparative studies of the Cariban family encounter a consistent correspondence of ə : o : ɨ : e, tentatively reconstructed as *o2 (considering only pronouns; Meira 2002) and *ô (considering only seven languages; Meira & Franchetto 2005). The first empirical contribution of this paper is to expand the comparative database to twenty-one modern and two extinct Cariban languages, where the robustness of the correspondence is confirmed. In ten languages, *ô merges with another vowel, either *o or *ɨ. The second empirical contribution of this paper is to more closely analyze one apparent case of attested change from *ô > o, as seen in cognate forms from Island Carib and dialectal variation in Kari’nja (Carib of Surinam). Kari’nja words borrowed into Island Carib/Garífuna show a split between rounded and unrounded back vowels: rounded back vowels are reflexes of *o and *u, unrounded back vowels reflexes of *ô and *ɨ. Our analysis of Island Carib phonology was originally developed by Douglas Taylor in the 1960s, supplemented with unpublished Garifuna data collected by Taylor in the 1950s.
  • Chapter 8. Studying Dena'ina discourse markers: Evidence from elicitation and narrative
    Abstract: This paper is concerned with discourse markers in Dena’ina Athabascan. One problem for transcribers and translators of Dena’ina texts is the great number of particles (i.e., words that cannot be inflected) that, according to speaker judgments “have no meaning” or “mean something else in every sentence.” This suggests that these particles are discourse markers, whose function is to relate discourse units to each other and to the discourse as a whole. The paper contrasts two different forms of linguistic inquiry: direct inquiry in the field, by elicitation of meaning and function of the discourse markers, and indirect inquiry, by study of a corpus of Dena’ina narratives. While elicitation is helpful in obtaining an initial gloss for the discourse markers, it is shown that only the study of texts will give us insight into the function of such particles and allows us to understand the important differences between particles that, on first sight, appear to be synonymous.
  • Chapter 7. Middles and reflexives in Yucatec Maya: Trusting speaker intuition
    Abstract: In this paper we provide a characterization of the middle construction in YM, and show that the apparently unpredictable distribution of middle voice in YM corresponds to a neatly identified, and quite limited, system of absolute events, i.e., events in which no energy is expended (Langacker 1987). This strategy is not exploited by other related Mayan languages, which tend to encode all absolute events as simple intransitive verbs. The semantic coherence of middle voice in YM is only discernible by combining analysis of narrative texts and direct elicitation with attention to speaker intuition in a variety of situational contrastive contexts guided by cognitive principles which are known to determine the behavior of middle voice systems in other languages.
  • Chapter 10. Revisiting the source: Dependent verbs in Sierra Popoluca (Mixe-Zoquean)
    Abstract: Sierra Popoluca (SP) is a Mixe-Zoquean language, spoken by about 28,000 individuals in southern Veracruz, Mexico. The objectives of this paper are (1) to explore the structures of dependent verb constructions in SP and the contexts in which they occur and (2) to highlight the stages in which data is gathered and the interplay between text collection, elicitation, and analysis. SP is an ergative, polysynthetic, head-marking language. It has five dependent verb construction types. Early analyses suggested that dependent verbs were non-finite, nominalized forms. Further research indicated that the verbs are components in complex predicates that share inflection for aspect/mood, person, and number. Implicated in the analysis of these constructions are: the prosodic system; the alignment system, which is hierarchically driven with split ergativity; and the number system, also hierarchically driven. The teasing apart of the various grammatical features led to a multi-step process of analyzing and collecting data. By looking at a complex grammatical structure, this paper highlights the interdependency of corpus building, text analysis, and elicitation and the strategies used to negotiate between naturally occurring speech, in which data may be obscured by phonology, and elicited data, which frequently produces periphrastic constructions or alternative utterance types.
  • Fieldwork and Linguistic Analysis in Indigenous Languages of the Americas
  • Chapter 2. Sociopragmatic influences on the development and use of the discourse marker vet in Ixil Maya
    Abstract: In this paper we explore the functions of the particle vet in Ixil Mayan and argue that it is a discourse marker used to perform both structural and pragmatic functions. Vet serves as a structural marker indicating temporally or causally interdependent items; it also has sociopragmatic functions, allowing speakers to present an evaluation of a discourse that invites interlocutors to also take a stance both on the information presented and on their roles in particular sociocultural activities. These functions of managing negotiations among interlocutors range from agreements on descriptive terms to calls for social action among entire groups, in all cases highlighting the social nature both of discourse and of group activity. The overlapping of the structural and pragmatic functions of vet demonstrates the grammaticalization cline ranging from adverb to discourse marker proposed by Traugott (1997). Our examination of vet in a range of genres produced by the Mujeres por la Paz of Nebaj, El Quiché, Guatemala, a cooperative formed in 1997 by Ixil Maya women who were widowed or left fatherless during the Guatemalan civil war, suggest that the effects of the individual and group identities and motivations of participants outweighs anticipated genre effects.
  • Chapter 9. Be careful what you throw out: Gemination and tonal feet in Weledeh Dogrib
    Abstract: The Weledeh dialect of Dogrib (Tłįchǫ Yatiì) is spoken by people of the Yellowknives Dene First Nation, in and around Yellowknife, Northwest Territories. Within the formal framework of Lexical Phonology (Kiparsky 1982), this paper argues for an over-arching generalization in the phonology of Weledeh Dogrib: the constraint NoContour-Ft, which prefers (High-High) and (Low-Low) feet, but militates against (High-Low) and (Low-High) feet. NoContour-Ft is satisfied differently in different morphophonological domains: vowel deletion at the Stem Level, gemination at the Word Level, and High to Mid tone lowering at the Postlexical Level. This analysis requires that consonant length be treated as phonological in Dogrib—that is, consonant length contributes to syllable weight and mora count—even though there are no minimal pairs based on consonant length. Similarly, the distinction between High and Middle tone does not distinguish any lexical items, but is nevertheless important for the prosody of the language. Thus the paper makes a methodological point about the importance of allophonic alternations for phonological theory. Our view of what counts as contrastive or allophonic, however, is to a large extent theory-dependent; therefore, the paper also emphasizes the importance of phonetic measurements when doing fieldwork.
  • Chapter 1. Introduction: The Boasian tradition and contemporary practice in linguistic fieldwork in the Americas
SP01: Documenting and Revitalizing Austronesian Languages

  • Chapter 4. SIL International and Endangered Austronesian Languages
    Abstract: SIL International has been partnering with Austronesian language communities in language development for over fifty years. This chapter briefly reviews that history, situates it in the current environment of international concern for the documentation and revitalization of endangered languages, and looks at ways in which SIL might assist endangered Austronesian language communities of today. Two aspects of language development are considered—one more “academic” in nature, focusing on products primarily of interest to linguists and other researchers; the other more “development” in nature, focusing on language resources and competencies of greater interest and relevance to language communities. The chapter summarizes some recent studies related to language endangerment/vitality, and considers how language development relates to language revitalization and documentary linguistics. SIL can continue to learn from and link with others in describing and documenting endangered Austronesian languages, in providing consulting and training at the request of language communities and others, and in designing and developing affordable language software to help accomplish related tasks.
  • Chapter 6. Documenting and Revitalizing Kavalan
    Abstract: The purpose of this chapter is to provide a two-dimensional approach to language documentation (Hi mmelman 1998). In addition to building a database, we also conducted a sociolinguistic survey des igned to document the state of health of a language in a particular spatio-temporal frame. Our goa l is to share our fieldwork experience of documenting Kavalan, a seriously endangered language in sou theastern Taiwan now spoken by fewer than just a few dozen speakers. We first discuss our field exp eriences in working with speakers of Kavalan in Sinshe village, the only significant Kavalan set tlement left in Taiwan, and the state of the Kavalan language, based in part on Huang and Cha ng’s (19 95) earlier sociolinguistic survey, and in part on a recent more in-depth village-wide survey of lan guage use in the community. Next, we introduce the NTU Corpus of Formosan Languages, part of which incorporates our corpus data in Kavalan. The NTU Corpus of Formosan Languages aims to establish a standard for the creation of linguistic corpus databases through the application of information technology to linguistic research. The creation of this linguistic database enables us both to preserve valuable linguistic data and to provide a systematic recording of these languages, for the benefit of future linguistic research.
  • Chapter 5. Local Autonomy, Local Capacity Building and Support for Minority Languages: Field Experiences from Indonesia
    Abstract: This chapter discusses the complexity of language/cultural maintenance and revival, highlighting the significance of building and supporting long-term local capacity. These complex issues are discussed in the current context of rapid political change towards greater local autonomy in Indonesia. After some background on aims and regulations of decentralization, the Balinese in Bali and Rongga in Flores are compared and discussed based on the author’s field experiences. It is argued that capacity building and support must include more than simply developing human resources. Strengthening, reforming, and/or restoring relevant institutions, particularly in relation to customary adat systems, are equally important. While a macro perspective must be adopted, priority must be given to a community- based approach and to long term capacity building and support at the most local level. The comparison of the Rongga and Balinese helps clarify how a range of inter-related socio-political and economic variables at the local and regional levels play a significant role in providing and/or inducing good conditions for bottom-up community-based initiatives in language/cultural maintenance and revival.
  • Chapter 9. Teaching and Learning an Endangered Austronesian Language in Taiwan
    Abstract: This chapter provides a case study of the process of endangered language acquisition, which has not been well studied from the viewpoint of applied linguistics. It describes the context of teaching Chinese adult learners in Taiwan an endangered indigenous language, the teachers’ pedagogical approaches, the phonological and syntactic acquisition processes the learners were undergoing, and applications to other language documentation and revitalization programs. Both qualitative and quantitative methods were used to address the research questions. This study demonstrates cogently that language is a complex adaptive system. In phonological acquisition, the trill was the most difficult phoneme to learn. Systematic variations for the variables (ŋ) and (s) were found to be constrained by both markedness and interference. Furthermore, learners also tended to interpret Yami orthography based on their knowledge of English. In word order acquisition, learners performed much better than expected, partially because the present tense, coded by the SV word order, is the norm in Yami conversations. However, students still inaccurately associated word order with sentence type rather than with tense distinction. The Yami case provides an integrated model for endangered language documentation, revitalization and pedagogical research, which would be of interest to people working with other languages and the language documentation field in general.
  • Chapter 7. E-learning in Endangered Language Documentation and Revitalization
    Abstract: This chapter analyses the application of e-learning in the revitalization of endangered languages. It outlines the areas in which e-learning is efficacious, the attitudes of the indigenous language teachers to e-learning, the feelings of the Yami community toward this kind of pedagogy, and the reactions of the users, mostly young and adolescent learners of Yami. The findings are based on the results of surveys and in-depth studies in the Yami community and also on surveys made in a nation-wide seminar that enrolled teachers of the majority of the still-spoken aboriginal languages in Taiwan. Both qualitative and quantitative methods were used to gather empirical data to address questions in the following three areas: (1) the contexts of developing e- Learning materials for endangered indigenous languages in Taiwan, (2) the indigenous language teachers’ perceptions of e-Learning in Taiwan, and (3) the attitudes of the Yami community on Orchid Island toward e-Learning. This chapter provides a model for the many language revitalization projects underway in Taiwan and worldwide to take advantage of e-Learning. It also provides guidelines that enable each project to better understand the kinds of e-Learning that workto make e-Learning acceptable and efficacious.
  • Chapter 3. Training for Language Documentation: Experiences at the School of Oriental and African Studies
    Abstract: Since 2003 the Endangered Languages Project at SOAS has been involved in various types of training for documentation of endangered languages, ranging from one-day workshops through to MA and PhD post-graduate degree programmes. The training events have been attended by specialists, research grantees, students, and members of the general public, and have covered a wide range of topics and involved delivery in a range of contexts and delivery modes, including hands-on practical sessions and e-learning in the Blackboard framework. We have covered both theory and practice of language documentation and endangered language support, including the development of multimedia and curriculum materials for language teaching, some of it experimental and, we think, quite innovative. In this chapter I discuss some of our experiences in developing and running these training workshops and courses, reporting on the models, and successes (and failures) over the past three and a half years. My goal is to share our accumulated knowledge and experience with others with similar interests, and in doing so to advance our understanding of the possibilities for language documentation training.
  • Chapter 11. On Designing the Formosan Multimedia Word Dictionaries by a Participatory Process
    Abstract: Digital archiving is important work for an endangered language, because if an endangered language disappears, associated cultural assets will disappear altogether. Several digital archiving projects are being conducted in Taiwan. Many tribal teachers are now involved in these projects. Based on the needs of these tribal teachers, this chapter presents an easy-to-use system for digitally archiving Formosan Languages. The proposed approach takes advantage of the Internet and the newly launched Web 2.0 sharing platform. This chapter gives details of the development and structure of the online dictionary system. Currently, several archiving projects in Taiwan are using this system to teach tribal teachers how to develop their own language resources and online dictionaries.
  • Chapter 8. Indigenous Language–informed Participatory Policy in Taiwan: A Socio-political Perspective
    Abstract: This chapter highlights the importance of incorporating indigenous language and its daily practice in the local context of newly transformed indigenous policy in Taiwan. Currently, the official indigenous people’s language policy is relatively confined to curriculum development and certification of indigenous peoples’ language abilities with little consideration of language practices in real socio- political situations. This chapter questions whether the revitalization of endangered indigenous languages can rely only on language policy per se. The participatory action research (PAR) methodology is employed as a main research method in inhabited Atayal communities. This chapter is divided into three main parts: firstly, a brief socio-political history of indigenous people in Taiwan is provided; secondly, two socio-political official projects related to traditional territory sovereignty are analyzed: their failure is revealed due to the neglect of indigenous language and local participation; thirdly, a case from an Atayal village, Smangus, is provided to show how indigenous languages can be revitalized through combining the villagers’ daily practices and participation. In conclusion, this chapter argues for a combining of language policy with other socio-political policies so as to create environments in which indigenous peoples can speak their own languages.
  • Chapter 10. WeSay, a Tool for Engaging Communities in Dictionary Building
    Abstract: This chapter introduces WeSay, an open source software application designed to involve language community members in the description and documentation of their language. Intended for rugged, low- power hardware, WeSay's simplified user interface removes many barriers that typically prevent the direct involvement of community members. In this chapter, we describe the dictionary-building features of WeSay that allow a linguist to tailor a sequence of language documentation tasks to engage community members. These tasks reduce a production step to its simplest form, enabling focused training and division of labor. Word gathering tasks use semantic domains, word lists, or patterns of likely words to build up the dictionary. Successive tasks add specific content, such as glosses and example sentences, to the entries. In addition, the program can prepare simple paper publications designed to promote community support for the effort and can transfer the raw data to the linguist for further processing with tools that are more powerful.
  • Chapter 2. The Language Documentation and Conservation Initiative at the University of Hawai‘i at Mānoa
    Abstract: Since its inception in 1963, the Department of Linguistics at the University of Hawai‘i at Mānoa (UHM) has had a special focus on Austronesian and Asian languages. It has supported and encouraged fieldwork on these languages, and it has played a major role in the development of vernacular language education programs in Micronesia and elsewhere. In 2003, the department renewed and intensified its commitment to such work through what I shall refer to in this chapter as the Language Documentation and Conservation Initiative (LDCI). The LDCI has three major objectives. The first is to provide high- quality training to graduate students who wish to undertake the essential task of documenting the many underdocumented and endangered languages of Asia and the Pacific. The second is to promote collaborative research efforts among linguists, native speakers of endangered and underdocumented languages, and other interested parties. The third is to facilitate the free and open exchange of ideas among all those working in this field. In this chapter, I discuss each of these three objectives and the activities being conducted at UHM in support of them.
  • Chapter 12. Annotating Texts for Language Documentation with Discourse Profiler’s Metatagging System
    Abstract: This chapter introduces a systematic and robust way to annotate (or ‘tag’) texts with discourse information. To date there has not been a method for annotating texts for language documentation with discourse-text information. This is the first paper to systematically describe the capabilities and the annotating methodology of the Discourse Profiler’s metatagging system as a means of annotating endangered languages’ texts in a Toolbox database. Since there is a division of labor between Toolbox and Discourse Profiler, the Toolbox database can be the basis for the archival tasks, whereas the Discourse Profiler software is a computer assisted discourse-text analytical tool that mines the Toolbox discourse-text annotated database in order to produce two primary capabilities: 1) to create a representative interactive compressed representation or ‘map’ of the structure and elements of a text, and 2) to quantify texts based on this special metatagging system with an array of sixteen different possible statistical outputs (including both referential distance and topic persistence statistics). Although the main focus of this chapter is on the multipurpose annotation system, I will introduce the basics of the Discourse Profiler software in order to illustrate the range of analytical possibilities that this annotation system incorporates.
  • Chapter 1. Introduction: Documenting and Revitalizing Austronesian Languages
    Abstract: This chapter provides an overview of the issues and themes which emerge throughout this book. It begins with a brief description of language revitalization activities which are taking place in the Pazeh, Kahabu and Thao aboriginal communities in the mountains and plains of Taiwan. The activities of elders in these communities exemplify the growth of language activism. These case studies lead to a discussion of changes in the field of linguistics and the alliances which are being built between linguists and community language activists. The 11 chapters in the book are then reviewed within the key themes of international capacity building initiatives, documentation and revitalization activities, and computational methods and tools for language documentation.
  • Documenting and Revitalizing Austronesian Languages