Notes from the Field: Wisconsin Walloon Documentation and Orthography
Kelly Biers & Ellen Osterhaus, pp. 1-29
Wisconsin Walloon is a heritage dialect of a threatened language in the langue d’oïl family that originated in southern Belgium and expanded to northeastern Wisconsin, USA in the mid-1850s. Walloon-speaking immigrants formed an isolated agricultural community, passing on and using the language for the next two generations until English became the dominant functional language. Although younger generations today have not learned the language, there remain enough Walloon speakers as well as Belgian descendants interested in their linguistic heritage to have generated community support for a Walloon documentation and conservation project. In this paper, we report on the results of over three years of collaboration between university researchers, students, and community members to document, study, and promote the language for the benefit of both scholars and community. We provide a description of the language, collaborative documentation efforts, and the development of community resources, including a phonetically-accessible Walloon orthography. We conclude with an outlook on future work with an eye toward increased community-led efforts.
What’s your sign for TORTILLA? Documenting lexical variation in Yucatec Maya Sign Languages
Josefina Safar, pp. 30-74
In this paper, I discuss methodological and ethical issues that arose in the process of documenting lexical variation in Yucatec Maya Sign Languages (YMSLs). YMSLs are indigenous sign languages used by deaf and hearing people in Yucatec Maya villages with a high incidence of deafness in the peninsula of Yucatán, Mexico. The documentation of rural sign languages such as YMSLs shares many characteristics with research on urban sign languages as well as spoken minority languages, but it also comes with a range of specific challenges. Elicitation materials, research procedures, and ethical decisions need to be adapted to specific local and cultural requirements while trying to maintain a level of comparability with previous studies. I will illustrate this process of negotiation by providing a detailed account of how I developed stimulus materials for lexical elicitation, obtained informed consent from the participants, and established ways of collaboration with community members in the Yucatec Maya Sign Language Documentation Project. Furthermore, I will present first results about lexical variation in YMSLs.
Living Language, Resurgent Radio: A Survey of Indigenous Language Broadcasting Initiatives
David Danos & Mark Turin, pp. 75-152
For a demise that has been predicted for over 60 years, radio is a remarkably resilient communications medium, and one that warrants deeper examination as a vehicle for the revitalization of historically marginalized and Indigenous languages.
Radio has not been eroded by the rise of new media, whether that be television, video, or newer multimodal technologies associated with the internet. To the contrary, communities are leveraging the formerly analogue medium of radio in transformative ways, breathing new life into old transistors, and using radio for the transmission of stories, song, and conversation. In this contribution, we highlight effective and imaginative uses of radio for Indigenous language reclamation through a series of case studies, and we offer a preliminary analysis of the structural conditions that can both support and impede developments in Indigenous-language radio programming.
The success of radio for Indigenous language programming is thanks to the comparatively low cost of operations, its asynchronous nature that supports programs to be consumed at any time (through repeats, podcasts, downloads, and streaming services) and the unusual, even unique, quality of radio being both engaging yet not all-consuming, meaning that a listener can be actively involved in another activity at the same time.
Ticuna (tca) language documentation: A guide to materials in the California Language Archive
Amalia Skilton, pp. 153-189
Ticuna (ISO: tca) is a language isolate spoken in the northwestern Amazon Basin (Brazil, Colombia, Peru). Ticuna has more speakers than almost all other Indigenous Amazonian languages and – unlike most languages of the area – is still learned by children. Yet academic linguists have given it relatively little research attention. Therefore, to raise the profile of this areally important language, I offer a guide to three collections of Ticuna language materials held in the California Language Archive. These materials are extensive, including over 1,396 hours of recordings – primarily of child language and everyday conversations between adults – and 33 hours of transcriptions. To contextualize the materials, I provide background on the Ticuna language and people; the research projects which produced the materials; the participants who appear in them; and the ethical and permissions issues involved in collecting them. I then discuss the nature and scope of the materials, showing how the content of each collection motivated collection-specific choices about recording, transcription, organization in the archive, and metadata. Last, I outline how other researchers could draw on the collections for comparative analysis.
Language use and attitudes as indicators of subjective vitality: The Iban of Sarawak, Malaysia
Su-Hie Ting, Andyson Tinggang, & Lilly Metom, pp. 190-218
The study examined the subjective ethnolinguistic vitality of an Iban community in Sarawak, Malaysia based on their language use and attitudes. A survey of 200 respondents in the Song district was conducted. To determine the objective ethnolinguistic vitality, a structural analysis was performed on their sociolinguistic backgrounds. The results show the Iban language dominates in family, friendship, transactions, religious, employment, and education domains. The language use patterns show functional differentiation into the Iban language as the “low language” and Malay as the “high language”. The respondents have positive attitudes towards the Iban language. The dimensions of language attitudes that are strongly positive are use of the Iban language, Iban identity, and intergenerational transmission of the Iban language. The marginally positive dimensions are instrumental use of the Iban language, social status of Iban speakers, and prestige value of the Iban language. Inferential statistical tests show that language attitudes are influenced by education level. However, language attitudes and use of the Iban language are not significantly correlated. By viewing language use and attitudes from the perspective of ethnolinguistic vitality, this study has revealed that a numerically dominant group assumed to be safe from language shift has only medium vitality, based on both objective and subjective evaluation.
Playing with Language: Three Language Games in the Gulf of Guinea
Ana Lívia Agostinho & Gabriel Antunes de Araujo, pp. 219-238
We present a description and an analysis of three related language games in Africa’s Gulf of Guinea: Fa d’Ambô’s Fa do Vesu, Lung’Ie’s Faa di Vesu, and São Tomé and Príncipe Portuguese’s P-language. We show how these language games can be used to investigate the linguistic features of their main languages and as learning resources for second language learners. First, we defend the common origin of these language games and that they emerged from contact with Portuguese settlers’ Língua do Pê’s varieties. Second, we discuss phonological issues, such as syllable structure, focusing on the loci of onglides, offglides, syllabic nasals, and word prosody. Finally, we discuss how these ludlings can help speakers, learners, and linguists perceive phonological properties as well as the contribution of describing and analyzing language games for language documentation.
#KeepOurLanguagesStrong: Indigenous Language Revitalization on Social Media during the Early COVID-19 Pandemic
Kari A. B. Chew, pp. 239-266
Indigenous communities, organizations, and individuals work tirelessly to #KeepOurLanguagesStrong. The COVID-19 pandemic was potentially detrimental to Indigenous language revitalization (ILR) as this mostly in-person work shifted online. This article shares findings from an analysis of public social media posts, dated March through July 2020 and primarily from Canada and the US, about ILR and the COVID-19 pandemic. The research team, affiliated with the NEȾOLṈEW̱ “one mind, one people” Indigenous language research partnership at the University of Victoria, identified six key themes of social media posts concerning ILR and the pandemic, including: 1. language promotion, 2. using Indigenous languages to talk about COVID-19, 3. trainings to support ILR, 4. language education, 5. creating and sharing language resources, and 6. information about ILR and COVID-19. Enacting the principle of reciprocity in Indigenous research, part of the research process was to create a short video to share research findings back to social media. This article presents a selection of slides from the video accompanied by an in-depth analysis of the themes. Written about the pandemic, during the pandemic, this article seeks to offer some insights and understandings of a time during which much is uncertain. Therefore, this article does not have a formal conclusion; rather, it closes with ideas about long-term implications and future research directions that can benefit ILR.
Community Archiving of Ethnic Groups in Thailand
Siripen Ungsitipoonporn, Buachut Watyam, Vera Ferreira, & Mandana Seyfeddinipur, pp. 267-284
This article presents the research process of the project “The Ethnic Group Digital Archive Project: Promoting the protection and preservation of language and culture diversity in Thailand”. This project involved the development of a local digital archive website for the ethnic groups of Thailand to archive, preserve, and transmit their knowledge of languages and cultures to their younger generations and those interested. The core objective of this digital archive development was the implementation of the archive website with uncomplicated accessibility and simple and interesting design that serves the language documentation purpose. The digital archive output includes collections from 18 ethnic groups in Thailand, containing 385 bundles of legacy and fieldwork data obtained by means of video, audio, text, image, and ELAN file. Despite the low number of researchers working on language documentation and archiving, the research team managed to expand both national and international networks working in this particular field of study. This serves as an opportunity for scholars and speaker communities in Thailand to recognize the importance of local knowledge preservation and transmission, and the availability of the digital archive is a practical way to support sustainable data preservation and accessibility in the future.
Virtual Frisian: A comparison of language use in North and West Frisian virtual communities
Guillem Belmar & Hauke Heyen, pp. 285-315
Social networking sites have become ubiquitous in our daily communicative exchanges, which has brought about new platforms of identification and opened possibilities that were out of reach for many minoritized communities. As they represent an increasing percentage of the media we consume, these sites have been considered crucial for revitalization processes. However, the growing importance of social media may also pose a problem for minoritized languages, as the need for communication with a wider audience seems to require the use of a language of wider communication. One way in which this apparent need for a global language can be avoided is by creating virtual communities where the minoritized languages can be used without competition, a virtual breathing space.
This study analyzes language practices of eight communities: four North Frisian and four West Frisian virtual communities. The analysis focuses on the languages used in each community, the topics discussed, as well as the status of the minoritized language in the community. A total of 1,127 posts are analyzed to determine whether these communities function as breathing spaces, the factors that may foster or prevent the emergence of these spaces, and the similarities and differences between these two sociolinguistic contexts.
Collecting and annotating corpora for three under-resourced languages of France: Methodological issues
Delphine Bernhard, Anne-Laure Ligozat, Myriam Bras, Fanny Martin, Marianne Vergez-Couret, Pascale Erhart, Jean Sibille, Amalia Todirascu, Philippe Boula de Mareüil, & Dominique Huck, pp. 316-357
In contrast to French, the vast majority of regional languages of France can be considered as under-resourced. In this article, we present the results of a research project aiming to produce annotated resources for three regional languages of France: Alsatian, Occitan, and Picard. These languages cover three different language families (Germanic and two subfamilies of Romance, Oïl and Oc languages) and different sociolinguistic situations. Yet, they all face issues common to many under-resourced languages: lack of human and financial resources and presence of geolinguistic variation. The originality of this project is that it brought together researchers from different fields (sociolinguistics, descriptive linguistics, dialectology, natural language processing, digital humanities) to work together towards the common goal of developing annotated corpora for Alsatian, Occitan, and Picard. This created a favorable and stimulating working environment which could not have been achieved had different research groups worked independently, each on a single language. This article details the annotation process, with a special focus on the delimitation of the tokens and the definition of the part-of-speech tags.
The Utility of Orthographic Design for Different Users: The Case of the Approved Dagbani Orthography
Fusheini Angulu Hudu, pp. 358-374
This paper presents a critical assessment of the utility of the orthography of Dagbani (a Gur language of Ghana) in the documentation, linguistic research, and literacy acquisition of Dagbani. While written literature on Dagbani dates to over a century, it was only in 1997 that the only known documented orthographic rules of the language, the Approved Dagbani Orthography (ADO), was put together. Its stated goal was to address inconsistencies that existed in the orthographic rules at the time. It has since largely served this goal and has remained a resource for linguists engaged in language documentation and linguistic research as well as adult and young learners acquiring literacy in Dagbani in formal and informal settings. The paper discusses the influence of the orthography in the understanding of aspects of Dagbani linguistics and the challenges that remain with its use in modern-day multimodal communication. It shows that while the ADO has impacted literacy, documentation, and research on Dagbani linguistics, aspects of the design of the orthography have limited its potential impact and have given room for the emergence or maintenance of co-orthographic practices used for electronic communication and in the documentation of names in non-native official circles.
The Conundrum of Friulian Language Vitality
Simone De Cia, pp. 375-410
Italy is characterized by a considerable amount of language variation. Only a few spoken vernaculars enjoy institutional support and are officially recognized as minority languages. Among these, Friulian is one of the largest in terms of number of speakers. In the past decade, the assessment of Friulian language vitality has yielded discordant conclusions. The aim of the present paper is to shed light on Friulian’s vitality by providing an informed discussion of the findings of the three most recent studies on the topic, namely De Cia (2013), Coluzzi (2015), and Melchior (2015). As a framework for discussion and means of synthesis among the different claims put forward on Friulian’s vitality, I will make reference to the nine factors of language vitality proposed by UNESCO (2003): each factor describes six possible sociolinguistic scenarios, which reflect six different levels of language vitality. Despite its official status and institutional support, Friulian lacks young native speakers and is used more and more infrequently in a limited number of social settings. The overall picture suggests that a marked process of language shift from Friulian to Italian is taking place. National and regional authorities should take immediate action to ensure the future survival of the minority language.
Collaborative Fieldwork with Custom Mobile Apps
Mat Bettinson & Steven Bird, pp. 411-432
Mobile apps have the potential to support collaborative fieldwork even where web connectivity is unreliable or unavailable. To explore this potential, we developed portable network infrastructure and custom-made field tool apps. We deployed this solution in remote communities in the far north of Australia, in connection with co-located cooperative language work. Throughout a series of visits, we worked with community members to iterate the designs, optimising their suitability for the tasks and the context. We found that custom toolmaking provides the benefits of digital collaboration tailored for the specific needs of the environment and community. However, we argue that it is activity design – not the technology itself – that must be foregrounded, placing fieldworkers in the driving seat of innovation in digital fieldwork practice.
The Role of Input in Language Revitalization: The Case of Lexical Development
William O’Grady, Raina Heaton, Sharon Bulalang & Jeanette King, pp. 433-457
Immersion programs have long been considered the gold standard for school-based language revitalization, but surprisingly little attention has been paid to the quantity and quality of the input that they provide to young language learners. Drawing on new data from three such programs (Kaqchikel, Western Subanon, and Māori), each with its own particular motivation, objectives, and pedagogical practices, we examine a key component of this revitalization strategy, namely the amount and type of lexical input that children receive. Our findings include previously unknown facts about the number of words that children in these programs hear per hour, the ratio of word tokens to word types, and the skewed frequency distribution of the particular words that make up the input. We discuss our findings with reference both to comparable measures for first language acquisition in a home setting and to their relevance for pedagogical strategies in the classroom.
Mapping Urban Linguistic Diversity in New York City: Motives, Methods, Tools, and Outcomes
Ross Perlin, Daniel Kaufman, Mark Turin, Maya Daurio, Sienna Craig, Jason Lampel, pp. 458-490
Communities around the world have distinctive ways of representing language use across space and territory. The approach to and method of mapping languages that began with nineteenth-century European dialectology and colonial boundary making is one such way. Though practiced by relatively few linguists today, language mapping has developed considerably from its roots yet remains stymied by problems of ideology, representation, and data quality. In this paper, we argue that digital language mapping in hyperdiverse cities can both contribute to overcoming these problems and bring visibility and resources to communities using Indigenous, minority, and primarily oral languages. For these communities, official surveys like the census are often inadequate, leaving a gap that communities, linguists, and mapping experts working in partnership can address. Urban language mapping as a field should make space for Indigenous, minority, and primarily oral languages through geospatial visualization – in terms that the communities themselves recognize and with a public policy agenda. As a case study, we present our ongoing efforts with LANGUAGEMAP.NYC to map the most linguistically diverse urban center in the world: New York City.
Automatic Speech Recognition for Supporting Endangered Language Documentation
Emily Prud’hommeaux, Robbie Jimerson, Richard Hatcher, Karin Michelson, pp. 491-513
Generating accurate word-level transcripts of recorded speech for language documentation is difficult and time-consuming, even for skilled speakers of the target language. Automatic speech recognition (ASR) has the potential to streamline transcription efforts for endangered language documentation, but the practical utility of ASR for this purpose has not been fully explored. In this paper, we present results of a study in which both linguists and community members, with varying levels of language proficiency, transcribe audio recordings of an endangered language under timed conditions with and without the assistance of ASR. We find that both time-to-transcribe and transcription error rates are significantly reduced when correcting ASR for language learners of all levels. Despite these improvements, most community members in our study express a preference for unassisted transcription, highlighting the need for developers to directly engage with stakeholders when designing and deploying technologies for supporting language documentation.
Using YouTube as the Primary Transcription and Translation Platform for Remote Corpus Work
Alexander Rice, pp. 514-550
This paper presents a remote corpus work model that was developed between an outside researcher and community collaborator to continue transcription/translation work at a distance with previously collected material in response to the travel restrictions imposed by the coronavirus pandemic. The paper describes, in detail, the corpus work model, which is based on Ryan Pennington’s (2014) SayMore-FLEx-ELAN workflow and uses YouTube as the primary transcription/translation platform. The paper also describes the pros, cons, and specific situational context in which this model has proven useful so that other documentation teams in similar contexts might benefit. In addition to simply providing a method of doing corpus work remotely, the model also provides a way to maintain community capacity building at a distance.
Between Stress and Tone: Acoustic Evidence of Word Prominence in Kurtöp
Gwendolyn Hyslop, pp. 551-575
Classic typologies within prosody tend to treat ‘tone’ languages as being diametrically opposed to ‘stress’ languages. However, Hyman (2006) highlights several languages that can have both, including Seneca, Fasu, and Copala Trique. As language documentation advances and our acoustic methodologies in the field are further refined, we have seen this list continue to expand. The aim in this article is to further this research trajectory by presenting the correlates of stress in Kurtöp, a tonal Tibeto-Burman language. Kurtöp has a word-level tone system, in which high versus low tone is required on the first syllable of every word. Stress, or prosodic word-level prominence, is realised on the first syllable of a root. Thus, stress and tone usually occur on the same syllable; they are only separated from each other when the negative prefix triggers movement of the tone to the initial syllable, leaving a stressed but toneless second syllable. Based on data collected in the field from three speakers, this article shows that the primary correlate of stress is duration, not pitch, intensity, or expansion of vowel space.