Language Documentation & Description

Organizations/Institutions

Databases & Websites

  • Glottolog, comprehensive reference information for the world's languages, especially the lesser-known languages.

  • Ethnologue, an annual reference publication in print and online that provides statistics and other information on the living languages of the world.

  • Grammar Watch, helps linguists find, read and cite open access grammars.

  • Databases by Association for Linguistic Typology

  • The E-MELD School of Best Practices in Language Documentation, A web site covering a wide range of topics in language documentation, with a focus on its technological aspects, including discussion of hardware and software tools for language documentation and encoding standards for digital linguistic resources.

  • Journal of Language Survey Reports, Online access to SIL's sociolinguistic surveys, conducted by SIL teams all over the world. Many of the surveys include the questionnaires used to conduct the surveys.

  • MPI EVA Tools for Language Description, a website containing tools for use in field linguistics and language description. Most of the items on the website are questionnaires designed to assist in eliciting data in such a fashion that the data will be comparable across languages.

  • Archiving for the Future, an open online educational resource for the archiving component in language documentation.

Fellowships for Language Documentation

Journals

Language Archives

International Archives

Endangered Languages Archive (ELAR) of the Hans Rausing Endangered Languages Project is a digital repository for preserving multimedia collections of endangered languages from all over the world, making them available for future generations. It primarily accepts data that were collected by researchers with support from the Endangered Language Documentation Programme (ELDP). They charge a fee for all other depositors.

Documentation of Endangered Languages (DoBeS), Max Planck Institute at contains language documentation data from a great variety of languages from around the world that are in danger of becoming extinct.

Pangloss, at Langues et Civilisations à Tradition Orale (LACITO), CNRS, Paris, France, is an open archive to contribute to the preservation of the world's linguistic heritage

Rosetta Project, Long Now Foundation, San Francisco, USA.

Areal Archives

North America

Alaska Native Language Archive, at the University of Alaska Fairbanks, houses materials relating to Alaska's 20 Native languages, including varieties spoken outside Alaska, and in some cases, languages related to those spoken in Alaska.

California Language Archive, at the University of California, Berkeley is an archive for the languages of California and the Americas.

Latin America

Archive of the Indigenous Languages of Latin America (AILLA) at the University of Texas at Austin accepts any materials relevant to any indigenous language of Latin America and the Caribbean. Its website contains a page of links covering a number of areas of potential interest to documentary and descriptive linguists, including links to information about intellectual property rights, linguistic archives, and funding organizations.

Pacific and Australia

Pacific And Regional Archive for Digital Sources in Endangered Cultures (PARADISEC)

Kaipuleohone Language Archive at the University of Hawai'i Manoa accepts materials from University of Hawai’i affiliates, and from anyone else with materials on languages from the Pacific or Asia. They accept born-digital items and can digitize analog materials like reel-to-reel and cassette recordings, images, and fieldnotes.

Archive of Maori and Pacific Music at University of Auckland, New Zealand (AMPM).

AUSTLANG Australian Indigenous languages database at Canberra, Australia.

Russia

LangueDOC archives in Moscow with a few languages spoken in the Caucasus and Siberia:

Africa

African Language Materials Archive (ALMA)

South Asia

The Computational Resource for South Asian Languages (CoRSAL), hosted at the University of North Texas Digital Library, College of Information, UNT, Texas, USA, supports archiving of audio, video, and text on the under-resourced languages of South Asia. The CoRSAL team engages in research at the intersection of language documentation, description, and information science.

Sikkim-Darjeeling Himalayas Endangered Language Archive (SiDHELA) is a regional archive maintained by the Centre for Endangered Languages, Sikkim University, Sikkim, India.

Software and Applications

  • FLEx (FieldWorks Language Explorer) by SIL International enables linguists to be highly productive when building a lexicon and interlinearizing texts. Powerful bulk editing tools can save hours of work. FLEx allows control of which fields and entries show up in a dictionary publication.

  • ELAN by the Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, The Netherlands, is computer software, a professional tool to manually and semi-automatically annotate and transcribe audio or video recordings. It has a tier-based data model that supports multi-level, multi-participant annotation of time-based media.

  • SayMore by SIL International is an easy-to-learn software program designed to build well-annotated corpora of language documentation resources. SayMore combines basic tools to perform the tasks required to create a language documentation corpus such as creating time-aligned transcription, multi-tiered annotations, translation into a language of wider communication, and gathering relevant metadata. It provides a tool that is less complex and less powerful than programs such as ELAN or FLEx, but it is meant to complement these programs without attempting to replace them.

  • Lameta is a new metadata tool to help with organising collections of files. It is mainly aimed at collections made in the course of documenting language, music, and other cultural expressions. It is basically an alternative for SayMore as the latter isn't compatible with MacOS. However, its functions are limited to metadata and its management, excluding other features, such as time-aligned transcription and translation, offered by SayMore

  • Field Linguist's Toolbox by SIL International is a data management and analysis tool for field linguists. It is especially useful for maintaining lexical data, and for parsing and interlinearizing text, but it can be used to manage virtually any kind of data. However, please note that Toolbox is outdated and no longer fully supported by SIL, so FLEx always remains a better option.

  • Phonology Assistant by SIL International is a discovery tool. Provided with a corpus of phonetic data, it automatically charts the sounds and through its searching capabilities, helps a user discover and test the rules of sound in a language.

  • Webonary by SIL International gives language groups the ability to publish bilingual or multilingual dictionaries on the web with a minimum of technical help.

  • Praat

  • Lexique Pro by SIL International.

  • IPA Keyboard by SIL International.

  • IPA Help, A Phonetics Learning Tool by SIL International.

  • Comparalex, a database of language word list data with audio samples for analysis and historical and comparative linguistic reconstruction.

  • Transcriber, a tool for segmenting, labeling, and transcribing speech.

  • Other SIL Products.

  • Audacity, free, open-source software for recording and editing sounds.

  • CuPED, free program for transforming time-aligned transcripts into a variety of presentation formats.