Phonemic transcription of low-resource tonal languages

Abstract : Transcription of speech is an important part of language documentation, and yet speech recognition technology has not been widely harnessed to aid linguists. We explore the use of a neural network architecture with the connectionist temporal classification loss function for phonemic and tonal transcription in a language documentation setting. In this framework, we explore jointly modelling phonemes and tones versus modelling them separately, and assess the importance of pitch information versus phonemic context for tonal prediction. Experiments on two tonal languages, Yongning Na and Eastern Chatino, show the changes in recognition performance as training data is scaled from 10 minutes to 150 minutes. We discuss the findings from incorporating this technology into the linguistic workflow for documenting Yongning Na, which show the method's promise in improving efficiency, minimizing typographical errors, and maintaining the transcription's faithfulness to the acoustic signal, while highlighting phonetic and phonemic facts for linguistic consideration.
Type de document :
Communication dans un congrès
Wong, Sze-Meng Jojo ; Haffari, Gholamreza. Australasian Language Technology Association Workshop 2017, Dec 2017, Brisbane, Australia. ISSN: 1834-7037, pp.53-60, 2017, Australasian Language Technology Association Workshop 2017: Proceedings of the workshop. 〈http://alta2017.alta.asn.au/alta2017-draft-proceedings.pdf〉
Liste complète des métadonnées

https://halshs.archives-ouvertes.fr/halshs-01656683
Contributeur : Alexis Michaud <>
Soumis le : mardi 5 décembre 2017 - 21:33:39
Dernière modification le : vendredi 8 décembre 2017 - 01:17:53

Fichier

Adams_et_al2017_PhonemicTransc...
Fichiers éditeurs autorisés sur une archive ouverte

Licence


Distributed under a Creative Commons Paternité - Pas d'utilisation commerciale - Partage selon les Conditions Initiales 4.0 International License

Identifiants

  • HAL Id : halshs-01656683, version 1

Collections

Citation

Oliver Adams, Trevor Cohn, Graham Neubig, Alexis Michaud. Phonemic transcription of low-resource tonal languages. Wong, Sze-Meng Jojo ; Haffari, Gholamreza. Australasian Language Technology Association Workshop 2017, Dec 2017, Brisbane, Australia. ISSN: 1834-7037, pp.53-60, 2017, Australasian Language Technology Association Workshop 2017: Proceedings of the workshop. 〈http://alta2017.alta.asn.au/alta2017-draft-proceedings.pdf〉. 〈halshs-01656683〉

Partager

Métriques

Consultations de la notice

40

Téléchargements de fichiers

13