Abdel-Hamid, O., Mohamed, A. R., Jiang, H., Deng, L., Penn, G., & Yu, D. (2014). Convolutional neural networks for speech recognition. IEEE Transactions on Audio, Speech and Language Processing, 22(10), 1533–1545. https://doi.org/10.1109/taslp.2014.2339736
DOI: https://doi.org/10.1109/TASLP.2014.2339736
Adami, A., & Hermansky, H. (2003). Segmentation of speech for speaker and language recognition. EUROSPEECH-2003 (pp. 841–844). Geneva. Retrieved from https://www.academia.edu/32317887/Segmentation_of_speech_for_speaker_and_language_recognition
DOI: https://doi.org/10.21437/Eurospeech.2003-189
Amodei, D., Anubhai, R., Battenberg, E., Case, C., Casper, J., Catanzaro, B., ... Narang, S. (2015). Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. CoRR, abs/1512.02595. Retrieved from https://arxiv.org/abs/1512.02595v1
Ashby, M., & Maidment, J. (2005). Introducing phonetic science. Cambridge University Press.
DOI: https://doi.org/10.1017/CBO9780511808852
Bartz, C., Herold, T., Yang, H., & Meinel, C. (2017). Language Identification Using Deep Convolutional Recurrent Neural Networks. In D. Liu, S. Xie, Y. Li, D. Zhao, & E. El-Alfy (Eds.), Neural Information Processing ICONIP 2017. Lecture Notes in Computer Science (vol. 10639). Springer. https://doi.org/10.1007/978-3-319-70136-3_93
DOI: https://doi.org/10.1007/978-3-319-70136-3_93
Boussard, J., Deveau, A., & Pyron, J. (2017). Methods for Spoken Language Identification. Retrieved from http://cs229.stanford.edu/proj2017/final-reports/5239784.pdf
Eberhard, D. M., Simons, G. F., & Fennig, C. D. (Eds.). (2020). Ethnologue: Languages of the World. Retrieved from http://www.ethnologue.com
Kirchhoff, K. (2006). Language characteristics. In T. Schultz, & K. Kirchhoff (Eds.), Multilingual Speech Processing (pp. 5–33). Elsevier.
DOI: https://doi.org/10.1016/B978-012088501-5/50005-6
Li, H., Ma, B., & Lee, K. A. (2013). Spoken Language Recognition: From Fundamentals to Practice. Proceedings of the IEEE, 101(5), 1136–1159. https://doi.org/10.1109/JPROC.2012.2237151
DOI: https://doi.org/10.1109/JPROC.2012.2237151
Muthusamy, Y. K., Cole, R., & Oshika, B. (1992). The OGI multi-language telephone speech corpus. Int. Conf. Spoken Lang. Process, 895-898. Retrieved from https://pdfs.semanticscholar.org/aad7/274fdd57191e89f9df2880a50ec14581d671.pdf
DOI: https://doi.org/10.21437/ICSLP.1992-276
Navratil, J. (2001). Spoken language recognition A step toward multilinguality in speech processing. IEEE Trans. Speech Audio Process, 9(6), 678–685. https://doi.org/10.1109/89.943345
DOI: https://doi.org/10.1109/89.943345
Park, D. S., Chan, W., Zhang, Y., Chiu, C.-C., Zoph, B., Cubuk, E. D., & Le, Q. V. (2019). SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. Proc. Interspeech 2019 (pp. 2613–2617). https://doi.org/10.21437/interspeech.2019-2680
DOI: https://doi.org/10.21437/Interspeech.2019-2680
Ramus, F., & Mehler, J. (1999). Language identification with suprasegmental cues: A study based on speech re-synthesis. Journal of Acoustical Society of America, 105(1), 512–521. https://doi.org/10.1121/1.424522
DOI: https://doi.org/10.1121/1.424522
Safitri, N. E., Zahra, A., & Adriani, M. (2016). Spoken Language Identification with Phonotactics Methods on Minangkabau, Sundanese, and Javanese Languages. Procedia Computer Science 81 (pp. 182–187). Elsevier. https://doi.org/10.1016/j.procs.2016.04.047
DOI: https://doi.org/10.1016/j.procs.2016.04.047
Sugiyama, M. (1991). Automatic language recognition using acoustic features. International Conference on Acoustics, Speech, and Signal Processing (pp. 813–816). Toronto. https://doi.org/10.1109/icassp.1991.150461
DOI: https://doi.org/10.1109/ICASSP.1991.150461
Torres-Carrasquillo, P., Singer, E., Kohler, M., Greene, R., Reynolds, D., & Deller, J. (2002). Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In ICSLP-2002 (pp. 89–92). Denver. https://doi.org/10.1109/icassp.2002.5743828
DOI: https://doi.org/10.21437/ICSLP.2002-74
Zhao, J., Shu, H., Zhang, L., Wang, X., Gong, Q., & Li, P. (2008). Cortical competition during language discrimination. NeuroImage, 43(3), 624–633. https://doi.org/10.1016/j.neuroimage.2008.07.025
DOI: https://doi.org/10.1016/j.neuroimage.2008.07.025
Zissman, M. (1996). Comparison of four approaches to automatic language identification of telephone speech. IEEE Transactions on Speech and Audio Processing, 4(1), 31–44. https://doi.org/10.1109/icassp.1993.319323
DOI: https://doi.org/10.1109/TSA.1996.481450
Zissman, M. A. (1993). Automatic language identification using Gaussian mixture and hidden Markov models. IEEE International Conference on Acoustics, Speech and Signal Processing (Vol. 2, pp. 399–402). IEEE. https://doi.org/10.1109/tsa.1996.481450
DOI: https://doi.org/10.1109/ICASSP.1993.319323