Acoustic Comparison of Malaysian and Nigerian English Accents

A. F. Abdulwahab, S. A. Mohd Yusof, H. Husni

Abstract


This study examines the differences in spectral and cepstral acoustics features between Malaysian and Nigerian English accents with the aim of determining the effect of accents on spectral and cepstral features of speech. Accent has received a great attention from ARS researchers due to the fact that it is a major source of ASR performance degradation. Most ASR applications were developed with native English speakers speech samples disregarding the fact that majority of its potential users speaks English as a second language with a marked accent. Malaysia and Nigeria were both colonized by Britain and speaks English as an official or second language despite being multi-ethnic nations. The results of the study revealed that formants values can be used to differentiate between ME and NE accents, most especially F1 and F2. Cepstral (MFCC) performs better in accents recognition than formants features. While the combination of both formants and MFCC yields a better classification performance. However, the effect of the formants is non-uniform and depends on the vowels and accents under consideration. This is evident as each of the formants has different predictive values. Classification rate shows that Multi-Layer Perceptron (MLP) performs better than K-nearest neighbors (KNN).

Keywords


Accent Recognition; Automatic Speech Recognition; Formants Analysis; KNN;

Full Text:

PDF

References


S. Furui, “Fifty years of progress in speech and speaker recognition,” Journal of the Acoustical Society of America, vol. 116, no. 4, pp. 2497- 2498, 2004.

M. Hariharan, L. S. Chee, O. C. Ai, and S. Yaacob, “Classification of speech dysfluencies using LPC based parameterization techniques,” Journal of Medical Systems, vol. 36, no. 3, pp. 1821-1830, 2012.

C.-L. Huang and C.-H. Wu, “Spoken document retrieval using multilevel knowledge and semantic verification,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 8, pp. 2551- 2560, 2007.

D. Crystal, English as a global language: Cambridge University Press, 2012.

J. Foley, English in new cultural contexts: Reflections from Singapore. Oxford University Press, 1998.

S. Sharbawi, An Acoustic Investigation of the Segmental Features of Educated Brunei English Speech. Nanyang Technological University, English Language and Literature, 2009.

U. Gut, “Nigerian English: Phonology,” in A handbook of varieties of English, vol. 1, 2004, pp. 992-1002.

U. Nallasamy, F. Metze, and T. Schultz, “Active learning for accent adaptation in automatic speech recognition,” in 2012 IEEE Spoken Language Technology Workshop (SLT), 2012, pp. 360-365.

J.-L. Gauvain, G. Adda, M. Adda-Decker, A. Allauzen, V. Gendner, L. Lamel, and H. Schwenk, “Where are we in transcribing French broadcast news?,” in INTERSPEECH, 2005, pp. 1665-1668.

S. Matsoukas, J.-L. Gauvain, G. Adda, T. Colthurst, C.-L. Kao, O. Kimball, L. Lamel, F. Lefevre, J. Z. Ma, and J. Makhoul, “Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, pp. 1541-1556, 2006.

H. Soltau, G. Saon, B. Kingsbury, H.-K. J. Kuo, L. Mangu, D. Povey, and A. Emami, “Advances in Arabic speech transcription at IBM under the DARPA GALE program,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, pp. 884-894, 2009.

R. Hsiao, M. C. Fuhs, Y.-C. Tam, Q. Jin, and T. Schultz, “The CMUinterACT 2008 Mandarin transcription system,” in INTERSPEECH, 2008, pp. 1445-1448.

S. Amuda, H. Boril, A. Sangwan, and J. H. Hansen, “Limited resource speech recognition for Nigerian English,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2010, pp. 5090-5093.

D. Vergyri, L. Lamel, and J.-L. Gauvain, “Automatic speech recognition of multiple accented English data,” in INTERSPEECH, 2010, pp. 1652-1655.

R. Soorajkumar, G. Girish, P. B. Ramteke, S. S. Joshi, and S. G. Koolagudi, “Text-Independent Automatic Accent Identification System for Kannada Language,” in Proceedings of the International Conference on Data Engineering and Communication Technology, 2017, pp. 411-418.

A. Rabiee and S. Setayeshi, “Persian accents identification using an adaptive neural network,” in Proceedings of the 2nd International Workshop on Education Technology and Computer Science, 2010, pp. 7-10.

S. Gaikwad, B. Gawali, and K. Kale, “Accent Recognition for Indian English using Acoustic Feature Approach,” International Journal of Computer Applications, vol. 63, 2013.

M. Yusnita, M. P. Paulraj, S. Yaacob, S. A. Bakar, and A. Saidatul, “Malaysian English accents identification using LPC and formant analysis,” in Proceedings of the 2011 IEEE International Conference on Control System, Computing and Engineering (ICCSCE), 2011, pp. 472-476.

C. Pedersen and J. Diederich, “Accent classification using support vector machines,” in Proceedings of the 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), 2007, pp. 444-449.

X. Huang, “Australian Accent-Based Speaker Classification,” in Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, 2010, pp. 416-419.

S. Ghorshi, S. Vaseghi, and Q. Yan, “Cross-entropic comparison of formants of British, Australian and American English accents,” Speech Communication, vol. 50, pp. 564-579, 2008.

L. M. Arslan and J. H. L. Hansen, “Language accent classification in American English,” Speech Communication, vol. 18, pp. 353-367, 1996.

A. Hanani, M. J. Russell, and M. J. Carey, “Human and computer recognition of regional accents and ethnic groups from British English speech,” Computer Speech & Language, vol. 27, pp. 59-74, 2013.

S. Azmi, F. Siraj, S. Yaacob, M. Paulraj, and A. Nazri, “Improved Malay Vowel Feature Extraction Method Based on First and Second Formants,” in Proceedings of the 2nd International Conference on Computational Intelligence, Modelling and Simulation (CIMSiM), 2010, pp. 339-344.

S. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuouslyspoken sentences,” IEEE Transactions on Acoustics, Speech and Signal Processing, , vol. 28, pp. 357-366, 1980.

M. Yusnita, “Investigation of Robust Speech Feature Extraction Techniques for Accents Classification of Malaysian English Speakers,” Ph.D Thesis, School of Mechatronic Engineering, Universiti Malaysia Perlis, 2014.

L. Arslan and J. Hansen, “A study of temporal features and frequency characteristics in American English foreign accent,”Journal of the Acoustical Society of America, vol. 102, pp. 28-40, 1997.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

ISSN: 2180-1843

eISSN: 2289-8131