Abstract
Time and space constraints in foreign/second language (L2) instruction often restrict learners’ exposure to phonetic variability, a key factor in pronunciation development. High-Variability Phonetic Training (HVPT) offers a promising solution by exposing learners to phonetic variation; however, its implementation into instructional settings remains underexplored. This study investigates the integration of Text-To-Speech (TTS) technology with HVPT to provide varied L2 input in a semi-autonomous (beyond-the-classroom) environment. A mixed-methods pretest-posttest design examined discrete aspects of English pronunciation development, focusing on learners’ phonological awareness of past -ed allomorphy. Thirty Arabic-speaking adult ESL learners in Kuwait were divided into a Treatment Group (exposed to varied TTS voices) and a Control Group (exposed to a single TTS voice), engaging in self-paced listening, categorization, and form-focused activities over four weeks. Results revealed significant improvements in phonological awareness for both groups, with no statistically significant difference between them. These findings contribute to ongoing debates about HVPT’s added value in semi-autonomous settings and suggest that TTS technology alone—whether implemented with HVPT or not—can effectively support phonological awareness, offering a flexible and accessible tool for L2 pronunciation practice.
References
Anthony, J. L., & Francis, D. J. (2005). Development of phonological awareness. Current Directions in Psychological Science, 14(5), 255–259. https://doi.org/10.1111/j.0963-7214.2005.00376.x
Barcomb, M., & Cardoso, W. (2020). Rock or lock? Gamifying an online course management system for pronunciation instruction: Focus on English /r/ and /l/. CALICO Journal, 37(2), 127–147. https://doi.org/10.1558/cj.36996
Barcroft, J., & Sommers, M. S. (2005). Effects of acoustic variability on second language vocabulary learning. Studies in Second Language Acquisition, 27(3), 387–414. https://doi.org/10.1017/S0272263105050175
Barriuso, T. A., & Hayes-Harb, R. (2018). High variability phonetic training as a bridge from research to practice. CATESOL Journal, 30(1), 177-194.
Barros, A. M. V. (2003). Pronunciation difficulties in the consonant system experienced by Arabic speakers when learning English after the age of puberty [Unpublished Master’s thesis]. West Virginia University, Morgantown. https://doi.org/10.33915/etd.766
Bione, T., & Cardoso, W. (2020). Synthetic voices in the foreign language context. Language Learning & Technology, 24(1), 169–186. https://doi.org/10125/44715
Bione, T., Grimshaw, J., & Cardoso, W. (2016). An evaluation of text-to-speech synthesizers in the foreign language classroom: learners’ perceptions. In S. Papadima-Sophocleous, L. Bradley & S. Thouësny (Eds), CALL communities and culture – short papers from EUROCALL 2016 (pp. 50-54). Research-publishing.net. https://doi.org/10.14705/rpnet.2016.eurocall2016.537
Bione, T., Grimshaw, J., & Cardoso, W. (2017). An evaluation of TTS as a pedagogical tool for pronunciation instruction: the ‘foreign’ language context. In K. Borthwick, L. Bradley & S. Thouësny (Eds), CALL in a climate of change: adapting to turbulent global conditions – short papers from EUROCALL 2017 (pp. 56-61). Research-publishing.net. https://doi.org/10.14705/rpnet.2017.eurocall2017.689
Bradlow, A. R., & Bent, T. (2008). Perceptual adaptation to non-native speech. Cognition, 106(2), 707–729. https://doi.org/10.1016/j.cognition.2007.04.005
Bradlow, A. R., Akahane-Yamada, R., Pisoni, D. B., & Tohkura, Y. (1999). Training Japanese listeners to identify English /r/and /l/: Long-term retention of learning in perception and production. Perception & Psychophysics, 61(5), 977–985. https://doi.org/10.3758/bf03206911
Bradlow, A. R., Pisoni, D. B., Akahane-Yamada, R., & Tohkura, Y. (1997). Training Japanese listeners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production. The Journal of the Acoustical Society of America, 101(4), 2299–2310. https://doi.org/10.1121/1.418276
Cardoso, W. (2011). The development of coda perception in second language phonology: A variationist perspective. Second Language Research, 27(4), 433-465. https://doi.org/10.1177/0267658311413540
Cardoso, W. (2018). Learning L2 pronunciation with a text-to-speech synthesizer. In Taalas, P., Jalkanen, J., Bradley, L., & Thouësny, S., (Eds.), Proceedings of the European Association for Computer-Assisted Language Learning – EUROCALL 2018 (pp.16-21). https://doi.org/10.14705/rpnet.2018.26.806
Cardoso, W. (2022). Technology for Speaking Development. In T. Derwing, M. Munro, & R. Thomson (Eds), Routledge Handbook on Second Language Acquisition and Speaking (p. 299-313). Routledge, Taylor & Francis Group.
Cardoso, W., Smith, G., & Garcia Fuentes, C. (2015). Evaluating text-to-speech synthesizers. In F. Helm, L. Bradley, M. Guarda, & S. Thouësny (Eds), Critical CALL – Proceedings of the 2015 EUROCALL Conference, Padova, Italy (pp. 108-113). Research-publishing.net. https://doi.org/10.14705/rpnet.2015.000318
Celce-Murcia, M., Brinton, D., & Goodwin, J. (2010). Teaching pronunciation: A reference for teachers of English to speakers of other languages. Cambridge University Press.
Crosby, C. (2020). Adding Production to High Variability Phonetic Training. Honors Theses, (1471). Retrieved from https://egrove.olemiss.edu/hon_thesis/1471/
Collins, L., & Muñoz, C. (2016). The foreign language classroom: Current perspectives and future considerations. The Modern Language Journal, 100(S1), 133–147. https://doi.org/10.1111/modl.12305
Collins, L., Trofimovich, P., White, J., Cardoso, W., & Horst, M. (2009). Some input on the easy/difficult grammar question: An empirical study. The Modern Language Journal, 93(3), 336–353. https://doi.org/10.1111/j.1540-4781.2009.00894.x
De Araújo Gomes, A. A., Cardoso, W., & De Lucena, R. M. (2018). Can TTS help L2 learners develop their phonological awareness? In P. Taalas, J. Jalkanen, L. Bradley & S. Thouësny (Eds), Future-proof CALL: language learning as exploration and encounters – short papers from EUROCALL 2018 (pp. 29-34). https://doi.org/10.14705/rpnet.2018.26.808
Delatorre, F. (2010). The role of orthography on the production of regular verbs ending in ed by Brazilian EFL learners. In Proceedings of the 9th Seminário do Círculo de Estudos Linguísticos do Sul (pp. 1-13). Florianópolis: Federal University of Santa Catarina.
Derwing, T. M., & Munro, M. J. (2005). Second language accent and pronunciation teaching: A research-based approach. TESOL Quarterly, 39(3), 379-397. https://doi.org/10.2307/3588486
Dwight, V. (2012). Regular Past Tense Acquisition in L2 English: The Roles of Perceptual Salience and Readiness [Unpublished Master’s thesis]. Concordia University.
Eksi, G. Y., & Yesilcinar, S. (2016). An investigation of the effectiveness of online text-to-speech tools in improving EFL teacher trainees’ pronunciation. English Language Teaching, 9(2), 205-214. https://doi.org/10.5539/elt.v9n2p205
Farhat, P., & Dzakiria, H. (2017). Pronunciation barriers and computer assisted language learning (CALL) coping the demands of 21st century in second language learning classroom in Pakistan. International Journal of Research in English Education, 2(2), 53–62. https://doi.org/10.18869/acadpub.ijree.2.2.53
Flege, J. E. (1988). The production and perception of speech sounds in a foreign language. In H. Winitz (Ed.), Human communication and its disorders: A review (pp. 224–401). Ablex.
Flege, J. E. (1991). Perception and production: The relevance of phonetic input to L2 phonological learning. In T. Huebner & C. A. Ferguson (Eds.), Cross Currents in Second Language Acquisition and Linguistic Theory (pp. 249–289). John Benjamins. https://doi.org/10.1075/lald.2.15fle
Flege, J. E. (1995). Second-language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233–277). York Press.
Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.), Second language learning and the critical period hypothesis. (pp. 101–131). Erlbaum.
Flege, J. E., & Liu, S. (2001). The effect of experience on adults’ acquisition of a second language. Studies in Second Language Acquisition, 23(4), 527–552. https://doi.org/10.1017/s0272263101004041
Fraser, H. (2000). Coordinating improvements in pronunciation teaching for adult learners of English as a second language. DETYA (ANTA Innovative project).
González, D. (2007). Text-to-speech applications used in EFL contexts to enhance pronunciation. TESL-EJ, 11(2), 1-11.
Honeybone, P. (2011). Variation and linguistic theory. In W. Maguire & A. McMahon (Eds), Analysing Variation in English (pp. 151–177). Cambridge University Press. https://doi.org/10.1017/cbo9780511976360.008
Ingvalson, E. M., Ettlinger, M., & Wong, P. C. M. (2014). Bilingual speech perception and learning: A review of recent trends. International Journal of Bilingualism, 18(1), 35–47. https://doi.org/10.1177/1367006912456586
Iverson, P., & Evans, B. G. (2009). Learning English vowels with different first-language vowel systems II: Auditory training for native Spanish and German speakers. The Journal of the Acoustical Society of America, 126(2), 866–877. https://doi.org/10.1121/1.3148196
Jackson, S., & Cardoso, W. (2022). Orthographic interference in the acquisition of English /h/ by francophones. Second Language Pronunciation, 229–248. https://doi.org/10.1515/9783110736120-009
Jenkins, J. (2000). The phonology of English as an international language. Oxford University Press.
Jia, G., & Aaronson, D. (2003). A longitudinal study of Chinese children and adolescents learning English in the United States. Applied Psycholinguistics, 24(1), 131–161. https://doi.org/10.1017/s0142716403000079
Jing, Z. (2010). A new approach to college English pronunciation teaching. Shandong Foreign Language Teaching Journal, 31(03), 60–63.
Johns, T. (1991). Should you be persuaded: Two examples of data-driven learning. In T. Johns & P. King (Eds.), Classroom concordancing. English Language Research, 4, 1-16.
John, P., & Cardoso, W. (2017). A comparative study of text-to-speech and native speaker output. In J. Demperio, E. Rosales & S. Springer (Eds.), Proceedings of the meeting on English language teaching (pp. 78-96). Université du Québec à Montréal Press.
Kharma, N., & Hajjaj, A. (1997). Errors in English among Arabic speakers: Analysis and remedy. York Press.
Kiliçkaya, F. (2008). Improving pronunciation via accent reduction and text-to-speech software. In T. Koyama (Ed.), Proceedings of the WorldCALL 2008 conference (pp. 135–137). Nagoya, Japan: The Japan Association for Language Education and Teaching.
Kim, S. (2018). Exploring media literacy: Enhancing English oral proficiency and autonomy using media technology. Studies in English Education, 23(2), 473–500. https://doi.org/10.22275/see.23.2.03
Krashen, S. (1985). The Input Hypothesis: Issues and Implications. Longman.
Levis, J. M. (2016). Research into practice: How research appears in pronunciation teaching materials. Language Teaching, 49(3), 423–437. https://doi.org/10.1017/s0261444816000045
Liakin, D., Cardoso, W., & Liakina, N. (2017a). Mobilizing instruction in a second-language context: Learners’ perceptions of two speech technologies. Languages, 2(3), 11. https://doi.org/10.3390/languages2030011
Liakin, D., Cardoso, W., & Liakina, N. (2017b). The pedagogical use of mobile speech synthesis (TTS): Focus on French liaison. Computer Assisted Language Learning, 30(3–4), 325–342. https://doi.org/10.1080/09588221.2017.1312463
Linebaugh, G., & Roche, T. (2015). Evidence that L2 production training can enhance perception. Journal of Academic Language & Learning, 9(1), 1-17.
Lively, S. E., Logan, J. S., & Pisoni, D. B. (1993). Training Japanese listeners to identify English /r/ and /l/. II: The role of phonetic environment and talker variability in learning new perceptual categories. The Journal of the Acoustical Society of America, 94(3), 1242–1255. https://doi.org/10.1121/1.408177
Lively, S. E., Pisoni, D. B., Yamada, R. A., Tohkura, Y., & Yamada, T. (1994). Training Japanese listeners to identify English /r/ and /l/. III. long-term retention of new phonetic categories. The Journal of the Acoustical Society of America, 96(4), 2076–2087. https://doi.org/10.1121/1.410149
Logan, J. S., Lively, S. E., & Pisoni, D. B. (1991). Training Japanese listeners to identify English /r/ and /l/: A first report. The Journal of the Acoustical Society of America, 89(2), 874–886. https://doi.org/10.1121/1.1894649
McCandliss, B. D., Fiez, J. A., Protopapas, A., Conway, M., & McClelland, J. L. (2002). Success and failure in teaching the [r]-[l] contrast to Japanese adults: Tests of a Hebbian model of plasticity and stabilization in spoken language perception. Cognitive, Affective, & Behavioral Neuroscience, 2(2), 89–108. https://doi.org/10.3758/cabn.2.2.89
Moon, D. (2012). Web-based text-to-speech technologies in foreign language learning: Opportunities and challenges. In T. Kim, J. Ma, W. Fang, Y. Zhang, A. Cuzzocrea (Eds.), Computer Applications for Database, Education, and Ubiquitous Computing (pp. 120-125). Springer. https://doi.org/10.1007/978-3-642-35603-2_19
Moyer, A. (2009). Input as a critical means to an end: Quantity and quality of experience in L2 phonological attainment. In T. Piske & M. Young-Scholten (Eds.), Input Matters in SLA (pp. 159–174). Multilingual Matters.
Pérez-Paredes, P., & Boulton, a. (2025). Data-driven learning in and out of the language classroom. Cambridge University Press.
Perrachione, T. K., Lee, J., Ha, L. Y., & Wong, P. C. (2011). Learning a novel phonological contrast depends on interactions between individual differences and training paradigm design. The Journal of the Acoustical Society of America, 130(1), 461–472. https://doi.org/10.1121/1.3593366
Prashant, P. D. (2018). Importance of pronunciation in English language communication. Pronunciation and Communication, 7(2), 16-17.
Sadakata, M., & McQueen, J. M. (2014). Individual aptitude in Mandarin lexical tone perception predicts effectiveness of high-variability training. Frontiers in Psychology, 5. https://doi.org/10.3389/fpsyg.2014.01318
Sakai, M., & Moorman, C. (2018). Can perception training improve the production of second language phonemes? A meta-analytic review of 25 years of perception training research. Applied Psycholinguistics, 39(1), 187–224. https://doi.org/10.1017/s0142716417000418
Salim, E. A. E., & Mohammed, F. A. H. (2023). Mother tongue interference in teaching English. Multicultural Education, 9(6). https://doi.org/10.5281/zenodo.8025208
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2), 129–158. https://doi.org/10.1093/applin/11.2.129
Schmidt, R., & Frota, S. (1986). Developing basic conversational ability in a second language: A case study of an adult learner. In R. Day (Ed.), Talking to learn: Conversation in second language acquisition (pp. 237-326). Newbury House.
Shin, D. J., & Iverson, P. (2013). Training Korean second language speakers on English vowels and prosody. Proceedings of Meetings on Acoustics, 19(1). https://doi.org/10.1121/1.4801046
Soler-Urzúa, F. (2011). The acquisition of English /ɪ/ by Spanish speakers via text-to-speech synthesizers: A quasi-experimental study [Unpublished Master’s thesis]. Concordia University.
Thomson, R. I. (2012). Improving L2 listeners’ perception of English vowels: A computer‐mediated approach. Language Learning, 62(4), 1231–1258. https://doi.org/10.1111/j.1467-9922.2012.00724.x
Thomson, R. I. (2018). High variability [pronunciation] training (HVPT): A proven technique about which every language teacher and learner ought to know. Journal of Second Language Pronunciation, 4(2), 208–231.
Thomson, R. I., & Derwing, T. M. (2016). Is phonemic training using nonsense or real words more effective? In J. Levis, H. Le, I. Lucic, E. Simpson, & S. Vo (Eds.), Proceedings of the 7th Pronunciation in Second Language Learning and Teaching Conference. Oct. 2015. (pp. 88–97). Iowa State University.
Wang, Y., Spence, M. M., Jongman, A., & Sereno, J. A. (1999). Training American listeners to perceive Mandarin tones. The Journal of the Acoustical Society of America, 106(6), 3649–3658. https://doi.org/10.1121/1.428217
Wong, J. W. (2014). The effects of high and low variability phonetic training on the perception and production of English vowels /e/-/æ/ by Cantonese ESL learners with high and low L2 proficiency levels. Interspeech 2014. https://doi.org/10.21437/interspeech.2014-129
Zhang, X., Cheng, B., & Zhang, Y. (2021). The role of talker variability in nonnative phonetic learning: A systematic review and meta-analysis. Journal of Speech, Language, and Hearing Research, 64(12), 4802–4825. https://doi.org/10.1044/2021_JSLHR-21-00181
Zimmer, M., Alves, U. K., & Silveira, R. (2009). Pronunciation instruction for Brazilians: Bringing theory and practice together. Cambridge Scholars.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2025 Author and CALL-EJ
