[CANCELLED] Accuracy Evaluation of Google and Microsoft’s Computer-assisted Segmental Feedback Features (75527)

Session Information: General CALL
Session Chair: Emerita

Monday, 13 November 2023 10:15
Session: Session 1
Room: Room A (Live Stream)
Presentation Type: Paper Presentation

All presentation times are UTC + 7 (Asia/Bangkok)

With the consistent increase in the development of technology for second language (L2) pronunciation training, continuous opportunities arise for creating new and enhancing the existing ways of providing feedback. The influence of segmental features on L2 pronunciation has been extensively investigated (Munro & Derwing, 2006; Suzukida & Saito, 2021), and the effect of segmental feedback on learners’ pronunciation improvement is well established (Neri et al., 2008; Olson, 2014). However, to this day, the technology which provides this type of feedback is limited, and the tools that are freely available have not been validated or evaluated for accuracy of mispronunciation detection.
This study examines the use of Google’s pronunciation practice and Microsoft Speech and the accuracy of the feedback they provide. Google’s pronunciation practice is completely free and accessible to users, while Microsoft Speech provides free feedback for up to 500 hours of recordings.The main goal of this study is to investigate the feedback accuracy operationalized as the extent to which this feedback is consistent with that of human raters. We used individual content words (N=150) from 50 utterances produced by non-native speakers (from the L2-Arctic database, see Zhao, et al. 2018), and we used Google’s pronunciation practice and Microsoft Speech to provide segmental feedback on each. Each word was annotated for segmental errors by an expert phonetician, which was used as the baseline for comparison with the two tools. To assess their reliability, we used mixed effects modeling and intraclass correlation (ICC) coefficients.
Results showed that Google’s pronunciation practice and Microsoft Speech are equivalently reliable i.e., neither tool is more accurate than the other. Furthermore, although human annotations were significantly more reliable than Google’s pronunciation practice, we concluded that this tool is adequate to be used as a complement to L2 pronunciation teaching and learning. In this presentation, we discuss the details of the results and what this could mean for future research and further technological developments. Finally, we assess the significance of our results for pedagogical implementations and the feasibility of the tools for both individual learner and classroom use.


Abstract Summary
The increasing development of technology for second language pronunciation training presents opportunities for improving the feedback methods as well. Segmental features' impact on L2 pronunciation improvement is well-established, but the technology for providing such feedback is limited in both availability and transparent accuracy evaluation. This study investigates the accuracy of Google's pronunciation practice and Microsoft Speech segmental feedback (both accessible for free) on 150 recorded content words from 50 utterances (L2-Arctic database). The results show that neither tool is more accurate than the other, and although human annotations are more reliable, both tools are adequate for complementing L2 pronunciation teaching and learning. The study's findings inform future research and technological developments, as well as assess the feasibility of these tools for individual learner and classroom use.

Authors:
Ivana Rehman, Iowa State University, United States


About the Presenter(s)
Dr Ivana Rehman is a University Assistant Professor/Lecturer at Iowa State University in United States

See this presentation on the full scheduleMonday Schedule



Conference Comments & Feedback

Place a comment using your LinkedIn profile

Comments

Share on activity feed

Powered by WP LinkPress

Share this Presentation

Posted by Amina Batbold


Find a Presentation

  • Reset