Validating automated speech timing methods across sentence, paragraph, and monologue tasks (submitted)

Journal article

Lian J. Arzbecker, Nathaniel Cline, Kris Tjaden
Motor Speech Conference, Tempe, AZ, 2026 Feb

Cite

APA Click to copy
Arzbecker, L. J., Cline, N., & Tjaden, K. (2026). Validating automated speech timing methods across sentence, paragraph, and monologue tasks (submitted). Motor Speech Conference.

Chicago/Turabian Click to copy
Arzbecker, Lian J., Nathaniel Cline, and Kris Tjaden. “Validating Automated Speech Timing Methods across Sentence, Paragraph, and Monologue Tasks (Submitted).” Motor Speech Conference (February 2026).

MLA Click to copy
Arzbecker, Lian J., et al. “Validating Automated Speech Timing Methods across Sentence, Paragraph, and Monologue Tasks (Submitted).” Motor Speech Conference, Feb. 2026.

BibTeX Click to copy

@article{lian2026a,
  title = {Validating automated speech timing methods across sentence, paragraph, and monologue tasks (submitted)},
  year = {2026},
  month = feb,
  address = {Tempe, AZ},
  journal = {Motor Speech Conference},
  author = {Arzbecker, Lian J. and Cline, Nathaniel and Tjaden, Kris},
  month_numeric = {2}
}

Abstract

(submitted)
Precise measurement of speech timing is critical for assessing motor speech disorders such as dysarthria. Speaking and articulation rates are common metrics, but manual analysis is time-consuming and impractical clinically [1,2]. Timing estimates vary by neurological diagnosis [3] and speech task [4], making task selection critical. This study evaluates a Praat-based algorithm [5] for automated estimation of speech timing, comparing it to manual measurements across 60 speakers (MS, PD, controls) using sentence, paragraph, and monologue tasks (N=180 recordings).

Speaking rate was operationalized as syllable count divided by total duration (including pauses); articulation rate used the same calculation, excluding pauses [6,7]. Manual measures used syllable counts and acoustic segmentation. Automated measures used a Praat script detecting syllable nuclei via amplitude dips. Default script parameters underestimated timing, prompting iterative peak dip threshold optimization. Analyses (separate for default and optimized) included descriptive metrics, a generalizability study, and linear mixed-effects models.

Speaking and articulation rates differed by group and task but were consistent across measurement methods: PD speakers had the fastest, MS speakers the slowest, and reading tasks yielded faster rates than monologues (Figure 1). Default automated estimates correlated with manual measures in 13 of 18 group-task combinations (r = .681–.998). A generalizability study revealed that speech task explained the most variance in speaking rate (38.3%), while measurement method accounted for the majority of articulation rate variance (47.5%). Optimizing the Praat script reduced mean proportional error in default automated rates by ~60%, though improvements were not observed in all group-task combinations (Figure 2).

Optimizing automated speech timing methods significantly improved alignment with manual measurements, especially for structured tasks and neurotypical speech. However, accuracy varied across speaker group and task. Articulation rate showed greater sensitivity to the measurement method, likely due to Praat’s difficulty detecting syllable boundaries, particularly in MS and PD speakers. The labor-intensive nature of optimization limits clinical scalability, highlighting the need for adaptive algorithms. Despite limitations, automated tools show promise for low-burden monitoring in dysarthria. Future work should enhance generalizability through validation across populations.

References

Dagenais, P. A., Southwood, M. H., & Lee, T. L. (1998). Rate reduction methods for improving speech intelligibility of dysarthric speakers with Parkinson’s disease. Journal of Medical SpeechLanguage Pathology, 6, 143–157.
Green, J. R., Beukelman, D. R., & Ball, L. J. (2004). Algorithmic estimation of pauses in extended speech samples of dysarthric and typical speech. Journal of Medical Speech-Language Pathology, 12(4), 149–154.
Rowe, H. P., Shellikeri, S., Yunusova, Y., Chenausky, K. V., & Green, J. R. (2023). Quantifying articulatory impairments in neurodegenerative motor diseases: A scoping review and metaanalysis of interpretable acoustic features. International Journal of Speech-Language Pathology, 25(4), 486–499.
Duchin, S. W., & Mysak, E. D. (1987). Disfluency and rate characteristics of young adult, middle-aged, and older males. Journal of Communication Disorders, 20(3), 245–257.
de Jong, N. H., & Wempe, T. (2009). Praat script to detect syllable nuclei and measure speech rate automatically. Behavior Research Methods, 41(2), 385–390.
Tsao, Y.-C., Weismer, G., & Iqbal, K. (2006). Interspeaker variation in habitual speaking rate: Additional evidence. Journal of Speech, Language, and Hearing Research, 49(5), 1156–1164.
Turner, G. S., & Weismer, G. (1993). Characteristics of speaking rate in the dysarthria associated with Amyotrophic Lateral Sclerosis. Journal of Speech, Language, and Hearing Research, 36(6), 1134–1144.