Error correction in text-to-speech systems
September 17, 2005
Arons, B. SpeechSkimmer: A system for interactively skimming recorded speech. ACM Transactions on Computer-Human Interaction 4, 1 (1997).
Bacchiani, M., Hirschberg, J., Rosenberg, A., Whittaker, S., Hindle, D., Isenhour, P., Jones, M., Stark, L. and Zamchick, G. SCANMail: Audio navigation in the voicemail domain. Proc. Conference on Human Language Technology Research 2001, ACM Press (2000), 1-3.
Boreczky, J., Gigensohn, A., Golovchinsky, G., and Uchihashi, S. An Interactive Comic Book Presentation for Exploring Video. Proc. CHI 2000, ACM Press (2000), 185-192.
Chase, L. Word and acoustic confidence annotation for large vocabulary speech recognition. Proc. Eurospeech 1997, (1997), 815-1818.
Degen, L., Mander, R., and Salomon, G. Working with Audio. Proc. CHI 1992, ACM Press (1992), 413-418.
Feng, J. and Sears, A. Using confidence scores to improve hands-free speech based navigation in continuous dictation systems. ACM Transactions on Computer-Human Interaction, 4,11 (2004), 329-256.
Hakkani-Tür, D., Béchet, F., Riccardi, G. and Tür, G. Beyond ASR 1-Best: Using word confusion networks in spoken language understanding. Journal of Computer Speech and Language, Elsevier, (To appear).
Hauptmann and Witbrock, M. Informedia: News-on-demand multimedia information acquisition and retrieval. Intelligent Multimedia Information Retrieval, AAAI Press (1997), 213-239.
Hazen, T., Polifroni, J., and Seneff, S. Recognition confidence scoring for use in speech understanding systems. Computer Speech and Language 16, (2002), 49-67.
Hindus, D., Schmandt, C., and Horner, C. Capturing, structuring, and representing ubiquitous audio. ACM Transactions on Information Systems 11, 4 (1993), 376-400.
Karat, C., Halverson, C., Karat J., and Horn, D. Patterns of entry and correction in large vocabulary continuous speech recognition systems. Proc. CHI 1999, ACM Press (1999), 568-575.
Kazman, R., Al-Halimi, R., Hunt, W., and Mantei, M. Four paradigms for indexing videoconferences. IEEE Multimedia 3, 1 (1996), 63-73.
Moran, T., Palen, L., Harrison, S., Chiu, P., Kimber, D., Minneman, S., van Melle, W., and Zellweger, P. “I’ll get that off the audio”: Salvaging in a multimedia meeting. Proc. CHI 1997, ACM Press (1997), 202-209.
Oviatt, S. Taming Recognition Errors with a Multimodal Interface. Communications of the ACM 43, ACM Press (2000), 45-51.
Stark, L., Whittaker, S., and Hirschberg, J. ASR satisficing: The effects of ASR accuracy on speech retrieval. Proc. International Conference on Spoken Language Processing, (2000).
Stifelman, L, Arons, B., and Schmandt, C. The audio notebook: Paper and pen interaction with structured speech. Proc. CHI 2001, ACM Press (2001), 182-189.
Suhm, B., Myers, B. and Waibel, A. Multimodal error correction for speech user interfaces. ACM Transactions on Computer-Human Interaction 1, 8 (2001), 60-98.
Vemuri, S., DeCamp, P., Bender, W., and Schmandt, C. Improving speech playback using time-compression and speech recognition. Proc. CHI 2004, ACM Press (2004), 295-302.
Whittaker, S. and Amento, B. Semantic Speech Editing. Proc. CHI 2004, ACM Press (2004), 527-534.
Whittaker, S., Davis, R., Hirshberg, J., and Muller, U. Jotmail: A voicemail interface that enables you to see what was said. Proc. CHI 2000, ACM Press (2000), 89-96.
Whittaker, S., Hirschberg, J., Amento, B., Stark, L., Bacchiani, M., Isenhour, P., Stead, L., Zamechick, G. and Rosenberg A. SCANMail: A voicemail interface that makes speech browsable, readable, and searchable. Proc. CHI 2002, ACM Press (2002), 275-282.
Whittaker, S., Hyland, P., and Wiley, M. Filochat: Handwritten notes provide access to recorded conversations. Proc. CHI 1994, ACM Press (1994), 271-277.
Wilcox, L., Chen, F., Kimber, D., and Balasubramanian, V. Segmentation of speech using speaker identification. Proc. International Conference on Acoustics, Speech, and Signal Processing (1994), 161-164.