It is well-known that phonemes have different acoustic realizations depending on the context. Thus, for example, the phoneme /t! is typically realized with a heavily aspirated strong burst at the beginning of a syllable as in the word Tom, but without a burst at the end of a syllable in a word like cat. Variation such as this is often considered to be problematic for speech recogni­ tion: (1) In most systems for sentence recognition, such modifications must be viewed as a kind of noise that makes it more difficult to hypothesize lexical candidates given an in­ put phonetic transcription. To see that this must be the case, we note that each phonological rule [in a certain example] results in irreversible ambiguity-the phonological rule does not have a unique inverse that could be used to recover the underlying phonemic representation for a lexical item. For example, . . . schwa vowels could be the first vowel in a word like about or the surface realization of almost any English vowel appearing in a sufficiently destressed word. The tongue flap [(] could have come from a /t! or a /d/. [65, pp. 548-549] This view of allophonic variation is representative of much of the speech recognition literature, especially during the late 1970s. One can find similar statements by Cole and Jakimik [22] and by Jelinek [50].

