From Sound to Sense and back again: The integration of lexical and speech processes David Gow Bob McMurray Massachusetts General Hospital Dept. of Brain and Cognitive Sciences University of Rochester The Speech Chain Complex computations from sound to sense must be broken up for study. Assume intermediate representations: Phonemes… Words… Syntactic Phrases… Sense Sound The Standard Paradigm Sense The Standard Paradigm Phonology Words Phonemes Sound The Standard Paradigm Sense The Standard Paradigm Delimited fields of study. • Speech Perception Phonology Words • Spoken Word Recognition • Phonology Phonemes* essential Phonemes Sound * or other sublexical category Why? Categorical Perception (CP) Continuous Acoustic Detail => Discrete Categories Does CAD affect speech categorization? 100 Discrimination % /p/ 100 Discrimination ID (%/pa/) 0 B VOT 0 • Sharp identification of tokens on a continuum. P • Discrimination poor within a phonetic category. Sense Categorical Perception (CP) Defined fundamental computational problems. Input to • Phonology • Word recognition. Phonology CP is output of • Speech perception Words Phonemes Sound But… CP • Not all speech contrasts are categorical. • Lots of tasks show non-categorical perception. Fry, Abramson, Eimas & Liberman (1962) Pisoni & Tash (1974) Pisoni & Lazarus (1974) Carney, Widden & Viemeister (1977) Hary & Massaro (1982) Pisoni, Aslin, Perey & Hennessy (1982) Healy & Repp (1982) Massaro & Cohen (1983) Miller (1997) Samuel (1997)… Why has the Standard Paradigm persisted? Categorical Perception is about phonetic classification. The minimal computational problem: compute meaning from sound. Sense Words CP tasks don’t necessarily tap a stage of this problem. Lexical activation… seems a good bet. ? CP Sound Why has the Standard Paradigm persisted? Even when continuous acoustic detail affects word recognition, it is seen as outside of core word recognition. Why has the Standard Paradigm persisted? • Vowel Length • Stress/Meter • Coarticulation Cue extra-segmental process. Words Phonemes CAD Segmentation Example: Word Segmentation Word Recognition Even when continuous acoustic detail affects word recognition, it is seen as outside of core word recognition. Does continuous acoustic detail affect interpretation via core word-recognition processes? No. Standard Paradigm is fine… Sublexical Filter Yes. Hmm… (phonemes) Need to use stimuli with: • Precise control over CAD Need to use tasks that: • reflect only minimal computational problem: meaning. • are sensitive to acoustic detail. Visual World Paradigm Visual World Paradigm • Subjects hear spoken language and manipulate objects in a visual world. • Visual world includes set of objects with interesting linguistic properties (names) • Eye-movements to each object are monitored throughout the task. Tanenhaus, Spivey-Knowlton, Eberhart & Sedivy (1995) Allopenna, Magnuson & Tanenhaus (1998) • Meaning based, natural task: Subjects must interpret speech to perform task. • Fixation probability maps onto dynamics of lexical activation. • Context is controlled: meaning lexical activation. • Eye-movements fast and timelocked to speech. Does continuous acoustic detail affect interpretation? Is lexical activation sensitive to continuous acoustic detail? McMurray, Tanenhaus & Aslin (2003) Combine tools of • speech perception: 9-step VOT continuum. • spoken word recognition: visual world paradigm Methods A moment to view the items 500 ms later Bear Repeat 1080 times… 200 ms Trials 1 2 3 4 5 Target = Bear Competitor = Pear Unrelated = Lamp, Ship Time VOT=0 Response= Fixation proportion 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 400 800 Time (ms) 1200 1600 Predictions What would lexical sensitivity to CAD look like? Systematic effect on competitor dynamics. Fixations to the competitor. target competitor time Gradient Effect Fixation proportion Fixation proportion Categorical Results target competitor time Results Response= Response= Competitor Fixations 0.16 VOT VOT 0.14 0 ms 5 ms 10 ms 15 ms 0.12 0.1 20 ms 25 ms 30 ms 35 ms 40 ms 0.08 0.06 0.04 0.02 0 0 400 800 1200 1600 0 400 800 Time since word onset (ms) 1200 1600 2000 Task? Phoneme ID P L B Sh Not part of minimal computational problem. Same stimuli in metalinguistic task… …more categorical pattern of fixations Continuous acoustic detail is not helpful in metalinguistic tasks… Summary Word recognition shows gradient sensitivity to continuous acoustic detail. Not extra-segmental: VOT CAD affects higher-level processes. Consistent with other studies: Andruski, Blumstein & Burton (1994) Marslen-Wilson & Warren (1994) Utman, Blumstein & Burton (2000) Dahan, Magnuson, Tanenhaus & Hogan (2001) McMurray, Clayards, Aslin & Tanenhaus (2004) McMurray, Aslin, Tanenhaus, Spivey & Subik (in prep) The Standard Paradigm? Sense CAD affects higher-level processes. From other work: Phonology Lexical activation influences sublexical representations. Words Samuel & Pitt (2003) Magnuson, McMurray, Tanehaus & Aslin (2003) Samuel (1997) Elman & McClelland (1988) Phonemes Continuous Acoustic Detail The Standard Paradigm? Sense CAD affects higher-level processes. From other work: Phonological regularity affects signal interpretation. Phonology Lexical activation influences sublexical representations. Words Massaro & Cohen (1983) Halle, Segui, Frauenfelder & Meunier (1998) Pitt (1998) Dupoux,Kakehi, Hirose, Pallier & Mehler, (1999) Phonemes Continuous Acoustic Detail Sense Perhaps interaction and integration make sense. Do they help solve sticky problems? YES Phonology Words Phonemes Continuous Acoustic Detail The Emerging Paradigm Integration of work in: • spoken word recognition • speech perception • phonology New computations simplify old problems and solve new ones. • Cognitive processes: Lexical activation & competition. • Perceptual processes: sensitivity to CAD & perceptual grouping. CAD is helpful in language comprehension. • Word segmentation • Coping with lawful variability due to assimilation Combination of approaches helps solve both problems. Lexical Segmentation Some lexical processes can’t work in the Standard Paradigm The SWR Solution I v d I p A t m I n t] I v d I p A t active m I n t] I v d I p A t active m I n t] department I v d I p A t active act of a m I n t] department dip art mint part depart in are par Standard Paradigm: Template matching overgenerates Frauenfelder & Peeters (1990) succeed suck seed ‘ k s I d - Cycle • Overgeneration resolved through competition in TRACE (McClelland & Elman 1986) Problem: What if the speaker is trying to say “suck seeds”? The Speech Solution Cues shown to affect segmentation: Words Phonemes CAD Lehiste, 1960; Garding,1967; Lehiste, 1972; Umeda, 1975; Nakatani & Dukes, 1977; Nakatani & Schaffer,1978; Cutler & Norris, 1988….. Segmentation Implied processing model requires separate segmentation process Recognition • Initial strong syllable • Initial lengthening • Increased aspiration • Increased glottalization Problem: cues are subtle and varied, extra-segmental processes are inelegant Phonemes CAD Segmentation Recognition Is there a better mechanism? Words Gow & Gordon (1995) The proposal had a strange syntax that nobody liked. ^ Syntax GRAMMAR primed Tax INCOME inhibited The proposal had a strange sin tax that nobody liked. ^ Syntax GRAMMAR primed Tax INCOME primed • CAD affects interpretation. • does not trigger segmentation. Good Start Model • Observation: All segmentation cues happen to enhance word-initial features • Strengthened cues facilitate activation, making intended words stronger competitors Incorporating CAD: • Solves overgeneration problem. • No extra-segmental segmentation process. Gow & Gordon (1995) Summary When continuous acoustic detail affects lexical activation, speech and SWR models can be integrated and simplified Assimilation The emerging paradigm reframes computational problems Redefining Computational Problems English coronal place assimilation /coronal # labial/ [labial # labial] /coronal #velar/ [velar # velar] Standard Paradigm: Change is • discrete • phonemically neutralizing [ G I m]# berries nonword? ripe berries? [ a I p ]# berries right berries? Standard Paradigm solution: Phonological inference (Gaskell & Marslen-Wilson, 1996; 1998; 2001) Knowledge driven inference: If [labial # labial] infer /coronal # labial/ greem beans green (Gaskell & Marslen-Wilson, 1996; Gow, 2001) ripe berries right (Gaskell & Marslen-Wilson, 2001; Gow, 2002) ripe Moreover: Assimilation effects dissociated from linguistic knowledge (Gow & Im, in press) Assimilation Produces CAD Assimilatory modification is acoustically continuous F3 Transitions in /æC/ Contexts 1850 2800 1800 2750 1750 coronal assimilated labial 1700 1650 1600 1550 Frequency (Hz) Frequency (Hz) F2 Transitions in /æC/ Contexts coronal assimilated labial 2700 2650 2600 2550 Pitch Period Pitch Period This is not discrete feature change! Regressive Context Effects Select the Sma t ca p box Subject Hears: Assim_Non-Coronal (cat/p box) Fixation Proportion 0.6 0.5 0.4 0.3 0.2 Coronal (cat) 0.1 Non-Coronal (cap) 0 0 400 800 Time (ms) 1200 1600 Subject Hears: Assim Non-Coronal (cat/p drawing) Fixation Proportion 0.6 0.5 0.4 0.3 0.2 Coronal (cat) 0.1 Non-Coronal (cap) 0 0 400 800 Time (ms) 1200 1600 Progressive Context Effects Looks to Final Non-coronal (box) Fixation Proportion 0.7 0.6 0.5 0.4 0.3 0.2 Assim Non-Coronal 0.1 Coronal Non-Coronal 0 0 400 800 1200 1600 Time (ms) Progressive effect in the same experiment Assimilation: Use of CAD Assimilation is resolved through phonological context. Partially-assimilated items show regressive context effects (Gow, 2002; 2003) progressive context effects (Gow, 2001; 2003) Fully assimilated items show neither* (Gaskell & Marslen-Wilson, 2001; Gow, 2002;2003) assimilation # context Infinite regress (eternal ambiguity)…. or something more interesting? Continuous acoustic detail is subject to basic perceptual processes A Perceptual Account Feature cue parsing (Gow, 2003) [k t b p 3000 0 0 0.760454 Time (s) l Feature cue parsing (Gow, 2003) 3000 0 0 0.760454 Time (s) Features encoded by multiple cues that are integrated Feature cue parsing (Gow, 2003) 3000 0 0 0.760454 Time (s) Feature cue parsing (Gow, 2003) 3000 0 0 0.760454 Time (s) Assimilation creates cues consistent with multiple places Feature cue parsing (Gow, 2003) Extract feature cues Feature cue parsing (Gow, 2003) Group feature cues by similarity and resolve ambiguity Feature cue parsing (Gow, 2003) example: eight…. catp # box | [cor] [lab] [LAB] catp # drawing catp # | | | [cor] [COR] [lab] [cor] [lab] Feature cue parsing (Gow, 2003) example: eight…. catp # Box | [cor] [lab] [LAB] catp # Drawing catp # | [cor] [COR] [lab] [cor] [lab] Progressive and regressive effects fall out of grouping Summary SWR problem (eternal ambiguity) replaced by simpler perceptual problem CAD important in solution: processing obstacle facilitates perception. Integration of continuous perceptual features facilitates higher-level processes. Facilitation via core-word recognition mechanisms—no extra-segmental routines required. The Standard Paradigm Standard paradigm • Created artificial boundaries that misframed issues. • Continous acoustic detail is variability to be conquered.. The basis of the standard paradigm is undercut. • Meaning-based processes are affected by CAD. • CAD is an essential component of word recognition. The Emerging Paradigm The emerging paradigm • Emphasis on methodologies that tap the minimal computational problem: meaning. • Stresses integration of speech and spoken word recognition, questions methods and theory. • Continuous acoustic detail is useful signal, not noise. From Sound to Sense and back again: The integration of lexical and speech processes David Gow Bob McMurray Massachusetts General Hospital Dept. of Brain and Cognitive Sciences University of Rochester
© Copyright 2025 Paperzz