Maximizing cochlear implant patients’ performance with advanced speech training procedures
Introduction
The cochlear implant (CI) is an electronic device that provides hearing sensation to patients with profound hearing loss. As the science and technology of the cochlear implant have developed over the past 50 years, CI users’ overall speech recognition performance has steadily improved. With the most advanced implant technology and speech processing strategies, many patients receive great benefit, and are capable of conversing with friends and family over the telephone. However, considerable variability remains in individual patient outcomes. Some patients receive little benefit from the latest implant technology, even after many years of daily use of the device. Much research has been devoted to exploring the sources of this variability in CI patient outcomes. Some studies have shown that patient-related factors, such as duration of deafness, are correlated with speech performance (Eggermont and Ponton, 2003, Kelly et al., 2005). Several psychophysical measures, including electrode discrimination (Donaldson and Nelson, 1999), temporal modulation detection (Cazals et al., 1994, Fu, 2002), and gap detection (Busby and Clark, 1999, Cazals et al., 1991, Muchnik et al., 1994), have also been correlated with speech performance.
Besides the high variability in CI patient outcomes, individual patients also differ in terms of the time course of adaptation to electric hearing. During the initial period of use, post-lingually deafened CI patients must adapt to differences between their previous experience with normal acoustic hearing and the pattern of activation produced by electrical stimulation. Many studies have tracked changes in performance over time in “naïve” or newly implanted CI users. These longitudinal studies showed that most gains in performance occur in the first 3 months of use (George et al., 1995, Gray et al., 1995, Loeb and Kessler, 1995, Spivak and Waltzman, 1990, Waltzman et al., 1986). However, continued improvement has been observed over longer periods for some CI patients (Tyler et al., 1997). Experienced CI users must also adapt to new electrical stimulation patterns provided by updated speech processors, speech processing strategies and/or changes to speech processor parameters. For these patients, the greatest gains in performance also occurred during the first 3–6 months, with little or no improvement beyond 6 months (e.g., Dorman and Loizou, 1997, Pelizzone et al., 1999).
These previous studies suggest that considerable auditory plasticity exists in CI patients. Because of the spectrally-degraded speech patterns provided by the implant, “passive” learning via long-term use of the device may not fully engage patients’ capacity to learn novel stimulation patterns. Instead, “active” auditory training may be needed to more fully exploit CI patients’ auditory plasticity and facilitate learning of electrically stimulated speech patterns. Some early studies assessed the benefits of auditory training in poor-performing CI patients. Busby et al. (1991) observed only minimal changes in speech performance after ten 1-h training sessions; note that the subject with the greatest improvement was implanted at an earlier age, and therefore had a shorter period of deafness. Dawson and Clark (1997) reported more encouraging results for ten 50-min vowel recognition training sessions in CI users; four out of the five CI subjects showed a modest but significant improvement in vowel recognition, and improvements were retained three weeks after training was stopped.
While there have been relatively few CI training studies, auditory training has been shown to be effective in the rehabilitation of children with central auditory processing disorders (Hesse et al., 2001, Musiek et al., 1999), children with language-learning impairment (Merzenich et al., 1996, Tallal et al., 1996), and hearing aid (HA) users (Sweetow and Palmer, 2005). Studies with normal-hearing (NH) listeners also show positive effects for auditory training. As there are many factors to consider in designing efficient and effective training protocols for CI users (e.g., generalization of training benefits, training method, training stimuli, duration of training, frequency of training, etc.), these studies can provide valuable guidance.
In terms of positive training outcomes, it is desirable that some degree of generalized learning occurs beyond the explicitly trained stimuli and tasks, i.e., some improvement in overall perceptual acuity. The degree of generalization may be influenced by the perceptual task, and the relevant perceptual markers. For example, Fitzgerald and Wright (2005) found that auditory training significantly improved sinusoidally-amplitude-modulated (SAM) frequency discrimination with the trained stimuli (150 Hz SAM). However, the improvement did not generalize to untrained stimuli (i.e., 30 Hz SAM, pure tones, spectral ripples, etc.), suggesting that the learning centered on the acoustic properties of the signal, rather than the general perceptual processes involved in temporal frequency discrimination. In an earlier study, Wright et al. (1997) found that for temporal interval discrimination, improvements for the trained interval (100 ms) generalized to untrained contexts (i.e., different border tones); however, the improved temporal interval perception did not generalize to other untrained intervals (50, 200, or 500 ms). While these NH studies employed relatively simple stimuli, the results imply that the degree and/or type of generalization may interact with the perceptual task and the training/test stimuli. Some acoustic CI simulation studies with NH listeners have also reported generalized learning effects. For example, Nogaki et al. (2007) found that spectrally-shifted consonant and sentence recognition significantly improved after five training sessions that targeted medial vowel contrasts (i.e., consonant and sentence recognition were not explicitly trained). In contrast, some acoustic CI simulation studies show that auditory training may not fully compensate for the preservation/restoration of the normal acoustic input, especially in terms of frequency-to-cochlear place mapping (e.g., Faulkner, 2006, Smith and Faulkner, 2006). Thus, there may be limited training benefits in the real CI user case. It is unclear whether benefits of training with simple stimuli (e.g., tones, bursts, speech segments, etc.) and simple tasks (e.g., electrode discrimination, modulation detection, phoneme identification, etc.) will extend to complex stimuli and tasks (e.g., open set sentence recognition).
It is also important to consider the most efficient and effective time course of training. How long and/or how often must training be performed to significantly improve performance? Extending their previous psychophysical studies, Wright and Sabin (2007) found that frequency discrimination required a greater amount of training than did temporal interval discrimination, suggesting that some perceptual tasks may require longer periods of training. Nogaki et al. (2007) reported that, for NH subjects listening to severely-shifted acoustic CI simulations, the total amount of training may matter more than the frequency of training. In the Nogaki et al. (2007) study, all subjects completed five one-hour training sessions at the rate of 1, 3, or 5 sessions per week; while more frequent training seemed to provide a small advantage, there was no significant difference between the three training rates.
Different training methods have been shown to affect training outcomes in acoustic CI simulation studies with NH listeners. For example, Fu et al. (2005a) compared training outcomes for severely-shifted speech for four different training protocols: test-only (repeated testing equal to the amount of training), preview (direct preview of the test stimuli immediately before testing), vowel contrast (targeted medial vowel training with novel monosyllable words, spoken by novel talkers), and sentence training (modified connected discourse). Medial vowel recognition (tested with phonemes presented in a h/V/d context) did not significantly change during the 5-day training period with the test-only and sentence training protocols. However, vowel recognition significantly improved after training with the preview or vowel contrast protocols. Different protocols may be appropriate for different listening conditions, and may depend on the degree of listening difficulty. For example, Li and Fu (2007) found that, for NH subjects listening to acoustic CI simulations, the degree of spectral mismatch interacted with the different protocols used to train NH subjects’ vowel recognition. In that study, subjects were repeatedly tested over the five-day study period while listening to either moderately- or severely-shifted speech. During testing, the 12 response boxes were either lexically or non-lexically labeled. With moderately-shifted speech, vowel recognition improved with either the lexical or non-lexical labels, i.e., evidence of “automatic” learning. With severely-shifted speech, vowel recognition improved with the non-lexical labels (i.e., discrimination improved), but not with the lexical labels (i.e., no improvement in identification), suggesting that difficult listening conditions may require explicit training with meaningful feedback. For CI users, these studies imply that different training protocols may be needed to address individual deficits or difficult listening conditions.
Finally, different training materials may influence training outcomes. Is it better to train with a well-known or well-defined group of stimuli, or does variability in the stimulus set provide better adaptation? Will training with relatively difficult stimuli improve recognition of both easy and difficult stimuli, or will enhanced or simplified training stimuli provide better outcomes? There is positive evidence for both approaches. For example, Tallal et al. (1996) found that, for language-learning-impaired children, modified speech signals (i.e., prolonged duration enhanced envelope modulation) improved recognition of both the modified and unprocessed speech signals during a 4-week training period. The modest benefits of auditory training in CI users observed by Busby et al., 1991, Dawson and Clark, 1997 may have been limited by the small stimulus sets used for training. Intuitively, it seems that more varied training stimuli may provide greater adaptation.
In the following section, we describe some of our recent auditory training studies in CI users. Some sections report preliminary data obtained with one or two subjects, while others summarize previously published work with 10 or more subjects. Different from many auditory training studies in NH listeners that use separate experimental control groups, we adopted a “within-subject” control procedure for most of our CI auditory training studies. In CI research, inter-subject variability is a well-known phenomenon. For these sorts of intensive training studies with CI users, it is difficult to establish comparable performance groups to provide appropriate experimental controls. To overcome the high variability of speech performance across CI patients, large numbers of CI subjects are needed to evaluate the benefits of auditory training. A within-subject control procedure provides an alternative method to evaluate the effectiveness of auditory training using a relatively small amount of subjects. In any training study, it is important to determine whether performance improvements are due to perceptual learning or to procedural learning (i.e., improvements that result from learning the response demands of the task). To minimize procedural learning effects, we repeatedly measured baseline performance in each subject (in some cases for 1 month or longer) until achieving asymptotic performance. For each subject, mean asymptotic performance was compared to post-training results to determine the effect of training.
Section snippets
Electrode discrimination training
Previous studies have shown a significant correlation between electrode discrimination and speech performance (Donaldson and Nelson, 1999). Training with basic psychophysical contrasts may improve CI users’ speech performance, as improved electrode and/or pitch discrimination may extend to improved sensitivity to the spectral cues contained in speech signals. We conducted a pilot experiment to see whether CI patients’ electrode discrimination abilities could be improved with psychoacoustic
Remaining challenges in auditory training for CI users
The results from these studies demonstrate that auditory training can significantly improve CI users’ speech and music perception. The benefits of training extended not only to poor-performing patients, but also to good performers listening to difficult conditions (e.g., speech in noise, telephone speech, music, etc.). While auditory training generally improved performance in the targeted listening task, the improvement often generalized to auditory tasks that were not explicitly trained (e.g.,
Acknowledgment
Research was partially supported by NIDCD Grant R01-DC004792.
References (76)
- et al.
Hearing ability by telephone of patients with cochlear implants
Otolaryngol. Head Neck Surg.
(1999) - et al.
Electrophysiological and speech perception measures of auditory processing in experienced adult cochlear implant users
Clin. Neurophysiol.
(2005) - et al.
Gap detection by early-deafened cochlear-implant subjects
J. Acoust. Soc. Am.
(1999) - et al.
Results of speech perception and speech production training for three prelingually deaf patients using a multiple-electrode cochlear implant
Br. J. Audiol.
(1991) - et al.
Indication of a relation between speech perception and temporal resolution for cochlear implantees
Ann. Otol. Rhinol. Laryngol.
(1991) - et al.
Low-pass filtering in amplitude modulation detection associated with vowel and consonant identification in subjects with cochlear implants
J. Acoust. Soc. Am.
(1994) - et al.
Telephone speech comprehension with use of the nucleus cochlear implant
Ann. Otol. Rhinol. Laryngol.
(1989) - et al.
Changes in synthetic and natural vowel perception after specific training for congenitally deafened patients using a multichannel cochlear implant
Ear. Hear.
(1997) - et al.
Place-pitch sensitivity and its relation to consonant recognition by cochlear implant listeners using the MPEAK and SPEAK speech processing strategies
J. Acoust. Soc. Am.
(1999) - et al.
Changes in speech intelligibility as a function of time and signal processing strategy for an Ineraid patient fitted with continuous interleaved sampling (CIS) processors
Ear. Hear.
(1997)