Samrat Sarkar: Speech enhancement for hearing aids

THE CHALLENGE:
In single channel system, the clean speech cannot be processed prior to being affected by the
noise. This is one of the most difficult situations in speech enhancement, since no reference
signal of the noise is available. The major thrust in hearing aid development today is changing
from further miniaturization to developing improved forms of signal processing for speech
enhancement. This change in emphasis comes at a fortuitous time in that recent advances in
digital technology provide the means for implementing substantially more advanced forms of
signal processing in modern hearing aids.
THE SOLUTION:
Developing an efficient and reliable single channel speech enhancement algorithm(with out
reference input) for modern digital hearing aids using Advanced Signal Processing Toolkit, DSP
Module and Adaptive Filter Toolkit with LABVIEW 8.6 real time software.
INTRODUCTION:
Hearing-aid users have great difficulty of understanding speech in a noisy and/or
reverberant acoustic environment. Speech enhancement means the improvement in intelligibility
and/or quality of degraded speech signal to reduce the listener’s fatigue by using signal
processing tools. Speech enhancement is a very difficult problem for two reasons. First, the
nature and characteristics of the noise signal can change dramatically in time and application to
application. It is therefore laborious to find versatile algorithms that really work in different
practical environments. Second, the performance measure can also be defined differently for
each application. The speech signal is a highly robust redundant signal so that if the loss of more
cues due to noise or other distortions will reduce speech intelligibility. As a consequence, people
with hearing loss are particularly susceptible to the damaging effects of background noise on
speech intelligibility.
COMMON SOURCES OF NOISE:
There are many factors that affect the speech signal during transmission at various
stages. In Fig given below the effect of various noise sources is shown. In the transmission side,
the effect of background noise is added with the desired signal and the signal from other speakers
are treated as noise for the desired speaker. The signal with background noise is transmitted
through the channel where the transmission noise is added with the desired signal.
PROBLEMS IN THE EXISTING TECHNIQUES:
The noise spectrum is estimated during pauses in the speech and then subtracted from the
speech-plus-noise spectrum when speech is present. Although this technique is effective in
Desired
Speaker
Commn.
Channel
Enhancement
Process
+ Listener
Background
Noise
Other
Speakers
Transmission Noise
reducing the background noise level, speech intelligibility remains essentially unchanged, or
reduced to some extent as a result of audible signal-processing distortions. Even with a small
decrement in intelligibility, listeners with hearing loss who are especially sensitive to
background noise have indicated a preference for the processed, slightly distorted signals over
the noisy unprocessed signals.
PROPOSED METHOD:
In this proposed system speech signal enhancement is done using adaptive LMS filter without
noise reference. It further reduces the residual effect usually exist in the most commonly used
algorithms like spectral subtraction, sub space method etc.,
SYSTEM IMPLEMENTATION:
The system basically divided in to three modules LMS filter,Delay loop,SNR calculation.
LMS FILTER:
LMS filter has been given with two inputs.One is the noisy signal input which is taken from the
CSLU (Center for Spoken Language Understanding) database.Another input is from the delay
loop output.The filter length has been a fixed one.The step size for the LMS filter will be varied
automatically by calculating the step size for the given input.
DELAY LOOP:
The second input for the LMS filter is given from this module.Thus the input for this module is
noisy speech , the sample delay and the maximum delay is fixed for all the samples of speeches.
SIGNAL TO NOISE RATIO (SNR):
The global SNR values are determined by the following equation,
( )
( ) ( ) [ ]

−
=

n
n
dB
s n s n
s n
SNR
2
2
10
ˆ
10 log --- (4.1)
s(n) = clean speech.
Sˆ(n) = Enhanced speech.
If the summation is performed over the whole signal length, the operation is called as
global SNR. As SNR decreases, the observation signal becomes noisier.
POWER SPECTRAL DENSITY (PSD) AND SPECTROGRAM PLOT:
In addition to the above validation parameter frequency domain analysis has to be done as far as
speech is concerned. In this work the PSD and Spectrogram (time -frequency) plot of the
enhanced speech and clean speech are more comparable.
MEAN OPINION SCORE (MOS):
The mean opinion score (MOS) provides a numerical measure of the quality of human speech
and it is necessary to make it distinguishable to the listener. The scheme uses subjective tests
(opinionated scores) that are mathematically averaged to obtain a quantitative indicator of the
system performance. To determine MOS, a number of listeners rate the quality of test sentences
by hearing test. Based on the perceived speech more comparable with clean speech. Quality of
enhanced speech listener gives a rating for each sentence as follows: (1) Bad (2) Poor (3) Fair
(4) Good (5) Excellent. The MOS is the arithmetic mean of all the individual scores, and can
range from 1 (worst) to 5 (best). In this work opinion score was collected from 10 listeners. The
average MOS obtained for this proposed method is between 3.5 to 4.
0
0.5
1
1.5
2
2.5
3
3.5
4
EMF Wiener
Filtering
BWT WPT SS Proposed
Method
Avg MOS
VARIOUS NOISY SAMPLES TAKEN FOR COMPARISON:
The clean input for reference & noisy test samples have been taken from SPEAR (Speech
Enhancement Assessment Resource) database of CSLU (with various input SNR) dB values.
Table-1 shows the SNR improvement obtained in the proposed is compared with the reported
results of other techniques.
TABLE- 1: SNR VALUES OF VARIOUS METHODS COMPARED WITH PROPOSED
METHOD
Noisy input
speech
PSS
(dB)
Wiener
filtering
(dB)
EMF
(dB)
WPT
(dB)
BWT
(dB)
DEK
F
(dB)
NRA
F
(dB)
Proposed
method(d
B)
Pink noisy
(0dB)
0.5
2.5
7
3.2
7.3
5.5
-
9.87
Pink noisy
(6dB)
-
-
-
-
-
3.99
-
10.5722
Cell noisy
(0dB)
-
-
8.95
-
-
-
- 11.05
White
stationary
noisy (0dB)
1.5
2.5
4.99
6.5
8
7.60
-
11.55
White
stationary
noisy (7dB)
3.7
3.8
13
10
12.5
4.76
-
16.57
White
Bursting
noisy (0dB)
-
-
-
-
-
9.95
-
11.9
White
Bursting
noisy (3dB)
-
-
-
-
-
-
-
15.16
Figure 1&3. shows the PSD , Time domin and Spectrogram plot of clean reference for
database,noisy speech and enhanced speech obtained using the proposed method. It is understood
from that the results of proposed method is more comparable with the clean reference signal.
Figure 2 shows the results of the same approach in matlab environment and in time domain itself
lot of speech activity portion were lost.
Figure 1: PSD, Time domain and spectrogram plot of the enhanced speech from white stationary
noise (0dB)
Figure 2: Enhanced speech obtained in Matlab environment
Figure 3: PSD, Time domain and spectrogram plot of the enhanced speech from white
bursting noise (3dB)
MATLAB VS LABVIEW:
In Matlab environment the execution time for the proposed method takes about two minutes. But
in labview environment the execution time is only in the range of milliseconds.
Social Impact:
This method have fast processing speed , low computational time and high efficiency which
results in efficient hearing even in noisy environment and it reduces the listener’s fatigue.
CONCLUSION:
The system developed is highly reliable and efficient for the above-mentioned problems in
digital hearing aids. We were able to develop the algorithm for speech enhancement and
evaluated its performance by calculating SNR improvement and studying its power spectral
density and spectrogram plot of the enhanced speech.

Samrat Sarkar

Tuesday, February 3, 2009

Speech enhancement for hearing aids

No comments:

About Me

Blog Archive