The Acoustic Features of Speech Sounds
EEN 540 - Computer Project I
                by Richard Juszkiewicz
 

This site contains monophonic recordings of the English language's forty phonemes along with an analysis of each one in both the time and frequency domains.  The recording was done using a Shure Beta 57A microphone connected to a Sound Blaster Live Audigy soundcard on a PC.  Adobe's Audition software package was used to capture the audio onto the computer with a sampling rate of 16kHz and a bit depth of 16.  All plots were generated using the MATLAB software suite.  The code is available here.

On the tables below each phoneme is listed under its respective category by its phonetic symbol and the word that was spoken to demonstrate it.  Both the phoneme and word are hyperlinks--clicking on the word brings up the time domain plot of the word along with narrow and wideband spectrograms of the word and clicking on the phoneme displays the waveform of the phoneme that was extracted from the word, a small sample of the phoneme and then the frequency domain representation (using a 1024 point FFT) of that small sample.   Superimposed in red on the spectrum plot is a smooth spectral envelope obtained using linear predictor coefficients.  The length of the aforementioned small sample varies from 20ms to 45ms.  For noise-like phonemes the length is 20ms and for periodic (voiced) phonemes the length is however much was needed to show approximately three full periods.  The length taken is noted on the plot.  The time axis of both the phoneme and the small sampled phoneme reflect where the sample was taken from in the word.  For example, in the word 'sing' the /G/ phoneme that is of interest is located at the end of the word, at.388 seconds, so this is the lowest possible starting value of the time axis on the next two phoneme plots.  Sometimes the small sample of the phoneme is taken at a location different from the beginning of the phoneme and once again, the time axis reflects that.  

The recorded sound file of the word is available by clicking the speaker icon ()  next to it.  A zip file for all of the recorded words is available here.  A recording of the phoneme is not linked to directly because it is too short of a sound file to be of any value but, if desired, a zip file of all of the recorded phonemes is available here.  By clicking on the heading of each table (vowels, consonants and transitionals) conclusions about the data gathered in that respective section will be discussed.  (Please note that each image or sound link opens in its own window so to avoid having an abundance of browser windows open it is advised to close the image and audio windows after viewing is complete.)  Without further  ado, here are the highly hyped up phoneme tables:

Vowels
Front Center Back

/R/ 'bird' 
/A/ 'up' 

/a/ 'father' 
/c/ 'all' 
/o/ 'obey' 
/U/ 'foot' 
/u/ 'boot' 

 

Consonants
Nasals Whisper Affricates

/h/ 'he' 
Voiced

/J/ 'just' 
Unvoiced

/tS/ 'chew' 
Fricatives Plosives
Voiced

/v/ 'vote' 
/D/ 'then' 
/z/ 'zoo' 
/Z/ 'azure' 
Unvoiced

/f/ 'for' 
/T/ 'thin' 
/s/ 'see' 
/S/ 'she' 
Voiced
/b/ 'be' 
/d/ 'day' 
/g/ 'go' 
 
 
Unvoiced

/p/ 'pay' 
/t/ 'to' 
/k/ 'key' 

 

Transitional
Diphthongs Semi-Vowels

/aI/ 'hide' 
/aU/ 'out' 
/O/ 'boy'  
/JU/ 'new' 
Liquid

/r/ 'read' 
/l/ 'left' 
Glides

/w/ 'we' 
/y/ 'you'