Detection of Parkinson’s disease through speech signatures
Jinu James, Neenu George, Shrinidhi Kulkarni, Sneha Parsewar, Revati Shriram and Mrugali Bhat
Abstract— Parkinson’s disease is the most common movement disorder and second most common neurodegenerative disorder. The symptoms of Parkinson’s disease can be classified as motor symptoms and non-motor symptoms. Out of these the non-motor or dopamine non-responsive symptoms have a major impact on the patients. Some of the non-motor symptoms are cognitive impairment, depression, REM sleep disorder, speech and swallowing difficulties, loss of smell and change in the body odor. It may become difficult to walk, talk, and complete simple tasks as symptoms worsen. The symptoms and the rate at which the disease worsens vary from individual to individual. Patients suffering from this disease also have soft speech, impaired voice or voice box spasms. The objective of our project work is to explore this symptom and its detection. The voice signals will be captured using MATLAB. Comparison of the signals obtained with the corresponding signals of a healthy person will determine whether the individual is affected by the disease.
Keywords: Parkinson’s disease, soft speech, MATLAB
Jinu James is a graduate student with Department of Instrumentation and Control, MKSSS’s Cummins College of Engineering for Women, Karvenagar, Pune-411052, INDIA
Email: [email protected] George is a graduate student with Department of Instrumentation and Control, MKSSS’s Cummins College of Engineering for Women, Karvenagar, Pune-411052, INDIA
Email: [email protected] Kulkarni is a graduate student with Department of Instrumentation and Control, MKSSS’s Cummins College of Engineering for Women, Karvenagar, Pune-411052, INDIA
Email: [email protected] Parsewar is a graduate student with Department of Instrumentation and Control, MKSSS’s Cummins College of Engineering for Women, Karvenagar, Pune-411052, INDIA
Email: [email protected] Shriram is with Department of Instrumentation and Control, MKSSS’s Cummins College of Engineering for Women, Karvenagar, Pune-411052, INDIA
Email : [email protected] Bhat is graduated in the of Department of Instrumentation and Control, MKSSS’s Cummins College of Engineering for Women, Karvenagar, Pune-411052, INDIA
Email: [email protected]
Parkinson’s disease is a progressive nervous system disorder that affects activity of a human body. Over 10 million people globally are suffering from this disorder. 1
In this disease, gradually breaking down of nerve cells that are neurons starts taking place. Due to which there is a reduction in the production of dopamine which is produced in the substantia nigra. Dopamines are chemical messengers that are produced by neuron in the brain, reduction of this leads to abnormal activity of brain, causing PD 2
Parkinson’s disease symptoms include muscle hardness, tremors, speech change, skin impedance and change in facial features. In the proposed work speech signal is used for analysis of Parkinson’s disease. There is variation in the voice of a person diagnosed with PD. There is change in pitch. The voice gets softer, breathy, hoarse and slurred, causing others difficulty hearing what is said. 3
The time-frequency-based extracted features for detection of Parkinson’s disease are pitch, jitter, shimmer, SNR, formant frequency, and energy 4
Pitch is a noncognitive property that allows the ordering of sounds on a frequency-related scale. Pitch period is the time duration of one glottal cycle. Pitch is known as the rate at which vibrations are produced. They are usually expressed in Hertz. One cycle represents the complete vibration of speech signal back and forth. The pitch is high when frequency of the tone is high. 5
The formants is known as concentration of acoustic energy around a particular frequency or resonant frequencies of the vocal tract. The values for formants frequencies depend on the shape and dimension of the vocal tract. The major resonances of the vocal tract can be approximately characterized by the first four resonant frequencies. These resonant frequencies are denoted by first (F1), second (F2), third (F3) and fourth (F4) formants. The fundamental frequency F0 and the formant frequency are correlated. The relation between the nth formant frequency Fn and the fundamental frequency F0 can be approximated as:
Where an and bn are vowel dependent constants. 6
Jitter is the alteration of a periodic signal from its true periodicity. Jitter attributes to the variations of fundamental frequency between cycles of vibration. According to researches the normal range for jitter is from 0.5 to 1% . Due to the loss of control of vibration of the vocal chords, the jitter is affected adversely. Patients affected with PD have higher percentage of jitter. Interfering with normal vocal fold vibration leads to higher jitter level. 7
Shimmer behaves towards amplitude change in the voice. Due to reduction of glottal resistance on the vocal chord, shimmer differs. Shimmer deals with the percentage distortion in the amplitude of the vocal chord. Therefore, it calculates the fluctuation in the intensity of neighboring vibratory cycles of the vocal folds. Jitter and shimmer follow the internal noises of the human body. Higher jitter and shimmer levels reflect neuromuscular problems. 7
The objective of this system is to anticipate Parkinson’s disease, based on speech signal analysis
Main components of this system are as follows:
Fig 1 displays the block diagram of the proposed system used for speech signal.
Fig 1: Block Diagram
The proposed system aims to record speech signal using MATLAB toolbox and thereafter calculating parameters.
Change in properties occurs in a speech signal over time. Some properties of the speech are of short or of long period of time. For a short period of time signal processing methods can be practiced with the use of DFT, Hamming, or autocorrelation. Speech processing is done by considering short windows called frame and then processing them. Long signal of speech is given a finite length of the original signal by multiplying it with a window function of finite length. 8
To determine the pitch or fundamental frequency of a speech signal, a pitch detection algorithm (PDA) is used. Different techniques used for pitch period estimation are autocorrelation, cepstrum and SIFT method. 9 The main disadvantage of pitch estimation by the auto correlation method is that there are presence of larger peaks than the peak corresponding to pitch period due to the convolution of the vocal tract and vocal source, hence wrong estimation of pitch can occur. The cepstrum of speech is stated as inverse Fourier transform of the log magnitude spectrum. In log magnitude spectrum, the cepstrum gives out all the slowly varying components to the low frequency region and fast varying components to the high frequency regions. The slowly varying components indicate the envolope corresponding to the vocal tract and the fast varying components to the excitation source. Hence, the vocal tract and excitation source components get represented naturally in the spectrum of speech. 10
Signal-to-Noise Ratio (SNR):
It is defined as the ratio of signal intensity to noise intensity, expressed in decibels.
SNR= 20log10SrmsNrmsTo determine the quality of audio data signal to noise ratio is an important feature. Recognition performance is strongly influenced by the SNR which is why this parameter is important. 11
Linear prediction method predicts the output of the linear system based on its input and previous output of a linear system. Linear predictive coding is used for determining formants, the intensity and the frequency is estimated by extracting their effects from the speech signal. This process of extracting formants is called as inverse filtering. The formant frequencies rely on the size and shape of the vocal tract. 12
Jitter (absolute) is the cycle-to-cycle variation of fundamental frequency, i.e. the average absolute difference between consecutive periods. 13
Jitter (absolute) = 1 N-1i=1N-1Ti-Ti+1Shimmer Determination
Shimmer (dB) is expressed as the variability of the peak-to-peak amplitude in decibels, i.e. the average absolute base-10 logarithm of the difference between the amplitudes of consecutive periods, multiplied by 20. 13
Shimmer (absolute) = 1 N-1i=1N-120logAi+1AiRESULT AND ANALYSIS
Table I lists the parameters calculated for 20 samples that are not affected by Parkinson’s disease. We recorded the speech signal with the help of MATLAB toolbox. These readings are taken during the period when these subjects were not sick. The plots obtained for a normal subject for the parameters of the speech signal are as shown in fig 2 and fig 3.
TABLE I: PARAMETERS CALCULATED USING MATLAB
Name Pitch SNR Formant Frequency Jitter Shimmer
Abraham 905.66 -19.46 382.3, 1581.2, 2405.2 7.67×10-10 2.72×10-6
Aishwarya827.58 -25.56 470.9, 1320.1, 2005.1 8.25×10-10 1.69×10-7
Ajoe923.07 -18.42 441.2, 1303.3, 2557.9 7.52×10-10 1.24×10-6
Ameya923.07 -19.15 495.8, 1266.3, 1826.0 7.21×10-10 -9.93×10-7
Neethu905.66 -23.62 341.0, 717.5, 2147.6 7.20×10-10 -2.58×10-7
Elsa 1000 -22.61 519.1, 1131.6, 1721.8 6.73×10-10 -3.48×10-6
Neenu786.88 -21.92 444.1, 783.9, 1882 8.61×10-10 -2.44×10-6
Shrinidhi923.07 -23.05 420.2, 500.5, 1844.2 7.32×10-10 5.18×10-6
Jinu738.46 -22.73 415.7, 776.3, 1947.1 9.40×10-10 3.25×10-7
Jease923.07 -17.09 463.3, 713.1, 1632.3 7.60×10-10 -2.65×10-6
Likhita750 -22.48 371.7, 781.8, 2102.7 9.10×10-10 4.21×10-6
Niyanta685.71 -19.7 426.4, 918.6, 1918.9 9.95×10-10 -1.16×10-6
Pallavi827.58 -19.59 398.5, 964.0, 1976.8 8.29×10-10 -3.46×10-6
Roshan 1000 -19.89 532.1, 1726.8, 2575.9 7.56×10-10 8.83×10-7
Rucha923.07 -17.93 488.8, 1911.6, 1928.9 7.53×10-10 1.90×10-6
Shreya 1000 -21.19 570.7, 709.1, 2000.5 6.94×10-10 -1.29×10-6
Simi 872.72 -24.1 384.8, 796.7, 1656.4 7.88×10-10 -2.16×10-7
Sneha786.88 -18.94 465.1, 1733.2, 2790.6 8.78×10-10 -3.59×10-6
Vaidehi D 738.46 -19.43 363.9, 660.3, 2016.3 9.41×10-10 4.30×10-6
Vincy872.72 -19.72 390.5, 961.9, 1945.8 7.44×10-10 2.15×10-6
Fig 2: Fundamental Frequency Estimation
Fig 3: Formant Frequency Estimation using LPC
The main reason behind such a study is early diagnosis of Parkinson. It was observed that for early detection of PD we need to concentrate on non-motor symptoms only rather than motor-symptoms as they start occurring after significant years. In the system proposed we have considered speech signal analysis for PD detection. There are many parameters that show variation when the subject is affected with Parkinson’s disease. Some of these are jitter, shimmer, signal to noise ratio, formant frequency, pitch, energy, intensity, phase differences, correlation dimension, harmonicity and many more. Amongst these we have worked on five parameters. Many other works has been carried out based on non-motor symptoms of PD which are tremor analysis, skin impedance, change in facial expression and multiple characteristics of finger movement while typing.
This method of detection of Parkinson’s disease is reliable and non-invasive. Various parameters have been discussed in the proposed work. According to the research carried out the values for jitter and shimmer are higher for PD patients than normal subjects, while the values for pitch and formants are lower. This is basically due to changes in the muscle contraction as a result of which the vocal cords do not function properly leading to voice disturbance.
Parkinson’s disease Foundation: www.pdf.org/en/parkinson_statistics
WebMD: Leading source for trustworthy and timely health and medical news https://www.webmd.com/parkinsons-disease/default.htmParkinsonsDisease.net: PD Health Information ; Community
https://parkinsonsdisease.net/symptoms/speech-difficulties-changes/Dixit, Vikas Mittal, Yuvraj Sharma, “Voice Parameter Analysis for the disease detection”, IOSR Journal of Electronics and Communication Engineering (IOSR-JECE)
L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, Pearson Education India
A.R. Jayan, Speech and Audio Signal Processing, PHI Learning
Joao Paulo Teixeira, Carla Oliveira, Carla Lopes, “Vocal Acoustic Analysis – Jitter, Shimmer and HNR Parameters”, CENTERIS 2013 – Conference on ENTERprise Information Systems / HCIST 2013 – International, Conference on Health and Social Care Information Systems and Technologies
P. Dhanalakshmi, S. Palanivel and V. Ramalingam, “Classification of Audio Signals Using SVM and RBFNN,” Expert Systems with Applications, Vol. 36, No. 3, 2009, pp. 6069-6075.
A. R. Rabiner On the Use of Autocorrelation Analysis for Pitch Detection, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 25, No. 1, 1977, pp.24-33.
Amrita Vishwa Vidyapeetham,” Estimation Of Pitch From Speech Signals”: http://vlab.amrita.edu/?sub=3;brch=164;sim=1012;cnt=6ICSI Speech:”Chapter 4.1 Signal processing and audio”: http://www1.icsi.berkeley.edu/Speech/faq/speechSNR.html
Roy C Snell, Fausto millinaso, Formant location from LPC analysis of data, , IEEE Transactions on Speech, and Audio Processing, Vol. 1, No. 2, 1993
Mireia Farrús, Javier Hernando, Pascual Ejarque “Jitter and Shimmer Measurements for Speaker Recognition”, TALP Research Center, Department of Signal Theory and Communications Universitat Politècnica de Catalunya, Barcelona, Spain
Mrugali Bhat, Sharvari Inamdar, Devyani Kulkarni, Gauri Kulkarni, Revati Shriram. “Parkinson’s disease Recognition”, TALP Research Center, Department of Signal Theory and Communications Universitat Politècnica de Catalunya, Barcelona, Spain