DIAGNOSIS OF ARRHYTHMIA DISEASES USING HEART SOUNDS AND ECG SIGNALS

This paper presents a novel method for the detection of Arrhythmia diseases using both heart sounds and ECG signals. This automated classification and analysis system is aimed to assist the cardiologist to make the diagnosis faster and more efficient. Most of the heart valve disorders are reflected to heart sounds and can be detected through Phono Cardio Gram (PCG) signal analysis. Heart sounds carry information about the mechanical activity of the cardiovascular system. The heart sound segmentation process segments the Phono Cardio Gram (PCG) signal into four parts: S1 (first heart sound),systole, S2 (second heart sound) and diastole. It can be considered as one of the most important phases in the autoanalysis of PCG signals. Systolic and Diastolic time periods of heart sound signals are used to detect the abnormality of heart functions. The Systolic and diastolic time periods are matched with the ECG signals. The interval between two consecutive R peak values in ECG signal is considered as one cardiac cycle. A single cardiac cycle consists of S1, Systole, S2 and Diastole. Both Echocardiogram and Electrocardiogram signals are analyzed for the accurate diagnosing of cardiac vascular diseases. Russ J Cardiol 2014, 1 (105), Engl.: 35-41


Introduction
In recent times, phonocardiogram has been playing an important role in the diagnosing of cardio vascular diseases. With the decrease in cost and increase in computation power of personal computers, a sophisticated and cost-effective PC-based classification and analysis system can be developed to assist the cardiologist to make the diagnosis faster and more efficient. This system should be able to extract the features, process, classify, identify and analyze the heart sounds and ECG signals efficiently and reliably. More importantly, it can detect early signs of Arrhythmia diseases such as, Sinus Node Arrhythmias, Atrial Arrhythmias, Junctional Arrhythmias, Ventricular arrhythmias, Atrioventricular Blocks, Bundle Branch blocks and provide objective diagnosis based on some criteria defined by the cardiologist himself. The proposed system will not only extend cardiologist's capability and productivity during examination but also provide an automatic tool in the mass screening of heart diseases classification. So, the proposed system is suitable for people residing at rural areas and the remote places away from the city. Our proposed system plays an important role to save the life of patient by immediate analysis of Heart sounds and ECG signals in rural areas through Public Health Care Centers by Enhancing Modern Health Care Scenario. Shivnarayan Patidar et al (2013) proposed a method for the segmentation of cardiac sound signals using tunable-Q wavelet transform (TQWT). The murmurs from cardiac sound signals were removed by suitably constraining TQWT based decomposition and reconstruction. The envelope based on cardiac sound characteristic waveform (CSCW) was extracted after the removal of low energy components from the reconstructed cardiac sound signals. Ali Moukadem et al (2012) developed a module for the segmentation of heart sounds which was divided into three main blocks: localization of heart sounds, boundaries detection of the localized heart sounds and classification block to distinguish S1 and S2. The heart sounds localization method was based on the S-transform and Shannon energy, and was evaluated against the white additive Gaussian noise. The above works pave a foot path for the preliminary work of this paper.

Related Work
Deboleena Sadhukhan et al (2012) proposed an algorithm for automatic detection of the R-peaks from a single lead digital ECG data. In order to localize the QRS regions, the squared double difference signal of the ECG data is used. Sabarimalai Manikandana M et al (2012) proposed a new R-peak detector, which is based on the new preprocessing technique and an automated peak-finding logic. The proposed peak-finding logic is based on the Hilbert-transform (HT) and moving average (MA) filter. The proposed preprocessor with a Shannon energy envelope (SEE) estimator is better able to detect R-peaks in case of wider and small QRS complexes, negative QRS polarities, and sudden changes in QRS amplitudes. The above methods are useful for the extraction of features of ECG signal.
Articles in the literatures focus on ECG signal processing algorithms for the Classification (i.e. the identification) of the diseases. Most research has been undertaken in identification of the cardiac diseases using ECG signals and in different physiological conditions. Progress has been made in the identification of the cardiac diseases and the analysis of features extracted from ECG signals.

Processing of Heart Sounds
Cardiac auscultation is one of the diagnostic technique and cost effective way to diagnose the condition of the heart valves. Heart sound has two major components called first heart sound S1 and second heart sound S2. The period between S1 and S2 is called systole and period between S2 and next S1 is called diastole. If there is a problem with one of the heart valves, erroneous sounds can be heard in either systole or diastole phase. In severe cases, the other sounds can completely dominate and distort S1 and S2. These sounds are called murmur and they are the indicators of valvular cardiac disorder. Moreover, the availability of new diagnosis tools such as echocardiogram resulted in decline in the auscultation method. But, cardiac auscultation still plays a key role in small clinics and rural medical facilities.
A phonocardiograph is used to record the heart sounds. Also, ECG signal is recorded simultaneously along with the heart sound using standard ECG equipment. In this paper, heart sound signal and ECG signal are both can be used for the detection of arrhythmia disease.
Heart sound is preprocessed for the classification. The raw heart sound is filtered by a band pass filter with pass band of 20-850 Hz to remove noise and for further analysis. Down sampling is done to match its sampling frequency to that of ECG signal so that each sample of two signals match up, this is for convenience during segmentation. Heart sound signal is then denoised using wavelet denoising technique.
Segmentation of heart sound detects and identifies each part of cardiac cycle, which are S1, Systole, S2 and Diastole. It is planned to measure amplitude and duration of first heart sound S1 and second heart sound S2. The first heart sound S1 results when blood is pumped from the heart to the rest of the body, during the latter half of cardiac cycle and it is comprised of sounds resulting from rise and release of pressure within the left ventricle along with the increase in ascending aortic pressure. After blood leaves the ventricles, the simultaneous closing of the semilunar valves, which connect the ventricles with the aorta and pulmonary arteries, causes the second heart sound.

Relation between Heart Sound and ECG signal
The Figure 1 shows ECG signal and Heart Sound. The ECG signal is compared with the Heart sound signal. The ECG signal represents the electrical pulses that cause the heart to beat. ECG signal with heart sound is shown in the following figure.
The spike in the ECG signal is called the QRS complex. It corresponds to the contractions of the ventricles (the lower chambers of the heart) which are the beginning of systole. The S1 sound starts at the peak of the QRS complex, (ie) the R positions on the ECG signal. The S2 sound occurs after the T wave but before the next QRS complex. The period between S1 and S2 is called Systole and the period between S2 and next S1 is called Diastole. Therefore, a single cardiac period consists of S1, Systole, S2 and Diastole. One cardiac period corresponds to the period between two consecutive R waves (RR interval).
In this paper, S1, Systole, S2 and Diastole are extracted from the Heart sound signal (PCG).Also, RR interval is extracted from ECG signal. These two are compared to detect the abnormality of the patient. Determination of RR interval plays a major role for the identification of arrhythmia diseases such as bradycardia and tachycardia. For the identification of other types of arrhythmia diseases, other parameters of ECG signal such as P wave, QRS complex, ST interval and T wave can be used ( Fig.2-5).

Cardiac disease analyzing system
The raw ECG signal contains some low and high frequency noises. The baseline wander noise is low frequency noise and it mainly affects the edge function of ECG signal. The baseline wander noise can be removed by using the median filter. This baseline drift can be eliminated without changing or disturbing the characteristics of the ECG waveform. The muscle noise is high frequency noise and it affects the structure of the wave form. It can be removed using the daubechies wavelet transform. The wavelet transforms de-noising process consists of three steps which are given as: Decompose: In this step, wavelet function is selected and level of process defined. The ECG signal de-noising wavelet function, mother wavelet is used. Threshold: In this step, threshold is selected and applied to each coefficient level. Reconstruction: The reconstruction step subtracts the original signal from the threshold applied coefficient signal.

Feature Extraction
The real time QRS detection algorithm extracts R-wave amplitude and R-wave time duration from the given signal. This algorithm contains four stages. The initial stage is derivative function which calculates the QRS-complex slope value using some five-point derivatives. The next stage is squaring function and it removes the negative data points using square the derivative values. The first two stages are used to calculate R-peak amplitude. Third stage is moving-window integration. It can calculate the R-peak slope value by using some sample rates. Final stage is fiducially marked which calculates the R-peak value and QRS complex time duration (width). The interval between two consecutive R waves is called RR interval. The RR-interval is based on the time duration of P, Q, R and ST-segment. The RR-interval is calculated using discrete wavelet transform. Based on R-peak value, the wavelet transform can extract the following features like: P, QRS-Complex and ST-segment. These features are used to calculate RR-interval value. This calculated RR interval is matched with the Systolic and diastolic intervals which is found using Heart Sound.

R-Peak Amplitude
Here third level of DB4 filter based de-noised ECG signal is used because it gives better R-peak value. Normally, derivative function is mainly used to calculate the height, width and amplitude values. In this process, five-point derivative function is used for the finding the QRS slope information. The QRS slope value gives the non-linear R-peak amplitude value. After differentiation, the ECG signal is squared in point by point. This makes all data points as positive and does nonlinear amplification of output to the derivative emphasizing the higher frequencies that is predominantly the ECG frequencies.

R-peak Time Duration
The R-peak time duration is calculated from the moving window integration. It is mainly based on the sample values. Generally, the window width is approximately as same as the widest possible QRS complex. The window is too wide and the integration waveform is merging to the QRS and T complexes together. The window is narrow means QRS complexes produce several R-peaks in the integration waveform. The QRS complex corresponds to the rising edge of the integration wave-     form. The time duration of the rising edge is equal to the width of the QRS complex. A fiducially mark for the temporal location of QRS complex is determined from the rising edges. According to the desired waveform, R-Peak is marked such as the maximal slope or the peak of the R wave (Table 1).
Heart rate variability analysis The RR intervals are fed into the HRV feature analysis to obtain the Time domain and Frequency domain features. Heart Rate Variability is a non-invasive measure, which reflects the variation over time of the interval between consecutive heartbeats. It is the distance between two successive QRS complexes. It is measured as the distance between RR waves, like Where N is the total number of RR intervals. The above Equation is the measure of distance between RR intervals.
The RR intervals for regular heartbeats are between 600 ms and 1000 ms. The RR intervals for irregular heartbeats are usually shorter (RR i <300 ms) or longer (RR i >2000 ms) than the healthy people. The Heart Rate Variability Features are analyzed through Time domain analysis and Frequency domain analysis.

SDNN
SDNN means standard deviation of all RR intervals. In statistics, standard deviation is a simple measure of the variability or dispersion of a data set. If the standard deviation is low then it indicates that the data points tend to be very close to the same value (the mean), where as if the standard deviation is high then it indicates that the data are spread out over a large range of values. The standard deviation is the rootmean-square (RMS) deviation of its values from the mean.
Where RR i is the RR interval. RR RR is the mean value of the RR i . The above Equation is the measure of standard deviation of all NN intervals.

RR 1 +RR 2 +···RR N RR = N
Where N is the total number of RR intervals. The above Equation is the measure of mean of all NN intervals. For the above sample RR interval samples, the SDNN is calculated as follows: RR RR = 0.684

RMSSD
RMSSD is the square root of the mean of sum of differences of successive RR intervals. It is described as, Where N is the total number of RR intervals. The above Equation is the measure of root mean of sum of squares of difference of adjacent NN intervals.
The RMSSD can be calculated as follows,

NNx
NNx is the count of adjacent RR intervals that are differed by more than x ms.
Where N is the total number of RR intervals. The above Equation is the measure of count of adjacent RR intervals having difference greater than x ms.

NNx pNNx = · 100 (%) N
Where N is the total number of RR intervals. The above Equation is the measure of percentage of NNx.  The Fast Fourier transform is determined using the following equation.
Where N is the total number of RR intervals. The above Equation is the measure of fast fourier transform.
The power spectrum can be calculated as, (u) is the real part and I 2 (u) is the imaginary part of the digital signal. The above Equation is the measure of the power spectrum.
Feature Selection Genetic algorithm is used for selecting the best fittest features which is used for the classification of Arrhythmia disease. The initial population is the difference of range of Heart Rate Variability features. Parents are selected according to their fitness. The better the individual are, the more chances to be selected they have. Roulette wheel selection is used for the selection of best fit individuals. Crossover selects individuals from parent population and creates a new offspring. After performing Cross over, mutation is carried out. For Example consider the binary representation of two chromosomes. The second part of parent1 is assigned to the second part of parent2 and vice versa.

Classification using Neuro Fuzzy Classifier
Decision making of classification is performed in two stages: selection of coefficients computing by DWT and the ANFIS classifiers. ECG beats of arrhythmia diseases are obtained from the PhysioBank databases, which will be classified by ANFIS classifiers. It is aimed to classify the arrhythmia diseases such as Sinus Node Arrhythmias, Atrial Arrhythmias: Premature Atrial Contractions (PAC), Atrial Tachycardia, Atrial Flutter, Atrial Fibrillation, Junctional Arrhythmias: Premature Junctional Contractions (PJC), Ventricular arrhythmias: Premature Ventricular Contractions (PVC), Ventricular Tachycardia (VT), Ventricular Fibrillation, Atrioventricular Blocks, Bundle Branch blocks using Artificial Neuro Fuzzy Classifier(ANFIS). ANFIS is used as a Neuro Fuzzy classifier for the ECG analysis, the accuracy rates for the combined neural network model presented for the classification of the ECG beats is expected to be higher than stand alone classifier model. The Neuro Fuzzy network is also more tolerant to the noise and less sensitive to the morphological changes of the ECG characteristic and ANFIS also plays an important role in dealing with uncertainty when making decisions in medical application. ANFIS method is compared with K Nearest Neighbourhood classifier and Discriminant Analysis (Fig 6-9).

ECG Database
In the proposed work, it is planned to use the MIT/ BIH arrhythmia database (www.physionet.org/physiobank/database/mitdb/). From the database, some of the records are taken for the preliminary work. For example, the arrhythmia database contains 70 records, each containing ECG signals for 1 min duration selected from 70 individuals. The signals were taken from, 58 men aged 27 to 63 years, and 12 women aged 22 to 44 years. These records are used for the identification of arrhythmia diseases for study purposes.

Conformance Testing
Testing is done using the measures of performance metrics (Fig. 10). Performance