Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference Shanghai, China, September 1-4, 2005 Biometric Statistical Study of One-Lead ECG Features and Body Mass Index (BMI) T. W. Shen, W. J. Tompkins Department of Biomedical Engineering, University of Wisconsin, Madison WI, USA Abstract— We have studied the electrocardiogram (ECG) as a potential biometric for human identity verification. This research investigates the relationship between ECG biometric features and body mass index (BMI) using correlation analysis and linear regression methods. Using our ECG database of 168 normal healthy people (113 females and 55 males), we studied normalized features extracted from a one-lead, resting, palm ECG. The results showed that normalized ECG biometric features explain 25.3% of the variability of the BMI. ECG features of males better correlate with the BMI model than those of females. Furthermore, we calculated correlation coefficients and R-square changes to analyze the correlations between extracted features and the BMI and to indicate the most significant feature as a predictor of BMI among all ECG biometric features. Keywords - Electrocardiogram (ECG), Human identity verification, Biometrics, Body Mass Index (BMI), ECG features, Linear regression, Correlation B I. INTRODUCTION iometrics use anatomical, physiological or behavioral characteristics that are significantly different from person to person and are difficult to forge. Several biometrics that have been used commercially for human identity verification are facial geometry, fingerprints, and voice analysis [1-2]. Electrocardiogram (ECG) analysis is not only a very useful diagnostic tool for clinical proposes [3], but also is lately studied as a potential biometric [4-7]. It is beneficial that a single-lead ECG is a one-dimensional, low-frequency, life-essential signal which can be recorded with three electrodes (two active electrodes and a ground electrode). Biel et al. [4] showed that it is possible to identify individuals based on a chest ECG signal. Israel et al. [7] showed the uniqueness of an individual’s ECG by investigating temporal features. This analysis achieved a 100% identification rate regardless of the electrode locations on a 29-person group by combining LDA for discriminant functions and a voting algorithm on the contingency matrix. Also, our previous research demonstrated that an ECG-based biometric system could successfully identify a group of 20 Tsu-Wang (David) Shen was a Ph.D. student in University of Wisconsin, Madison, WI 53705 USA. He is now an assistant professor with the Department of Medical Informatics, Tzu Chi University, Hau-Lien, Taiwan, R.O.C. (phone: 886-3-856-5301 ext.7379; e-mail: [email protected]). W. J. Tompkins is with the Department of Biomedical Engineering, University of Wisconsin, Madison, WI 53705 USA (e-mail: [email protected]). 0-7803-8740-6/05/$20.00 ©2005 IEEE. persons from the MIT/BIH database with 100% accuracy [5] by combining a template matching method with a decision-based neural network (DBNN). In addition, we investigated the resting palm ECGs on a large, normal, healthy population for human identification. In a predetermined group with 10, 20, and 50 persons, we achieved a 100% identification rate by using prescreening technology and distance classification. Moreover, the combined system model was further tested in the predetermined group with 100 and 168 people to get 96% and 95.3% identification rates respectively [6]. This article summarizes how the ECG can be used for human identification for a short-term scale. The ECG varies from person to person due to the differences in position, size, and anatomy of the heart, age, sex, relative body weight, chest configuration, and various other factors [8]. These variant factors make a person’s ECG signal unique. From our experimental data, Fig. 1 shows an example of two persons with exactly the same age, sex, weight and height who have completely different ECG patterns. Figure 1. Two subjects (called No. 217 and No. 225) have completely different ECG patterns, even though they share the same gender (female), age (21 years old), weight (56.7 kg), and height (170 cm). The units on the x axis are sample data point numbers. The sampling rate of these ECG signals is 500 sps. The units on the y axis are millivolts. Thaler [9] described that ECG waves can increase in duration and in amplitude at certain parts of the signal and the electrical axis shifts with cardiac hypertrophy. However, it is unclear how factors such as weight, height, and body mass index ( kg / m 2 ), may influence the ECG for presumed healthy individuals, and what features extracted from the Lead-I ECG can be related to these factors. This paper explores the relationships between the Lead-I biometric ECG features and the BMI. II. EXPERIMENTAL SETUP Unlike the MIT/BIH ECG arrhythmia database from cardiology patients, this research surveyed a normal healthy population and all ECG signals are presumed normal. We investigated short-term, resting, lead-I ECG signals recorded from 168 individuals (113 females and 55 males) to create our ECG biometric database. The subjects voluntarily reported their ages, weights, and heights. The age range is from 19~52 years. Their weight and height range from 45~118 kg and 155~208 cm. The Interquartile Ranges (IQR) of age, weight, and height are 3 years (Q1:20 and Q3:23), 13 kg (Q1:57 and Q3:70), and 15.24 cm (Q1:160 and Q3:175), respectively. Table 1 shows more detail information about our database. gender, weight, height, and BMI ( kg / m 2 ) influence selected biometric features. In the preprocessing procedure, baseline wander, dc shift, power-line noise, and high-frequency interference are removed [5-6]. In general, standard ECG machines have a bandwidth between 0.05 Hz and 150 Hz. With this bandwidth, baseline wander, muscle interference and other noise are so severe for a palm ECG that we band-limited the ECG to the frequency range between 1 and 50 Hz. We designed our computer software to randomly select 20 sequential normal heartbeats from each of the 168 individuals in this investigation to form a 3360-beat group as an original ECG database. Next, the signal averaging method was applied on each 20-heartbeat group to create 168 median heartbeats as our database. Then we extracted the 17 features listed in Table 2 and Fig. 3 from each heartbeat. R RP amplitude RQ amplitude RS/RS2 amplitude RT amplitude RS slope Table 1. General statistic data on ECG biometric database Females (mean ̈́ S.D.) Males (mean ̈́ S.D.) Age (year) 20.7 ̈́˄ˁˉ 23.2 ̈́ʳˉˁˉ Weight (kg) 62.4 ̈́ʳˋˁ˅ 77.1 ̈́ʳ˄˅ˁˌ Height (cm) 166.9 ̈́ʳˈˁˋ 179.8 ̈́ˊˁˌ 22.41 ̈́ʳ˅ˁˉˉ 23.80 ̈́ʳˆˁˆ˃ 2 BMI ( kg / m ) T ST slope ST amplitude P QRS Triangle Area QS amplitude The subjects’ ECG signals were measured and collected with an ECG data acquisition unit (BIOPAC Student Lab PRO system MP30 with software), electrodes (disposable silver-silver chloride electrodes from BIOPAC Systems, Inc.), and computers (IBM-compatible PCs). We recorded the lead I ECG from each subject using two electrodes placed on the left palm (active and ground) and one electrode on the right palm as shown in Fig. 2. These subjects were in a resting position and sitting upright, and they were asked to relax. Their palms were open and resting on their legs. We recorded the Lead-I ECG for 90 s at a sampling rate of 500 sps with an amplifier gain of 2000. In the preprocessing session, we applied digital filters to the raw ECG data to reduce interference. Q S QS duration QT duration Figure 3. Seven features based on QRST points. Table 2: Seventeen selected features used for classification Selected features Selected features Selected features Angle Q Angle R Angle S RQ amplitude RS amp./TS amp. 8 15 QS duration RS 2 amplitude 9 16 RS amplitude PQ amplitude 10 17 ST amplitude QS amplitude 11 QT duration** RP amplitude 12 RS slope RT amplitude 13 QRS triangular Area ST slope 14 Note: **The definition of QT duration is different from the clinical definition of QT interval. The QT duration is the time delay between the Q and T point. Bazett formula was applied for QT normalization. 1 2 3 4 5 6 7 After we extracted these features, we divided the database into a female and a male group. Then, we normalized all features using (1) so that we could compare features with different units. Figure 2. Disposable electrodes attached to a subject’s palms. III. METHODOLOGY Our ECG biometric database surveyed a young, normal, healthy population and compared the results with those from a clinical database. It is crucial to analyze whether age, Normalized feature = feature − Globe min … (1) Globe max − Globe min where “ Globe min” and “ Globe max” represent the minimum and maximum values of a certain feature over a total of 168 people. We applied correlation analysis and linear regression methods [10] to analyze these normalized features with the BMI by using SPSS 12. IV. RESULTS and DISCUSSION In statistics, the R-squared value is the fraction of the variance in the data that is explained by a regression. It is defined as the ratio of the sum of squares explained by a regression model and the total sum of squares around the mean. It can be referred to as the proportion of variation explained by the model. Table 3 shows that our normalized ECG biometric features explain 25.3% and 6.5% of the variability of BMI and age. Table 3. Comparison of R square values by changing the dependent variable as age or BMI Model Summary – dependent variable: BMI ( kg / m 2 ) Model 1 R Adjusted R Square R Square .503 .253 .185 Std. Error of the Estimate 2.66366 Model Summary – dependent variable: Age (year) Model 1 R Adjusted R Square R Square .255 .065 -.020 Std. Error of the Estimate 4.1832 As shown in Table 3, the BMI showed much stronger R-squared values than age, so it was selected for further analysis. We calculated correlation coefficients to analyze the correlation level between normalized features and the BMI as shown in Table 4. Table 4: Correlation coefficient table formed by calculating the relationship between normalized features and the BMI. BMI BMI V1 Pearson V10 Pearson .354(**) .268(**) Correlation Correlation Sig. (2-tailed) .000 Sig. (2-tailed) .000 Pearson V2 Pearson V11 .105 -.017 Correlation Correlation Sig. (2-tailed) .830 Sig. (2-tailed) .175 V3 Pearson V12 Pearson .413(**) .347(**) Correlation Correlation Sig. (2-tailed) .000 Sig. (2-tailed) .000 V4 Pearson V13 Pearson .237(**) .288(**) Correlation Correlation Sig. (2-tailed) .002 Sig. (2-tailed) .000 Pearson V5 Pearson V14 .192(*) -.178(*) Correlation Correlation Sig. (2-tailed) .021 Sig. (2-tailed) .013 V6 Pearson V15 Pearson .424(**) Correlation Correlation .229(**) Sig. (2-tailed) .000 Sig. (2-tailed) .003 V7 Pearson V16 Pearson .346(**) Correlation Correlation .317(**) Sig. (2-tailed) .000 Sig. (2-tailed) .000 Pearson V8 Pearson V17 -.111 -.197(*) Correlation Correlation Sig. (2-tailed) .011 Sig. (2-tailed) .152 V9 Pearson ** Correlation is significant at .419(**) Correlation the 0.01 level (2-tailed). Sig. (2-tailed) .000 * Correlation is significant at the 0.05 level (2-tailed). According to Table 4, the largest correlation coefficient occurs for feature 6 (v6). However, the R-square changes for all normalized features must be calculated in order to figure out if this variable is a good predictor of the dependent variable. Feature 6 is the only significant predictor of the dependent variable BMI in Table 5 because the Significance number is less than 0.05. Features 1, 3, and 12 are the variables automatically excluded by SPSS 12. Table 5. Results of R-squared changes among futures - BMI ANOVA(c) V2 V4 V5 V6 V7 V8 V9 V10 V11 V13 V14 V15 V16 V17 Reg. Res. Total Sum of Squares .639 .001 3.089 31.482 9.341 7.921 6.149 .964 3.600 .867 10.278 7.821 .751 .841 368.423 1085.550 1453.973 df 1 1 1 1 1 1 1 1 1 1 1 1 1 1 14 153 167 Mean Square .639 .001 3.089 31.482 9.341 7.921 6.149 .964 3.600 .867 10.278 7.821 .751 .841 26.316 7.095 F .090 .000 .435 4.437 1.316 1.116 .867 .136 .507 .122 1.449 1.102 .106 .119 3.709 Sig. .764(a) .991(a) .510(a) .037(a) .253(a) .292(a) .353(a) .713(a) .477(a) .727(a) .231(a) .295(a) .745(a) .731(a) .000(b) R2 Change .000 .000 .002 .022 .006 .005 .004 .001 .002 .001 .007 .005 .001 .001 a Tested against the full model. b Predictors in the Full Model: (Constant), V17, V9, V5, V11, V2, V10, V8, V15, V16, V14, V4, V6, V7, V13. c Dependent Variable: BMI Gender analysis for BMI – The next step was to investigate if gender differences can cause the different proportion of variation explained by the model. Based on the female data samples, 23.0% of the variability among the observed values of the female BMI was explained by the linear relationship between BMI and the normalized features. In comparison, significantly, 42.1% of the variation was explained by the male model. That is, 57.9% of variation was not explained by this relationship. Overall, features extracted from male subjects can explain their BMI values more than those same features extracted from female subjects. Table 6 lists correlation coefficients that we analyzed by gender. Table 6: Correlation coefficient table obtained by calculating the relationship between normalized features and the BMIs. The correlation coefficients are separated by gender. Feature Female Male Feature 1 Gender 0.2678 0.4058 Feature 2 -0.0036 0.0238 Feature 3 0.3120 0.4653 Feature 4 0.1966 0.1727 Feature 5 -0.0443 -0.1826 Feature 6 0.3064 0.4927 Feature 7 0.2868 0.3768 Feature 8 -0.1827 -0.2392 Feature 9 0.3188 0.4777 Feature 10 0.2127 0.2523 Feature 11 0.0780 0.0626 Feature 12 0.2618 0.4082 Feature 13 0.2207 0.3452 Feature 14 0.1384 0.1610 Feature 6 provided the largest correlation coefficient for the male model. By comparison Features 3 and 9 showed the biggest correlation coefficient numbers for the female group. Also, the overall correlation coefficients decreased for the female database. Features 3, 6, and 9 are all highly correlated with each other. Fig. 4 shows two scatter plots (normalized Feature #6 vs. BMI) and linear regression lines for each gender. Figure 4. Two scatter plots (normalized Feature 6 vs. BMI) separated by gender. V. CONCLUSIONS & FUTURE WORK This research showed that our normalized ECG biometric features explain 25.3% of the variability of the BMI. The ECG features of males can better explain BMI than those of females. R-square changes showed that Feature 6 is the significant predictor of overall BMI values against all the other ECG features. However, the correlation coefficient between Feature 6 and the overall BMI is only 0.424. These results imply that certain ECG biometric features are somewhat correlated with the BMI. A possible explanation for this is that the BMI may be an indicator of the abdominal volume. Hence, BMI changes may be correlated with a shift of a subject’s electrical heart axis making RS-amplitude-related features (such as Features 3, 6, and 9) have more correlation with BMI than the other features. We will evaluate this assumption in a future long-term ECG study. The future work will measure the above ECG biometric features several times from the same individual with changes to the BMI due to significant weight variations. This will clarify how the BMI can influence certain ECG biometric features and how those features can be normalized based on the current BMI. VI. ACKNOWLEDGMENTS Special appreciation goes to Dr. Kevin T. Strang and Andrew J. Lokuta for the full support of this research and for helping us in the experimental environment. Also, thanks go to Profs. Ron Serlin and Daniel Bolt who provided many valuable suggestions. REFERENCES [1] Frischholz, R.W., and Dieckmann, U. 2000. BioID: A multimodal biometric identification system. Computer; Feb. 2000, IEEE. pp.64-68. [2] Pankanti, S., Bolle R.M., and Jain, A. 2000. Biometrics: The future of identification. Computer; Feb. 2000, IEEE. pp.46-55. [3] D. Dubin, Rapid Interpretation of EKG's, V ed. Cover Publishing Company, Tampa, Florida, 2000. [4] Biel, L., Pettersson, O., Philipson, L., and Wide, P. ECG analysis: A new approach in human identification. IEEE Trans. on Instrumentation and Measurement, vol. 50, No. 3, June 2001. [5] T. W. Shen, W. J. Tompkins, and Y. H. Hu, "One-lead ECG for identity verification," 2nd Joint Conf. IEEE Eng. Med. Biol. Soc. & Biomed. Eng. Soc., pp. 62-63, 2002. [6] T. W. Shen, "Biometric Identity Verification Based on Electrocardiogram," PhD thesis, in Biomedical Engineering. University of Wisconsin, Madison, WI, 2005. [7] S. A. Israel, J. M. Irvine, A. Cheng, M. D. Wiederhold, and B. K. Wiederhold, “ECG to identify individuals,” Pattern Recognition, vol. 38, pp. 133-142, 2005. [8] Simon, B.P., and Eswaran, C. An ECG classifier designed using modified decision based neural network; Computers and Biomedical Research, 30. pp. 257-72, 1997. [9] M. S. Thaler, The only EKG book you’ll ever need, 4th ed. Lippincott Williams & Wilkins, Philadelphia, PA, p.p. 61-93, 2003. [10] M. Pagano and K. Gauvreau, Principles of biostatistics, 2nd ed. Duxbury, Pacific Grove, CA, 2000.
© Copyright 2025 Paperzz