Efficiency of Pearson, Spearman and Kendall’s Correlation Coefficients When Data is Non-normal Distributed

Authors

  • Chaninan Pruekpramool Chulalongkorn University
  • Nattika Jaroentaku Chulalongkorn University
  • Siwachoat Srisuttiyakorn Chulalongkorn University

Keywords:

non-normal distribution, efficiency, Pearson’s correlation coefficient, Spearman’s correlation coefficient, Kendall’s correlation coefficient

Abstract

This research aimed to compare the efficiency in estimation and hypothesis testing of correlation analysis among Pearson, Spearman and Kendall’s correlation coefficients when the data were non-normal distributed. The Monte Carlo data simulation was used in this study to generate the data from a log-normal probability distribution. The total numbers of simulations were 288 situations repeating 1,000 rounds in each situation. The efficiency in estimation was considered from Relative Bias (RB) and Monte Carlo Standard Errors (MCSE). The efficiency in hypothesis testing was considered from the probability of a type I error and power of testing. The results revealed that (1) Pearson’s correlation coefficients had tended to present the estimating values with the lowest relative bias while Kendall’s correlation coefficients had tended to present the estimating values with the highest relative bias. (2) Pearson’s and Spearman’s correlation coefficients had tended to present the values of Monte Carlo standard errors close to each other when there was no relationship between two sets of the data and Kendall’s correlation coefficients had tended to have the lowest values of the Monte Carlo standard errors. (3) The significance testing of these three correlation coefficients had tended to be able to control the type I error under .05 significant level, and (4) Pearson’s correlation coefficients had tended to present the highest power of testing.

References

ทองดี แย้มสรวล. (2530). การศึกษาลักษณะการแจกแจงการควบคุมความคลาดเคลื่อนประเภทที่ 1 และอำนาจ การทดสอบของค่าสัมประสิทธิ์สหสัมพันธ์แบบสเปียร์แมน แคนดัลเทาและเครมเมอวี [วิทยานิพนธ์ปริญญามหาบัณฑิต ไม่ได้ตีพิมพ์]. จุฬาลงกรณ์มหาวิทยาลัย.

นภดล พิมพ์จันทร์, จิราพร เขียวอยู่, นิคม ถนอมเสียง, และ ยุภาพร ตงประสิทธิ์. (2550). ผลของข้อมูลที่มีค่าผิดปกติจากกลุ่มต่อความแกร่งของสัมประสิทธิ์สหสัมพันธ์. วารสารวิจัยมหาวิทยาลัยขอนแก่นระดับบัณฑิตศึกษา, 7(2), 111-120.

บุญชม ศรีสะอาด. (2553). การวิจัยเบื้องต้น (พิมพ์ครั้งที่ 8). สุวีริยาสาส์น.

ประชาชาติ อารีชาติ, ชนิกานต์ ตั้งตระกูล, และ จุฑาภรณ์ สินสมบูรณ์ทอง. (2560). การเปรียบเทียบความแกร่งของสัมประสิทธิ์สหสัมพันธ์ลำดับที่เมื่อข้อมูลมีค่าผิดปกติ. วารสารวิทยาศาสตร์และเทคโนโลยี, 25(6), 930-943.

Bishara, A. J., & Hitter, J. B. (2015). Reducing bias and error in the correlation coefficient due to nonnormality. Educational and Psychological Measurement, 75(5), 785-804.

Bolboacă, S. D., & Jäntschi, L. (2006). Pearson versus Spearman, Kendall's Tau correlation analysis on structure-activity relationships of biologic active compounds. Leonardo Journal of Sciences, 9, 179-200.

Boos, D. D., & Osborne, J. A. (2014). Assessing variability of complex descriptive statistics in Monte Carlo studies using resampling methods. International Statistical Review, 83(2), 228–238. https://doi.org/10.1111/insr.12087

Chok, N. S., (2010). Pearson’s versus Spearman’s and Kendall’s correlation coefficients for continuous data [Unpublished master’s thesis]. University of Pittsburgh.

Chumney, F. L. (2012). Comparison of maximum likelihood, Bayesian, partial least square and generalized structured component analysis methods for estimation of structural equation models with small sample: An exploratory study [Unpublished master’s thesis]. University of Nebraska - Lincoln.

Croux, C., & Dehon, C. (2008). Robustness versus efficiency for nonparametric correlation. KU Leuven Association. https://lirias.kuleuven.be/bitstream/123456789/164226/1/KBI_0803.pdf

Hauke, J., & Kossowski, T. (2011). Compare of value of Pearson’s and Spearman’s correlation coefficient on the same sets of data. Quaestiones Geographicae, 30(2), 1-2.

Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30(1/2), 81-93.

Pearson, K. (1920). Notes on the history of correlation. Biometrika, 13(1), 25-45.

Spearman, C. (1904). The proof and measurement of association between two things. American Journal of Psychology, 100(3/4), 441-471.

Trierweiler, T. (2010). An evaluation of estimation methods in confirmatory factor analytic model with ordered categorical data in LISREL. ETD Collection for Fordham University. https://research.library.fordham.edu/dissertations/AAI3416004/

Tukey, J. W. (1977). Exploratory data analysis. Addison-Wesley.

Wang, J. (2012). On the relationship between Pearson correlation coefficient and Kendall’s Tau under bivariate homogeneous shock model. International Scholarly Research Notices, 2012, 1-7. https://doi.org/10.5402/2012/717839

Zimmerman, D. W., Zumbo, B. D., & Williams, R. H. (2003). Bias in estimation and hypothesis testing of correlation. Psicológica, 24(1), 133-158.

Downloads

Published

2020-12-29

How to Cite

Pruekpramool, C. ., Jaroentaku , N. ., & Srisuttiyakorn , S. . (2020). Efficiency of Pearson, Spearman and Kendall’s Correlation Coefficients When Data is Non-normal Distributed. An Online Journal of Education, 15(2), OJED1502040 (16 pages). Retrieved from https://so01.tci-thaijo.org/index.php/OJED/article/view/245395