การจัดสร้างแบบสอบทดสอบภาษาอังกฤษ Engenius English Language Tests (EELT) สำหรับนักเรียนระดับมัธยมศึกษาและการเทียบคะแนนผลการทดสอบกับแบบทดสอบ English Language Test for International Students (ELTiS)

Sasa Watanapokakul; Suphat  Sukamolson

PDF

Published: Dec 23, 2022

Keywords:

Test score equating Linear regression method Equipercentile method Engenius English Language Test (EELT) ), English Language Test for International Students (ELTiS) Single Group Design with Counterbalancing

Sasa Watanapokakul

Liberal Arts, Mahidol University

Suphat Sukamolson

International College, Maejo University

Abstract

การเทียบคะแนนของแบบทดสอบที่สร้างขึ้นใหม่กับแบบทดสอบมาตรฐานที่มีจำนวนจำกัดเป็นสิ่งที่สำคัญมากเพื่อความปลอดภัยของแบบทดสอบ และความน่าเชื่อถือของผลการสอบ การวิจัยครั้งนี้มีวัตถุประสงค์เพื่อจัดสร้างแบบทดสอบ Engenius English Language Test (EELT) สำหรับนักเรียนระดับมัธยมศึกษาชั้นปีที่ 1-6 ให้เป็นแบบทดสอบมาตรฐานในระดับสากลจำนวน 6 ชุดที่คู่ขนานจริง กับแบบทดสอบชุดเดิมคือ แบบทดสอบ English Language Test for International Students (ELTiS) และเพื่อสร้างตารางเทียบคะแนนของแบบทดสอบทั้งสองด้วย 2 วิธี คือ เทียบตำแหน่งเปอร์เซ็นต์ไทล์ และวิธีการถดถอยเชิงเส้นตรง กลุ่มตัวอย่างคือ ผู้สอบแบบทดสอบแต่ละชุดมี 300 คน มาจากนักเรียนแต่ละระดับชั้น 50 คน รวมเป็นทั้งสิ้น 1,800 คน ซึ่งได้มาจากการสุ่มอย่างง่ายจากผู้ที่มาสมัครสอบโดยไม่เสียค่าใช้จ่าย แบบทดสอบ EELT ชุดใหม่ทั้ง 6 ชุดสร้างจากตารางกำหนดลักษณะเฉพาะของแบบทดสอบ ELTiS และตรวจสอบความตรงเชิงเนื้อหามาแล้วว่ามีคุณภาพอยู่ในระดับสูงมากโดยผู้เชี่ยวชาญด้านการเรียนการสอน และการวัดและประเมินผลทางภาษาอังกฤษ 5 ท่าน และเก็บข้อมูลจากการจัดสอบแบบใช้ผู้สอบกลุ่มเดียวและสลับแบบทดสอบ แล้ววิเคราะห์ข้อมูลด้วย Independent Samples t-test วิธีเทียบคะแนนเชิงเส้นตรงของเลอวีน สหสัมพันธ์แบบเพียร์สัน วิธีเทียบคะแนนโดยใช้ตำแหน่งของคะแนน และวิธีเทียบคะแนนเชิงเส้นตรงการถดถอย ผลการวิจัยพบว่า แบบทดสอบ EELT ชุดที่ 1, 2 และ 4 เป็นแบบทดสอบคู่ขนานจริงกับแบบทดสอบ ELTiS แต่แบบทดสอบ EELT ชุดที่ 3, 5 และ 6 เป็นแบบ ทดสอบคู่ขนานเทียม และได้ตารางเทียบคะแนนระหว่างแบบทดสอบ EELT กับ ELTiS จำนวน 2 ตารางจากผลการเทียบคะแนน 2 วิธี ผลของการวิจัยได้แบบทดสอบใหม่ 3 ชุดที่สามารถใช้แทนแบบทดสอบมาตรฐานได้ ส่วนอีก 3 ชุดควรต้องได้รับการปรับปรุงแก้ไขบางอย่างก่อนที่จะนำไปใช้จริงต่อไป

Issue

Vol. 15 No. 2 (2022): THE JOURNAL OF FACULTY OF APPLIED ARTS, KING MONGKUT'S UNIVERSITY OF TECHNOLOGY NORTH BANGKOK

Section

Research Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

1.The articles published in the Journal of Faculty of Applied Arts are the copyright of the journal. The use of the contents, texts, opinions, pictures, tables or any parts of the articles to be published in any format for the commercial use must be officially allowed in a written form by the authorized person of the editorial team.

2.The opinion exists in each article is the author’s responsibility. The editorial team of the Journal of Faculty of Applied Arts (FAA lecturers, staff, and any personnel) will not take any responsibility on that. The author of each article is the only person who will take full responsibility for his or her own article.

References

กองทุนเพื่อความเสมอภาคทางการศึกษา. (2565). จำนวนนักเรียน จำแนกตามเพศและช่วงชั้น ปีการศึกษา 2564. สืบค้นจาก https://isee.eef.or.th/screen/studentdata/student.html

คุรุสภา. (2563). หลักเกณฑ์การเทียบเคียงผลการทดสอบและประเมินสมรรถนะทางวิชาชีพครูด้านความรู้และประสบการณ์วิชาชีพ ตามมาตรฐานวิชาชีพครู ประจำปี พ.ศ. 2563-2565. สืบค้นจาก https://www.ksp.or.th/ksp2018/wp-content/uploads/2020/11/Announce-of-teachers-licence-test.pdf

สรัญญา จันทร์ชูสกุล, ไชยยศ ไพวิทยศิริธรรม และยุวรี ผลพันธิน. (2560). การเปรียบเทียบคุณภาพของการปรับเทียบคะแนนสอบภาษาอังกฤษระหว่างวิธีอิควิเปอร์เซ็นไตล์ วิธีเชิงเส้นตรง และวิธีสมการถดถอย. วารสาร Veridian E-Journal, Silpakorn University, 10(2), 2444-2455. สืบค้นจาก https://he02.tci-thaijo.org/index.php/Veridian-E-Journal/article/view/103591/83359

สุพัฒน์ สุกมลสันต์. (2560). การเทียบคะแนนแถบระดับความสามารถของแบบทดสอบวัดสมิทธิภาพทางภาษาอังกฤษชุดใหม่ของมหาวิทยาลัยแม่โจ้ (New MJU-TEP) กับระดับของ CEFR. (รายงานวิจัย). เชียงใหม่ : ศูนย์ภาษา มหาวิทยาลัยแม่โจ้.

สำนักงานเลขาธิการสภาการศึกษา. (2560). แผนการศึกษาแห่งชาติ พ.ศ. 2560-2579. กรุงเทพมหานคร : สำนักพิมพ์พริกหวานกราฟฟิค จำกัด. สืบค้นจาก http://www.lampang.go.th/public60/EducationPlan2.pdf

Albano, A. (2010). Statistical equating methods. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.168.826&rep=rep1&type=pdf

Arıkan, C. A. & Gelbal, S. (2018). A comparison of traditional and Kernel equating methods. International Journal of Assessment Tools in Education, 5(3), 417–427. Retrieved from https://dergipark.org.tr/tr/pub/ijate/issue/37036/409826

Baghaei, P. (2010). Test score equating and fairness in language assessment. Journal of English Language Studies, 1(3), 113-128.

Brennan, R. L., Wang, T., Kim, S. & Seol, J. (2009). Equating recipes (Version 1) [Computer software and manual] (CASMA Monograph No. 1). Iowa City, IA : Center for Advanced Studies in Measurement and Assessment, The University of Iowa. Retrieved from http://www.education.uiowa.edu/casma

Council of Europe. (2020). Common European framework of reference for languages : Learning, teaching, Assessment—Companion volume. Strasbourg : Council of Europe Publishing. Retrieved from www.coe.int/lang-cefr

Cui, Z. & Kolen, M. J. (2009). Evaluation of two new smoothing methods in equating : The cubic B-spline presmoothing method and the direct presmoothing method. Journal of Educational Measurement, 46(2), 135–158.

Dorans, N. J. (2004). Equating, concordance, and expectation. Applied Psychological Measurement, 28(4), 227-246.

Fairbank, B. A. Jr. (1987). The use of presmoothing and postsmoothing to increase the precision of equipercentile equating. Journal of Applied Psychological Measurement, 11(3), 245-262. Retrieved from https://conservancy.umn.edu/bitstream/handle/11299/104061/v11n3p245.pdf?sequence=1

Grant, M. C., Zhang, L. & Damiano, I. (2009). An evaluation of Kernel equating : Parallel equating with classical methods in the SAT subject tests program (Research Report ETS RR–09-06). Princeton, New Jersey : Educational Testing Service.

Gubes, N. & Uyar, S. (2020). Comparing performance of different equating methods in presence and absence of DIF items in Anchor Test. International Journal of Progressive Education, 16(3), 111-122.

Hanson, B. A. (1991). A comparison of bivariate smoothing methods in common-Item equipercentile equating. Journal of Applied Psychological Measurement, 15(4), 391-408. Retrieved from https://conservancy.umn.edu/bitstream/handle/11299/114469/1/v15n4p391.pdf

Hanson, B. A., Zeng, L. & Colton, D. (1994). A comparison of presmoothing and postsmoothing methods in equipercentile equating (ACT Research Report Series 94-4). Iowa : The American College Testing Program. Retrieved from https://www.act.org/content/dam/act/unsecured/documents/ACT_RR94-04.pdf

Heh, V. K. (2007). Equating accuracy using small samples in the random group design (Doctoral Dissertation). Ohio, USA : The College of Education of Ohio University. Retrieved from https://etd.ohiolink.edu/!etd.sendfile?accession=ohiou1178299995&disposition=inline

Hopkins, W. G. (2002). A new view of statistics : Correlation coefficients. Retrieved from https://www. sportsci.org/resource/stats/effectmag.html

Kolen, M. J. & Brennan, R. J. (2004). Test equating, scaling, and linking: Methods and practices (2nd ed.). New York : Springer-Verlag.

Kontos, D. L. (2007). Investigation of validity, reliability, and practice effects of the immediate postconcussion assessment and cognitive test (impact) and traditional paper-pencil neuropsychological tests (Master’s thesis). North Carolina, USA : University of North Carolina at Chapel Hill. Retrieved from https://doi.org/10.17615/ht0k-ge23

Liu, J. & Low, A. C. (2008). A comparison of the Kernel equating method with traditional equating methods using SAT data. Journal of Educational Measurement, 45(4), 309–323. Retrieved from https://doi.org/10.1111/j.1745-3984.2008.00067.x

Livingston, S. (1992). Small-sample equating with log-linear smoothing (Research Report No. RR-92-4). Princeton, New Jersey : Educational Testing Service. Retrieved from https://doi.org/10.1002/j.2333-8504.1992.tb01434.x

Lord, F. M. & Wingersky, M. S. (1984). Comparison of IRT true-Score and equipercentile observed-score "equatings". Journal of Psychological Measurement, 8(4), 453-461.

Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA : Addison-Wesley.

Moses, T. & Liu, J. (2011). Smoothing and equating methods applied to different types of test scoredistributions and evaluated with respect to multiple equating criteria (Research Report ETS RR–11-20). Princeton, New Jersey : Educational Testing Service. Retrieved from https://files.eric.ed.gov/fulltext/ED523686.pdf

Phillips, S. E. (1983). Comparison of equi-percentile and response theory equating when the scaling test method is applied to multilevel achievement batter. Applied Psychological Measurement, 7(3), 267-281.

Rapp, J. (1999). Multilevel linear and equi-percentile methods for equating PET. National Institute for Testing and Evaluation. Retrieved from https://www.nite.org.il/wp-content/uploads/2018/06/e266.pdf

Rovinelli, R. J. & Hambleton, R. K. (1977). On the use of content specialists in the assessment of criterion-referenced test item validity. Tijdschrift voor Onderwijsresearch, 2(2), 49–60. Retrieved from https://files.eric.ed.gov/fulltext/ED121845.pdf

Shah, C. H. & Brown, J. D. (2020). Reliability and validity of the short-form 12 item version 2 (SF−12v2) health-related quality of life survey and disutilities associated with relevant conditions in the U.S. older adult population. Journal of Clinical Medicine, 9(3), 661. Retrieved from https://www.mdpi.com/2077-0383/9/3/661

Shin, M. (2015). An investigation of subtest score equating methods under classical test theory and item response theory frameworks (Doctoral dissertation). Massachusetts, USA : University of Massachusetts Amherst. Retrieved from https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1397&context=dissertations_2

Turner, R. C. & Carlson, L. (2003). Indexes of item-objective congruence for multidimensional items. International Journal of Testing, 3(2), 163-171. Retrieved from https://www.tandfonline.com/doi/abs/10.1207/S15327574IJT0302_5

von Davier, A. A. et al. (2006). An evaluation of the kernel equating method: a special study with pseudotests constructed from real test data (Research Report RR-06-02). Princeton, New Jersey : Educational Testing Service. Retrieved from https://files.eric.ed.gov/fulltext/EJ1111405.pdf

Wiberg, M. (2021). On the use of different linkage plans with different observed-score equipercentile equating methods. Practical Assessment, Research & Evaluation, 26(23), 1-16. Retrieved from https://scholarworks.umass.edu/pare/vol26/iss1/23

Yu, C. H. & Osborn Popp, S. E. (2005). Test equating by common items and common subjects : concepts and applications. Practical Assessment Research and Evaluation, 10(4), 1–19.

Article Sidebar

Main Article Content

Abstract

Article Details

References