Login

Karakteristik butir soal materi penggabungan badan usaha pada Matakuliah Akuntansi Keuangan Lanjutan I

Vol. 5 No. 1 (2025):

Natalina Premastuti Brataningrum (1), Umi Farisiyah (2)

(1) Universitas Sanata Dharma, Indonesia
(2) Universitas Negeri Yogyakarta, Indonesia
Fulltext View | Download

Abstract:

Penelitian ini bertujuan untuk mengetahui karakteristik butir soal materi penggabungan badan usaha pada mata kuliah Akuntansi Keuangan Lanjutan I. Penelitian menggunakan pendekatan kuantitatif dengan desain deskriptif eksploratif, melibatkan 40 mahasiswa sebagai subjek. Instrumen berupa tes pilihan ganda terdiri dari 30 butir soal dianalisis menggunakan teori tes klasik dan model Rasch. Hasil validitas isi melalui indeks V Aiken menunjukkan 24 butir (80%) sangat valid, dan 6 butir (20%) validitas sedang. Secara kualitatif, butir soal memenuhi aspek materi (70%), konstruksi (79%), dan bahasa (97%). Berdasarkan teori tes klasik, 60% butir berada pada tingkat kesukaran sedang, 67% memiliki daya pembeda baik, dan 41,67% pengecoh berfungsi optimal. Analisis faktor menunjukkan 25 butir (83%) signifikan, dengan reliabilitas instrumen tinggi (Cronbach Alpha 0,835). Terdapat satu butir mengalami bias (p < 0,05). Dengan model Rasch, 28 butir (97%) sesuai model dan tingkat kesukaran berada dalam rentang sedang (b antara -2 hingga +2). Temuan menunjukkan sebagian besar butir mampu memberikan informasi akurat bagi mahasiswa dengan kemampuan menengah. Simpulan, instrumen soal menunjukkan kualitas baik baik dari sisi isi maupun konstruk.


 


Item characteristics of business entity merger material in Advanced Financial Accounting Course I


 


Abstract: This study aims to examine the item characteristics of test questions on business combination material in the Advanced Financial Accounting I course. A quantitative approach with a descriptive-exploratory design was employed, involving 40 students as research subjects. The test instrument consisted of 30 multiple-choice items analyzed using Classical Test Theory (CTT) and the Rasch model. Content validity using Aiken’s V index showed that 24 items (80%) were highly valid, while 6 items (20%) had moderate validity. Qualitative review revealed 70% compliance in material aspects, 79% in construction, and 97% in language. Based on CTT, 60% of items had moderate difficulty, 67% demonstrated good discrimination, and 41.67% of distractors functioned effectively. Factor loading analysis showed 25 items (83%) were significant, with a high reliability score (Cronbach's Alpha = 0.835). One item showed bias (p < 0.05). The Rasch model analysis indicated that 28 items (97%) fit the model and had difficulty levels within the moderate range (b between -2 and +2). Most items provided accurate information for students with average ability levels. In conclusion, the test items demonstrated good quality in terms of both content and construct validity.

References

Aiken, L, R. (1985). Three coefficients for analyzing the reliability and validity of ratings. Educational Dan Psicological Measurement, 45, 131–142. https://doi.org/10.1177/0013164485451012

Airasian, P. W., Cruikshank, K. A., Mayer, R. E., Pintrich, P. R., Raths, J., & Wittrock, M. C. (2015). Pembelajaran, Pengajaran, Dan Asesmen (L. W. Anderson & D. R. Krathwohl, Eds.; A. Prihantoro, Trans.; 1st ed.). Pustaka Pelajar.

Allen, M. J., & Yen, W. M. (1979). Introduction to Measurement Theory. Brooks/Cole Publishing Company.

Amalia, R., Astuti, S., & Sari, A. (2022). Deteksi bias gender pada instrumen evaluasi belajar kimia dengan metode mantel-haezel. Tarbiyah, 29(2), 243–256. https://doi.org/http://dx.doi.org/10.30829/tar.v29i2.1781

Ambarwati, Y., & Ismiyati, I. (2021). Analisis butir soal pilihan ganda ulangan akhir semester genap mata pelajaran kearsipan. Measurement in Educational Research, 1(2), 64–75.

Andrich, D., & Maris, I. (2019). A course in rasch model measurement theory. Spinger.

Arikunto, S. (2018). Dasar-dasar evaluasi pendidikan (3rd ed.). PT Bumi Aksara.

Arlinwibowo, J., Retnawati, H., & Hadi, S. (2024). Aplikasi teori respon butir dengan R dan R Studio. Cahaya Harapan.

Azwar, S. (2012). Reliabilitas dan validitas. Pustaka Pelajar.

Azwar, S. (2015). Dasar-dasar psikometri. Pustaka Pelajar.

Bademci, V. (2022). Correcting fallacies about validity as the most fundamental concept in educational and psychological measurement. International E-Journal of Educational Studies, 6(12), 148–154. https://doi.org/10.31458/iejes.1140672

Baker, F. B., & Kim, S. (2017). The information function. in: the basic of item response theory using R. statistics for social and behavioral sciences. Springer, Cham. https://doi.org/https://doi.org/10.1007/978-3-319-54205-8_6

Bichi, A. A. (2016). Classical test theory: an introduction to linear modeling approach to test and item analysis. International Journal for Social Studies , 2(9), 27–33. https://edupediapublications.org/journals

Bichi, A. A., & Talib, R. (2018). Item response theory : an introduction to latent trait models to test and item development. International Journal of Evaluation and Research in Education (IJERE), 7(2), 142–151. https://doi.org/10.11591/ijere.v7.i2.12900

Bond, T. G., & Fox, C. M. (2020). Applying the rasch model; fundamental measurement in the human sciences. Raoutledge.

Bookhart, S. M., & Nitko, A. J. (2019). Educatonal assessment of student (6th ed.). Pearson.

Boone, W. J., & Staver, J. R. (2020). Rasch measurement estimation prosedure. In: Advance In Rasch Analyses In Human Sciences. Spinger. https://doi.org/https://doi.org/10.1007/978-3-030-43420-5_14

Bordbar, S. (2020). Gender differential item functioning ( GDIF ) analysis in iran ’ s university entrance exam. English Language in Focus (ELIF), 3(1), 49-68, https://doi.org/10.24853/elif.3.1.49-68

Datau, R., Putrawan, I. M., & Rahayu, W. (2022). The effects of sample size and options number on the validity item of students’ environmental personality score. Proceedings of the Eighth Southeast Asia Design Research (SEA-DR) & the Second Science, Technology, Education, Arts, Culture, and Humanity (STEACH) International Conference (SEADR-STEACH 2021), 627, 165–170. https://doi.org/10.2991/assehr.k.211229.026

Dewan Perwakilan Rakyat Indonesia. (2005). Undang-Undang (UU) tentang guru dan dosen nomor 14. Dewan Perwakilan Rakyat Indonesia, 2.

DiBattista, D., & Kurzawa, L. (2011). Examination of the quality of multiple-choice items on classroom tests. The Canadian Journal for the Scholarship of Teaching and Learning, 2(2). https://doi.org/10.5206/cjsotl-rcacea.2011.2.4

Downing, S. M. (2018). Twelve step for effective test devolopment. In Handbook of Test Development. Routledge.

Elvira, M., & Hadi, S. (2016). Karakteristik butir soal ujian semester dan kemampuan siswa sma di kabupaten muaro jambi. Jurnal Evaluasi Pendidikan, 4(1), 1–23. http://journal.student.uny.ac.id/ojs/index.php/jep%0AKARAKTERISTIK

Gierl, M. J., Bulut, O., & Zhang, X. (2017). Developing, analyzing, and using distractors for multiple-choice testes in education: A Comprehensive Review. Review of Educational Research, 87(6), 1082–1116. https://doi.org/https://doi.org/10.3102/0034654317726529

Hair, J. F., Black, W. C., Babin, B. J., & Anderson, B. J. (2010). Multivariat data analisis a global perspertive (7th ed.). Pearson.

Hair, J. F., Sarstedt, M., Hopkins, L., & Kuppelwieser, V. G. (2014). Partial least squares structural equation modeling (PLS-SEM): An emerging tool in business research. European Business Review, 26(2), 106–121. https://doi.org/10.1108/EBR-10-2013-0128

Haladyna, T., & Rodriguez, M. (2013). Developing and validating test item (1st ed.). Routledge, Taylor & Francis Group.

Hendriyaldi. (2023). Strategi inovasi pengembangan kompetensi dosen menuju sumber daya manusia unggul pada pasca pandemi Covid-19 di Universitas Jambi. Jurnal Paradigma Ekonomika, 18(4), 28–42.

Hu, Z., Lin, L., Wang, Y., & Li, J. (2021). The integration of classical testing theory and item response theory. Psychology, 12, 1397–1409. https://doi.org/10.4236/psych.2021.129088

Ihda, M. F., & Heri, R. (2023). Analysis of the distractor of the multiple-choice test using classical test theory (CTT) and item response theory (IRT). Vi, 196–203. https://doi.org/10.31643/2023.23

Irwantoro, N., & Suryanto, Y. (2016). Kompetensi padagogik. Genta Group Production.

Lane, S., Raymond, M., & Haladnya, T. (2016). Handbook of test development (2nd ed.). Raoutledge.

Lee, E., & Garg, N. (2020). Reliability of multiple-choice versus problem-solving student exam scores in higher education: Empirical tests. International Conference on Higher Education Advances, 2020-June, 1399–1407. https://doi.org/10.4995/HEAd20.2020.11303

Mardapi, D. (2008). Teknik penyusunan instrumen tes dan non tes. Mitra Cendekia.

Osadebe, P. U., & Agbure, B. (2019). Assessment of differential item functioning in social studies multiple choice questions. European Journal of Education Studies, 312–344. https://doi.org/10.5281/zenodo.3674732

PUSPENDIK. (2016). Panduan penulisan soal. Kementrian pendidikan dan kebudayaan RI

Qiu, Z., Wu, X., & Fan, W. (2020). Automatic distractor generation for multiple choice questions in standard tests. COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference, 2096–2106. https://doi.org/10.18653/v1/2020.coling-main.189

Raina, V., Liusie, A., & Gales, M. (2024). Assessing distractors in multiple-choice tests. 12–22. https://doi.org/10.18653/v1/2023.eval4nlp-1.2

Retnawati, H. (2017). Validitas reliabilitas & karakteristik butir. Parama Publisher.

Rezigalla, A. A., Eleragi, A. M. E. S. A., Elhussein, A. B., Alfaifi, J., ALGhamdi, M. A., Al Ameer, A. Y., Yahia, A. I. O., Mohammed, O. A., & Adam, M. I. E. (2024). Item analysis: the impact of distractor efficiency on the difficulty index and discrimination power of multiple-choice items. BMC Medical Education, 24(1), 1–7. https://doi.org/10.1186/s12909-024-05433-y

Sajjad, M., Iltaf, S., & Khan, R. A. (2020). Nonfunctional distractor analysis: An indicator for quality of multiple choice questions. Pakistan Journal of Medical Sciences, 36(5), 982–986. https://doi.org/10.12669/pjms.36.5.2439

Sumaryanto. (2021). Teori tes klasik & teori respon butir konsep & penerapannya (1st ed.). CV. Confident.

Sumintono, B., & Widhiarso, W. (2019). Aplikasi Model Rasch untuk penelitian ilmu-ilmu sosial. Trikom Publishing House.

Suseno, I. (2017). Komparasi karakteristik butir tes pilihan ganda ditinjau dari teori tes klasik. Jurnal Ilmiah Kependidikan, 4(1), 1–8.

Suwardjono. (2016). Akuntansi pengantar. Fakultas Ekonomika dan Bisnis UGM.

Team, R. C. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.r-project.org/

Tennant, A., & Küçükdeveci, A. A. (2023). Application of the rasch measurement model in rehabilitation research and practice: early developments, current practice, and future challenges. Frontiers in Rehabilitation Sciences, 4(July). https://doi.org/10.3389/fresc.2023.1208670

Uyar, Ş., Kelecioğlu, H., & Doğan, N. (2017). Comparing differential item functioning based on manifest groups and latent classes. Kuram ve Uygulamada Egitim Bilimleri, 17(6), 1977–2000. https://doi.org/10.12738/estp.2017.6.0526

Van der linden, W. J. (2016). Handbook of item response theori. CRC Press. https://doi.org/https://doi.org/10.1201/9781315374512

Widharyanto, B., & Prijowuntato, S. W. (2021). Menilai peserta didik (Penyusunan instrumen penilaian) (N. Premastuti Brataningrum, Ed.; 1st ed.). Sanata Dharma University Press.