%0 Journal Article %@ 2291-9279 %I JMIR Publications %V 9 %N 1 %P e21620 %T Pupillary Responses for Cognitive Load Measurement to Classify Difficulty Levels in an Educational Video Game: Empirical Study %A Mitre-Hernandez,Hugo %A Covarrubias Carrillo,Roberto %A Lara-Alvarez,Carlos %+ Center for Research in Mathematics, Calle Lasec y Andador Galileo Galilei, Quantum, Ciudad del Conocimiento, Zacatecas, 98160, Mexico, 52 4929980 ext 1105, c.alberto.lara@gmail.com %K video games %K pupil %K metacognitive monitoring %K educational technology %K machine learning %D 2021 %7 11.1.2021 %9 Original Paper %J JMIR Serious Games %G English %X Background: A learning task recurrently perceived as easy (or hard) may cause poor learning results. Gamer data such as errors, attempts, or time to finish a challenge are widely used to estimate the perceived difficulty level. In other contexts, pupillometry is widely used to measure cognitive load (mental effort); hence, this may describe the perceived task difficulty. Objective: This study aims to assess the use of task-evoked pupillary responses to measure the cognitive load measure for describing the difficulty levels in a video game. In addition, it proposes an image filter to better estimate baseline pupil size and to reduce the screen luminescence effect. Methods: We conducted an experiment that compares the baseline estimated from our filter against that estimated from common approaches. Then, a classifier with different pupil features was used to classify the difficulty of a data set containing information from students playing a video game for practicing math fractions. Results: We observed that the proposed filter better estimates a baseline. Mauchly’s test of sphericity indicated that the assumption of sphericity had been violated (χ214=0.05; P=.001); therefore, a Greenhouse-Geisser correction was used (ε=0.47). There was a significant difference in mean pupil diameter change (MPDC) estimated from different baseline images with the scramble filter (F5,78=30.965; P<.001). Moreover, according to the Wilcoxon signed rank test, pupillary response features that better describe the difficulty level were MPDC (z=−2.15; P=.03) and peak dilation (z=−3.58; P<.001). A random forest classifier for easy and hard levels of difficulty showed an accuracy of 75% when the gamer data were used, but the accuracy increased to 87.5% when pupillary measurements were included. Conclusions: The screen luminescence effect on pupil size is reduced with a scrambled filter on the background video game image. Finally, pupillary response data can improve classifier accuracy for the perceived difficulty of levels in educational video games. %M 33427677 %R 10.2196/21620 %U http://games.jmir.org/2021/1/e21620/ %U https://doi.org/10.2196/21620 %U http://www.ncbi.nlm.nih.gov/pubmed/33427677