TY - JOUR
T1 - Exploring the use of probabilistic latent representations to encode the students' reading characteristics
AU - Lopez, Erwin D.
AU - Minematsu, Tsubasa
AU - Yuta, Taniguchi
AU - Okubo, Fumiya
AU - Shimada, Atsushi
N1 - Funding Information:
This work was supported by JST AIP Grant Number JPMJCR19U1, and JSPS KAKENHI Grant Number JP18H04125, Japan.
Publisher Copyright:
© 2022 Copyright for this paper by its authors
PY - 2022
Y1 - 2022
N2 - The emergence of digital textbook reading systems such as Bookroll, and their ability of recording reader interactions has opened the possibility of analyzing the students reading behaviors and characteristics. To date, several works have conducted compelling analyses characterizing the different types of students with the use of clustering ML models, while others have used supervised ML models to predict their academic performance. The main characteristic these models share is that internally they simplify the students' data into a latent representation to get an insight or make a prediction. Nevertheless, these representations are oversimplified, otherwise difficult to interpret. Accordingly, the present work explores the use of Variational Autoencoders to make more interpretable and complex latent representations. After a brief description of these models, we present and discuss the results of four explorative studies when using the LAK22 Data Challenge Workshop datasets. Our results show that the probabilistic latent representations generated by the proposed models preserve the student reading characteristics, allowing a better visual interpretation when using 3 dimensions. Also, they allow supervised regressive and classification models to have a more stable and less overfitted learning process, which also allows some of them to make better score predictions.
AB - The emergence of digital textbook reading systems such as Bookroll, and their ability of recording reader interactions has opened the possibility of analyzing the students reading behaviors and characteristics. To date, several works have conducted compelling analyses characterizing the different types of students with the use of clustering ML models, while others have used supervised ML models to predict their academic performance. The main characteristic these models share is that internally they simplify the students' data into a latent representation to get an insight or make a prediction. Nevertheless, these representations are oversimplified, otherwise difficult to interpret. Accordingly, the present work explores the use of Variational Autoencoders to make more interpretable and complex latent representations. After a brief description of these models, we present and discuss the results of four explorative studies when using the LAK22 Data Challenge Workshop datasets. Our results show that the probabilistic latent representations generated by the proposed models preserve the student reading characteristics, allowing a better visual interpretation when using 3 dimensions. Also, they allow supervised regressive and classification models to have a more stable and less overfitted learning process, which also allows some of them to make better score predictions.
UR - http://www.scopus.com/inward/record.url?scp=85128933513&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85128933513&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85128933513
VL - 3120
SP - 1
EP - 10
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
SN - 1613-0073
T2 - 4th Workshop on Predicting Performance Based on the Analysis of Reading Behavior, DC in LAK 2022
Y2 - 22 March 2022
ER -