Abstract : This study uses students’ highlights in textbooks to predict their performance on quiz questions, and constructs a semantic representation using deep-learning sentence embedding technique (SBERT) to capture content-based similarity. We built regression models that include highlighting features and found that they reliably boost model performance. The highlighting features improved models for questions at all levels of the Bloom taxonomy. However, the generalization was not strong for held-out questions.

Download paper here