Dissertation Defense: Shabnam Behzad
Candidate: Shabnam Behzad
Major: Computer Science
Advisors: Nathan Schneider, Ph.D., and Amir Zeldes, Ph.D.
Title: Language Learning Meets Generative AI: Utilizing Large Language Models for Metalinguistic Explanations
Second language learners constitute a significant and expanding portion of the global population and there is a growing demand for tools that facilitate language learning and instruction across various levels and in different countries. The development of large language models (LLMs) has brought about a significant impact on the domains of natural language processing and these advancements hold considerable potential for the realm of educational technology. While the potential use of large language models in education shows promise, it is crucial to conduct further exploration into their impact and limitations.
This thesis explores the task of delivering explanation and feedback to learners through different formats, including essay feedback and a question-answering framework, and studies how LLMs can be leveraged for these tasks. The thesis is structured into three main parts: In the first part, we investigate the capabilities of large language models in generating feedback on students’ essays. Our findings indicate that current state-of-the-art models are unable to provide learners with specific and actionable feedback. To address this issue, we propose a new corpus tailored for this task and demonstrate that utilizing this corpus in an in-context learning setup allows us to deliver more effective feedback to learners.
In the second part, we study a different type of feedback (answering questions about the language, such as grammar or vocabulary questions) and introduce a new challenge dataset comprising metalinguistic questions and answers about the English language. These questions are posed by both English second language learners and native speakers. Our goal with this dataset is to use it as an evaluation benchmark and stimulate greater emphasis on addressing the complexities of metalinguistic question answering within the field. Using this benchmark, we investigate the extent to which language models can articulate their generalizations about language.
In the third part, our focus shifts to multilingualism within the question answering framework. We propose using data from grammatical error correction corpora to establish benchmarks for assessing the multilingual capabilities of LLMs in addressing learners’ language grammar questions. We ask whether LLMs can answer questions posed in a language other than the one the learner is asking about.