Building A Strong Multilingual Idiomatic Repertoire Using Natural Language Processing Models
Zehlia Babaci-Wilhite, Lecturer
UGIS
Closed. This professor is continuing with Fall 2023 apprentices on this project; no new apprentices needed for Spring 2024.
Effective acquisition of multi-languages is mainly based on building a fortified vocabulary repertoire. Understanding and using figurative language, specifically idioms is considered a sign of students’ advanced language proficiency. However, acquiring idioms has always been a challenge for language learners. Considering the fact that NLP and NLU is rapidly flourishing in our lives, the study aims to exploit NLP and NLU models specifically text classification for facilitating the differentiation process between the literal and figurative meaning of idiomatic expressions, develop a user-friendly interface for assisting learners of Arabic, English, French, Norwegian, Spanish, Mandarin, Cantonese, Japanese and other foreign languages to investigate the impact of using the developed program on language learners’ competency level and explore the learners’ opinions about this program for future development. The accuracy of these models using different classifiers; Nivea Bayes, Logistic Regression and SVM will be checked. The highly accurate model will be then developed into a full-fledged program that is user-friendly. Following the development of this program, a complementary study will be conducted. Data will be collected from a huge number of foreign language learners (i.e. English, Arabic, Mandarin and Cantonese,Spanish, Norwegian, French and Japanese) both through pre- and post-test designed by the researcher and analyzed quantitatively and quantitatively. The pre-test which will assess the learners’ idiomatic cognition before the treatment will be followed by a treatment (i.e. learners will be asked to lean idioms using the program) and a post-test to check its efficiency through determining the impact of using the developed program on their idiomatic knowledge.
Role: Collecting Data through the research process and presenting at a mini-conference
Qualifications: Interested in language and technology
Hours: to be negotiated
Off-Campus Research Site: Via zoom