Hanna-Mari Kupari profile picture
Hanna-Mari
Kupari
Doctoral Researcher, Digital Language Studies, Chinese, French, German, Italian, Spanish
filosofian maisteri - Master of Arts
Medieval Latin with corpus linguistics methods

Contact

Arcanuminkuja 1
20500
Turku

Areas of expertise

Medieval Latin
corpus linguistics
TEI-xml
automatic morpho-syntactic parsing

Biography

I am a doctoral researcher in digital language studies at the University of Turku, funded by the Emil Aaltonen Foundation. In my work I combine medieval data with the state of the art in machine learning. I have a Master's degree in Classical Philology with a major in Latin. I am particularly interested in the study of grammar, quantitative methods and aspects of local history.

I am interested in science communication and have worked as an associate editor of the online journal Hiiskuttua.

For some years now I have been an active member of the Tohtoriverkosto society.

Teaching

University of Tartu, Estonia

A practical workshop on automatic morpho-syntactic annotation of large language corpora using the Universal Dependencies framework, spring 2024. A five-session practical workshop for PhD students and staff on automatic parsing. Topics covered: theory, terminology, parsing tools, building your own treebank in practice.

Course github:

https://github.com/HannaKoo/ParsersTartu

https://maailmakeeled.ut.ee/en/content/multi-day-practical-workshop-automatic-morpho-syntactic-annotation-coming

Digital resources course at the University of Tartu. Treebanks and automatic linguistic annotation for Classical Languages, spring 2024. One lecture for undergraduate students.

University of Turku, Finland

Digital Interaction Lecture Course, spring 2024. Using computer-assisted methods for parsing grammar. One lecture.

Corpus Linguistics and Language Technology for undergraduates, fall 2023. Five lectures. Topics covered: student project, ethics and large language models, named-entity recognition, sentiment analysis, automatic morpho-syntactic parsing, reprsenting language as vectors and supervised and unsupervised machine learning.

Linguistic landscapes course for undergraduates, spring 2023. One lecture 2023-03-15 with professor Marko Lamberg "Historiallisten kirjallisten lähteiden näkökulmia kielimaisemiin Turussa".


Research

Modern methods for medieval texts

In my digital humanities doctoral dissertation I am researching the medieval apostolic penitentiary documents and the Registrum Ecclesiae Aboensis copybook with corpus linguistics methods. I explore language use and linguistic variation (i.e. register analysis) of Medieval Latin with metadata enriched and morpho-syntactically annotated corpora. My work promotes open-access research and I publish all my code, data and results along with my publications.

Member of TurkuNLP and TUCEMEMS research groups.

Grants

My work is made possible by the Emil Aaltonen säätiö -fund 2022 to 2024, Turku University Foundation travel grant 2023, University of Turku research grants 2022 and 2021, The Finnish Cultural Foundation Varsinais-Suomi Regional Fund grant 2021, Uskelan opintorahastosäätiö 2020. I have also received Turku University Foundation Villa Tammekann grants (Tartu, Estonia) 2023 and 2024. January 2024 I spent at the Finnish Institute in Rome working on my PhD and visited the penitentiary archive and libraries.

Publications

Sort by:

Avoin tiede ja tutkimusinfra (2024)

Hiiskuttua: Turun yliopiston humanistisen tiedekunnan verkkolehti
Kupari, Hanna-Mari; Leinonen, Päivi
(D1 Artikkeli ammattilehdessä)

FinGPT: Large Generative Models for a Small Language (2023)

Conference on Empirical Methods in Natural Language Processing
Luukkonen Risto, Komulainen Ville, Luoma Jouni, Eskelinen Anni, Kanerva Jenna, Kupari Hanna-Mari, Ginter Filip, Laippala Veronika, Muennighoff Niklas, Piktus Aleksandra, Wang Thomas, Tazi Nouamane, Scao Le Teven, Wolf Thomas, Suominen Osma, Sairanen Samuli, Merioksa Mikko, Heinonen Jyrki, Vahtola Aija, Antao Samuel, Pyysalo Sampo
(Vertaisarvioitu artikkeli konferenssijulkaisussa (A4))