
Veronika
Laippala
Professor, Digital Language Studies, Chinese, French, German, Italian, Spanish
Areas of expertise
Computational linguistics
text linguistics
corpus linguistics
digital discourse analysis.
Biography
I am a linguist who likes computers. My main research topics include language variation across different communicative situations and the development of automatic tools so that we could better benefit from large, web-crawled corpora.
My ongoing projects include "A piece of news, an opinion or something else? Different texts and their detection from the multilingual Internet" funded by Emil Aaltonen foundation and "Massively multilingual modeling of registers in web-scale data" funded by Academy of Finland.
For more information, please have a look at our lab website at https://turkunlp.github.io/
Publications
Perspectives on Forests and Forestry in Finnish Online Discussions - A Topic Modeling Approach to Suomi24 (2025)
NLP4Ecology
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa)
From keywords to key embeddings – contrasting French and Swedish web registers using multilingual deep learning (2025)
Corpus Linguistics and Linguistic Theory
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä )
Building the Penitentiary Document Corpus (PeDoCo) for NLP: Balancing Data Complexity and Uniform Data Structure (2025)
Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2025), Digital Humanities in the Nordic and Baltic Countries Publications
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa)
Building Question-Answer Data Using Web Register Identification (2024)
Language Resources and Evaluation, LREC Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa)
Introduction (2024)
(Vertaisarvioitu artikkeli kokoomateoksessa (A3))Intersecting Register and Genre: Understanding the Contents of Web-Crawled Corpora (2024)
4th International Conference on Natural Language Processing for Digital Humanities
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa)
Linguistics across Disciplinary Borders : the March of Data (2024)
(C2 Toimitustyö tieteelliselle kokoomateokselle)Linguistic variation beyond the Indo-European web: Analyzing Turkish web registers in TurCORE (2024)
Register studies
(A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä )
Automated Emotion Annotation of Finnish Parliamentary Speeches Using GPT-4 (2024)
ParlaCLARIN Workshop, LREC Proceedings
(A4 Vertaisarvioitu artikkeli konferenssijulkaisussa)