Deep Dive workshop
Big Scandinavian Data and LLMs
Wednesday 27 August 9.00
Organizer: Johannes Bjerva, Aalborg University
Intro including instructions for the mingling activity (5 min)
Keynote (20 min + 10 min Q&A)
Ditte Laursen, Ph.D., Royal Danish Library
Keynote (20 min + 10 min Q&A)
Danila Petrelli, Senior Data Lead, AI Sweden
Breakout groups with structured discussions (25 min)
Break (15 min)
Panel (25 min)
Ditte Laursen and Danila Petrelli. Moderator: Stella Frank
Keynote (20 min + 10 min Q&A)
Heather Lent, Ph.D., Aalborg University
Mingling activity (30 min)
Find your note-of-interest match
Wrap-up (5 min)
This activity aims to encourage attendees to talk to each other and meet new people. Before the break, participants will write a brief Note of Interest (NoI), indicating their main area of interest. In the “Mingling activity”, participants will be given a random NoI as collected during the break. They are then tasked with finding the person who wrote that NoI and finding the person holding their own NoI.
The role of libraries in the language modelling revolution
Ditte Laursen, Ph.D, Senior researcher, Royal Danish Library
Areas of expertise include collection management, it governance and research & development. Special interests include digital cultural heritage, digital humanities and digital research infrastructures, including AI’s use of digital collections as data.
What’s in the Data? Lessons from Building Scandinavian LLM Foundations
Danila Petrelli, Senior Data Lead @ AI Sweden
Danila works at AI Sweden, where, as part of the Natural Language Understanding team, she leads work on data governance and dataset development for large language models. She’s especially interested in public sector use cases, multilingual data, and the challenges of building European LLMs with open and legally sound data. Her recent work includes the TrustLLM project, national data infrastructure efforts, and initiatives to connect legal, technical, and societal perspectives on data.
Ethics in Multilingual NLP Security // How NLP Security Affects Scandinavian Languages
Heather Lent, Ph.D, Aalborg University
Heather is a postdoc at AAU working on Security in LLMs. She has a Ph.D from the University of Copenhagen. Her research interests include multilingual NLP, with a focus on small and underserved languages.
The target audience is Ph.D students, Postdocs, Faculty, and industry practitioners. We expect attendance from those with an interest in NLP, LLMs, textual data, and how current
developments affect society (e.g. cultural effects, security challenges). We do not have a preference for a maximum audience size – however, we strongly prefer a room where tables are set up in groups during the whole workshop in order to facilitate our breakout group discussions. This might place restrictions on the maximum number of participants.
Intermediate
Organizers