Workshop
Multimodal Intelligence: Advancing Beyond Text with Generative AI
Tuesday 26 August 15.00
Organizer: Yanlan Mao, Aalborg University
The integration of multiple modalities (text, vision, audio, etc.) represents one of the most exciting frontiers in artificial intelligence research. As generative AI continues to advance rapidly, there is growing interest in creating systems that can perceive, understand, and respond to the rich multimodal nature of human communication. Despite breakthroughs in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs), significant challenges remain in creating truly seamless multimodal interactions that approach human-like understanding and responsiveness.
These challenges include cross-modal alignment, contextual grounding, maintaining coherence across modalities, and addressing ethical considerations such as bias and hallucinations in multimodal outputs. This session brings together researchers and practitioners from across Danish institutions to discuss the latest advancements and persistent challenges in multimodal AI, fostering cross-institutional connections and potential research collaborations.
Presentations (30 minutes)
Multimodality in face-to-face communication from a language technology point of view
Costanza Navarretta, Associate Professor, Department of Nordic Studies and Linguistics, University of Copenhagen
Conversational AI for industrial robots in manufacturing settings
Chen Li, Associate Professor, Department of Materials and Production, Aalborg University
Facilitated discussion and activities (60 minutes)
Group activity
Participants will be divided into small interdisciplinary groups to share their experience and discuss specific challenges in multimodal AI in their domains (Note: To support the hands-on activities, we kindly request that the D3A provide papers, pencils, and sticky notes.)
Group presentations
Each group will present their ideas, insights, and identify common themes and potential collaborative opportunities
Discussion round
Summarizing key insights from the session and discussing potential follow-up activities and collaborations.
Intermediate: For attendees who have basic understanding or some experience with the subject but are not yet advanced.