Workshop

Multimodal Intelligence: Advancing Beyond Text with Generative AI

Tuesday 26 August 15.00

Organizer: Yanlan Mao, Aalborg University

The integration of multiple modalities (text, vision, audio, etc.) represents one of the most exciting frontiers in artificial intelligence research. As generative AI continues to advance rapidly, there is growing interest in creating systems that can perceive, understand, and respond to the rich multimodal nature of human communication. Despite breakthroughs in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs), significant challenges remain in creating truly seamless multimodal interactions that approach human-like understanding and responsiveness.

These challenges include cross-modal alignment, contextual grounding, maintaining coherence across modalities, and addressing ethical considerations such as bias and hallucinations in multimodal outputs. This session brings together researchers and practitioners from across Danish institutions to discuss the latest advancements and persistent challenges in multimodal AI, fostering cross-institutional connections and potential research collaborations.

Program

Presentations (30 minutes)

Multimodality in face-to-face communication from a language technology point of view
Costanza Navarretta, Associate Professor, Department of Nordic Studies and Linguistics, University of Copenhagen

Conversational AI for industrial robots in manufacturing settings
Chen Li, Associate Professor, Department of Materials and Production, Aalborg University

Facilitated discussion and activities (60 minutes)

Group activity
Participants will be divided into small interdisciplinary groups to share their experience and discuss specific challenges in multimodal AI in their domains (Note: To support the hands-on activities, we kindly request that the D3A provide papers, pencils, and sticky notes.)

Group presentations
Each group will present their ideas, insights, and identify common themes and potential collaborative opportunities

Discussion round
Summarizing key insights from the session and discussing potential follow-up activities and collaborations.

Organizers
Level

Intermediate: For attendees who have basic understanding or some experience with the subject but are not yet advanced.