Go back to Agenda

Thu

11 Jun

2:00 pm

-

2:15 pm

Emergent Trends in Large Multimodal Models

This talk highlights the three emerging trends in Large Multimodal Models: encoder-free unified architectures, spatial intelligence, and video reasoning. Together, these developments are reshaping how multimodal systems perceive, understand, and reason about the world.

15 min

WEKA Stage

About

This talk highlights the three emerging trends in Large Multimodal Models: encoder-free unified architectures, spatial intelligence, and video reasoning. Together, these developments are reshaping how multimodal systems perceive, understand, and reason about the world.

Speakers

3D wireframe model of a human brain in gray on a white background.

Zhongang Cai

Principal Research Scientist

SenseTime

LLMs (Large Language Models)

Moderator

No items found.

Previous Session

SuperAI 8-9 September 2027

Pro

US$1999

US$999

SuperAI KV key visual