Yichao Cai

Towards bridging language & world representations

prof_pic.png

Adelaide, Australia

yichao.cai@adelaide.edu.au

I am a third-year Ph.D. student in Computer Science at the Australian Institute for Machine Learning (AIML), Adelaide University (formerly The University of Adelaide), advised by Prof. Javen Qinfeng Shi. Before my Ph.D., I received my M.Sc. and B.Eng. degrees from Wuhan University of Technology. During my M.Sc., I spent five months as a visiting student researcher at California PATH, UC Berkeley.

I study how language supervision shapes the semantics, geometry, and identifiability of multimodal representations. My current research interests span:

  • representation learning (learning objectives and training paradigms, identifiability theory, semantic structure in learned representations);
  • vision-language modeling (multimodal alignment, multimodal LLMs, supervision design and data curation);
  • explainable machine learning (mechanistic interpretability, representation geometry, latent-structure characterization).

news

May 01, 2026 We had 3 papers on representation learning (contrastive learning theory, AI4Science, and graphical modeling) accepted to ICML 2026.
Feb 10, 2026 I attended MLSS Melbourne 2026 and enjoyed learning from world-class speakers and connecting with the community.
Jan 28, 2026 Check out our new preprint: The Geometric Mechanics of Contrastive Representation Learning.
Oct 15, 2025 I served as a guest lecturer in Statistical Machine Learning and presented recent advances in vision-language modeling. Slides.
Sep 19, 2025 Our work On the Value of Cross-Modal Misalignment in Multimodal Representation Learning was selected as a Spotlight at NeurIPS 2025.

Selected Publications

View full publications →

  1. ICML’26
    InfoNCE_geometry.png
    The Geometric Mechanics of Contrastive Representation Learning: Alignment Potentials, Entropic Dispersion, and Cross-Modal Divergence
    Yichao Cai, Zhen Zhang, Yuhang Liu, and 1 more author
    In International Conference on Machine Learning (ICML), 2026
  2. ICLR’26
    ntp_concept.png
    I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?
    Yuhang Liu, Dong Gong, Yichao Cai, and 6 more authors
    In International Conference on Learning Representations (ICLR), 2026
  3. NeurIPS’25
    misalignment.png
    On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
    Yichao Cai, Yuhang Liu, Erdun Gao, and 4 more authors
    In Advances in Neural Information Processing Systems (NeurIPS), 2025  Spotlight
  4. ECCV’24
    CLAP.png
    CLAP: Isolating Content from Style through Contrastive Learning with Augmented Prompts
    Yichao Cai, Yuhang Liu, Zhen Zhang, and 1 more author
    In European Conference on Computer Vision (ECCV), 2024