Principles of Large-Scale Foundation Models

 

Abstract: Large-scale foundation models (e.g., GPT) have reached unprecedented levels of performance by relying on two emerging learning mechanisms: massive self-supervised learning (SSL) and flexible test-time learning (TTL). Yet, these paradigms remain poorly understood, leading to considerable trial and error in practice. In this talk, I demonstrate how identifying the underlying principles of SSL and TTL can inform more effective and reliable model design. First, I introduce a unifying graph-theoretic framework that characterizes the generalization of both discriminative and generative SSL models. This framework explains how seemingly disparate approaches converge on meaningful semantic representations without labels, and it points to practical strategies for improving model efficiency, robustness, and interpretability. Second, I investigate how test-time learning works in language models—particularly the long, reflective reasoning process as in o1 and R1—and provide both theoretical insights and scalable designs to enhance model capabilities and safety. By bridging rigorous theoretical understanding with practical algorithmic solutions, this research offers a cohesive path for building more principled, interpretable, and trustworthy foundation models.

 

Bio: Yifei Wang is a postdoctoral associate at MIT CSAIL, where he works with Professor Stefanie Jegelka. His research focuses on the theoretical and algorithmic foundations of self-supervised learning, foundation models, and AI safety. His research in these areas has earned him three best paper awards, including the Best ML Paper Award at ECML-PKDD 2021, the Silver Best Paper Award at the ICML 2021 AML workshop, and the Best Paper Award at the ICML 2024 ICL Workshop. His work was also featured by Anthropic and MIT for its contributions to self-learning and AI safety mechanisms. Before joining MIT, he earned a PhD in Applied Mathematics, a BS in Data Science, and a BA in Philosophy from Peking University.

Event Details

See Who Is Interested

  • He Liu
  • Nathan Hadjiyski
  • Clemence Granade
  • Lorenzo Mendoza

4 people are interested in this event


User Activity

No recent activity