Qing Qu (UMich): Harnessing Low-Dimensionality for Generalizable and Scientific Generative AI
Abstract: The empirical success of modern generative AI, from diffusion models to Large Language Models (LLMs), often outpaces our classical understanding of how machine learning models generalize from finite, out-of-distribution (OOD) data. This talk introduces a unified mathematical framework identifying intrinsic low-dimensional structures as the primary driver of generalization and a critical lever for advancing scientific AI. First, we deconstruct the generalization mechanism of diffusion models, revealing a training transition from memorization to generalization that effectively breaks the curse of dimensionality. By using a mixture of low-rank Gaussian models, we demonstrate that sample complexity scales linearly with the intrinsic dimension rather than exponentially with the ambient dimension, through establishing a formal equivalence with the canonical subspace clustering problem. Moreover, by exploring nonlinearity in two-layer denoising autoencoders, we uncover how weight structures differ between memorization and generalization. This distinction allows us to provide a unified understanding of how models learn representations and how they generate new data. Second, we characterize the OOD generalization of in-context learning (ICL) in transformers. For linear regression tasks in which vectors lie in low-dimensional subspaces, we show that OOD capabilities emerge from interpolating across training task subspaces. We derive precise conditions under which linear attention models interpolate across distribution shifts, highlighting task diversity as a prerequisite for ICL efficacy. Finally, we translate these theoretical insights into practical guidelines for controlled generation, ensuring model safety and privacy, and solving high-dimensional inverse problems in science and engineering.
Speakers
Qing Qu
Qing Qu is an Assistant Professor in EECS at the University of Michigan. He works at the intersection of the foundations of machine learning, numerical optimization, and signal/image processing, with a current focus on the theory of deep generative models and representation learning. Prior to joining Michigan in 2021, he was a Moore–Sloan Data Science Fellow at the Center for Data Science, New York University (2018–2020). He received his Ph.D. in Electrical Engineering from Columbia University in October 2018 and his B.Eng. in Electrical and Computer Engineering from Tsinghua University in July 2011. His work has been recognized with multiple honors, including the Best Student Paper Award at SPARS 2015, a Microsoft PhD Fellowship in Machine Learning (2016), the Best Paper Award at the NeurIPS Diffusion Models Workshop (2023), NSF CAREER Award (2022), Amazon Research Award (AWS AI, 2023), UM CHS Junior Faculty Award (2025), Google Research Scholar Award (2025), and the 1938E Award in Michigan Engineering. He has led and delivered multiple tutorials at ICASSP, CPAL, CVPR, ICCV, and ICML. He was one of the founding organizers and Program Chair for the new Conference on Parsimony & Learning (CPAL), regularly serves as an Area Chair for NeurIPS, ICML, and ICLR, senior area chair for ICASSP’26, and is an Action Editor for TMLR.