Tuesday, February 21, 2023 12:00pm to 1:00pm
About this Event
201 Wilson Commons, Rochester, 14611
Join the Goergen Institute for Data Science for Principled Frameworks for Designing Deep Learning Models: Efficiency, Robustness, and Expressivity with Tan Nguyen, postdoctoral scholar with the Department of Mathematics at UCLA. Lunch will be provided to attendees.
Abstract: Designing deep learning models for practical applications, including those in computer vision, natural language processing, and mathematical modeling, is an art that often involves an expensive search over candidate architectures. In this talk, I present novel frameworks to facilitate the process of designing efficient and robust deep learning models with better expressivity via three principled approaches: optimization, differential equation, and statistical modeling.
From an optimization viewpoint, I leverage the continuous limit of the classical momentum accelerated gradient descent to improve Neural ODEs training and inference. The resulting Momentum Neural ODEs accelerate both forward and backward ODE solvers, as well as alleviating the vanishing gradient problem (Efficiency).
From a differential equation approach, I present a random walk interpretation of graph neural networks (GNNs), revealing a potentially inevitable over-smoothing phenomenon. Based on this random walk viewpoint of GNNs, I then propose the graph neural diffusion with a source term (GRAND++) that overcomes the over-smoothing issue and achieves better accuracy in low-labeling rate regimes (Robustness).
Using statistical modeling as a tool, I show that the attention in transformer models can be derived from solving a nonparametric kernel regression problem. I then propose the FourierFormer, a new class of transformers in which the softmax kernels are replaced by the novel generalized Fourier integral kernels. The generalized Fourier integral kernels can automatically capture the dependency of the features of data and remove the need to tune the covariance matrix (Expressivity).
Bio: Dr. Tan Nguyen is currently a postdoctoral scholar in the Department of Mathematics at the University of California, Los Angeles, working with Dr. Stanley J. Osher. He obtained his Ph.D. in Machine Learning from Rice University, where he was advised by Dr. Richard G. Baraniuk. Dr. Nguyen is an organizer of the 1st Workshop on Integration of Deep Neural Models and Differential Equations at ICLR 2020. He also had two awesome long internships with Amazon AI and NVIDIA Research. He is the recipient of the prestigious Computing Innovation Postdoctoral Fellowship (CIFellows) from the Computing Research Association (CRA), the NSF Graduate Research Fellowship, and the IGERT Neuroengineering Traineeship. He received his M.S. and B.S. in Electrical and Computer Engineering from Rice University in May 2018 and May 2014, respectively.
This seminar is part of the tenure-track, Assistant Professor in Data Science faculty search led by the Goergen Institute for Data Science.
User Activity
No recent activity