The International Conference on Machine Learning (ICML) 2023 is being hosted July 23th - 29th. We’re excited to share all the work from SAIL that’s being presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford!
List of Accepted Papers
Main Conference
Towards Learning Geometry Eigen-Lengths Crucial for Fitting Tasks
Authors: Yijia Weng, Kaichun Mo, Ruoxi Shi, Yanchao Yang, Leonidas Guibas
Contact: yijiaw@stanford.edu
Keywords: geometric learning, eigen-length
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Authors: Rohan Taori, Tatsunori B. Hashimoto
Contact: rtaori@stanford.edu
Award nominations: Oral
Links: Paper | Website
Keywords: feedback loops, bias amplification, deep learning, self-supervised learning, cv, nlp
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Authors: Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen
Contact: beidic@andrew.cmu.edu
Links: Paper
Keywords: large language models, efficient inference
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
Authors: Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D Manning, Chelsea Finn
Contact: eric.mitchell@cs.stanford.edu
Award nominations: Selected for oral presentation at the conference
Links: Paper | Website
Keywords: detection; zero-shot; language generation; curvature; deepfake
Discover and Cure: Concept-aware Mitigation of Spurious Correlation
Authors: Shirley Wu, Mert Yuksekgonul, Linjun Zhang, James Zou
Contact: shirwu@cs.stanford.edu
Links: Paper
Keywords: spurious correlation, generalization, interpretability, concept
Long Horizon Temperature Scaling
Authors: Andy Shih, Dorsa Sadigh, Stefano Ermon
Contact: andyshih@stanford.edu
Links: Paper | Website
Keywords: temperature scaling, long horizon, nonmyopic, autoregressive models, diffusion models, inference, tractability
Emergence of Sparse Representations from Noise
Authors: Trenton Bricken, Rylan Schaeffer, Bruno Olshausen, Gabriel Kreiman
Contact: trenton- bricken@g.harvard.edu
Links: Paper
Keywords: sparsity, neural networks, neuroscience
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Authors: Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher Ré, Ion Stoica, Ce Zhang
Contact: ying1123@stanford.edu
Links: Paper | Website
Keywords: large language models, memory optimizations, offloading, compression, generative pre-trained transformers
Generating Language Corrections for Teaching Physical Control Tasks
Authors: Megha Srivastava, Noah Goodman, Dorsa Sadigh
Contact: meghas@stanford.edu
Links: Paper
Keywords: education, language, human-ai interaction
Hyena Hierarchy: Towards Larger Convolutional Language Models
Authors: Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, Christopher Ré
Contact: poli@stanford.edu
Award nominations: Oral
Links: Paper | Blog Post
Keywords: long context, long convolution, large language models
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Authors: Jonathan N. Lee, Alekh Agarwal, Christoph Dann, Tong Zhang
Contact: jnl@stanford.edu
Links: Paper
Keywords: reinforcement learning, partial observability, hindsight observability, pomdp, homdp, regret, learning theory, theory, sample complexity
Causal Proxy Models For Concept-Based Model Explanations
Authors: Zhengxuan Wu, Karel D’Oosterlinck, Atticus Geiger, Amir Zur, Christopher Potts
Contact: kldooste@stanford.edu
Links: Paper | Website
Keywords: explainability, causality, concept-based explanations, causal explanations
Modeling Dynamic Environments with Scene Graph Memory
Authors: Andrey Kurenkov, Michael Lingelbach,Tanmay Agarwal, Emily Jin, Chengshu Li, Ruohan Zhang, Li Fei-Fei, Jiajun Wu, Silvio Savarese, Roberto Martín-Martín
Contact: andreyk@stanford.edu
Links: Paper
Keywords: graph neural network, embodied ai, link prediction
Motion Question Answering via Modular Motion Programs
Authors: Mark Endo*, Joy Hsu*, Jiaman Li, Jiajun Wu
Contact: markendo@stanford.edu
Links: Paper | Website
Keywords: question answering, human motion understanding, neuro-symbolic learning
One-sided matrix completion from two observations per row
Authors: Steven Cao, Percy Liang, Gregory Valiant
Contact: shcao@stanford.edu
Links: Paper
Keywords: machine learning, icml, matrix completion, subspace estimation, high-dimensional statistics, random matrix theory
Optimal Sets and Solution Paths of ReLU Networks
Authors: Aaron Mishkin, Mert Pilanci
Contact: mishkin@stanford.edu
Links: Paper
Keywords: convex optimization, relu networks, regularization path
Out-of-Domain Robustness via Targeted Augmentations
Authors: Irena Gao*, Shiori Sagawa*, Pang Wei Koh, Tatsunori Hashimoto, Percy Liang
Contact: irena@cs.stanford.edu, ssagawa@cs.stanford.edu
Links: Paper
Keywords: robustness, data augmentation
PhD
Authors: Minkai Xu, Alexander Powers, Ron Dror, Stefano Ermon, Jure Leskovec
Contact: minkai@cs.stanford.edu
Links: Paper | Website
Keywords: generative models, geometric representation learning, drug discovery
Reflected Diffusion Models
Authors: Aaron Lou, Stefano Ermon
Contact: aaronlou@stanford.edu
Links: Paper | Blog Post
Keywords: diffusion models
Sequence Modeling with Multiresolution Convolutional Memory
Authors: Jiaxin Shi, Ke Alexander Wang, Emily B. Fox
Contact: ishijiaxin@gmail.com
Links: Paper | Website
Keywords: long-range, sequence modeling, convolution, multiresolution, wavelets, parameter-efficient
Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement Learning
Authors: Evan Zheran Liu, Sahaana Suri, Tong Mu, Allan Zhou, Chelsea Finn
Contact: evanliu@cs.stanford.edu
Links: Paper
Keywords: meta-reinforcement learning, language learning, reinforcement learning
Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Authors: Daniel Y. Fu, Elliot L. Epstein, Eric Nguyen, Armin W. Thomas, Michael Zhang, Tri Dao, Atri Rudra, Christopher Ré
Contact: danfu@cs.stanford.edu
Links: Paper | Blog Post
Keywords: convolutions, sequence modeling
VIMA: General Robot Manipulation with Multimodal Prompts
Authors: Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan
Contact: yunfanj@cs.stanford.edu
Links: Paper | Website
Keywords: robot learning, foundation model, transformer, multi-task learning
Workshops
MultiLegalPile: A 689GB Multilingual Legal Corpus
Authors: Joel Niklaus, Veton Matoshi, Matthias Stürmer, Ilias Chalkidis, Daniel E. Ho
Contact: jniklaus@stanford.edu
Workshop: DMLR
Links: Paper
Keywords: legal, law, corpus, language models, pretraining, large
Leveraging Side Information for Communication-Efficient Federated Learning
Authors: Berivan Isik, Francesco Pase, Deniz Gunduz, Sanmi Koyejo, Tsachy Weissman, Michele Zorzi
Contact: berivan0@stanford.edu
Workshop: Workshop on Federated Learning and Analytics,
Links: Paper
Keywords: federated learning, compression, importance sampling
Lexinvariant Language Models
Authors: Qian Huang, Eric Zelikman, Sarah Li Chen, Yuhuai Wu, Gregory Valiant, Percy Liang
Contact: qhwang@stanford.edu
Workshop: ICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling
Links: Paper
Keywords: large language model, in-context learning, pretraining
Is Pre-training Truly Better Than Meta-Learning?
Authors: Brando Miranda, Patrick Yu, Saumya Goyal, Yu-Xiong Wang, Sanmi Koyejo
Contact: brando9@stanford.edu
Workshop: ICML data centric workshop
Links: Paper
Keywords: meta-learning, general intelligence, machine learning, llm, pre-training, few-shot learning
Layer-Wise Feedback Alignment is Conserved in Deep Neural Networks
Authors: Zachary Robertson, Oluwasanmi Koyejo
Contact: zroberts@stanford.edu
Workshop: Localized Learning Workshop
Links: Paper
Keywords: deep learning, bio-plausible, implicit bias
H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Authors: Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen.
Contact: zhenyu.zhang@utexas.edu
Workshop: Efficient Systems for Foundation Models (ES-FoMo) Workshop
Links: Paper | Website
Keywords: large language models; efficient generative inference
GPT-Zip: Deep Compression of Finetuned Large Language Models
Authors: Berivan Isik, Hermann Kumbong, Wanyi Ning, Xiaozhe Yao, Sanmi Koyejo, Ce Zhang
Contact: berivan0@stanford.edu
Workshop: Workshop on Efficient Systems for Foundation Models
Links: Paper
Keywords: large language models, model compression, finetuning, scalable machine learning
Exact Optimality in Communication-Privacy-Utility Tradeoffs
Authors: Berivan Isik, Wei-Ning Chen, Ayfer Ozgur, Tsachy Weissman, Albert No
Contact: berivan0@stanford.edu
Workshop: Workshop on Federated Learning and Analytics in Practice
Links: Paper
Keywords: distributed mean estimation, differential privacy, compression
Do Users Write More Insecure Code with AI Assistants?
Authors: Neil Perry*, Megha Srivastava*, Deepak Kumar, Dan Boneh
Contact: meghas@stanford.edu
Workshop: Challenges in Deployable Generative AI Workshop
Links: Paper
Keywords: generative ai, code generation, human-ai interaction
Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance
Authors: Omer Reingold, Judy Hanwen Shen, Aditi Talati
Contact: jhshen@stanford.edu
Workshop: HCI & AI Workshop
Links: Paper
Keywords: explainability, model multiplicity
PRODIGY: Enabling In-context Learning Over Graphs
Authors: Qian Huang, Hongyu Ren, Peng Chen, Gregor Kržmanc, Daniel Zeng, Percy Liang, Jure Leskovec
Contact: qhwang@stanford.edu
Workshop: ICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling
Links: Paper | Website
Keywords: graph ml, in-context learning, pretraining
PRODIGY: Enabling In-context Learning Over Graphs
Authors: Qian Huang, Hongyu Ren, Peng Chen, Gregor Kržmanc, Daniel Zeng, Percy Liang, Jure Leskovec
Contact: qhwang@stanford.edu
Workshop: ICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling
Links: Paper | Website
Keywords: graph ml, in-context learning, pretraining
Thomas: Learning to Explore Human Preference via Probabilistic Reward Model
Authors: Sang T. Truong, Duc Nguyen, Tho Quan, Sanmi Koyejo
Contact: sttruong@cs.stanford.edu
Workshop: ICML 2023 Workshop: The Many Facets of Preference-Based Learning
Keywords: preference learning, active learning, bayesian neural networks, thompson sampling
Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data
Authors: Alycia Lee, Brando Miranda, Sanmi Koyejo
Contact: brando9@stanford.edu
Workshop: ICML data centric workshop
Links: Paper
Keywords: data centric, diversity, llm, foundation model
We look forward to seeing you at ICML!