Last updated: 2025-05-19 23:32:36. Maintained by Weisen Jiang.

citation publish date title (pdf) review authors
1068 2024-03-05 Scaling Rectified Flow Transformers for High-Resolution Image Synthesis link Patrick Esser, Sumith Kulal,..., Robin Rombach
708 2024-01-17 Vision Mamba: Efficient Visual Representation Learning with Bidirectional State
Space Model
link Lianghui Zhu, Bencheng Liao,..., Xinggang Wang
607 2023-08-04 MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities link Weihao Yu, Zhengyuan Yang,..., Lijuan Wang
602 2023-05-23 Improving Factuality and Reasoning in Language Models through Multiagent
Debate
link Yilun Du, Shuang Li,..., Igor Mordatch
481 2024-03-07 Chatbot Arena: An Open Platform for Evaluating LLMs by
Human Preference
link Wei-Lin Chiang, Lianmin Zheng,..., Ion Stoica
454 2023-09-11 NExT-GPT: Any-to-Any Multimodal LLM link Shengqiong Wu, Hao Fei,..., Tat-Seng Chua
424 2024-05-31 Transformers are SSMs: Generalized Models and Efficient Algorithms Through
Structured State Space Duality
link Tri Dao, Albert Gu
361 2023-10-02 ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback link Ganqu Cui, Lifan Yuan,..., Maosong Sun
340 2024-02-14 DoRA: Weight-Decomposed Low-Rank Adaptation link Shih-yang Liu, Chien-Yi Wang,..., Min-Hung Chen
313 2024-02-06 HarmBench: A Standardized Evaluation Framework for Automated Red Teaming
and Robust Refusal
link Mantas Mazeika, Long Phan,..., Dan Hendrycks
310 2023-09-01 RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback
with AI Feedback
link Harrison Lee, Samrat Phatale,..., Sushant Prakash
298 2024-01-18 Self-Rewarding Language Models link Weizhe Yuan, Richard Yuanzhe Pang,..., Jason E Weston
275 2024-01-02 Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
Models
link Zixiang Chen, Yihe Deng,..., Quanquan Gu
258 2023-12-14 Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision link Collin Burns, Pavel Izmailov,..., Jeffrey Wu
253 2023-05-22 How Language Model Hallucinations Can Snowball link Muru Zhang, Ofir Press,..., Noah A. Smith
247 2024-01-19 Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding
Heads
link Tianle Cai, Yuhong Li,..., Tri Dao
237 2023-12-21 VideoPoet: A Large Language Model for Zero-Shot Video Generation link Dan Kondratyuk, Lijun Yu,..., Lu Jiang
207 2024-01-03 GPT-4V(ision) is a Generalist Web Agent, if Grounded link Boyuan Zheng, Boyu Gou,..., Yu Su
200 2024-01-16 Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance
in Machine Translation
link Haoran Xu, Amr Sharaf,..., Young Jin Kim
188 2023-10-14 A decoder-only foundation model for time-series forecasting link Abhimanyu Das, Weihao Kong,..., Yichen Zhou
186 2024-02-06 LESS: Selecting Influential Data for Targeted Instruction Tuning link Mengzhou Xia, Sadhika Malladi,..., Danqi Chen
183 2023-10-06 Language Agent Tree Search Unifies Reasoning, Acting, and Planning
in Language Models
link Andy Zhou, Kai Yan,..., Yu-Xiong Wang
179 2023-09-28 Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution link Chrisantha Fernando, Dylan Sunil Banarse,..., Tim Rocktäschel
173 2024-03-06 GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection link Jiawei Zhao, Zhenyu Zhang,..., Yuandong Tian
167 2023-06-13 SqueezeLLM: Dense-and-Sparse Quantization link Sehoon Kim, Coleman Richard Charles Hooper,..., Kurt Keutzer
165 2024-02-04 Unified Training of Universal Time Series Forecasting Transformers link Gerald Woo, Chenghao Liu,..., Doyen Sahoo
161 2024-02-05 KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache link Zirui Liu, Jiayi Yuan,..., Xia Hu
157 2023-12-18 Iterative Preference Learning from Human Feedback: Bridging Theory and
Practice for RLHF under KL-constraint
link Wei Xiong, Hanze Dong,..., Tong Zhang
152 2023-09-29 AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and
Training
link Ziyu Wan, Xidong Feng,..., Jun Wang
144 2024-02-23 Genie: Generative Interactive Environments link Jake Bruce, Michael D Dennis,..., Tim Rocktäschel
143 2024-03-05 NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
Diffusion Models
link Zeqian Ju, Yuancheng Wang,..., sheng zhao
142 2023-12-11 Gated Linear Attention Transformers with Hardware-Efficient Training link Songlin Yang, Bailin Wang,..., Yoon Kim
141 2024-02-19 LoRA+: Efficient Low Rank Adaptation of Large Models link Soufiane Hayou, Nikhil Ghosh, Bin Yu
140 2024-02-03 Break the Sequential Dependency of LLM Inference Using Lookahead
Decoding
link Yichao Fu, Peter Bailis,..., Hao Zhang
139 2023-04-19 Fundamental Limitations of Alignment in Large Language Models link Yotam Wolf, Noam Wies,..., Amnon Shashua
136 2023-11-07 The Linear Representation Hypothesis and the Geometry of Large
Language Models
link Kiho Park, Yo Joong Choe, Victor Veitch
136 2023-11-18 An Embodied Generalist Agent in 3D World link Jiangyong Huang, Silong Yong,..., Siyuan Huang
134 2024-04-16 Is DPO Superior to PPO for LLM Alignment? A
Comprehensive Study
link Shusheng Xu, Wei Fu,..., Yi Wu
132 2024-02-21 LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens link Yiran Ding, Li Lyna Zhang,..., Mao Yang
128 2024-02-01 Executable Code Actions Elicit Better LLM Agents link Xingyao Wang, Yangyi Chen,..., Heng Ji
127 2023-09-25 Physics of Language Models: Part 3.1, Knowledge Storage and
Extraction
link Zeyuan Allen-Zhu, Yuanzhi Li
127 2024-02-02 TravelPlanner: A Benchmark for Real-World Planning with Language Agents link Jian Xie, Kai Zhang,..., Yu Su
124 2023-12-01 Nash Learning from Human Feedback link Remi Munos, Michal Valko,..., Bilal Piot
122 2024-02-15 Data Engineering for Scaling Language Models to 128K Context link Yao Fu, Rameswar Panda,..., Hao Peng
121 2024-01-26 EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty link Yuhui Li, Fangyun Wei,..., Hongyang Zhang
114 2024-01-22 Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal
LLMs
link Ling Yang, Zhaochen Yu,..., Bin CUI
111 2024-02-06 MOMENT: A Family of Open Time-series Foundation Models link Mononito Goswami, Konrad Szafer,..., Artur Dubrawski
109 2024-02-08 SPHINX-X: Scaling Data and Parameters for a Family of
Multi-modal Large Language Models
link Dongyang Liu, Renrui Zhang,..., Peng Gao
106 2024-04-22 Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data link Fahim Tajwar, Anikait Singh,..., Aviral Kumar
101 2024-02-07 Fast Timing-Conditioned Latent Audio Diffusion link Zach Evans, CJ Carr,..., Jordi Pons
101 2024-01-02 LLM Maybe LongLM: SelfExtend LLM Context Window Without Tuning link Hongye Jin, Xiaotian Han,..., Xia Hu
100 2023-10-11 In-Context Unlearning: Language Models as Few-Shot Unlearners link Martin Pawelczyk, Seth Neel, Himabindu Lakkaraju
98 2024-02-12 Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language
Models
link Siddharth Karamcheti, Suraj Nair,..., Dorsa Sadigh
98 2023-11-02 RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning
via Generative Simulation
link Yufei Wang, Zhou Xian,..., Chuang Gan
96 2024-02-07 Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with
Applications to Protein Co-Design
link Andrew Campbell, Jason Yim,..., Tommi Jaakkola
95 2024-01-03 A Mechanistic Understanding of Alignment Algorithms: A Case Study
on DPO and Toxicity
link Andrew Lee, Xiaoyan Bai,..., Rada Mihalcea
95 2024-02-06 QuIP$#$: Even Better LLM Quantization with Hadamard Incoherence and
Lattice Codebooks
link Albert Tseng, Jerry Chee,..., Christopher De Sa
94 2024-02-09 Debating with More Persuasive LLMs Leads to More Truthful
Answers
link Akbir Khan, John Hughes,..., Ethan Perez
94 2024-01-08 A Minimaximalist Approach to Reinforcement Learning from Human Feedback link Gokul Swamy, Christoph Dann,..., Alekh Agarwal
93 2024-01-22 WARM: On the Benefits of Weight Averaged Reward Models link Alexandre Rame, Nino Vieillard,..., Johan Ferret
92 2024-02-07 AlphaFold Meets Flow Matching for Generating Protein Ensembles link Bowen Jing, Bonnie Berger, Tommi Jaakkola
92 2024-04-30 Better & Faster Large Language Models via Multi-token Prediction link Fabian Gloeckle, Badr Youbi Idrissi,..., Gabriel Synnaeve
92 2023-12-04 Magicoder: Empowering Code Generation with OSS-Instruct link Yuxiang Wei, Zhe Wang,..., LINGMING ZHANG
92 2024-02-12 PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs link Soroush Nasiriany, Fei Xia,..., brian ichter
91 2024-02-07 MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark link Dongping Chen, Ruoxi Chen,..., Lichao Sun
91 2024-01-05 CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution link Alex Gu, Baptiste Roziere,..., Sida Wang
90 2024-01-11 Extreme Compression of Large Language Models via Additive Quantization link Vage Egiazarian, Andrei Panferov,..., Dan Alistarh
90 2024-03-14 3D-VLA: A 3D Vision-Language-Action Generative World Model link Haoyu Zhen, Xiaowen Qiu,..., Chuang Gan
88 2024-02-08 Generalized Preference Optimization: A Unified Approach to Offline Alignment link Yunhao Tang, Zhaohan Daniel Guo,..., Bilal Piot
87 2024-01-29 OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models link Fuzhao Xue, Zian Zheng,..., Yang You
87 2024-01-11 Patchscopes: A Unifying Framework for Inspecting Hidden Representations of
Language Models
link Asma Ghandeharioun, Avi Caciularu,..., Mor Geva
85 2023-07-20 SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language
Models
link Xiaoxuan Wang, Ziniu Hu,..., Wei Wang
85 2024-04-24 MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language
Models Towards Multitask AGI
link Kaining Ying, Fanqing Meng,..., Wenqi Shao
84 2024-03-11 Monitoring AI-Modified Content at Scale: A Case Study on
the Impact of ChatGPT on AI Conference Peer Reviews
link Weixin Liang, Zachary Izzo,..., James Y. Zou
83 2023-11-11 In-context Vectors: Making In Context Learning More Effective and
Controllable Through Latent Space Steering
link Sheng Liu, Haotian Ye,..., James Y. Zou
82 2024-01-22 Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text link Abhimanyu Hans, Avi Schwarzschild,..., Tom Goldstein
79 2024-03-05 Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling link Yair Schiff, Chia Hsiang Kao,..., Volodymyr Kuleshov
79 2024-02-07 Assessing the Brittleness of Safety Alignment via Pruning and
Low-Rank Modifications
link Boyi Wei, Kaixuan Huang,..., Peter Henderson
78 2023-09-01 Image Hijacks: Adversarial Images can Control Generative Models at
Runtime
link Luke Bailey, Euan Ong,..., Scott Emmons
78 2024-02-01 Repeat After Me: Transformers are Better than State Space
Models at Copying
link Samy Jelassi, David Brandfonbrener,..., eran malach
78 2023-10-08 Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce
for Pruning LLMs to High Sparsity
link Lu Yin, You Wu,..., Shiwei Liu
78 2023-10-29 Language Agents with Reinforcement Learning for Strategic Play in
the Werewolf Game
link Zelai Xu, Chao Yu,..., Yi Wu
76 2024-02-22 tinyBenchmarks: evaluating LLMs with fewer examples link Felipe Maia Polo, Lucas Weber,..., Mikhail Yurochkin
75 2024-06-16 QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference link Jiaming Tang, Yilong Zhao,..., Song Han
75 2024-02-22 MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use
Cases
link Zechun Liu, Changsheng Zhao,..., Vikas Chandra
73 2024-02-02 Audio Flamingo: A Novel Audio Language Model with Few-Shot
Learning and Dialogue Abilities
link Zhifeng Kong, Arushi Goel,..., Bryan Catanzaro
71 2023-12-07 Chain of Code: Reasoning with a Language Model-Augmented Code
Emulator
link Chengshu Li, Jacky Liang,..., brian ichter
71 2024-03-11 Stealing part of a production language model link Nicholas Carlini, Daniel Paleka,..., Florian Tramèr
69 2023-12-31 Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling
Laws
link Nikhil Sardana, Jacob Portes,..., Jonathan Frankle
69 2024-02-13 COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability link Xingang Guo, Fangxu Yu,..., Bin Hu
69 2024-02-06 Can Mamba Learn How To Learn? A Comparative Study
on In-Context Learning Tasks
link Jongho Park, Jaeseung Park,..., Dimitris Papailiopoulos
69 2024-02-06 BiLLM: Pushing the Limit of Post-Training Quantization for LLMs link Wei Huang, Yangdong Liu,..., XIAOJUAN QI
69 2023-10-25 Controlled Decoding from Language Models link Sidharth Mudgal, Jong Lee,..., Ahmad Beirami
67 2024-02-15 Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference
Adjustment
link Rui Yang, Xiaoman Pan,..., Jianshu Chen
67 2023-10-05 MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation link Qian Huang, Jian Vora,..., Jure Leskovec
66 2023-09-12 Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts link Zhi-Yi Chin, Chieh Ming Jiang,..., Wei-Chen Chiu
65 2024-05-07 Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition link Hao Fei, Shengqiong Wu,..., Wynne Hsu
65 2024-03-05 Behavior Generation with Latent Actions link Seungjae Lee, Yibin Wang,..., Lerrel Pinto
64 2023-10-25 Discrete Diffusion Modeling by Estimating the Ratios of the
Data Distribution
link Aaron Lou, Chenlin Meng, Stefano Ermon
64 2024-02-13 IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D
Generation
link Luke Melas-Kyriazi, Iro Laina,..., Filippos Kokkinos
62 2024-02-28 Simple linear attention language models balance the recall-throughput tradeoff link Simran Arora, Sabri Eyuboglu,..., Christopher Re
62 2023-11-04 Position: Levels of AGI for Operationalizing Progress on the
Path to AGI
link Meredith Ringel Morris, Jascha Sohl-Dickstein,..., Shane Legg
62 2023-08-20 Algorithm of Thoughts: Enhancing Exploration of Ideas in Large
Language Models
link Bilgehan Sel, Ahmad Tawaha,..., Ming Jin
62 2024-02-10 A Tale of Tails: Model Collapse as a Change
of Scaling Laws
link Elvis Dohmatob, Yunzhen Feng,..., Julia Kempe
60 2024-02-15 QuRating: Selecting High-Quality Data for Training Language Models link Alexander Wettig, Aatmik Gupta,..., Danqi Chen
60 2024-03-13 Human Alignment of Large Language Models through Online Preference
Optimisation
link Daniele Calandriello, Zhaohan Daniel Guo,..., Bilal Piot
60 2024-03-11 The Pitfalls of Next-Token Prediction link Gregor Bachmann, Vaishnavh Nagarajan
59 2024-02-08 WebLINX: Real-World Website Navigation with Multi-Turn Dialogue link Xing Han Lu, Zdeněk Kasner, Siva Reddy
59 2023-12-07 An LLM Compiler for Parallel Function Calling link Sehoon Kim, Suhong Moon,..., Amir Gholami
59 2024-03-05 MathScale: Scaling Instruction Tuning for Mathematical Reasoning link Zhengyang Tang, Xingxing Zhang,..., Furu Wei
59 2023-10-08 In-context Convergence of Transformers link Yu Huang, Yuan Cheng, Yingbin Liang
58 None Position: LLMs Can’t Plan, But Can Help Planning in
LLM-Modulo Frameworks
link Subbarao Kambhampati, Karthik Valmeekam,..., Anil B Murthy
57 2024-03-01 HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding link Zhaorun Chen, Zhuokai Zhao,..., Jiawei Zhou
57 2024-02-03 Safety Fine-Tuning at (Almost) No Cost: A Baseline for
Vision Large Language Models
link Yongshuo Zong, Ondrej Bohdal,..., Timothy Hospedales
56 2024-03-06 Stop Regressing: Training Value Functions via Classification for Scalable
Deep RL
link Jesse Farebrother, Jordi Orbay,..., Rishabh Agarwal
55 2024-01-04 Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using
Large Language Model
link Fei Liu, Tong Xialiang,..., Qingfu Zhang
55 2024-03-01 Provably Robust DPO: Aligning Language Models with Noisy Feedback link Sayak Ray Chowdhury, Anush Kini, Nagarajan Natarajan
55 2023-06-09 Prodigy: An Expeditiously Adaptive Parameter-Free Learner link Konstantin Mishchenko, Aaron Defazio
54 2024-02-03 BetterV: Controlled Verilog Generation with Discriminative Guidance link Zehua PEI, Huiling Zhen,..., Bei Yu
54 2024-02-13 LLaGA: Large Language and Graph Assistant link Runjin Chen, Tong Zhao,..., Zhangyang Wang
54 2024-02-11 GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative
Gaussian Splatting
link Xiaoyu Zhou, Xingjian Ran,..., Ming-Hsuan Yang
53 2024-03-12 WorkArena: How Capable are Web Agents at Solving Common
Knowledge Work Tasks?
link Alexandre Drouin, Maxime Gasse,..., Alexandre Lacoste
53 2024-02-12 Scaling Laws for Fine-Grained Mixture of Experts link Jan Ludziejewski, Jakub Krajewski,..., Sebastian Jaszczur
51 2024-02-11 ODIN: Disentangled Reward Mitigates Hacking in RLHF link Lichang Chen, Chen Zhu,..., Bryan Catanzaro
51 2023-11-08 NExT-Chat: An LMM for Chat, Detection and Segmentation link Ao Zhang, Yuan Yao,..., Tat-Seng Chua
50 2023-07-31 Learning to Model the World With Language link Jessy Lin, Yuqing Du,..., Anca Dragan
50 2024-04-05 Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion
Models for One-Step Generation
link Mingyuan Zhou, Huangjie Zheng,..., Hai Huang
50 2024-02-06 RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback link Yufei Wang, Zhanyi Sun,..., Zackory Erickson
50 2024-02-22 How Transformers Learn Causal Structure with Gradient Descent link Eshaan Nichani, Alex Damian, Jason D. Lee
50 2024-03-14 Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference link Piotr Nawrot, Adrian Łańcucki,..., Edoardo Ponti
50 2024-02-13 GLoRe: When, Where, and How to Improve LLM Reasoning
via Global and Local Refinements
link Alexander Havrilla, Sharath Chandra Raparthy,..., Roberta Raileanu
50 2024-02-29 ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL link Yifei Zhou, Andrea Zanette,..., Aviral Kumar
50 2023-10-11 Online Speculative Decoding link Xiaoxuan Liu, Lanxiang Hu,..., Hao Zhang
49 2024-02-14 Get More with LESS: Synthesizing Recurrence with KV Cache
Compression for Efficient LLM Inference
link Harry Dong, Xinyu Yang,..., Beidi Chen
49 2023-11-18 MagicPose: Realistic Human Poses and Facial Expressions Retargeting with
Identity-aware Diffusion
link Di Chang, Yichun Shi,..., Mohammad Soleymani
48 2023-11-15 Decomposing Uncertainty for Large Language Models through Input Clarification
Ensembling
link Bairu Hou, Yujian Liu,..., Yang Zhang
48 2024-01-21 Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers link Katherine Crowson, Stefan Andreas Baumann,..., Enrico Shippole
48 2024-02-13 Agent Smith: A Single Image Can Jailbreak One Million
Multimodal LLM Agents Exponentially Fast
link Xiangming Gu, Xiaosen Zheng,..., Min Lin
47 2023-07-17 Do Models Explain Themselves? Counterfactual Simulatability of Natural Language
Explanations
link Yanda Chen, Ruiqi Zhong,..., Kathleen McKeown
47 2024-01-23 DsDm: Model-Aware Dataset Selection with Datamodels link Logan Engstrom, Axel Feldmann, Aleksander Madry
47 2024-02-14 MaxMin-RLHF: Alignment with Diverse Human Preferences link Souradip Chakraborty, Jiahao Qiu,..., Mengdi Wang
47 2023-10-16 ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method
for Aligning Large Language Models
link Ziniu Li, Tian Xu,..., Zhi-Quan Luo
47 2024-01-31 On Prompt-Driven Safeguarding for Large Language Models link Chujie Zheng, Fan Yin,..., Nanyun Peng
47 2024-02-18 Momentor: Advancing Video Large Language Model with Fine-Grained Temporal
Reasoning
link Long Qian, Juncheng Li,..., Siliang Tang
46 2023-06-30 Stay on Topic with Classifier-Free Guidance link Guillaume Sanchez, Alexander Spangher,..., Stella Biderman
46 2024-02-27 Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for
Generative Recommendations
link Jiaqi Zhai, Lucy Liao,..., Yu Shi
46 2024-05-13 Localizing Task Information for Improved Model Merging and Compression link Ke Wang, Nikolaos Dimitriadis,..., Pascal Frossard
46 2023-12-08 SparQ Attention: Bandwidth-Efficient LLM Inference link Luka Ribar, Ivan Chelombiev,..., Douglas Orr
46 2024-02-08 Dirichlet Flow Matching with Applications to DNA Sequence Design link Hannes Stark, Bowen Jing,..., Tommi Jaakkola
46 2022-09-30 Differentially Private Bias-Term Fine-tuning of Foundation Models link Zhiqi Bu, Yu-Xiang Wang,..., George Karypis
45 2024-02-19 FiT: Flexible Vision Transformer for Diffusion Model link Zeyu Lu, ZiDong Wang,..., LEI BAI
45 2023-10-11 InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining link Boxin Wang, Wei Ping,..., Bryan Catanzaro
45 2024-02-18 Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark link Yihua Zhang, Pingzhi Li,..., Tianlong Chen
44 2024-02-07 Long Is More for Alignment: A Simple but Tough-to-Beat
Baseline for Instruction Fine-Tuning
link Hao Zhao, Maksym Andriushchenko,..., Nicolas Flammarion
43 2024-04-12 The Illusion of State in State-Space Models link William Merrill, Jackson Petty, Ashish Sabharwal
43 2024-02-02 Boximator: Generating Rich and Controllable Motions for Video Synthesis link Jiawei Wang, Yuchen Zhang,..., Hang Li
43 2024-02-28 Evaluating Quantized Large Language Models link Shiyao Li, Xuefei Ning,..., Yu Wang
42 2024-02-05 Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization link Yang Jin, Zhicheng Sun,..., Yadong MU
42 2024-04-18 Token-level Direct Preference Optimization link Yongcheng Zeng, Guoqing Liu,..., Jun Wang
42 2023-06-05 InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models link Lichang Chen, Jiuhai Chen,..., Tianyi Zhou
41 2024-01-23 In-Context Language Learning: Architectures and Algorithms link Ekin Akyürek, Bailin Wang,..., Jacob Andreas
41 2024-02-09 Iterated Denoising Energy Matching for Sampling from Boltzmann Densities link Tara Akhound-Sadegh, Jarrid Rector-Brooks,..., Alexander Tong
41 2024-01-10 InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks link Xueyu Hu, Ziyu Zhao,..., Fei Wu
40 2024-02-04 Transolver: A Fast Transformer Solver for PDEs on General
Geometries
link Haixu Wu, Huakun Luo,..., Mingsheng Long
40 2024-02-01 Merging Multi-Task Models via Weight-Ensembling Mixture of Experts link Anke Tang, Li Shen,..., Dacheng Tao
40 2024-01-30 Proactive Detection of Voice Cloning with Localized Watermarking link Robin San Roman, Pierre Fernandez,..., Tuan Tran
40 2023-10-02 Prompt-tuning Latent Diffusion Models for Inverse Problems link Hyungjin Chung, Jong Chul Ye,..., Mauricio Delbracio
39 2024-02-02 Challenges in Training PINNs: A Loss Landscape Perspective link Pratik Rathore, Weimu Lei,..., Madeleine Udell
39 2024-02-05 Large Language Models are Geographically Biased link Rohin Manvi, Samar Khanna,..., Stefano Ermon
39 2024-02-05 Flora: Low-Rank Adapters Are Secretly Gradient Compressors link Yongchang Hao, Yanshuai Cao, Lili Mou
38 2023-12-12 AI Control: Improving Safety Despite Intentional Subversion link Ryan Greenblatt, Buck Shlegeris,..., Fabien Roger
38 2024-03-19 RigorLLM: Resilient Guardrails for Large Language Models against Undesired
Content
link Zhuowen Yuan, Zidi Xiong,..., Bo Li
37 2024-02-19 Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for
Robust Large Vision-Language Models
link Christian Schlarmann, Naman Deep Singh,..., Matthias Hein
37 2024-02-08 Accurate LoRA-Finetuning Quantization of LLMs via Information Retention link Haotong Qin, Xudong Ma,..., Michele Magno
37 2024-02-06 AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls link Yu Du, Fangyun Wei, Hongyang Zhang
37 2024-02-05 Representation Surgery for Multi-Task Model Merging link Enneng Yang, Li Shen,..., Dacheng Tao
36 2024-05-02 SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters link Shengsheng Lin, Weiwei Lin,..., Junjie Yang
36 2024-02-02 A Dynamical Model of Neural Scaling Laws link Blake Bordelon, Alexander Atanasov, Cengiz Pehlevan
36 2024-04-12 TSLANet: Rethinking Transformers for Time Series Representation Learning link Emadeldeen Eldele, Mohamed Ragab,..., Xiaoli Li
36 2023-09-13 Auto-Regressive Next-Token Predictors are Universal Learners link eran malach
35 2024-02-05 Decoding-time Realignment of Language Models link Tianlin Liu, Shangmin Guo,..., Mathieu Blondel
35 2024-02-07 Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation link Luca Beurer-Kellner, Marc Fischer, Martin Vechev
35 2024-02-14 Feature Reuse and Scaling: Understanding Transfer Learning with Protein
Language Models
link Francesca-Zhoufan Li, Ava P Amini,..., Alex Xijie Lu
35 2024-03-17 MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of
Data
link Paul Steven Scotti, Mihir Tripathy,..., Tanishq Mathew Abraham
35 2024-02-05 Guidance with Spherical Gaussian Constraint for Conditional Diffusion link Lingxiao Yang, Shutong Ding,..., Ye Shi
35 2023-12-11 Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions
In Context
link Xiang Cheng, Yuxin Chen, Suvrit Sra
34 2024-03-28 MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions link Kai Zhang, Yi Luan,..., Ming-Wei Chang
34 2023-12-06 Low-Cost High-Power Membership Inference Attacks link Sajjad Zarifzadeh, Philippe Liu, Reza Shokri
34 2024-04-23 NExT: Teaching Large Language Models to Reason about Code
Execution
link Ansong Ni, Miltiadis Allamanis,..., Pengcheng Yin
34 2024-02-27 Training-Free Long-Context Scaling of Large Language Models link Chenxin An, Fei Huang,..., Lingpeng Kong
34 None Position: Will we run out of data? Limits of
LLM scaling based on human-generated data
link Pablo Villalobos, Anson Ho,..., Marius Hobbhahn
34 2024-02-07 On Computational Limits of Modern Hopfield Models: A Fine-Grained
Complexity Analysis
link Jerry Yao-Chieh Hu, Thomas Lin,..., Han Liu
33 2024-01-22 DITTO: Diffusion Inference-Time T-Optimization for Music Generation link Zachary Novack, Julian McAuley,..., Nicholas J. Bryan
33 2022-10-10 Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score
Based Diffusion Models
link Louis Sharrock, Jack Simons,..., Mark Beaumont
33 2024-03-04 Differentially Private Synthetic Data via Foundation Model APIs 2:
Text
link Chulin Xie, Zinan Lin,..., Sergey Yekhanin
33 2024-02-12 Rolling Diffusion Models link David Ruhe, Jonathan Heek,..., Emiel Hoogeboom
33 2024-06-07 FlowMM: Generating Materials with Riemannian Flow Matching link Benjamin Kurt Miller, Ricky T. Q. Chen,..., Brandon M Wood
33 2024-02-15 Language Models with Conformal Factuality Guarantees link Christopher Mohri, Tatsunori Hashimoto
33 2024-02-23 Fast Adversarial Attacks on Language Models In One GPU
Minute
link Vinu Sankar Sadasivan, Shoumik Saha,..., Soheil Feizi
33 2024-02-28 CogBench: a large language model walks into a psychology
lab
link Julian Coda-Forno, Marcel Binz,..., Eric Schulz
33 2024-02-28 Diffusion Language Models Are Versatile Protein Learners link Xinyou Wang, Zaixiang Zheng,..., Quanquan Gu
32 2024-02-20 A Touch, Vision, and Language Dataset for Multimodal Alignment link Letian Fu, Gaurav Datta,..., Ken Goldberg
32 2024-02-15 DE-COP: Detecting Copyrighted Content in Language Models Training Data link André Vicente Duarte, Xuandong Zhao,..., Lei Li
32 2023-05-17 Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling link Weijia Xu, Andrzej Banburski, Nebojsa Jojic
32 2023-12-11 Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication
Cost under 18 Kilobytes
link Zhen Qin, Daoyuan Chen,..., Shuiguang Deng
31 2024-04-26 Probabilistic Inference in Language Models via Twisted Sequential Monte
Carlo
link Stephen Zhao, Rob Brekelmans,..., Roger Baker Grosse
31 2024-05-18 Towards Modular LLMs by Building and Reusing a Library
of LoRAs
link Oleksiy Ostapenko, Zhan Su,..., Alessandro Sordoni
31 2024-02-01 Dense Reward for Free in Reinforcement Learning from Human
Feedback
link Alex James Chan, Hao Sun,..., Mihaela van der Schaar
31 2023-02-26 Diffusion Model-Augmented Behavioral Cloning link Shang-Fu Chen, Hsiang-Chun Wang,..., Shao-Hua Sun
31 2023-12-08 EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language
Models with 3D Parallelism
link Yanxi Chen, Xuchen Pan,..., Jingren Zhou
30 2024-02-29 Watermark Stealing in Large Language Models link Nikola Jovanović, Robin Staab, Martin Vechev
30 2023-10-23 DOGE: Domain Reweighting with Generalization Estimation link Simin Fan, Matteo Pagliardini, Martin Jaggi
30 2024-02-08 How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis link Federico Bianchi, Patrick John Chia,..., James Zou
29 2023-10-05 Stochastic Interpolants with Data-Dependent Couplings link Michael Samuel Albergo, Mark Goldstein,..., Eric Vanden-Eijnden
29 2024-02-14 Transformers, parallel computation, and logarithmic depth link Clayton Sanford, Daniel Hsu, Matus Telgarsky
29 2024-02-13 Mixtures of Experts Unlock Parameter Scaling for Deep RL link Johan Samir Obando Ceron, Ghada Sokar,..., Pablo Samuel Castro
29 2024-04-18 RoboDreamer: Learning Compositional World Models for Robot Imagination link Siyuan Zhou, Yilun Du,..., Chuang Gan
29 2024-02-15 A Human-Inspired Reading Agent with Gist Memory of Very
Long Contexts
link Kuang-Huei Lee, Xinyun Chen,..., Ian Fischer
29 2024-03-06 Accelerating Convergence of Score-Based Diffusion Models, Provably link Gen Li, Yu Huang,..., Yuxin Chen
29 2024-06-28 Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation link Danny Halawi, Alexander Wei,..., Jacob Steinhardt
28 2024-02-08 Self-Alignment of Large Language Models via Monopolylogue-based Social Scene
Simulation
link Xianghe Pang, Shuo Tang,..., Siheng Chen
28 2024-02-03 GliDe with a CaPE: A Low-Hassle Method to Accelerate
Speculative Decoding
link Cunxiao Du, Jing Jiang,..., Yang You
28 2024-01-11 DiffDA: a Diffusion model for weather-scale Data Assimilation link Langwen Huang, Lukas Gianinazzi,..., Torsten Hoefler
28 2024-03-11 Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical
Knowledge Enhancement
link Che Liu, Zhongwei Wan,..., Rossella Arcucci
28 2024-04-04 Uniform Memory Retrieval with Larger Capacity for Modern Hopfield
Models
link Dennis Wu, Jerry Yao-Chieh Hu,..., Han Liu
28 2024-04-16 Position: Social Choice Should Guide AI Alignment in Dealing
with Diverse Human Feedback
link Vincent Conitzer, Rachel Freedman,..., William S. Zwicker
28 2024-02-03 A Closer Look at the Limitations of Instruction Tuning link Sreyan Ghosh, Chandra Kiran Reddy Evuru,..., Dinesh Manocha
27 2024-01-09 RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation link Mahdi Nikdan, Soroush Tabesh,..., Dan Alistarh
27 2023-10-26 Codebook Features: Sparse and Discrete Interpretability for Neural Networks link Alex Tamkin, Mohammad Taufeeque, Noah Goodman
27 2024-02-28 CLLMs: Consistency Large Language Models link Siqi Kou, Lanxiang Hu,..., Hao Zhang
27 2024-02-26 Asymmetry in Low-Rank Adapters of Foundation Models link Jiacheng Zhu, Kristjan Greenewald,..., Justin Solomon
27 2024-02-27 DS-Agent: Automated Data Science by Empowering Large Language Models
with Case-Based Reasoning
link Siyuan Guo, Cheng Deng,..., Jun Wang
27 2024-02-01 Position: Bayesian Deep Learning is Needed in the Age
of Large-Scale AI
link Theodore Papamarkou, Maria Skoularidou,..., Ruqi Zhang
26 2024-04-15 All-in-one simulation-based inference link Manuel Gloeckler, Michael Deistler,..., Jakob H. Macke
26 2024-03-15 Repoformer: Selective Retrieval for Repository-Level Code Completion link Di Wu, Wasi Uddin Ahmad,..., Xiaofei Ma
26 2024-02-03 Position: Graph Foundation Models Are Already Here link Haitao Mao, Zhikai Chen,..., Jiliang Tang
26 2024-02-14 Premise Order Matters in Reasoning with Large Language Models link Xinyun Chen, Ryan Andrew Chi,..., Denny Zhou
26 2024-02-29 Dual Operating Modes of In-Context Learning link Ziqian Lin, Kangwook Lee
26 2024-02-09 Feedback Loops With Language Models Drive In-Context Reward Hacking link Alexander Pan, Erik Jones,..., Jacob Steinhardt
26 2023-09-18 Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts link Jiang-Xin Shi, Tong Wei,..., Yu-Feng Li
26 2024-05-16 LLM and Simulation as Bilevel Optimizers: A New Paradigm
to Advance Physical Scientific Discovery
link Pingchuan Ma, Tsun-Hsuan Wang,..., Wojciech Matusik
26 2024-02-05 The Benefits of Reusing Batches for Gradient Descent in
Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
link Yatin Dandi, Emanuele Troiani,..., Florent Krzakala
26 2024-02-14 Position: Topological Deep Learning is the New Frontier for
Relational Learning
link Theodore Papamarkou, Tolga Birdal,..., Ghada Zamzmi
25 2024-04-10 What needs to go right for an induction head?
A mechanistic study of in-context learning circuits and their formation
link Aaditya K Singh, Ted Moskovitz,..., Andrew M Saxe
25 2024-02-09 Particle Denoising Diffusion Sampler link Angus Phillips, Hai-Dang Dau,..., Arnaud Doucet
25 2024-02-08 AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers link Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi,..., Wojciech Samek
25 2024-03-21 Protein Conformation Generation via Force-Guided SE(3) Diffusion Models link YanWang, Lihao Wang,..., Quanquan Gu
25 2023-12-19 Curated LLM: Synergy of LLMs and Data Curation for
tabular augmentation in low-data regimes
link Nabeel Seedat, Nicolas Huynh,..., Mihaela van der Schaar
25 2024-04-04 Outlier-Efficient Hopfield Layers for Large Transformer-Based Models link Jerry Yao-Chieh Hu, Pei-Hsuan Chang,..., Han Liu
25 2024-01-29 Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in
RLHF
link Banghua Zhu, Michael Jordan, Jiantao Jiao
25 2023-10-09 Harmonic Self-Conditioned Flow Matching for joint Multi-Ligand Docking and
Binding Site Design
link Hannes Stark, Bowen Jing,..., Tommi Jaakkola
25 2023-06-07 Don't trust your eyes: on the (un)reliability of feature
visualizations
link Robert Geirhos, Roland S. Zimmermann,..., Been Kim
25 2024-05-05 Parameter-Efficient Fine-Tuning with Discrete Fourier Transform link Ziqi Gao, Qichao Wang,..., Jia Li
24 2024-03-02 SceneCraft: An LLM Agent for Synthesizing 3D Scenes as
Blender Code
link Ziniu Hu, Ahmet Iscen,..., Alireza Fathi
24 2023-10-26 CompeteAI: Understanding the Competition Dynamics of Large Language Model-based
Agents
link Qinlin Zhao, Jindong Wang,..., Xing Xie
24 2024-02-26 Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning link Michael Matthews, Michael Beukman,..., Jakob Nicolaus Foerster
24 2024-03-04 DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving link Foteini Strati, Sara McAllister,..., Ana Klimovic
24 2023-10-05 Agent Instructs Large Language Models to be General Zero-Shot
Reasoners
link Nicholas Crispino, Kyle Montgomery,..., Chenguang Wang
24 2024-03-06 On the Origins of Linear Representations in Large Language
Models
link Yibo Jiang, Goutham Rajendran,..., Victor Veitch
24 2024-02-21 D-Flow: Differentiating through Flows for Controlled Generation link Heli Ben-Hamu, Omri Puny,..., Yaron Lipman
24 2024-03-18 Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs
Under Compression
link Junyuan Hong, Jinhao Duan,..., Bo Li
24 2024-04-22 Align Your Steps: Optimizing Sampling Schedules in Diffusion Models link Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis
24 None CaM: Cache Merging for Memory-efficient LLMs Inference link Yuxin Zhang, Yuxuan Du,..., Rongrong Ji
24 2024-01-05 VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model link Pengying Wu, Yao Mu,..., Chang Liu
23 2024-01-07 The Stronger the Diffusion Model, the Easier the Backdoor:
Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline
link Haonan Wang, Qianli Shen,..., Kenji Kawaguchi
23 2024-06-05 Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large
Language Models
link Peijie Dong, Lujun Li,..., Xiaowen Chu
23 None The Emergence of Reproducibility and Consistency in Diffusion Models link Huijie Zhang, Jinfan Zhou,..., Qing Qu
23 2024-02-08 In-Context Principle Learning from Mistakes link Tianjun Zhang, Aman Madaan,..., Uri Alon
23 2023-10-02 FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language
Models
link Jingwei Sun, Ziyue Xu,..., Holger R Roth
23 2024-03-03 Theoretical insights for diffusion guidance: A case study for
Gaussian mixture models
link Yuchen Wu, Minshuo Chen,..., Yuting Wei
23 2023-10-09 Generalized Neural Collapse for a Large Number of Classes link Jiachen Jiang, Jinxin Zhou,..., Zhihui Zhu
23 2024-01-24 Can AI Assistants Know What They Don't Know? link Qinyuan Cheng, Tianxiang Sun,..., Xipeng Qiu
22 2024-02-15 SAMformer: Unlocking the Potential of Transformers in Time Series
Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
link Romain Ilbert, Ambroise Odonnat,..., Ievgen Redko
22 2024-02-08 Memory Consolidation Enables Long-Context Video Understanding link Ivana Balazevic, Yuge Shi,..., Olivier J Henaff
22 2024-02-27 Variational Learning is Effective for Large Deep Networks link Yuesong Shen, Nico Daheim,..., Thomas Möllenhoff
22 2024-02-02 Online conformal prediction with decaying step sizes link Anastasios Nikolas Angelopoulos, Rina Barber, Stephen Bates
22 2024-02-13 A Dense Reward View on Aligning Text-to-Image Diffusion with
Preference
link Shentao Yang, Tianqi Chen, Mingyuan Zhou
22 2024-02-15 OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large
Language Models
link Ali AhmadiTeshnizi, Wenzhi Gao, Madeleine Udell
22 2024-03-06 DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training link Zhongkai Hao, Chang Su,..., Jun Zhu
22 2023-10-04 Assessing Large Language Models on Climate Information link Jannis Bulian, Mike S. Schäfer,..., Nadine Strauss
22 2024-01-18 Improving fine-grained understanding in image-text pre-training link Ioana Bica, Anastasija Ilic,..., Jovana Mitrovic
22 2024-03-21 An Analysis of Linear Time Series Forecasting Models link William Toner, Luke Nicholas Darlow
22 2024-06-22 video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models link Guangzhi Sun, Wenyi Yu,..., Chao Zhang
22 None Position: TrustLLM: Trustworthiness in Large Language Models link Yue Huang, Lichao Sun,..., Yue Zhao
22 2023-10-11 A Resilient and Accessible Distribution-Preserving Watermark for Large Language
Models
link Yihan Wu, Zhengmian Hu,..., Heng Huang
22 2024-02-19 Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language
Models
link Didi Zhu, Zhongyisun Sun,..., Kun Kuang
22 2023-05-27 CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers link Dachuan Shi, Chaofan Tao,..., Jiaqi Wang
21 2024-02-22 Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion link Yujia Huang, Adishree Ghatare,..., Yisong Yue
21 2024-03-30 Linguistic Calibration of Long-Form Generations link Neil Band, Xuechen Li,..., Tatsunori Hashimoto
21 None R2E: Turning any Github Repository into a Programming Agent
Environment
link Naman Jain, Manish Shetty,..., Ion Stoica
21 2024-02-22 A Language Model’s Guide Through Latent Space link Dimitri von Rütte, Sotiris Anagnostidis,..., Thomas Hofmann
21 2024-03-26 Mechanistic Design and Scaling of Hybrid Architectures link Michael Poli, Armin W Thomas,..., Stefano Massaroli
21 2024-02-26 Feedback Efficient Online Fine-Tuning of Diffusion Models link Masatoshi Uehara, Yulai Zhao,..., Tommaso Biancalani
21 2024-05-28 AI Alignment with Changing and Influenceable Reward Functions link Micah Carroll, Davis Foote,..., Anca Dragan
21 2024-02-25 RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis link Yao Mu, Junting Chen,..., Ping Luo
21 2023-12-06 Generalization to New Sequential Decision Making Tasks with In-Context
Learning
link Sharath Chandra Raparthy, Eric Hambro,..., Roberta Raileanu
21 2024-03-03 In-Context Sharpness as Alerts: An Inner Representation Perspective for
Hallucination Mitigation
link Shiqi Chen, Miao Xiong,..., Junxian He
21 2024-05-21 How Universal Polynomial Bases Enhance Spectral Graph Neural Networks:
Heterophily, Over-smoothing, and Over-squashing
link Keke Huang, Yu Guang Wang,..., Pietro Lio
21 2024-02-03 Improving Diffusion Models for Inverse Problems Using Optimal Posterior
Covariance
link Xinyu Peng, Ziyang Zheng,..., Hongkai Xiong
21 2024-02-06 Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains link Junhong Shen, Neil Tenenholtz,..., Nicolo Fusi
21 2023-10-10 Conformal Prediction for Deep Classifier via Label Ranking link Jianguo Huang, HuaJun Xi,..., Hongxin Wei
21 2023-10-20 Equivariant Deep Weight Space Alignment link Aviv Navon, Aviv Shamsian,..., Haggai Maron
21 2024-02-07 A Sober Look at LLMs for Material Discovery: Are
They Actually Good for Bayesian Optimization Over Molecules?
link Agustinus Kristiadi, Felix Strieth-Kalthoff,..., Geoff Pleiss
21 2023-02-04 CosPGD: an efficient white-box adversarial attack for pixel-wise prediction
tasks
link Shashank Agnihotri, Steffen Jung, Margret Keuper
20 2024-03-19 Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data
Flow and Per-Block Quantization
link Haocheng Xi, Yuxiang Chen,..., Jun Zhu
20 2024-03-05 Time Weaver: A Conditional Time Series Generation Model link Sai Shankar Narasimhan, Shubhankar Agarwal,..., Sandeep P. Chinchali
20 2024-03-06 Conformal prediction for multi-dimensional time series by ellipsoidal sets link Chen Xu, Hanyang Jiang, Yao Xie
20 2024-02-23 Minimax Optimality of Score-based Diffusion Models: Beyond the Density
Lower Bound Assumptions
link Kaihong Zhang, Heqi Yin,..., Jingbo Liu
20 2024-06-22 Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language
Models without Training through Attention Calibration
link Zhongzhi Yu, Zheng Wang,..., Yingyan Celine Lin
20 2024-03-20 Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models
with Noisy Data
link Giannis Daras, Alex Dimakis, Constantinos Costis Daskalakis
20 2024-02-04 Selecting Large Language Model to Fine-tune via Rectified Scaling
Law
link Haowei Lin, Baizhou Huang,..., Yitao Liang
20 2024-02-08 Training Large Language Models for Reasoning through Reverse Curriculum
Reinforcement Learning
link Zhiheng Xi, Wenxiang Chen,..., Xuanjing Huang
20 2023-11-29 Should we be going MAD? A Look at Multi-Agent
Debate Strategies for LLMs
link Andries Petrus Smit, Nathan Grinsztajn,..., Arnu Pretorius
20 2023-04-03 Chain-of-Thought Predictive Control link Zhiwei Jia, Vineet Thumuluri,..., Hao Su
20 2020-11-29 Scaling Down Deep Learning with MNIST-1D link Samuel James Greydanus, Dmitry Kobak
20 2024-02-05 C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models link Mintong Kang, Nezihe Merve Gürel,..., Bo Li
20 2024-02-01 Efficient Exploration for LLMs link Vikranth Dwaracherla, Seyed Mohammad Asghari,..., Benjamin Van Roy
20 2024-02-01 Getting the most out of your tokenizer for pre-training
and domain adaptation
link Gautier Dagan, Gabriel Synnaeve, Baptiste Roziere
20 2024-02-14 SLEB: Streamlining LLMs through Redundancy Verification and Elimination of
Transformer Blocks
link Jiwon Song, Kyungseok Oh,..., jae-joon kim
20 2024-02-26 Disentangled 3D Scene Generation with Layout Learning link Dave Epstein, Ben Poole,..., Aleksander Holynski
20 2023-12-28 Non-Vacuous Generalization Bounds for Large Language Models link Sanae Lotfi, Marc Anton Finzi,..., Andrew Gordon Wilson
19 None DRCT: Diffusion Reconstruction Contrastive Training towards Universal Detection of
Diffusion Generated Images
link Baoying Chen, Jishen Zeng,..., Rui Yang
19 2024-07-08 Scaling Exponents Across Parameterizations and Optimizers link Katie E Everett, Lechao Xiao,..., Jeffrey Pennington
19 2024-02-23 Deep Networks Always Grok and Here is Why link Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
19 2024-02-28 Characterizing Truthfulness in Large Language Model Generations with Local
Intrinsic Dimension
link Fan Yin, Jayanth Srinivasa, Kai-Wei Chang
19 2022-06-10 Accelerated Algorithms for Constrained Nonconvex-Nonconcave Min-Max Optimization and Comonotone
Inclusion
link Yang Cai, Argyris Oikonomou, Weiqiang Zheng
19 2024-04-30 Modeling Caption Diversity in Contrastive Vision-Language Pretraining link Samuel Lavoie, Polina Kirichenko,..., Nicolas Ballas
19 2023-08-14 Position: Key Claims in LLM Research Have a Long
Tail of Footnotes
link Anna Rogers, Sasha Luccioni
19 2024-05-16 IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency link Linshan Hou, Ruili Feng,..., Yiming Li
19 2024-01-23 TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic
Tasks
link Zhiruo Wang, Graham Neubig, Daniel Fried
19 2023-10-11 A Theory of Non-Linear Feature Learning with One Gradient
Step in Two-Layer Neural Networks
link Behrad Moniri, Donghwan Lee,..., Edgar Dobriban
19 2022-11-09 Few-Shot Character Understanding in Movies as an Assessment to
Meta-Learning of Theory-of-Mind
link Mo Yu, Qiujing Wang,..., Jie Zhou
19 2023-08-25 Learning to Intervene on Concept Bottlenecks link David Steinmann, Wolfgang Stammer,..., Kristian Kersting
19 2024-02-26 Neural Operators with Localized Integral and Differential Kernels link Miguel Liu-Schiaffini, Julius Berner,..., Anima Anandkumar
19 2023-02-07 Graph Generation with Diffusion Mixture link Jaehyeong Jo, Dongki Kim, Sung Ju Hwang
19 2024-04-04 BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized
Sparse Modern Hopfield Model
link Chenwei Xu, Yu-Chao Huang,..., Han Liu
18 2024-02-02 Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics
on the Attention Landscape
link Juno Kim, Taiji Suzuki
18 2023-09-28 Discovering Environments with XRM link Mohammad Pezeshki, Diane Bouchacourt,..., David Lopez-Paz
18 2024-09-03 Interpreting and Improving Large Language Models in Arithmetic Calculation link Wei Zhang, Chaoqun Wan,..., Jieping Ye
18 2024-04-02 Test-Time Model Adaptation with Only Forward Passes link Shuaicheng Niu, Chunyan Miao,..., Peilin Zhao
18 2023-12-02 Second-Order Uncertainty Quantification: A Distance-Based Approach link Yusuf Sale, Viktor Bengs,..., Eyke Hüllermeier
18 2024-02-14 Copyright Traps for Large Language Models link Matthieu Meeus, Igor Shilov,..., Yves-Alexandre de Montjoye
18 2024-02-07 Causal Representation Learning from Multiple Distributions: A General Setting link Kun Zhang, Shaoan Xie,..., Yujia Zheng
18 2024-04-17 Learning with 3D rotations, a hitchhiker's guide to SO(3) link Andreas René Geist, Jonas Frey,..., Georg Martius
18 2024-02-27 Case-Based or Rule-Based: How Do Transformers Do the Math? link Yi Hu, Xiaojuan Tang,..., Muhan Zhang
18 2024-05-13 PARDEN, Can You Repeat That? Defending against Jailbreaks via
Repetition
link Ziyang Zhang, Qizhen Zhang, Jakob Nicolaus Foerster
18 2024-03-18 Larimar: Large Language Models with Episodic Memory Control link Payel Das, Subhajit Chaudhury,..., Pin-Yu Chen
18 2024-02-13 Hybrid Inverse Reinforcement Learning link Juntao Ren, Gokul Swamy,..., Sanjiban Choudhury
18 2023-10-02 Fool Your (Vision and) Language Model with Embarrassingly Simple
Permutations
link Yongshuo Zong, Tingyang Yu,..., Timothy Hospedales
18 2024-02-05 Distinguishing the Knowable from the Unknowable with Language Models link Gustaf Ahdritz, Tian Qin,..., Benjamin L. Edelman
18 2024-02-19 In value-based deep reinforcement learning, a pruned network is
a good network
link Johan Samir Obando Ceron, Aaron Courville, Pablo Samuel Castro
18 2024-04-18 MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space link Yanru Qu, Keyue Qiu,..., Wei-Ying Ma
18 2024-01-25 Adaptive Text Watermark for Large Language Models link Yepeng Liu, Yuheng Bu
18 2024-03-16 SelfIE: Self-Interpretation of Large Language Model Embeddings link Haozhe Chen, Carl Vondrick, Chengzhi Mao
18 2023-10-11 Language Models as Semantic Indexers link Bowen Jin, Hansi Zeng,..., Xianfeng Tang
18 2024-01-28 An Information-Theoretic Analysis of In-Context Learning link Hong Jun Jeon, Jason D. Lee,..., Benjamin Van Roy
18 2024-01-24 Conformal Prediction Sets Improve Human Decision Making link Jesse C. Cresswell, Yi Sui,..., Noël Vouitsis
18 2024-02-23 Foundation Policies with Hilbert Representations link Seohong Park, Tobias Kreiman, Sergey Levine
18 2024-02-03 Image Fusion via Vision-Language Model link Zixiang Zhao, Lilun Deng,..., Luc Van Gool
17 2024-05-22 Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam
Generation
link Gauthier Guinet, Behrooz Omidvar-Tehrani,..., Laurent Callot
17 2024-01-22 APT: Adaptive Pruning and Tuning Pretrained Language Models for
Efficient Training and Inference
link Bowen Zhao, Hannaneh Hajishirzi, Qingqing Cao
17 2024-05-03 Auto-Encoding Morph-Tokens for Multimodal LLM link Kaihang Pan, Siliang Tang,..., Hanwang Zhang
17 2024-03-01 Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson
of Reinforcement Learning
link Michal Nauman, Michał Bortkiewicz,..., Marek Cygan
17 2024-10-10 Can Looped Transformers Learn to Implement Multi-step Gradient Descent
for In-context Learning?
link Khashayar Gatmiry, Nikunj Saunshi,..., Sanjiv Kumar
17 2024-02-21 Privacy-Preserving Instructions for Aligning Large Language Models link Da Yu, Peter Kairouz,..., Zheng Xu
17 2023-08-31 On the Implicit Bias of Adam link Matias D. Cattaneo, Jason Matthew Klusowski, Boris Shigida
17 2023-05-18 Emergent Representations of Program Semantics in Language Models Trained
on Programs
link Charles Jin, Martin Rinard
17 2024-02-21 Do Efficient Transformers Really Save Computation? link Kai Yang, Jan Ackermann,..., Liwei Wang
17 2024-02-07 CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay link Natasha Butt, Blazej Manczak,..., Taco Cohen
17 2024-02-21 From Self-Attention to Markov Models: Unveiling the Dynamics of
Generative Transformers
link Muhammed Emrullah Ildiz, Yixiao HUANG,..., Samet Oymak
17 2024-06-03 A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization link Sebastian Sanokowski, Sepp Hochreiter, Sebastian Lehner
17 2024-05-14 Compositional Text-to-Image Generation with Dense Blob Representations link Weili Nie, Sifei Liu,..., Arash Vahdat
17 2024-04-22 A Multimodal Automated Interpretability Agent link Tamar Rott Shaham, Sarah Schwettmann,..., Antonio Torralba
17 2023-10-16 A Computational Framework for Solving Wasserstein Lagrangian Flows link Kirill Neklyudov, Rob Brekelmans,..., Alireza Makhzani
17 2024-02-25 Equivariant Frames and the Impossibility of Continuous Canonicalization link Nadav Dym, Hannah Lawrence, Jonathan W. Siegel
17 2024-01-20 Make-A-Shape: a Ten-Million-scale 3D Shape Model link Ka-Hei Hui, Aditya Sanghi,..., Chi-Wing Fu
17 2024-05-18 AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models
via Watermark LoRA
link Weitao Feng, Wenbo Zhou,..., Nenghai Yu
17 2024-05-03 PICLe: Eliciting Diverse Behaviors from Large Language Models with
Persona In-Context Learning
link Hyeong Kyu Choi, Yixuan Li
16 2023-02-23 EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand
Binding Site Prediction
link yang zhang, Zhewei Wei,..., Wenbing Huang
16 2024-02-16 Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs link Yeonhong Park, Jake Hyun,..., Jae W. Lee
16 2024-02-07 Asymptotics of feature learning in two-layer networks after one
gradient-step
link Hugo Cui, Luca Pesce,..., Bruno Loureiro
16 2024-02-14 Instruction Tuning for Secure Code Generation link Jingxuan He, Mark Vero,..., Martin Vechev
16 2024-10-29 Cell2Sentence: Teaching Large Language Models the Language of Biology link Daniel Levine, Syed A Rizvi,..., David van Dijk
16 2024-02-29 Smooth Tchebycheff Scalarization for Multi-Objective Optimization link Xi Lin, Xiaoyuan Zhang,..., Qingfu Zhang
16 2024-06-11 Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for
Sampling
link Denis Blessing, Xiaogang Jia,..., Gerhard Neumann
16 2024-02-13 eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale,
High-quality Instruction Data
link Bo Peng, Xinyi Ling,..., Xia Ning
16 2024-02-12 Active Preference Learning for Large Language Models link William Muldrew, Peter Hayes,..., David Barber
16 2024-03-20 Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes link Yifan Chen, Mark Goldstein,..., Eric Vanden-Eijnden
16 2024-03-04 CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary
Time Series as Exogenous Variables
link Jiecheng Lu, Xu Han,..., Shihao Yang
16 2024-05-06 To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning
in Large Language Models
link George-Octavian Bărbulescu, Peter Triantafillou
16 2023-07-21 Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width
Guarantees and Benefits of Complex Eigenvalues
link Antonio Orvieto, Soham De,..., Samuel L Smith
16 2024-02-06 MusicRL: Aligning Music Generation to Human Preferences link Geoffrey Cideron, Sertan Girgin,..., Andrea Agostinelli
16 2024-02-12 Benchmarking and Building Long-Context Retrieval Models with LoCo and
M2-BERT
link Jon Saad-Falcon, Daniel Y Fu,..., Christopher Re
16 2024-02-05 Graph-enhanced Large Language Models in Asynchronous Plan Reasoning link Fangru Lin, Emanuele La Malfa,..., Janet B. Pierrehumbert
16 2023-12-20 Learning and Forgetting Unsafe Examples in Large Language Models link Jiachen Zhao, Zhun Deng,..., Mengye Ren
16 2024-09-09 TERD: A Unified Framework for Safeguarding Diffusion Models Against
Backdoors
link Yichuan Mo, Hui Huang,..., Yisen Wang
16 2024-02-05 Position: What Can Large Language Models Tell Us about
Time Series Analysis
link Ming Jin, YiFan Zhang,..., Qingsong Wen
16 2023-11-15 ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy link Kirill Vishniakov, Zhiqiang Shen, Zhuang Liu
16 2023-05-27 Matrix Information Theory for Self-Supervised Learning link Yifan Zhang, Zhiquan Tan,..., Yang Yuan
15 2024-03-07 Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks link Linyuan Gong, Sida Wang,..., Alvin Cheung
15 2023-11-23 Scalable AI Safety via Doubly-Efficient Debate link Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras
15 2024-08-18 Parameterized Physics-informed Neural Networks for Parameterized PDEs link Woojin Cho, Minju Jo,..., Noseong Park
15 2023-12-26 Generalization in Kernel Regression Under Realistic Assumptions link Daniel Barzilai, Ohad Shamir
15 2024-02-27 RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences link Jie Cheng, Gang Xiong,..., Fei-Yue Wang
15 2024-06-01 Slow and Steady Wins the Race: Maintaining Plasticity with
Hare and Tortoise Networks
link Hojoon Lee, Hyeonseo Cho,..., Clare Lyle
15 2023-07-14 Graph Positional and Structural Encoder link Semih Cantürk, Renming Liu,..., Ladislav Rampášek
15 2024-06-14 Towards Scalable and Versatile Weight Space Learning link Konstantin Schürholt, Michael W. Mahoney, Damian Borth
15 2024-06-02 Full-Atom Peptide Design based on Multi-modal Flow Matching link Jiahan Li, Chaoran Cheng,..., Jianzhu Ma
15 None UniAudio: Towards Universal Audio Generation with Large Language Models link Dongchao Yang, Jinchuan Tian,..., Helen M. Meng
15 2023-09-08 Graph Neural Networks Use Graphs When They Shouldn't link Maya Bechler-Speicher, Ido Amos,..., Amir Globerson
15 2024-06-03 GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer link Ding Jia, Jianyuan Guo,..., Xinghao Chen
15 2024-05-20 Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using
Spatio-Temporal Slices
link Nathaniel Cohen, Vladimir Kulikov,..., Tomer Michaeli
14 2024-07-11 Position: Measure Dataset Diversity, Don't Just Claim It link Dora Zhao, Jerone Andrews,..., Alice Xiang
14 2024-02-04 Revisiting the Power of Prompt for Visual Tuning link Yuzhu Wang, Lechao Cheng,..., Meng Wang
14 2024-04-17 Decomposing and Editing Predictions by Modeling Model Computation link Harshay Shah, Andrew Ilyas, Aleksander Madry
14 2024-05-18 On the Trajectory Regularity of ODE-based Diffusion Sampling link Defang Chen, Zhenyu Zhou,..., Siwei Lyu
14 2024-04-11 Lyapunov-stable Neural Control for State and Output Feedback: A
Novel Formulation
link Lujie Yang, Hongkai Dai,..., Huan Zhang
14 2023-11-16 Structured Chemistry Reasoning with Large Language Models link Siru Ouyang, Zhuosheng Zhang,..., Lianhui Qin
14 2024-02-27 Automated Statistical Model Discovery with Language Models link Michael Y. Li, Emily Fox, Noah Goodman
14 2024-03-12 BAGEL: Bootstrapping Agents by Guiding Exploration with Language link Shikhar Murty, Christopher D Manning,..., Kenton Lee
14 2024-02-02 MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning
in Smaller Language Models
link Justin Chen, Swarnadeep Saha,..., Mohit Bansal
14 2024-02-23 How Do Nonlinear Transformers Learn and Generalize in In-Context
Learning?
link Hongkang Li, Meng Wang,..., Pin-Yu Chen
14 2024-02-04 Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models link Fangzhao Zhang, Mert Pilanci
14 2023-06-05 Seizing Serendipity: Exploiting the Value of Past Success in
Off-Policy Actor-Critic
link Tianying Ji, Yu Luo,..., Huazhe Xu
14 2024-02-02 Simulation of Graph Algorithms with Looped Transformers link Artur Back de Luca, Kimon Fountoulakis
14 None Diffusion Models Encode the Intrinsic Dimension of Data Manifolds link Jan Pawel Stanczuk, Georgios Batzolis,..., Carola-Bibiane Schönlieb
14 2024-03-30 Privacy Backdoors: Stealing Data with Corrupted Pretrained Models link Shanglun Feng, Florian Tramèr
14 2024-02-23 Human vs. Generative AI in Content Creation Competition: Symbiosis
or Conflict?
link Fan Yao, Chuanhao Li,..., Haifeng Xu
14 2023-12-20 In-Context Reinforcement Learning for Variable Action Spaces link Viacheslav Sinii, Alexander Nikulin,..., Sergey Kolesnikov
14 2023-12-18 The Good, The Bad, and Why: Unveiling Emotions in
Generative AI
link CHENG LI, Jindong Wang,..., Xing Xie
14 2024-02-02 BAT: Learning to Reason about Spatial Sounds with Large
Language Models
link Zhisheng Zheng, Puyuan Peng,..., David Harwath
14 2023-10-11 LLark: A Multimodal Instruction-Following Language Model for Music link Joshua P Gardner, Simon Durand,..., Rachel M Bittner
14 2024-02-10 Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF link Han Shen, Zhuoran Yang, Tianyi Chen
14 2023-12-08 Membership Inference Attacks on Diffusion Models via Quantile Regression link Shuai Tang, Steven Wu,..., Aaron Roth
14 2024-02-28 Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for
Large Language Models
link Mingjia Huo, Sai Ashish Somayajula,..., Pengtao Xie
14 2023-11-15 Converting Transformers to Polynomial Form for Secure Inference Over
Homomorphic Encryption
link Itamar Zimerman, Moran Baruch,..., Lior Wolf
14 2023-12-06 Interpretability Illusions in the Generalization of Simplified Models link Dan Friedman, Andrew Kyle Lampinen,..., Asma Ghandeharioun
13 2024-02-11 How do Large Language Models Navigate Conflicts between Honesty
and Helpfulness?
link Ryan Liu, Theodore Sumers,..., Thomas L. Griffiths
13 2024-02-19 LoRA Training in the NTK Regime has No Spurious
Local Minima
link Uijeong Jang, Jason D. Lee, Ernest K. Ryu
13 2023-10-03 High-Probability Convergence for Composite and Distributed Stochastic Minimization and
Variational Inequalities with Heavy-Tailed Noise
link Eduard Gorbunov, Abdurakhmon Sadiev,..., Peter Richtárik
13 2024-02-05 Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation
Problem
link Maciej Wolczyk, Bartłomiej Cupiał,..., Piotr Miłoś
13 2024-02-26 CARTE: Pretraining and Transfer for Tabular Learning link Myung Jun Kim, Leo Grinsztajn, Gael Varoquaux
13 2023-06-02 Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning link Xiangzhe Kong, Wenbing Huang, Yang Liu
13 2024-01-29 ReGAL: Refactoring Programs to Discover Generalizable Abstractions link Elias Stengel-Eskin, Archiki Prasad, Mohit Bansal
13 2024-02-05 On Least Square Estimation in Softmax Gating Mixture of
Experts
link Huy Nguyen, Nhat Ho, Alessandro Rinaldo
13 2024-02-05 Light and Optimal Schrödinger Bridge Matching link Nikita Gushchin, Sergei Kholkin,..., Alexander Korotin
13 2024-06-10 A Statistical Theory of Regularization-Based Continual Learning link Xuyang Zhao, Huiyuan Wang,..., Wei Lin
13 2024-04-14 Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies link Brian R. Bartoldson, James Diffenderfer,..., Bhavya Kailkhura
13 2024-05-02 On Mechanistic Knowledge Localization in Text-to-Image Generative Models link Samyadeep Basu, Keivan Rezaei,..., Soheil Feizi
13 2024-06-02 Is In-Context Learning in Large Language Models Bayesian? A
Martingale Perspective
link Fabian Falck, Ziyu Wang, Christopher C. Holmes
13 2024-04-16 Fewer Truncations Improve Language Modeling link Hantian Ding, Zijian Wang,..., Stefano Soatto
13 2024-03-13 A Sparsity Principle for Partially Observable Causal Representation Learning link Danru Xu, Dingling Yao,..., Sara Magliacane
13 2024-02-15 Representation Surgery: Theory and Practice of Affine Steering link Shashwat Singh, Shauli Ravfogel,..., Ponnurangam Kumaraguru
13 2024-05-14 Reinformer: Max-Return Sequence Modeling for Offline RL link Zifeng Zhuang, Dengyun Peng,..., Donglin Wang
13 2024-01-31 Do Language Models Exhibit the Same Cognitive Biases in
Problem Solving as Human Learners?
link Andreas Opedal, Alessandro Stolfo,..., Mrinmaya Sachan
13 2023-06-07 Catapults in SGD: spikes in the training loss and
their impact on generalization through feature learning
link Libin Zhu, Chaoyue Liu,..., Mikhail Belkin
13 2024-05-28 MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance link Yake Wei, Di Hu
13 2024-05-30 Proteus: Exploring Protein Structure Generation for Enhanced Designability and
Efficiency
link Chentong Wang, Yannan Qu,..., Longxing Cao
13 2023-11-24 StableSSM: Alleviating the Curse of Memory in State-space Models
through Stable Reparameterization
link Shida Wang, Qianxiao Li
13 2024-02-20 Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series
Dependencies and Intra-Series Variations Modeling
link Guoqi Yu, Jing Zou,..., Shujun Wang
13 2023-09-29 Information Flow in Self-Supervised Learning link Zhiquan Tan, Jingqin Yang,..., Yifan Zhang
13 2024-02-11 Self-Correcting Self-Consuming Loops for Generative Model Training link Nate Gillman, Michael Freeman,..., Chen Sun
13 2023-12-16 Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge link Conghan Yue, Zhengwei Peng,..., Dongyu Zhang
12 2024-06-06 Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation link Can Yaras, Peng Wang,..., Qing Qu
12 2024-03-26 How Private are DP-SGD Implementations? link Lynn Chua, Badih Ghazi,..., Chiyuan Zhang
12 2023-06-15 ViP: A Differentially Private Foundation Model for Computer Vision link Yaodong Yu, Maziar Sanjabi,..., Chuan Guo
12 2024-02-16 Stochastic Localization via Iterative Posterior Sampling link Louis Grenioux, Maxence Noble,..., Alain Oliviero Durmus
12 2023-06-06 Designing Decision Support Systems using Counterfactual Prediction Sets link Eleni Straitouri, Manuel Gomez Rodriguez
12 2024-06-20 Understanding Finetuning for Factual Knowledge Extraction link Gaurav Rohit Ghosal, Tatsunori Hashimoto, Aditi Raghunathan
12 2024-02-22 Clifford-Steerable Convolutional Neural Networks link Maksim Zhdanov, David Ruhe,..., Patrick Forré
12 2024-04-02 What Can Transformer Learn with Varying Depth? Case Studies
on Sequence Learning Tasks
link Xingwu Chen, Difan Zou
12 2024-02-05 Can We Remove the Square-Root in Adaptive Gradient Methods?
A Second-Order Perspective
link Wu Lin, Felix Dangel,..., Alireza Makhzani
12 2024-01-08 Sampling in Unit Time with Kernel Fisher-Rao Flow link Aimee Maurais, Youssef Marzouk
12 2024-02-15 Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification link Yiming Meng, Ruikun Zhou,..., Jun Liu
12 2024-05-08 VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems
in Visual Context
link yunxin li, Baotian Hu,..., Min Zhang
12 2024-05-12 Learning Reward for Robot Skills Using Large Language Models
via Self-Alignment
link Yuwei Zeng, Yao Mu, Lin Shao
12 2024-06-11 Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets
Cannot
link Zixuan Wang, Stanley Wei,..., Jason D. Lee
12 2023-07-03 Trainable Transformer in Transformer link Abhishek Panigrahi, Sadhika Malladi,..., Sanjeev Arora
12 2024-06-03 Do Large Language Models Perform the Way People Expect?
Measuring the Human Generalization Function
link Keyon Vafa, Ashesh Rambachan, Sendhil Mullainathan
12 2024-06-28 Multimodal Prototyping for cancer survival prediction link Andrew H. Song, Richard J. Chen,..., Faisal Mahmood
12 2024-07-03 DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents link Yilun Xu, Gabriele Corso,..., Karsten Kreis
12 2024-02-06 Improved Generalization of Weight Space Networks via Augmentations link Aviv Shamsian, Aviv Navon,..., Haggai Maron
12 2022-12-21 Not Just Pretty Pictures: Toward Interventional Data Augmentation Using
Text-to-Image Generators
link Jianhao Yuan, Francesco Pinto,..., Philip Torr
12 2023-11-02 Gaussian Processes on Cellular Complexes link Mathieu Alain, So Takao,..., Marc Peter Deisenroth
12 2023-10-18 A connection between Tempering and Entropic Mirror Descent link Nicolas Chopin, Francesca Crucinio, Anna Korba
12 2023-12-15 Fast Decision Boundary based Out-of-Distribution Detector link Litian Liu, Yao Qin
12 2024-06-05 Graph Neural Network Explanations are Fragile link Jiate Li, Meng Pang,..., Binghui Wang
12 2024-03-05 PPFLOW: Target-Aware Peptide Design with Torsional Flow Matching link Haitao Lin, Odin Zhang,..., Stan Z. Li
12 2024-05-08 The Entropy Enigma: Success and Failure of Entropy Minimization link Ori Press, Ravid Shwartz-Ziv,..., Matthias Bethge
12 2024-06-02 FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning link Yuwei Fu, Haichao Zhang,..., Benoit Boulet
12 2024-05-30 SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for
Embodied Manipulation
link Junjie Zhang, Chenjia Bai,..., Xuelong Li
12 2024-02-15 Accelerating Parallel Sampling of Diffusion Models link Zhiwei Tang, Jiasheng Tang,..., Tsung-Hui Chang
12 2024-06-10 Diving into Underwater: Segment Anything Model Guided Underwater Salient
Instance Segmentation and A Large-scale Dataset
link Shijie Lian, Ziyi Zhang,..., Runmin Cong
12 2024-05-28 SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity,
ECG and Respiratory Signals
link Rahul Thapa, Bryan He,..., James Zou
12 2024-02-11 More Benefits of Being Distributional: Second-Order Bounds for Reinforcement
Learning
link Kaiwen Wang, Owen Oertell,..., Wen Sun
12 2024-02-06 DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic
Systems
link Yair Schiff, Zhong Yi Wan,..., Leonardo Zepeda-Núñez
12 2023-09-29 Latent Space Symmetry Discovery link Jianke Yang, Nima Dehmamy,..., Rose Yu
12 2024-05-27 Q-value Regularized Transformer for Offline Reinforcement Learning link Shengchao Hu, Ziqing Fan,..., Dacheng Tao
12 2024-04-06 Multicalibration for Confidence Scoring in LLMs link Gianluca Detommaso, Martin Bertran Lopez,..., Aaron Roth
11 2024-02-22 ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization link Tianying Ji, Yongyuan Liang,..., Huazhe Xu
11 2024-02-08 Offline Actor-Critic Reinforcement Learning Scales to Large Models link Jost Tobias Springenberg, Abbas Abdolmaleki,..., Martin Riedmiller
11 2024-05-20 LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph
Clustering
link Li Sun, Zhenhao Huang,..., Philip S. Yu
11 2023-10-19 Improved Operator Learning by Orthogonal Attention link Zipeng Xiao, Zhongkai Hao,..., Hang Su
11 2024-03-19 End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations link Lirui Luo, Guoxi Zhang,..., Qing Li
11 2024-05-29 Locally Estimated Global Perturbations are Better than Local Perturbations
for Federated Sharpness-aware Minimization
link Ziqing Fan, Shengchao Hu,..., Yanfeng Wang
11 2024-05-28 FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic
Prediction
link Zhonghang Li, Lianghao Xia,..., Chao Huang
11 2024-02-16 RLVF: Learning from Verbal Feedback without Overgeneralization link Moritz Pascal Stephan, Alexander Khazatsky,..., Chelsea Finn
11 2022-12-08 A New Linear Scaling Rule for Private Adaptive Hyperparameter
Optimization
link Ashwinee Panda, Xinyu Tang,..., Prateek Mittal
11 2024-01-05 AST-T5: Structure-Aware Pretraining for Code Generation and Understanding link Linyuan Gong, Mostafa Elhoushi, Alvin Cheung
11 2024-02-14 Is Epistemic Uncertainty Faithfully Represented by Evidential Deep Learning
Methods?
link Mira Juergens, Nis Meinert,..., Willem Waegeman
11 2024-01-17 Understanding Heterophily for Graph Neural Networks link Junfu Wang, Yuanfang Guo,..., Yunhong Wang
11 2024-04-24 Unifying Bayesian Flow Networks and Diffusion Models through Stochastic
Differential Equations
link Kaiwen Xue, Yuhao Zhou,..., Chongxuan Li
11 2024-02-02 InferCept: Efficient Intercept Support for Augmented Large Language Model
Inference
link Reyna Abhyankar, Zijian He,..., Yiying Zhang
11 2024-03-03 Critical windows: non-asymptotic theory for feature emergence in diffusion
models
link Marvin Li, Sitan Chen
11 2023-05-26 Rotational Equilibrium: How Weight Decay Balances Learning Across Neural
Networks
link Atli Kosson, Bettina Messmer, Martin Jaggi
11 2024-02-23 Explorations of Self-Repair in Language Models link Cody Rushing, Neel Nanda
11 2024-02-13 Homomorphism Counts for Graph Neural Networks: All About That
Basis
link Emily Jin, Michael M. Bronstein,..., Matthias Lanzinger
11 2024-06-10 Compute Better Spent: Replacing Dense Layers with Structured Matrices link Shikai Qiu, Andres Potapczynski,..., Andrew Gordon Wilson
11 2024-02-01 Towards Efficient Exact Optimization of Language Model Alignment link Haozhe Ji, Cheng Lu,..., Minlie Huang
11 2024-02-16 Language Models as Science Tutors link Alexis Chevalier, Jiayi Geng,..., Danqi Chen
11 2024-04-11 Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models link Tanmay Gautam, Youngsuk Park,..., Wooseok Ha
11 2023-10-14 DPZero: Private Fine-Tuning of Language Models without Backpropagation link Liang Zhang, Bingcong Li,..., Niao He
11 2024-03-04 Reward Model Learning vs. Direct Policy Optimization: A Comparative
Analysis of Learning from Human Preferences
link Andi Nika, Debmalya Mandal,..., Adish Singla
11 2024-02-08 Time Series Diffusion in the Frequency Domain link Jonathan Crabbé, Nicolas Huynh,..., Mihaela van der Schaar
11 2024-02-05 Understanding Reasoning Ability of Language Models From the Perspective
of Reasoning Paths Aggregation
link Xinyi Wang, Alfonso Amayuelas,..., William Yang Wang
11 2024-05-11 Non-confusing Generation of Customized Concepts in Diffusion Models link Wang Lin, Jingyuan Chen,..., Hanwang Zhang
11 2024-01-24 ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal
Models
link Rohan Wadhawan, Hritik Bansal,..., Nanyun Peng
11 2024-02-01 Transforming and Combining Rewards for Aligning Large Language Models link Zihao Wang, Chirag Nagpal,..., Victor Veitch
11 2024-01-26 Residual Quantization with Implicit Neural Codebooks link Iris A.M. Huijben, Matthijs Douze,..., Jakob Verbeek
11 2024-02-02 Understanding Adam Optimizer via Online Learning of Updates: Adam
is FTRL in Disguise
link Kwangjun Ahn, Zhiyu Zhang,..., Yan Dai
11 2024-02-22 Prompting a Pretrained Transformer Can Be a Universal Approximator link Aleksandar Petrov, Philip Torr, Adel Bibi
11 2024-01-04 Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU
Features Model
link Hien Dang, Tho Tran Huu,..., Nhat Ho
11 2023-03-15 Borda Regret Minimization for Generalized Linear Dueling Bandits link Yue Wu, Tao Jin,..., Quanquan Gu
11 2023-05-30 Plug-in Performative Optimization link Licong Lin, Tijana Zrnic
11 2024-03-27 Understanding the Learning Dynamics of Alignment with Human Feedback link Shawn Im, Yixuan Li
11 2023-11-28 Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models link Zhihe Lu, Jiawang Bai,..., Xinchao Wang
11 2024-02-06 Neural Networks Learn Statistics of Increasing Complexity link Nora Belrose, Quintin Pope,..., Xiaoli Fern
10 2024-03-05 Active Statistical Inference link Tijana Zrnic, Emmanuel Candes
10 2023-10-18 Image Clustering with External Guidance link Yunfan Li, Peng Hu,..., Xi Peng
10 2023-07-13 Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks link Liam Collins, Hamed Hassani,..., Sanjay Shakkottai
10 2024-06-11 Flextron: Many-in-One Flexible Large Language Model link Ruisi Cai, Saurav Muralidharan,..., Pavlo Molchanov
10 None Relaxing the Accurate Imputation Assumption in Doubly Robust Learning
for Debiased Collaborative Filtering
link Haoxuan Li, Chunyuan Zheng,..., Xiao-Hua Zhou
10 2024-03-28 Regression with Multi-Expert Deferral link Anqi Mao, Mehryar Mohri, Yutao Zhong
10 2024-06-11 Failures Are Fated, But Can Be Faded: Characterizing and
Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models
link Som Sagar, Aditya Taparia, Ransalu Senanayake
10 2024-02-27 Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings link Kevin Frans, Seohong Park,..., Sergey Levine
10 2023-11-01 Robust and Conjugate Gaussian Process Regression link Matias Altamirano, Francois-Xavier Briol, Jeremias Knoblauch
10 2024-02-07 CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks link Yulong Huang, Xiaopeng LIN,..., Bojun Cheng
10 2023-07-11 Memorization Through the Lens of Curvature of Loss Function
Around Samples
link Isha Garg, Deepak Ravikumar, Kaushik Roy
10 2024-03-01 EfficientZero V2: Mastering Discrete and Continuous Control with Limited
Data
link Shengjie Wang, Shaohuai Liu,..., Yang Gao
10 None Learning Adaptive and View-Invariant Vision Transformer for Real-Time UAV
Tracking
link Yongxin Li, Mengyuan Liu,..., Shuiwang Li
10 2024-02-12 Weisfeiler-Leman at the margin: When more expressivity matters link Billy Joe Franks, Christopher Morris,..., Floris Geerts
10 2024-06-05 SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN link kang you, Zekai Xu,..., Zhezhi He
10 None Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution
Approximation
link Boheng Li, Yishuo Cai,..., Tianwei Zhang
10 2023-11-20 HexGen: Generative Inference of Large Language Model over Heterogeneous
Environment
link YOUHE JIANG, Ran Yan,..., Binhang Yuan
10 2023-12-05 UPOCR: Towards Unified Pixel-Level OCR Interface link Dezhi Peng, Zhenhua Yang,..., Lianwen Jin
10 2024-05-15 ReconBoost: Boosting Can Achieve Modality Reconcilement link Cong Hua, Qianqian Xu,..., Qingming Huang
10 2024-05-20 Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning link Kai Gan, Tong Wei
10 2024-01-05 Graph2Tac: Online Representation Learning of Formal Math Concepts link Lasse Blaauwbroek, Mirek Olšák,..., Vasily Pestun
10 None Distribution Alignment Optimization through Neural Collapse for Long-tailed Classification link Jintong Gao, He Zhao,..., Hongyuan Zha
10 2023-10-22 A General Theory for Softmax Gating Multinomial Logistic Mixture
of Experts
link Huy Nguyen, Pedram Akbarian,..., Nhat Ho
10 2024-06-04 What Improves the Generalization of Graph Transformers? A Theoretical
Dive into the Self-attention and Positional Encoding
link Hongkang Li, Meng Wang,..., Pin-Yu Chen
10 2024-05-02 Uncertainty for Active Learning on Graphs link Dominik Fuchsgruber, Tom Wollschläger,..., Stephan Günnemann
10 2024-02-22 Comparing Graph Transformers via Positional Encodings link Mitchell Black, Zhengchao Wan,..., Yusu Wang
10 2024-02-28 Out-of-Domain Generalization in Dynamical Systems Reconstruction link Niclas Alexander Göring, Florian Hess,..., Daniel Durstewitz
10 2024-02-15 Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient
Data Utilization
link Yihan Du, Anna Winnicki,..., R. Srikant
10 2024-01-25 Is Temperature Sample Efficient for Softmax Gaussian Mixture of
Experts?
link Huy Nguyen, Pedram Akbarian, Nhat Ho
10 2023-01-27 Single-Trajectory Distributionally Robust Reinforcement Learning link Zhipeng Liang, Xiaoteng Ma,..., Zhengyuan Zhou
10 None Equivariant Diffusion for Crystal Structure Prediction link Peijia Lin, Pin Chen,..., Yutong Lu
10 2023-04-16 An Empirical Study of Realized GNN Expressiveness link Yanbo Wang, Muhan Zhang
10 2024-04-12 On the Independence Assumption in Neurosymbolic Learning link Emile van Krieken, Pasquale Minervini,..., Antonio Vergari
10 2023-10-13 Split-and-Denoise: Protect large language model inference with local differential
privacy
link Peihua Mai, Ran Yan,..., Yan Pang
10 2024-02-23 Fair Resource Allocation in Multi-Task Learning link Hao Ban, Kaiyi Ji
10 2024-02-07 Multi-Sender Persuasion: A Computational Perspective link Safwan Hossain, Tonghan Wang,..., Haifeng Xu
10 2024-01-21 Linear Alignment: A Closed-form Solution for Aligning Human Preferences
without Tuning and Feedback
link Songyang Gao, Qiming Ge,..., Dahua Lin
10 2024-06-02 Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution
Detection
link Chentao Cao, Zhun Zhong,..., Bo Han