Last updated: 2025-04-16 04:08:34. Maintained by Weisen Jiang.

citation publish date title (pdf) review authors
922 2024-03-05 Scaling Rectified Flow Transformers for High-Resolution Image Synthesis link Patrick Esser, Sumith Kulal,..., Robin Rombach
634 2024-01-17 Vision Mamba: Efficient Visual Representation Learning with Bidirectional State
Space Model
link Lianghui Zhu, Bencheng Liao,..., Xinggang Wang
569 2023-08-04 MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities link Weihao Yu, Zhengyuan Yang,..., Lijuan Wang
549 2023-05-23 Improving Factuality and Reasoning in Language Models through Multiagent
Debate
link Yilun Du, Shuang Li,..., Igor Mordatch
417 2023-09-11 NExT-GPT: Any-to-Any Multimodal LLM link Shengqiong Wu, Hao Fei,..., Tat-Seng Chua
394 2024-03-07 Chatbot Arena: An Open Platform for Evaluating LLMs by
Human Preference
link Wei-Lin Chiang, Lianmin Zheng,..., Ion Stoica
364 2024-05-31 Transformers are SSMs: Generalized Models and Efficient Algorithms Through
Structured State Space Duality
link Tri Dao, Albert Gu
339 2023-10-02 ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback link Ganqu Cui, Lifan Yuan,..., Maosong Sun
301 2024-02-14 DoRA: Weight-Decomposed Low-Rank Adaptation link Shih-yang Liu, Chien-Yi Wang,..., Min-Hung Chen
301 2023-09-01 RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback
with AI Feedback
link Harrison Lee, Samrat Phatale,..., Sushant Prakash
283 2024-02-06 HarmBench: A Standardized Evaluation Framework for Automated Red Teaming
and Robust Refusal
link Mantas Mazeika, Long Phan,..., Dan Hendrycks
273 2024-01-18 Self-Rewarding Language Models link Weizhe Yuan, Richard Yuanzhe Pang,..., Jason E Weston
245 2024-01-02 Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
Models
link Zixiang Chen, Yihe Deng,..., Quanquan Gu
237 2023-12-14 Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision link Collin Burns, Pavel Izmailov,..., Jeffrey Wu
234 2023-05-22 How Language Model Hallucinations Can Snowball link Muru Zhang, Ofir Press,..., Noah A. Smith
226 2024-01-19 Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding
Heads
link Tianle Cai, Yuhong Li,..., Tri Dao
217 2023-12-21 VideoPoet: A Large Language Model for Zero-Shot Video Generation link Dan Kondratyuk, Lijun Yu,..., Lu Jiang
183 2024-01-03 GPT-4V(ision) is a Generalist Web Agent, if Grounded link Boyuan Zheng, Boyu Gou,..., Yu Su
176 2024-01-16 Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance
in Machine Translation
link Haoran Xu, Amr Sharaf,..., Young Jin Kim
165 2023-10-06 Language Agent Tree Search Unifies Reasoning, Acting, and Planning
in Language Models
link Andy Zhou, Kai Yan,..., Yu-Xiong Wang
164 2023-10-14 A decoder-only foundation model for time-series forecasting link Abhimanyu Das, Weihao Kong,..., Yichen Zhou
163 2023-09-28 Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution link Chrisantha Fernando, Dylan Sunil Banarse,..., Tim Rocktäschel
158 2024-03-06 GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection link Jiawei Zhao, Zhenyu Zhang,..., Yuandong Tian
156 2024-02-06 LESS: Selecting Influential Data for Targeted Instruction Tuning link Mengzhou Xia, Sadhika Malladi,..., Danqi Chen
149 2023-06-13 SqueezeLLM: Dense-and-Sparse Quantization link Sehoon Kim, Coleman Richard Charles Hooper,..., Kurt Keutzer
141 2024-02-05 KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache link Zirui Liu, Jiayi Yuan,..., Xia Hu
141 2023-12-18 Iterative Preference Learning from Human Feedback: Bridging Theory and
Practice for RLHF under KL-constraint
link Wei Xiong, Hanze Dong,..., Tong Zhang
140 2024-02-04 Unified Training of Universal Time Series Forecasting Transformers link Gerald Woo, Chenghao Liu,..., Doyen Sahoo
136 2023-09-29 AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and
Training
link Ziyu Wan, Xidong Feng,..., Jun Wang
132 2023-04-19 Fundamental Limitations of Alignment in Large Language Models link Yotam Wolf, Noam Wies,..., Amnon Shashua
126 2024-03-05 NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
Diffusion Models
link Zeqian Ju, Yuancheng Wang,..., sheng zhao
126 2023-11-07 The Linear Representation Hypothesis and the Geometry of Large
Language Models
link Kiho Park, Yo Joong Choe, Victor Veitch
124 2023-12-11 Gated Linear Attention Transformers with Hardware-Efficient Training link Songlin Yang, Bailin Wang,..., Yoon Kim
123 2024-02-23 Genie: Generative Interactive Environments link Jake Bruce, Michael D Dennis,..., Tim Rocktäschel
123 2024-02-19 LoRA+: Efficient Low Rank Adaptation of Large Models link Soufiane Hayou, Nikhil Ghosh, Bin Yu
122 2024-02-03 Break the Sequential Dependency of LLM Inference Using Lookahead
Decoding
link Yichao Fu, Peter Bailis,..., Hao Zhang
122 2024-02-21 LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens link Yiran Ding, Li Lyna Zhang,..., Mao Yang
121 2023-11-18 An Embodied Generalist Agent in 3D World link Jiangyong Huang, Silong Yong,..., Siyuan Huang
117 2024-04-16 Is DPO Superior to PPO for LLM Alignment? A
Comprehensive Study
link Shusheng Xu, Wei Fu,..., Yi Wu
117 2023-12-01 Nash Learning from Human Feedback link Remi Munos, Michal Valko,..., Bilal Piot
113 2023-09-25 Physics of Language Models: Part 3.1, Knowledge Storage and
Extraction
link Zeyuan Allen-Zhu, Yuanzhi Li
111 2024-02-02 TravelPlanner: A Benchmark for Real-World Planning with Language Agents link Jian Xie, Kai Zhang,..., Yu Su
110 2024-02-01 Executable Code Actions Elicit Better LLM Agents link Xingyao Wang, Yangyi Chen,..., Heng Ji
110 2024-02-15 Data Engineering for Scaling Language Models to 128K Context link Yao Fu, Rameswar Panda,..., Hao Peng
107 2024-01-26 EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty link Yuhui Li, Fangyun Wei,..., Hongyang Zhang
105 2024-01-22 Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal
LLMs
link Ling Yang, Zhaochen Yu,..., Bin CUI
101 2024-02-08 SPHINX-X: Scaling Data and Parameters for a Family of
Multi-modal Large Language Models
link Dongyang Liu, Renrui Zhang,..., Peng Gao
97 2024-02-06 MOMENT: A Family of Open Time-series Foundation Models link Mononito Goswami, Konrad Szafer,..., Artur Dubrawski
95 2024-02-07 Fast Timing-Conditioned Latent Audio Diffusion link Zach Evans, CJ Carr,..., Jordi Pons
95 2024-04-22 Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data link Fahim Tajwar, Anikait Singh,..., Aviral Kumar
94 2023-10-11 In-Context Unlearning: Language Models as Few-Shot Unlearners link Martin Pawelczyk, Seth Neel, Himabindu Lakkaraju
92 2024-01-03 A Mechanistic Understanding of Alignment Algorithms: A Case Study
on DPO and Toxicity
link Andrew Lee, Xiaoyan Bai,..., Rada Mihalcea
92 2024-01-02 LLM Maybe LongLM: SelfExtend LLM Context Window Without Tuning link Hongye Jin, Xiaotian Han,..., Xia Hu
91 2024-01-08 A Minimaximalist Approach to Reinforcement Learning from Human Feedback link Gokul Swamy, Christoph Dann,..., Alekh Agarwal
90 2024-02-12 Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language
Models
link Siddharth Karamcheti, Suraj Nair,..., Dorsa Sadigh
87 2023-11-02 RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning
via Generative Simulation
link Yufei Wang, Zhou Xian,..., Chuang Gan
85 2024-01-22 WARM: On the Benefits of Weight Averaged Reward Models link Alexandre Rame, Nino Vieillard,..., Johan Ferret
83 2024-02-06 QuIP$#$: Even Better LLM Quantization with Hadamard Incoherence and
Lattice Codebooks
link Albert Tseng, Jerry Chee,..., Christopher De Sa
83 2024-02-07 Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with
Applications to Protein Co-Design
link Andrew Campbell, Jason Yim,..., Tommi Jaakkola
82 2024-02-09 Debating with More Persuasive LLMs Leads to More Truthful
Answers
link Akbir Khan, John Hughes,..., Ethan Perez
80 2024-02-12 PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs link Soroush Nasiriany, Fei Xia,..., brian ichter
79 2024-02-07 AlphaFold Meets Flow Matching for Generating Protein Ensembles link Bowen Jing, Bonnie Berger, Tommi Jaakkola
79 2023-11-11 In-context Vectors: Making In Context Learning More Effective and
Controllable Through Latent Space Steering
link Sheng Liu, Haotian Ye,..., James Y. Zou
79 2023-12-04 Magicoder: Empowering Code Generation with OSS-Instruct link Yuxiang Wei, Zhe Wang,..., LINGMING ZHANG
79 2024-01-11 Patchscopes: A Unifying Framework for Inspecting Hidden Representations of
Language Models
link Asma Ghandeharioun, Avi Caciularu,..., Mor Geva
78 2024-04-30 Better & Faster Large Language Models via Multi-token Prediction link Fabian Gloeckle, Badr Youbi Idrissi,..., Gabriel Synnaeve
78 2024-01-11 Extreme Compression of Large Language Models via Additive Quantization link Vage Egiazarian, Andrei Panferov,..., Dan Alistarh
78 2024-02-07 Assessing the Brittleness of Safety Alignment via Pruning and
Low-Rank Modifications
link Boyi Wei, Kaixuan Huang,..., Peter Henderson
78 2024-01-29 OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models link Fuzhao Xue, Zian Zheng,..., Yang You
77 2024-02-08 Generalized Preference Optimization: A Unified Approach to Offline Alignment link Yunhao Tang, Zhaohan Daniel Guo,..., Bilal Piot
77 2024-04-24 MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language
Models Towards Multitask AGI
link Kaining Ying, Fanqing Meng,..., Wenqi Shao
76 2023-09-01 Image Hijacks: Adversarial Images can Control Generative Models at
Runtime
link Luke Bailey, Euan Ong,..., Scott Emmons
76 2024-01-05 CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution link Alex Gu, Baptiste Roziere,..., Sida Wang
75 2024-02-07 MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark link Dongping Chen, Ruoxi Chen,..., Lichao Sun
73 2024-02-01 Repeat After Me: Transformers are Better than State Space
Models at Copying
link Samy Jelassi, David Brandfonbrener,..., eran malach
72 2023-07-20 SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language
Models
link Xiaoxuan Wang, Ziniu Hu,..., Wei Wang
71 2024-03-05 Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling link Yair Schiff, Chia Hsiang Kao,..., Volodymyr Kuleshov
71 2024-01-22 Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text link Abhimanyu Hans, Avi Schwarzschild,..., Tom Goldstein
70 2023-10-29 Language Agents with Reinforcement Learning for Strategic Play in
the Werewolf Game
link Zelai Xu, Chao Yu,..., Yi Wu
69 2023-10-08 Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce
for Pruning LLMs to High Sparsity
link Lu Yin, You Wu,..., Shiwei Liu
67 2024-06-16 QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference link Jiaming Tang, Yilong Zhao,..., Song Han
66 2024-03-11 Monitoring AI-Modified Content at Scale: A Case Study on
the Impact of ChatGPT on AI Conference Peer Reviews
link Weixin Liang, Zachary Izzo,..., James Y. Zou
66 2024-02-02 Audio Flamingo: A Novel Audio Language Model with Few-Shot
Learning and Dialogue Abilities
link Zhifeng Kong, Arushi Goel,..., Bryan Catanzaro
66 2024-03-14 3D-VLA: A 3D Vision-Language-Action Generative World Model link Haoyu Zhen, Xiaowen Qiu,..., Chuang Gan
65 2023-12-07 Chain of Code: Reasoning with a Language Model-Augmented Code
Emulator
link Chengshu Li, Jacky Liang,..., brian ichter
64 2024-03-11 Stealing part of a production language model link Nicholas Carlini, Daniel Paleka,..., Florian Tramèr
62 2023-09-12 Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts link Zhi-Yi Chin, Chieh Ming Jiang,..., Wei-Chen Chiu
62 2024-02-10 A Tale of Tails: Model Collapse as a Change
of Scaling Laws
link Elvis Dohmatob, Yunzhen Feng,..., Julia Kempe
62 2023-10-25 Controlled Decoding from Language Models link Sidharth Mudgal, Jong Lee,..., Ahmad Beirami
62 2024-02-22 MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use
Cases
link Zechun Liu, Changsheng Zhao,..., Vikas Chandra
61 2024-02-22 tinyBenchmarks: evaluating LLMs with fewer examples link Felipe Maia Polo, Lucas Weber,..., Mikhail Yurochkin
61 2024-02-13 COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability link Xingang Guo, Fangxu Yu,..., Bin Hu
61 2024-02-06 Can Mamba Learn How To Learn? A Comparative Study
on In-Context Learning Tasks
link Jongho Park, Jaeseung Park,..., Dimitris Papailiopoulos
60 2023-11-04 Position: Levels of AGI for Operationalizing Progress on the
Path to AGI
link Meredith Ringel Morris, Jascha Sohl-Dickstein,..., Shane Legg
60 2023-12-31 Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling
Laws
link Nikhil Sardana, Jacob Portes,..., Jonathan Frankle
60 2024-02-15 Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference
Adjustment
link Rui Yang, Xiaoman Pan,..., Jianshu Chen
58 2024-02-13 IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D
Generation
link Luke Melas-Kyriazi, Iro Laina,..., Filippos Kokkinos
58 2024-03-13 Human Alignment of Large Language Models through Online Preference
Optimisation
link Daniele Calandriello, Zhaohan Daniel Guo,..., Bilal Piot
58 2024-02-06 BiLLM: Pushing the Limit of Post-Training Quantization for LLMs link Wei Huang, Yangdong Liu,..., XIAOJUAN QI
56 2024-03-05 Behavior Generation with Latent Actions link Seungjae Lee, Yibin Wang,..., Lerrel Pinto
56 2024-02-03 Safety Fine-Tuning at (Almost) No Cost: A Baseline for
Vision Large Language Models
link Yongshuo Zong, Ondrej Bohdal,..., Timothy Hospedales
55 2023-10-05 MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation link Qian Huang, Jian Vora,..., Jure Leskovec
54 2023-08-20 Algorithm of Thoughts: Enhancing Exploration of Ideas in Large
Language Models
link Bilgehan Sel, Ahmad Tawaha,..., Ming Jin
53 2024-03-01 HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding link Zhaorun Chen, Zhuokai Zhao,..., Jiawei Zhou
53 2024-03-05 MathScale: Scaling Instruction Tuning for Mathematical Reasoning link Zhengyang Tang, Xingxing Zhang,..., Furu Wei
53 2023-10-08 In-context Convergence of Transformers link Yu Huang, Yuan Cheng, Yingbin Liang
52 2024-03-06 Stop Regressing: Training Value Functions via Classification for Scalable
Deep RL
link Jesse Farebrother, Jordi Orbay,..., Rishabh Agarwal
52 2023-10-25 Discrete Diffusion Modeling by Estimating the Ratios of the
Data Distribution
link Aaron Lou, Chenlin Meng, Stefano Ermon
52 2024-02-15 QuRating: Selecting High-Quality Data for Training Language Models link Alexander Wettig, Aatmik Gupta,..., Danqi Chen
52 2024-03-11 The Pitfalls of Next-Token Prediction link Gregor Bachmann, Vaishnavh Nagarajan
51 2024-02-28 Simple linear attention language models balance the recall-throughput tradeoff link Simran Arora, Sabri Eyuboglu,..., Christopher Re
51 2024-02-08 WebLINX: Real-World Website Navigation with Multi-Turn Dialogue link Xing Han Lu, Zdeněk Kasner, Siva Reddy
50 2024-05-07 Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition link Hao Fei, Shengqiong Wu,..., Wynne Hsu
50 2024-04-05 Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion
Models for One-Step Generation
link Mingyuan Zhou, Huangjie Zheng,..., Hai Huang
49 2024-03-01 Provably Robust DPO: Aligning Language Models with Noisy Feedback link Sayak Ray Chowdhury, Anush Kini, Nagarajan Natarajan
49 2023-06-09 Prodigy: An Expeditiously Adaptive Parameter-Free Learner link Konstantin Mishchenko, Aaron Defazio
48 2024-02-14 MaxMin-RLHF: Alignment with Diverse Human Preferences link Souradip Chakraborty, Jiahao Qiu,..., Mengdi Wang
48 2023-12-07 An LLM Compiler for Parallel Function Calling link Sehoon Kim, Suhong Moon,..., Amir Gholami
48 2024-03-14 Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference link Piotr Nawrot, Adrian Łańcucki,..., Edoardo Ponti
47 2024-03-12 WorkArena: How Capable are Web Agents at Solving Common
Knowledge Work Tasks?
link Alexandre Drouin, Maxime Gasse,..., Alexandre Lacoste
47 2024-02-13 LLaGA: Large Language and Graph Assistant link Runjin Chen, Tong Zhao,..., Zhangyang Wang
47 2023-11-08 NExT-Chat: An LMM for Chat, Detection and Segmentation link Ao Zhang, Yuan Yao,..., Tat-Seng Chua
46 2024-01-04 Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using
Large Language Model
link Fei Liu, Tong Xialiang,..., Qingfu Zhang
46 2024-02-13 GLoRe: When, Where, and How to Improve LLM Reasoning
via Global and Local Refinements
link Alexander Havrilla, Sharath Chandra Raparthy,..., Roberta Raileanu
46 2022-09-30 Differentially Private Bias-Term Fine-tuning of Foundation Models link Zhiqi Bu, Yu-Xiang Wang,..., George Karypis
45 2023-06-30 Stay on Topic with Classifier-Free Guidance link Guillaume Sanchez, Alexander Spangher,..., Stella Biderman
45 2023-07-17 Do Models Explain Themselves? Counterfactual Simulatability of Natural Language
Explanations
link Yanda Chen, Ruiqi Zhong,..., Kathleen McKeown
45 2024-02-19 FiT: Flexible Vision Transformer for Diffusion Model link Zeyu Lu, ZiDong Wang,..., LEI BAI
45 2023-10-11 Online Speculative Decoding link Xiaoxuan Liu, Lanxiang Hu,..., Hao Zhang
45 2023-11-18 MagicPose: Realistic Human Poses and Facial Expressions Retargeting with
Identity-aware Diffusion
link Di Chang, Yichun Shi,..., Mohammad Soleymani
44 2023-07-31 Learning to Model the World With Language link Jessy Lin, Yuqing Du,..., Anca Dragan
44 2024-02-14 Get More with LESS: Synthesizing Recurrence with KV Cache
Compression for Efficient LLM Inference
link Harry Dong, Xinyu Yang,..., Beidi Chen
44 2024-02-12 Scaling Laws for Fine-Grained Mixture of Experts link Jan Ludziejewski, Jakub Krajewski,..., Sebastian Jaszczur
44 2024-02-13 Agent Smith: A Single Image Can Jailbreak One Million
Multimodal LLM Agents Exponentially Fast
link Xiangming Gu, Xiaosen Zheng,..., Min Lin
43 2024-02-03 BetterV: Controlled Verilog Generation with Discriminative Guidance link Zehua PEI, Huiling Zhen,..., Bei Yu
43 2024-02-11 ODIN: Disentangled Reward Mitigates Hacking in RLHF link Lichang Chen, Chen Zhu,..., Bryan Catanzaro
43 2024-02-22 How Transformers Learn Causal Structure with Gradient Descent link Eshaan Nichani, Alex Damian, Jason D. Lee
43 2024-02-29 ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL link Yifei Zhou, Andrea Zanette,..., Aviral Kumar
42 2024-02-06 RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback link Yufei Wang, Zhanyi Sun,..., Zackory Erickson
42 2024-02-07 Long Is More for Alignment: A Simple but Tough-to-Beat
Baseline for Instruction Fine-Tuning
link Hao Zhao, Maksym Andriushchenko,..., Nicolas Flammarion
42 2024-02-11 GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative
Gaussian Splatting
link Xiaoyu Zhou, Xingjian Ran,..., Ming-Hsuan Yang
42 2024-01-31 On Prompt-Driven Safeguarding for Large Language Models link Chujie Zheng, Fan Yin,..., Nanyun Peng
41 2024-02-05 Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization link Yang Jin, Zhicheng Sun,..., Yadong MU
41 2024-04-12 The Illusion of State in State-Space Models link William Merrill, Jackson Petty, Ashish Sabharwal
41 2023-10-11 InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining link Boxin Wang, Wei Ping,..., Bryan Catanzaro
40 2023-11-15 Decomposing Uncertainty for Large Language Models through Input Clarification
Ensembling
link Bairu Hou, Yujian Liu,..., Yang Zhang
40 2024-01-21 Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers link Katherine Crowson, Stefan Andreas Baumann,..., Enrico Shippole
40 2024-02-02 Boximator: Generating Rich and Controllable Motions for Video Synthesis link Jiawei Wang, Yuchen Zhang,..., Hang Li
40 2024-02-08 Dirichlet Flow Matching with Applications to DNA Sequence Design link Hannes Stark, Bowen Jing,..., Tommi Jaakkola
39 2024-01-23 DsDm: Model-Aware Dataset Selection with Datamodels link Logan Engstrom, Axel Feldmann, Aleksander Madry
39 2023-12-08 SparQ Attention: Bandwidth-Efficient LLM Inference link Luka Ribar, Ivan Chelombiev,..., Douglas Orr
39 2024-02-28 Evaluating Quantized Large Language Models link Shiyao Li, Xuefei Ning,..., Yu Wang
39 2024-02-18 Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark link Yihua Zhang, Pingzhi Li,..., Tianlong Chen
39 2024-02-18 Momentor: Advancing Video Large Language Model with Fine-Grained Temporal
Reasoning
link Long Qian, Juncheng Li,..., Siliang Tang
38 None Position: LLMs Can’t Plan, But Can Help Planning in
LLM-Modulo Frameworks
link Subbarao Kambhampati, Karthik Valmeekam,..., Anil B Murthy
38 2024-05-13 Localizing Task Information for Improved Model Merging and Compression link Ke Wang, Nikolaos Dimitriadis,..., Pascal Frossard
38 2023-06-05 InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models link Lichang Chen, Jiuhai Chen,..., Tianyi Zhou
38 2024-01-23 In-Context Language Learning: Architectures and Algorithms link Ekin Akyürek, Bailin Wang,..., Jacob Andreas
36 2024-02-05 Large Language Models are Geographically Biased link Rohin Manvi, Samar Khanna,..., Stefano Ermon
36 2024-02-05 Flora: Low-Rank Adapters Are Secretly Gradient Compressors link Yongchang Hao, Yanshuai Cao, Lili Mou
36 2024-02-09 Iterated Denoising Energy Matching for Sampling from Boltzmann Densities link Tara Akhound-Sadegh, Jarrid Rector-Brooks,..., Alexander Tong
36 2024-01-10 InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks link Xueyu Hu, Ziyu Zhao,..., Fei Wu
36 2023-10-02 Prompt-tuning Latent Diffusion Models for Inverse Problems link Hyungjin Chung, Jong Chul Ye,..., Mauricio Delbracio
35 2024-02-05 Decoding-time Realignment of Language Models link Tianlin Liu, Shangmin Guo,..., Mathieu Blondel
35 2024-03-19 RigorLLM: Resilient Guardrails for Large Language Models against Undesired
Content
link Zhuowen Yuan, Zidi Xiong,..., Bo Li
35 2024-02-05 Representation Surgery for Multi-Task Model Merging link Enneng Yang, Li Shen,..., Dacheng Tao
34 2024-02-01 Merging Multi-Task Models via Weight-Ensembling Mixture of Experts link Anke Tang, Li Shen,..., Dacheng Tao
34 2024-01-30 Proactive Detection of Voice Cloning with Localized Watermarking link Robin San Roman, Pierre Fernandez,..., Tuan Tran
34 2023-12-11 Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions
In Context
link Xiang Cheng, Yuxin Chen, Suvrit Sra
34 2023-09-13 Auto-Regressive Next-Token Predictors are Universal Learners link eran malach
33 2024-02-02 Challenges in Training PINNs: A Loss Landscape Perspective link Pratik Rathore, Weimu Lei,..., Madeleine Udell
33 2024-05-02 SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters link Shengsheng Lin, Weiwei Lin,..., Junjie Yang
33 2024-02-02 A Dynamical Model of Neural Scaling Laws link Blake Bordelon, Alexander Atanasov, Cengiz Pehlevan
33 2024-03-17 MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of
Data
link Paul Steven Scotti, Mihir Tripathy,..., Tanishq Mathew Abraham
33 2024-02-06 AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls link Yu Du, Fangyun Wei, Hongyang Zhang
33 2023-10-16 ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method
for Aligning Large Language Models
link Ziniu Li, Tian Xu,..., Zhi-Quan Luo
32 2024-02-19 Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for
Robust Large Vision-Language Models
link Christian Schlarmann, Naman Deep Singh,..., Matthias Hein
32 2023-12-12 AI Control: Improving Safety Despite Intentional Subversion link Ryan Greenblatt, Buck Shlegeris,..., Fabien Roger
32 2024-02-08 Accurate LoRA-Finetuning Quantization of LLMs via Information Retention link Haotong Qin, Xudong Ma,..., Michele Magno
32 2022-10-10 Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score
Based Diffusion Models
link Louis Sharrock, Jack Simons,..., Mark Beaumont
32 2024-04-18 Token-level Direct Preference Optimization link Yongcheng Zeng, Guoqing Liu,..., Jun Wang
31 2023-05-17 Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling link Weijia Xu, Andrzej Banburski, Nebojsa Jojic
31 2024-02-07 On Computational Limits of Modern Hopfield Models: A Fine-Grained
Complexity Analysis
link Jerry Yao-Chieh Hu, Thomas Lin,..., Han Liu
30 2024-03-28 MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions link Kai Zhang, Yi Luan,..., Ming-Wei Chang
30 2024-03-04 Differentially Private Synthetic Data via Foundation Model APIs 2:
Text
link Chulin Xie, Zinan Lin,..., Sergey Yekhanin
30 2024-02-27 Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for
Generative Recommendations
link Jiaqi Zhai, Lucy Liao,..., Yu Shi
30 2024-02-15 Language Models with Conformal Factuality Guarantees link Christopher Mohri, Tatsunori Hashimoto
30 2023-12-11 Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication
Cost under 18 Kilobytes
link Zhen Qin, Daoyuan Chen,..., Shuiguang Deng
29 2023-12-06 Low-Cost High-Power Membership Inference Attacks link Sajjad Zarifzadeh, Philippe Liu, Reza Shokri
29 2023-10-05 Stochastic Interpolants with Data-Dependent Couplings link Michael Samuel Albergo, Mark Goldstein,..., Eric Vanden-Eijnden
29 2024-04-23 NExT: Teaching Large Language Models to Reason about Code
Execution
link Ansong Ni, Miltiadis Allamanis,..., Pengcheng Yin
29 2024-02-15 DE-COP: Detecting Copyrighted Content in Language Models Training Data link André Vicente Duarte, Xuandong Zhao,..., Lei Li
29 2024-02-23 Fast Adversarial Attacks on Language Models In One GPU
Minute
link Vinu Sankar Sadasivan, Shoumik Saha,..., Soheil Feizi
29 2024-02-28 CogBench: a large language model walks into a psychology
lab
link Julian Coda-Forno, Marcel Binz,..., Eric Schulz
29 2024-02-14 Feature Reuse and Scaling: Understanding Transfer Learning with Protein
Language Models
link Francesca-Zhoufan Li, Ava P Amini,..., Alex Xijie Lu
29 2024-02-05 Guidance with Spherical Gaussian Constraint for Conditional Diffusion link Lingxiao Yang, Shutong Ding,..., Ye Shi
28 2024-04-26 Probabilistic Inference in Language Models via Twisted Sequential Monte
Carlo
link Stephen Zhao, Rob Brekelmans,..., Roger Baker Grosse
28 2024-01-22 DITTO: Diffusion Inference-Time T-Optimization for Music Generation link Zachary Novack, Julian McAuley,..., Nicholas J. Bryan
28 2024-02-04 Transolver: A Fast Transformer Solver for PDEs on General
Geometries
link Haixu Wu, Huakun Luo,..., Mingsheng Long
28 2024-02-12 Rolling Diffusion Models link David Ruhe, Jonathan Heek,..., Emiel Hoogeboom
28 2024-02-27 Training-Free Long-Context Scaling of Large Language Models link Chenxin An, Fei Huang,..., Lingpeng Kong
28 2024-06-07 FlowMM: Generating Materials with Riemannian Flow Matching link Benjamin Kurt Miller, Ricky T. Q. Chen,..., Brandon M Wood
27 2024-02-14 Transformers, parallel computation, and logarithmic depth link Clayton Sanford, Daniel Hsu, Matus Telgarsky
27 2024-02-29 Watermark Stealing in Large Language Models link Nikola Jovanović, Robin Staab, Martin Vechev
27 2024-02-03 GliDe with a CaPE: A Low-Hassle Method to Accelerate
Speculative Decoding
link Cunxiao Du, Jing Jiang,..., Yang You
27 None Position: Will we run out of data? Limits of
LLM scaling based on human-generated data
link Pablo Villalobos, Anson Ho,..., Marius Hobbhahn
27 2024-02-15 A Human-Inspired Reading Agent with Gist Memory of Very
Long Contexts
link Kuang-Huei Lee, Xinyun Chen,..., Ian Fischer
27 2024-02-01 Dense Reward for Free in Reinforcement Learning from Human
Feedback
link Alex James Chan, Hao Sun,..., Mihaela van der Schaar
27 2024-04-16 Position: Social Choice Should Guide AI Alignment in Dealing
with Diverse Human Feedback
link Vincent Conitzer, Rachel Freedman,..., William S. Zwicker
26 2024-01-09 RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation link Mahdi Nikdan, Soroush Tabesh,..., Dan Alistarh
26 2023-02-26 Diffusion Model-Augmented Behavioral Cloning link Shang-Fu Chen, Hsiang-Chun Wang,..., Shao-Hua Sun
26 2024-02-28 Diffusion Language Models Are Versatile Protein Learners link Xinyou Wang, Zaixiang Zheng,..., Quanquan Gu
26 2024-03-06 Accelerating Convergence of Score-Based Diffusion Models, Provably link Gen Li, Yu Huang,..., Yuxin Chen
25 2024-02-20 A Touch, Vision, and Language Dataset for Multimodal Alignment link Letian Fu, Gaurav Datta,..., Ken Goldberg
25 2024-02-13 Mixtures of Experts Unlock Parameter Scaling for Deep RL link Johan Samir Obando Ceron, Ghada Sokar,..., Pablo Samuel Castro
25 2024-05-18 Towards Modular LLMs by Building and Reusing a Library
of LoRAs
link Oleksiy Ostapenko, Zhan Su,..., Alessandro Sordoni
25 2024-02-14 Premise Order Matters in Reasoning with Large Language Models link Xinyun Chen, Ryan Andrew Chi,..., Denny Zhou
25 2024-01-11 DiffDA: a Diffusion model for weather-scale Data Assimilation link Langwen Huang, Lukas Gianinazzi,..., Torsten Hoefler
25 2024-02-26 Asymmetry in Low-Rank Adapters of Foundation Models link Jiacheng Zhu, Kristjan Greenewald,..., Justin Solomon
25 2023-12-08 EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language
Models with 3D Parallelism
link Yanxi Chen, Xuchen Pan,..., Jingren Zhou
25 2024-02-03 A Closer Look at the Limitations of Instruction Tuning link Sreyan Ghosh, Chandra Kiran Reddy Evuru,..., Dinesh Manocha
25 2024-02-08 How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis link Federico Bianchi, Patrick John Chia,..., James Zou
24 2024-02-08 Self-Alignment of Large Language Models via Monopolylogue-based Social Scene
Simulation
link Xianghe Pang, Shuo Tang,..., Siheng Chen
24 2024-02-07 Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation link Luca Beurer-Kellner, Marc Fischer, Martin Vechev
24 2023-10-26 Codebook Features: Sparse and Discrete Interpretability for Neural Networks link Alex Tamkin, Mohammad Taufeeque, Noah Goodman
24 2024-02-28 CLLMs: Consistency Large Language Models link Siqi Kou, Lanxiang Hu,..., Hao Zhang
24 2024-03-21 Protein Conformation Generation via Force-Guided SE(3) Diffusion Models link YanWang, Lihao Wang,..., Quanquan Gu
24 2023-10-23 DOGE: Domain Reweighting with Generalization Estimation link Simin Fan, Matteo Pagliardini, Martin Jaggi
24 2024-03-11 Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical
Knowledge Enhancement
link Che Liu, Zhongwei Wan,..., Rossella Arcucci
24 2024-02-27 DS-Agent: Automated Data Science by Empowering Large Language Models
with Case-Based Reasoning
link Siyuan Guo, Cheng Deng,..., Jun Wang
24 2024-04-04 Uniform Memory Retrieval with Larger Capacity for Modern Hopfield
Models
link Dennis Wu, Jerry Yao-Chieh Hu,..., Han Liu
24 2024-05-16 LLM and Simulation as Bilevel Optimizers: A New Paradigm
to Advance Physical Scientific Discovery
link Pingchuan Ma, Tsun-Hsuan Wang,..., Wojciech Matusik
24 2023-06-07 Don't trust your eyes: on the (un)reliability of feature
visualizations
link Robert Geirhos, Roland S. Zimmermann,..., Been Kim
24 2024-02-14 Position: Topological Deep Learning is the New Frontier for
Relational Learning
link Theodore Papamarkou, Tolga Birdal,..., Ghada Zamzmi
24 2024-06-28 Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation link Danny Halawi, Alexander Wei,..., Jacob Steinhardt
23 2024-04-15 All-in-one simulation-based inference link Manuel Gloeckler, Michael Deistler,..., Jakob H. Macke
23 2024-03-15 Repoformer: Selective Retrieval for Repository-Level Code Completion link Di Wu, Wasi Uddin Ahmad,..., Xiaofei Ma
23 2024-02-03 Position: Graph Foundation Models Are Already Here link Haitao Mao, Zhikai Chen,..., Jiliang Tang
23 2024-04-10 What needs to go right for an induction head?
A mechanistic study of in-context learning circuits and their formation
link Aaditya K Singh, Ted Moskovitz,..., Andrew M Saxe
23 2024-03-06 On the Origins of Linear Representations in Large Language
Models
link Yibo Jiang, Goutham Rajendran,..., Victor Veitch
23 2024-04-12 TSLANet: Rethinking Transformers for Time Series Representation Learning link Emadeldeen Eldele, Mohamed Ragab,..., Xiaoli Li
23 2024-02-29 Dual Operating Modes of In-Context Learning link Ziqian Lin, Kangwook Lee
23 2024-04-04 Outlier-Efficient Hopfield Layers for Large Transformer-Based Models link Jerry Yao-Chieh Hu, Pei-Hsuan Chang,..., Han Liu
23 2024-02-05 The Benefits of Reusing Batches for Gradient Descent in
Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
link Yatin Dandi, Emanuele Troiani,..., Florent Krzakala
22 2024-02-08 AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers link Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi,..., Wojciech Samek
22 2024-02-08 In-Context Principle Learning from Mistakes link Tianjun Zhang, Aman Madaan,..., Uri Alon
22 2023-10-05 Agent Instructs Large Language Models to be General Zero-Shot
Reasoners
link Nicholas Crispino, Kyle Montgomery,..., Chenguang Wang
22 2023-12-19 Curated LLM: Synergy of LLMs and Data Curation for
tabular augmentation in low-data regimes
link Nabeel Seedat, Nicolas Huynh,..., Mihaela van der Schaar
22 2024-02-09 Feedback Loops With Language Models Drive In-Context Reward Hacking link Alexander Pan, Erik Jones,..., Jacob Steinhardt
22 2023-10-09 Harmonic Self-Conditioned Flow Matching for joint Multi-Ligand Docking and
Binding Site Design
link Hannes Stark, Bowen Jing,..., Tommi Jaakkola
22 None CaM: Cache Merging for Memory-efficient LLMs Inference link Yuxin Zhang, Yuxuan Du,..., Rongrong Ji
22 2024-02-01 Position: Bayesian Deep Learning is Needed in the Age
of Large-Scale AI
link Theodore Papamarkou, Maria Skoularidou,..., Ruqi Zhang
22 2023-05-27 CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers link Dachuan Shi, Chaofan Tao,..., Jiaqi Wang
22 2024-05-05 Parameter-Efficient Fine-Tuning with Discrete Fourier Transform link Ziqi Gao, Qichao Wang,..., Jia Li
22 2024-01-05 VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model link Pengying Wu, Yao Mu,..., Chang Liu
21 2024-03-02 SceneCraft: An LLM Agent for Synthesizing 3D Scenes as
Blender Code
link Ziniu Hu, Ahmet Iscen,..., Alireza Fathi
21 2024-01-07 The Stronger the Diffusion Model, the Easier the Backdoor:
Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline
link Haonan Wang, Qianli Shen,..., Kenji Kawaguchi
21 2024-04-18 RoboDreamer: Learning Compositional World Models for Robot Imagination link Siyuan Zhou, Yilun Du,..., Chuang Gan
21 2024-02-02 Online conformal prediction with decaying step sizes link Anastasios Nikolas Angelopoulos, Rina Barber, Stephen Bates
21 2024-02-13 A Dense Reward View on Aligning Text-to-Image Diffusion with
Preference
link Shentao Yang, Tianqi Chen, Mingyuan Zhou
21 2024-03-03 Theoretical insights for diffusion guidance: A case study for
Gaussian mixture models
link Yuchen Wu, Minshuo Chen,..., Yuting Wei
21 2024-06-22 video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models link Guangzhi Sun, Wenyi Yu,..., Chao Zhang
21 2023-10-10 Conformal Prediction for Deep Classifier via Label Ranking link Jianguo Huang, HuaJun Xi,..., Hongxin Wei
20 2024-03-04 DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving link Foteini Strati, Sara McAllister,..., Ana Klimovic
20 2020-11-29 Scaling Down Deep Learning with MNIST-1D link Samuel James Greydanus, Dmitry Kobak
20 2024-02-21 D-Flow: Differentiating through Flows for Controlled Generation link Heli Ben-Hamu, Omri Puny,..., Yaron Lipman
20 2024-03-03 In-Context Sharpness as Alerts: An Inner Representation Perspective for
Hallucination Mitigation
link Shiqi Chen, Miao Xiong,..., Junxian He
20 2024-01-18 Improving fine-grained understanding in image-text pre-training link Ioana Bica, Anastasija Ilic,..., Jovana Mitrovic
20 2024-03-21 An Analysis of Linear Time Series Forecasting Models link William Toner, Luke Nicholas Darlow
20 2024-02-01 Getting the most out of your tokenizer for pre-training
and domain adaptation
link Gautier Dagan, Gabriel Synnaeve, Baptiste Roziere
20 2024-03-18 Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs
Under Compression
link Junyuan Hong, Jinhao Duan,..., Bo Li
20 2024-01-29 Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in
RLHF
link Banghua Zhu, Michael Jordan, Jiantao Jiao
20 2024-02-14 SLEB: Streamlining LLMs through Redundancy Verification and Elimination of
Transformer Blocks
link Jiwon Song, Kyungseok Oh,..., jae-joon kim
20 2024-04-22 Align Your Steps: Optimizing Sampling Schedules in Diffusion Models link Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis
20 2023-10-20 Equivariant Deep Weight Space Alignment link Aviv Navon, Aviv Shamsian,..., Haggai Maron
20 2023-10-09 Generalized Neural Collapse for a Large Number of Classes link Jiachen Jiang, Jinxin Zhou,..., Zhihui Zhu
19 2024-02-15 SAMformer: Unlocking the Potential of Transformers in Time Series
Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
link Romain Ilbert, Ambroise Odonnat,..., Ievgen Redko
19 2024-02-26 Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning link Michael Matthews, Michael Beukman,..., Jakob Nicolaus Foerster
19 2024-02-08 Memory Consolidation Enables Long-Context Video Understanding link Ivana Balazevic, Yuge Shi,..., Olivier J Henaff
19 2024-03-06 Conformal prediction for multi-dimensional time series by ellipsoidal sets link Chen Xu, Hanyang Jiang, Yao Xie
19 2024-02-23 Minimax Optimality of Score-based Diffusion Models: Beyond the Density
Lower Bound Assumptions
link Kaihong Zhang, Heqi Yin,..., Jingbo Liu
19 2024-02-27 Variational Learning is Effective for Large Deep Networks link Yuesong Shen, Nico Daheim,..., Thomas Möllenhoff
19 2024-03-30 Linguistic Calibration of Long-Form Generations link Neil Band, Xuechen Li,..., Tatsunori Hashimoto
19 2024-06-05 Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large
Language Models
link Peijie Dong, Lujun Li,..., Xiaowen Chu
19 2024-02-08 Training Large Language Models for Reasoning through Reverse Curriculum
Reinforcement Learning
link Zhiheng Xi, Wenxiang Chen,..., Xuanjing Huang
19 2023-04-03 Chain-of-Thought Predictive Control link Zhiwei Jia, Vineet Thumuluri,..., Hao Su
19 2024-02-09 Particle Denoising Diffusion Sampler link Angus Phillips, Hai-Dang Dau,..., Arnaud Doucet
19 2024-03-26 Mechanistic Design and Scaling of Hybrid Architectures link Michael Poli, Armin W Thomas,..., Stefano Massaroli
19 None The Emergence of Reproducibility and Consistency in Diffusion Models link Huijie Zhang, Jinfan Zhou,..., Qing Qu
19 2024-02-26 Feedback Efficient Online Fine-Tuning of Diffusion Models link Masatoshi Uehara, Yulai Zhao,..., Tommaso Biancalani
19 2022-06-10 Accelerated Algorithms for Constrained Nonconvex-Nonconcave Min-Max Optimization and Comonotone
Inclusion
link Yang Cai, Argyris Oikonomou, Weiqiang Zheng
19 2024-02-15 OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large
Language Models
link Ali AhmadiTeshnizi, Wenzhi Gao, Madeleine Udell
19 2023-12-06 Generalization to New Sequential Decision Making Tasks with In-Context
Learning
link Sharath Chandra Raparthy, Eric Hambro,..., Roberta Raileanu
19 2024-02-05 C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models link Mintong Kang, Nezihe Merve Gürel,..., Bo Li
19 2024-02-03 Improving Diffusion Models for Inverse Problems Using Optimal Posterior
Covariance
link Xinyu Peng, Ziyang Zheng,..., Hongkai Xiong
19 2023-10-11 A Theory of Non-Linear Feature Learning with One Gradient
Step in Two-Layer Neural Networks
link Behrad Moniri, Donghwan Lee,..., Edgar Dobriban
19 2022-11-09 Few-Shot Character Understanding in Movies as an Assessment to
Meta-Learning of Theory-of-Mind
link Mo Yu, Qiujing Wang,..., Jie Zhou
19 2023-10-11 A Resilient and Accessible Distribution-Preserving Watermark for Large Language
Models
link Yihan Wu, Zhengmian Hu,..., Heng Huang
19 2024-02-19 Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language
Models
link Didi Zhu, Zhongyisun Sun,..., Kun Kuang
19 2024-01-24 Can AI Assistants Know What They Don't Know? link Qinyuan Cheng, Tianxiang Sun,..., Xipeng Qiu
19 2023-02-07 Graph Generation with Diffusion Mixture link Jaehyeong Jo, Dongki Kim, Sung Ju Hwang
19 2023-12-28 Non-Vacuous Generalization Bounds for Large Language Models link Sanae Lotfi, Marc Anton Finzi,..., Andrew Gordon Wilson
18 2023-10-26 CompeteAI: Understanding the Competition Dynamics of Large Language Model-based
Agents
link Qinlin Zhao, Jindong Wang,..., Xing Xie
18 2024-02-22 Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion link Yujia Huang, Adishree Ghatare,..., Yisong Yue
18 2024-06-22 Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language
Models without Training through Attention Calibration
link Zhongzhi Yu, Zheng Wang,..., Yingyan Celine Lin
18 2024-02-22 A Language Model’s Guide Through Latent Space link Dimitri von Rütte, Sotiris Anagnostidis,..., Thomas Hofmann
18 2024-02-19 In value-based deep reinforcement learning, a pruned network is
a good network
link Johan Samir Obando Ceron, Aaron Courville, Pablo Samuel Castro
18 2023-10-02 FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language
Models
link Jingwei Sun, Ziyue Xu,..., Holger R Roth
18 2023-02-04 CosPGD: an efficient white-box adversarial attack for pixel-wise prediction
tasks
link Shashank Agnihotri, Steffen Jung, Margret Keuper
17 2024-01-22 APT: Adaptive Pruning and Tuning Pretrained Language Models for
Efficient Training and Inference
link Bowen Zhao, Hannaneh Hajishirzi, Qingqing Cao
17 2024-03-05 Time Weaver: A Conditional Time Series Generation Model link Sai Shankar Narasimhan, Shubhankar Agarwal,..., Sandeep P. Chinchali
17 None R2E: Turning any Github Repository into a Programming Agent
Environment
link Naman Jain, Manish Shetty,..., Ion Stoica
17 2024-03-20 Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models
with Noisy Data
link Giannis Daras, Alex Dimakis, Constantinos Costis Daskalakis
17 2024-10-10 Can Looped Transformers Learn to Implement Multi-step Gradient Descent
for In-context Learning?
link Khashayar Gatmiry, Nikunj Saunshi,..., Sanjiv Kumar
17 2024-04-17 Learning with 3D rotations, a hitchhiker's guide to SO(3) link Andreas René Geist, Jonas Frey,..., Georg Martius
17 2024-02-04 Selecting Large Language Model to Fine-tune via Rectified Scaling
Law
link Haowei Lin, Baizhou Huang,..., Yitao Liang
17 2024-02-27 Case-Based or Rule-Based: How Do Transformers Do the Math? link Yi Hu, Xiaojuan Tang,..., Muhan Zhang
17 2023-11-29 Should we be going MAD? A Look at Multi-Agent
Debate Strategies for LLMs
link Andries Petrus Smit, Nathan Grinsztajn,..., Arnu Pretorius
17 2023-05-18 Emergent Representations of Program Semantics in Language Models Trained
on Programs
link Charles Jin, Martin Rinard
17 2024-05-28 AI Alignment with Changing and Influenceable Reward Functions link Micah Carroll, Davis Foote,..., Anca Dragan
17 2023-10-02 Fool Your (Vision and) Language Model with Embarrassingly Simple
Permutations
link Yongshuo Zong, Tingyang Yu,..., Timothy Hospedales
17 2024-02-25 RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis link Yao Mu, Junting Chen,..., Ping Luo
17 2024-02-05 Distinguishing the Knowable from the Unknowable with Language Models link Gustaf Ahdritz, Tian Qin,..., Benjamin L. Edelman
17 2024-04-18 MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space link Yanru Qu, Keyue Qiu,..., Wei-Ying Ma
17 2024-02-07 CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay link Natasha Butt, Blazej Manczak,..., Taco Cohen
17 2024-03-06 DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training link Zhongkai Hao, Chang Su,..., Jun Zhu
17 None Position: TrustLLM: Trustworthiness in Large Language Models link Yue Huang, Lichao Sun,..., Yue Zhao
17 2024-01-23 TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic
Tasks
link Zhiruo Wang, Graham Neubig, Daniel Fried
17 2024-03-16 SelfIE: Self-Interpretation of Large Language Model Embeddings link Haozhe Chen, Carl Vondrick, Chengzhi Mao
17 2024-02-26 Disentangled 3D Scene Generation with Layout Learning link Dave Epstein, Ben Poole,..., Aleksander Holynski
17 2024-02-03 Image Fusion via Vision-Language Model link Zixiang Zhao, Lilun Deng,..., Luc Van Gool
16 2024-02-02 Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics
on the Attention Landscape
link Juno Kim, Taiji Suzuki
16 2024-04-02 Test-Time Model Adaptation with Only Forward Passes link Shuaicheng Niu, Chunyan Miao,..., Peilin Zhao
16 None DRCT: Diffusion Reconstruction Contrastive Training towards Universal Detection of
Diffusion Generated Images
link Baoying Chen, Jishen Zeng,..., Rui Yang
16 2024-03-19 Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data
Flow and Per-Block Quantization
link Haocheng Xi, Yuxiang Chen,..., Jun Zhu
16 2023-12-02 Second-Order Uncertainty Quantification: A Distance-Based Approach link Yusuf Sale, Viktor Bengs,..., Eyke Hüllermeier
16 2024-02-07 Asymptotics of feature learning in two-layer networks after one
gradient-step
link Hugo Cui, Luca Pesce,..., Bruno Loureiro
16 2024-02-07 Causal Representation Learning from Multiple Distributions: A General Setting link Kun Zhang, Shaoan Xie,..., Yujia Zheng
16 2024-07-08 Scaling Exponents Across Parameterizations and Optimizers link Katie E Everett, Lechao Xiao,..., Jeffrey Pennington
16 2024-02-28 Characterizing Truthfulness in Large Language Model Generations with Local
Intrinsic Dimension
link Fan Yin, Jayanth Srinivasa, Kai-Wei Chang
16 2024-02-21 Do Efficient Transformers Really Save Computation? link Kai Yang, Jan Ackermann,..., Liwei Wang
16 2023-10-04 Assessing Large Language Models on Climate Information link Jannis Bulian, Mike S. Schäfer,..., Nadine Strauss
16 2024-02-05 Graph-enhanced Large Language Models in Asynchronous Plan Reasoning link Fangru Lin, Emanuele La Malfa,..., Janet B. Pierrehumbert
16 2024-02-01 Efficient Exploration for LLMs link Vikranth Dwaracherla, Seyed Mohammad Asghari,..., Benjamin Van Roy
16 2023-10-16 A Computational Framework for Solving Wasserstein Lagrangian Flows link Kirill Neklyudov, Rob Brekelmans,..., Alireza Makhzani
16 2024-01-25 Adaptive Text Watermark for Large Language Models link Yepeng Liu, Yuheng Bu
16 2023-10-11 Language Models as Semantic Indexers link Bowen Jin, Hansi Zeng,..., Xianfeng Tang
16 2023-08-25 Learning to Intervene on Concept Bottlenecks link David Steinmann, Wolfgang Stammer,..., Kristian Kersting
16 2024-01-28 An Information-Theoretic Analysis of In-Context Learning link Hong Jun Jeon, Jason D. Lee,..., Benjamin Van Roy
16 2023-09-18 Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts link Jiang-Xin Shi, Tong Wei,..., Yu-Feng Li
16 2024-01-24 Conformal Prediction Sets Improve Human Decision Making link Jesse C. Cresswell, Yi Sui,..., Noël Vouitsis
16 2024-02-23 Foundation Policies with Hilbert Representations link Seohong Park, Tobias Kreiman, Sergey Levine
16 2023-05-27 Matrix Information Theory for Self-Supervised Learning link Yifan Zhang, Zhiquan Tan,..., Yang Yuan
15 2023-09-28 Discovering Environments with XRM link Mohammad Pezeshki, Diane Bouchacourt,..., David Lopez-Paz
15 2023-02-23 EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand
Binding Site Prediction
link yang zhang, Zhewei Wei,..., Wenbing Huang
15 2024-02-16 Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs link Yeonhong Park, Jake Hyun,..., Jae W. Lee
15 2024-02-14 Copyright Traps for Large Language Models link Matthieu Meeus, Igor Shilov,..., Yves-Alexandre de Montjoye
15 2024-05-13 PARDEN, Can You Repeat That? Defending against Jailbreaks via
Repetition
link Ziyang Zhang, Qizhen Zhang, Jakob Nicolaus Foerster
15 2024-02-23 Deep Networks Always Grok and Here is Why link Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
15 2023-08-31 On the Implicit Bias of Adam link Matias D. Cattaneo, Jason Matthew Klusowski, Boris Shigida
15 2024-05-06 To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning
in Large Language Models
link George-Octavian Bărbulescu, Peter Triantafillou
15 2024-04-30 Modeling Caption Diversity in Contrastive Vision-Language Pretraining link Samuel Lavoie, Polina Kirichenko,..., Nicolas Ballas
15 2024-02-21 From Self-Attention to Markov Models: Unveiling the Dynamics of
Generative Transformers
link Muhammed Emrullah Ildiz, Yixiao HUANG,..., Samet Oymak
15 2024-05-16 IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency link Linshan Hou, Ruili Feng,..., Yiming Li
15 2024-05-14 Compositional Text-to-Image Generation with Dense Blob Representations link Weili Nie, Sifei Liu,..., Arash Vahdat
15 2023-12-20 Learning and Forgetting Unsafe Examples in Large Language Models link Jiachen Zhao, Zhun Deng,..., Mengye Ren
15 2024-05-18 AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models
via Watermark LoRA
link Weitao Feng, Wenbo Zhou,..., Nenghai Yu
15 2024-05-03 PICLe: Eliciting Diverse Behaviors from Large Language Models with
Persona In-Context Learning
link Hyeong Kyu Choi, Yixuan Li
15 2024-04-04 BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized
Sparse Modern Hopfield Model
link Chenwei Xu, Yu-Chao Huang,..., Han Liu
14 2024-08-18 Parameterized Physics-informed Neural Networks for Parameterized PDEs link Woojin Cho, Minju Jo,..., Noseong Park
14 2024-09-03 Interpreting and Improving Large Language Models in Arithmetic Calculation link Wei Zhang, Chaoqun Wan,..., Jieping Ye
14 2023-12-26 Generalization in Kernel Regression Under Realistic Assumptions link Daniel Barzilai, Ohad Shamir
14 2024-03-01 Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson
of Reinforcement Learning
link Michal Nauman, Michał Bortkiewicz,..., Marek Cygan
14 2024-02-14 Instruction Tuning for Secure Code Generation link Jingxuan He, Mark Vero,..., Martin Vechev
14 2024-03-18 Larimar: Large Language Models with Episodic Memory Control link Payel Das, Subhajit Chaudhury,..., Pin-Yu Chen
14 2024-02-13 Hybrid Inverse Reinforcement Learning link Juntao Ren, Gokul Swamy,..., Sanjiban Choudhury
14 2024-03-04 CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary
Time Series as Exogenous Variables
link Jiecheng Lu, Xu Han,..., Shihao Yang
14 2024-02-02 Simulation of Graph Algorithms with Looped Transformers link Artur Back de Luca, Kimon Fountoulakis
14 2024-02-12 Benchmarking and Building Long-Context Retrieval Models with LoCo and
M2-BERT
link Jon Saad-Falcon, Daniel Y Fu,..., Christopher Re
14 2023-08-14 Position: Key Claims in LLM Research Have a Long
Tail of Footnotes
link Anna Rogers, Sasha Luccioni
14 2024-03-30 Privacy Backdoors: Stealing Data with Corrupted Pretrained Models link Shanglun Feng, Florian Tramèr
14 2024-02-25 Equivariant Frames and the Impossibility of Continuous Canonicalization link Nadav Dym, Hannah Lawrence, Jonathan W. Siegel
14 None UniAudio: Towards Universal Audio Generation with Large Language Models link Dongchao Yang, Jinchuan Tian,..., Helen M. Meng
14 2023-12-18 The Good, The Bad, and Why: Unveiling Emotions in
Generative AI
link CHENG LI, Jindong Wang,..., Xing Xie
14 2024-02-06 Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains link Junhong Shen, Neil Tenenholtz,..., Nicolo Fusi
14 2024-01-20 Make-A-Shape: a Ten-Million-scale 3D Shape Model link Ka-Hei Hui, Aditya Sanghi,..., Chi-Wing Fu
14 2023-09-08 Graph Neural Networks Use Graphs When They Shouldn't link Maya Bechler-Speicher, Ido Amos,..., Amir Globerson
14 2024-06-03 GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer link Ding Jia, Jianyuan Guo,..., Xinghao Chen
13 2024-03-07 Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks link Linyuan Gong, Sida Wang,..., Alvin Cheung
13 2024-05-22 Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam
Generation
link Gauthier Guinet, Behrooz Omidvar-Tehrani,..., Laurent Callot
13 2024-02-05 Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation
Problem
link Maciej Wolczyk, Bartłomiej Cupiał,..., Piotr Miłoś
13 2024-02-29 Smooth Tchebycheff Scalarization for Multi-Objective Optimization link Xi Lin, Xiaoyuan Zhang,..., Qingfu Zhang
13 2024-02-21 Privacy-Preserving Instructions for Aligning Large Language Models link Da Yu, Peter Kairouz,..., Zheng Xu
13 2023-06-02 Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning link Xiangzhe Kong, Wenbing Huang, Yang Liu
13 2024-05-18 On the Trajectory Regularity of ODE-based Diffusion Sampling link Defang Chen, Zhenyu Zhou,..., Siwei Lyu
13 2024-02-12 Active Preference Learning for Large Language Models link William Muldrew, Peter Hayes,..., David Barber
13 2024-02-27 Automated Statistical Model Discovery with Language Models link Michael Y. Li, Emily Fox, Noah Goodman
13 2024-01-29 ReGAL: Refactoring Programs to Discover Generalizable Abstractions link Elias Stengel-Eskin, Archiki Prasad, Mohit Bansal
13 2024-02-23 How Do Nonlinear Transformers Learn and Generalize in In-Context
Learning?
link Hongkang Li, Meng Wang,..., Pin-Yu Chen
13 2024-02-05 Light and Optimal Schrödinger Bridge Matching link Nikita Gushchin, Sergei Kholkin,..., Alexander Korotin
13 2023-07-21 Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width
Guarantees and Benefits of Complex Eigenvalues
link Antonio Orvieto, Soham De,..., Samuel L Smith
13 2023-06-05 Seizing Serendipity: Exploiting the Value of Past Success in
Off-Policy Actor-Critic
link Tianying Ji, Yu Luo,..., Huazhe Xu
13 2024-03-13 A Sparsity Principle for Partially Observable Causal Representation Learning link Danru Xu, Dingling Yao,..., Sara Magliacane
13 2024-06-02 Full-Atom Peptide Design based on Multi-modal Flow Matching link Jiahan Li, Chaoran Cheng,..., Jianzhu Ma
13 2024-06-03 A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization link Sebastian Sanokowski, Sepp Hochreiter, Sebastian Lehner
13 2024-04-22 A Multimodal Automated Interpretability Agent link Tamar Rott Shaham, Sarah Schwettmann,..., Antonio Torralba
13 2023-06-07 Catapults in SGD: spikes in the training loss and
their impact on generalization through feature learning
link Libin Zhu, Chaoyue Liu,..., Mikhail Belkin
13 2023-12-20 In-Context Reinforcement Learning for Variable Action Spaces link Viacheslav Sinii, Alexander Nikulin,..., Sergey Kolesnikov
13 2024-09-09 TERD: A Unified Framework for Safeguarding Diffusion Models Against
Backdoors
link Yichuan Mo, Hui Huang,..., Yisen Wang
13 2024-05-30 Proteus: Exploring Protein Structure Generation for Enhanced Designability and
Efficiency
link Chentong Wang, Yannan Qu,..., Longxing Cao
13 2023-12-08 Membership Inference Attacks on Diffusion Models via Quantile Regression link Shuai Tang, Steven Wu,..., Aaron Roth
13 2024-02-26 Neural Operators with Localized Integral and Differential Kernels link Miguel Liu-Schiaffini, Julius Berner,..., Anima Anandkumar
13 2023-11-15 Converting Transformers to Polynomial Form for Secure Inference Over
Homomorphic Encryption
link Itamar Zimerman, Moran Baruch,..., Lior Wolf
13 2023-11-15 ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy link Kirill Vishniakov, Zhiqiang Shen, Zhuang Liu
13 2024-02-07 A Sober Look at LLMs for Material Discovery: Are
They Actually Good for Bayesian Optimization Over Molecules?
link Agustinus Kristiadi, Felix Strieth-Kalthoff,..., Geoff Pleiss
13 2023-09-29 Information Flow in Self-Supervised Learning link Zhiquan Tan, Jingqin Yang,..., Yifan Zhang
12 2024-02-11 How do Large Language Models Navigate Conflicts between Honesty
and Helpfulness?
link Ryan Liu, Theodore Sumers,..., Thomas L. Griffiths
12 2024-02-19 LoRA Training in the NTK Regime has No Spurious
Local Minima
link Uijeong Jang, Jason D. Lee, Ernest K. Ryu
12 2024-03-26 How Private are DP-SGD Implementations? link Lynn Chua, Badih Ghazi,..., Chiyuan Zhang
12 2024-07-11 Position: Measure Dataset Diversity, Don't Just Claim It link Dora Zhao, Jerone Andrews,..., Alice Xiang
12 2023-06-15 ViP: A Differentially Private Foundation Model for Computer Vision link Yaodong Yu, Maziar Sanjabi,..., Chuan Guo
12 2023-06-06 Designing Decision Support Systems using Counterfactual Prediction Sets link Eleni Straitouri, Manuel Gomez Rodriguez
12 2024-05-03 Auto-Encoding Morph-Tokens for Multimodal LLM link Kaihang Pan, Siliang Tang,..., Hanwang Zhang
12 2024-04-17 Decomposing and Editing Predictions by Modeling Model Computation link Harshay Shah, Andrew Ilyas, Aleksander Madry
12 2024-06-11 Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for
Sampling
link Denis Blessing, Xiaogang Jia,..., Gerhard Neumann
12 2024-02-13 eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale,
High-quality Instruction Data
link Bo Peng, Xinyi Ling,..., Xia Ning
12 2023-11-16 Structured Chemistry Reasoning with Large Language Models link Siru Ouyang, Zhuosheng Zhang,..., Lianhui Qin
12 2024-03-20 Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes link Yifan Chen, Mark Goldstein,..., Eric Vanden-Eijnden
12 2024-02-05 Can We Remove the Square-Root in Adaptive Gradient Methods?
A Second-Order Perspective
link Wu Lin, Felix Dangel,..., Alireza Makhzani
12 2024-06-01 Slow and Steady Wins the Race: Maintaining Plasticity with
Hare and Tortoise Networks
link Hojoon Lee, Hyeonseo Cho,..., Clare Lyle
12 2024-02-05 On Least Square Estimation in Softmax Gating Mixture of
Experts
link Huy Nguyen, Nhat Ho, Alessandro Rinaldo
12 2023-07-03 Trainable Transformer in Transformer link Abhishek Panigrahi, Sadhika Malladi,..., Sanjeev Arora
12 2024-06-10 A Statistical Theory of Regularization-Based Continual Learning link Xuyang Zhao, Huiyuan Wang,..., Wei Lin
12 2024-04-14 Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies link Brian R. Bartoldson, James Diffenderfer,..., Bhavya Kailkhura
12 2024-05-02 On Mechanistic Knowledge Localization in Text-to-Image Generative Models link Samyadeep Basu, Keivan Rezaei,..., Soheil Feizi
12 2024-02-04 Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models link Fangzhao Zhang, Mert Pilanci
12 2024-06-28 Multimodal Prototyping for cancer survival prediction link Andrew H. Song, Richard J. Chen,..., Faisal Mahmood
12 2024-02-15 Representation Surgery: Theory and Practice of Affine Steering link Shashwat Singh, Shauli Ravfogel,..., Ponnurangam Kumaraguru
12 2024-06-14 Towards Scalable and Versatile Weight Space Learning link Konstantin Schürholt, Michael W. Mahoney, Damian Borth
12 2024-01-31 Do Language Models Exhibit the Same Cognitive Biases in
Problem Solving as Human Learners?
link Andreas Opedal, Alessandro Stolfo,..., Mrinmaya Sachan
12 2024-02-23 Human vs. Generative AI in Content Creation Competition: Symbiosis
or Conflict?
link Fan Yao, Chuanhao Li,..., Haifeng Xu
12 2023-10-18 A connection between Tempering and Entropic Mirror Descent link Nicolas Chopin, Francesca Crucinio, Anna Korba
12 2024-05-21 How Universal Polynomial Bases Enhance Spectral Graph Neural Networks:
Heterophily, Over-smoothing, and Over-squashing
link Keke Huang, Yu Guang Wang,..., Pietro Lio
12 2024-05-28 MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance link Yake Wei, Di Hu
12 2023-10-11 LLark: A Multimodal Instruction-Following Language Model for Music link Joshua P Gardner, Simon Durand,..., Rachel M Bittner
12 2024-02-05 Position: What Can Large Language Models Tell Us about
Time Series Analysis
link Ming Jin, YiFan Zhang,..., Qingsong Wen
12 2024-02-10 Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF link Han Shen, Zhuoran Yang, Tianyi Chen
12 2024-02-11 More Benefits of Being Distributional: Second-Order Bounds for Reinforcement
Learning
link Kaiwen Wang, Owen Oertell,..., Wen Sun
12 2023-11-24 StableSSM: Alleviating the Curse of Memory in State-space Models
through Stable Reparameterization
link Shida Wang, Qianxiao Li
12 2024-05-27 Q-value Regularized Transformer for Offline Reinforcement Learning link Shengchao Hu, Ziqing Fan,..., Dacheng Tao
12 2024-05-20 Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using
Spatio-Temporal Slices
link Nathaniel Cohen, Vladimir Kulikov,..., Tomer Michaeli
12 2023-12-16 Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge link Conghan Yue, Zhengwei Peng,..., Dongyu Zhang
11 2024-06-06 Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation link Can Yaras, Peng Wang,..., Qing Qu
11 2023-11-23 Scalable AI Safety via Doubly-Efficient Debate link Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras
11 2024-02-08 Offline Actor-Critic Reinforcement Learning Scales to Large Models link Jost Tobias Springenberg, Abbas Abdolmaleki,..., Martin Riedmiller
11 2023-10-03 High-Probability Convergence for Composite and Distributed Stochastic Minimization and
Variational Inequalities with Heavy-Tailed Noise
link Eduard Gorbunov, Abdurakhmon Sadiev,..., Peter Richtárik
11 2024-03-19 End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations link Lirui Luo, Guoxi Zhang,..., Qing Li
11 2024-02-27 RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences link Jie Cheng, Gang Xiong,..., Fei-Yue Wang
11 2024-02-04 Revisiting the Power of Prompt for Visual Tuning link Yuzhu Wang, Lechao Cheng,..., Meng Wang
11 2024-10-29 Cell2Sentence: Teaching Large Language Models the Language of Biology link Daniel Levine, Syed A Rizvi,..., David van Dijk
11 2022-12-08 A New Linear Scaling Rule for Private Adaptive Hyperparameter
Optimization
link Ashwinee Panda, Xinyu Tang,..., Prateek Mittal
11 2024-04-02 What Can Transformer Learn with Varying Depth? Case Studies
on Sequence Learning Tasks
link Xingwu Chen, Difan Zou
11 2024-01-05 AST-T5: Structure-Aware Pretraining for Code Generation and Understanding link Linyuan Gong, Mostafa Elhoushi, Alvin Cheung
11 2024-02-26 CARTE: Pretraining and Transfer for Tabular Learning link Myung Jun Kim, Leo Grinsztajn, Gael Varoquaux
11 2024-03-12 BAGEL: Bootstrapping Agents by Guiding Exploration with Language link Shikhar Murty, Christopher D Manning,..., Kenton Lee
11 2024-01-08 Sampling in Unit Time with Kernel Fisher-Rao Flow link Aimee Maurais, Youssef Marzouk
11 2024-02-15 Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification link Yiming Meng, Ruikun Zhou,..., Jun Liu
11 2024-05-08 VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems
in Visual Context
link yunxin li, Baotian Hu,..., Min Zhang
11 2024-05-12 Learning Reward for Robot Skills Using Large Language Models
via Self-Alignment
link Yuwei Zeng, Yao Mu, Lin Shao
11 2024-02-02 MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning
in Smaller Language Models
link Justin Chen, Swarnadeep Saha,..., Mohit Bansal
11 2024-06-11 Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets
Cannot
link Zixuan Wang, Stanley Wei,..., Jason D. Lee
11 2024-06-03 Do Large Language Models Perform the Way People Expect?
Measuring the Human Generalization Function
link Keyon Vafa, Ashesh Rambachan, Sendhil Mullainathan
11 2024-06-10 Compute Better Spent: Replacing Dense Layers with Structured Matrices link Shikai Qiu, Andres Potapczynski,..., Andrew Gordon Wilson
11 2024-04-16 Fewer Truncations Improve Language Modeling link Hantian Ding, Zijian Wang,..., Stefano Soatto
11 2024-02-01 Towards Efficient Exact Optimization of Language Model Alignment link Haozhe Ji, Cheng Lu,..., Minlie Huang
11 2024-02-16 Language Models as Science Tutors link Alexis Chevalier, Jiayi Geng,..., Danqi Chen
11 2024-02-06 MusicRL: Aligning Music Generation to Human Preferences link Geoffrey Cideron, Sertan Girgin,..., Andrea Agostinelli
11 2024-02-06 Improved Generalization of Weight Space Networks via Augmentations link Aviv Shamsian, Aviv Navon,..., Haggai Maron
11 None Diffusion Models Encode the Intrinsic Dimension of Data Manifolds link Jan Pawel Stanczuk, Georgios Batzolis,..., Carola-Bibiane Schönlieb
11 2022-12-21 Not Just Pretty Pictures: Toward Interventional Data Augmentation Using
Text-to-Image Generators
link Jianhao Yuan, Francesco Pinto,..., Philip Torr
11 2023-11-02 Gaussian Processes on Cellular Complexes link Mathieu Alain, So Takao,..., Marc Peter Deisenroth
11 2023-12-15 Fast Decision Boundary based Out-of-Distribution Detector link Litian Liu, Yao Qin
11 2024-06-05 Graph Neural Network Explanations are Fragile link Jiate Li, Meng Pang,..., Binghui Wang
11 2024-01-24 ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal
Models
link Rohan Wadhawan, Hritik Bansal,..., Nanyun Peng
11 2024-02-01 Transforming and Combining Rewards for Aligning Large Language Models link Zihao Wang, Chirag Nagpal,..., Victor Veitch
11 2024-03-05 PPFLOW: Target-Aware Peptide Design with Torsional Flow Matching link Haitao Lin, Odin Zhang,..., Stan Z. Li
11 2024-05-08 The Entropy Enigma: Success and Failure of Entropy Minimization link Ori Press, Ravid Shwartz-Ziv,..., Matthias Bethge
11 2024-01-26 Residual Quantization with Implicit Neural Codebooks link Iris A.M. Huijben, Matthijs Douze,..., Jakob Verbeek
11 2024-06-10 Diving into Underwater: Segment Anything Model Guided Underwater Salient
Instance Segmentation and A Large-scale Dataset
link Shijie Lian, Ziyi Zhang,..., Runmin Cong
11 2024-05-28 SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity,
ECG and Respiratory Signals
link Rahul Thapa, Bryan He,..., James Zou
11 2024-02-28 Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for
Large Language Models
link Mingjia Huo, Sai Ashish Somayajula,..., Pengtao Xie
11 2023-05-30 Plug-in Performative Optimization link Licong Lin, Tijana Zrnic
11 2023-11-28 Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models link Zhihe Lu, Jiawang Bai,..., Xinchao Wang
11 2023-09-29 Latent Space Symmetry Discovery link Jianke Yang, Nima Dehmamy,..., Rose Yu
11 2023-12-06 Interpretability Illusions in the Generalization of Simplified Models link Dan Friedman, Andrew Kyle Lampinen,..., Asma Ghandeharioun
11 2024-02-11 Self-Correcting Self-Consuming Loops for Generative Model Training link Nate Gillman, Michael Freeman,..., Chen Sun
10 2024-03-05 Active Statistical Inference link Tijana Zrnic, Emmanuel Candes
10 2023-07-13 Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks link Liam Collins, Hamed Hassani,..., Sanjay Shakkottai
10 2024-06-11 Flextron: Many-in-One Flexible Large Language Model link Ruisi Cai, Saurav Muralidharan,..., Pavlo Molchanov
10 2024-02-16 Stochastic Localization via Iterative Posterior Sampling link Louis Grenioux, Maxence Noble,..., Alain Oliviero Durmus
10 2024-05-29 Locally Estimated Global Perturbations are Better than Local Perturbations
for Federated Sharpness-aware Minimization
link Ziqing Fan, Shengchao Hu,..., Yanfeng Wang
10 2024-05-28 FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic
Prediction
link Zhonghang Li, Lianghao Xia,..., Chao Huang
10 2024-06-05 SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN link kang you, Zekai Xu,..., Zhezhi He
10 None Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution
Approximation
link Boheng Li, Yishuo Cai,..., Tianwei Zhang
10 2024-04-11 Lyapunov-stable Neural Control for State and Output Feedback: A
Novel Formulation
link Lujie Yang, Hongkai Dai,..., Huan Zhang
10 2024-04-24 Unifying Bayesian Flow Networks and Diffusion Models through Stochastic
Differential Equations
link Kaiwen Xue, Yuhao Zhou,..., Chongxuan Li
10 2024-03-03 Critical windows: non-asymptotic theory for feature emergence in diffusion
models
link Marvin Li, Sitan Chen
10 2023-05-26 Rotational Equilibrium: How Weight Decay Balances Learning Across Neural
Networks
link Atli Kosson, Bettina Messmer, Martin Jaggi
10 2024-01-05 Graph2Tac: Online Representation Learning of Formal Math Concepts link Lasse Blaauwbroek, Mirek Olšák,..., Vasily Pestun
10 2023-10-22 A General Theory for Softmax Gating Multinomial Logistic Mixture
of Experts
link Huy Nguyen, Pedram Akbarian,..., Nhat Ho
10 2024-06-02 Is In-Context Learning in Large Language Models Bayesian? A
Martingale Perspective
link Fabian Falck, Ziyu Wang, Christopher C. Holmes
10 2024-02-28 Out-of-Domain Generalization in Dynamical Systems Reconstruction link Niclas Alexander Göring, Florian Hess,..., Daniel Durstewitz
10 2023-07-14 Graph Positional and Structural Encoder link Semih Cantürk, Renming Liu,..., Ladislav Rampášek
10 2024-07-03 DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents link Yilun Xu, Gabriele Corso,..., Karsten Kreis
10 2024-04-11 Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models link Tanmay Gautam, Youngsuk Park,..., Wooseok Ha
10 2024-02-15 Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient
Data Utilization
link Yihan Du, Anna Winnicki,..., R. Srikant
10 2024-01-25 Is Temperature Sample Efficient for Softmax Gaussian Mixture of
Experts?
link Huy Nguyen, Pedram Akbarian, Nhat Ho
10 2023-10-14 DPZero: Private Fine-Tuning of Language Models without Backpropagation link Liang Zhang, Bingcong Li,..., Niao He
10 2023-01-27 Single-Trajectory Distributionally Robust Reinforcement Learning link Zhipeng Liang, Xiaoteng Ma,..., Zhengyuan Zhou
10 2024-02-05 Understanding Reasoning Ability of Language Models From the Perspective
of Reasoning Paths Aggregation
link Xinyi Wang, Alfonso Amayuelas,..., William Yang Wang
10 2024-06-02 FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning link Yuwei Fu, Haichao Zhang,..., Benoit Boulet
10 2024-04-12 On the Independence Assumption in Neurosymbolic Learning link Emile van Krieken, Pasquale Minervini,..., Antonio Vergari
10 2024-05-30 SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for
Embodied Manipulation
link Junjie Zhang, Chenjia Bai,..., Xuelong Li
10 2024-02-02 BAT: Learning to Reason about Spatial Sounds with Large
Language Models
link Zhisheng Zheng, Puyuan Peng,..., David Harwath
10 2024-02-02 Understanding Adam Optimizer via Online Learning of Updates: Adam
is FTRL in Disguise
link Kwangjun Ahn, Zhiyu Zhang,..., Yan Dai
10 2024-01-04 Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU
Features Model
link Hien Dang, Tho Tran Huu,..., Nhat Ho
10 2024-01-21 Linear Alignment: A Closed-form Solution for Aligning Human Preferences
without Tuning and Feedback
link Songyang Gao, Qiming Ge,..., Dahua Lin
10 2023-03-15 Borda Regret Minimization for Generalized Linear Dueling Bandits link Yue Wu, Tao Jin,..., Quanquan Gu
10 2024-03-27 Understanding the Learning Dynamics of Alignment with Human Feedback link Shawn Im, Yixuan Li
10 2024-02-06 Neural Networks Learn Statistics of Increasing Complexity link Nora Belrose, Quintin Pope,..., Xiaoli Fern
10 2024-04-06 Multicalibration for Confidence Scoring in LLMs link Gianluca Detommaso, Martin Bertran Lopez,..., Aaron Roth
10 2023-05-12 MoMo: Momentum Models for Adaptive Learning Rates link Fabian Schaipp, Ruben Ohana,..., Robert M. Gower
9 2024-05-06 Rethinking Data Shapley for Data Selection Tasks: Misleads and
Merits
link Jiachen T. Wang, Tianji Yang,..., Ruoxi Jia
9 2023-10-18 Image Clustering with External Guidance link Yunfan Li, Peng Hu,..., Xi Peng
9 2024-03-28 Regression with Multi-Expert Deferral link Anqi Mao, Mehryar Mohri, Yutao Zhong
9 2024-06-11 Failures Are Fated, But Can Be Faded: Characterizing and
Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models
link Som Sagar, Aditya Taparia, Ransalu Senanayake
9 2024-02-27 Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings link Kevin Frans, Seohong Park,..., Sergey Levine
9 2024-02-22 Batch and match: black-box variational inference with a score-based
divergence
link Diana Cai, Chirag Modi,..., Lawrence K. Saul
9 2023-07-11 Memorization Through the Lens of Curvature of Loss Function
Around Samples
link Isha Garg, Deepak Ravikumar, Kaushik Roy
9 2024-03-01 EfficientZero V2: Mastering Discrete and Continuous Control with Limited
Data
link Shengjie Wang, Shaohuai Liu,..., Yang Gao
9 2024-05-26 High-Performance Temporal Reversible Spiking Neural Networks with $\mathcal{O}(L)$ Training
Memory and $\mathcal{O}(1)$ Inference Cost
link JiaKui Hu, Man Yao,..., Guoqi Li
9 2024-02-22 Clifford-Steerable Convolutional Neural Networks link Maksim Zhdanov, David Ruhe,..., Patrick Forré
9 2024-02-16 RLVF: Learning from Verbal Feedback without Overgeneralization link Moritz Pascal Stephan, Alexander Khazatsky,..., Chelsea Finn
9 2024-02-05 Retrieval-Augmented Score Distillation for Text-to-3D Generation link Junyoung Seo, Susung Hong,..., Seungryong Kim
9 2024-03-28 $H$-Consistency Guarantees for Regression link Anqi Mao, Mehryar Mohri, Yutao Zhong
9 2023-12-05 UPOCR: Towards Unified Pixel-Level OCR Interface link Dezhi Peng, Zhenhua Yang,..., Lianwen Jin
9 2024-02-02 InferCept: Efficient Intercept Support for Augmented Large Language Model
Inference
link Reyna Abhyankar, Zijian He,..., Yiying Zhang
9 2024-02-20 Diffusion Posterior Sampling is Computationally Intractable link Shivam Gupta, Ajil Jalal,..., Zhiyang Xun
9 2024-02-23 Explorations of Self-Repair in Language Models link Cody Rushing, Neel Nanda
9 2024-06-04 What Improves the Generalization of Graph Transformers? A Theoretical
Dive into the Self-attention and Positional Encoding
link Hongkang Li, Meng Wang,..., Pin-Yu Chen
9 2024-05-02 Uncertainty for Active Learning on Graphs link Dominik Fuchsgruber, Tom Wollschläger,..., Stephan Günnemann
9 2024-07-07 Harmony in Diversity: Merging Neural Networks with Canonical Correlation
Analysis
link Stefan Horoi, Albert Manuel Orozco Camacho,..., Guy Wolf
9 2023-05-30 Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting link Andrea Cini, Danilo Mandic, Cesare Alippi
9 2024-03-04 Reward Model Learning vs. Direct Policy Optimization: A Comparative
Analysis of Learning from Human Preferences
link Andi Nika, Debmalya Mandal,..., Adish Singla
9 2024-05-14 Reinformer: Max-Return Sequence Modeling for Offline RL link Zifeng Zhuang, Dengyun Peng,..., Donglin Wang
9 2024-02-08 Time Series Diffusion in the Frequency Domain link Jonathan Crabbé, Nicolas Huynh,..., Mihaela van der Schaar
9 2024-02-23 On the Duality Between Sharpness-Aware Minimization and Adversarial Training link Yihao Zhang, Hangzhou He,..., Zeming Wei
9 2024-02-19 Towards Theoretical Understandings of Self-Consuming Generative Models link Shi Fu, Sen Zhang,..., Dacheng Tao
9 2023-04-16 An Empirical Study of Realized GNN Expressiveness link Yanbo Wang, Muhan Zhang
9 2024-05-11 Non-confusing Generation of Customized Concepts in Diffusion Models link Wang Lin, Jingyuan Chen,..., Hanwang Zhang
9 2024-02-15 Accelerating Parallel Sampling of Diffusion Models link Zhiwei Tang, Jiasheng Tang,..., Tsung-Hui Chang
9 2023-10-13 Split-and-Denoise: Protect large language model inference with local differential
privacy
link Peihua Mai, Ran Yan,..., Yan Pang
9 2023-11-17 Stable Differentiable Causal Discovery link Achille Nazaret, Justin Hong,..., David Blei
9 2023-12-13 The Relative Value of Prediction in Algorithmic Decision Making link Juan Carlos Perdomo
9 2024-02-07 Multi-Sender Persuasion: A Computational Perspective link Safwan Hossain, Tonghan Wang,..., Haifeng Xu
9 2024-02-22 Prompting a Pretrained Transformer Can Be a Universal Approximator link Aleksandar Petrov, Philip Torr, Adel Bibi
9 2023-10-03 Discovering Symmetry Breaking in Physical Systems with Relaxed Group
Convolution
link Rui Wang, Elyssa Hofgard,..., Tess Smidt
9 2023-10-26 HyperFields: Towards Zero-Shot Generation of NeRFs from Text link Sudarshan Babu, Richard Liu,..., Rana Hanocka
9 2024-01-30 StrokeNUWA—Tokenizing Strokes for Vector Graphic Synthesis link Zecheng Tang, Chenfei Wu,..., Nan Duan
9 2024-02-06 CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling link Junchao Gong, LEI BAI,..., Wanli Ouyang
9 2024-05-28 HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning link Shengchao Hu, Ziqing Fan,..., Dacheng Tao
9 2024-02-07 Data-efficient Large Vision Models through Sequential Autoregression link Zhiwei Hao, Jianyuan Guo,..., Chang Xu
9 2024-02-06 DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic
Systems
link Yair Schiff, Zhong Yi Wan,..., Leonardo Zepeda-Núñez
9 2024-02-20 Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series
Dependencies and Intra-Series Variations Modeling
link Guoqi Yu, Jing Zou,..., Shujun Wang
9 2024-02-02 Two Heads Are Better Than One: Boosting Graph Sparse
Training via Semantic and Topological Awareness
link Guibin Zhang, Yanwei Yue,..., Tianlong Chen