Last updated: 2024-12-09 08:43:37. Maintained by Weisen Jiang.

citation date review title (pdf) authors
407 2024-01-17 link Vision Mamba: Efficient Visual Representation Learning with Bidirectional State
Space Model
Lianghui Zhu, Bencheng Liao,..., Xinggang Wang
402 2023-05-23 link Improving Factuality and Reasoning in Language Models through Multiagent
Debate
Yilun Du, Shuang Li,..., Igor Mordatch
400 2024-03-05 link Scaling Rectified Flow Transformers for High-Resolution Image Synthesis Patrick Esser, Sumith Kulal,..., Robin Rombach
399 2023-08-04 link MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities Weihao Yu, Zhengyuan Yang,..., Lijuan Wang
308 2023-09-11 link NExT-GPT: Any-to-Any Multimodal LLM Shengqiong Wu, Hao Fei,..., Tat-Seng Chua
237 2024-03-07 link Chatbot Arena: An Open Platform for Evaluating LLMs by
Human Preference
Wei-Lin Chiang, Lianmin Zheng,..., Ion Stoica
207 2024-01-18 link Self-Rewarding Language Models Weizhe Yuan, Richard Yuanzhe Pang,..., Jason E Weston
202 2024-02-14 link DoRA: Weight-Decomposed Low-Rank Adaptation Shih-yang Liu, Chien-Yi Wang,..., Min-Hung Chen
197 2023-05-22 link How Language Model Hallucinations Can Snowball Muru Zhang, Ofir Press,..., Noah A. Smith
192 2024-05-31 link Transformers are SSMs: Generalized Models and Efficient Algorithms Through
Structured State Space Duality
Tri Dao, Albert Gu
187 2023-12-14 link Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns, Pavel Izmailov,..., Jeffrey Wu
170 2024-02-06 link HarmBench: A Standardized Evaluation Framework for Automated Red Teaming
and Robust Refusal
Mantas Mazeika, Long Phan,..., Dan Hendrycks
157 2023-11-06 link Language Models are Super Mario: Absorbing Abilities from Homologous
Models as a Free Lunch
Le Yu, Bowen Yu,..., Yongbin Li
155 2024-01-19 link Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding
Heads
Tianle Cai, Yuhong Li,..., Tri Dao
135 2023-12-21 link VideoPoet: A Large Language Model for Zero-Shot Video Generation Dan Kondratyuk, Lijun Yu,..., Lu Jiang
124 2024-01-03 link GPT-4V(ision) is a Generalist Web Agent, if Grounded Boyuan Zheng, Boyu Gou,..., Yu Su
121 2023-09-28 link Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution Chrisantha Fernando, Dylan Sunil Banarse,..., Tim Rocktäschel
119 2023-06-13 link SqueezeLLM: Dense-and-Sparse Quantization Sehoon Kim, Coleman Richard Charles Hooper,..., Kurt Keutzer
119 2023-04-19 link Fundamental Limitations of Alignment in Large Language Models Yotam Wolf, Noam Wies,..., Amnon Shashua
116 2024-01-16 link Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance
in Machine Translation
Haoran Xu, Amr Sharaf,..., Young Jin Kim
115 2023-10-14 link A decoder-only foundation model for time-series forecasting Abhimanyu Das, Weihao Kong,..., Yichen Zhou
104 2024-02-06 link LESS: Selecting Influential Data for Targeted Instruction Tuning Mengzhou Xia, Sadhika Malladi,..., Danqi Chen
103 2024-03-06 link GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Jiawei Zhao, Zhenyu Zhang,..., Yuandong Tian
96 2023-10-06 link Language Agent Tree Search Unifies Reasoning Acting and Planning
in Language Models
Andy Zhou, Kai Yan,..., Yu-Xiong Wang
95 2023-12-18 link Iterative Preference Learning from Human Feedback: Bridging Theory and
Practice for RLHF under KL-constraint
Wei Xiong, Hanze Dong,..., Tong Zhang
93 2024-03-05 link NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
Diffusion Models
Zeqian Ju, Yuancheng Wang,..., sheng zhao
91 2024-02-21 link LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Yiran Ding, Li Lyna Zhang,..., Mao Yang
87 2024-02-03 link Break the Sequential Dependency of LLM Inference Using Lookahead
Decoding
Yichao Fu, Peter Bailis,..., Hao Zhang
85 2023-12-11 link Gated Linear Attention Transformers with Hardware-Efficient Training Songlin Yang, Bailin Wang,..., Yoon Kim
84 2023-12-01 link Nash Learning from Human Feedback Remi Munos, Michal Valko,..., Bilal Piot
83 2024-02-08 link SPHINX-X: Scaling Data and Parameters for a Family of
Multi-modal Large Language Models
Dongyang Liu, Renrui Zhang,..., Peng Gao
82 2024-02-19 link LoRA+: Efficient Low Rank Adaptation of Large Models Soufiane Hayou, Nikhil Ghosh, Bin Yu
80 2023-11-07 link The Linear Representation Hypothesis and the Geometry of Large
Language Models
Kiho Park, Yo Joong Choe, Victor Veitch
78 2024-02-15 link Data Engineering for Scaling Language Models to 128K Context Yao Fu, Rameswar Panda,..., Hao Peng
77 2024-02-04 link Unified Training of Universal Time Series Forecasting Transformers Gerald Woo, Chenghao Liu,..., Doyen Sahoo
76 2024-02-05 link KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache Zirui Liu, Jiayi Yuan,..., Xia Hu
73 2023-10-11 link In-Context Unlearning: Language Models as Few Shot Unlearners Martin Pawelczyk, Seth Neel, Himabindu Lakkaraju
72 2023-09-25 link Physics of Language Models: Part 3.1, Knowledge Storage and
Extraction
Zeyuan Allen-Zhu, Yuanzhi Li
72 2024-02-23 link Genie: Generative Interactive Environments Jake Bruce, Michael D Dennis,..., Tim Rocktäschel
70 2023-12-28 link Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined
Levels
Haoning Wu, Zicheng Zhang,..., Weisi Lin
70 2024-04-16 link Is DPO Superior to PPO for LLM Alignment? A
Comprehensive Study
Shusheng Xu, Wei Fu,..., Yi Wu
70 2024-01-22 link Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal
LLMs
Ling Yang, Zhaochen Yu,..., Bin CUI
69 2024-02-02 link TravelPlanner: A Benchmark for Real-World Planning with Language Agents Jian Xie, Kai Zhang,..., Yu Su
69 2024-01-22 link WARM: On the Benefits of Weight Averaged Reward Models Alexandre Rame, Nino Vieillard,..., Johan Ferret
69 2024-03-05 link The WMDP Benchmark: Measuring and Reducing Malicious Use With
Unlearning
Nathaniel Li, Alexander Pan,..., Dan Hendrycks
67 2024-01-26 link EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty Yuhui Li, Fangyun Wei,..., Hongyang Zhang
67 2024-02-01 link Executable Code Actions Elicit Better LLM Agents Xingyao Wang, Yangyi Chen,..., Heng Ji
66 2024-02-07 link Fast Timing-Conditioned Latent Audio Diffusion Zach Evans, CJ Carr,..., Jordi Pons
65 2023-11-18 link An Embodied Generalist Agent in 3D World Jiangyong Huang, Silong Yong,..., Siyuan Huang
65 2024-01-02 link LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning Hongye Jin, Xiaotian Han,..., Xia Hu
65 2024-01-03 link A Mechanistic Understanding of Alignment Algorithms: A Case Study
on DPO and Toxicity
Andrew Lee, Xiaoyan Bai,..., Rada Mihalcea
64 2024-04-22 link Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data Fahim Tajwar, Anikait Singh,..., Aviral Kumar
59 2023-11-02 link RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning
via Generative Simulation
Yufei Wang, Zhou Xian,..., Chuang Gan
59 2024-02-12 link Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language
Models
Siddharth Karamcheti, Suraj Nair,..., Dorsa Sadigh
59 2024-01-08 link A Minimaximalist Approach to Reinforcement Learning from Human Feedback Gokul Swamy, Christoph Dann,..., Alekh Agarwal
59 2023-09-29 link Alphazero-like Tree-Search can Guide Large Language Model Decoding and
Training
Ziyu Wan, Xidong Feng,..., Jun Wang
58 2024-02-22 link GaussianPro: 3D Gaussian Splatting with Progressive Propagation Kai Cheng, Xiaoxiao Long,..., Xuejin Chen
56 2024-02-12 link PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs Soroush Nasiriany, Fei Xia,..., brian ichter
56 2024-02-06 link MOMENT: A Family of Open Time-series Foundation Models Mononito Goswami, Konrad Szafer,..., Artur Dubrawski
56 2024-01-11 link Patchscopes: A Unifying Framework for Inspecting Hidden Representations of
Language Models
Asma Ghandeharioun, Avi Caciularu,..., Mor Geva
55 2023-09-01 link Image Hijacks: Adversarial Images can Control Generative Models at
Runtime
Luke Bailey, Euan Ong,..., Scott Emmons
54 2024-02-08 link Generalized Preference Optimization: A Unified Approach to Offline Alignment Yunhao Tang, Zhaohan Daniel Guo,..., Bilal Piot
54 2023-10-29 link Language Agents with Reinforcement Learning for Strategic Play in
the Werewolf Game
Zelai Xu, Chao Yu,..., Yi Wu
53 2024-02-07 link Assessing the Brittleness of Safety Alignment via Pruning and
Low-Rank Modifications
Boyi Wei, Kaixuan Huang,..., Peter Henderson
52 2023-11-11 link In-context Vectors: Making In Context Learning More Effective and
Controllable Through Latent Space Steering
Sheng Liu, Haotian Ye,..., James Y. Zou
52 2023-07-20 link SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language
Models
Xiaoxuan Wang, Ziniu Hu,..., Wei Wang
52 2024-01-11 link Extreme Compression of Large Language Models via Additive Quantization Vage Egiazarian, Andrei Panferov,..., Dan Alistarh
51 2023-10-08 link Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce
for Pruning LLMs to High Sparsity
Lu Yin, You Wu,..., Shiwei Liu
50 2024-01-29 link OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models Fuzhao Xue, Zian Zheng,..., Yang You
50 2024-02-06 link Can Mamba Learn How to Learn? A Comparative Study
on In-Context Learning Tasks
Jongho Park, Jaeseung Park,..., Dimitris Papailiopoulos
48 2024-01-16 link Scalable Pre-training of Large Autoregressive Image Models Alaaeldin El-Nouby, Michal Klein,..., Armand Joulin
48 2024-02-02 link Audio Flamingo: A Novel Audio Language Model with Few-Shot
Learning and Dialogue Abilities
Zhifeng Kong, Arushi Goel,..., Bryan Catanzaro
47 2024-03-11 link Stealing Part of a Production Language Model Nicholas Carlini, Daniel Paleka,..., Florian Tramèr
47 2023-12-07 link Chain of Code: Reasoning with a Language Model-Augmented Code
Emulator
Chengshu Li, Jacky Liang,..., brian ichter
46 2024-02-07 link AlphaFold Meets Flow Matching for Generating Protein Ensembles Bowen Jing, Bonnie Berger, Tommi Jaakkola
46 2024-04-30 link Better & Faster Large Language Models via Multi-token Prediction Fabian Gloeckle, Badr Youbi Idrissi,..., Gabriel Synnaeve
46 2024-01-22 link Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text Abhimanyu Hans, Avi Schwarzschild,..., Tom Goldstein
46 2024-02-01 link Repeat After Me: Transformers are Better than State Space
Models at Copying
Samy Jelassi, David Brandfonbrener,..., eran malach
45 2024-04-24 link MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language
Models Towards Multitask AGI
Kaining Ying, Fanqing Meng,..., Wenqi Shao
44 2024-02-07 link Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with
Applications to Protein Co-Design
Andrew Campbell, Jason Yim,..., Tommi Jaakkola
44 2023-09-12 link Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts Zhi-Yi Chin, Chieh Ming Jiang,..., Wei-Chen Chiu
44 2024-03-11 link Monitoring AI-Modified Content at Scale: A Case Study on
the Impact of ChatGPT on AI Conference Peer Reviews
Weixin Liang, Zachary Izzo,..., James Y. Zou
43 2023-12-31 link Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling
Laws
Nikhil Sardana, Jacob Portes,..., Jonathan Frankle
42 2023-08-20 link Algorithm of Thoughts: Enhancing Exploration of Ideas in Large
Language Models
Bilgehan Sel, Ahmad Tawaha,..., Ming Jin
42 2023-10-08 link In-Context Convergence of Transformers Yu Huang, Yuan Cheng, Yingbin Liang
42 2023-12-04 link Magicoder: Empowering Code Generation with OSS-Instruct Yuxiang Wei, Zhe Wang,..., LINGMING ZHANG
41 2023-10-25 link Controlled Decoding from Language Models Sidharth Mudgal, Jong Lee,..., Ahmad Beirami
40 2024-02-22 link tinyBenchmarks: evaluating LLMs with fewer examples Felipe Maia Polo, Lucas Weber,..., Mikhail Yurochkin
40 2024-03-13 link Human Alignment of Large Language Models through Online Preference
Optimisation
Daniele Calandriello, Zhaohan Daniel Guo,..., Bilal Piot
40 2024-02-13 link COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability Xingang Guo, Fangxu Yu,..., Bin Hu
39 2023-07-17 link Do Models Explain Themselves? Counterfactual Simulatability of Natural Language
Explanations
Yanda Chen, Ruiqi Zhong,..., Kathleen McKeown
39 2024-02-10 link A Tale of Tails: Model Collapse as a Change
of Scaling Laws
Elvis Dohmatob, Yunzhen Feng,..., Julia Kempe
39 2024-01-05 link CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution Alex Gu, Baptiste Roziere,..., Sida Wang
39 2024-02-13 link IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D
Generation
Luke Melas-Kyriazi, Iro Laina,..., Filippos Kokkinos
37 2024-02-15 link Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference
Adjustment
Rui Yang, Xiaoman Pan,..., Jianshu Chen
37 2024-02-28 link Simple linear attention language models balance the recall-throughput tradeoff Simran Arora, Sabri Eyuboglu,..., Christopher Re
37 2024-02-06 link BiLLM: Pushing the Limit of Post-Training Quantization for LLMs Wei Huang, Yangdong Liu,..., XIAOJUAN QI
36 2023-06-30 link Stay on topic with Classifier-Free Guidance Guillaume Sanchez, Alexander Spangher,..., Stella Biderman
36 2024-03-05 link Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling Yair Schiff, Chia Hsiang Kao,..., Volodymyr Kuleshov
36 2023-05-24 link Robust Classification via a Single Diffusion Model Huanran Chen, Yinpeng Dong,..., Jun Zhu
36 2023-06-05 link InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models Lichang Chen, Jiuhai Chen,..., Tianyi Zhou
36 2023-07-31 link Learning to Model the World with Language Jessy Lin, Yuqing Du,..., Anca Dragan
35 2024-01-31 link On Prompt-Driven Safeguarding for Large Language Models Chujie Zheng, Fan Yin,..., Nanyun Peng
34 2023-10-19 link HumanTOMATO: Text-aligned Whole-body Motion Generation Shunlin Lu, Ling-Hao Chen,..., Heung-Yeung Shum
34 2024-02-22 link MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use
Cases
Zechun Liu, Changsheng Zhao,..., Vikas Chandra
34 2024-03-06 link Stop Regressing: Training Value Functions via Classification for Scalable
Deep RL
Jesse Farebrother, Jordi Orbay,..., Rishabh Agarwal
33 2023-11-08 link NExT-Chat: An LMM for Chat, Detection and Segmentation Ao Zhang, Yuan Yao,..., Tat-Seng Chua
33 2023-06-09 link Prodigy: An Expeditiously Adaptive Parameter-Free Learner Konstantin Mishchenko, Aaron Defazio
33 2023-12-07 link An LLM Compiler for Parallel Function Calling Sehoon Kim, Suhong Moon,..., Amir Gholami
33 2023-10-11 link Online Speculative Decoding Xiaoxuan Liu, Lanxiang Hu,..., Hao Zhang
32 2024-03-05 link Behavior Generation with Latent Actions Seungjae Lee, Yibin Wang,..., Lerrel Pinto
32 2023-10-11 link InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining Boxin Wang, Wei Ping,..., Bryan Catanzaro
32 2024-03-11 link The pitfalls of next-token prediction Gregor Bachmann, Vaishnavh Nagarajan
32 2024-03-01 link HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding Zhaorun Chen, Zhuokai Zhao,..., Jiawei Zhou
31 2024-02-03 link Safety Fine-Tuning at (Almost) No Cost: A Baseline for
Vision Large Language Models
Yongshuo Zong, Ondrej Bohdal,..., Timothy Hospedales
31 2024-03-14 link 3D-VLA: A 3D Vision-Language-Action Generative World Model Haoyu Zhen, Xiaowen Qiu,..., Chuang Gan
30 2024-02-11 link GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative
Gaussian Splatting
Xiaoyu Zhou, Xingjian Ran,..., Ming-Hsuan Yang
30 2024-02-13 link LLaGA: Large Language and Graph Assistant Runjin Chen, Tong Zhao,..., Zhangyang Wang
30 2023-12-11 link Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions
In Context
Xiang Cheng, Yuxin Chen, Suvrit Sra
30 2023-11-15 link Decomposing Uncertainty for Large Language Models through Input Clarification
Ensembling
Bairu Hou, Yujian Liu,..., Yang Zhang
30 2023-10-05 link MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation Qian Huang, Jian Vora,..., Jure Leskovec
30 2023-04-05 link Algorithm and Hardness for Dynamic Attention Maintenance in Large
Language Models
Jan van den Brand, Zhao Song, Tianyi Zhou
30 2024-02-07 link Long Is More for Alignment: A Simple but Tough-to-Beat
Baseline for Instruction Fine-Tuning
Hao Zhao, Maksym Andriushchenko,..., Nicolas Flammarion
29 2024-02-14 link Get More with LESS: Synthesizing Recurrence with KV Cache
Compression for Efficient LLM Inference
Harry Dong, Xinyu Yang,..., Beidi Chen
29 2024-02-07 link MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark Dongping Chen, Ruoxi Chen,..., Lichao Sun
29 2024-01-23 link DsDm: Model-Aware Dataset Selection with Datamodels Logan Engstrom
29 2024-03-01 link Provably Robust DPO: Aligning Language Models with Noisy Feedback Sayak Ray Chowdhury, Anush Kini, Nagarajan Natarajan
29 2024-02-08 link WebLINX: Real-World Website Navigation with Multi-Turn Dialogue Xing Han Lu, Zdeněk Kasner, Siva Reddy
28 2024-01-04 link Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using
Large Language Model
Fei Liu, Tong Xialiang,..., Qingfu Zhang
28 2024-01-23 link In-Context Language Learning: Architectures and Algorithms Ekin Akyürek, Bailin Wang,..., Jacob Andreas
28 2024-01-21 link Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers Katherine Crowson, Stefan Andreas Baumann,..., Enrico Shippole
28 2024-04-05 link Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion
Models for One-Step Generation
Mingyuan Zhou, Huangjie Zheng,..., Hai Huang
28 2024-02-12 link Scaling Laws for Fine-Grained Mixture of Experts Jan Ludziejewski, Jakub Krajewski,..., Sebastian Jaszczur
28 2024-02-08 link Dirichlet Flow Matching with Applications to DNA Sequence Design Hannes Stark, Bowen Jing,..., Tommi Jaakkola
27 2024-02-13 link Agent Smith: A Single Image Can Jailbreak One Million
Multimodal LLM Agents Exponentially Fast
Xiangming Gu, Xiaosen Zheng,..., Min Lin
27 2023-05-17 link Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling Weijia Xu, Andrzej Banburski, Nebojsa Jojic
27 2024-03-14 link Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference Piotr Nawrot, Adrian Łańcucki,..., Edoardo Ponti
27 2024-02-05 link Large Language Models are Geographically Biased Rohin Manvi, Samar Khanna,..., Stefano Ermon
27 2024-03-12 link WorkArena: How Capable Are Web Agents at Solving Common
Knowledge Work Tasks?
Alexandre Drouin, Maxime Gasse,..., Alexandre Lacoste
26 2024-02-11 link ODIN: Disentangled Reward Mitigates Hacking in RLHF Lichang Chen, Chen Zhu,..., Bryan Catanzaro
26 2023-09-13 link Auto-Regressive Next-Token Predictors are Universal Learners eran malach
26 2024-03-19 link RigorLLM: Resilient Guardrails for Large Language Models against Undesired
Content
Zhuowen Yuan, Zidi Xiong,..., Bo Li
25 2024-02-04 link Timer: Generative Pre-trained Transformers Are Large Time Series Models Yong Liu, Haoran Zhang,..., Mingsheng Long
25 2023-10-25 link Discrete Diffusion Modeling by Estimating the Ratios of the
Data Distribution
Aaron Lou, Chenlin Meng, Stefano Ermon
24 2024-03-05 link MathScale: Scaling Instruction Tuning for Mathematical Reasoning Zhengyang Tang, Xingxing Zhang,..., Furu Wei
24 2023-10-02 link Prompt-tuning latent diffusion models for inverse problems Hyungjin Chung, Jong Chul Ye,..., Mauricio Delbracio
24 2024-02-29 link Watermark Stealing in Large Language Models Nikola Jovanović, Robin Staab, Martin Vechev
24 2024-02-06 link AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls Yu Du, Fangyun Wei, Hongyang Zhang
24 2022-10-10 link Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score
Based Diffusion Models
Louis Sharrock, Jack Simons,..., Mark Beaumont
24 2024-02-14 link Feature Reuse and Scaling: Understanding Transfer Learning with Protein
Language Models
Francesca-Zhoufan Li, Ava P Amini,..., Alex Xijie Lu
23 2023-12-08 link SparQ Attention: Bandwidth-Efficient LLM Inference Luka Ribar, Ivan Chelombiev,..., Douglas Orr
23 2024-01-22 link DITTO: Diffusion Inference-Time T-Optimization for Music Generation Zachary Novack, Julian McAuley,..., Nicholas J. Bryan
23 2024-02-28 link Evaluating Quantized Large Language Models Shiyao Li, Xuefei Ning,..., Yu Wang
23 2024-01-09 link RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation Mahdi Nikdan, Soroush Tabesh,..., Dan Alistarh
23 2024-02-03 link BetterV: Controlled Verilog Generation with Discriminative Guidance Zehua PEI, Huiling Zhen,..., Bei Yu
23 2024-02-02 link A Dynamical Model of Neural Scaling Laws Blake Bordelon, Alexander Atanasov, Cengiz Pehlevan
22 2024-02-29 link ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL Yifei Zhou, Andrea Zanette,..., Aviral Kumar
22 2024-04-04 link Outlier-Efficient Hopfield Layers for Large Transformer-Based Models Jerry Yao-Chieh Hu, Pei-Hsuan Chang,..., Han Liu
22 2024-02-06 link RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback Yufei Wang, Zhanyi Sun,..., Zackory Erickson
22 2024-04-12 link The Illusion of State in State-Space Models William Merrill, Jackson Petty, Ashish Sabharwal
22 2024-01-30 link Proactive Detection of Voice Cloning with Localized Watermarking Robin San Roman, Pierre Fernandez,..., Tuan Tran
22 2024-03-06 link Accelerating Convergence of Score-Based Diffusion Models, Provably Gen Li, Yu Huang,..., Yuxin Chen
22 2023-11-18 link MagicPose: Realistic Human Poses and Facial Expressions Retargeting with
Identity-aware Diffusion
Di Chang, Yichun Shi,..., Mohammad Soleymani
21 2024-02-02 link Boximator: Generating Rich and Controllable Motions for Video Synthesis Jiawei Wang, Yuchen Zhang,..., Hang Li
21 2024-02-05 link Flora: Low-Rank Adapters Are Secretly Gradient Compressors Yongchang Hao, Yanshuai Cao, Lili Mou
21 2024-02-09 link Iterated Denoising Energy Matching for Sampling from Boltzmann Densities Tara Akhound-Sadegh, Jarrid Rector-Brooks,..., Alexander Tong
21 2024-02-18 link Momentor: Advancing Video Large Language Model with Fine-Grained Temporal
Reasoning
Long Qian, Juncheng Li,..., Siliang Tang
21 2023-06-07 link Don't trust your eyes: on the (un)reliability of feature
visualizations
Robert Geirhos, Roland S. Zimmermann,..., Been Kim
21 2024-02-05 link Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization Yang Jin, Zhicheng Sun,..., Yadong MU
21 2024-02-07 link On Computational Limits of Modern Hopfield Models: A Fine-Grained
Complexity Analysis
Jerry Yao-Chieh Hu, Thomas Lin,..., Han Liu
21 2024-02-27 link Training-Free Long-Context Scaling of Large Language Models Chenxin An, Fei Huang,..., Lingpeng Kong
21 2024-02-27 link Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for
Generative Recommendations
Jiaqi Zhai, Lucy Liao,..., Yu Shi
21 2024-02-05 link Decoding-time Realignment of Language Models Tianlin Liu, Shangmin Guo,..., Mathieu Blondel
20 2024-02-28 link CogBench: a large language model walks into a psychology
lab
Julian Coda-Forno, Marcel Binz,..., Eric Schulz
20 2023-02-26 link Diffusion Model-Augmented Behavioral Cloning Shang-Fu Chen, Hsiang-Chun Wang,..., Shao-Hua Sun
20 2024-02-02 link Challenges in Training PINNs: A Loss Landscape Perspective Pratik Rathore, Weimu Lei,..., Madeleine Udell
20 2023-10-16 link ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method
for Aligning Large Language Models
Ziniu Li, Tian Xu,..., Zhi-Quan Luo
20 2024-02-18 link Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark Yihua Zhang, Pingzhi Li,..., Tianlong Chen
20 2024-04-18 link Token-level Direct Preference Optimization Yongcheng Zeng, Guoqing Liu,..., Jun Wang
20 2024-02-13 link GLoRe: When, Where, and How to Improve LLM Reasoning
via Global and Local Refinements
Alexander Havrilla, Sharath Chandra Raparthy,..., Roberta Raileanu
19 2024-02-05 link C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models Mintong Kang, Nezihe Merve Gürel,..., Bo Li
19 2024-02-14 link Transformers, parallel computation, and logarithmic depth Clayton Sanford, Daniel Hsu, Matus Telgarsky
19 2024-03-17 link MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of
Data
Paul Steven Scotti, Mihir Tripathy,..., Tanishq Mathew Abraham
19 2024-02-14 link Premise Order Matters in Reasoning with Large Language Models Xinyun Chen, Ryan Andrew Chi,..., Denny Zhou
19 2024-02-26 link Asymmetry in Low-Rank Adapters of Foundation Models Jiacheng Zhu, Kristjan Greenewald,..., Justin Solomon
19 2023-10-05 link Stochastic interpolants with data-dependent couplings Michael Samuel Albergo, Mark Goldstein,..., Eric Vanden-Eijnden
19 2024-01-11 link DiffDA: a diffusion model for weather-scale data assimilation Langwen Huang, Lukas Gianinazzi,..., Torsten Hoefler
19 2024-02-01 link Merging Multi-Task Models via Weight-Ensembling Mixture of Experts Anke Tang, Li Shen,..., Dacheng Tao
19 2023-10-26 link Codebook Features: Sparse and Discrete Interpretability for Neural Networks Alex Tamkin, Mohammad Taufeeque, Noah Goodman
18 2023-10-09 link Generalized Neural Collapse for a Large Number of Classes Jiachen Jiang, Jinxin Zhou,..., Zhihui Zhu
18 2024-02-01 link Dense Reward for Free in Reinforcement Learning from Human
Feedback
Alex James Chan, Hao Sun,..., Mihaela van der Schaar
18 2023-12-12 link AI Control: Improving Safety Despite Intentional Subversion Ryan Greenblatt, Buck Shlegeris,..., Fabien Roger
18 2024-02-05 link The Benefits of Reusing Batches for Gradient Descent in
Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi, Emanuele Troiani,..., Florent Krzakala
18 2024-02-08 link Accurate LoRA-Finetuning Quantization of LLMs via Information Retention Haotong Qin, Xudong Ma,..., Michele Magno
18 2023-12-11 link Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication
Cost under 18 Kilobytes
Zhen Qin, Daoyuan Chen,..., Shuiguang Deng
18 2024-02-15 link DE-COP: Detecting Copyrighted Content in Language Models Training Data André Vicente Duarte, Xuandong Zhao,..., Lei Li
18 2024-02-15 link Language Models with Conformal Factuality Guarantees Christopher Mohri, Tatsunori Hashimoto
17 2024-02-03 link A Closer Look at the Limitations of Instruction Tuning Sreyan Ghosh, Chandra Kiran Reddy Evuru,..., Dinesh Manocha
17 2024-02-21 link D-Flow: Differentiating through Flows for Controlled Generation Heli Ben-Hamu, Omri Puny,..., Yaron Lipman
17 2020-11-29 link Scaling Down Deep Learning with MNIST-1D Samuel James Greydanus, Dmitry Kobak
17 2024-02-08 link In-Context Principle Learning from Mistakes Tianjun Zhang, Aman Madaan,..., Uri Alon
17 2024-01-05 link VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model Pengying Wu, Yao Mu,..., Chang Liu
17 2024-02-05 link Guidance with Spherical Gaussian Constraint for Conditional Diffusion Lingxiao Yang, Shutong Ding,..., Ye Shi
17 2024-04-10 link What needs to go right for an induction head?
A mechanistic study of in-context learning circuits and their formation
Aaditya K Singh, Ted Moskovitz,..., Andrew M Saxe
17 2024-02-13 link Mixtures of Experts Unlock Parameter Scaling for Deep RL Johan Samir Obando Ceron, Ghada Sokar,..., Pablo Samuel Castro
17 2024-01-10 link InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks Xueyu Hu, Ziyu Zhao,..., Fei Wu
16 2024-03-03 link Theoretical Insights for Diffusion Guidance: A Case Study for
Gaussian Mixture Models
Yuchen Wu, Minshuo Chen,..., Yuting Wei
16 2023-10-10 link Conformal Prediction for Deep Classifier via Label Ranking Jianguo Huang, HuaJun Xi,..., Hongxin Wei
16 2024-02-07 link Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window
Matching
Yuchen Zhang, Tianle Zhang,..., Yang You
16 2024-02-20 link A Touch, Vision, and Language Dataset for Multimodal Alignment Letian Fu, Gaurav Datta,..., Ken Goldberg
16 2022-11-09 link Few-Shot Character Understanding in Movies as an Assessment to
Meta-Learning of Theory-of-Mind
Mo Yu, Qiujing Wang,..., Jie Zhou
16 2023-10-11 link A Theory of Non-Linear Feature Learning with One Gradient
Step in Two-Layer Neural Networks
Behrad Moniri, Donghwan Lee,..., Edgar Dobriban
16 2023-10-23 link DoGE: Domain Reweighting with Generalization Estimation Simin Fan, Matteo Pagliardini, Martin Jaggi
16 2023-12-19 link Curated LLM: Synergy of LLMs and Data Curation for
tabular augmentation in ultra low-data regimes
Nabeel Seedat, Nicolas Huynh,..., Mihaela van der Schaar
16 2024-03-04 link Differentially Private Synthetic Data via Foundation Model APIs 2:
Text
Chulin Xie, Zinan Lin,..., Sergey Yekhanin
16 2024-04-16 link Position: Social Choice Should Guide AI Alignment in Dealing
with Diverse Human Feedback
Vincent Conitzer, Rachel Freedman,..., William S. Zwicker
16 2024-02-08 link How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis Federico Bianchi, Patrick John Chia,..., James Zou
15 2023-12-06 link Generalization to New Sequential Decision Making Tasks with In-Context
Learning
Sharath Chandra Raparthy, Eric Hambro,..., Roberta Raileanu
15 2024-04-15 link All-in-one simulation-based inference Manuel Gloeckler, Michael Deistler,..., Jakob H. Macke
15 2024-02-20 link VideoPrism: A Foundational Visual Encoder for Video Understanding Long Zhao, Nitesh Bharadwaj Gundavarapu,..., Boqing Gong
15 2023-12-08 link EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language
Models with 3D Parallelism
Yanxi Chen, Xuchen Pan,..., Jingren Zhou
15 2024-03-06 link On the Origins of Linear Representations in Large Language
Models
Yibo Jiang, Goutham Rajendran,..., Victor Veitch
15 2024-05-02 link SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters Shengsheng Lin, Weiwei Lin,..., Junjie Yang
15 2023-12-06 link Low-Cost High-Power Membership Inference Attacks Sajjad Zarifzadeh, Philippe Liu, Reza Shokri
15 2024-02-23 link Fast Adversarial Attacks on Language Models In One GPU
Minute
Vinu Sankar Sadasivan, Shoumik Saha,..., Soheil Feizi
15 2024-02-04 link Transolver: A Fast Transformer Solver for PDEs on General
Geometries
Haixu Wu, Huakun Luo,..., Mingsheng Long
15 2023-04-03 link Chain-of-Thought Predictive Control Zhiwei Jia, Vineet Thumuluri,..., Hao Su
15 2024-02-26 link Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning Michael Matthews, Michael Beukman,..., Jakob Nicolaus Foerster
15 2024-05-18 link Towards Modular LLMs by Building and Reusing a Library
of LoRAs
Oleksiy Ostapenko, Zhan Su,..., Alessandro Sordoni
15 2024-01-28 link An Information-Theoretic Analysis of In-Context Learning Hong Jun Jeon, Jason D. Lee,..., Benjamin Van Roy
15 2023-05-27 link CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers Dachuan Shi, Chaofan Tao,..., Jiaqi Wang
15 2024-02-08 link Self-Alignment of Large Language Models via Monopolylogue-based Social Scene
Simulation
Xianghe Pang, Shuo Tang,..., Siheng Chen
15 2024-02-28 link CLLMs: Consistency Large Language Models Siqi Kou, Lanxiang Hu,..., Hao Zhang
14 2024-05-13 link Localizing Task Information for Improved Model Merging and Compression Ke Wang, Nikolaos Dimitriadis,..., Pascal Frossard
14 2023-10-16 link A Computational Framework for Solving Wasserstein Lagrangian Flows Kirill Neklyudov, Rob Brekelmans,..., Alireza Makhzani
14 2024-02-15 link Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion Hila Manor, Tomer Michaeli
14 2024-02-08 link Memory Consolidation Enables Long-Context Video Understanding Ivana Balazevic, Yuge Shi,..., Olivier J Henaff
14 2024-02-15 link A Human-Inspired Reading Agent with Gist Memory of Very
Long Contexts
Kuang-Huei Lee, Xinyun Chen,..., Ian Fischer
14 2023-10-02 link Fool Your (Vision and) Language Model With Embarrassingly Simple
Permutations
Yongshuo Zong, Tingyang Yu,..., Timothy Hospedales
14 2024-01-29 link Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in
RLHF
Banghua Zhu, Michael Jordan, Jiantao Jiao
14 2024-02-26 link Feedback Efficient Online Fine-Tuning of Diffusion Models Masatoshi Uehara, Yulai Zhao,..., Tommaso Biancalani
14 2023-06-02 link Revisiting the Role of Language Priors in Vision-Language Models Zhiqiu Lin, Xinyue Chen,..., Deva Ramanan
14 2024-01-18 link Improving fine-grained understanding in image-text pre-training Ioana Bica, Anastasija Ilic,..., Jovana Mitrovic
14 2024-03-06 link DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training Zhongkai Hao, Chang Su,..., Jun Zhu
14 2024-01-24 link Can AI Assistants Know What They Don't Know? Qinyuan Cheng, Tianxiang Sun,..., Xipeng Qiu
14 2024-02-19 link Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for
Robust Large Vision-Language Models
Christian Schlarmann, Naman Deep Singh,..., Matthias Hein
14 2024-02-07 link CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay Natasha Butt, Blazej Manczak,..., Taco Cohen
14 2024-02-14 link Position: Topological Deep Learning is the New Frontier for
Relational Learning
Theodore Papamarkou, Tolga Birdal,..., Ghada Zamzmi
13 2024-03-11 link Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical
Knowledge Enhancement
Che Liu, Zhongwei Wan,..., Rossella Arcucci
13 2024-02-19 link In value-based deep reinforcement learning, a pruned network is
a good network
Johan Samir Obando Ceron, Aaron Courville, Pablo Samuel Castro
13 2024-02-27 link DS-Agent: Automated Data Science by Empowering Large Language Models
with Case-Based Reasoning
Siyuan Guo, Cheng Deng,..., Jun Wang
13 2024-03-03 link In-Context Sharpness as Alerts: An Inner Representation Perspective for
Hallucination Mitigation
Shiqi Chen, Miao Xiong,..., Junxian He
13 2024-02-05 link Distinguishing the Knowable from the Unknowable with Language Models Gustaf Ahdritz, Tian Qin,..., Benjamin L. Edelman
13 2023-07-21 link Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width
Guarantees and Benefits of Complex Eigenvalues
Antonio Orvieto, Soham De,..., Samuel L Smith
13 2024-02-19 link Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language
Models
Didi Zhu, Zhongyisun Sun,..., Kun Kuang
13 2024-02-13 link A Dense Reward View on Aligning Text-to-Image Diffusion with
Preference
Shentao Yang, Tianqi Chen, Mingyuan Zhou
13 2023-09-18 link Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts Jiang-Xin Shi, Tong Wei,..., Yu-Feng Li
13 2023-12-20 link Learning and Forgetting Unsafe Examples in Large Language Models Jiachen Zhao, Zhun Deng,..., Mengye Ren
13 2024-02-05 link Representation Surgery for Multi-Task Model Merging Enneng Yang, Li Shen,..., Dacheng Tao
13 2023-10-20 link Equivariant Deep Weight Space Alignment Aviv Navon, Aviv Shamsian,..., Haggai Maron
13 2024-02-03 link Improving Diffusion Models for Inverse Problems Using Optimal Posterior
Covariance
Xinyu Peng, Ziyang Zheng,..., Hongkai Xiong
13 2024-03-16 link SelfIE: Self-Interpretation of Large Language Model Embeddings Haozhe Chen, Carl Vondrick, Chengzhi Mao
13 2024-02-03 link GliDe with a CaPE: A Low-Hassle Method to Accelerate
Speculative Decoding
Cunxiao Du, Jing Jiang,..., Yang You
13 2024-04-26 link Probabilistic Inference in Language Models via Twisted Sequential Monte
Carlo
Stephen Zhao, Rob Brekelmans,..., Roger Baker Grosse
13 2024-02-21 link Do Efficient Transformers Really Save Computation? Kai Yang, Jan Ackermann,..., Liwei Wang
13 2024-02-26 link Disentangled 3D Scene Generation with Layout Learning Dave Epstein, Ben Poole,..., Aleksander Holynski
13 2023-02-23 link EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand
Binding Site Prediction
yang zhang, Zhewei Wei,..., Wenbing Huang
13 2024-02-15 link SAMformer: Unlocking the Potential of Transformers in Time Series
Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
Romain Ilbert, Ambroise Odonnat,..., Ievgen Redko
13 2024-03-15 link Repoformer: Selective Retrieval for Repository-Level Code Completion Di Wu, Wasi Uddin Ahmad,..., Xiaofei Ma
13 2024-02-23 link Foundation Policies with Hilbert Representations Seohong Park, Tobias Kreiman, Sergey Levine
13 2024-02-02 link Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics
on the Attention Landscape
Juno Kim, Taiji Suzuki
13 2024-03-28 link MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions Kai Zhang, Yi Luan,..., Ming-Wei Chang
13 2023-10-16 link Unifying Image Processing as Visual Prompting Question Answering Yihao Liu, Xiangyu Chen,..., Chao Dong
13 2023-12-28 link Non-Vacuous Generalization Bounds for Large Language Models Sanae Lotfi, Marc Anton Finzi,..., Andrew Gordon Wilson
13 2024-02-01 link Position: Bayesian Deep Learning is Needed in the Age
of Large-Scale AI
Theodore Papamarkou, Maria Skoularidou,..., Ruqi Zhang
13 2023-05-27 link Matrix Information Theory for Self-Supervised Learning Yifan Zhang, Zhiquan Tan,..., Yang Yuan
12 2024-02-08 link Training Large Language Models for Reasoning through Reverse Curriculum
Reinforcement Learning
Zhiheng Xi, Wenxiang Chen,..., Xuanjing Huang
12 2024-02-06 link DistiLLM: Towards Streamlined Distillation for Large Language Models Jongwoo Ko, Sungnyun Kim,..., Se-Young Yun
12 2024-01-29 link ReGAL: Refactoring Programs to Discover Generalizable Abstractions Elias Stengel-Eskin, Archiki Prasad, Mohit Bansal
12 2023-10-06 link On the Embedding Collapse when Scaling up Recommendation Models Xingzhuo Guo, Junwei Pan,..., Mingsheng Long
12 2024-03-04 link DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving Foteini Strati, Sara McAllister,..., Ana Klimovic
12 2023-07-03 link Trainable Transformer in Transformer Abhishek Panigrahi, Sadhika Malladi,..., Sanjeev Arora
12 2024-01-19 link Equivariant Graph Neural Operator for Modeling 3D Dynamics Minkai Xu, Jiaqi Han,..., Anima Anandkumar
12 2024-04-04 link BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized
Sparse Modern Hopfield Model
Chenwei Xu, Yu-Chao Huang,..., Han Liu
12 2024-03-21 link Protein Conformation Generation via Force-Guided SE(3) Diffusion Models YanWang, Lihao Wang,..., Quanquan Gu
12 2024-01-24 link Conformal Prediction Sets Improve Human Decision Making Jesse C. Cresswell, Yi Sui,..., Noël Vouitsis
12 2024-03-21 link An Analysis of Linear Time Series Forecasting Models William Toner, Luke Nicholas Darlow
12 2024-02-23 link How Do Nonlinear Transformers Learn and Generalize in In-Context
Learning?
Hongkang Li, Meng Wang,..., Pin-Yu Chen
12 2024-03-02 link SceneCraft: An LLM Agent for Synthesizing 3D Scene as
Blender Code
Ziniu Hu, Ahmet Iscen,..., Alireza Fathi
12 2023-12-02 link Second-Order Uncertainty Quantification: A Distance-Based Approach Yusuf Sale, Viktor Bengs,..., Eyke Hüllermeier
12 2024-05-02 link MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts Jianan Zhou, Zhiguang Cao,..., Xu Chi
12 2024-05-05 link Parameter-Efficient Fine-Tuning with Discrete Fourier Transform Ziqi Gao, Qichao Wang,..., Jia Li
11 2024-02-02 link Online conformal prediction with decaying step sizes Anastasios Nikolas Angelopoulos, Rina Barber, Stephen Bates
11 2024-04-12 link TSLANet: Rethinking Transformers for Time Series Representation Learning Emadeldeen Eldele, Mohamed Ragab,..., Xiaoli Li
11 2023-11-15 link ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy Kirill Vishniakov, Zhiqiang Shen, Zhuang Liu
11 2024-02-25 link Equivariant Frames and the Impossibility of Continuous Canonicalization Nadav Dym, Hannah Lawrence, Jonathan W. Siegel
11 2024-02-05 link Graph-enhanced Large Language Models in Asynchronous Plan Reasoning Fangru Lin, Emanuele La Malfa,..., Janet B. Pierrehumbert
11 2024-03-26 link Mechanistic Design and Scaling of Hybrid Architectures Michael Poli, Armin W Thomas,..., Stefano Massaroli
11 2024-03-04 link CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary
Time Series as Exogenous Variables
Jiecheng Lu, Xu Han,..., Shihao Yang
11 2024-10-29 link Cell2Sentence: Teaching Large Language Models the Language of Biology Daniel Levine, Syed A Rizvi,..., David van Dijk
11 2024-01-07 link The Stronger the Diffusion Model, the Easier the Backdoor:
Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline
Haonan Wang, Qianli Shen,..., Kenji Kawaguchi
11 2024-02-25 link RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis Yao Mu, Junting Chen,..., Ping Luo
11 2024-03-30 link Linguistic Calibration of Long-Form Generations Neil Band, Xuechen Li,..., Tatsunori Hashimoto
11 2024-02-12 link Benchmarking and Building Long-Context Retrieval Models with LoCo and
M2-BERT
Jon Saad-Falcon, Daniel Y Fu,..., Christopher Re
11 2024-02-08 link AttnLRP: Attention-Aware Layer-wise Relevance Propagation for Transformers Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi,..., Wojciech Samek
11 2024-05-03 link Auto-Encoding Morph-Tokens for Multimodal LLM Kaihang Pan, Siliang Tang,..., Hanwang Zhang
11 2024-02-21 link From Self-Attention to Markov Models: Unveiling the Dynamics of
Generative Transformers
Muhammed Emrullah Ildiz, Yixiao HUANG,..., Samet Oymak
11 2023-12-26 link Generalization in Kernel Regression Under Realistic Assumptions Daniel Barzilai, Ohad Shamir
11 2024-02-27 link Case-Based or Rule-Based: How Do Transformers Do the Math? Yi Hu, Xiaojuan Tang,..., Muhan Zhang
11 2024-02-07 link Asymptotics of feature learning in two-layer networks after one
gradient-step
Hugo Cui, Luca Pesce,..., Bruno Loureiro
11 2024-02-09 link Feedback Loops With Language Models Drive In-Context Reward Hacking Alexander Pan, Erik Jones,..., Jacob Steinhardt
11 2024-05-30 link Why Larger Language Models Do In-context Learning Differently? Zhenmei Shi, Junyi Wei,..., Yingyu Liang
11 2024-02-04 link Selecting Large Language Model to Fine-tune via Rectified Scaling
Law
Haowei Lin, Baizhou Huang,..., Yitao Liang
11 2024-01-23 link TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic
Tasks
Zhiruo Wang, Graham Neubig, Daniel Fried
11 2024-02-01 link Efficient Exploration for LLMs Vikranth Dwaracherla, Seyed Mohammad Asghari,..., Benjamin Van Roy
11 2023-06-07 link Catapults in SGD: spikes in the training loss and
their impact on generalization through feature learning
Libin Zhu, Chaoyue Liu,..., Mikhail Belkin
11 2024-02-26 link Neural Operators with Localized Integral and Differential Kernels Miguel Liu-Schiaffini, Julius Berner,..., Anima Anandkumar
11 2023-06-02 link Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning Xiangzhe Kong, Wenbing Huang, Yang Liu
10 2024-03-06 link Conformal prediction for multi-dimensional time series by ellipsoidal sets Chen Xu, Hanyang Jiang, Yao Xie
10 2023-10-02 link Cooperative Graph Neural Networks Ben Finkelshtein, Xingyue Huang,..., Ismail Ilkan Ceylan
10 2024-01-22 link APT: Adaptive Pruning and Tuning Pretrained Language Models for
Efficient Training and Inference
Bowen Zhao, Hannaneh Hajishirzi, Qingqing Cao
10 2024-04-17 link Learning with 3D rotations, a hitchhiker's guide to SO(3) Andreas René Geist, Jonas Frey,..., Georg Martius
10 2024-02-14 link Copyright Traps for Large Language Models Matthieu Meeus, Igor Shilov,..., Yves-Alexandre de Montjoye
10 2024-02-13 link Hybrid Inverse Reinforcement Learning Juntao Ren, Gokul Swamy,..., Sanjiban Choudhury
10 2024-03-01 link Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson
of Reinforcement Learning
Michal Nauman, Michał Bortkiewicz,..., Marek Cygan
10 2024-01-25 link Adaptive Text Watermark for Large Language Models Yepeng Liu, Yuheng Bu
10 2023-08-25 link Learning to Intervene on Concept Bottlenecks David Steinmann, Wolfgang Stammer,..., Kristian Kersting
10 2023-06-15 link ViP: A Differentially Private Foundation Model for Computer Vision Yaodong Yu, Maziar Sanjabi,..., Chuan Guo
10 2023-10-02 link FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language
Models
Jingwei Sun, Ziyue Xu,..., Holger R Roth
10 2024-01-31 link Do Language Models Exhibit the Same Cognitive Biases in
Problem Solving as Human Learners?
Andreas Opedal, Alessandro Stolfo,..., Mrinmaya Sachan
10 2024-05-06 link To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning
in Large Language Models
George-Octavian Bărbulescu, Peter Triantafillou
10 2023-03-15 link Borda Regret Minimization for Generalized Linear Dueling Bandits Yue Wu, Tao Jin,..., Quanquan Gu
10 2023-10-04 link Assessing Large Language Models on Climate Information Jannis Bulian, Mike S. Schäfer,..., Nadine Strauss
10 2024-02-11 link More Benefits of Being Distributional: Second-Order Bounds for Reinforcement
Learning
Kaiwen Wang, Owen Oertell,..., Wen Sun
10 2024-06-05 link Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large
Language Models
Peijie Dong, Lujun Li,..., Xiaowen Chu
10 2023-02-07 link Graph Generation with Diffusion Mixture Jaehyeong Jo, Dongki Kim, Sung Ju Hwang
10 2024-03-20 link Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models
with Noisy Data
Giannis Daras, Alex Dimakis, Constantinos Costis Daskalakis
10 2023-12-20 link In-Context Reinforcement Learning for Variable Action Spaces Viacheslav Sinii, Alexander Nikulin,..., Sergey Kolesnikov
10 2023-10-18 link A connection between Tempering and Entropic Mirror Descent Nicolas Chopin, Francesca Crucinio, Anna Korba
10 2023-06-05 link Seizing Serendipity: Exploiting the Value of Past Success in
Off-Policy Actor-Critic
Tianying Ji, Yu Luo,..., Huazhe Xu
10 2024-02-03 link Position: Graph Foundation Models Are Already Here Haitao Mao, Zhikai Chen,..., Jiliang Tang
10 2023-05-30 link Plug-in Performative Optimization Licong Lin, Tijana Zrnic
10 2024-02-22 link Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion Yujia Huang, Adishree Ghatare,..., Yisong Yue
10 2023-09-29 link Information Flow in Self-Supervised Learning Zhiquan Tan, Jingqin Yang,..., Yifan Zhang
10 2024-04-22 link A Multimodal Automated Interpretability Agent Tamar Rott Shaham, Sarah Schwettmann,..., Antonio Torralba
9 2024-01-04 link Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU
Feature Model
Hien Dang, Tho Tran Huu,..., Nhat Ho
9 2024-02-04 link Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models Fangzhao Zhang, Mert Pilanci
9 2024-04-06 link Multicalibration for Confidence Scoring in LLMs Gianluca Detommaso, Martin Bertran Lopez,..., Aaron Roth
9 2024-02-23 link Deep Networks Always Grok and Here is Why Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
9 2023-10-11 link Language Models As Semantic Indexers Bowen Jin, Hansi Zeng,..., Xianfeng Tang
9 2024-02-28 link Characterizing Truthfulness in Large Language Model Generations with Local
Intrinsic Dimension
Fan Yin, Jayanth Srinivasa, Kai-Wei Chang
9 2024-01-05 link AST-T5: Structure-Aware Pretraining for Code Generation and Understanding Linyuan Gong, Mostafa Elhoushi, Alvin Cheung
9 2023-09-28 link Discovering environments with XRM Mohammad Pezeshki, Diane Bouchacourt,..., David Lopez-Paz
9 2024-05-30 link Proteus: Exploring Protein Structure Generation for Enhanced Designability and
Efficiency
Chentong Wang, Yannan Qu,..., Longxing Cao
9 2024-02-27 link Variational Learning is Effective for Large Deep Networks Yuesong Shen, Nico Daheim,..., Thomas Möllenhoff
9 2024-05-22 link Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam
Generation
Gauthier Guinet, Behrooz Omidvar-Tehrani,..., Laurent Callot
9 2023-06-06 link Designing Decision Support Systems Using Counterfactual Prediction Sets Eleni Straitouri, Manuel Gomez Rodriguez
9 2024-03-05 link Time Weaver: A Conditional Time Series Generation Model Sai Shankar Narasimhan, Shubhankar Agarwal,..., Sandeep P. Chinchali
9 2024-04-02 link Test-Time Model Adaptation with Only Forward Passes Shuaicheng Niu, Chunyan Miao,..., Peilin Zhao
9 2024-02-07 link A Sober Look at LLMs for Material Discovery: Are
They Actually Good for Bayesian Optimization Over Molecules?
Agustinus Kristiadi, Felix Strieth-Kalthoff,..., Geoff Pleiss
9 2022-12-08 link A New Linear Scaling Rule for Private Adaptive Hyperparameter
Optimization
Ashwinee Panda, Xinyu Tang,..., Prateek Mittal
9 None link Characterizing Large Language Model Geometry Solves Toxicity Detection and
Generation
Randall Balestriero, Romain Cosentino, Sarath Shekkizhar
9 2024-05-08 link VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems
in Visual Context
yunxin li, Baotian Hu,..., Min Zhang
9 2024-02-21 link Privacy-Preserving Instructions for Aligning Large Language Models Da Yu, Peter Kairouz,..., Zheng Xu
9 2023-11-24 link StableSSM: Alleviating the Curse of Memory in State-space Models
through Stable Reparameterization
Shida Wang, Qianxiao Li
9 2024-01-18 link Exploration and Anti-Exploration with Distributional Random Network Distillation Kai Yang, Jian Tao,..., Xiu Li
9 2024-02-02 link Simulation of Graph Algorithms with Looped Transformers Artur Back de Luca, Kimon Fountoulakis
9 2024-02-07 link Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation Luca Beurer-Kellner, Marc Fischer, Martin Vechev
9 2024-03-27 link Understanding the Learning Dynamics of Alignment with Human Feedback Shawn Im, Yixuan Li
9 2024-03-30 link Privacy Backdoors: Stealing Data with Corrupted Pretrained Models Shanglun Feng, Florian Tramèr
9 2024-02-12 link Rolling Diffusion Models David Ruhe, Jonathan Heek,..., Emiel Hoogeboom
9 2024-03-04 link Wukong: Towards a Scaling Law for Large-Scale Recommendation Buyun Zhang, Liang Luo,..., Wenlin Chen
9 2024-01-08 link Sampling in Unit Time with Kernel Fisher-Rao Flow Aimee Maurais, Youssef Marzouk
9 2023-08-31 link On the Implicit Bias of Adam Matias D. Cattaneo, Jason Matthew Klusowski, Boris Shigida
9 2024-02-01 link Getting the most out of your tokenizer for pre-training
and domain adaptation
Gautier Dagan, Gabriel Synnaeve, Baptiste Roziere
9 2023-09-08 link Graph Neural Networks Use Graphs When They Shouldn't Maya Bechler-Speicher, Ido Amos,..., Amir Globerson
9 2024-02-14 link Instruction Tuning for Secure Code Generation Jingxuan He, Mark Vero,..., Martin Vechev
8 2023-04-16 link An Empirical Study of Realized GNN Expressiveness Yanbo Wang, Muhan Zhang
8 2024-04-16 link Fewer Truncations Improve Language Modeling Hantian Ding, Zijian Wang,..., Stefano Soatto
8 2023-11-16 link Structured Chemistry Reasoning with Large Language Models Siru Ouyang, Zhuosheng Zhang,..., Lianhui Qin
8 2024-03-28 link Regression with Multi-Expert Deferral Anqi Mao, Mehryar Mohri, Yutao Zhong
8 2024-02-09 link Particle Denoising Diffusion Sampler Angus Phillips, Hai-Dang Dau,..., Arnaud Doucet
8 2024-05-28 link AI Alignment with Changing and Influenceable Reward Functions Micah Carroll, Davis Foote,..., Anca Dragan
8 2024-01-21 link Linear Alignment: A Closed-form Solution for Aligning Human Preferences
without Tuning and Feedback
Songyang Gao, Qiming Ge,..., Dahua Lin
8 2023-12-06 link Interpretability Illusions in the Generalization of Simplified Models Dan Friedman, Andrew Kyle Lampinen,..., Asma Ghandeharioun
8 2023-05-26 link Rotational Equilibrium: How Weight Decay Balances Learning Across Neural
Networks
Atli Kosson, Bettina Messmer, Martin Jaggi
8 2024-02-04 link TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling Jiaxiang Dong, Haixu Wu,..., Mingsheng Long
8 2024-05-03 link PICLe: Eliciting Diverse Behaviors from Large Language Models with
Persona In-Context Learning
Hyeong Kyu Choi, Yixuan Li
8 2024-02-15 link Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification Yiming Meng, Ruikun Zhou,..., Jun Liu
8 2024-04-18 link RoboDreamer: Learning Compositional World Models for Robot Imagination Siyuan Zhou, Yilun Du,..., Chuang Gan
8 2024-02-05 link Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation
Problem
Maciej Wolczyk, Bartłomiej Cupiał,..., Piotr Miłoś
8 2024-02-15 link OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large
Language Models
Ali AhmadiTeshnizi, Wenzhi Gao, Madeleine Udell
8 2024-05-29 link Locally Estimated Global Perturbations are Better than Local Perturbations
for Federated Sharpness-aware Minimization
Ziqing Fan, Shengchao Hu,..., Yanfeng Wang
8 2024-02-04 link LQER: Low-Rank Quantization Error Reconstruction for LLMs Cheng Zhang, Jianyi Cheng,..., Yiren Zhao
8 2024-03-20 link Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes Yifan Chen, Mark Goldstein,..., Eric Vanden-Eijnden
8 2024-02-15 link Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling Raunaq Bhirangi, Chenyu Wang,..., Lerrel Pinto
8 2024-02-01 link Transforming and Combining Rewards for Aligning Large Language Models Zihao Wang, Chirag Nagpal,..., Victor Veitch
8 2024-06-04 link What Improves the Generalization of Graph Transformers? A Theoretical
Dive into the Self-attention and Positional Encoding
Hongkang Li, Meng Wang,..., Pin-Yu Chen
8 2023-11-27 link Swallowing the Bitter Pill: Simplified Scalable Conformer Generation Yuyang Wang, Ahmed A. A. Elhag,..., Miguel Ángel Bautista
8 2023-10-14 link Mastering Robot Manipulation with Multimodal Prompts through Pretraining and
Multi-task Fine-tuning
Jiachen Li, Qiaozi Gao,..., William Yang Wang
8 2023-01-27 link Single-Trajectory Distributionally Robust Reinforcement Learning Zhipeng Liang, Xiaoteng Ma,..., Zhengyuan Zhou
8 2023-08-14 link Position: Key Claims in LLM Research Have a Long
Tail of Footnotes
Anna Rogers, Sasha Luccioni
8 2024-02-02 link Understanding Adam Optimizer via Online Learning of Updates: Adam
is FTRL in Disguise
Kwangjun Ahn, Zhiyu Zhang,..., Yan Dai
8 2024-04-23 link NExT: Teaching Large Language Models to Reason about Code
Execution
Ansong Ni, Miltiadis Allamanis,..., Pengcheng Yin
8 2022-12-15 link Integrating Multimodal Data for Joint Generative Modeling of Complex
Dynamics
Manuel Brenner, Florian Hess,..., Daniel Durstewitz
8 2024-04-17 link Decomposing and Editing Predictions by Modeling Model Computation Harshay Shah, Andrew Ilyas, Aleksander Madry
8 2024-02-19 link LoRA Training in the NTK Regime has No Spurious
Local Minima
Uijeong Jang, Jason D. Lee, Ernest K. Ryu
8 2024-02-22 link Clifford-Steerable Convolutional Neural Networks Maksim Zhdanov, David Ruhe,..., Patrick Forré
8 2024-02-14 link SLEB: Streamlining LLMs through Redundancy Verification and Elimination of
Transformer Blocks
Jiwon Song, Kyungseok Oh,..., jae-joon kim
8 2024-02-10 link Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF Han Shen, Zhuoran Yang, Tianyi Chen
8 2024-02-07 link MEMORYLLM: Towards Self-Updatable Large Language Models Yu Wang, Yifan Gao,..., Julian McAuley
8 2024-02-06 link In-context learning agents are asymmetric belief updaters Johannes A. Schubert, Akshay Kumar Jagadish,..., Eric Schulz
8 2023-12-08 link Membership Inference Attacks on Diffusion Models via Quantile Regression Shuai Tang, Steven Wu,..., Aaron Roth
8 2024-05-28 link FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic
Prediction
Zhonghang Li, Lianghao Xia,..., Chao Huang
8 2024-02-13 link eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale,
High-quality Instruction Data
Bo Peng, Xinyi Ling,..., Xia Ning
8 2023-11-15 link Converting Transformers to Polynomial Form for Secure Inference Over
Homomorphic Encryption
Itamar Zimerman, Moran Baruch,..., Lior Wolf
8 2024-02-04 link Revisiting the Power of Prompt for Visual Tuning Yuzhu Wang, Lechao Cheng,..., Meng Wang
8 2024-02-28 link Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for
Large Language Models
Mingjia Huo, Sai Ashish Somayajula,..., Pengtao Xie
8 2024-02-06 link MusicRL: Aligning Music Generation to Human Preferences Geoffrey Cideron, Sertan Girgin,..., Andrea Agostinelli
8 2023-11-28 link Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models Zhihe Lu, Jiawang Bai,..., Xinchao Wang
8 2024-03-01 link Shifted Interpolation for Differential Privacy Jinho Bok, Weijie J Su, Jason Altschuler
8 2023-05-12 link MoMo: Momentum Models for Adaptive Learning Rates Fabian Schaipp, Ruben Ohana,..., Robert M. Gower
8 2024-03-18 link Larimar: Large Language Models with Episodic Memory Control Payel Das, Subhajit Chaudhury,..., Pin-Yu Chen
8 2024-05-16 link LLM and Simulation as Bilevel Optimizers: A New Paradigm
to Advance Physical Scientific Discovery
Pingchuan Ma, Tsun-Hsuan Wang,..., Wojciech Matusik
7 2024-02-02 link Two Heads Are Better Than One: Boosting Graph Sparse
Training via Semantic and Topological Awareness
Guibin Zhang, Yanwei Yue,..., Tianlong Chen
7 2023-07-11 link Memorization Through the Lens of Curvature of Loss Function
Around Samples
Isha Garg, Deepak Ravikumar, Kaushik Roy
7 2024-04-25 link In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization Herilalaina Rakotoarison, Steven Adriaensen,..., Frank Hutter
7 2024-02-16 link Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs Yeonhong Park, Jake Hyun,..., Jae W. Lee
7 2023-11-17 link Stable Differentiable Causal Discovery Achille Nazaret, Justin Hong,..., David Blei
7 2024-02-22 link Q-Probe: A Lightweight Approach to Reward Maximization for Language
Models
Kenneth Li, Samy Jelassi,..., David Brandfonbrener
7 2023-11-02 link Gaussian Processes on Cellular Complexes Mathieu Alain, So Takao,..., Marc Peter Deisenroth
7 2024-05-13 link PARDEN, Can You Repeat That? Defending against Jailbreaks via
Repetition
Ziyang Zhang, Qizhen Zhang, Jakob Nicolaus Foerster
7 2024-02-22 link CoLoRA: Continuous low-rank adaptation for reduced implicit neural modeling
of parameterized partial differential equations
Jules Berman, Benjamin Peherstorfer
7 2023-12-18 link Harnessing the Power of Neural Operators with Automatically Encoded
Conservation Laws
Ning Liu, Yiming Fan,..., Yue Yu
7 2024-02-02 link BAT: Learning to Reason about Spatial Sounds with Large
Language Models
Zhisheng Zheng, Puyuan Peng,..., David Harwath
7 2024-02-12 link Weisfeiler-Leman at the margin: When more expressivity matters Billy Joe Franks, Christopher Morris,..., Floris Geerts
7 2023-10-26 link HyperFields: Towards Zero-Shot Generation of NeRFs from Text Sudarshan Babu, Richard Liu,..., Rana Hanocka
7 2024-03-12 link BAGEL: Bootstrapping Agents by Guiding Exploration with Language Shikhar Murty, Christopher D Manning,..., Kenton Lee
7 2024-03-03 link Critical windows: non-asymptotic theory for feature emergence in diffusion
models
Marvin Li, Sitan Chen
7 2023-11-01 link Robust and Conjugate Gaussian Process Regression Matias Altamirano, Francois-Xavier Briol, Jeremias Knoblauch
7 2024-02-11 link How do Large Language Models Navigate Conflicts between Honesty
and Helpfulness?
Ryan Liu, Theodore Sumers,..., Thomas L. Griffiths
7 2024-01-29 link Two Stones Hit One Bird: Bilevel Positional Encoding for
Better Length Extrapolation
Zhenyu He, Guhao Feng,..., Di He
7 2023-10-13 link Split-and-Denoise: Protect large language model inference with local differential
privacy
Peihua Mai, Ran Yan,..., Yan Pang
7 2024-04-22 link Align Your Steps: Optimizing Sampling Schedules in Diffusion Models Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis
7 2024-06-02 link Is In-Context Learning in Large Language Models Bayesian? A
Martingale Perspective
Fabian Falck, Ziyu Wang, Christopher C. Holmes
7 2024-02-05 link Can We Remove the Square-Root in Adaptive Gradient Methods?
A Second-Order Perspective
Wu Lin, Felix Dangel,..., Alireza Makhzani
7 2024-03-28 link H-Consistency Guarantees for Regression Anqi Mao, Mehryar Mohri, Yutao Zhong
7 2023-06-19 link Hyperbolic Active Learning for Semantic Segmentation under Domain Shift Luca Franco, Paolo Mandica,..., Fabio Galasso
7 2024-02-13 link Unsupervised Evaluation of Code LLMs with Round-Trip Correctness Miltiadis Allamanis, Sheena Panthaplackel, Pengcheng Yin
7 2024-02-06 link Neural Networks Learn Statistics of Increasing Complexity Nora Belrose, Quintin Pope,..., Xiaoli Fern
7 2024-02-28 link Out-of-Domain Generalization in Dynamical Systems Reconstruction Niclas Alexander Göring, Florian Hess,..., Daniel Durstewitz
7 2024-06-06 link Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation Can Yaras, Peng Wang,..., Qing Qu
7 2023-03-25 link DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency Zalan Fabian, Berk Tinaz, Mahdi Soltanolkotabi
7 2024-02-06 link Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains Junhong Shen, Neil Tenenholtz,..., Nicolo Fusi
7 2024-02-27 link Automated Statistical Model Discovery with Language Models Michael Y. Li, Emily Fox, Noah Goodman
7 2024-02-12 link Active Preference Learning for Large Language Models William Muldrew, Peter Hayes,..., David Barber
7 2024-06-11 link Failures Are Fated, But Can Be Faded: Characterizing and
Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models
Som Sagar, Aditya Taparia, Ransalu Senanayake
7 2023-11-23 link Scalable AI Safety via Doubly-Efficient Debate Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras
7 2024-05-14 link Compositional Text-to-Image Generation with Dense Blob Representations Weili Nie, Sifei Liu,..., Arash Vahdat
7 2023-10-26 link CompeteAI: Understanding the Competition Dynamics of Large Language Model-based
Agents
Qinlin Zhao, Jindong Wang,..., Xing Xie
7 2024-03-19 link Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data
Flow and Per-Block Quantization
Haocheng Xi, Yuxiang Chen,..., Jun Zhu
7 2024-06-05 link Graph Neural Network Explanations are Fragile Jiate Li, Meng Pang,..., Binghui Wang
7 2024-06-07 link FlowMM: Generating Materials with Riemannian Flow Matching Benjamin Kurt Miller, Ricky T. Q. Chen,..., Brandon M Wood
7 2024-02-22 link Prompting a Pretrained Transformer Can Be a Universal Approximator Aleksandar Petrov, Philip Torr, Adel Bibi
7 2023-10-14 link DPZero: Private Fine-Tuning of Language Models without Backpropagation Liang Zhang, Bingcong Li,..., Niao He
7 2023-08-28 link Rate-Optimal Policy Optimization for Linear Markov Decision Processes Uri Sherman, Alon Cohen,..., Yishay Mansour
7 2024-02-16 link RLVF: Learning from Verbal Feedback without Overgeneralization Moritz Pascal Stephan, Alexander Khazatsky,..., Chelsea Finn
7 2023-10-03 link Discovering Symmetry Breaking in Physical Systems with Relaxed Group
Convolution
Rui Wang, Elyssa Hofgard,..., Tess Smidt
7 2024-06-07 link Projecting Molecules into Synthesizable Chemical Spaces Shitong Luo, Wenhao Gao,..., Jianzhu Ma
7 2023-10-11 link LLark: A Multimodal Instruction-Following Language Model for Music Joshua P Gardner, Simon Durand,..., Rachel M Bittner
6 2024-02-16 link Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling Ivan Marisca, Cesare Alippi, Filippo Maria Bianchi
6 2024-02-03 link Vanilla Bayesian Optimization Performs Great in High Dimensions Carl Hvarfner, Erik Orm Hellsten, Luigi Nardi
6 2024-02-23 link On the Duality Between Sharpness-Aware Minimization and Adversarial Training Yihao Zhang, Hangzhou He,..., Zeming Wei
6 2024-05-08 link The Entropy Enigma: Success and Failure of Entropy Minimization Ori Press, Ravid Shwartz-Ziv,..., Matthias Bethge
6 2023-12-22 link How Smooth Is Attention? Valérie Castin, Pierre Ablin, Gabriel Peyré
6 2024-03-13 link A Sparsity Principle for Partially Observable Causal Representation Learning Danru Xu, Dingling Yao,..., Sara Magliacane
6 2024-05-02 link On Mechanistic Knowledge Localization in Text-to-Image Generative Models Samyadeep Basu, Keivan Rezaei,..., Soheil Feizi
6 2024-02-05 link DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment
Design
Samuel Garcin, James Doran,..., Stefano V Albrecht
6 2023-12-19 link Emergence of In-Context Reinforcement Learning from Noise Distillation Ilya Zisman, Vladislav Kurenkov,..., Sergey Kolesnikov
6 2024-02-29 link Smooth Tchebycheff Scalarization for Multi-Objective Optimization Xi Lin, Xiaoyuan Zhang,..., Qingfu Zhang
6 2023-11-09 link Model-Based Minimum Bayes Risk Decoding for Text Generation Yuu Jinnai, Tetsuro Morimura,..., Kenshi Abe
6 2024-02-02 link Mapping the Multiverse of Latent Representations Jeremy Wayland, Corinna Coupette, Bastian Rieck
6 2023-12-11 link Grokking Group Multiplication with Cosets Dashiell Stander, Qinan Yu,..., Stella Biderman
6 2024-02-08 link Offline Actor-Critic Reinforcement Learning Scales to Large Models Jost Tobias Springenberg, Abbas Abdolmaleki,..., Martin Riedmiller
6 2024-02-05 link Rethinking Optimization and Architecture for Tiny Language Models Yehui Tang, Kai Han,..., Yunhe Wang
6 2023-12-15 link Fast Decision Boundary based Out-of-Distribution Detector Litian Liu, Yao Qin
6 2024-02-15 link Recovering the Pre-Fine-Tuning Weights of Generative Models Eliahu Horwitz, Jonathan Kahana, Yedid Hoshen
6 2024-02-16 link Stochastic Localization via Iterative Posterior Sampling Louis Grenioux, Maxence Noble,..., Alain Oliviero Durmus
6 2024-03-27 link A Geometric Explanation of the Likelihood OOD Detection Paradox Hamidreza Kamkari, Brendan Leigh Ross,..., Gabriel Loaiza-Ganem
6 2023-05-30 link Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting Andrea Cini, Danilo Mandic, Cesare Alippi
6 2024-06-11 link Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets
Cannot
Zixuan Wang, Stanley Wei,..., Jason D. Lee
6 2024-05-28 link HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning Shengchao Hu, Ziqing Fan,..., Dacheng Tao
6 2023-06-28 link Towards a Better Theoretical Understanding of Independent Subnetwork Training Egor Shulgin, Peter Richtárik
6 2024-04-15 link Large Language Models Can Automatically Engineer Features for Few-Shot
Tabular Learning
Sungwon Han, Jinsung Yoon,..., Tomas Pfister
6 2024-02-22 link Batch and match: black-box variational inference with a score-based
divergence
Diana Cai, Chirag Modi,..., Lawrence K. Saul
6 2024-06-01 link Slow and Steady Wins the Race: Maintaining Plasticity with
Hare and Tortoise Networks
Hojoon Lee, Hyeonseo Cho,..., Clare Lyle
6 2023-12-06 link Improving Gradient-guided Nested Sampling for Posterior Inference Pablo Lemos, Nikolay Malkin,..., Laurence Perreault-Levasseur
6 2024-02-12 link Tuning-Free Stochastic Optimization Ahmed Khaled, Chi Jin
6 2024-06-11 link Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for
Sampling
Denis Blessing, Xiaogang Jia,..., Gerhard Neumann
6 2024-06-02 link Full-Atom Peptide Design based on Multi-modal Flow Matching Jiahan Li, Chaoran Cheng,..., Jianzhu Ma
6 2024-03-07 link Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks Linyuan Gong, Sida Wang,..., Alvin Cheung
6 2024-02-27 link Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings Kevin Frans, Seohong Park,..., Sergey Levine
6 2023-07-13 link Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks Liam Collins, Hamed Hassani,..., Sanjay Shakkottai
6 2024-05-16 link IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency Linshan Hou, Ruili Feng,..., Yiming Li
6 2023-12-16 link Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge Conghan Yue, Zhengwei Peng,..., Dongyu Zhang
6 2023-11-06 link Sample Complexity Bounds for Estimating Probability Divergences under Invariances Behrooz Tahmasebi, Stefanie Jegelka
6 2024-02-19 link Towards Theoretical Understandings of Self-Consuming Generative Models Shi Fu, Sen Zhang,..., Dacheng Tao
6 2024-01-20 link Make-A-Shape: a Ten-Million-scale 3D Shape Model Ka-Hei Hui, Aditya Sanghi,..., Chi-Wing Fu
6 2024-02-09 link CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes Peter Mikhael, Itamar Chinn, Regina Barzilay
6 2024-06-20 link Revealing Vision-Language Integration in the Brain with Multimodal Networks Vighnesh Subramaniam, Colin Conwell,..., Andrei Barbu
6 2024-03-25 link Enabling Uncertainty Estimation in Iterative Neural Networks Nikita Durasov, Doruk Oner,..., Pascal Fua
6 2024-03-05 link PPFlow: Target-Aware Peptide Design with Torsional Flow Matching Haitao Lin, Odin Zhang,..., Stan Z. Li
6 2024-02-07 link Causal Representation Learning from Multiple Distributions: A General Setting Kun Zhang, Shaoan Xie,..., Yujia Zheng
6 2024-02-06 link CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling Junchao Gong, LEI BAI,..., Wanli Ouyang
6 2023-12-18 link The Good, The Bad, and Why: Unveiling Emotions in
Generative AI
CHENG LI, Jindong Wang,..., Xing Xie
6 2024-04-30 link Modeling Caption Diversity in Contrastive Vision-Language Pretraining Samuel Lavoie, Polina Kirichenko,..., Nicolas Ballas
6 2024-02-01 link Towards Efficient Exact Optimization of Language Model Alignment Haozhe Ji, Cheng Lu,..., Minlie Huang
6 2024-02-17 link How to Make the Gradients Small Privately: Improved Rates
for Differentially Private Non-Convex Optimization
Andrew Lowy, Jonathan Ullman, Stephen Wright
6 2024-04-24 link Unifying Bayesian Flow Networks and Diffusion Models through Stochastic
Differential Equations
Kaiwen Xue, Yuhao Zhou,..., Chongxuan Li
6 2024-05-10 link MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and
Distillation for Medical Data Analysis
Luyuan Xie, Manqing Lin,..., Zhonghai Wu
6 2023-10-24 link Neural Collapse in Multi-label Learning with Pick-all-label Loss Pengyu Li, Xiao Li,..., Qing Qu
6 2024-01-30 link Arrows of Time for Large Language Models Vassilis Papadopoulos, Jérémie Wenger, Clément Hongler
6 2024-02-05 link Position: What Can Large Language Models Tell Us about
Time Series Analysis
Ming Jin, YiFan Zhang,..., Qingsong Wen
6 2024-04-10 link Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation Thomas Merth, Qichen Fu,..., Mahyar Najibi
6 2024-06-10 link Compute Better Spent: Replacing Dense Layers with Structured Matrices Shikai Qiu, Andres Potapczynski,..., Andrew Gordon Wilson
6 2024-02-05 link See More Details: Efficient Image Super-Resolution by Experts Mining Eduard Zamfir, Zongwei Wu,..., Radu Timofte
6 2023-09-29 link Latent Space Symmetry Discovery Jianke Yang, Nima Dehmamy,..., Rose Yu
6 2024-05-09 link Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning Shibo Jie, Yehui Tang,..., Yunhe Wang
6 2024-04-01 link Optimal Ridge Regularization for Out-of-Distribution Prediction Pratik Patil, Jin-Hong Du, Ryan Tibshirani
6 2023-11-29 link Should we be going MAD? A Look at Multi-Agent
Debate Strategies for LLMs
Andries Petrus Smit, Nathan Grinsztajn,..., Arnu Pretorius
6 2024-05-27 link Q-value Regularized Transformer for Offline Reinforcement Learning Shengchao Hu, Ziqing Fan,..., Dacheng Tao
6 2024-02-09 link Understanding the Effects of Iterative Prompting on Truthfulness Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju
6 2024-05-18 link AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models
via Watermark LoRA
Weitao Feng, Wenbo Zhou,..., Nenghai Yu
6 2023-12-13 link The Relative Value of Prediction in Algorithmic Decision Making Juan Carlos Perdomo
6 2022-08-31 link Be Your Own Neighborhood: Detecting Adversarial Example by the
Neighborhood Relations Built on Self-Supervised Learning
Zhiyuan He, Yijun Yang,..., Tsung-Yi Ho
6 2024-02-06 link DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic
Systems
Yair Schiff, Zhong Yi Wan,..., Leonardo Zepeda-Núñez
6 2023-07-14 link Graph Positional and Structural Encoder Semih Cantürk, Renming Liu,..., Ladislav Rampášek
6 2024-02-15 link Accelerating Parallel Sampling of Diffusion Models Zhiwei Tang, Jiasheng Tang,..., Tsung-Hui Chang
6 2024-05-28 link SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity,
ECG and Respiratory Signals
Rahul Thapa, Bryan He,..., James Zou
6 2024-05-09 link Outlier-robust Kalman Filtering through Generalised Bayes Gerardo Duran-Martin, Matias Altamirano,..., Kevin Patrick Murphy
6 2024-05-18 link On the Trajectory Regularity of ODE-based Diffusion Sampling Defang Chen, Zhenyu Zhou,..., Siwei Lyu
6 2024-02-07 link Data-efficient Large Vision Models through Sequential Autoregression Zhiwei Hao, Jianyuan Guo,..., Chang Xu
6 2024-03-05 link Active Statistical Inference Tijana Zrnic, Emmanuel Candes
6 2024-03-19 link Listenable Maps for Audio Classifiers Francesco Paissan, Mirco Ravanelli, Cem Subakan
6 2023-05-26 link Selective Mixup Helps with Distribution Shifts, But Not (Only)
because of Mixup
Damien Teney, Jindong Wang, Ehsan Abbasnejad
6 2024-01-26 link Residual Quantization with Implicit Neural Codebooks Iris A.M. Huijben, Matthijs Douze,..., Jakob Verbeek
6 2024-02-17 link Offline Training of Language Model Agents with Functions as
Learnable Weights
Shaokun Zhang, Jieyu Zhang,..., Qingyun Wu