407 |
2024-01-17 |
link |
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model |
Lianghui Zhu, Bencheng Liao,..., Xinggang Wang |
402 |
2023-05-23 |
link |
Improving Factuality and Reasoning in Language Models through Multiagent Debate |
Yilun Du, Shuang Li,..., Igor Mordatch |
400 |
2024-03-05 |
link |
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis |
Patrick Esser, Sumith Kulal,..., Robin Rombach |
399 |
2023-08-04 |
link |
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities |
Weihao Yu, Zhengyuan Yang,..., Lijuan Wang |
308 |
2023-09-11 |
link |
NExT-GPT: Any-to-Any Multimodal LLM |
Shengqiong Wu, Hao Fei,..., Tat-Seng Chua |
237 |
2024-03-07 |
link |
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference |
Wei-Lin Chiang, Lianmin Zheng,..., Ion Stoica |
207 |
2024-01-18 |
link |
Self-Rewarding Language Models |
Weizhe Yuan, Richard Yuanzhe Pang,..., Jason E Weston |
202 |
2024-02-14 |
link |
DoRA: Weight-Decomposed Low-Rank Adaptation |
Shih-yang Liu, Chien-Yi Wang,..., Min-Hung Chen |
197 |
2023-05-22 |
link |
How Language Model Hallucinations Can Snowball |
Muru Zhang, Ofir Press,..., Noah A. Smith |
192 |
2024-05-31 |
link |
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality |
Tri Dao, Albert Gu |
187 |
2023-12-14 |
link |
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision |
Collin Burns, Pavel Izmailov,..., Jeffrey Wu |
170 |
2024-02-06 |
link |
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal |
Mantas Mazeika, Long Phan,..., Dan Hendrycks |
157 |
2023-11-06 |
link |
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch |
Le Yu, Bowen Yu,..., Yongbin Li |
155 |
2024-01-19 |
link |
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads |
Tianle Cai, Yuhong Li,..., Tri Dao |
135 |
2023-12-21 |
link |
VideoPoet: A Large Language Model for Zero-Shot Video Generation |
Dan Kondratyuk, Lijun Yu,..., Lu Jiang |
124 |
2024-01-03 |
link |
GPT-4V(ision) is a Generalist Web Agent, if Grounded |
Boyuan Zheng, Boyu Gou,..., Yu Su |
121 |
2023-09-28 |
link |
Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution |
Chrisantha Fernando, Dylan Sunil Banarse,..., Tim Rocktäschel |
119 |
2023-06-13 |
link |
SqueezeLLM: Dense-and-Sparse Quantization |
Sehoon Kim, Coleman Richard Charles Hooper,..., Kurt Keutzer |
119 |
2023-04-19 |
link |
Fundamental Limitations of Alignment in Large Language Models |
Yotam Wolf, Noam Wies,..., Amnon Shashua |
116 |
2024-01-16 |
link |
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation |
Haoran Xu, Amr Sharaf,..., Young Jin Kim |
115 |
2023-10-14 |
link |
A decoder-only foundation model for time-series forecasting |
Abhimanyu Das, Weihao Kong,..., Yichen Zhou |
104 |
2024-02-06 |
link |
LESS: Selecting Influential Data for Targeted Instruction Tuning |
Mengzhou Xia, Sadhika Malladi,..., Danqi Chen |
103 |
2024-03-06 |
link |
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection |
Jiawei Zhao, Zhenyu Zhang,..., Yuandong Tian |
96 |
2023-10-06 |
link |
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models |
Andy Zhou, Kai Yan,..., Yu-Xiong Wang |
95 |
2023-12-18 |
link |
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint |
Wei Xiong, Hanze Dong,..., Tong Zhang |
93 |
2024-03-05 |
link |
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models |
Zeqian Ju, Yuancheng Wang,..., sheng zhao |
91 |
2024-02-21 |
link |
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens |
Yiran Ding, Li Lyna Zhang,..., Mao Yang |
87 |
2024-02-03 |
link |
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding |
Yichao Fu, Peter Bailis,..., Hao Zhang |
85 |
2023-12-11 |
link |
Gated Linear Attention Transformers with Hardware-Efficient Training |
Songlin Yang, Bailin Wang,..., Yoon Kim |
84 |
2023-12-01 |
link |
Nash Learning from Human Feedback |
Remi Munos, Michal Valko,..., Bilal Piot |
83 |
2024-02-08 |
link |
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models |
Dongyang Liu, Renrui Zhang,..., Peng Gao |
82 |
2024-02-19 |
link |
LoRA+: Efficient Low Rank Adaptation of Large Models |
Soufiane Hayou, Nikhil Ghosh, Bin Yu |
80 |
2023-11-07 |
link |
The Linear Representation Hypothesis and the Geometry of Large Language Models |
Kiho Park, Yo Joong Choe, Victor Veitch |
78 |
2024-02-15 |
link |
Data Engineering for Scaling Language Models to 128K Context |
Yao Fu, Rameswar Panda,..., Hao Peng |
77 |
2024-02-04 |
link |
Unified Training of Universal Time Series Forecasting Transformers |
Gerald Woo, Chenghao Liu,..., Doyen Sahoo |
76 |
2024-02-05 |
link |
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache |
Zirui Liu, Jiayi Yuan,..., Xia Hu |
73 |
2023-10-11 |
link |
In-Context Unlearning: Language Models as Few Shot Unlearners |
Martin Pawelczyk, Seth Neel, Himabindu Lakkaraju |
72 |
2023-09-25 |
link |
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction |
Zeyuan Allen-Zhu, Yuanzhi Li |
72 |
2024-02-23 |
link |
Genie: Generative Interactive Environments |
Jake Bruce, Michael D Dennis,..., Tim Rocktäschel |
70 |
2023-12-28 |
link |
Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels |
Haoning Wu, Zicheng Zhang,..., Weisi Lin |
70 |
2024-04-16 |
link |
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study |
Shusheng Xu, Wei Fu,..., Yi Wu |
70 |
2024-01-22 |
link |
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs |
Ling Yang, Zhaochen Yu,..., Bin CUI |
69 |
2024-02-02 |
link |
TravelPlanner: A Benchmark for Real-World Planning with Language Agents |
Jian Xie, Kai Zhang,..., Yu Su |
69 |
2024-01-22 |
link |
WARM: On the Benefits of Weight Averaged Reward Models |
Alexandre Rame, Nino Vieillard,..., Johan Ferret |
69 |
2024-03-05 |
link |
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning |
Nathaniel Li, Alexander Pan,..., Dan Hendrycks |
67 |
2024-01-26 |
link |
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty |
Yuhui Li, Fangyun Wei,..., Hongyang Zhang |
67 |
2024-02-01 |
link |
Executable Code Actions Elicit Better LLM Agents |
Xingyao Wang, Yangyi Chen,..., Heng Ji |
66 |
2024-02-07 |
link |
Fast Timing-Conditioned Latent Audio Diffusion |
Zach Evans, CJ Carr,..., Jordi Pons |
65 |
2023-11-18 |
link |
An Embodied Generalist Agent in 3D World |
Jiangyong Huang, Silong Yong,..., Siyuan Huang |
65 |
2024-01-02 |
link |
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning |
Hongye Jin, Xiaotian Han,..., Xia Hu |
65 |
2024-01-03 |
link |
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity |
Andrew Lee, Xiaoyan Bai,..., Rada Mihalcea |
64 |
2024-04-22 |
link |
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data |
Fahim Tajwar, Anikait Singh,..., Aviral Kumar |
59 |
2023-11-02 |
link |
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation |
Yufei Wang, Zhou Xian,..., Chuang Gan |
59 |
2024-02-12 |
link |
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models |
Siddharth Karamcheti, Suraj Nair,..., Dorsa Sadigh |
59 |
2024-01-08 |
link |
A Minimaximalist Approach to Reinforcement Learning from Human Feedback |
Gokul Swamy, Christoph Dann,..., Alekh Agarwal |
59 |
2023-09-29 |
link |
Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training |
Ziyu Wan, Xidong Feng,..., Jun Wang |
58 |
2024-02-22 |
link |
GaussianPro: 3D Gaussian Splatting with Progressive Propagation |
Kai Cheng, Xiaoxiao Long,..., Xuejin Chen |
56 |
2024-02-12 |
link |
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs |
Soroush Nasiriany, Fei Xia,..., brian ichter |
56 |
2024-02-06 |
link |
MOMENT: A Family of Open Time-series Foundation Models |
Mononito Goswami, Konrad Szafer,..., Artur Dubrawski |
56 |
2024-01-11 |
link |
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models |
Asma Ghandeharioun, Avi Caciularu,..., Mor Geva |
55 |
2023-09-01 |
link |
Image Hijacks: Adversarial Images can Control Generative Models at Runtime |
Luke Bailey, Euan Ong,..., Scott Emmons |
54 |
2024-02-08 |
link |
Generalized Preference Optimization: A Unified Approach to Offline Alignment |
Yunhao Tang, Zhaohan Daniel Guo,..., Bilal Piot |
54 |
2023-10-29 |
link |
Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game |
Zelai Xu, Chao Yu,..., Yi Wu |
53 |
2024-02-07 |
link |
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications |
Boyi Wei, Kaixuan Huang,..., Peter Henderson |
52 |
2023-11-11 |
link |
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering |
Sheng Liu, Haotian Ye,..., James Y. Zou |
52 |
2023-07-20 |
link |
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models |
Xiaoxuan Wang, Ziniu Hu,..., Wei Wang |
52 |
2024-01-11 |
link |
Extreme Compression of Large Language Models via Additive Quantization |
Vage Egiazarian, Andrei Panferov,..., Dan Alistarh |
51 |
2023-10-08 |
link |
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity |
Lu Yin, You Wu,..., Shiwei Liu |
50 |
2024-01-29 |
link |
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models |
Fuzhao Xue, Zian Zheng,..., Yang You |
50 |
2024-02-06 |
link |
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks |
Jongho Park, Jaeseung Park,..., Dimitris Papailiopoulos |
48 |
2024-01-16 |
link |
Scalable Pre-training of Large Autoregressive Image Models |
Alaaeldin El-Nouby, Michal Klein,..., Armand Joulin |
48 |
2024-02-02 |
link |
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities |
Zhifeng Kong, Arushi Goel,..., Bryan Catanzaro |
47 |
2024-03-11 |
link |
Stealing Part of a Production Language Model |
Nicholas Carlini, Daniel Paleka,..., Florian Tramèr |
47 |
2023-12-07 |
link |
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator |
Chengshu Li, Jacky Liang,..., brian ichter |
46 |
2024-02-07 |
link |
AlphaFold Meets Flow Matching for Generating Protein Ensembles |
Bowen Jing, Bonnie Berger, Tommi Jaakkola |
46 |
2024-04-30 |
link |
Better & Faster Large Language Models via Multi-token Prediction |
Fabian Gloeckle, Badr Youbi Idrissi,..., Gabriel Synnaeve |
46 |
2024-01-22 |
link |
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text |
Abhimanyu Hans, Avi Schwarzschild,..., Tom Goldstein |
46 |
2024-02-01 |
link |
Repeat After Me: Transformers are Better than State Space Models at Copying |
Samy Jelassi, David Brandfonbrener,..., eran malach |
45 |
2024-04-24 |
link |
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI |
Kaining Ying, Fanqing Meng,..., Wenqi Shao |
44 |
2024-02-07 |
link |
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design |
Andrew Campbell, Jason Yim,..., Tommi Jaakkola |
44 |
2023-09-12 |
link |
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts |
Zhi-Yi Chin, Chieh Ming Jiang,..., Wei-Chen Chiu |
44 |
2024-03-11 |
link |
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews |
Weixin Liang, Zachary Izzo,..., James Y. Zou |
43 |
2023-12-31 |
link |
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws |
Nikhil Sardana, Jacob Portes,..., Jonathan Frankle |
42 |
2023-08-20 |
link |
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models |
Bilgehan Sel, Ahmad Tawaha,..., Ming Jin |
42 |
2023-10-08 |
link |
In-Context Convergence of Transformers |
Yu Huang, Yuan Cheng, Yingbin Liang |
42 |
2023-12-04 |
link |
Magicoder: Empowering Code Generation with OSS-Instruct |
Yuxiang Wei, Zhe Wang,..., LINGMING ZHANG |
41 |
2023-10-25 |
link |
Controlled Decoding from Language Models |
Sidharth Mudgal, Jong Lee,..., Ahmad Beirami |
40 |
2024-02-22 |
link |
tinyBenchmarks: evaluating LLMs with fewer examples |
Felipe Maia Polo, Lucas Weber,..., Mikhail Yurochkin |
40 |
2024-03-13 |
link |
Human Alignment of Large Language Models through Online Preference Optimisation |
Daniele Calandriello, Zhaohan Daniel Guo,..., Bilal Piot |
40 |
2024-02-13 |
link |
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability |
Xingang Guo, Fangxu Yu,..., Bin Hu |
39 |
2023-07-17 |
link |
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations |
Yanda Chen, Ruiqi Zhong,..., Kathleen McKeown |
39 |
2024-02-10 |
link |
A Tale of Tails: Model Collapse as a Change of Scaling Laws |
Elvis Dohmatob, Yunzhen Feng,..., Julia Kempe |
39 |
2024-01-05 |
link |
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution |
Alex Gu, Baptiste Roziere,..., Sida Wang |
39 |
2024-02-13 |
link |
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation |
Luke Melas-Kyriazi, Iro Laina,..., Filippos Kokkinos |
37 |
2024-02-15 |
link |
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment |
Rui Yang, Xiaoman Pan,..., Jianshu Chen |
37 |
2024-02-28 |
link |
Simple linear attention language models balance the recall-throughput tradeoff |
Simran Arora, Sabri Eyuboglu,..., Christopher Re |
37 |
2024-02-06 |
link |
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs |
Wei Huang, Yangdong Liu,..., XIAOJUAN QI |
36 |
2023-06-30 |
link |
Stay on topic with Classifier-Free Guidance |
Guillaume Sanchez, Alexander Spangher,..., Stella Biderman |
36 |
2024-03-05 |
link |
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling |
Yair Schiff, Chia Hsiang Kao,..., Volodymyr Kuleshov |
36 |
2023-05-24 |
link |
Robust Classification via a Single Diffusion Model |
Huanran Chen, Yinpeng Dong,..., Jun Zhu |
36 |
2023-06-05 |
link |
InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models |
Lichang Chen, Jiuhai Chen,..., Tianyi Zhou |
36 |
2023-07-31 |
link |
Learning to Model the World with Language |
Jessy Lin, Yuqing Du,..., Anca Dragan |
35 |
2024-01-31 |
link |
On Prompt-Driven Safeguarding for Large Language Models |
Chujie Zheng, Fan Yin,..., Nanyun Peng |
34 |
2023-10-19 |
link |
HumanTOMATO: Text-aligned Whole-body Motion Generation |
Shunlin Lu, Ling-Hao Chen,..., Heung-Yeung Shum |
34 |
2024-02-22 |
link |
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases |
Zechun Liu, Changsheng Zhao,..., Vikas Chandra |
34 |
2024-03-06 |
link |
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL |
Jesse Farebrother, Jordi Orbay,..., Rishabh Agarwal |
33 |
2023-11-08 |
link |
NExT-Chat: An LMM for Chat, Detection and Segmentation |
Ao Zhang, Yuan Yao,..., Tat-Seng Chua |
33 |
2023-06-09 |
link |
Prodigy: An Expeditiously Adaptive Parameter-Free Learner |
Konstantin Mishchenko, Aaron Defazio |
33 |
2023-12-07 |
link |
An LLM Compiler for Parallel Function Calling |
Sehoon Kim, Suhong Moon,..., Amir Gholami |
33 |
2023-10-11 |
link |
Online Speculative Decoding |
Xiaoxuan Liu, Lanxiang Hu,..., Hao Zhang |
32 |
2024-03-05 |
link |
Behavior Generation with Latent Actions |
Seungjae Lee, Yibin Wang,..., Lerrel Pinto |
32 |
2023-10-11 |
link |
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining |
Boxin Wang, Wei Ping,..., Bryan Catanzaro |
32 |
2024-03-11 |
link |
The pitfalls of next-token prediction |
Gregor Bachmann, Vaishnavh Nagarajan |
32 |
2024-03-01 |
link |
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding |
Zhaorun Chen, Zhuokai Zhao,..., Jiawei Zhou |
31 |
2024-02-03 |
link |
Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models |
Yongshuo Zong, Ondrej Bohdal,..., Timothy Hospedales |
31 |
2024-03-14 |
link |
3D-VLA: A 3D Vision-Language-Action Generative World Model |
Haoyu Zhen, Xiaowen Qiu,..., Chuang Gan |
30 |
2024-02-11 |
link |
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting |
Xiaoyu Zhou, Xingjian Ran,..., Ming-Hsuan Yang |
30 |
2024-02-13 |
link |
LLaGA: Large Language and Graph Assistant |
Runjin Chen, Tong Zhao,..., Zhangyang Wang |
30 |
2023-12-11 |
link |
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context |
Xiang Cheng, Yuxin Chen, Suvrit Sra |
30 |
2023-11-15 |
link |
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling |
Bairu Hou, Yujian Liu,..., Yang Zhang |
30 |
2023-10-05 |
link |
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation |
Qian Huang, Jian Vora,..., Jure Leskovec |
30 |
2023-04-05 |
link |
Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models |
Jan van den Brand, Zhao Song, Tianyi Zhou |
30 |
2024-02-07 |
link |
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning |
Hao Zhao, Maksym Andriushchenko,..., Nicolas Flammarion |
29 |
2024-02-14 |
link |
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference |
Harry Dong, Xinyu Yang,..., Beidi Chen |
29 |
2024-02-07 |
link |
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark |
Dongping Chen, Ruoxi Chen,..., Lichao Sun |
29 |
2024-01-23 |
link |
DsDm: Model-Aware Dataset Selection with Datamodels |
Logan Engstrom |
29 |
2024-03-01 |
link |
Provably Robust DPO: Aligning Language Models with Noisy Feedback |
Sayak Ray Chowdhury, Anush Kini, Nagarajan Natarajan |
29 |
2024-02-08 |
link |
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue |
Xing Han Lu, Zdeněk Kasner, Siva Reddy |
28 |
2024-01-04 |
link |
Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model |
Fei Liu, Tong Xialiang,..., Qingfu Zhang |
28 |
2024-01-23 |
link |
In-Context Language Learning: Architectures and Algorithms |
Ekin Akyürek, Bailin Wang,..., Jacob Andreas |
28 |
2024-01-21 |
link |
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers |
Katherine Crowson, Stefan Andreas Baumann,..., Enrico Shippole |
28 |
2024-04-05 |
link |
Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation |
Mingyuan Zhou, Huangjie Zheng,..., Hai Huang |
28 |
2024-02-12 |
link |
Scaling Laws for Fine-Grained Mixture of Experts |
Jan Ludziejewski, Jakub Krajewski,..., Sebastian Jaszczur |
28 |
2024-02-08 |
link |
Dirichlet Flow Matching with Applications to DNA Sequence Design |
Hannes Stark, Bowen Jing,..., Tommi Jaakkola |
27 |
2024-02-13 |
link |
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast |
Xiangming Gu, Xiaosen Zheng,..., Min Lin |
27 |
2023-05-17 |
link |
Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling |
Weijia Xu, Andrzej Banburski, Nebojsa Jojic |
27 |
2024-03-14 |
link |
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference |
Piotr Nawrot, Adrian Łańcucki,..., Edoardo Ponti |
27 |
2024-02-05 |
link |
Large Language Models are Geographically Biased |
Rohin Manvi, Samar Khanna,..., Stefano Ermon |
27 |
2024-03-12 |
link |
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? |
Alexandre Drouin, Maxime Gasse,..., Alexandre Lacoste |
26 |
2024-02-11 |
link |
ODIN: Disentangled Reward Mitigates Hacking in RLHF |
Lichang Chen, Chen Zhu,..., Bryan Catanzaro |
26 |
2023-09-13 |
link |
Auto-Regressive Next-Token Predictors are Universal Learners |
eran malach |
26 |
2024-03-19 |
link |
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content |
Zhuowen Yuan, Zidi Xiong,..., Bo Li |
25 |
2024-02-04 |
link |
Timer: Generative Pre-trained Transformers Are Large Time Series Models |
Yong Liu, Haoran Zhang,..., Mingsheng Long |
25 |
2023-10-25 |
link |
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution |
Aaron Lou, Chenlin Meng, Stefano Ermon |
24 |
2024-03-05 |
link |
MathScale: Scaling Instruction Tuning for Mathematical Reasoning |
Zhengyang Tang, Xingxing Zhang,..., Furu Wei |
24 |
2023-10-02 |
link |
Prompt-tuning latent diffusion models for inverse problems |
Hyungjin Chung, Jong Chul Ye,..., Mauricio Delbracio |
24 |
2024-02-29 |
link |
Watermark Stealing in Large Language Models |
Nikola Jovanović, Robin Staab, Martin Vechev |
24 |
2024-02-06 |
link |
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls |
Yu Du, Fangyun Wei, Hongyang Zhang |
24 |
2022-10-10 |
link |
Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score Based Diffusion Models |
Louis Sharrock, Jack Simons,..., Mark Beaumont |
24 |
2024-02-14 |
link |
Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models |
Francesca-Zhoufan Li, Ava P Amini,..., Alex Xijie Lu |
23 |
2023-12-08 |
link |
SparQ Attention: Bandwidth-Efficient LLM Inference |
Luka Ribar, Ivan Chelombiev,..., Douglas Orr |
23 |
2024-01-22 |
link |
DITTO: Diffusion Inference-Time T-Optimization for Music Generation |
Zachary Novack, Julian McAuley,..., Nicholas J. Bryan |
23 |
2024-02-28 |
link |
Evaluating Quantized Large Language Models |
Shiyao Li, Xuefei Ning,..., Yu Wang |
23 |
2024-01-09 |
link |
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation |
Mahdi Nikdan, Soroush Tabesh,..., Dan Alistarh |
23 |
2024-02-03 |
link |
BetterV: Controlled Verilog Generation with Discriminative Guidance |
Zehua PEI, Huiling Zhen,..., Bei Yu |
23 |
2024-02-02 |
link |
A Dynamical Model of Neural Scaling Laws |
Blake Bordelon, Alexander Atanasov, Cengiz Pehlevan |
22 |
2024-02-29 |
link |
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL |
Yifei Zhou, Andrea Zanette,..., Aviral Kumar |
22 |
2024-04-04 |
link |
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models |
Jerry Yao-Chieh Hu, Pei-Hsuan Chang,..., Han Liu |
22 |
2024-02-06 |
link |
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback |
Yufei Wang, Zhanyi Sun,..., Zackory Erickson |
22 |
2024-04-12 |
link |
The Illusion of State in State-Space Models |
William Merrill, Jackson Petty, Ashish Sabharwal |
22 |
2024-01-30 |
link |
Proactive Detection of Voice Cloning with Localized Watermarking |
Robin San Roman, Pierre Fernandez,..., Tuan Tran |
22 |
2024-03-06 |
link |
Accelerating Convergence of Score-Based Diffusion Models, Provably |
Gen Li, Yu Huang,..., Yuxin Chen |
22 |
2023-11-18 |
link |
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion |
Di Chang, Yichun Shi,..., Mohammad Soleymani |
21 |
2024-02-02 |
link |
Boximator: Generating Rich and Controllable Motions for Video Synthesis |
Jiawei Wang, Yuchen Zhang,..., Hang Li |
21 |
2024-02-05 |
link |
Flora: Low-Rank Adapters Are Secretly Gradient Compressors |
Yongchang Hao, Yanshuai Cao, Lili Mou |
21 |
2024-02-09 |
link |
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities |
Tara Akhound-Sadegh, Jarrid Rector-Brooks,..., Alexander Tong |
21 |
2024-02-18 |
link |
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning |
Long Qian, Juncheng Li,..., Siliang Tang |
21 |
2023-06-07 |
link |
Don't trust your eyes: on the (un)reliability of feature visualizations |
Robert Geirhos, Roland S. Zimmermann,..., Been Kim |
21 |
2024-02-05 |
link |
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization |
Yang Jin, Zhicheng Sun,..., Yadong MU |
21 |
2024-02-07 |
link |
On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis |
Jerry Yao-Chieh Hu, Thomas Lin,..., Han Liu |
21 |
2024-02-27 |
link |
Training-Free Long-Context Scaling of Large Language Models |
Chenxin An, Fei Huang,..., Lingpeng Kong |
21 |
2024-02-27 |
link |
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations |
Jiaqi Zhai, Lucy Liao,..., Yu Shi |
21 |
2024-02-05 |
link |
Decoding-time Realignment of Language Models |
Tianlin Liu, Shangmin Guo,..., Mathieu Blondel |
20 |
2024-02-28 |
link |
CogBench: a large language model walks into a psychology lab |
Julian Coda-Forno, Marcel Binz,..., Eric Schulz |
20 |
2023-02-26 |
link |
Diffusion Model-Augmented Behavioral Cloning |
Shang-Fu Chen, Hsiang-Chun Wang,..., Shao-Hua Sun |
20 |
2024-02-02 |
link |
Challenges in Training PINNs: A Loss Landscape Perspective |
Pratik Rathore, Weimu Lei,..., Madeleine Udell |
20 |
2023-10-16 |
link |
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models |
Ziniu Li, Tian Xu,..., Zhi-Quan Luo |
20 |
2024-02-18 |
link |
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark |
Yihua Zhang, Pingzhi Li,..., Tianlong Chen |
20 |
2024-04-18 |
link |
Token-level Direct Preference Optimization |
Yongcheng Zeng, Guoqing Liu,..., Jun Wang |
20 |
2024-02-13 |
link |
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements |
Alexander Havrilla, Sharath Chandra Raparthy,..., Roberta Raileanu |
19 |
2024-02-05 |
link |
C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models |
Mintong Kang, Nezihe Merve Gürel,..., Bo Li |
19 |
2024-02-14 |
link |
Transformers, parallel computation, and logarithmic depth |
Clayton Sanford, Daniel Hsu, Matus Telgarsky |
19 |
2024-03-17 |
link |
MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data |
Paul Steven Scotti, Mihir Tripathy,..., Tanishq Mathew Abraham |
19 |
2024-02-14 |
link |
Premise Order Matters in Reasoning with Large Language Models |
Xinyun Chen, Ryan Andrew Chi,..., Denny Zhou |
19 |
2024-02-26 |
link |
Asymmetry in Low-Rank Adapters of Foundation Models |
Jiacheng Zhu, Kristjan Greenewald,..., Justin Solomon |
19 |
2023-10-05 |
link |
Stochastic interpolants with data-dependent couplings |
Michael Samuel Albergo, Mark Goldstein,..., Eric Vanden-Eijnden |
19 |
2024-01-11 |
link |
DiffDA: a diffusion model for weather-scale data assimilation |
Langwen Huang, Lukas Gianinazzi,..., Torsten Hoefler |
19 |
2024-02-01 |
link |
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts |
Anke Tang, Li Shen,..., Dacheng Tao |
19 |
2023-10-26 |
link |
Codebook Features: Sparse and Discrete Interpretability for Neural Networks |
Alex Tamkin, Mohammad Taufeeque, Noah Goodman |
18 |
2023-10-09 |
link |
Generalized Neural Collapse for a Large Number of Classes |
Jiachen Jiang, Jinxin Zhou,..., Zhihui Zhu |
18 |
2024-02-01 |
link |
Dense Reward for Free in Reinforcement Learning from Human Feedback |
Alex James Chan, Hao Sun,..., Mihaela van der Schaar |
18 |
2023-12-12 |
link |
AI Control: Improving Safety Despite Intentional Subversion |
Ryan Greenblatt, Buck Shlegeris,..., Fabien Roger |
18 |
2024-02-05 |
link |
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents |
Yatin Dandi, Emanuele Troiani,..., Florent Krzakala |
18 |
2024-02-08 |
link |
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention |
Haotong Qin, Xudong Ma,..., Michele Magno |
18 |
2023-12-11 |
link |
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes |
Zhen Qin, Daoyuan Chen,..., Shuiguang Deng |
18 |
2024-02-15 |
link |
DE-COP: Detecting Copyrighted Content in Language Models Training Data |
André Vicente Duarte, Xuandong Zhao,..., Lei Li |
18 |
2024-02-15 |
link |
Language Models with Conformal Factuality Guarantees |
Christopher Mohri, Tatsunori Hashimoto |
17 |
2024-02-03 |
link |
A Closer Look at the Limitations of Instruction Tuning |
Sreyan Ghosh, Chandra Kiran Reddy Evuru,..., Dinesh Manocha |
17 |
2024-02-21 |
link |
D-Flow: Differentiating through Flows for Controlled Generation |
Heli Ben-Hamu, Omri Puny,..., Yaron Lipman |
17 |
2020-11-29 |
link |
Scaling Down Deep Learning with MNIST-1D |
Samuel James Greydanus, Dmitry Kobak |
17 |
2024-02-08 |
link |
In-Context Principle Learning from Mistakes |
Tianjun Zhang, Aman Madaan,..., Uri Alon |
17 |
2024-01-05 |
link |
VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model |
Pengying Wu, Yao Mu,..., Chang Liu |
17 |
2024-02-05 |
link |
Guidance with Spherical Gaussian Constraint for Conditional Diffusion |
Lingxiao Yang, Shutong Ding,..., Ye Shi |
17 |
2024-04-10 |
link |
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation |
Aaditya K Singh, Ted Moskovitz,..., Andrew M Saxe |
17 |
2024-02-13 |
link |
Mixtures of Experts Unlock Parameter Scaling for Deep RL |
Johan Samir Obando Ceron, Ghada Sokar,..., Pablo Samuel Castro |
17 |
2024-01-10 |
link |
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks |
Xueyu Hu, Ziyu Zhao,..., Fei Wu |
16 |
2024-03-03 |
link |
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models |
Yuchen Wu, Minshuo Chen,..., Yuting Wei |
16 |
2023-10-10 |
link |
Conformal Prediction for Deep Classifier via Label Ranking |
Jianguo Huang, HuaJun Xi,..., Hongxin Wei |
16 |
2024-02-07 |
link |
Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching |
Yuchen Zhang, Tianle Zhang,..., Yang You |
16 |
2024-02-20 |
link |
A Touch, Vision, and Language Dataset for Multimodal Alignment |
Letian Fu, Gaurav Datta,..., Ken Goldberg |
16 |
2022-11-09 |
link |
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind |
Mo Yu, Qiujing Wang,..., Jie Zhou |
16 |
2023-10-11 |
link |
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks |
Behrad Moniri, Donghwan Lee,..., Edgar Dobriban |
16 |
2023-10-23 |
link |
DoGE: Domain Reweighting with Generalization Estimation |
Simin Fan, Matteo Pagliardini, Martin Jaggi |
16 |
2023-12-19 |
link |
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in ultra low-data regimes |
Nabeel Seedat, Nicolas Huynh,..., Mihaela van der Schaar |
16 |
2024-03-04 |
link |
Differentially Private Synthetic Data via Foundation Model APIs 2: Text |
Chulin Xie, Zinan Lin,..., Sergey Yekhanin |
16 |
2024-04-16 |
link |
Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback |
Vincent Conitzer, Rachel Freedman,..., William S. Zwicker |
16 |
2024-02-08 |
link |
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis |
Federico Bianchi, Patrick John Chia,..., James Zou |
15 |
2023-12-06 |
link |
Generalization to New Sequential Decision Making Tasks with In-Context Learning |
Sharath Chandra Raparthy, Eric Hambro,..., Roberta Raileanu |
15 |
2024-04-15 |
link |
All-in-one simulation-based inference |
Manuel Gloeckler, Michael Deistler,..., Jakob H. Macke |
15 |
2024-02-20 |
link |
VideoPrism: A Foundational Visual Encoder for Video Understanding |
Long Zhao, Nitesh Bharadwaj Gundavarapu,..., Boqing Gong |
15 |
2023-12-08 |
link |
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism |
Yanxi Chen, Xuchen Pan,..., Jingren Zhou |
15 |
2024-03-06 |
link |
On the Origins of Linear Representations in Large Language Models |
Yibo Jiang, Goutham Rajendran,..., Victor Veitch |
15 |
2024-05-02 |
link |
SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters |
Shengsheng Lin, Weiwei Lin,..., Junjie Yang |
15 |
2023-12-06 |
link |
Low-Cost High-Power Membership Inference Attacks |
Sajjad Zarifzadeh, Philippe Liu, Reza Shokri |
15 |
2024-02-23 |
link |
Fast Adversarial Attacks on Language Models In One GPU Minute |
Vinu Sankar Sadasivan, Shoumik Saha,..., Soheil Feizi |
15 |
2024-02-04 |
link |
Transolver: A Fast Transformer Solver for PDEs on General Geometries |
Haixu Wu, Huakun Luo,..., Mingsheng Long |
15 |
2023-04-03 |
link |
Chain-of-Thought Predictive Control |
Zhiwei Jia, Vineet Thumuluri,..., Hao Su |
15 |
2024-02-26 |
link |
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning |
Michael Matthews, Michael Beukman,..., Jakob Nicolaus Foerster |
15 |
2024-05-18 |
link |
Towards Modular LLMs by Building and Reusing a Library of LoRAs |
Oleksiy Ostapenko, Zhan Su,..., Alessandro Sordoni |
15 |
2024-01-28 |
link |
An Information-Theoretic Analysis of In-Context Learning |
Hong Jun Jeon, Jason D. Lee,..., Benjamin Van Roy |
15 |
2023-05-27 |
link |
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers |
Dachuan Shi, Chaofan Tao,..., Jiaqi Wang |
15 |
2024-02-08 |
link |
Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation |
Xianghe Pang, Shuo Tang,..., Siheng Chen |
15 |
2024-02-28 |
link |
CLLMs: Consistency Large Language Models |
Siqi Kou, Lanxiang Hu,..., Hao Zhang |
14 |
2024-05-13 |
link |
Localizing Task Information for Improved Model Merging and Compression |
Ke Wang, Nikolaos Dimitriadis,..., Pascal Frossard |
14 |
2023-10-16 |
link |
A Computational Framework for Solving Wasserstein Lagrangian Flows |
Kirill Neklyudov, Rob Brekelmans,..., Alireza Makhzani |
14 |
2024-02-15 |
link |
Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion |
Hila Manor, Tomer Michaeli |
14 |
2024-02-08 |
link |
Memory Consolidation Enables Long-Context Video Understanding |
Ivana Balazevic, Yuge Shi,..., Olivier J Henaff |
14 |
2024-02-15 |
link |
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts |
Kuang-Huei Lee, Xinyun Chen,..., Ian Fischer |
14 |
2023-10-02 |
link |
Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations |
Yongshuo Zong, Tingyang Yu,..., Timothy Hospedales |
14 |
2024-01-29 |
link |
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF |
Banghua Zhu, Michael Jordan, Jiantao Jiao |
14 |
2024-02-26 |
link |
Feedback Efficient Online Fine-Tuning of Diffusion Models |
Masatoshi Uehara, Yulai Zhao,..., Tommaso Biancalani |
14 |
2023-06-02 |
link |
Revisiting the Role of Language Priors in Vision-Language Models |
Zhiqiu Lin, Xinyue Chen,..., Deva Ramanan |
14 |
2024-01-18 |
link |
Improving fine-grained understanding in image-text pre-training |
Ioana Bica, Anastasija Ilic,..., Jovana Mitrovic |
14 |
2024-03-06 |
link |
DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training |
Zhongkai Hao, Chang Su,..., Jun Zhu |
14 |
2024-01-24 |
link |
Can AI Assistants Know What They Don't Know? |
Qinyuan Cheng, Tianxiang Sun,..., Xipeng Qiu |
14 |
2024-02-19 |
link |
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models |
Christian Schlarmann, Naman Deep Singh,..., Matthias Hein |
14 |
2024-02-07 |
link |
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay |
Natasha Butt, Blazej Manczak,..., Taco Cohen |
14 |
2024-02-14 |
link |
Position: Topological Deep Learning is the New Frontier for Relational Learning |
Theodore Papamarkou, Tolga Birdal,..., Ghada Zamzmi |
13 |
2024-03-11 |
link |
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement |
Che Liu, Zhongwei Wan,..., Rossella Arcucci |
13 |
2024-02-19 |
link |
In value-based deep reinforcement learning, a pruned network is a good network |
Johan Samir Obando Ceron, Aaron Courville, Pablo Samuel Castro |
13 |
2024-02-27 |
link |
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning |
Siyuan Guo, Cheng Deng,..., Jun Wang |
13 |
2024-03-03 |
link |
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation |
Shiqi Chen, Miao Xiong,..., Junxian He |
13 |
2024-02-05 |
link |
Distinguishing the Knowable from the Unknowable with Language Models |
Gustaf Ahdritz, Tian Qin,..., Benjamin L. Edelman |
13 |
2023-07-21 |
link |
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues |
Antonio Orvieto, Soham De,..., Samuel L Smith |
13 |
2024-02-19 |
link |
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models |
Didi Zhu, Zhongyisun Sun,..., Kun Kuang |
13 |
2024-02-13 |
link |
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference |
Shentao Yang, Tianqi Chen, Mingyuan Zhou |
13 |
2023-09-18 |
link |
Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts |
Jiang-Xin Shi, Tong Wei,..., Yu-Feng Li |
13 |
2023-12-20 |
link |
Learning and Forgetting Unsafe Examples in Large Language Models |
Jiachen Zhao, Zhun Deng,..., Mengye Ren |
13 |
2024-02-05 |
link |
Representation Surgery for Multi-Task Model Merging |
Enneng Yang, Li Shen,..., Dacheng Tao |
13 |
2023-10-20 |
link |
Equivariant Deep Weight Space Alignment |
Aviv Navon, Aviv Shamsian,..., Haggai Maron |
13 |
2024-02-03 |
link |
Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance |
Xinyu Peng, Ziyang Zheng,..., Hongkai Xiong |
13 |
2024-03-16 |
link |
SelfIE: Self-Interpretation of Large Language Model Embeddings |
Haozhe Chen, Carl Vondrick, Chengzhi Mao |
13 |
2024-02-03 |
link |
GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding |
Cunxiao Du, Jing Jiang,..., Yang You |
13 |
2024-04-26 |
link |
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo |
Stephen Zhao, Rob Brekelmans,..., Roger Baker Grosse |
13 |
2024-02-21 |
link |
Do Efficient Transformers Really Save Computation? |
Kai Yang, Jan Ackermann,..., Liwei Wang |
13 |
2024-02-26 |
link |
Disentangled 3D Scene Generation with Layout Learning |
Dave Epstein, Ben Poole,..., Aleksander Holynski |
13 |
2023-02-23 |
link |
EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction |
yang zhang, Zhewei Wei,..., Wenbing Huang |
13 |
2024-02-15 |
link |
SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention |
Romain Ilbert, Ambroise Odonnat,..., Ievgen Redko |
13 |
2024-03-15 |
link |
Repoformer: Selective Retrieval for Repository-Level Code Completion |
Di Wu, Wasi Uddin Ahmad,..., Xiaofei Ma |
13 |
2024-02-23 |
link |
Foundation Policies with Hilbert Representations |
Seohong Park, Tobias Kreiman, Sergey Levine |
13 |
2024-02-02 |
link |
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape |
Juno Kim, Taiji Suzuki |
13 |
2024-03-28 |
link |
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions |
Kai Zhang, Yi Luan,..., Ming-Wei Chang |
13 |
2023-10-16 |
link |
Unifying Image Processing as Visual Prompting Question Answering |
Yihao Liu, Xiangyu Chen,..., Chao Dong |
13 |
2023-12-28 |
link |
Non-Vacuous Generalization Bounds for Large Language Models |
Sanae Lotfi, Marc Anton Finzi,..., Andrew Gordon Wilson |
13 |
2024-02-01 |
link |
Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI |
Theodore Papamarkou, Maria Skoularidou,..., Ruqi Zhang |
13 |
2023-05-27 |
link |
Matrix Information Theory for Self-Supervised Learning |
Yifan Zhang, Zhiquan Tan,..., Yang Yuan |
12 |
2024-02-08 |
link |
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning |
Zhiheng Xi, Wenxiang Chen,..., Xuanjing Huang |
12 |
2024-02-06 |
link |
DistiLLM: Towards Streamlined Distillation for Large Language Models |
Jongwoo Ko, Sungnyun Kim,..., Se-Young Yun |
12 |
2024-01-29 |
link |
ReGAL: Refactoring Programs to Discover Generalizable Abstractions |
Elias Stengel-Eskin, Archiki Prasad, Mohit Bansal |
12 |
2023-10-06 |
link |
On the Embedding Collapse when Scaling up Recommendation Models |
Xingzhuo Guo, Junwei Pan,..., Mingsheng Long |
12 |
2024-03-04 |
link |
DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving |
Foteini Strati, Sara McAllister,..., Ana Klimovic |
12 |
2023-07-03 |
link |
Trainable Transformer in Transformer |
Abhishek Panigrahi, Sadhika Malladi,..., Sanjeev Arora |
12 |
2024-01-19 |
link |
Equivariant Graph Neural Operator for Modeling 3D Dynamics |
Minkai Xu, Jiaqi Han,..., Anima Anandkumar |
12 |
2024-04-04 |
link |
BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model |
Chenwei Xu, Yu-Chao Huang,..., Han Liu |
12 |
2024-03-21 |
link |
Protein Conformation Generation via Force-Guided SE(3) Diffusion Models |
YanWang, Lihao Wang,..., Quanquan Gu |
12 |
2024-01-24 |
link |
Conformal Prediction Sets Improve Human Decision Making |
Jesse C. Cresswell, Yi Sui,..., Noël Vouitsis |
12 |
2024-03-21 |
link |
An Analysis of Linear Time Series Forecasting Models |
William Toner, Luke Nicholas Darlow |
12 |
2024-02-23 |
link |
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning? |
Hongkang Li, Meng Wang,..., Pin-Yu Chen |
12 |
2024-03-02 |
link |
SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code |
Ziniu Hu, Ahmet Iscen,..., Alireza Fathi |
12 |
2023-12-02 |
link |
Second-Order Uncertainty Quantification: A Distance-Based Approach |
Yusuf Sale, Viktor Bengs,..., Eyke Hüllermeier |
12 |
2024-05-02 |
link |
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts |
Jianan Zhou, Zhiguang Cao,..., Xu Chi |
12 |
2024-05-05 |
link |
Parameter-Efficient Fine-Tuning with Discrete Fourier Transform |
Ziqi Gao, Qichao Wang,..., Jia Li |
11 |
2024-02-02 |
link |
Online conformal prediction with decaying step sizes |
Anastasios Nikolas Angelopoulos, Rina Barber, Stephen Bates |
11 |
2024-04-12 |
link |
TSLANet: Rethinking Transformers for Time Series Representation Learning |
Emadeldeen Eldele, Mohamed Ragab,..., Xiaoli Li |
11 |
2023-11-15 |
link |
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy |
Kirill Vishniakov, Zhiqiang Shen, Zhuang Liu |
11 |
2024-02-25 |
link |
Equivariant Frames and the Impossibility of Continuous Canonicalization |
Nadav Dym, Hannah Lawrence, Jonathan W. Siegel |
11 |
2024-02-05 |
link |
Graph-enhanced Large Language Models in Asynchronous Plan Reasoning |
Fangru Lin, Emanuele La Malfa,..., Janet B. Pierrehumbert |
11 |
2024-03-26 |
link |
Mechanistic Design and Scaling of Hybrid Architectures |
Michael Poli, Armin W Thomas,..., Stefano Massaroli |
11 |
2024-03-04 |
link |
CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables |
Jiecheng Lu, Xu Han,..., Shihao Yang |
11 |
2024-10-29 |
link |
Cell2Sentence: Teaching Large Language Models the Language of Biology |
Daniel Levine, Syed A Rizvi,..., David van Dijk |
11 |
2024-01-07 |
link |
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline |
Haonan Wang, Qianli Shen,..., Kenji Kawaguchi |
11 |
2024-02-25 |
link |
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis |
Yao Mu, Junting Chen,..., Ping Luo |
11 |
2024-03-30 |
link |
Linguistic Calibration of Long-Form Generations |
Neil Band, Xuechen Li,..., Tatsunori Hashimoto |
11 |
2024-02-12 |
link |
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT |
Jon Saad-Falcon, Daniel Y Fu,..., Christopher Re |
11 |
2024-02-08 |
link |
AttnLRP: Attention-Aware Layer-wise Relevance Propagation for Transformers |
Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi,..., Wojciech Samek |
11 |
2024-05-03 |
link |
Auto-Encoding Morph-Tokens for Multimodal LLM |
Kaihang Pan, Siliang Tang,..., Hanwang Zhang |
11 |
2024-02-21 |
link |
From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers |
Muhammed Emrullah Ildiz, Yixiao HUANG,..., Samet Oymak |
11 |
2023-12-26 |
link |
Generalization in Kernel Regression Under Realistic Assumptions |
Daniel Barzilai, Ohad Shamir |
11 |
2024-02-27 |
link |
Case-Based or Rule-Based: How Do Transformers Do the Math? |
Yi Hu, Xiaojuan Tang,..., Muhan Zhang |
11 |
2024-02-07 |
link |
Asymptotics of feature learning in two-layer networks after one gradient-step |
Hugo Cui, Luca Pesce,..., Bruno Loureiro |
11 |
2024-02-09 |
link |
Feedback Loops With Language Models Drive In-Context Reward Hacking |
Alexander Pan, Erik Jones,..., Jacob Steinhardt |
11 |
2024-05-30 |
link |
Why Larger Language Models Do In-context Learning Differently? |
Zhenmei Shi, Junyi Wei,..., Yingyu Liang |
11 |
2024-02-04 |
link |
Selecting Large Language Model to Fine-tune via Rectified Scaling Law |
Haowei Lin, Baizhou Huang,..., Yitao Liang |
11 |
2024-01-23 |
link |
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks |
Zhiruo Wang, Graham Neubig, Daniel Fried |
11 |
2024-02-01 |
link |
Efficient Exploration for LLMs |
Vikranth Dwaracherla, Seyed Mohammad Asghari,..., Benjamin Van Roy |
11 |
2023-06-07 |
link |
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning |
Libin Zhu, Chaoyue Liu,..., Mikhail Belkin |
11 |
2024-02-26 |
link |
Neural Operators with Localized Integral and Differential Kernels |
Miguel Liu-Schiaffini, Julius Berner,..., Anima Anandkumar |
11 |
2023-06-02 |
link |
Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning |
Xiangzhe Kong, Wenbing Huang, Yang Liu |
10 |
2024-03-06 |
link |
Conformal prediction for multi-dimensional time series by ellipsoidal sets |
Chen Xu, Hanyang Jiang, Yao Xie |
10 |
2023-10-02 |
link |
Cooperative Graph Neural Networks |
Ben Finkelshtein, Xingyue Huang,..., Ismail Ilkan Ceylan |
10 |
2024-01-22 |
link |
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference |
Bowen Zhao, Hannaneh Hajishirzi, Qingqing Cao |
10 |
2024-04-17 |
link |
Learning with 3D rotations, a hitchhiker's guide to SO(3) |
Andreas René Geist, Jonas Frey,..., Georg Martius |
10 |
2024-02-14 |
link |
Copyright Traps for Large Language Models |
Matthieu Meeus, Igor Shilov,..., Yves-Alexandre de Montjoye |
10 |
2024-02-13 |
link |
Hybrid Inverse Reinforcement Learning |
Juntao Ren, Gokul Swamy,..., Sanjiban Choudhury |
10 |
2024-03-01 |
link |
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning |
Michal Nauman, Michał Bortkiewicz,..., Marek Cygan |
10 |
2024-01-25 |
link |
Adaptive Text Watermark for Large Language Models |
Yepeng Liu, Yuheng Bu |
10 |
2023-08-25 |
link |
Learning to Intervene on Concept Bottlenecks |
David Steinmann, Wolfgang Stammer,..., Kristian Kersting |
10 |
2023-06-15 |
link |
ViP: A Differentially Private Foundation Model for Computer Vision |
Yaodong Yu, Maziar Sanjabi,..., Chuan Guo |
10 |
2023-10-02 |
link |
FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models |
Jingwei Sun, Ziyue Xu,..., Holger R Roth |
10 |
2024-01-31 |
link |
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? |
Andreas Opedal, Alessandro Stolfo,..., Mrinmaya Sachan |
10 |
2024-05-06 |
link |
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models |
George-Octavian Bărbulescu, Peter Triantafillou |
10 |
2023-03-15 |
link |
Borda Regret Minimization for Generalized Linear Dueling Bandits |
Yue Wu, Tao Jin,..., Quanquan Gu |
10 |
2023-10-04 |
link |
Assessing Large Language Models on Climate Information |
Jannis Bulian, Mike S. Schäfer,..., Nadine Strauss |
10 |
2024-02-11 |
link |
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning |
Kaiwen Wang, Owen Oertell,..., Wen Sun |
10 |
2024-06-05 |
link |
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models |
Peijie Dong, Lujun Li,..., Xiaowen Chu |
10 |
2023-02-07 |
link |
Graph Generation with Diffusion Mixture |
Jaehyeong Jo, Dongki Kim, Sung Ju Hwang |
10 |
2024-03-20 |
link |
Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data |
Giannis Daras, Alex Dimakis, Constantinos Costis Daskalakis |
10 |
2023-12-20 |
link |
In-Context Reinforcement Learning for Variable Action Spaces |
Viacheslav Sinii, Alexander Nikulin,..., Sergey Kolesnikov |
10 |
2023-10-18 |
link |
A connection between Tempering and Entropic Mirror Descent |
Nicolas Chopin, Francesca Crucinio, Anna Korba |
10 |
2023-06-05 |
link |
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic |
Tianying Ji, Yu Luo,..., Huazhe Xu |
10 |
2024-02-03 |
link |
Position: Graph Foundation Models Are Already Here |
Haitao Mao, Zhikai Chen,..., Jiliang Tang |
10 |
2023-05-30 |
link |
Plug-in Performative Optimization |
Licong Lin, Tijana Zrnic |
10 |
2024-02-22 |
link |
Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion |
Yujia Huang, Adishree Ghatare,..., Yisong Yue |
10 |
2023-09-29 |
link |
Information Flow in Self-Supervised Learning |
Zhiquan Tan, Jingqin Yang,..., Yifan Zhang |
10 |
2024-04-22 |
link |
A Multimodal Automated Interpretability Agent |
Tamar Rott Shaham, Sarah Schwettmann,..., Antonio Torralba |
9 |
2024-01-04 |
link |
Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Feature Model |
Hien Dang, Tho Tran Huu,..., Nhat Ho |
9 |
2024-02-04 |
link |
Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models |
Fangzhao Zhang, Mert Pilanci |
9 |
2024-04-06 |
link |
Multicalibration for Confidence Scoring in LLMs |
Gianluca Detommaso, Martin Bertran Lopez,..., Aaron Roth |
9 |
2024-02-23 |
link |
Deep Networks Always Grok and Here is Why |
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk |
9 |
2023-10-11 |
link |
Language Models As Semantic Indexers |
Bowen Jin, Hansi Zeng,..., Xianfeng Tang |
9 |
2024-02-28 |
link |
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension |
Fan Yin, Jayanth Srinivasa, Kai-Wei Chang |
9 |
2024-01-05 |
link |
AST-T5: Structure-Aware Pretraining for Code Generation and Understanding |
Linyuan Gong, Mostafa Elhoushi, Alvin Cheung |
9 |
2023-09-28 |
link |
Discovering environments with XRM |
Mohammad Pezeshki, Diane Bouchacourt,..., David Lopez-Paz |
9 |
2024-05-30 |
link |
Proteus: Exploring Protein Structure Generation for Enhanced Designability and Efficiency |
Chentong Wang, Yannan Qu,..., Longxing Cao |
9 |
2024-02-27 |
link |
Variational Learning is Effective for Large Deep Networks |
Yuesong Shen, Nico Daheim,..., Thomas Möllenhoff |
9 |
2024-05-22 |
link |
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation |
Gauthier Guinet, Behrooz Omidvar-Tehrani,..., Laurent Callot |
9 |
2023-06-06 |
link |
Designing Decision Support Systems Using Counterfactual Prediction Sets |
Eleni Straitouri, Manuel Gomez Rodriguez |
9 |
2024-03-05 |
link |
Time Weaver: A Conditional Time Series Generation Model |
Sai Shankar Narasimhan, Shubhankar Agarwal,..., Sandeep P. Chinchali |
9 |
2024-04-02 |
link |
Test-Time Model Adaptation with Only Forward Passes |
Shuaicheng Niu, Chunyan Miao,..., Peilin Zhao |
9 |
2024-02-07 |
link |
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? |
Agustinus Kristiadi, Felix Strieth-Kalthoff,..., Geoff Pleiss |
9 |
2022-12-08 |
link |
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization |
Ashwinee Panda, Xinyu Tang,..., Prateek Mittal |
9 |
None |
link |
Characterizing Large Language Model Geometry Solves Toxicity Detection and Generation |
Randall Balestriero, Romain Cosentino, Sarath Shekkizhar |
9 |
2024-05-08 |
link |
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context |
yunxin li, Baotian Hu,..., Min Zhang |
9 |
2024-02-21 |
link |
Privacy-Preserving Instructions for Aligning Large Language Models |
Da Yu, Peter Kairouz,..., Zheng Xu |
9 |
2023-11-24 |
link |
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization |
Shida Wang, Qianxiao Li |
9 |
2024-01-18 |
link |
Exploration and Anti-Exploration with Distributional Random Network Distillation |
Kai Yang, Jian Tao,..., Xiu Li |
9 |
2024-02-02 |
link |
Simulation of Graph Algorithms with Looped Transformers |
Artur Back de Luca, Kimon Fountoulakis |
9 |
2024-02-07 |
link |
Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation |
Luca Beurer-Kellner, Marc Fischer, Martin Vechev |
9 |
2024-03-27 |
link |
Understanding the Learning Dynamics of Alignment with Human Feedback |
Shawn Im, Yixuan Li |
9 |
2024-03-30 |
link |
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models |
Shanglun Feng, Florian Tramèr |
9 |
2024-02-12 |
link |
Rolling Diffusion Models |
David Ruhe, Jonathan Heek,..., Emiel Hoogeboom |
9 |
2024-03-04 |
link |
Wukong: Towards a Scaling Law for Large-Scale Recommendation |
Buyun Zhang, Liang Luo,..., Wenlin Chen |
9 |
2024-01-08 |
link |
Sampling in Unit Time with Kernel Fisher-Rao Flow |
Aimee Maurais, Youssef Marzouk |
9 |
2023-08-31 |
link |
On the Implicit Bias of Adam |
Matias D. Cattaneo, Jason Matthew Klusowski, Boris Shigida |
9 |
2024-02-01 |
link |
Getting the most out of your tokenizer for pre-training and domain adaptation |
Gautier Dagan, Gabriel Synnaeve, Baptiste Roziere |
9 |
2023-09-08 |
link |
Graph Neural Networks Use Graphs When They Shouldn't |
Maya Bechler-Speicher, Ido Amos,..., Amir Globerson |
9 |
2024-02-14 |
link |
Instruction Tuning for Secure Code Generation |
Jingxuan He, Mark Vero,..., Martin Vechev |
8 |
2023-04-16 |
link |
An Empirical Study of Realized GNN Expressiveness |
Yanbo Wang, Muhan Zhang |
8 |
2024-04-16 |
link |
Fewer Truncations Improve Language Modeling |
Hantian Ding, Zijian Wang,..., Stefano Soatto |
8 |
2023-11-16 |
link |
Structured Chemistry Reasoning with Large Language Models |
Siru Ouyang, Zhuosheng Zhang,..., Lianhui Qin |
8 |
2024-03-28 |
link |
Regression with Multi-Expert Deferral |
Anqi Mao, Mehryar Mohri, Yutao Zhong |
8 |
2024-02-09 |
link |
Particle Denoising Diffusion Sampler |
Angus Phillips, Hai-Dang Dau,..., Arnaud Doucet |
8 |
2024-05-28 |
link |
AI Alignment with Changing and Influenceable Reward Functions |
Micah Carroll, Davis Foote,..., Anca Dragan |
8 |
2024-01-21 |
link |
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback |
Songyang Gao, Qiming Ge,..., Dahua Lin |
8 |
2023-12-06 |
link |
Interpretability Illusions in the Generalization of Simplified Models |
Dan Friedman, Andrew Kyle Lampinen,..., Asma Ghandeharioun |
8 |
2023-05-26 |
link |
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks |
Atli Kosson, Bettina Messmer, Martin Jaggi |
8 |
2024-02-04 |
link |
TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling |
Jiaxiang Dong, Haixu Wu,..., Mingsheng Long |
8 |
2024-05-03 |
link |
PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning |
Hyeong Kyu Choi, Yixuan Li |
8 |
2024-02-15 |
link |
Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification |
Yiming Meng, Ruikun Zhou,..., Jun Liu |
8 |
2024-04-18 |
link |
RoboDreamer: Learning Compositional World Models for Robot Imagination |
Siyuan Zhou, Yilun Du,..., Chuang Gan |
8 |
2024-02-05 |
link |
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem |
Maciej Wolczyk, Bartłomiej Cupiał,..., Piotr Miłoś |
8 |
2024-02-15 |
link |
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models |
Ali AhmadiTeshnizi, Wenzhi Gao, Madeleine Udell |
8 |
2024-05-29 |
link |
Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization |
Ziqing Fan, Shengchao Hu,..., Yanfeng Wang |
8 |
2024-02-04 |
link |
LQER: Low-Rank Quantization Error Reconstruction for LLMs |
Cheng Zhang, Jianyi Cheng,..., Yiren Zhao |
8 |
2024-03-20 |
link |
Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes |
Yifan Chen, Mark Goldstein,..., Eric Vanden-Eijnden |
8 |
2024-02-15 |
link |
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling |
Raunaq Bhirangi, Chenyu Wang,..., Lerrel Pinto |
8 |
2024-02-01 |
link |
Transforming and Combining Rewards for Aligning Large Language Models |
Zihao Wang, Chirag Nagpal,..., Victor Veitch |
8 |
2024-06-04 |
link |
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding |
Hongkang Li, Meng Wang,..., Pin-Yu Chen |
8 |
2023-11-27 |
link |
Swallowing the Bitter Pill: Simplified Scalable Conformer Generation |
Yuyang Wang, Ahmed A. A. Elhag,..., Miguel Ángel Bautista |
8 |
2023-10-14 |
link |
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning |
Jiachen Li, Qiaozi Gao,..., William Yang Wang |
8 |
2023-01-27 |
link |
Single-Trajectory Distributionally Robust Reinforcement Learning |
Zhipeng Liang, Xiaoteng Ma,..., Zhengyuan Zhou |
8 |
2023-08-14 |
link |
Position: Key Claims in LLM Research Have a Long Tail of Footnotes |
Anna Rogers, Sasha Luccioni |
8 |
2024-02-02 |
link |
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise |
Kwangjun Ahn, Zhiyu Zhang,..., Yan Dai |
8 |
2024-04-23 |
link |
NExT: Teaching Large Language Models to Reason about Code Execution |
Ansong Ni, Miltiadis Allamanis,..., Pengcheng Yin |
8 |
2022-12-15 |
link |
Integrating Multimodal Data for Joint Generative Modeling of Complex Dynamics |
Manuel Brenner, Florian Hess,..., Daniel Durstewitz |
8 |
2024-04-17 |
link |
Decomposing and Editing Predictions by Modeling Model Computation |
Harshay Shah, Andrew Ilyas, Aleksander Madry |
8 |
2024-02-19 |
link |
LoRA Training in the NTK Regime has No Spurious Local Minima |
Uijeong Jang, Jason D. Lee, Ernest K. Ryu |
8 |
2024-02-22 |
link |
Clifford-Steerable Convolutional Neural Networks |
Maksim Zhdanov, David Ruhe,..., Patrick Forré |
8 |
2024-02-14 |
link |
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks |
Jiwon Song, Kyungseok Oh,..., jae-joon kim |
8 |
2024-02-10 |
link |
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF |
Han Shen, Zhuoran Yang, Tianyi Chen |
8 |
2024-02-07 |
link |
MEMORYLLM: Towards Self-Updatable Large Language Models |
Yu Wang, Yifan Gao,..., Julian McAuley |
8 |
2024-02-06 |
link |
In-context learning agents are asymmetric belief updaters |
Johannes A. Schubert, Akshay Kumar Jagadish,..., Eric Schulz |
8 |
2023-12-08 |
link |
Membership Inference Attacks on Diffusion Models via Quantile Regression |
Shuai Tang, Steven Wu,..., Aaron Roth |
8 |
2024-05-28 |
link |
FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic Prediction |
Zhonghang Li, Lianghao Xia,..., Chao Huang |
8 |
2024-02-13 |
link |
eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data |
Bo Peng, Xinyi Ling,..., Xia Ning |
8 |
2023-11-15 |
link |
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption |
Itamar Zimerman, Moran Baruch,..., Lior Wolf |
8 |
2024-02-04 |
link |
Revisiting the Power of Prompt for Visual Tuning |
Yuzhu Wang, Lechao Cheng,..., Meng Wang |
8 |
2024-02-28 |
link |
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models |
Mingjia Huo, Sai Ashish Somayajula,..., Pengtao Xie |
8 |
2024-02-06 |
link |
MusicRL: Aligning Music Generation to Human Preferences |
Geoffrey Cideron, Sertan Girgin,..., Andrea Agostinelli |
8 |
2023-11-28 |
link |
Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models |
Zhihe Lu, Jiawang Bai,..., Xinchao Wang |
8 |
2024-03-01 |
link |
Shifted Interpolation for Differential Privacy |
Jinho Bok, Weijie J Su, Jason Altschuler |
8 |
2023-05-12 |
link |
MoMo: Momentum Models for Adaptive Learning Rates |
Fabian Schaipp, Ruben Ohana,..., Robert M. Gower |
8 |
2024-03-18 |
link |
Larimar: Large Language Models with Episodic Memory Control |
Payel Das, Subhajit Chaudhury,..., Pin-Yu Chen |
8 |
2024-05-16 |
link |
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery |
Pingchuan Ma, Tsun-Hsuan Wang,..., Wojciech Matusik |
7 |
2024-02-02 |
link |
Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness |
Guibin Zhang, Yanwei Yue,..., Tianlong Chen |
7 |
2023-07-11 |
link |
Memorization Through the Lens of Curvature of Loss Function Around Samples |
Isha Garg, Deepak Ravikumar, Kaushik Roy |
7 |
2024-04-25 |
link |
In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization |
Herilalaina Rakotoarison, Steven Adriaensen,..., Frank Hutter |
7 |
2024-02-16 |
link |
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs |
Yeonhong Park, Jake Hyun,..., Jae W. Lee |
7 |
2023-11-17 |
link |
Stable Differentiable Causal Discovery |
Achille Nazaret, Justin Hong,..., David Blei |
7 |
2024-02-22 |
link |
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models |
Kenneth Li, Samy Jelassi,..., David Brandfonbrener |
7 |
2023-11-02 |
link |
Gaussian Processes on Cellular Complexes |
Mathieu Alain, So Takao,..., Marc Peter Deisenroth |
7 |
2024-05-13 |
link |
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition |
Ziyang Zhang, Qizhen Zhang, Jakob Nicolaus Foerster |
7 |
2024-02-22 |
link |
CoLoRA: Continuous low-rank adaptation for reduced implicit neural modeling of parameterized partial differential equations |
Jules Berman, Benjamin Peherstorfer |
7 |
2023-12-18 |
link |
Harnessing the Power of Neural Operators with Automatically Encoded Conservation Laws |
Ning Liu, Yiming Fan,..., Yue Yu |
7 |
2024-02-02 |
link |
BAT: Learning to Reason about Spatial Sounds with Large Language Models |
Zhisheng Zheng, Puyuan Peng,..., David Harwath |
7 |
2024-02-12 |
link |
Weisfeiler-Leman at the margin: When more expressivity matters |
Billy Joe Franks, Christopher Morris,..., Floris Geerts |
7 |
2023-10-26 |
link |
HyperFields: Towards Zero-Shot Generation of NeRFs from Text |
Sudarshan Babu, Richard Liu,..., Rana Hanocka |
7 |
2024-03-12 |
link |
BAGEL: Bootstrapping Agents by Guiding Exploration with Language |
Shikhar Murty, Christopher D Manning,..., Kenton Lee |
7 |
2024-03-03 |
link |
Critical windows: non-asymptotic theory for feature emergence in diffusion models |
Marvin Li, Sitan Chen |
7 |
2023-11-01 |
link |
Robust and Conjugate Gaussian Process Regression |
Matias Altamirano, Francois-Xavier Briol, Jeremias Knoblauch |
7 |
2024-02-11 |
link |
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness? |
Ryan Liu, Theodore Sumers,..., Thomas L. Griffiths |
7 |
2024-01-29 |
link |
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation |
Zhenyu He, Guhao Feng,..., Di He |
7 |
2023-10-13 |
link |
Split-and-Denoise: Protect large language model inference with local differential privacy |
Peihua Mai, Ran Yan,..., Yan Pang |
7 |
2024-04-22 |
link |
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models |
Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis |
7 |
2024-06-02 |
link |
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective |
Fabian Falck, Ziyu Wang, Christopher C. Holmes |
7 |
2024-02-05 |
link |
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective |
Wu Lin, Felix Dangel,..., Alireza Makhzani |
7 |
2024-03-28 |
link |
H-Consistency Guarantees for Regression |
Anqi Mao, Mehryar Mohri, Yutao Zhong |
7 |
2023-06-19 |
link |
Hyperbolic Active Learning for Semantic Segmentation under Domain Shift |
Luca Franco, Paolo Mandica,..., Fabio Galasso |
7 |
2024-02-13 |
link |
Unsupervised Evaluation of Code LLMs with Round-Trip Correctness |
Miltiadis Allamanis, Sheena Panthaplackel, Pengcheng Yin |
7 |
2024-02-06 |
link |
Neural Networks Learn Statistics of Increasing Complexity |
Nora Belrose, Quintin Pope,..., Xiaoli Fern |
7 |
2024-02-28 |
link |
Out-of-Domain Generalization in Dynamical Systems Reconstruction |
Niclas Alexander Göring, Florian Hess,..., Daniel Durstewitz |
7 |
2024-06-06 |
link |
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation |
Can Yaras, Peng Wang,..., Qing Qu |
7 |
2023-03-25 |
link |
DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency |
Zalan Fabian, Berk Tinaz, Mahdi Soltanolkotabi |
7 |
2024-02-06 |
link |
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains |
Junhong Shen, Neil Tenenholtz,..., Nicolo Fusi |
7 |
2024-02-27 |
link |
Automated Statistical Model Discovery with Language Models |
Michael Y. Li, Emily Fox, Noah Goodman |
7 |
2024-02-12 |
link |
Active Preference Learning for Large Language Models |
William Muldrew, Peter Hayes,..., David Barber |
7 |
2024-06-11 |
link |
Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models |
Som Sagar, Aditya Taparia, Ransalu Senanayake |
7 |
2023-11-23 |
link |
Scalable AI Safety via Doubly-Efficient Debate |
Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras |
7 |
2024-05-14 |
link |
Compositional Text-to-Image Generation with Dense Blob Representations |
Weili Nie, Sifei Liu,..., Arash Vahdat |
7 |
2023-10-26 |
link |
CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents |
Qinlin Zhao, Jindong Wang,..., Xing Xie |
7 |
2024-03-19 |
link |
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization |
Haocheng Xi, Yuxiang Chen,..., Jun Zhu |
7 |
2024-06-05 |
link |
Graph Neural Network Explanations are Fragile |
Jiate Li, Meng Pang,..., Binghui Wang |
7 |
2024-06-07 |
link |
FlowMM: Generating Materials with Riemannian Flow Matching |
Benjamin Kurt Miller, Ricky T. Q. Chen,..., Brandon M Wood |
7 |
2024-02-22 |
link |
Prompting a Pretrained Transformer Can Be a Universal Approximator |
Aleksandar Petrov, Philip Torr, Adel Bibi |
7 |
2023-10-14 |
link |
DPZero: Private Fine-Tuning of Language Models without Backpropagation |
Liang Zhang, Bingcong Li,..., Niao He |
7 |
2023-08-28 |
link |
Rate-Optimal Policy Optimization for Linear Markov Decision Processes |
Uri Sherman, Alon Cohen,..., Yishay Mansour |
7 |
2024-02-16 |
link |
RLVF: Learning from Verbal Feedback without Overgeneralization |
Moritz Pascal Stephan, Alexander Khazatsky,..., Chelsea Finn |
7 |
2023-10-03 |
link |
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution |
Rui Wang, Elyssa Hofgard,..., Tess Smidt |
7 |
2024-06-07 |
link |
Projecting Molecules into Synthesizable Chemical Spaces |
Shitong Luo, Wenhao Gao,..., Jianzhu Ma |
7 |
2023-10-11 |
link |
LLark: A Multimodal Instruction-Following Language Model for Music |
Joshua P Gardner, Simon Durand,..., Rachel M Bittner |
6 |
2024-02-16 |
link |
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling |
Ivan Marisca, Cesare Alippi, Filippo Maria Bianchi |
6 |
2024-02-03 |
link |
Vanilla Bayesian Optimization Performs Great in High Dimensions |
Carl Hvarfner, Erik Orm Hellsten, Luigi Nardi |
6 |
2024-02-23 |
link |
On the Duality Between Sharpness-Aware Minimization and Adversarial Training |
Yihao Zhang, Hangzhou He,..., Zeming Wei |
6 |
2024-05-08 |
link |
The Entropy Enigma: Success and Failure of Entropy Minimization |
Ori Press, Ravid Shwartz-Ziv,..., Matthias Bethge |
6 |
2023-12-22 |
link |
How Smooth Is Attention? |
Valérie Castin, Pierre Ablin, Gabriel Peyré |
6 |
2024-03-13 |
link |
A Sparsity Principle for Partially Observable Causal Representation Learning |
Danru Xu, Dingling Yao,..., Sara Magliacane |
6 |
2024-05-02 |
link |
On Mechanistic Knowledge Localization in Text-to-Image Generative Models |
Samyadeep Basu, Keivan Rezaei,..., Soheil Feizi |
6 |
2024-02-05 |
link |
DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design |
Samuel Garcin, James Doran,..., Stefano V Albrecht |
6 |
2023-12-19 |
link |
Emergence of In-Context Reinforcement Learning from Noise Distillation |
Ilya Zisman, Vladislav Kurenkov,..., Sergey Kolesnikov |
6 |
2024-02-29 |
link |
Smooth Tchebycheff Scalarization for Multi-Objective Optimization |
Xi Lin, Xiaoyuan Zhang,..., Qingfu Zhang |
6 |
2023-11-09 |
link |
Model-Based Minimum Bayes Risk Decoding for Text Generation |
Yuu Jinnai, Tetsuro Morimura,..., Kenshi Abe |
6 |
2024-02-02 |
link |
Mapping the Multiverse of Latent Representations |
Jeremy Wayland, Corinna Coupette, Bastian Rieck |
6 |
2023-12-11 |
link |
Grokking Group Multiplication with Cosets |
Dashiell Stander, Qinan Yu,..., Stella Biderman |
6 |
2024-02-08 |
link |
Offline Actor-Critic Reinforcement Learning Scales to Large Models |
Jost Tobias Springenberg, Abbas Abdolmaleki,..., Martin Riedmiller |
6 |
2024-02-05 |
link |
Rethinking Optimization and Architecture for Tiny Language Models |
Yehui Tang, Kai Han,..., Yunhe Wang |
6 |
2023-12-15 |
link |
Fast Decision Boundary based Out-of-Distribution Detector |
Litian Liu, Yao Qin |
6 |
2024-02-15 |
link |
Recovering the Pre-Fine-Tuning Weights of Generative Models |
Eliahu Horwitz, Jonathan Kahana, Yedid Hoshen |
6 |
2024-02-16 |
link |
Stochastic Localization via Iterative Posterior Sampling |
Louis Grenioux, Maxence Noble,..., Alain Oliviero Durmus |
6 |
2024-03-27 |
link |
A Geometric Explanation of the Likelihood OOD Detection Paradox |
Hamidreza Kamkari, Brendan Leigh Ross,..., Gabriel Loaiza-Ganem |
6 |
2023-05-30 |
link |
Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting |
Andrea Cini, Danilo Mandic, Cesare Alippi |
6 |
2024-06-11 |
link |
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot |
Zixuan Wang, Stanley Wei,..., Jason D. Lee |
6 |
2024-05-28 |
link |
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning |
Shengchao Hu, Ziqing Fan,..., Dacheng Tao |
6 |
2023-06-28 |
link |
Towards a Better Theoretical Understanding of Independent Subnetwork Training |
Egor Shulgin, Peter Richtárik |
6 |
2024-04-15 |
link |
Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning |
Sungwon Han, Jinsung Yoon,..., Tomas Pfister |
6 |
2024-02-22 |
link |
Batch and match: black-box variational inference with a score-based divergence |
Diana Cai, Chirag Modi,..., Lawrence K. Saul |
6 |
2024-06-01 |
link |
Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks |
Hojoon Lee, Hyeonseo Cho,..., Clare Lyle |
6 |
2023-12-06 |
link |
Improving Gradient-guided Nested Sampling for Posterior Inference |
Pablo Lemos, Nikolay Malkin,..., Laurence Perreault-Levasseur |
6 |
2024-02-12 |
link |
Tuning-Free Stochastic Optimization |
Ahmed Khaled, Chi Jin |
6 |
2024-06-11 |
link |
Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling |
Denis Blessing, Xiaogang Jia,..., Gerhard Neumann |
6 |
2024-06-02 |
link |
Full-Atom Peptide Design based on Multi-modal Flow Matching |
Jiahan Li, Chaoran Cheng,..., Jianzhu Ma |
6 |
2024-03-07 |
link |
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks |
Linyuan Gong, Sida Wang,..., Alvin Cheung |
6 |
2024-02-27 |
link |
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings |
Kevin Frans, Seohong Park,..., Sergey Levine |
6 |
2023-07-13 |
link |
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks |
Liam Collins, Hamed Hassani,..., Sanjay Shakkottai |
6 |
2024-05-16 |
link |
IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency |
Linshan Hou, Ruili Feng,..., Yiming Li |
6 |
2023-12-16 |
link |
Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge |
Conghan Yue, Zhengwei Peng,..., Dongyu Zhang |
6 |
2023-11-06 |
link |
Sample Complexity Bounds for Estimating Probability Divergences under Invariances |
Behrooz Tahmasebi, Stefanie Jegelka |
6 |
2024-02-19 |
link |
Towards Theoretical Understandings of Self-Consuming Generative Models |
Shi Fu, Sen Zhang,..., Dacheng Tao |
6 |
2024-01-20 |
link |
Make-A-Shape: a Ten-Million-scale 3D Shape Model |
Ka-Hei Hui, Aditya Sanghi,..., Chi-Wing Fu |
6 |
2024-02-09 |
link |
CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes |
Peter Mikhael, Itamar Chinn, Regina Barzilay |
6 |
2024-06-20 |
link |
Revealing Vision-Language Integration in the Brain with Multimodal Networks |
Vighnesh Subramaniam, Colin Conwell,..., Andrei Barbu |
6 |
2024-03-25 |
link |
Enabling Uncertainty Estimation in Iterative Neural Networks |
Nikita Durasov, Doruk Oner,..., Pascal Fua |
6 |
2024-03-05 |
link |
PPFlow: Target-Aware Peptide Design with Torsional Flow Matching |
Haitao Lin, Odin Zhang,..., Stan Z. Li |
6 |
2024-02-07 |
link |
Causal Representation Learning from Multiple Distributions: A General Setting |
Kun Zhang, Shaoan Xie,..., Yujia Zheng |
6 |
2024-02-06 |
link |
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling |
Junchao Gong, LEI BAI,..., Wanli Ouyang |
6 |
2023-12-18 |
link |
The Good, The Bad, and Why: Unveiling Emotions in Generative AI |
CHENG LI, Jindong Wang,..., Xing Xie |
6 |
2024-04-30 |
link |
Modeling Caption Diversity in Contrastive Vision-Language Pretraining |
Samuel Lavoie, Polina Kirichenko,..., Nicolas Ballas |
6 |
2024-02-01 |
link |
Towards Efficient Exact Optimization of Language Model Alignment |
Haozhe Ji, Cheng Lu,..., Minlie Huang |
6 |
2024-02-17 |
link |
How to Make the Gradients Small Privately: Improved Rates for Differentially Private Non-Convex Optimization |
Andrew Lowy, Jonathan Ullman, Stephen Wright |
6 |
2024-04-24 |
link |
Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations |
Kaiwen Xue, Yuhao Zhou,..., Chongxuan Li |
6 |
2024-05-10 |
link |
MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and Distillation for Medical Data Analysis |
Luyuan Xie, Manqing Lin,..., Zhonghai Wu |
6 |
2023-10-24 |
link |
Neural Collapse in Multi-label Learning with Pick-all-label Loss |
Pengyu Li, Xiao Li,..., Qing Qu |
6 |
2024-01-30 |
link |
Arrows of Time for Large Language Models |
Vassilis Papadopoulos, Jérémie Wenger, Clément Hongler |
6 |
2024-02-05 |
link |
Position: What Can Large Language Models Tell Us about Time Series Analysis |
Ming Jin, YiFan Zhang,..., Qingsong Wen |
6 |
2024-04-10 |
link |
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation |
Thomas Merth, Qichen Fu,..., Mahyar Najibi |
6 |
2024-06-10 |
link |
Compute Better Spent: Replacing Dense Layers with Structured Matrices |
Shikai Qiu, Andres Potapczynski,..., Andrew Gordon Wilson |
6 |
2024-02-05 |
link |
See More Details: Efficient Image Super-Resolution by Experts Mining |
Eduard Zamfir, Zongwei Wu,..., Radu Timofte |
6 |
2023-09-29 |
link |
Latent Space Symmetry Discovery |
Jianke Yang, Nima Dehmamy,..., Rose Yu |
6 |
2024-05-09 |
link |
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning |
Shibo Jie, Yehui Tang,..., Yunhe Wang |
6 |
2024-04-01 |
link |
Optimal Ridge Regularization for Out-of-Distribution Prediction |
Pratik Patil, Jin-Hong Du, Ryan Tibshirani |
6 |
2023-11-29 |
link |
Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs |
Andries Petrus Smit, Nathan Grinsztajn,..., Arnu Pretorius |
6 |
2024-05-27 |
link |
Q-value Regularized Transformer for Offline Reinforcement Learning |
Shengchao Hu, Ziqing Fan,..., Dacheng Tao |
6 |
2024-02-09 |
link |
Understanding the Effects of Iterative Prompting on Truthfulness |
Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju |
6 |
2024-05-18 |
link |
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA |
Weitao Feng, Wenbo Zhou,..., Nenghai Yu |
6 |
2023-12-13 |
link |
The Relative Value of Prediction in Algorithmic Decision Making |
Juan Carlos Perdomo |
6 |
2022-08-31 |
link |
Be Your Own Neighborhood: Detecting Adversarial Example by the Neighborhood Relations Built on Self-Supervised Learning |
Zhiyuan He, Yijun Yang,..., Tsung-Yi Ho |
6 |
2024-02-06 |
link |
DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems |
Yair Schiff, Zhong Yi Wan,..., Leonardo Zepeda-Núñez |
6 |
2023-07-14 |
link |
Graph Positional and Structural Encoder |
Semih Cantürk, Renming Liu,..., Ladislav Rampášek |
6 |
2024-02-15 |
link |
Accelerating Parallel Sampling of Diffusion Models |
Zhiwei Tang, Jiasheng Tang,..., Tsung-Hui Chang |
6 |
2024-05-28 |
link |
SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals |
Rahul Thapa, Bryan He,..., James Zou |
6 |
2024-05-09 |
link |
Outlier-robust Kalman Filtering through Generalised Bayes |
Gerardo Duran-Martin, Matias Altamirano,..., Kevin Patrick Murphy |
6 |
2024-05-18 |
link |
On the Trajectory Regularity of ODE-based Diffusion Sampling |
Defang Chen, Zhenyu Zhou,..., Siwei Lyu |
6 |
2024-02-07 |
link |
Data-efficient Large Vision Models through Sequential Autoregression |
Zhiwei Hao, Jianyuan Guo,..., Chang Xu |
6 |
2024-03-05 |
link |
Active Statistical Inference |
Tijana Zrnic, Emmanuel Candes |
6 |
2024-03-19 |
link |
Listenable Maps for Audio Classifiers |
Francesco Paissan, Mirco Ravanelli, Cem Subakan |
6 |
2023-05-26 |
link |
Selective Mixup Helps with Distribution Shifts, But Not (Only) because of Mixup |
Damien Teney, Jindong Wang, Ehsan Abbasnejad |
6 |
2024-01-26 |
link |
Residual Quantization with Implicit Neural Codebooks |
Iris A.M. Huijben, Matthijs Douze,..., Jakob Verbeek |
6 |
2024-02-17 |
link |
Offline Training of Language Model Agents with Functions as Learnable Weights |
Shaokun Zhang, Jieyu Zhang,..., Qingyun Wu |