Last updated: 2025-04-16 04:06:19. Maintained by Weisen Jiang.

citation publish date title (pdf) review authors
2092 2023-12-01 Mamba: Linear-Time Sequence Modeling with Selective State Spaces link Albert Gu, Tri Dao
550 None AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversations link Qingyun Wu, Gagan Bansal,..., Chi Wang
356 2023-11-20 GPQA: A Graduate-Level Google-Proof Q&A Benchmark link David Rein, Betty Li Hou,..., Samuel R. Bowman
335 2023-10-25 Zephyr: Direct Distillation of LM Alignment link Lewis Tunstall, Edward Emanuel Beeching,..., Thomas Wolf
256 2024-04-09 MiniCPM: Unveiling the Potential of Small Language Models with
Scalable Training Strategies
link Shengding Hu, Yuge Tu,..., Maosong Sun
202 2024-04-09 RULER: What’s the Real Context Size of Your Long-Context
Language Models?
link Cheng-Ping Hsieh, Simeng Sun,..., Boris Ginsburg
189 2023-06-28 Towards Measuring the Representation of Subjective Global Opinions in
Language Models
link Esin DURMUS, Karina Nguyen,..., Deep Ganguli
172 2023-07-25 LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition link Chengsong Huang, Qian Liu,..., Min Lin
157 2023-10-10 The Geometry of Truth: Emergent Linear Structure in Large
Language Model Representations of True/False Datasets
link Samuel Marks, Max Tegmark
157 2024-03-15 RAFT: Adapting Language Model to Domain Specific RAG link Tianjun Zhang, Shishir G Patil,..., Joseph E. Gonzalez
153 2023-09-06 Certifying LLM Safety against Adversarial Prompting link Aounon Kumar, Chirag Agarwal,..., Himabindu Lakkaraju
149 2024-04-09 LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders link Parishad BehnamGhader, Vaibhav Adlakha,..., Siva Reddy
142 2023-04-10 Is ChatGPT a Good Sentiment Analyzer? link Zengzhi Wang, Qiming Xie,..., Rui Xia
131 2023-10-05 A Long Way to Go: Investigating Length Correlations in
RLHF
link Prasann Singhal, Tanya Goyal,..., Greg Durrett
123 2024-01-11 TOFU: A Task of Fictitious Unlearning for LLMs link Pratyush Maini, Zhili Feng,..., J Zico Kolter
122 2024-04-18 From $r$ to $Q^*$: Your Language Model is Secretly
a Q-Function
link Rafael Rafailov, Joey Hejna,..., Chelsea Finn
116 2024-02-27 Tower: An Open Multilingual Large Language Model for Translation-Related
Tasks
link Duarte Miguel Alves, José Pombal,..., Andre Martins
110 2024-04-08 Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning link Ruiqi Zhang, Licong Lin,..., Song Mei
99 2023-10-03 A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration link Zijun Liu, Yanzhe Zhang,..., Diyi Yang
94 2024-03-14 Quiet-STaR: Language Models Can Teach Themselves to Think Before
Speaking
link Eric Zelikman, Georges Raif Harik,..., Noah Goodman
93 2024-02-09 V-STaR: Training Verifiers for Self-Taught Reasoners link Arian Hosseini, Xingdi Yuan,..., Rishabh Agarwal
86 2023-11-17 A Language Agent for Autonomous Driving link Jiageng Mao, Junjie Ye,..., Yue Wang
78 2023-10-16 OpenAgents: An Open Platform for Language Agents in the
Wild
link Tianbao Xie, Fan Zhou,..., Tao Yu
77 2024-04-11 Best Practices and Lessons Learned on Synthetic Data link Ruibo Liu, Jerry Wei,..., Andrew M. Dai
74 2023-04-03 Inspecting and Editing Knowledge Representations in Language Models link Evan Hernandez, Belinda Z. Li, Jacob Andreas
74 2023-12-14 Helping or Herding? Reward Model Ensembles Mitigate but do
not Eliminate Reward Hacking
link Jacob Eisenstein, Chirag Nagpal,..., Jonathan Berant
74 2023-09-26 VideoDirectorGPT: Consistent Multi-Scene Video Generation via LLM-Guided Planning link Han Lin, Abhay Zala,..., Mohit Bansal
72 2024-01-12 Fine-grained Hallucination Detection and Editing for Language Models link Abhika Mishra, Akari Asai,..., Hannaneh Hajishirzi
71 2024-02-12 Do Membership Inference Attacks Work on Large Language Models? link Michael Duan, Anshuman Suri,..., Hannaneh Hajishirzi
70 2024-04-11 AmpleGCG: Learning a Universal and Transferable Generative Model of
Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
link Zeyi Liao, Huan Sun
67 2023-12-11 LLM360: Towards Fully Transparent Open-Source LLMs link Zhengzhong Liu, Aurick Qiao,..., Eric P. Xing
66 2024-04-08 Eagle and Finch: RWKV with Matrix-Valued States and Dynamic
Recurrence
link Bo Peng, Daniel Goldstein,..., Rui-Jie Zhu
64 2024-01-27 MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries link Yixuan Tang, Yi Yang
61 2024-04-03 JailBreakV: A Benchmark for Assessing the Robustness of MultiModal
Large Language Models against Jailbreak Attacks
link Weidi Luo, Siyuan Ma,..., Chaowei Xiao
58 2024-03-25 Aligning with Human Judgement: The Role of Pairwise Preference
in Large Language Model Evaluators
link Yinhong Liu, Han Zhou,..., Nigel Collier
58 2023-11-16 HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs link Junying Chen, Xidong Wang,..., Benyou Wang
57 2024-02-27 Massive Activations in Large Language Models link Mingjie Sun, Xinlei Chen,..., Zhuang Liu
53 2023-02-11 A Reparameterized Discrete Diffusion Model for Text Generation link Lin Zheng, Jianbo Yuan,..., Lingpeng Kong
52 2024-03-31 RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation link Chi-Min Chan, Chunpu Xu,..., Jie Fu
52 2024-04-24 Let’s Think Dot by Dot: Hidden computation in transformer
language models
link Jacob Pfau, William Merrill, Samuel R. Bowman
52 2024-03-12 Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM link Sainbayar Sukhbaatar, Olga Golovneva,..., Xian Li
50 2023-09-27 Large Language Model Routing with Benchmark Datasets link Tal Shnitzer, Anthony Ou,..., Mikhail Yurochkin
48 2024-04-27 Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language
Capabilities
link Kazuki Fujii, Taishi Nakamura,..., Naoaki Okazaki
47 2024-04-01 Mapping the Increasing Use of LLMs in Scientific Papers link Weixin Liang, Yaohui Zhang,..., James Y. Zou
47 2024-04-01 LLM as a Mastermind: A Survey of Strategic Reasoning
with Large Language Models
link Yadong Zhang, Shaoguang Mao,..., Furu Wei
46 2024-04-01 Is Model Collapse Inevitable? Breaking the Curse of Recursion
by Accumulating Real and Synthetic Data
link Matthias Gerstgrasser, Rylan Schaeffer,..., Sanmi Koyejo
45 2024-02-21 Beyond A*: Better Planning with Transformers via Search Dynamics
Bootstrapping
link Lucas Lehnert, Sainbayar Sukhbaatar,..., Yuandong Tian
45 2024-01-30 Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion
Tokens
link Jiacheng Liu, Sewon Min,..., Hannaneh Hajishirzi
44 2024-04-09 Autonomous Evaluation and Refinement of Digital Agents link Jiayi Pan, Yichi Zhang,..., Alane Suhr
44 2024-04-18 TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical
Speculative Decoding
link Hanshi Sun, Zhuoming Chen,..., Beidi Chen
43 2024-01-16 Tuning Language Models by Proxy link Alisa Liu, Xiaochuang Han,..., Noah A. Smith
42 2023-10-03 Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation link Eric Zelikman, Eliana Lorch,..., Adam Tauman Kalai
41 2023-09-30 Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model
Collaboration
link Qiushi Sun, Zhangyue Yin,..., Lingpeng Kong
40 2024-04-11 Ferret-v2: An Improved Baseline for Referring and Grounding with
Large Language Models
link Haotian Zhang, Haoxuan You,..., Yinfei Yang
38 2023-10-23 AutoDAN: Interpretable Gradient-Based Adversarial Attacks on Large Language Models link Sicheng Zhu, Ruiyi Zhang,..., Tong Sun
38 2023-09-29 Suspicion Agent: Playing Imperfect Information Games with Theory of
Mind Aware GPT-4
link Jiaxian Guo, Bo Yang,..., Yutaka Matsuo
37 2024-04-11 HGRN2: Gated Linear RNNs with State Expansion link Zhen Qin, Songlin Yang,..., Yiran Zhong
36 2024-04-09 VisualWebBench: How Far Have Multimodal LLMs Evolved in Web
Page Understanding and Grounding?
link Junpeng Liu, Yifan Song,..., Xiang Yue
35 2024-04-01 FABLES: Evaluating faithfulness and content selection in book-length summarization link Yekyung Kim, Yapei Chang,..., Mohit Iyyer
35 2024-04-01 Stream of Search (SoS): Learning to Search in Language link Kanishk Gandhi, Denise H J Lee,..., Noah Goodman
34 2023-10-16 CLIN: A Continually Learning Language Agent for Rapid Task
Adaptation and Generalization
link Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra,..., Peter Clark
34 2023-07-13 Effective Prompt Extraction from Language Models link Yiming Zhang, Nicholas Carlini, Daphne Ippolito
34 2024-01-24 MambaByte: Token-free Selective State Space Model link Junxiong Wang, Tushaar Gangavarapu,..., Alexander M Rush
32 2024-04-01 What is in Your Safe Data? Identifying Benign Data
that Breaks Safety
link Luxi He, Mengzhou Xia, Peter Henderson
31 2023-08-15 RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models link Jie Huang, Wei Ping,..., Bryan Catanzaro
31 2024-07-25 Keep the Cost Down: A Review on Methods to
Optimize LLM’s KV-Cache Consumption
link Shi Luohe, Hongyi Zhang,..., hai zhao
31 2024-02-14 LlaSMol: Advancing Large Language Models for Chemistry with a
Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset
link Botao Yu, Frazier N. Baker,..., Huan Sun
31 2024-03-26 Have Faith in Faithfulness: Going Beyond Circuit Overlap When
Finding Model Mechanisms
link Michael Hanna, Sandro Pezzelle, Yonatan Belinkov
30 2024-03-27 Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions
Using RL from Knowledge Feedback
link Hongshen Xu, Zichen Zhu,..., Kai Yu
30 2024-04-29 MileBench: Benchmarking MLLMs in Long Context link Song Dingjie, Shunian Chen,..., Benyou Wang
30 2024-04-02 Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language
Models - A Survey
link Philipp Mondorf, Barbara Plank
29 2024-03-24 The N+ Implementation Details of RLHF with PPO: A
Case Study on TL;DR Summarization
link Shengyi Huang, Michael Noukhovitch,..., Lewis Tunstall
29 2024-02-13 On Limitations of the Transformer Architecture link Binghui Peng, Srini Narayanan, Christos Papadimitriou
29 2024-03-20 Reverse Training to Nurse the Reversal Curse link Olga Golovneva, Zeyuan Allen-Zhu,..., Sainbayar Sukhbaatar
28 2024-05-10 LLM Discussion: Enhancing the Creativity of Large Language Models
via Discussion Framework and Role-Play
link Li-Chun Lu, Shou-Jen Chen,..., Shao-Hua Sun
28 2024-04-04 Locating and Editing Factual Associations in Mamba link Arnab Sen Sharma, David Atkinson, David Bau
28 2023-10-18 Understanding Retrieval Augmentation for Long-Form Question Answering link Hung-Ting Chen, Fangyuan Xu,..., Eunsol Choi
28 2024-04-07 How bad is training on synthetic data? A statistical
analysis of language model collapse
link Mohamed El Amine Seddik, Suei-Wen Chen,..., Merouane Abdelkader DEBBAH
27 2023-07-12 Instruction Mining: Instruction Data Selection for Tuning Large Language
Models
link Yihan Cao, Yanbin Kang,..., Lichao Sun
26 2023-05-10 Bot or Human? Detecting ChatGPT Imposters with A Single
Question
link Hong Wang, Xuan Luo,..., Xifeng Yan
25 2024-07-16 Self-Guide: Better Task-Specific Instruction Following via Self-Synthetic Finetuning link Chenyang Zhao, Xueying Jia,..., Tongshuang Wu
24 2024-04-11 From Words to Numbers: Your Large Language Model Is
Secretly A Capable Regressor When Given In-Context Examples
link Robert Vacareanu, Vlad Andrei Negru,..., Mihai Surdeanu
24 2024-07-22 Do Large Language Models Have Compositional Ability? An Investigation
into Limitations and Scalability
link Zhuoyan Xu, Zhenmei Shi, Yingyu Liang
24 2023-12-02 Eliciting Latent Knowledge from "Quirky" Language Models link Alex Troy Mallen, Madeline Brumley,..., Nora Belrose
23 2024-04-15 Compression Represents Intelligence Linearly link Yuzhen Huang, Jinghan Zhang,..., Junxian He
23 2023-10-05 SteP: Stacked LLM Policies for Web Actions link Paloma Sodhi, S.R.K Branavan,..., Ryan McDonald
23 2024-03-28 STaR-GATE: Teaching Language Models to Ask Clarifying Questions link Chinmaya Andukuri, Jan-Philipp Fränken,..., Noah Goodman
22 2024-03-18 What Are Tools Anyway? A Survey from the Language
Model Perspective
link Zhiruo Wang, Zhoujun Cheng,..., Graham Neubig
22 2023-09-26 Don't throw away your value model! Generating more preferable
text with Value-Guided Monte-Carlo Tree Search decoding
link Jiacheng Liu, Andrew Cohen,..., Asli Celikyilmaz
21 2024-04-17 Pack of LLMs: Model Fusion at Test-Time via Perplexity
Optimization
link Costas Mavromatis, Petros Karypis, George Karypis
21 2023-10-06 An In-Context Learning Agent for Formal Theorem-Proving link Amitayush Thakur, George Tsoukalas,..., Swarat Chaudhuri
21 2023-06-23 Bring Your Own Data! Self-Sensitivity Evaluation for Large Language
Models
link Neel Jain, Khalid Saifullah,..., Tom Goldstein
21 2024-03-28 Top Leaderboard Ranking = Top Coding Proficiency, Always? EvoEval:
Evolving Coding Benchmarks via LLM
link Chunqiu Steven Xia, Yinlin Deng, LINGMING ZHANG
21 2024-02-07 Hydra: Sequentially-Dependent Draft Heads for Medusa Decoding link Zachary Ankner, Rishab Parthasarathy,..., William Brandon
20 2024-03-14 Logits of API-Protected LLMs Leak Proprietary Information link Matthew Finlayson, Xiang Ren, Swabha Swayamdipta
20 2024-03-30 Multi-hop Question Answering under Temporal Knowledge Editing link Keyuan Cheng, Gang Lin,..., Di Wang
19 2024-04-16 Can Language Models Solve Olympiad Programming? link Ben Shi, Michael Tang,..., Shunyu Yao
19 2024-04-03 Auxiliary task demands mask the capabilities of smaller language
models
link Jennifer Hu, Michael Frank
19 2024-04-16 CULTURE-GEN: Revealing Global Cultural Perception in Language Models through
Natural Language Prompting
link Huihan Li, Liwei Jiang,..., Yejin Choi
19 2024-04-15 A Survey on Deep Learning for Theorem Proving link Zhaoyu Li, Jialiang Sun,..., Xujie Si
18 2024-04-01 Do Language Models Plan Ahead for Future Tokens? link Wilson Wu, John Xavier Morris, Lionel Levine
18 2024-03-17 StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows link Yiran Wu, Tianwei Yue,..., Qingyun Wu
17 2024-04-08 LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step
Reasoning with Large Language Models
link Shibo Hao, Yi Gu,..., Zhiting Hu
17 2024-05-10 SKVQ: Sliding-window Key and Value Cache Quantization for Large
Language Models
link Haojie Duanmu, Zhihang Yuan,..., Dahua Lin
17 2023-12-01 Instruction-tuning Aligns LLMs to the Human Brain link Khai Loong Aw, Syrielle Montariol,..., Antoine Bosselut
17 2024-03-21 Emergent World Models and Latent Variable Estimation in Chess-Playing
Language Models
link Adam Karvonen
17 2024-03-31 The Larger the Better? Improved LLM Code-Generation via Budget
Reallocation
link Michael Hassid, Tal Remez,..., Yossi Adi
17 2024-04-04 Evaluating LLMs at Detecting Errors in LLM Responses link Ryo Kamoi, Sarkar Snigdha Sarathi Das,..., Rui Zhang
16 2024-04-08 Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data link Tim Baumgärtner, Yang Gao,..., Donald Metzler
16 2023-06-05 Early Weight Averaging meets High Learning Rates for LLM
Pre-training
link Sunny Sanyal, Atula Tejaswi Neerkaje,..., sujay sanghavi
16 2024-01-21 With Greater Text Comes Greater Necessity: Inference-Time Training Helps
Long Text Generation
link Yan Wang, Dongyang Ma, Deng Cai
16 2024-04-15 Impact of Preference Noise on the Alignment Performance of
Generative Language Models
link Yang Gao, Dana Alon, Donald Metzler
16 2023-12-11 Can It Edit? Evaluating the Ability of Large Language
Models to Follow Code Editing Instructions
link Federico Cassano, Luisa Li,..., Arjun Guha
16 2024-04-04 How Easily do Irrelevant Inputs Skew the Responses of
Large Language Models?
link Siye Wu, Jian Xie,..., Yanghua Xiao
16 2024-05-10 Linearizing Large Language Models link Jean Mercat, Igor Vasiljevic,..., Thomas Kollar
15 2024-06-05 Does your data spark joy? Performance gains from domain
upsampling at the end of training
link Cody Blakeney, Mansheej Paul,..., Jonathan Frankle
15 2024-08-17 How Susceptible are LLMs to Influence in Prompts? link Sotiris Anagnostidis, Jannis Bulian
15 2024-04-05 Chinese Tiny LLM: Pretraining a Chinese-Centered Large Language Model link Xeron Du, Zhouliang Yu,..., Ge Zhang
15 2023-10-09 Guiding Language Model Reasoning with Planning Tokens link Xinyi Wang, Lucas Caccia,..., Alessandro Sordoni
14 2023-10-05 DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training link Dacheng Li, Rulin Shao,..., Hao Zhang
14 2024-05-06 Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training link Zexuan Zhong, Mengzhou Xia,..., Mike Lewis
14 2024-06-20 Timo: Towards Better Temporal Reasoning for Language Models link Zhaochen Su, Jun Zhang,..., Yu Cheng
14 2024-01-31 Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning link Tinghui Zhu, Kai Zhang,..., Yu Su
14 2024-08-12 Evaluating Language Models for Efficient Code Generation link Jiawei Liu, Songrun Xie,..., LINGMING ZHANG
13 2024-02-06 Task Success is not Enough: Investigating the Use of
Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors
link Lin Guan, Yifan Zhou,..., Subbarao Kambhampati
13 2024-04-02 Risks from Language Models for Automated Mental Healthcare: Ethics
and Structure for Implementation
link Declan Grabb, Max Lamparth, Nina Vasan
13 2024-04-01 Source-Aware Training Enables Knowledge Attribution in Language Models link Muhammad Khalifa, David Wadden,..., Hao Peng
13 2024-02-26 StructLM: Towards Building Generalist Models for Structured Knowledge Grounding link Alex Zhuang, Ge Zhang,..., Wenhu Chen
13 2024-07-11 Automata-based constraints for language model decoding link Terry Koo, Frederick Liu, Luheng He
13 2024-02-22 Stop Reasoning! When Multimodal LLM with Chain-of-Thought Reasoning Meets
Adversarial Image
link Zefeng Wang, Zhen Han,..., Jindong Gu
13 2024-04-01 IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations link Deqing Fu, Ruohao Guo,..., Willie Neiswanger
12 2024-04-16 Forcing Diffuse Distributions out of Language Models link Yiming Zhang, Avi Schwarzschild,..., Daphne Ippolito
12 2023-06-19 SynerGPT: In-Context Learning for Personalized Drug Synergy Prediction and
Drug Design
link Carl Edwards, Aakanksha Naik,..., Tom Hope
12 2024-04-25 List Items One by One: A New Data Source
and Learning Paradigm for Multimodal LLMs
link An Yan, Zhengyuan Yang,..., Lijuan Wang
12 2024-03-07 How Far Are We from Intelligent Visual Deductive Reasoning? link Yizhe Zhang, Richard He Bai,..., Navdeep Jaitly
11 2024-07-16 Trust No Bot: Discovering Personal Disclosures in Human-LLM
Conversations in the Wild
link Niloofar Mireshghallah, Maria Antoniak,..., Golnoosh Farnadi
11 2023-08-24 CALM : A Multi-task Benchmark for Comprehensive Assessment of
Language Model Bias
link Vipul Gupta, Pranav Narayanan Venkit,..., Rebecca J. Passonneau
11 2023-10-02 Resolving Knowledge Conflicts in Large Language Models link Yike Wang, Shangbin Feng,..., Yulia Tsvetkov
11 2024-08-05 LLM economicus? Mapping the Behavioral Biases of LLMs via
Utility Theory
link Jillian Ross, Yoon Kim, Andrew Lo
11 2024-04-09 Elephants Never Forget: Memorization and Learning of Tabular Data
in Large Language Models
link Sebastian Bordt, Harsha Nori,..., Rich Caruana
11 2024-06-11 Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? link Xingyu Fu, Muyu He,..., Dan Roth
10 2024-06-20 Mind the Privacy Unit! User-Level Differential Privacy for Language
Model Fine-Tuning
link Lynn Chua, Badih Ghazi,..., Chiyuan Zhang
10 2024-08-08 Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs
for Low-Resource NLP
link François Remy, Pieter Delobelle,..., Thomas Demeester
10 2024-05-01 AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of
Low-Rank Adaptation Experts
link Zefang Liu, Jiahua Luo
10 2024-03-18 EnvGen: Generating and Adapting Environments via LLMs for Training
Embodied Agents
link Abhay Zala, Jaemin Cho,..., Mohit Bansal
10 2023-10-18 DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning link Abhay Zala, Han Lin,..., Mohit Bansal
10 2024-04-01 Exploring the Mystery of Influential Data for Mathematical Reasoning link Xinzhe Ni, Yeyun Gong,..., Weizhu Chen
10 2023-05-22 Should We Attend More or Less? Modulating Attention for
Fairness
link Abdelrahman Zayed, Goncalo Mordido,..., Sarath Chandar
9 2024-09-01 Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in
Large Language Models
link Bang An, Sicheng Zhu,..., Furong Huang
9 2024-01-29 NoFunEval: Funny How Code LMs Falter on Requirements Beyond
Functional Correctness
link Manav Singhal, Tushar Aggarwal,..., Aditya Kanade
9 2024-02-02 Can MLLMs Perform Text-to-Image In-Context Learning? link Yuchen Zeng, Wonjun Kang,..., Kangwook Lee
9 2024-03-23 IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language
Models
link Haz Sameen Shahgir, Khondker Salman Sayeed,..., Rifat Shahriyar
9 2023-05-23 Are Language Models Robust Coreference Resolvers? link Nghia T. Le, Alan Ritter
9 2024-01-22 The Curious Case of Nonverbal Abstract Reasoning with Multi-Modal
Large Language Models
link Kian Ahrabian, Zhivar Sourati,..., Jay Pujara
9 2024-03-19 Dated Data: Tracing Knowledge Cutoffs in Large Language Models link Jeffrey Cheng, Marc Marone,..., Benjamin Van Durme
9 2024-04-12 CATS: Context-Aware Thresholding for Sparsity in Large Language Models link Donghyun Lee, Jaeyong Lee,..., Azalia Mirhoseini
8 2024-04-09 Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to
The Persian Language?
link Omid Ghahroodi, Marzia Nouri,..., Mohammad Hossein Rohban
8 2024-07-22 Benchmarks as Microscopes: A Call for Model Metrology link Michael Saxon, Ari Holtzman,..., Naomi Saphra
8 2024-08-13 Evaluating Cultural Adaptability of a Large Language Model via
Simulation of Synthetic Personas
link Louis Kwok, Michal Bravansky, Lewis Griffin
8 2024-05-02 NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment link Gerald Shen, Zhilin Wang,..., Oleksii Kuchaiev
8 2024-04-05 Prompt Public Large Language Models to Synthesize Data for
Private On-device Applications
link Shanshan Wu, Zheng Xu,..., Daniel Ramage
8 2023-11-15 Towards Verifiable Text Generation with Symbolic References link Lucas Torroba Hennigen, Zejiang Shen,..., Yoon Kim
8 2024-04-04 Fakes of Varying Shades: How Warning Affects Human Perception
and Engagement Regarding LLM Hallucinations
link Mahjabin Nahar, Haeseung Seo,..., Dongwon Lee
8 2024-02-24 Empowering Large Language Model Agents through Action Learning link Haiteng Zhao, Chang Ma,..., Hongxia Yang
8 2024-03-13 Scattered Mixture-of-Experts Implementation link Shawn Tan, Yikang Shen,..., Aaron Courville
8 2023-09-15 "Merge Conflicts!'" Exploring the Impacts of External Knowledge Distractors
to Parametric Knowledge Graphs
link Cheng Qian, Xinran Zhao, Tongshuang Wu
8 2024-03-30 ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction link Mingyu Jin, Haochen Xue,..., Yongfeng Zhang
7 2024-02-19 ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of
LLMs
link Pengrui Han, Rafal Dariusz Kocielnik,..., Anima Anandkumar
7 2024-04-05 Counting Like Transformers: Compiling Temporal Counting Logic Into Softmax
Transformers
link Andy Yang, David Chiang
7 2024-08-27 Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns
to Model Representations
link Yize Zhao, Tina Behnia,..., Christos Thrampoulidis
7 2024-05-03 Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier
Selection
link Guillem Ramírez, Alexandra Birch, Ivan Titov
7 2024-04-11 Why do small language models underperform? Studying Language Model
Saturation via the Softmax Bottleneck
link Nathan Godey, Éric Villemonte de la Clergerie, Benoît Sagot
7 2024-03-13 PAPERCLIP: Associating Astronomical Observations and Natural Language with Multi-Modal
Models
link Siddharth Mishra-Sharma, YIDING SONG, Jesse Thaler
6 2023-10-27 TarGEN: Targeted Data Generation with Large Language Models link Himanshu Gupta, Kevin Scaria,..., Swaroop Mishra
6 2024-04-01 Prompt-prompted Adaptive Structured Pruning for Efficient LLM Generation link Harry Dong, Beidi Chen, Yuejie Chi
6 2024-04-03 An Incomplete Loop: Instruction Inference, Instruction Following, and In-Context
Learning in Language Models
link Emmy Liu, Graham Neubig, Jacob Andreas
6 2024-06-11 MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in
Generative LLMs
link Vera Neplenbroek, Arianna Bisazza, Raquel Fernández
6 2024-04-01 Large Language Models are Capable of Offering Cognitive Reappraisal,
if Guided
link Hongli Zhan, Allen Zheng,..., Desmond Ong
6 2024-05-04 Beyond Relevance: Evaluate and Improve Retrievers on Perspective Awareness link Xinran Zhao, Tong Chen,..., Tongshuang Wu
6 2024-05-19 Hummer: Towards Limited Competitive Preference Dataset link Yusen Wu, Li Jiang,..., Xiaotie Deng
6 2024-02-13 Measuring and Controlling Instruction (In)Stability in Language Model Dialogs link Kenneth Li, Tianle Liu,..., Martin Wattenberg
6 2024-05-10 LMD3: Language Model Data Density Dependence link John Kirchenbauer, Garrett Honke,..., David Andre
6 2023-11-07 Uncovering Intermediate Variables in Transformers using Circuit Probing link Michael A. Lepori, Thomas Serre, Ellie Pavlick
6 2023-05-24 Using Natural Language Explanations to Rescale Human Judgments link Manya Wadhwa, Jifan Chen,..., Greg Durrett
6 2023-11-09 Efficient Parallelization Layouts for Large-Scale Distributed Model Training link Johannes Hagemann, Samuel Weinbach,..., Gerard de Melo
6 2024-04-01 Will the Real Linda Please Stand up...to Large Language
Models? Examining the Representativeness Heuristic in LLMs
link Pengda Wang, Zilin Xiao,..., Frederick L. Oswald
5 2024-05-15 PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large
Language Models
link Devansh Jain, Priyanshu Kumar,..., Maarten Sap
5 2024-11-25 Predicting Emergent Capabilities by Finetuning link Charlie Victor Snell, Eric Wallace,..., Sergey Levine
5 2024-08-16 See What LLMs Cannot Answer: A Self-Challenge Framework for
Uncovering LLM Weaknesses
link Yulong Chen, Yang Liu,..., Yue Zhang
5 None ReAct Meets ActRe: Autonomous Annotation of Agent Trajectories for
Contrastive Self-Training
link Zonghan Yang, Peng Li,..., Yang Liu
5 2023-12-28 LLM4Causal: Democratized Causal Tools for Everyone via Large Language
Model
link Haitao Jiang, Lin Ge,..., Rui Song
5 2024-04-03 From Narratives to Numbers: Valid Inference Using Language Model
Predictions from Verbal Autopsies
link Shuxian Fan, Adam Visokay,..., Tyler McCormick
5 2024-04-02 Helmsman of the Masses? Evaluate the Opinion Leadership of
Large Language Models in the Werewolf Game
link Silin Du, Xiaowei Zhang
5 2024-03-31 Learning to Plan for Language Modeling from Unlabeled Data link Nathan Cornille, Marie-Francine Moens, Florian Mai
5 2024-03-29 MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities
of Large Language Models
link Peng Ding, Jiading Fang,..., Matthew Walter
5 None On Robustness-Accuracy Characterization of Language Models using Synthetic Datasets link Ching-Yun Ko, Pin-Yu Chen,..., Luca Daniel
4 2024-01-24 TPD: Enhancing Student Language Model Reasoning via Principle Discovery
and Guidance
link Haorui Wang, Rongzhi Zhang,..., Chao Zhang
4 2024-09-17 CoCA: Regaining Safety-awareness of Multimodal Large Language Models with
Constitutional Calibration
link Jiahui Gao, Renjie Pi,..., Zhenguo Li
4 2024-02-28 Multi-FAct: Assessing Factuality of Multilingual LLMs using FActScore link Sheikh Shafayat, Eunsu Kim,..., Alice Oh
4 2024-04-01 PairEval: Open-domain Dialogue Evaluation Metric with Pairwise Comparisons link ChaeHun Park, Minseok Choi,..., Jaegul Choo
4 2024-04-18 AmbigDocs: Reasoning across Documents on Different Entities under the
Same Name
link Yoonsang Lee, Xi Ye, Eunsol Choi
4 2024-04-21 Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated
Images
link Ali Naseh, Katherine Thai,..., Amir Houmansadr
4 2024-08-15 Web Retrieval Agents for Evidence-Based Misinformation Detection link Jacob-Junqi Tian, Hao Yu,..., Kellin Pelrine
4 2024-11-06 Crystal: Illuminating LLM Abilities on Language and Code link Tianhua Tao, Junbo Li,..., Zhengzhong Liu
4 2024-04-04 Unveiling LLMs: The Evolution of Latent Representations in a
Dynamic Knowledge Graph
link Marco Bronzini, Carlo Nicolini,..., Andrea Passerini
4 2024-05-02 D2PO: Discriminator-Guided DPO with Response Evaluation Models link Prasann Singhal, Nathan Lambert,..., Greg Durrett
4 2024-01-04 SPEER: Sentence-Level Planning of Long Clinical Summaries via Embedded
Entity Retrieval
link Griffin Thomas Adams, Jason Zucker, Noémie Elhadad
4 2024-08-12 Long-Form Answers to Visual Questions from Blind and Low
Vision People
link Mina Huh, Fangyuan Xu,..., Amy Pavel
4 2024-04-03 Scalable Model Editing via Customized Expert Networks link Zihan Yao, Yu He,..., Ming Li
4 2024-03-31 CHOPS: CHat with custOmer Profile Systems for Customer Service
with LLMs
link Jingzhe Shi, Jialuo Li,..., Lei Li
4 2024-08-10 Your Context Is Not an Array: Unveiling Random Access
Limitations in Transformers
link MohammadReza Ebrahimi, Sunny Panchal, Roland Memisevic
4 2023-05-21 Description-Based Text Similarity link Shauli Ravfogel, Valentina Pyatkin,..., Yoav Goldberg
4 2024-05-27 On Fairness of Low-Rank Adaptation of Large Models link Zhoujie Ding, Ken Liu,..., Sanmi Koyejo
3 2024-08-09 How Well Do LLMs Identify Cultural Unity in Diversity? link Jialin Li, Junli Wang,..., Ming Jiang
3 2024-10-07 Cookbook: A framework for improving LLM generative abilities via
programmatic data generating templates
link Avanika Narayan, Mayee F Chen,..., Christopher Re
3 2024-05-30 How Multilingual are Large Language Models Fine-tuned for Translation? link Aquia Richburg, Marine Carpuat
3 2024-04-15 Personalized Collaborative Fine-Tuning for On-Device Large Language Models link Nicolas Wagner, Dongyang Fan, Martin Jaggi
3 2024-08-26 Crowd-Calibrator: Can Annotator Disagreement Inform Calibration in Subjective Tasks? link Urja Khurana, Eric Nalisnick,..., Swabha Swayamdipta
3 2024-08-02 Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in
Multimodal LLMs
link Yilun Hua, Yoav Artzi
3 2024-04-27 Building a Large Japanese Web Corpus for Large Language
Models
link Naoaki Okazaki, Kakeru Hattori,..., Sakae Mizuki
3 2024-04-04 PRobELM: Plausibility Ranking Evaluation for Language Models link Moy Yuan, Eric Chamoun,..., Andreas Vlachos
3 2023-11-14 AI-generated text boundary detection with RoFT link Laida Kushnareva, Tatiana Gaintseva,..., Irina Piontkovskaya
3 2024-04-12 Look at the Text: Instruction-Tuned Language Models are More
Robust Multiple Choice Selectors than You Think
link Xinpeng Wang, Chengzhi Hu,..., Barbara Plank
3 2024-07-12 Large Language Models as Biomedical Hypothesis Generators: A Comprehensive
Evaluation
link Biqing Qi, Kaiyan Zhang,..., Bowen Zhou
3 2024-04-06 Language Models as Critical Thinking Tools: A Case Study
of Philosophers
link Andre Ye, Jared Moore,..., Amy X Zhang
2 2024-02-05 UniMem: Towards a Unified View of Long-Context Large Language
Models
link Junjie Fang, Likai Tang,..., Maosong Sun
2 2024-04-30 Revenge of the Fallen? Recurrent Models Match Transformers at
Predicting Human Language Comprehension Metrics
link James Michaelov, Catherine Arnett, Ben Bergen
2 2023-07-13 Does Collaborative Human–LM Dialogue Generation Help Information Extraction from
Human–Human Dialogues?
link Bo-Ru Lu, Nikita Haduong,..., Mari Ostendorf
2 2024-03-22 CoLLEGe: Concept Embedding Generation for Large Language Models link Ryan Teehan, Brenden Lake, Mengye Ren
2 2024-12-18 What makes a good metric? Evaluating automatic metrics for
text-to-image consistency
link Candace Ross, Melissa Hall,..., Adina Williams
2 2024-04-16 DeStein: Navigating Detoxification of Language Models via Universal Steering
Pairs and Head-wise Activation Fusion
link Yu Li, Han Jiang,..., Zhihua Wei
2 2023-07-15 CA-LoRA: Adapting Existing LoRA for Compressed LLMs to Enable
Efficient Multi-Tasking on Personal Devices
link Weilin Zhao, Yuxiang Huang,..., Maosong Sun
2 2024-05-01 WorkBench: a Benchmark Dataset for Agents in a Realistic
Workplace Setting
link Olly Styles, Sam Miller,..., Bertie Vidgen
2 2024-08-10 Investigating Instruction Tuning Large Language Models on Graphs link Kerui Zhu, Bo-Wei Huang,..., Jiawei Han
2 2024-08-13 StyleTalker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model
for Fast Spoken Dialogue Generation
link Yinghao Aaron Li, Xilin Jiang,..., Nima Mesgarani
2 2024-04-09 Characterizing Multimodal Long-form Summarization: A Case Study on Financial
Reports
link Tianyu Cao, Natraj Raman,..., Chenhao Tan
2 2024-04-19 Stronger Random Baselines for In-Context Learning link Gregory Yauney, David Mimno
2 2024-08-23 LalaEval: A Holistic Human Evaluation Framework for Domain-Specific Large
Language Models
link Chongyan Sun, Ken Lin,..., Zhen Wang
2 2024-04-24 Studying Large Language Model Behaviors Under Context-Memory Conflicts With
Real Documents
link Evgenii Kortukov, Alexander Rubinstein,..., Seong Joon Oh
2 2024-07-12 Does Incomplete Syntax Influence Korean Language Model? Focusing on
Word Order and Case Markers
link Jong Myoung Kim, Young-Jun Lee,..., Sangkeun Jung
2 2024-05-30 Reasoning about concepts with LLMs: Inconsistencies abound link Rosario Uceda Sosa, Karthikeyan Natesan Ramamurthy,..., Moninder Singh
2 2024-03-20 Information-Theoretic Distillation for Reference-less Summarization link Jaehun Jung, Ximing Lu,..., Yejin Choi
2 2024-04-01 LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models link Haoran Li, Junqi Liu,..., Huaxiu Yao
2 2024-08-09 Tabular Transfer Learning via Prompting LLMs link Jaehyun Nam, Woomin Song,..., Jinwoo Shin
1 2024-07-18 Latent Causal Probing: A Formal Perspective on Probing with
Causal Models of Data
link Charles Jin
1 2024-03-11 3M-Diffusion: Latent Multi-Modal Diffusion for Language-Guided Molecular Structure Generation link Huaisheng Zhu, Teng Xiao, Vasant G Honavar
1 2024-08-14 Training Language Models on the Knowledge Graph: Insights on
Hallucinations and Their Detectability
link Jiri Hron, Laura A Culp,..., Jeffrey Pennington
1 2024-04-08 GeniL: A Multilingual Dataset on Generalizing Language link Aida Mostafazadeh Davani, Sagar Gubbi Venkatesh,..., Vinodkumar Prabhakaran
1 2024-08-06 LAMPO: Large Language Models as Preference Machines for Few-shot
Ordinal Classification
link Zhen Qin, Junru Wu,..., Xuanhui Wang
1 2024-09-03 Unforgettable Generalization in Language Models link Eric Zhang, Leshem Choshen, Jacob Andreas
1 2024-08-05 ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for
Compositional Visual Reasoning
link Yuxuan Wang, Alan Yuille,..., Zilong Zheng
1 None Pairwise Proximal Policy Optimization: Language Model Alignment with Comparative
RL
link Tianhao Wu, Banghua Zhu,..., Jiantao Jiao
1 2024-07-13 Cohesive Conversations: Enhancing Authenticity in Multi-Agent Simulated Dialogues link KuanChao Chu, Yi-Pei Chen, Hideki Nakayama
1 None Redesigning Information Markets in the Era of Language Models link Martin Weiss, Nasim Rahaman,..., Christopher Pal
1 2024-03-29 Measuring Taiwanese Mandarin Language Understanding link Po-Heng Chen, Sijia Cheng,..., Yun-Nung Chen
1 2024-04-17 AgentKit: Structured LLM Reasoning with Dynamic Graphs link Yue Wu, Yewen Fan,..., Tom Mitchell
0 2024-05-08 ACORN: Aspect-wise Commonsense Reasoning Explanation Evaluation link Ana Brassard, Benjamin Heinzerling,..., Kentaro Inui
0 None BumbleBee: Dynamic KV-Cache Streaming Submodular Summarization for Infinite-Context Transformers link Lilly Kumari, Shengjie Wang,..., Jeff Bilmes
0 2024-04-01 Forklift: An Extensible Neural Lifter link Jordi Armengol-Estapé, Rodrigo C. O. Rocha,..., Michael O'Boyle
0 None Evaluating the Adversarial Robustness of Retrieval-Based In-Context Learning for
Large Language Models
link Simon Chi Lok Yu, Jie He,..., Jeff Z. Pan
0 2024-08-06 Data Checklist: On Unit-Testing Datasets with Usable Information link Heidi Chenyu Zhang, Shabnam Behzad,..., Dan Jurafsky
0 2023-10-22 O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making
with Large Language Models
link Yuchen Xiao, Yanchao Sun,..., Sumitra Ganesh
0 2024-08-09 FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization
Across Tokenizers
link Joshua Nathaniel Williams, J Zico Kolter
0 2024-10-08 Generating Synthetic Datasets for Few-shot Prompt Tuning link Xu Guo, Zilin Du,..., Chunyan Miao
0 2024-04-11 Does In-Context Learning Really Learn? Rethinking How Large Language
Models Respond and Solve Tasks via In-Context Learning
link Quanyu Long, Yin Wu,..., Sinno Jialin Pan
0 2023-12-01 Nonparametric Variational Regularisation of Pretrained Transformers link Fabio James Fehr, James Henderson
0 2024-03-04 CatCode: A Comprehensive Evaluation Framework for LLMs On the
Mixture of Code and Text
link Zhenru Lin, Yiqun Yao, Yang Yuan
0 None Handling Open-Vocabulary Constructs in Formalizing Specifications: Retrieval Augmented Parsing
with Expert Knowledge
link Mohammad Saqib Hasan, Sayontan Ghosh,..., Niranjan Balasubramanian
0 2024-06-20 Information Guided Regularization for Fine-tuning Language Models link Mandar Sharma, Nikhil Muralidhar,..., Naren Ramakrishnan
0 2024-10-08 Does RoBERTa Perform Better than BERT in Continual Learning:
An Attention Sink Perspective
link Xueying Bai, Yifan Sun, Niranjan Balasubramanian
0 None Factual and Tailored Recommendation Endorsements using Language Models and
Reinforcement Learning
link Jihwan Jeong, Yinlam Chow,..., Craig Boutilier
0 2023-07-27 Exploiting the Potential of Seq2Seq Models as Robust Few-Shot
Learners
link Jihyeon Lee, Dain Kim,..., Kyoung-Woon On
0 2024-07-11 HDT: Hierarchical Document Transformer link Haoyu He, Markus Flicke,..., Andreas Geiger
0 2024-07-16 InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification link Yujia Hu, Zhiqiang Hu,..., Roy Ka-Wei Lee
0 2024-05-17 Prompt Exploration with Prompt Regression link Michael Feffer, Ronald Xu,..., Mikhail Yurochkin