Last updated: 2025-04-16 04:17:23. Maintained by Weisen Jiang.

citation publish date title (pdf) review authors
530 2023-06-08 Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and
Language Models
link Maaz, Muhammad,..., Fahad
473 2023-05-29 Large Language Models are not Fair Evaluators link Wang, Peiyi,..., Zhifang
448 2023-08-28 LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding link Bai, Yushi,..., Juanzi
322 2024-02-01 OLMo: Accelerating the Science of Language Models link Groeneveld, Dirk,..., Hannaneh
237 2024-01-12 How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking
Persuasion to Challenge AI Safety by Humanizing LLMs
link Zeng, Yi,..., Weiyan
230 2023-12-14 Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations link Wang, Peiyi,..., Zhifang
218 2024-01-31 Dolma: an Open Corpus of Three Trillion Tokens for
Language Model Pretraining Research
link Soldaini, Luca,..., Kyle
202 2024-01-11 DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models link Dai, Damai,..., Wenfeng
182 2023-04-22 LaMP: When Large Language Models Meet Personalization link Salemi, Alireza,..., Hamed
176 2024-02-12 Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model link {\"U}st{\"u}n, Ahmet,..., Sara
163 2023-10-10 LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios
via Prompt Compression
link Jiang, Huiqiang,..., Lili
143 None VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks link Koh, Jing Yu,..., Daniel
141 2023-12-31 Improving Text Embeddings with Large Language Models link Wang, Liang,..., Furu
137 2023-12-09 Steering Llama 2 via Contrastive Activation Addition link Rimsky, Nina,..., Alexander
135 2023-09-27 Navigate through Enigmatic Labyrinth A Survey of Chain of
Thought Reasoning: Advances, Frontiers and Future
link Chu, Zheng,..., Ting
125 2023-09-18 Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM link Cao, Bochuan,..., Jinghui
124 2023-07-20 L-Eval: Instituting Standardized Evaluation for Long Context Language Models link An, Chenxin,..., Xipeng
124 2023-07-16 ChatDev: Communicative Agents for Software Development link Qian, Chen,..., Maosong
117 2024-01-17 SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents link Cheng, Kanzhi,..., Zhiyong
114 2023-06-16 Full Parameter Fine-tuning for Large Language Models with Limited
Resources
link Lv, Kai,..., Xipeng
113 2023-08-31 The Belebele Benchmark: a Parallel Reading Comprehension Dataset in
122 Language Variants
link Bandarkar, Lucas,..., Madian
110 2023-02-23 Active Prompting with Chain-of-Thought for Large Language Models link Diao, Shizhe,..., Tong
110 2023-10-03 Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology
View
link Zhang, Jintian,..., Shumin
109 2023-10-09 How Abilities in Large Language Models are Affected by
Supervised Fine-tuning Data Composition
link Dong, Guanting,..., Jingren
106 2024-01-25 WebVoyager: Building an End-to-End Web Agent with Large Multimodal
Models
link He, Hongliang,..., Dong
105 2024-02-21 OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level
Bilingual Multimodal Scientific Problems
link He, Chaoqun,..., Maosong
105 2024-02-19 AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling link Zhan, Jun,..., Xipeng
105 2024-02-09 Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning link Singh, Shivalika,..., Sara
103 2023-09-22 ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse
LLMs
link Chen, Justin,..., Mohit
103 2023-11-15 Defending Large Language Models Against Jailbreaking Attacks Through Goal
Prioritization
link Zhang, Zhexin,..., Minlie
99 2023-11-08 LooGLE: Can Long-Context Language Models Understand Long Contexts? link Li, Jiaqi,..., Muhan
94 2023-12-12 LLM in a flash: Efficient Large Language Model Inference
with Limited Memory
link Alizadeh, Keivan,..., Mehrdad
86 2023-09-04 Are Emergent Abilities in Large Language Models just In-Context
Learning?
link Lu, Sheng,..., Iryna
84 2024-02-16 Do Llamas Work in English? On the Latent Language
of Multilingual Transformers
link Wendler, Chris,..., Robert
79 2023-10-27 InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological
Interviews
link Wang, Xintao,..., Yanghua
78 2023-05-24 Who Wrote this Code? Watermarking for Code Generation link Lee, Taehyun,..., Gunhee
78 2024-02-14 SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding link Xu, Zhangchen,..., Radha
78 2023-09-13 SafetyBench: Evaluating the Safety of Large Language Models link Zhang, Zhexin,..., Minlie
76 2024-02-19 ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs link Jiang, Fengqing,..., Radha
74 2024-04-25 LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding link Elhoushi, Mostafa,..., Carole-Jean
71 2023-05-23 Having Beer after Prayer? Measuring Cultural Bias in Large
Language Models
link Naous, Tarek,..., Wei
70 2024-02-26 Do Large Language Models Latently Perform Multi-Hop Reasoning? link Yang, Sohee,..., Sebastian
69 2023-11-07 Black-Box Prompt Optimization: Aligning Large Language Models without Model
Training
link Cheng, Jiale,..., Minlie
69 2023-12-31 RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language
Models
link Niu, Cheng,..., Tong
67 2024-01-17 ReFT: Reasoning with Reinforced Fine-Tuning link Trung, Luong,..., Hang
67 2024-01-12 Large Language Models Can Learn Temporal Reasoning link Xiong, Siheng,..., Faramarz
65 2024-02-01 Don`t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM
Collaboration
link Feng, Shangbin,..., Yulia
64 2024-02-28 Arithmetic Control of LLMs for Diverse User Preferences: Directional
Preference Alignment with Multi-Objective Rewards
link Wang, Haoxiang,..., Tong
64 2024-01-14 CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for
Real-World Repo-level Coding Challenges
link Zhang, Kechi,..., Zhi
64 2024-02-01 When Benchmarks are Targets: Revealing the Sensitivity of Large
Language Model Leaderboards
link Alzahrani, Norah,..., Haidar
62 2024-01-19 Mementos: A Comprehensive Benchmark for Multimodal Large Language Model
Reasoning over Image Sequences
link Wang, Xiyao,..., Furong
62 2024-02-19 Same Task, More Tokens: the Impact of Input Length
on the Reasoning Performance of Large Language Models
link Levy, Mosh,..., Yoav
60 2024-02-22 MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models
in Multi-Turn Dialogues
link Bai, Ge,..., Wanli
60 2024-02-27 Evaluating Very Long-Term Conversational Memory of LLM Agents link Maharana, Adyasha,..., Yuwei
59 2023-06-10 Boosting Language Models Reasoning with Chain-of-Knowledge Prompting link Wang, Jianing,..., Ming
57 2024-03-04 Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents link Song, Yifan,..., Bill Yuchen
57 2023-08-17 MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in
Large Language Models
link Wen, Yilin,..., Jimeng
56 2024-01-02 CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation link Tu, Quan,..., Rui
56 2024-02-01 Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning link Li, Ming,..., Tianyi
55 2023-10-16 EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities link Li, Nian,..., Qingmin
54 2024-02-12 AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension link Yang, Qian,..., Jingren
54 2024-01-04 LLaMA Pro: Progressive LLaMA with Block Expansion link Wu, Chengyue,..., Ping
52 2024-01-23 Large Language Models are Superpositions of All Characters: Attaining
Arbitrary Role-play via Self-Alignment
link Lu, Keming,..., Jingren
51 2023-08-30 Quantifying Uncertainty in Answers from any Language Model and
Enhancing their Trustworthiness
link Chen, Jiuhai,..., Jonas
50 2024-03-25 VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild link Peng, Puyuan,..., David
50 2024-02-26 Political Compass or Spinning Arrow? Towards More Meaningful Evaluations
for Values and Opinions in Large Language Models
link R{\"o}ttger, Paul,..., Dirk
49 2023-05-23 SciMON: Scientific Inspiration Machines Optimized for Novelty link Wang, Qingyun,..., Tom
49 2024-03-21 Detoxifying Large Language Models via Knowledge Editing link Wang, Mengru,..., Huajun
49 2024-02-28 Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions
the Key?
link Wang, Qineng,..., Yangqiu
49 2024-01-29 Rephrasing the Web: A Recipe for Compute and Data-Efficient
Language Modeling
link Maini, Pratyush,..., Navdeep
46 2024-03-01 Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of
Large Vision-Language Models
link Li, Lei,..., Qi
45 2024-01-12 Relying on the Unreliable: The Impact of Language Models'
Reluctance to Express Uncertainty
link Zhou, Kaitlyn,..., Maarten
45 2024-02-26 Language-Specific Neurons: The Key to Multilingual Capabilities in Large
Language Models
link Tang, Tianyi,..., Ji-Rong
45 2023-12-31 DocLLM: A Layout-Aware Generative Language Model for Multimodal Document
Understanding
link Wang, Dongsheng,..., Xiaomo
45 2024-02-18 Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement link Xu, Wenda,..., William
42 2024-03-29 Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors
to Boost for Reasoning
link Tong, Yongqi,..., Jingbo
42 2023-12-22 NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language
Models via Complexity Classes
link Fan, Lizhou,..., Yongfeng
42 2024-05-18 MapCoder: Multi-Agent Code Generation for Competitive Problem Solving link Islam, Md. Ashraful,..., Md Rizwan
41 2024-02-29 GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of
LLMs as Mathematical Problem Solvers
link Li, Qintong,..., Wei
41 2023-12-22 VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation link Ku, Max,..., Wenhu
41 2023-12-14 The Earth is Flat because...: Investigating LLMs' Belief towards
Misinformation via Persuasive Conversation
link Xu, Rongwu,..., Han
40 2024-02-26 Long-Context Language Modeling with Parallel Context Encoding link Yen, Howard,..., Danqi
40 2024-02-16 Quantifying the Persona Effect in LLM Simulations link Hu, Tiancheng,..., Nigel
40 2023-06-03 MultiLegalPile: A 689GB Multilingual Legal Corpus link Niklaus, Joel,..., Daniel
39 2023-05-22 MAGE: Machine-generated Text Detection in the Wild link Li, Yafu,..., Yue
39 2024-01-04 Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives link Zhang, Wenqi,..., Weiming
39 2023-11-30 AlignBench: Benchmarking Chinese Alignment of Large Language Models link Liu, Xiao,..., Jie
38 2024-02-26 MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing
Mathematical Reasoning of LLMs
link Lu, Zimu,..., Hongsheng
38 2024-02-05 Unified Hallucination Detection for Multimodal Large Language Models link Chen, Xiang,..., Huajun
38 2024-05-28 Faithful Logical Reasoning via Symbolic Chain-of-Thought link Xu, Jundong,..., Wynne
37 2024-02-28 FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability link Xia, Congying,..., Caiming
37 2024-02-14 Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation link Zhang, Xiaoying,..., Helen
37 2023-12-28 Experiential Co-Learning of Software-Developing Agents link Qian, Chen,..., Maosong
36 2024-02-27 TruthX: Alleviating Hallucinations by Editing Large Language Models in
Truthful Space
link Zhang, Shaolei,..., Yang
36 2024-02-20 Investigating Cultural Alignment of Large Language Models link AlKhamissi, Badr,..., Mona
35 2024-02-22 Unintended Impacts of LLM Alignment on Global Representation link Ryan, Michael J,..., Diyi
34 2024-01-11 GroundingGPT: Language Enhanced Multi-modal Grounding Model link Li, Zhaowei,..., Tao
34 2024-01-06 The Dawn After the Dark: An Empirical Study on
Factuality Hallucination in Large Language Models
link Li, Junyi,..., Ji-Rong
32 2024-02-21 Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning link Yang, Zhaorui,..., Qian
32 2024-05-26 M$^3$CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought link Chen, Qiguang,..., Wanxiang
32 2024-02-24 PRP: Propagating Universal Perturbations to Attack Large Language Model
Guard-Rails
link Mangaokar, Neal,..., Atul
32 2024-05-13 RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated
Text Detectors
link Dugan, Liam,..., Chris
31 2023-05-24 Harnessing the Power of Large Language Models for Natural
Language to First-Order Logic Translation
link Yang, Yuan,..., Faramarz
31 2024-02-23 Machine Unlearning of Pre-trained Large Language Models link Yao, Jin,..., Xiang
31 2023-11-09 Agent Lumos: Unified and Modular Training for Open-Source Language
Agents
link Yin, Da,..., Bill Yuchen
31 2024-02-19 Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned
Language Models through Task Arithmetic
link Bhardwaj, Rishabh,..., Soujanya
30 2024-02-27 Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization link Zhang, Wenqi,..., Weiming
30 2024-02-16 When is Tree Search Useful for LLM Planning? It
Depends on the Discriminator
link Chen, Ziru,..., Huan
29 None LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models
via MoE-Style Plugin
link Dou, Shihan,..., Xuanjing
29 2024-01-16 MMToM-QA: Multimodal Theory of Mind Question Answering link Jin, Chuanyang,..., Tianmin
28 2023-06-20 Democratizing LLMs for Low-Resource Languages by Leveraging their English
Dominant Abilities with Linguistically-Diverse Prompts
link Nguyen, Xuan-Phi,..., Lidong
28 2024-02-17 M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection link Wang, Yuxia,..., Preslav
28 2024-02-20 Instruction-tuned Language Models are Better Knowledge Learners link Jiang, Zhengbao,..., Srini
28 2023-11-14 CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating
LLMs on Code Understanding and Generation
link Yan, Weixiang,..., Shuiguang
28 2023-11-16 Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities link Wilf, Alex,..., Louis-Philippe
28 2023-12-26 Aligning Large Language Models with Human Preferences through Representation
Engineering
link Liu, Wenhao,..., Xuanjing
28 2024-03-06 Quantifying Contamination in Evaluating Code Generation Capabilities of Language
Models
link Riddell, Martin,..., Arman
28 2024-01-22 PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and
Evaluation of Multi-agent System Safety
link Zhang, Zaibin,..., Feng
27 2024-01-10 AutoAct: Automatic Agent Learning from Scratch for QA via
Self-Planning
link Qiao, Shuofei,..., Huajun
27 2023-11-15 Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models link Xu, Fangzhi,..., Jun
26 2023-10-05 InstructProtein: Aligning Human and Protein Language via Knowledge Instruction link Wang, Zeyuan,..., Huajun
26 2023-09-29 Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency link Huang, Baizhou,..., Nan
26 2023-10-03 OceanGPT: A Large Language Model for Ocean Science Tasks link Bi, Zhen,..., Huajun
26 2023-11-15 PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning link Zhang, Zhihan,..., Francesco
26 2023-11-10 ChiMed-GPT: A Chinese Medical Large Language Model with Full
Training Regime and Better Alignment to Human Preferences
link Tian, Yuanhe,..., Yongdong
26 2024-03-20 An Entropy-based Text Watermarking Detection Method link Lu, Yijian,..., Irwin
25 2024-03-02 Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized
Rehearsal
link Huang, Jianheng,..., Jinsong
25 2024-01-14 CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language
Models for Commonsense Reasoning
link Wang, Weiqi,..., Yangqiu
25 2023-07-03 Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification
of Free-Form Large Language Models
link Duan, Jinhao,..., Kaidi
25 2023-10-19 Not All Countries Celebrate Thanksgiving: On the Cultural Dominance
in Large Language Models
link Wang, Wenxuan,..., Michael
25 2024-01-12 The Unreasonable Effectiveness of Easy Training Data for Hard
Tasks
link Hase, Peter,..., Sarah
24 2024-02-19 What Evidence Do Language Models Find Convincing? link Wan, Alexander,..., Dan
24 2024-02-27 RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations link Huang, Jing,..., Atticus
24 2024-01-12 MAPO: Advancing Multilingual Reasoning through Multilingual-Alignment-as-Preference Optimization link She, Shuaijie,..., Jiajun
24 2024-01-13 Bridging the Preference Gap between Retrievers and LLMs link Ke, Zixuan,..., Michael
24 2024-03-16 DIALECTBENCH: An NLP Benchmark for Dialects, Varieties, and Closely-Related
Languages
link Faisal, Fahim,..., Antonios
23 2024-02-16 BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation link Du, DaYou,..., Ningyi
23 2024-03-12 KnowCoder: Coding Structured Knowledge into LLMs for Universal Information
Extraction
link Li, Zixuan,..., Xueqi
23 2024-01-14 MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language
Navigation
link Chen, Jiaqi,..., Kwan-Yee
23 2024-05-31 Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive
Adversarial Training
link Fang, Feiteng,..., Ruifeng
23 2024-03-27 Measuring Political Bias in Large Language Models: What Is
Said and How It Is Said
link Bang, Yejin,..., Pascale
23 2023-12-07 Fortify the Shortest Stave in Attention: Enhancing Context Awareness
of Large Language Models for Effective Tool Use
link Chen, Yuhan,..., Rui
23 2024-03-11 IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets
for Indian Languages
link Khan, Mohammed Safi Ur Rahman,..., Mitesh M.
22 2023-11-07 PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models link Li, Haoran,..., Yangqiu
22 2024-07-01 FineSurE: Fine-grained Summarization Evaluation using LLMs link Song, Hwanjun,..., Saab
22 2024-02-06 Training Language Models to Generate Text with Citations via
Fine-grained Rewards
link Huang, Chengyu,..., Wenya
22 2023-08-31 RepCodec: A Speech Representation Codec for Speech Tokenization link Huang, Zhichao,..., Tom
22 2024-02-22 Not All Experts are Equal: Efficient Expert Pruning and
Skipping for Mixture-of-Experts Large Language Models
link Lu, Xudong,..., Hongsheng
22 2024-02-20 Advancing Large Language Models to Capture Varied Speaking Styles
and Respond Properly in Spoken Conversations
link Lin, Guan-Ting,..., Hung-yi
22 2024-03-05 Angry Men, Sad Women: Large Language Models Reflect Gendered
Stereotypes in Emotion Attribution
link Plaza-del-Arco, Flor Miriam,..., Dirk
22 2023-05-22 Word Embeddings Are Steers for Language Models link Han, Chi,..., Heng
21 2024-01-12 Navigating the Metrics Maze: Reconciling Score Magnitudes and Accuracies link Kocmi, Tom,..., Matt
21 2024-02-16 DataDreamer: A Tool for Synthetic Data Generation and Reproducible
LLM Workflows
link Patel, Ajay,..., Chris
21 2023-10-31 FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large
Language Models
link Jiang, Yuxin,..., Wei
21 2024-06-21 Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering link Shi, Zhengliang,..., Zhaochun
21 2023-12-23 PokeMQA: Programmable knowledge editing for Multi-hop Question Answering link Gu, Hengrui,..., Xin
21 2024-02-23 ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase
Partition
link Ye, Lu,..., Yang
21 2024-07-26 AppWorld: A Controllable World of Apps and People for
Benchmarking Interactive Coding Agents
link Trivedi, Harsh,..., Niranjan
20 2024-02-10 GenTranslate: Large Language Models are Generative Multilingual Speech and
Machine Translators
link Hu, Yuchen,..., EngSiong
20 2023-11-15 Explore Spurious Correlations at the Concept Level in Language
Models for Text Classification
link Zhou, Yuhang,..., Furong
20 2023-11-15 Exploring the Potential of Large Language Models in Computational
Argumentation
link Chen, Guizhen,..., Lidong
20 2023-10-10 Exploring Memorization in Fine-tuned Language Models link Zeng, Shenglai,..., Dawei
20 2024-08-06 Synthesizing Text-to-SQL Data from Weak and Strong LLMs link Yang, Jiaxi,..., Chang
20 2023-12-21 T-Eval: Evaluating the Tool Utilization Capability of Large Language
Models Step by Step
link Chen, Zehui,..., Feng
20 2023-10-28 DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy link Sun, Hongda,..., Rui
20 2024-04-25 IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of
LLMs on Indic Languages
link Singh, Harman,..., Partha
20 2024-02-16 Multi-modal Preference Alignment Remedies Degradation of Visual Instruction Tuning
on Language Models
link Li, Shengzhi,..., Shichao
19 2024-02-19 Small Models, Big Insights: Leveraging Slim Proxy Models To
Decide When and What to Retrieve for LLMs
link Tan, Jiejun,..., Ji-Rong
19 2023-11-13 On Measuring Faithfulness or Self-consistency of Natural Language Explanations link Parcalabescu, Letitia,..., Anette
19 2024-02-26 Leveraging Large Language Models for Learning Complex Legal Concepts
through Storytelling
link Jiang, Hang,..., Jad
19 2024-02-19 Are LLM-based Evaluators Confusing NLG Quality Criteria? link Hu, Xinyu,..., Xiaojun
19 2024-02-19 Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions
Without the Question?
link Balepur, Nishant,..., Rachel
19 2024-02-23 Advancing Parameter Efficiency in Fine-tuning via Representation Editing link Wu, Muling,..., Xuanjing
19 2024-02-19 CausalGym: Benchmarking causal interpretability methods on linguistic tasks link Arora, Aryaman,..., Christopher
19 2024-02-15 Why are Sensitive Functions Hard for Transformers? link Hahn, Michael,..., Mark
18 2024-02-14 Tell Me More! Towards Implicit User Intention Understanding of
Language Model Driven Agents
link Qian, Cheng,..., Maosong
18 2024-06-06 VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval link Zhou, Junjie,..., Yongping
18 2024-03-25 Attribute First, then Generate: Locally-attributable Grounded Text Generation link Slobodkin, Aviv,..., Ido
18 2024-02-01 A Chain-of-Thought Is as Strong as Its Weakest Link:
A Benchmark for Verifiers of Reasoning Chains
link Jacovi, Alon,..., Mor
18 2023-12-20 WaveCoder: Widespread And Versatile Enhancement For Code Large Language
Models By Instruction Tuning
link Yu, Zhaojian,..., Qiufeng
18 2024-02-23 KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language
Models
link Yu, Zhuohao,..., Shikun
18 2023-06-21 ARIES: A Corpus of Scientific Paper Edits Made in
Response to Peer Reviews
link D{'}Arcy, Mike,..., Doug
18 2024-01-22 Revisiting Demonstration Selection Strategies in In-Context Learning link Peng, Keqin,..., Dacheng
18 2024-06-05 BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents link Wang, Yifei,..., Shengsheng
18 2024-03-15 DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time
Information Needs of Large Language Models
link Su, Weihang,..., Yiqun
18 2023-11-30 CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation
of Large Language Model Generation
link Ke, Pei,..., Minlie
18 2024-03-29 Latxa: An Open Language Model and Evaluation Suite for
Basque
link Etxaniz, Julen,..., Aitor
17 2024-02-25 Citation-Enhanced Generation for LLM-based Chatbots link Li, Weitao,..., Yang
17 2024-02-19 Learning to Edit: Aligning LLMs with Knowledge Editing link Jiang, Yuxin,..., Wei
17 2023-11-26 UHGEval: Benchmarking the Hallucination of Chinese Large Language Models
via Unconstrained Generation
link Liang, Xun,..., Haiying
17 2023-10-09 MuggleMath: Assessing the Impact of Query and Response Augmentation
on Math Reasoning
link Li, Chengpeng,..., Chang
17 2024-05-17 Layer-Condensed KV Cache for Efficient Inference of Large Language
Models
link Wu, Haoyi,..., Kewei
17 2024-04-23 LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of
Large Language Models
link Parmar, Mihir,..., Chitta
16 2024-02-21 GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient
Analysis
link Xie, Yueqi,..., Neil
16 2023-12-20 Time is Encoded in the Weights of Finetuned Language
Models
link Nylund, Kai,..., Noah
16 2024-02-18 Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text
Detectors Under Attacks
link Wang, Yichen,..., Tianxing
16 2024-02-21 Can Watermarks Survive Translation? On the Cross-lingual Consistency of
Text Watermark for Large Language Models
link He, Zhiwei,..., Rui
16 2024-03-05 CoGenesis: A Framework Collaborating Large and Small Language Models
for Secure Context-Aware Instruction Following
link Zhang, Kaiyan,..., Bowen
16 2024-02-14 Spectral Filters, Dark Signals, and Attention Sinks link Cancedda, Nicola
16 2024-02-16 Large Language Models as Zero-shot Dialogue State Tracker through
Function Calling
link Li, Zekun,..., Paul
16 2024-02-23 On the Multi-turn Instruction Following for Conversational Web Agents link Deng, Yang,..., Tat-Seng
16 2024-06-13 Living in the Moment: Can Large Language Models Grasp
Co-Temporal Reasoning?
link Su, Zhaochen,..., Min
16 2023-11-16 Reducing Privacy Risks in Online Self-Disclosures with Language Models link Dou, Yao,..., Wei
16 2024-06-06 Prototypical Reward Network for Data-Efficient RLHF link Zhang, Jinghan,..., Kunpeng
16 2024-06-06 Confabulation: The Surprising Value of Large Language Model Hallucinations link Sui, Peiqi,..., Richard
16 2024-01-12 Mission: Impossible Language Models link Kallini, Julie,..., Christopher
15 2024-02-26 HealMe: Harnessing Cognitive Reframing in Large Language Models for
Psychotherapy
link Xiao, Mengxi,..., Jimin
15 2024-06-24 UniCoder: Scaling Code Large Language Model via Universal Code link Sun, Tao,..., Zhoujun
15 2024-02-16 ToolSword: Unveiling Safety Issues of Large Language Models in
Tool Learning Across Three Stages
link Ye, Junjie,..., Xuanjing
15 2024-02-13 PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers link Lin, Weizhe,..., Bill
15 2023-11-14 A Ship of Theseus: Curious Cases of Paraphrasing in
LLM-Generated Texts
link Tripto, Nafis Irtiza,..., Dongwon
15 2024-02-18 Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing
and Improving LLMs
link Wang, Siyuan,..., Xiang
15 2023-11-16 On the Impact of Calibration Data in Post-training Quantization
and Pruning
link Williams, Miles,..., Nikolaos
15 2024-04-04 Learning to Plan and Generate Text with Citations link Fierro, Constanza,..., Mirella
15 2024-02-18 LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in
Generative Tasks
link Wang, Hanqing,..., Maosong
15 2024-03-09 Calibrating Large Language Models Using Their Generations Only link Ulmer, Dennis,..., Seong
14 2023-05-10 ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A
Million-scale Knowledge Base
link Yuan, Siyu,..., Deqing
14 2024-02-11 Generalizing Conversational Dense Retrieval via LLM-Cognition Data Augmentation link Chen, Haonan,..., Ziliang
14 2024-01-12 INTERS: Unlocking the Power of Large Language Models in
Search with Instruction Tuning
link Zhu, Yutao,..., Zhicheng
14 None Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense
Retrieval
link Liu, Zheng,..., Defu
14 2024-05-21 ProtT3: Protein-to-Text Generation for Text-based Protein Understanding link Liu, Zhiyuan,..., Tat-Seng
14 2024-02-18 Competition of Mechanisms: Tracing How Language Models Handle Facts
and Counterfactuals
link Ortu, Francesco,..., Bernhard
14 2024-02-08 OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities
of Large Language Models
link Xu, Hainiu,..., Yulan
14 2024-02-14 Towards Privacy-Aware Sign Language Translation at Scale link Rust, Phillip,..., Jean
14 2024-03-11 ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis link Liu, Yanming,..., Xuhong
14 2024-06-05 Text-like Encoding of Collaborative Information in Large Language Models
for Recommendation
link Zhang, Yang,..., Xiangnan
14 2024-02-20 OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech
Recognition, Translation, and Language Identification
link Peng, Yifan,..., Shinji
14 2024-02-16 Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models
for Retrieval and Beyond
link Li, Yongqi,..., Tat-Seng
14 2023-05-22 Iterative Forward Tuning Boosts In-Context Learning in Language Models link Yang, Jiaxi,..., Yongbin
13 2024-05-31 Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean
with Ko-H5 Benchmark
link Park, Chanjun,..., Hwalsuk
13 2024-02-01 What Does the Bot Say? Opportunities and Risks of
Large Language Models in Social Media Bot Detection
link Feng, Shangbin,..., Yulia
13 2024-01-12 AboutMe: Using Self-Descriptions in Webpages to Document the Effects
of English Pretraining Data Filters
link Lucy, Li,..., Jesse
13 2024-06-12 Multimodal Table Understanding link Zheng, Mingyu,..., Weiping
13 2024-01-15 MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal
Large Language Models in Perception
link Wang, Yuhao,..., Yu
13 2023-10-13 Improving Large Language Models in Event Relation Logical Prediction link Chen, Meiqi,..., Dongsheng
13 2024-03-04 To Generate or to Retrieve? On the Effectiveness of
Artificial Contexts for Medical Open-Domain Question Answering
link Frisoni, Giacomo,..., Zaiqiao
13 2024-02-19 Revisiting Knowledge Distillation for Autoregressive Language Models link Zhong, Qihuang,..., Dacheng
13 2024-01-12 Don`t Rank, Combine! Combining Machine Translation Hypotheses Using Quality
Estimation
link Vernikos, Giorgos,..., Andrei
13 2024-02-23 API-BLEND: A Comprehensive Corpora for Training and Benchmarking API
LLMs
link Basu, Kinjal,..., Luis
13 2023-11-15 Never Lost in the Middle: Mastering Long-Context Question Answering
with Position-Agnostic Decompositional Training
link He, Junqing,..., Jiaxing
13 2024-04-09 Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian
Languages
link Cahyawijaya, Samuel,..., Pascale
12 2024-02-16 AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning
with Plausibility Estimation
link Wang, Zhaowei,..., Simon
12 2024-02-18 Benchmarking Knowledge Boundary for Large Language Models: A Different
Perspective on Model Evaluation
link Yin, Xunjian,..., Xiaojun
12 2024-04-15 Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table
Retrieval
link Chen, Peter Baile,..., Dan
12 2024-02-24 PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA link Wang, Sheng,..., Chuan
12 2024-02-08 TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware
Simulation
link Zhang, Yikai,..., Jiangjie
12 2023-03-06 XCodeEval: An Execution-based Large Scale Multilingual Multitask Benchmark for
Code Understanding, Generation, Translation and Retrieval
link Khan, Mohammad Abdullah Matin,..., Shafiq
12 2024-01-26 ProxyQA: An Alternative Framework for Evaluating Long-Form Text Generation
with Large Language Models
link Tan, Haochen,..., Linqi
12 2023-12-04 A Glitch in the Matrix? Locating and Detecting Language
Model Grounding with Fakepedia
link Monea, Giovanni,..., Robert
12 2024-02-21 Analysing The Impact of Sequence Composition on Language Model
Pre-Training
link Zhao, Yu,..., Pasquale
12 2024-08-07 NACL: A General and Effective KV Cache Eviction Framework
for LLM at Inference Time
link Chen, Yilong,..., Hua
12 2023-11-19 Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark
with Faked and Misspelled Characters
link Li, Yinghui,..., Ying
12 2024-02-19 Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt
Distillation
link Liu, Aiwei,..., Lijie
12 2024-06-04 Multimodal Reasoning with Multimodal Knowledge Graph link Lee, Junlin,..., Min
12 2024-03-12 Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs link Fang, Tianqing,..., Antoine
12 2024-02-16 Exploring Precision and Recall to assess the quality and
diversity of LLMs
link Le Bronnec, Florian,..., Alexandre
12 2024-07-25 Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning link Wang, Tianduo,..., Wei
12 2024-06-20 On the Representational Capacity of Neural Language Models with
Chain-of-Thought Reasoning
link Nowak, Franz,..., Ryan
12 2024-02-20 Can Large Language Models be Good Emotional Supporter? Mitigating
Preference Bias on Emotional Support Conversation
link Kang, Dongjin,..., Jinyoung
12 2024-02-19 Emulated Disalignment: Safety Alignment for Large Language Models May
Backfire!
link Zhou, Zhanhui,..., Yu
12 2023-05-09 COKE: A Cognitive Knowledge Graph for Machine Theory of
Mind
link Wu, Jincenzi,..., Minlie
11 2024-02-24 Multimodal Instruction Tuning with Conditional Mixture of LoRA link Shen, Ying,..., Lifu
11 2024-02-06 Tuning Large Multimodal Models for Videos using Reinforcement Learning
from AI Feedback
link Ahn, Daechul,..., Jonghyun
11 2024-02-15 Grounding Language Model with Chunking-Free In-Context Retrieval link Qian, Hongjin,..., Zhicheng
11 2024-07-01 IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama
Script Generation
link Han, Senyu,..., Kai
11 2024-05-26 M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation
with Multiple Partitions
link Wang, Zheng,..., Wei
11 2023-11-16 RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human
Feedback in Large Language Models
link Wang, Jiongxiao,..., Chaowei
11 2024-06-10 FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning
Using a Large Multimodal Model
link Lee, Yebin,..., Myungjoo
11 2023-11-15 Temporal Knowledge Question Answering via Abstract Reasoning Induction link Chen, Ziyang,..., Min
11 None Jailbreak Open-Sourced Large Language Models via Enforced Decoding link Zhang, Hangfan,..., Dinghao
11 2024-06-12 TasTe: Teaching Large Language Models to Translate through Self-Reflection link Wang, Yutong,..., Min
11 2024-01-09 Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with
Multi-Granularity Answers
link Yona, Gal,..., Mor
11 2024-01-19 LangBridge: Multilingual Reasoning Without Multilingual Supervision link Yoon, Dongkeun,..., Minjoon
11 2024-03-15 EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating
Vision Language Models
link Das, Rocktim,..., Preslav
11 2024-01-22 Text Embedding Inversion Security for Multilingual Language Models link Chen, Yiyi,..., Johannes
11 2023-04-05 Efficient OCR for Building a Diverse Digital History link Carlson, Jacob,..., Melissa
11 2024-03-08 Aligning Large Language Models for Controllable Recommendations link Lu, Wensheng,..., Xing
11 2024-03-19 Bypassing LLM Watermarks with Color-Aware Substitutions link Wu, Qilong,..., Varun
11 2024-02-28 ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training link Zhuo, Le,..., Wentao
11 2023-10-07 Chat Vector: A Simple Approach to Equip LLMs with
Instruction Following and Model Alignment in New Languages
link Huang, Shih-Cheng,..., Hung-yi
11 2023-09-16 Cross-Lingual Knowledge Editing in Large Language Models link Wang, Jiaan,..., Fandong
11 2024-02-18 Don`t Go To Extremes: Revealing the Excessive Sensitivity and
Calibration Limitations of LLMs in Implicit Hate Speech Detection
link Zhang, Min,..., Chang-Tien
11 2024-01-12 ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image
Generation
link Jha, Akshita,..., Sunipa
11 2024-06-05 Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination
Trends
link Ramprasad, Sanjana,..., Zachary
11 2024-02-19 PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction
Games with LLM Agents
link Yang, Qisen,..., Gao
11 2024-03-06 IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code
Generators
link Paul, Indraneil,..., Iryna
11 2024-03-15 MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual
Language Modeling
link Limisiewicz, Tomasz,..., Luke
11 2023-11-16 DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding
Long and Specialized Documents
link Zhao, Yilun,..., Arman
10 None AoE: Angle-optimized Embeddings for Semantic Textual Similarity link Li, Xianming,..., Jing
10 2023-03-28 When Good and Reproducible Results are a Giant with
Feet of Clay: The Importance of Software Quality in NLP
link Papi, Sara,..., Matteo
10 2024-06-08 Planning Like Human: A Dual-process Framework for Dialogue Planning link He, Tao,..., Bing
10 2024-02-18 Stealthy Attack on Large Language Model based Recommendation link Zhang, Jinghao,..., Liang
10 2024-02-22 Unveiling Linguistic Regions in Large Language Models link Zhang, Zhihao,..., Xuanjing
10 2023-11-16 Where Do People Tell Stories Online? Story Detection Across
Online Communities
link Antoniak, Maria,..., Andrew
10 2024-02-19 Parallel Structures in Pre-training Data Yield In-Context Learning link Chen, Yanda,..., He
10 2023-12-13 Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic
Image-Report Generation
link Chen, Wenting,..., Yixuan
10 2024-03-09 Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines link Toker, Michael,..., Yonatan
10 2024-04-25 Examining the robustness of LLM evaluation to the distributional
assumptions of benchmarks
link Siska, Charlotte,..., James
10 None Fundamental Capabilities of Large Language Models and their Applications
in Domain Scenarios: A Survey
link Li, Jiawei,..., Heyan
10 2024-07-02 Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency
for Free-Form Language Generation
link Wang, Xinglin,..., Kan
10 2024-06-04 Retaining Key Information under High Compression Ratios: Query-Guided Compressor
for LLMs
link Cao, Zhiwei,..., Jinsong
10 2024-03-01 Peacock: A Family of Arabic Multimodal Large Language Models
and Benchmarks
link Alwajih, Fakhraddin,..., Muhammad
10 2024-06-10 HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using
LLMs
link Panda, Pranoy,..., Prathosh
10 2024-05-21 G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection
for Machine Translation
link Pan, Xingyuan,..., Shanbo
10 2024-01-25 RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models
via Romanization
link J, Jaavid,..., Anoop
10 2023-12-31 BatchEval: Towards Human-like Text Evaluation link Yuan, Peiwen,..., Kan
9 2024-03-05 OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied
Instruction Following
link Shi, Haochen,..., Bang
9 2024-02-16 AFaCTA: Assisting the Annotation of Factual Claim Detection with
Reliable LLM Annotators
link Ni, Jingwei,..., Markus
9 2024-02-24 ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval link Yoon, Soyoung,..., Seung-won
9 2023-10-16 On Context Utilization in Summarization with Large Language Models link Ravaut, Mathieu,..., Shafiq
9 2024-01-31 Navigating the OverKill in Large Language Models link Shi, Chenyu,..., Dahua
9 2024-02-18 Multi-Task Inference: Can Large Language Models Follow Multiple Instructions
at Once?
link Son, Guijin,..., Seungone
9 2024-02-27 Benchmarking Data Science Agents link Zhang, Yuge,..., Kan
9 2024-02-19 EmoBench: Evaluating the Emotional Intelligence of Large Language Models link Sabour, Sahand,..., Minlie
9 2024-01-22 Blinded by Generated Contexts: How Language Models Merge Generated
and Retrieved Contexts When Knowledge Conflicts?
link Tan, Hexiang,..., Xueqi
9 2023-12-05 Prompt Optimization via Adversarial In-Context Learning link Long, Xuan Do,..., Junxian
9 2024-02-24 HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria
Decomposition
link Liu, Yuxuan,..., Qi
9 2024-05-30 Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion link Cheng, Wei,..., Wei
9 2024-02-18 FactPICO: Factuality Evaluation for Plain Language Summarization of Medical
Evidence
link Joseph, Sebastian,..., Junyi Jessy
9 2024-01-09 MERA: A Comprehensive LLM Evaluation in Russian link Fenogenova, Alena,..., Sergey
9 2024-02-28 Meta-Task Prompting Elicits Embeddings from Large Language Models link Lei, Yibin,..., Andrew
9 2024-02-20 HyperMoE: Towards Better Mixture of Experts via Transferring Among
Experts
link Zhao, Hao,..., Jie
9 2024-01-16 SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning
of Large Language Models
link Zhao, Weixiang,..., Wanxiang
9 2024-05-27 DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution link Mao, Yulong,..., Jinan
9 2024-06-30 Investigating and Mitigating the Multimodal Hallucination Snowballing in Large
Vision-Language Models
link Zhong, Weihong,..., Bing
9 2024-06-04 mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language
Models
link Lai, Huiyuan,..., Malvina
9 2023-11-16 FinanceMATH: Knowledge-Intensive Math Reasoning in Finance Domains link Zhao, Yilun,..., Arman
9 2024-01-12 Effects of diversity incentives on sample diversity and downstream
model performance in LLM-based text augmentation
link Cegin, Jan,..., Peter
9 2024-02-19 Speech Translation with Speech Foundation Models and Large Language
Models: What is There and What is Missing?
link Gaido, Marco,..., Luisa
9 2023-12-13 Learn or Recall? Revisiting Incremental Learning with Pre-trained Language
Models
link Zheng, Junhao,..., Qianli
9 2023-10-30 M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark
for Large Language Models
link Kwan, Wai-Chung,..., Kam-Fai
9 2024-02-23 ToMBench: Benchmarking Theory of Mind in Large Language Models link Chen, Zhuang,..., Minlie
8 2024-02-28 Unsupervised Information Refinement Training of Large Language Models for
Retrieval-Augmented Generation
link Xu, Shicheng,..., Jie
8 2023-10-08 MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question
Answering
link Chen, Xiusi,..., Wei
8 2024-02-15 SportsMetrics: Blending Text and Numerical Data to Understand Information
Fusion in LLMs
link Hu, Yebowen,..., Fei
8 2023-10-02 Probing the Multi-turn Planning Capabilities of LLMs via 20
Question Games
link Zhang, Yizhe,..., Navdeep
8 2024-06-11 A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation link Ma, Zhengrui,..., Min
8 2024-06-03 Probing Language Models for Pre-training Data Detection link Liu, Zhenhua,..., Wenliang
8 2024-06-04 Analyzing Temporal Complex Events with Large Language Models? A
Benchmark towards Temporal, Long Context Understanding
link Zhang, Zhihan,..., Tat-Seng
8 None Reasoning in Flux: Enhancing Large Language Models Reasoning through
Uncertainty-aware Adaptive Guidance
link Yin, Zhangyue,..., Xipeng
8 2024-05-21 SirLLM: Streaming Infinite Retentive LLM link Yao, Yao,..., Hai
8 2024-03-09 ItD: Large Language Models Can Teach Themselves Induction through
Deduction
link Sun, Wangtao,..., Kang
8 None Rethinking Task-Oriented Dialogue Systems: From Complex Modularity to Zero-Shot
Autonomous Agent
link Xu, Heng-Da,..., Heyan
8 2024-03-18 Metaphor Understanding Challenge Dataset for LLMs link Tong, Xiaoyu,..., Ekaterina
8 2024-09-22 SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors
for Domain Knowledge Graph
link Chen, Hanzhu,..., Jieping
8 2024-06-19 Factual Confidence of LLMs: on Reliability and Robustness of
Current Estimators
link Mahaut, Mat{\'e}o,..., Lluis
8 2024-02-14 DolphCoder: Echo-Locating Code Large Language Models with Diverse and
Multi-Objective Instruction Tuning
link Wang, Yejie,..., Xunliang
8 2024-03-18 QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental
Feedback based Self-Correction
link Huang, Xiang,..., Yuzhong
8 2024-02-26 What Do Language Models Hear? Probing for Auditory Representations
in Language Models
link Ngo, Jerry,..., Yoon
8 2024-06-07 A Deep Dive into the Trade-Offs of Parameter-Efficient Preference
Alignment Techniques
link Thakkar, Megh,..., Sarath
8 2024-03-04 Masked Thought: Simply Masking Partial Reasoning Steps Can Improve
Mathematical Reasoning Learning of Language Models
link Chen, Changyu,..., Yongbin
8 2024-05-16 FinTextQA: A Dataset for Long-form Financial Question Answering link Chen, Jian,..., Junwei
8 2024-05-30 The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM
Abilities
link Stap, David,..., Ke
8 2024-02-19 MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative
LLMs
link Bakman, Yavuz Faruk,..., Salman
8 2024-05-28 Long Context is Not Long at All: A Prospector
of Long-Dependency Data for Large Language Models
link Chen, Longze,..., Min
8 2024-03-06 A Modular Approach for Multimodal Summarization of TV Shows link Mahon, Louis,..., Mirella
8 2024-05-17 Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation link Niu, Cheng,..., Tong
8 2024-03-28 NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using
Representative Data
link Tonneau, Manuel,..., Samuel
8 2024-02-28 Focus on Your Question! Interpreting and Mitigating Toxic CoT
Problems in Commonsense Reasoning
link Li, Jiachun,..., Jun
8 2024-02-23 Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with
Large Language Models
link Xiong, Guanming,..., Wen
8 2024-02-17 Aligning Large Language Models by On-Policy Self-Judgment link Lee, Sangkyu,..., Youngjae
8 2023-11-11 LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting
Strategy for Document-Level Event Argument Extraction
link Zhou, Hanzhang,..., Kezhi
8 None TaPERA: Enhancing Faithfulness and Interpretability in Long-Form Table QA
by Content Planning and Execution-based Reasoning
link Zhao, Yilun,..., Chen
8 2024-03-21 XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech
Perception
link Han, HyoJung,..., Changhan
8 2024-04-06 Context versus Prior Knowledge in Language Models link Du, Kevin,..., Ryan
8 2024-02-16 Navigating the Dual Facets: A Comprehensive Evaluation of Sequential
Memory Editing in Large Language Models
link Lin, Zihao,..., Lifu
8 2024-05-24 GPT is Not an Annotator: The Necessity of Human
Annotation in Fairness Benchmark Construction
link Felkner, Virginia,..., Jonathan
8 2024-06-06 What Languages are Easy to Language-Model? A Perspective from
Learning Probabilistic Regular Languages
link Borenstein, Nadav,..., Ryan
7 2022-11-16 CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers link Hu, Yong,..., Jie
7 2023-10-05 Expedited Training of Visual Conditioned Language Generation via Redundancy
Reduction
link Jian, Yiren,..., Hongxia
7 2023-12-20 Retrieval-Augmented Multilingual Knowledge Editing link Wang, Weixuan,..., Alexandra
7 2024-02-15 Answer is All You Need: Instruction-following Text Embedding via
Answering the Question
link Peng, Letian,..., Jingbo
7 2024-02-19 IMBUE: Improving Interpersonal Effectiveness through Simulation and Just-in-time Feedback
with Human-Language Model Interaction
link Lin, Inna,..., Tim
7 2024-05-28 ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large
Language Models
link Elangovan, Aparna,..., Dan
7 2023-11-29 TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in
Large Language Models
link Chu, Zheng,..., Bing
7 2024-01-09 Rewriting the Code: A Simple Method for Large Language
Model Augmented Code Search
link Li, Haochen,..., Zhiqi
7 2024-07-07 Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis
and Emotion Recognition
link Guo, Zirun,..., Zhou
7 2024-03-04 VariErr NLI: Separating Annotation Error from Human Label Variation link Weber-Genzel, Leon,..., Barbara
7 2024-02-20 Modality-Aware Integration with Large Language Models for Knowledge-Based Visual
Question Answering
link Dong, Junnan,..., Xiao
7 2024-08-06 Making Long-Context Language Models Better Multi-Hop Reasoners link Li, Yanyang,..., Liwei
7 2024-03-05 Improving Event Definition Following For Zero-Shot Event Detection link Cai, Zefan,..., Nanyun
7 2024-02-20 Interpreting Conversational Dense Retrieval by Rewriting-Enhanced Inversion of Session
Embedding
link Cheng, Yiruo,..., Zhicheng
7 2023-12-27 Prompt Expansion for Adaptive Text-to-Image Generation link Datta, Siddhartha,..., Peter
7 2024-06-25 MPCoder: Multi-user Personalized Code Generator with Explicit and Implicit
Style Representation Learning
link Dai, Zhenlong,..., Jingyuan
7 2023-11-14 Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating
in Large Language Models
link Ni, Shiwen,..., Min
7 2024-02-20 The Hidden Space of Transformer Language Adapters link Alabi, Jesujoba,..., Mor
7 2024-07-09 Automated Justification Production for Claim Veracity in Fact Checking:
A Survey on Architectures and Approaches
link Eldifrawi, Islam,..., Amine
7 2023-08-29 SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable
Memory Budget
link Kong, Rui,..., Yunxin
7 None Chain-of-Exemplar: Enhancing Distractor Generation for Multimodal Educational Question Generation link Luo, Haohao,..., Tat-Seng
7 2024-02-20 Comparing Inferential Strategies of Humans and Large Language Models
in Deductive Reasoning
link Mondorf, Philipp,..., Barbara
7 2024-05-01 CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification
with Large Multimodal Models
link Chen, Zixin,..., Guang
7 2023-11-14 Predicting Text Preference Via Structured Comparative Reasoning link Yan, Jing Nathan,..., Michael
7 2023-11-29 CLOMO: Counterfactual Logical Modification with Large Language Models link Huang, Yinya,..., Linqi
7 2024-02-19 Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation
from Resource-Rich Languages
link Zhang, Yuanchi,..., Yang
7 2024-07-07 IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning link Joshi, Abhinav,..., Ashutosh
7 2024-05-16 Generating Coherent Sequences of Visual Illustrations for Real-World Manual
Tasks
link Bordalo, Jo{\~a}o,..., Joao
7 2024-02-26 LLMArena: Assessing Capabilities of Large Language Models in Dynamic
Multi-Agent Environments
link Chen, Junzhe,..., Lijie
7 2023-08-25 Chunk, Align, Select: A Simple Long-sequence Processing Method for
Transformers
link Xie, Jiawen,..., Nan
7 2024-02-14 MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot
Text-to-Speech
link Ji, Shengpeng,..., Zhou
7 2023-11-15 Disinformation Capabilities of Large Language Models link Vykopal, Ivan,..., Maria
7 2024-06-13 ECBD: Evidence-Centered Benchmark Design for NLP link Liu, Yu Lu,..., Ziang
6 2024-08-19 TaSL: Continual Dialog State Tracking via Task Skill Localization
and Consolidation
link Feng, Yujie,..., Xiao-Ming
6 2023-10-13 EasyGen: Easing Multimodal Generation with BiDiffuser and LLMs link Zhao, Xiangyu,..., Xiao-Ming
6 2024-05-28 Detection-Correction Structure via General Language Model for Grammatical Error
Correction
link Li, Wei,..., Houfeng
6 2024-02-19 A synthetic data approach for domain generalization of NLI
models
link Hosseini, Mohammad Javad,..., Annie
6 2024-03-05 Eliciting Better Multilingual Structured Reasoning from LLMs through Code link Li, Bryan,..., Saab
6 2024-02-20 GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick link Fu, Jiayi,..., Yanghua
6 2024-04-14 Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment link Hong, Zhiqing,..., Zhimeng
6 2024-01-26 Unlearning Traces the Influential Training Data of Language Models link Isonuma, Masaru,..., Ivan
6 2024-01-29 Muffin or Chihuahua? Challenging Multimodal Large Language Models with
Multipanel VQA
link Fan, Yue,..., Xin
6 2024-06-18 An Investigation of Neuron Activation as a Unified Lens
to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs
link Rai, Daking,..., Ziyu
6 2024-06-08 MemeGuard: An LLM and VLM-based Framework for Advancing Content
Moderation via Meme Intervention
link Jha, Prince,..., Pushpak
6 2024-03-05 InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers link Yehuda, Yakir,..., Noam
6 2023-10-03 Dodo: Dynamic Contextual Compression for Decoder-only LMs link Qin, Guanghui,..., Benjamin
6 2023-06-28 Pareto Optimal Learning for Estimating Large Language Model Errors link Zhao, Theodore,..., Hoifung
6 2024-03-07 LLMs in the Imaginarium: Tool Learning through Simulated Trial
and Error
link Wang, Boshi,..., Yu
6 2024-05-20 CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information
Needs in Large Language Models
link Zhang, Tong,..., Tat-Seng
6 None Self-chats from Large Language Models Make Small Emotional Support
Chatbot Better
link Zheng, Zhonghua,..., Liqiang
6 2024-03-25 An Expert is Worth One Token: Synergizing Multiple Expert
LLMs as Generalist via Expert Token Routing
link Chai, Ziwei,..., Yang
6 2024-01-18 Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on
Data-to-Text Generation
link Kasner, Zden{\v{e}}k,..., Ondrej
6 2024-02-18 One Prompt To Rule Them All: LLMs for Opinion
Summary Evaluation
link Siledar, Tejpalsingh,..., Nikesh
6 2024-07-31 Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language
Models with Knowledge Graphs
link Markowitz, Elan,..., Aram
6 2024-06-06 What Do Language Models Learn in Context? The Structured
Task Hypothesis.
link Li, Jiaoda,..., Ryan
6 2024-07-31 Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends link Martinelli, Giuliano,..., Roberto
6 2023-11-13 Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning link Yu, Yue,..., Michael
6 2023-10-10 Quality-Aware Translation Models: Efficient Generation and Quality Estimation in
a Single Model
link Tomani, Christian,..., Daniel
6 2024-06-12 Let`s Go Real Talk: Spoken Dialogue Model for Face-to-Face
Conversation
link Park, Se,..., Yong
5 2024-05-20 A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI:
The First Romanian Natural Language Inference Corpus
link Poesina, Eduard,..., Radu
5 2024-04-10 Learn from Failure: Fine-tuning LLMs with Trial-and-Error Data for
Intuitionistic Propositional Logic Proving
link An, Chenyang,..., Jingbo
5 2024-06-05 Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play
Approach
link Lee, Saehyung,..., Sungroh
5 2023-11-13 WaterBench: Towards Holistic Evaluation of Watermarks for Large Language
Models
link Tu, Shangqing,..., Juanzi
5 None Evaluating Intention Detection Capability of Large Language Models in
Persuasive Dialogues
link Sakurai, Hiromasa,..., Yusuke
5 2024-01-15 Selene: Pioneering Automated Proof in Software Verification link Zhang, Lichen,..., Nan
5 None REANO: Optimising Retrieval-Augmented Reader Models through Knowledge Graph Generation link Fang, Jinyuan,..., Craig
5 2024-06-09 MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story
Generation
link Ma, Yan,..., Pengfei
5 2023-05-22 CopyNE: Better Contextual ASR by Copying Named Entities link Zhou, Shilin,..., Baoxing
5 2024-03-12 Beyond Memorization: The Challenge of Random Memory Access in
Language Models
link Zhu, Tongyao,..., Min
5 2023-12-25 Instruction Fusion: Advancing Prompt Evolution through Hybridization link Guo, Weidong,..., Di
5 2024-02-10 Instruct Once, Chat Consistently in Multiple Rounds: An Efficient
Tuning Framework for Dialogue
link Wang, Jian,..., Xiaoyong
5 2024-01-29 InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification link Trienes, Jan,..., Junyi Jessy
5 2023-05-23 DAPR: A Benchmark on Document-Aware Passage Retrieval link Wang, Kexin,..., Iryna
5 2024-06-03 Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice
Selectors
link Xue, Mengge,..., Chengguo
5 2024-04-29 Analyzing Semantic Change through Lexical Replacements link Periti, Francesco,..., Nina
5 2023-06-14 Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations link Geigle, Gregor,..., Goran
5 None Soft Knowledge Prompt: Help External Knowledge Become a Better
Teacher to Instruct LLM in Knowledge-based VQA
link Wang, Qunbo,..., Jing
5 2023-11-16 PixT3: Pixel-based Table-To-Text Generation link Alonso, I{\~n}igo,..., Mirella
5 2024-02-27 Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive
EEG-Text Masked Autoencoder
link Wang, Jiaqi,..., Zhiguo
5 2024-01-19 StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice
Conversion
link Wang, Zhichao,..., Yuping
5 2024-02-23 Unlocking the Power of Large Language Models for Entity
Alignment
link Jiang, Xuhui,..., Yuanzhuo
5 None Conundrums in Cross-Prompt Automated Essay Scoring: Making Sense of
the State of the Art
link Li, Shengjie,..., Vincent
5 2024-01-20 STICKERCONV: Generating Multimodal Empathetic Responses from Scratch link Zhang, Yiqun,..., Kaisong
5 2023-12-12 Safety Alignment in NLP Tasks: Weakly Aligned Summarization as
an In-Context Attack
link Fu, Yu,..., Yue
5 2023-11-08 Speech language models lack important brain-relevant semantics link Oota, Subba Reddy,..., Mariya
5 2024-01-10 I am a Strange Dataset: Metalinguistic Tests for Language
Models
link Thrush, Tristan,..., Douwe
5 2023-11-16 Mitigating Biases for Instruction-following Language Models via Bias Neurons
Elimination
link Yang, Nakyeong,..., Kyomin
5 2023-11-15 Few-shot Transfer Learning for Knowledge Base Question Answering: Fusing
Supervised Models with In-Context Learning
link Patidar, Mayur,..., Indrajit
5 2024-05-22 Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline link Yang, Dingyi,..., Qin
5 2023-05-12 Synergistic Interplay between Search and Large Language Models for
Information Retrieval
link Feng, Jiazhan,..., Daxin
5 2023-10-11 Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models link Sun, Yuchong,..., Kun
5 2024-05-16 Robust Singing Voice Transcription Serves Synthesis link Li, Ruiqi,..., Zhou
5 2024-06-04 Self-Modifying State Modeling for Simultaneous Machine Translation link Yu, Donglei,..., Chengqing
5 2024-02-21 CODIS: Benchmarking Context-dependent Visual Comprehension for Multimodal Large Language
Models
link Luo, Fuwen,..., Yang
5 None Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual
and Multitask Learners
link Huang, Rongjie,..., Dong
5 2025-04-03 Hide and Seek in Noise Labels: Noise-Robust Collaborative Active
Learning with LLMs-Powered Assistance
link Yuan, Bo,..., Wei
5 2024-03-21 Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to
Reasoning-Memorization Correlations
link Sun, Jiaxing,..., Conghui
5 2024-02-28 Small But Funny: A Feedback-Driven Approach to Humor Distillation link Ravi, Sahithya,..., Arash
5 2024-06-16 ESCoT: Towards Interpretable Emotional Support Dialogue Systems link Zhang, Tenggan,..., Qin
5 None REFINESUMM: Self-Refining MLLM for Generating a Multimodal Summarization Dataset link Patil, Vaidehi,..., Markus
5 None DeCoT: Debiasing Chain-of-Thought for Knowledge-Intensive Tasks in Large Language
Models via Causal Intervention
link Wu, Junda,..., Julian
5 2023-10-21 MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via
Visual Module Plugin
link Zhou, Tianshuo,..., Ge
5 2024-06-02 Deciphering Oracle Bone Language with Diffusion Models link Guan, Haisu,..., Yuliang
5 None EZ-STANCE: A Large Dataset for English Zero-Shot Stance Detection link Zhao, Chenye,..., Cornelia
4 2024-02-11 Through the Lens of Split Vote: Exploring Disagreement, Difficulty
and Calibration in Legal Case Outcome Classification
link Xu, Shanshan,..., Matthias
4 2024-05-25 Confidence Under the Hood: An Investigation into the Confidence-Probability
Alignment in Large Language Models
link Kumar, Abhishek,..., Ali
4 None DocLens: Multi-aspect Fine-grained Medical Text Evaluation link Xie, Yiqing,..., Carolyn
4 2024-06-06 ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract
Descriptions
link Ghosh, Sreyan,..., Dinesh
4 2024-06-03 Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer link Zhu, Yongxin,..., Dong
4 2024-02-17 Dissecting Human and LLM Preferences link Li, Junlong,..., Pengfei
4 2024-03-14 TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Semantic Tasks link Moskvoretskii, Viktor,..., Irina
4 2024-05-21 Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV
Cache Compression
link Liu, Peiyu,..., Ji-Rong
4 2024-03-13 Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at
Scale
link Hu, Xiang,..., Kewei
4 2024-05-23 ChronosLex: Time-aware Incremental Training for Temporal Generalization of Legal
Classification Tasks
link T.y.s.s, Santosh,..., Matthias
4 2024-05-16 Timeline-based Sentence Decomposition with In Context Learning for Temporal
Fact Extraction
link Chen, Jianhao,..., Yuzhong
4 2024-05-26 MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation
in Conversations
link Wang, Yuxin,..., Soroush
4 2023-11-15 MAVEN-ARG: Completing the Puzzle of All-in-One Event Understanding Dataset
with Event Argument Annotation
link Wang, Xiaozhi,..., Juanzi
4 2024-03-21 Multi-Level Feedback Generation with Large Language Models for Empowering
Novice Peer Counselors
link Chaszczewicz, Alicja,..., Diyi
4 2024-06-12 Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text
Embeddings without Model Queries
link Huang, Yu-Hsiang,..., Shou-De
4 2024-01-15 Uncovering the Full Potential of Visual Grounding Methods in
VQA
link Reich, Daniel,..., Tanja
4 2024-06-10 Interpretability of Language Models via Task Spaces link Weber, Lucas,..., Dieuwke
4 2024-06-05 Using Synchronic Definitions and Semantic Relations to Classify Semantic
Change Types
link Cassotti, Pierluigi,..., Nina
4 None StepCoder: Improving Code Generation with Reinforcement Learning from Compiler
Feedback
link Dou, Shihan,..., Xuanjing
4 2023-12-15 Marathon: A Race Through the Realm of Long Context
with Large Language Models
link Zhang, Lei,..., Min
4 2024-02-16 Threads of Subtlety: Detecting Machine-Generated Texts Through Discourse Motifs link Kim, Zae Myung,..., Dongyeop
4 None PRP-Graph: Pairwise Ranking Prompting to LLMs with Graph Aggregation
for Effective Text Re-ranking
link Luo, Jian,..., Le
4 2024-01-24 UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion link Li, Wei,..., Xinyan
4 2024-04-05 Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text
Generation
link Zhong, Tianqi,..., Zhendong
4 2023-07-11 Lightweight reranking for language model generations link Jain, Siddhartha,..., Bing
4 2023-08-21 PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User
Simulator
link Kong, Chuyi,..., Benyou
4 2024-01-12 STRUCTSUM Generation for Faster Text Comprehension link Jain, Parag,..., Francesco
4 2024-02-19 Acquiring Clean Language Models from Backdoor Poisoned Datasets by
Downscaling Frequency Space
link Wu, Zongru,..., Gongshen
4 2023-11-11 BizBench: A Quantitative Reasoning Benchmark for Business and Finance link Krumdick, Michael,..., Chris
4 2024-02-12 Label-Efficient Model Selection for Text Generation link Ashury Tahan, Shir,..., Eyal
4 2024-07-01 Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents link Deng, Shihan,..., Shuo
4 2024-02-19 Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large
Language Models
link Ju, Tianjie,..., Gongshen
4 2023-11-15 Temperature-scaling surprisal estimates improve fit to human reading times
-- but does it do so for the \textquotedblleftright reasons\textquotedblright?
link Liu, Tong,..., Vera
4 2024-02-29 NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities
of Large Language Models in Chinese Journalism
link Li, Miao,..., Yi
4 2024-02-28 A Sentiment Consolidation Framework for Meta-Review Generation link Li, Miao,..., Eduard
4 2023-12-07 Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with
Large Language Models
link Agostinelli, Victor,..., Lizhong
4 2024-03-19 Interpretable User Satisfaction Estimation for Conversational Systems with Large
Language Models
link Lin, Ying-Chun,..., Jaime
4 2024-02-19 Browse and Concentrate: Comprehending Multimodal Content via Prior-LLM Context
Fusion
link Wang, Ziyue,..., Yang
4 2024-07-03 Improving Conversational Abilities of Quantized Large Language Models via
Direct Preference Alignment
link Lee, Janghwan,..., Jungwook
4 2024-05-17 Language Models can Exploit Cross-Task In-context Learning for Data-Scarce
Novel Tasks
link Chatterjee, Anwoy,..., Tanmoy
4 None LANDeRMT: Dectecting and Routing Language-Aware Neurons for Selectively Finetuning
LLMs to Machine Translation
link Zhu, Shaolin,..., Deyi
4 None VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in
Large Vision-Language Models
link Cao, Qingxing,..., Liang
4 2024-06-18 AutoDSL: Automated domain-specific language design for structural representation of
procedures with constraints
link Shi, Yu-Zhe,..., Qining
4 2024-01-02 Cheetah: Natural Language Generation for 517 African Languages link Adebara, Ife,..., Muhammad
4 2024-05-07 Toward In-Context Teaching: Adapting Examples to Students' Misconceptions link Ross, Alexis,..., Jacob
4 2024-03-03 WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection link Shetty, Anudeex,..., Qiongkai
4 None Fora: A corpus and framework for the study of
facilitated dialogue
link Schroeder, Hope,..., Jad
4 2024-06-05 What is the Best Way for ChatGPT to Translate
Poetry?
link Wang, Shanshan,..., Lidia
4 2024-06-14 EWEK-QA : Enhanced Web and Efficient Knowledge Graph Retrieval
for Citation-based Question Answering Systems
link Dehghan, Mohammad,..., Mehdi
4 2024-03-06 The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language
Models
link Bhaskar, Adithya,..., Danqi
4 2024-06-06 Causal Estimation of Memorisation Profiles link Lesci, Pietro,..., Tiago
3 2024-05-28 A Unified Temporal Knowledge Graph Reasoning Model Towards Interpolation
and Extrapolation
link Chen, Kai,..., Xin
3 2024-05-23 Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating
Representative and Affinity Bias in Large Language Models
link Kumar, Abhishek,..., Ali
3 2024-05-20 Token-wise Influential Training Data Retrieval for Large Language Models link Lin, Huawei,..., Weijie
3 2024-06-28 Prompt Refinement with Image Pivot for Text-to-Image Generation link Zhan, Jingtao,..., Tao
3 2024-02-20 Reflect-RL: Two-Player Online RL Fine-Tuning for LMs link Zhou, Runlong,..., Beibin
3 2024-06-28 BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop
Question Answering
link Chu, Zheng,..., Bing
3 2023-12-25 Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical
Hypothesis Generation
link Bai, Jiaxin,..., Yangqiu
3 None Persuading across Diverse Domains: a Dataset and Persuasion Large
Language Model
link Jin, Chuhao,..., Huan
3 2024-02-13 Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering link Schimanski, Tobias,..., Markus
3 2024-06-06 ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of
Large Language Models
link Ren, Yuanyi,..., Guojie
3 None DM-BLI: Dynamic Multiple Subspaces Alignment for Unsupervised Bilingual Lexicon
Induction
link Hu, Ling,..., Yuemei
3 2024-02-28 VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language
Models
link Kim, Seoyeon,..., Dongha
3 2023-11-15 MELA: Multilingual Evaluation of Linguistic Acceptability link Zhang, Ziyin,..., Hai
3 None Through the MUD: A Multi-Defendant Charge Prediction Benchmark with
Linked Crime Elements
link Wei, Xiao,..., Erik
3 2024-02-28 An Iterative Associative Memory Model for Empathetic Response Generation link Yang, Zhou,..., Xiangwen
3 2024-08-20 Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education
for Large Language Models
link Chen, Yuyan,..., Yanghua
3 None Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented
Long-Context Large Language Models
link Luo, Kun,..., Kang
3 2024-06-09 GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge? link Ko, Dayoon,..., Gunhee
3 2024-06-05 BIPED: Pedagogically Informed Tutoring System for ESL Education link Kwon, Soonwoo,..., Kyuseok
3 None ARL2: Aligning Retrievers with Black-box Large Language Models via
Self-guided Adaptive Relevance Labeling
link Zhang, LingXi,..., Chao
3 2024-06-11 Crayon: Customized On-Device LLM via Instant Adapter Blending and
Edge-Server Hybrid Inference
link Bang, Jihwan,..., Simyung
3 2024-01-06 CaMML: Context-Aware Multimodal Learner for Large Models link Chen, Yixin,..., Bo
3 2023-09-29 Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting
Prompts
link Ying, Jiahao,..., Yongbin
3 2023-09-15 CoCA: Fusing Position Embedding with Collinear Constrained Attention in
Transformers for Long Context Window Extending
link Zhu, Shiyi,..., Jianguo
3 2024-06-05 Missci: Reconstructing Fallacies in Misrepresented Science link Glockner, Max,..., Iryna
3 2024-06-05 LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from
Machine Feedback
link Ziegenbein, Timon,..., Henning
3 2024-01-13 Graph Language Models link Plenz, Moritz,..., Anette
3 2024-05-21 Limits of Theory of Mind Modelling in Dialogue-Based Collaborative
Plan Acquisition
link Bortoletto, Matteo,..., Andreas
3 2024-02-22 RelayAttention for Efficient Large Language Model Serving with Long
System Prompts
link Zhu, Lei,..., Rynson
3 2024-05-19 Your Transformer is Secretly Linear link Razzhigaev, Anton,..., Andrey
3 2024-06-07 Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using
LLM Optimizers
link Senel, L{\"u}tfi Kerem,..., Shervin
3 2024-02-09 NICE: To Optimize In-Context Examples or Not? link Srivastava, Pragya,..., Amit
3 2024-07-31 Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation link Luo, Xiang,..., Xuejie
3 None Event-Radar: Event-driven Multi-View Learning for Multimodal Fake News Detection link Ma, Zihan,..., Xiang
3 2024-02-21 Fine-Grained Modeling of Narrative Context: A Coherence Perspective via
Retrospective Questions
link Xu, Liyan,..., Jie
3 2024-06-01 Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning link Ryu, Sangwon,..., Jungseul
3 2024-01-24 SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning link Chen, Guoxin,..., Yiming
3 2024-04-08 EFSA: Towards Event-Level Financial Sentiment Analysis link Chen, Tianyu,..., Xiang
3 2024-02-21 Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual
Knowledge Alignment
link Li, Yunxin,..., Min
3 None LEMON: Reviving Stronger and Smaller LMs from Larger LMs
with Linear Parameter Fusion
link Chen, Yilong,..., Hua
3 2024-04-29 Revealing the Parametric Knowledge of Language Models: A Unified
Framework for Attribution Methods
link Yu, Haeun,..., Isabelle
3 2024-07-20 Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt
Tuning with RL
link Choi, Yunseon,..., Kee-Eung
3 2024-07-18 MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking link Chen, Ting-Chih,..., Chris
3 2024-06-06 Decoder-only Streaming Transformer for Simultaneous Translation link Guo, Shoutao,..., Yang
3 2024-06-05 StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning link Zhang, Shaolei,..., Yang
3 2024-06-09 Why Don`t Prompt-Based Fairness Metrics Correlate? link Zayed, Abdelrahman,..., Sarath
3 2023-11-16 WatME: Towards Lossless Watermarking Through Lexical Redundancy link Chen, Liang,..., Kam-Fai
3 2024-06-04 Understanding Retrieval Robustness for Retrieval-augmented Image Captioning link Li, Wenyan,..., Desmond
3 2024-02-16 Linear Transformers with Learnable Kernel Functions are Better In-Context
Models
link Aksenov, Yaroslav,..., Daniil
3 2023-08-09 VulLibGen: Generating Names of Vulnerability-Affected Packages via a Large
Language Model
link Chen, Tianyu,..., Tao
3 2024-03-03 SyllabusQA: A Course Logistics Question Answering Dataset link Fernandez, Nigel,..., Andrew
3 2024-02-16 Exploring Hybrid Question Answering via Program-based Prompting link Shi, Qi,..., Ting
3 2024-06-07 Uncertainty Aware Learning for Language Model Alignment link Wang, Yikun,..., Dacheng
3 2024-02-20 Model Composition for Multimodal Large Language Models link Chen, Chi,..., Yang
3 None Enhancing Explainable Rating Prediction through Annotated Macro Concepts link Zhou, Huachi,..., Xiao
3 None Can Large Language Models Interpret Noun-Noun Compounds? A Linguistically-Motivated
Study on Lexicalized and Novel Compounds
link Rambelli, Giulia,..., Marianna
3 2024-06-05 Document-level Claim Extraction and Decontextualisation for Fact-Checking link Deng, Zhenyun,..., Andreas
3 2024-06-06 To Distill or Not to Distill? On the Robustness
of Robust Knowledge Distillation
link Waheed, Abdul,..., Muhammad
3 2024-03-07 Classist Tools: Social Class Correlates with Performance in NLP link Cercas Curry, Amanda,..., Dirk
3 2024-02-16 Generalizability of Mixture of Domain-Specific Adapters from the Lens
of Signed Weight Directions and its Application to Effective Model Pruning
link Nguyen, Tuc,..., Thai
3 2024-06-21 Word Matters: What Influences Domain Adaptation in Summarization? link Li, Yinghao,..., Yang
3 2024-02-19 NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms link Zheng, Jonathan,..., Wei
3 2024-02-08 Transparent and Scrutable Recommendations Using Natural Language User Profiles link Ramos, Jerome,..., Aldo
3 2023-11-15 Multistage Collaborative Knowledge Distillation from a Large Language Model
for Semi-Supervised Sequence Generation
link Zhao, Jiachen,..., Andrew