399 |
2023-05-29 |
link |
Large Language Models are not Fair Evaluators |
Wang, Peiyi,..., Zhifang |
354 |
2023-06-08 |
link |
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models |
Maaz, Muhammad,..., Fahad |
291 |
2023-08-28 |
link |
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding |
Bai, Yushi,..., Juanzi |
219 |
2024-02-01 |
link |
OLMo: Accelerating the Science of Language Models |
Groeneveld, Dirk,..., Hannaneh |
166 |
2024-01-12 |
link |
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs |
Zeng, Yi,..., Weiyan |
147 |
2024-01-31 |
link |
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research |
Soldaini, Luca,..., Kyle |
135 |
2023-04-22 |
link |
LaMP: When Large Language Models Meet Personalization |
Salemi, Alireza,..., Hamed |
125 |
2024-02-12 |
link |
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model |
{\"U}st{\"u}n, Ahmet,..., Sara |
115 |
2024-01-11 |
link |
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models |
Dai, Damai,..., Wenfeng |
114 |
2023-10-10 |
link |
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression |
Jiang, Huiqiang,..., Lili |
96 |
2023-09-18 |
link |
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM |
Cao, Bochuan,..., Jinghui |
95 |
2023-07-20 |
link |
L-Eval: Instituting Standardized Evaluation for Long Context Language Models |
An, Chenxin,..., Xipeng |
92 |
2023-02-23 |
link |
Active Prompting with Chain-of-Thought for Large Language Models |
Diao, Shizhe,..., Tong |
91 |
2023-10-09 |
link |
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition |
Dong, Guanting,..., Jingren |
88 |
2023-08-31 |
link |
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants |
Bandarkar, Lucas,..., Madian |
85 |
2023-09-27 |
link |
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future |
Chu, Zheng,..., Ting |
85 |
2023-06-16 |
link |
Full Parameter Fine-tuning for Large Language Models with Limited Resources |
Lv, Kai,..., Xipeng |
85 |
None |
link |
VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks |
Koh, Jing Yu,..., Daniel |
84 |
2023-12-31 |
link |
Improving Text Embeddings with Large Language Models |
Wang, Liang,..., Furu |
81 |
2023-12-14 |
link |
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations |
Wang, Peiyi,..., Zhifang |
78 |
2024-02-09 |
link |
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning |
Singh, Shivalika,..., Sara |
77 |
2023-10-03 |
link |
Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View |
Zhang, Jintian,..., Shumin |
75 |
2023-11-15 |
link |
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization |
Zhang, Zhexin,..., Minlie |
72 |
2023-12-09 |
link |
Steering Llama 2 via Contrastive Activation Addition |
Rimsky, Nina,..., Alexander |
69 |
2023-09-04 |
link |
Are Emergent Abilities in Large Language Models just In-Context Learning? |
Lu, Sheng,..., Iryna |
68 |
2024-01-25 |
link |
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models |
He, Hongliang,..., Dong |
67 |
2023-11-08 |
link |
LooGLE: Can Long-Context Language Models Understand Long Contexts? |
Li, Jiaqi,..., Muhan |
66 |
2023-12-12 |
link |
LLM in a flash: Efficient Large Language Model Inference with Limited Memory |
Alizadeh, Keivan,..., Mehrdad |
64 |
2024-02-19 |
link |
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling |
Zhan, Jun,..., Xipeng |
62 |
2023-09-22 |
link |
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs |
Chen, Justin,..., Mohit |
62 |
2024-01-17 |
link |
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents |
Cheng, Kanzhi,..., Zhiyong |
56 |
2023-05-24 |
link |
Who Wrote this Code? Watermarking for Code Generation |
Lee, Taehyun,..., Gunhee |
54 |
2023-07-16 |
link |
ChatDev: Communicative Agents for Software Development |
Qian, Chen,..., Maosong |
52 |
2023-10-27 |
link |
InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews |
Wang, Xintao,..., Yanghua |
51 |
2024-02-14 |
link |
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding |
Xu, Zhangchen,..., Radha |
51 |
2024-02-16 |
link |
Do Llamas Work in English? On the Latent Language of Multilingual Transformers |
Wendler, Chris,..., Robert |
49 |
2023-11-07 |
link |
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training |
Cheng, Jiale,..., Minlie |
49 |
2023-05-23 |
link |
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models |
Naous, Tarek,..., Wei |
48 |
2023-06-10 |
link |
Boosting Language Models Reasoning with Chain-of-Knowledge Prompting |
Wang, Jianing,..., Ming |
48 |
2024-01-12 |
link |
Large Language Models Can Learn Temporal Reasoning |
Xiong, Siheng,..., Faramarz |
46 |
2024-02-19 |
link |
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs |
Jiang, Fengqing,..., Radha |
46 |
2024-02-28 |
link |
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards |
Wang, Haoxiang,..., Tong |
45 |
2024-01-19 |
link |
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences |
Wang, Xiyao,..., Furong |
44 |
2024-02-26 |
link |
Do Large Language Models Latently Perform Multi-Hop Reasoning? |
Yang, Sohee,..., Sebastian |
42 |
2024-02-01 |
link |
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration |
Feng, Shangbin,..., Yulia |
41 |
2024-04-25 |
link |
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding |
Elhoushi, Mostafa,..., Carole-Jean |
40 |
2024-02-19 |
link |
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models |
Levy, Mosh,..., Yoav |
39 |
2023-12-31 |
link |
RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models |
Niu, Cheng,..., Tong |
38 |
2024-01-23 |
link |
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment |
Lu, Keming,..., Jingren |
38 |
2024-01-29 |
link |
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling |
Maini, Pratyush,..., Navdeep |
38 |
2024-02-01 |
link |
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards |
Alzahrani, Norah,..., Haidar |
38 |
2024-01-02 |
link |
CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation |
Tu, Quan,..., Rui |
37 |
2024-01-04 |
link |
LLaMA Pro: Progressive LLaMA with Block Expansion |
Wu, Chengyue,..., Ping |
37 |
2024-02-01 |
link |
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning |
Li, Ming,..., Tianyi |
36 |
2024-03-21 |
link |
Detoxifying Large Language Models via Knowledge Editing |
Wang, Mengru,..., Huajun |
36 |
2024-01-14 |
link |
CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges |
Zhang, Kechi,..., Zhi |
35 |
2023-08-17 |
link |
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models |
Wen, Yilin,..., Jimeng |
35 |
2024-02-26 |
link |
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models |
Tang, Tianyi,..., Ji-Rong |
35 |
2024-01-12 |
link |
Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty |
Zhou, Kaitlyn,..., Maarten |
34 |
2024-03-25 |
link |
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild |
Peng, Puyuan,..., David |
32 |
2023-10-16 |
link |
EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities |
Li, Nian,..., Qingmin |
32 |
2024-02-27 |
link |
Evaluating Very Long-Term Conversational Memory of LLM Agents |
Maharana, Adyasha,..., Yuwei |
32 |
2024-02-28 |
link |
FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability |
Xia, Congying,..., Caiming |
31 |
2024-03-29 |
link |
Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning |
Tong, Yongqi,..., Jingbo |
31 |
2023-12-28 |
link |
Experiential Co-Learning of Software-Developing Agents |
Qian, Chen,..., Maosong |
30 |
2024-03-04 |
link |
Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents |
Song, Yifan,..., Bill Yuchen |
30 |
2024-02-26 |
link |
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models |
R{\"o}ttger, Paul,..., Dirk |
30 |
2023-12-31 |
link |
DocLLM: A layout-aware generative language model for multimodal document understanding |
Wang, Dongsheng,..., Xiaomo |
29 |
2023-05-23 |
link |
SciMON: Scientific Inspiration Machines Optimized for Novelty |
Wang, Qingyun,..., Tom |
29 |
2024-02-05 |
link |
Unified Hallucination Detection for Multimodal Large Language Models |
Chen, Xiang,..., Huajun |
29 |
2024-03-01 |
link |
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models |
Li, Lei,..., Qi |
29 |
2024-02-26 |
link |
Long-Context Language Modeling with Parallel Context Encoding |
Yen, Howard,..., Danqi |
29 |
2023-08-30 |
link |
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness |
Chen, Jiuhai,..., Jonas |
29 |
2024-02-22 |
link |
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues |
Bai, Ge,..., Wanli |
28 |
2024-01-04 |
link |
Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives |
Zhang, Wenqi,..., Weiming |
27 |
2023-12-22 |
link |
VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation |
Ku, Max,..., Wenhu |
27 |
2023-06-03 |
link |
MultiLegalPile: A 689GB Multilingual Legal Corpus |
Niklaus, Joel,..., Daniel |
27 |
2023-05-24 |
link |
Harnessing the Power of Large Language Models for Natural Language to First-Order Logic Translation |
Yang, Yuan,..., Faramarz |
26 |
2023-12-22 |
link |
NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes |
Fan, Lizhou,..., Yongfeng |
26 |
2023-11-30 |
link |
AlignBench: Benchmarking Chinese Alignment of Large Language Models |
Liu, Xiao,..., Jie |
25 |
2023-12-26 |
link |
Aligning Large Language Models with Human Preferences through Representation Engineering |
Liu, Wenhao,..., Xuanjing |
25 |
2023-12-14 |
link |
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation |
Xu, Rongwu,..., Han |
24 |
2024-02-24 |
link |
PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails |
Mangaokar, Neal,..., Atul |
24 |
2024-02-22 |
link |
Unintended Impacts of LLM Alignment on Global Representation |
Ryan, Michael,..., Diyi |
24 |
2024-02-16 |
link |
Quantifying the Persona Effect in LLM Simulations |
Hu, Tiancheng,..., Nigel |
23 |
2024-02-18 |
link |
Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement |
Xu, Wenda,..., William |
23 |
2024-02-28 |
link |
Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key? |
Wang, Qineng,..., Yangqiu |
23 |
2024-02-27 |
link |
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space |
Zhang, Shaolei,..., Yang |
23 |
2024-02-23 |
link |
Machine Unlearning of Pre-trained Large Language Models |
Yao, Jin,..., Xiang |
23 |
2024-02-12 |
link |
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension |
Yang, Qian,..., Jingren |
23 |
2024-02-29 |
link |
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers |
Li, Qintong,..., Wei |
23 |
2024-01-17 |
link |
ReFT: Reasoning with Reinforced Fine-Tuning |
Trung, Luong,..., Hang |
23 |
2024-02-21 |
link |
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems |
He, Chaoqun,..., Maosong |
22 |
2024-02-26 |
link |
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs |
Lu, Zimu,..., Hongsheng |
21 |
2024-02-14 |
link |
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation |
Zhang, Xiaoying,..., Helen |
21 |
2024-01-14 |
link |
CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning |
Wang, Weiqi,..., Yangqiu |
21 |
2023-11-15 |
link |
Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models |
Xu, Fangzhi,..., Jun |
21 |
2023-10-03 |
link |
OceanGPT: A Large Language Model for Ocean Science Tasks |
Bi, Zhen,..., Huajun |
21 |
2024-01-12 |
link |
The Unreasonable Effectiveness of Easy Training Data for Hard Tasks |
Hase, Peter,..., Sarah |
21 |
2023-11-10 |
link |
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences |
Tian, Yuanhe,..., Yongdong |
20 |
2024-01-06 |
link |
The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models |
Li, Junyi,..., Ji-Rong |
20 |
2023-06-20 |
link |
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts |
Nguyen, Xuan-Phi,..., Lidong |
20 |
2024-02-27 |
link |
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations |
Huang, Jing,..., Atticus |
20 |
2024-02-19 |
link |
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic |
Bhardwaj, Rishabh,..., Soujanya |
19 |
2024-03-06 |
link |
Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models |
Riddell, Martin,..., Arman |
19 |
2023-11-15 |
link |
Exploring the Potential of Large Language Models in Computational Argumentation |
Chen, Guizhen,..., Lidong |
19 |
2023-11-16 |
link |
Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities |
Wilf, Alex,..., Louis-Philippe |
19 |
2024-02-21 |
link |
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning |
Yang, Zhaorui,..., Qian |
18 |
2024-02-16 |
link |
When is Tree Search Useful for LLM Planning? It Depends on the Discriminator |
Chen, Ziru,..., Huan |
18 |
2024-02-20 |
link |
Instruction-tuned Language Models are Better Knowledge Learners |
Jiang, Zhengbao,..., Srini |
18 |
2024-03-20 |
link |
An Entropy-based Text Watermarking Detection Method |
Lu, Yijian,..., Irwin |
18 |
2024-02-19 |
link |
What Evidence Do Language Models Find Convincing? |
Wan, Alexander,..., Dan |
18 |
2023-11-15 |
link |
PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning |
Zhang, Zhihan,..., Francesco |
18 |
2024-05-18 |
link |
MapCoder: Multi-Agent Code Generation for Competitive Problem Solving |
Islam, Md. Ashraful,..., Md Rizwan |
18 |
2024-01-12 |
link |
Navigating the Metrics Maze: Reconciling Score Magnitudes and Accuracies |
Kocmi, Tom,..., Matt |
18 |
2023-09-29 |
link |
Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency |
Huang, Baizhou,..., Nan |
17 |
2023-11-14 |
link |
CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation |
Yan, Weixiang,..., Shuiguang |
17 |
2024-02-27 |
link |
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization |
Zhang, Wenqi,..., Weiming |
17 |
2024-02-20 |
link |
Investigating Cultural Alignment of Large Language Models |
AlKhamissi, Badr,..., Mona |
17 |
2024-01-13 |
link |
Bridging the Preference Gap between Retrievers and LLMs |
Ke, Zixuan,..., Michael |
17 |
2023-12-07 |
link |
Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use |
Chen, Yuhan,..., Rui |
17 |
2023-11-09 |
link |
Agent Lumos: Unified and Modular Training for Open-Source Language Agents |
Yin, Da,..., Bill Yuchen |
17 |
2023-11-07 |
link |
PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models |
Li, Haoran,..., Yangqiu |
16 |
2024-01-22 |
link |
PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety |
Zhang, Zaibin,..., Feng |
16 |
2023-10-05 |
link |
InstructProtein: Aligning Human and Protein Language via Knowledge Instruction |
Wang, Zeyuan,..., Huajun |
16 |
2023-10-10 |
link |
Exploring Memorization in Fine-tuned Language Models |
Zeng, Shenglai,..., Dawei |
16 |
2024-02-17 |
link |
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection |
Wang, Yuxia,..., Preslav |
16 |
2024-03-02 |
link |
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal |
Huang, Jianheng,..., Jinsong |
15 |
2023-12-23 |
link |
PokeMQA: Programmable knowledge editing for Multi-hop Question Answering |
Gu, Hengrui,..., Xin |
15 |
2024-01-10 |
link |
AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning |
Qiao, Shuofei,..., Huajun |
15 |
2024-03-05 |
link |
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution |
Plaza-del-Arco, Flor,..., Dirk |
15 |
2023-05-10 |
link |
ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base |
Yuan, Siyu,..., Deqing |
15 |
2024-02-16 |
link |
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows |
Patel, Ajay,..., Chris |
15 |
2024-02-20 |
link |
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations |
Lin, Guan-Ting,..., Hung-yi |
15 |
2023-10-31 |
link |
FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models |
Jiang, Yuxin,..., Wei |
15 |
2024-01-16 |
link |
MMToM-QA: Multimodal Theory of Mind Question Answering |
Jin, Chuanyang,..., Tianmin |
15 |
2023-08-31 |
link |
RepCodec: A Speech Representation Codec for Speech Tokenization |
Huang, Zhichao,..., Tom |
15 |
2023-10-19 |
link |
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models |
Wang, Wenxuan,..., Michael |
14 |
2023-11-15 |
link |
Explore Spurious Correlations at the Concept Level in Language Models for Text Classification |
Zhou, Yuhang,..., Furong |
14 |
2023-06-21 |
link |
ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews |
D{'}Arcy, Mike,..., Doug |
14 |
2023-10-09 |
link |
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning |
Li, Chengpeng,..., Chang |
14 |
2024-02-19 |
link |
CausalGym: Benchmarking causal interpretability methods on linguistic tasks |
Arora, Aryaman,..., Christopher |
14 |
2024-02-23 |
link |
Advancing Parameter Efficiency in Fine-tuning via Representation Editing |
Wu, Muling,..., Xuanjing |
14 |
2023-11-26 |
link |
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation |
Liang, Xun,..., Haiying |
14 |
2023-05-22 |
link |
Word Embeddings Are Steers for Language Models |
Han, Chi,..., Heng |
13 |
2024-02-06 |
link |
Training Language Models to Generate Text with Citations via Fine-grained Rewards |
Huang, Chengyu,..., Wenya |
13 |
2024-02-18 |
link |
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs |
Wang, Siyuan,..., Xiang |
13 |
2024-03-12 |
link |
KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction |
Li, Zixuan,..., Xueqi |
13 |
2024-03-27 |
link |
Measuring Political Bias in Large Language Models: What Is Said and How It Is Said |
Bang, Yejin,..., Pascale |
13 |
2024-05-17 |
link |
Layer-Condensed KV Cache for Efficient Inference of Large Language Models |
Wu, Haoyi,..., Kewei |
13 |
2024-02-19 |
link |
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs |
Tan, Jiejun,..., Ji-Rong |
13 |
2024-02-11 |
link |
Generalizing Conversational Dense Retrieval via LLM-Cognition Data Augmentation |
Chen, Haonan,..., Ziliang |
13 |
2024-02-01 |
link |
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains |
Jacovi, Alon,..., Mor |
13 |
2023-05-22 |
link |
MAGE: Machine-generated Text Detection in the Wild |
Li, Yafu,..., Yue |
13 |
2024-02-19 |
link |
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question? |
Balepur, Nishant,..., Rachel |
13 |
2024-02-22 |
link |
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models |
Lu, Xudong,..., Hongsheng |
13 |
2024-02-25 |
link |
Citation-Enhanced Generation for LLM-based Chatbots |
Li, Weitao,..., Yang |
13 |
2024-02-23 |
link |
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition |
Ye, Lu,..., Yang |
13 |
2024-06-24 |
link |
UniCoder: Scaling Code Large Language Model via Universal Code |
Sun, Tao,..., Zhoujun |
12 |
2024-01-12 |
link |
MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization |
She, Shuaijie,..., Jiajun |
12 |
2024-02-16 |
link |
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation |
Du, DaYou,..., Ningyi |
12 |
2024-05-28 |
link |
Faithful Logical Reasoning via Symbolic Chain-of-Thought |
Xu, Jundong,..., Wynne |
12 |
2023-05-22 |
link |
Iterative Forward Tuning Boosts In-context Learning in Language Models |
Yang, Jiaxi,..., Yongbin |
12 |
2024-01-12 |
link |
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning |
Zhu, Yutao,..., Zhicheng |
12 |
2024-02-14 |
link |
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents |
Qian, Cheng,..., Maosong |
12 |
2024-02-18 |
link |
Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks |
Wang, Yichen,..., Tianxing |
11 |
2024-01-15 |
link |
MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception |
Wang, Yuhao,..., Yu |
11 |
2024-04-04 |
link |
Learning to Plan and Generate Text with Citations |
Fierro, Constanza,..., Mirella |
11 |
None |
link |
LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin |
Dou, Shihan,..., Xuanjing |
11 |
2024-03-05 |
link |
CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following |
Zhang, Kaiyan,..., Bowen |
11 |
2024-05-13 |
link |
RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors |
Dugan, Liam,..., Chris |
11 |
2024-01-12 |
link |
AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters |
Lucy, Li,..., Jesse |
11 |
2023-11-13 |
link |
On Measuring Faithfulness or Self-consistency of Natural Language Explanations |
Parcalabescu, Letitia,..., Anette |
11 |
2023-11-14 |
link |
A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts |
Tripto, Nafis Irtiza,..., Dongwon |
11 |
2024-02-15 |
link |
Why are Sensitive Functions Hard for Transformers? |
Hahn, Michael,..., Mark |
11 |
2024-02-21 |
link |
Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models |
He, Zhiwei,..., Rui |
11 |
2024-02-10 |
link |
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators |
Hu, Yuchen,..., EngSiong |
11 |
2023-11-16 |
link |
Reducing Privacy Risks in Online Self-Disclosures with Language Models |
Dou, Yao,..., Wei |
11 |
2023-12-20 |
link |
Time is Encoded in the Weights of Finetuned Language Models |
Nylund, Kai,..., Noah |
11 |
2024-02-16 |
link |
AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation |
Wang, Zhaowei,..., Simon |
11 |
2024-02-18 |
link |
Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals |
Ortu, Francesco,..., Bernhard |
10 |
2023-07-03 |
link |
Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models |
Duan, Jinhao,..., Kaidi |
10 |
2024-03-19 |
link |
Bypassing LLM Watermarks with Color-Aware Substitutions |
Wu, Qilong,..., Varun |
10 |
2024-02-21 |
link |
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis |
Xie, Yueqi,..., Neil |
10 |
2023-10-13 |
link |
Improving Large Language Models in Event Relation Logical Prediction |
Chen, Meiqi,..., Dongsheng |
10 |
2023-09-16 |
link |
Cross-Lingual Knowledge Editing in Large Language Models |
Wang, Jiaan,..., Fandong |
10 |
2024-02-08 |
link |
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models |
Xu, Hainiu,..., Yulan |
10 |
2024-03-09 |
link |
Calibrating Large Language Models Using Their Generations Only |
Ulmer, Dennis,..., Seong |
10 |
2024-06-21 |
link |
Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering |
Shi, Zhengliang,..., Zhaochun |
10 |
2024-02-19 |
link |
Learning to Edit: Aligning LLMs with Knowledge Editing |
Jiang, Yuxin,..., Wei |
10 |
2024-06-13 |
link |
Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning? |
Su, Zhaochen,..., Min |
10 |
2024-02-20 |
link |
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification |
Peng, Yifan,..., Shinji |
9 |
2024-02-19 |
link |
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! |
Zhou, Zhanhui,..., Yu |
9 |
2024-03-15 |
link |
DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time Information Needs of Large Language Models |
Su, Weihang,..., Yiqun |
9 |
2024-02-08 |
link |
TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation |
Zhang, Yikai,..., Jiangjie |
9 |
2024-04-25 |
link |
Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks |
Siska, Charlotte,..., James |
9 |
2024-04-25 |
link |
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages |
Singh, Harman,..., Partha |
9 |
2023-12-04 |
link |
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia |
Monea, Giovanni,..., Robert |
9 |
2024-07-01 |
link |
FineSurE: Fine-grained Summarization Evaluation using LLMs |
Song, Hwanjun,..., Saab |
9 |
2024-01-09 |
link |
Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers |
Yona, Gal,..., Mor |
9 |
2024-03-25 |
link |
Attribute First, then Generate: Locally-attributable Grounded Text Generation |
Slobodkin, Aviv,..., Ido |
9 |
2024-01-12 |
link |
Mission: Impossible Language Models |
Kallini, Julie,..., Christopher |
9 |
2024-06-06 |
link |
Prototypical Reward Network for Data-Efficient RLHF |
Zhang, Jinghan,..., Kunpeng |
9 |
2023-03-28 |
link |
When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP |
Papi, Sara,..., Matteo |
9 |
2024-02-18 |
link |
LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks |
Wang, Hanqing,..., Maosong |
9 |
2024-02-26 |
link |
Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling |
Jiang, Hang,..., Jad |
9 |
2024-06-20 |
link |
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning |
Nowak, Franz,..., Ryan |
9 |
2024-01-22 |
link |
Revisiting Demonstration Selection Strategies in In-Context Learning |
Peng, Keqin,..., Dacheng |
9 |
2023-10-28 |
link |
DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy |
Sun, Hongda,..., Rui |
9 |
2024-03-29 |
link |
Latxa: An Open Language Model and Evaluation Suite for Basque |
Etxaniz, Julen,..., Aitor |
9 |
2023-11-19 |
link |
Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters |
Li, Yinghui,..., Ying |
9 |
2024-02-16 |
link |
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages |
Ye, Junjie,..., Xuanjing |
9 |
2024-02-23 |
link |
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models |
Yu, Zhuohao,..., Shikun |
8 |
2024-03-01 |
link |
Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks |
Alwajih, Fakhraddin,..., Muhammad |
8 |
2024-01-12 |
link |
Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation |
Cegin, Jan,..., Peter |
8 |
2024-01-31 |
link |
Navigating the OverKill in Large Language Models |
Shi, Chenyu,..., Dahua |
8 |
2024-05-21 |
link |
ProtT3: Protein-to-Text Generation for Text-based Protein Understanding |
Liu, Zhiyuan,..., Tat-Seng |
8 |
2023-11-15 |
link |
Temporal Knowledge Question Answering via Abstract Reasoning Induction |
Chen, Ziyang,..., Min |
8 |
2024-02-18 |
link |
Stealthy Attack on Large Language Model based Recommendation |
Zhang, Jinghao,..., Liang |
8 |
2024-03-08 |
link |
Aligning Large Language Models for Controllable Recommendations |
Lu, Wensheng,..., Xing |
8 |
2024-05-31 |
link |
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training |
Fang, Feiteng,..., Ruifeng |
8 |
2024-01-14 |
link |
MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation |
Chen, Jiaqi,..., Kwan-Yee |
8 |
2024-02-23 |
link |
On the Multi-turn Instruction Following for Conversational Web Agents |
Deng, Yang,..., Tat-Seng |
8 |
2024-02-28 |
link |
ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training |
Zhuo, Le,..., Wentao |
8 |
2024-03-15 |
link |
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling |
Limisiewicz, Tomasz,..., Luke |
8 |
2024-07-26 |
link |
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents |
Trivedi, Harsh,..., Niranjan |
8 |
2023-04-05 |
link |
Efficient OCR for Building a Diverse Digital History |
Carlson, Jacob,..., Melissa |
8 |
2024-03-12 |
link |
Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs |
Fang, Tianqing,..., Antoine |
8 |
2024-02-20 |
link |
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts |
Zhao, Hao,..., Jie |
8 |
2024-06-05 |
link |
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents |
Wang, Yifei,..., Shengsheng |
8 |
2024-02-19 |
link |
PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents |
Yang, Qisen,..., Gao |
8 |
2024-02-17 |
link |
Aligning Large Language Models by On-Policy Self-Judgment |
Lee, Sangkyu,..., Youngjae |
8 |
2023-11-16 |
link |
On the Impact of Calibration Data in Post-training Quantization and Pruning |
Williams, Miles,..., Nikolaos |
8 |
2023-11-16 |
link |
Where Do People Tell Stories Online? Story Detection Across Online Communities |
Antoniak, Maria,..., Andrew |
8 |
2024-03-11 |
link |
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis |
Liu, Yanming,..., Xuhong |
7 |
2024-02-19 |
link |
Are LLM-based Evaluators Confusing NLG Quality Criteria? |
Hu, Xinyu,..., Xiaojun |
7 |
2024-02-14 |
link |
Spectral Filters, Dark Signals, and Attention Sinks |
Cancedda, Nicola |
7 |
2024-02-20 |
link |
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation |
Kang, Dongjin,..., Jinyoung |
7 |
2024-02-26 |
link |
LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments |
Chen, Junzhe,..., Lijie |
7 |
2024-05-27 |
link |
DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution |
Mao, Yulong,..., Jinan |
7 |
2024-03-04 |
link |
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models |
Chen, Changyu,..., Yongbin |
7 |
2024-03-04 |
link |
To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering |
Frisoni, Giacomo,..., Zaiqiao |
7 |
2023-11-30 |
link |
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation |
Ke, Pei,..., Minlie |
7 |
2024-02-28 |
link |
Meta-Task Prompting Elicits Embeddings from Large Language Models |
Lei, Yibin,..., Andrew |
7 |
2023-12-31 |
link |
BatchEval: Towards Human-like Text Evaluation |
Yuan, Peiwen,..., Kan |
7 |
2024-02-16 |
link |
Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond |
Li, Yongqi,..., Tat-Seng |
7 |
2024-02-24 |
link |
Multimodal Instruction Tuning with Conditional Mixture of LoRA |
Shen, Ying,..., Lifu |
7 |
2024-02-21 |
link |
Analysing The Impact of Sequence Composition on Language Model Pre-Training |
Zhao, Yu,..., Pasquale |
7 |
2024-06-06 |
link |
Confabulation: The Surprising Value of Large Language Model Hallucinations |
Sui, Peiqi,..., Richard |
7 |
2024-02-23 |
link |
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs |
Basu, Kinjal,..., Luis |
7 |
2024-01-12 |
link |
Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation |
Vernikos, Giorgos,..., Andrei |
7 |
2024-08-06 |
link |
Synthesizing Text-to-SQL Data from Weak and Strong LLMs |
Yang, Jiaxi,..., Chang |
7 |
2024-01-22 |
link |
Text Embedding Inversion Security for Multilingual Language Models |
Chen, Yiyi,..., Johannes |
7 |
2024-02-18 |
link |
Don't Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech Detection |
Zhang, Min,..., Chang-Tien |
7 |
2024-07-25 |
link |
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning |
Wang, Tianduo,..., Wei |
7 |
2024-03-06 |
link |
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators |
Paul, Indraneil,..., Iryna |
7 |
2024-02-16 |
link |
Large Language Models as Zero-shot Dialogue State Tracker through Function Calling |
Li, Zekun,..., Paul |
7 |
2023-05-09 |
link |
COKE: A Cognitive Knowledge Graph for Machine Theory of Mind |
Wu, Jincenzi,..., Minlie |
6 |
2023-11-14 |
link |
Predicting Text Preference Via Structured Comparative Reasoning |
Yan, Jing Nathan,..., Michael |
6 |
2024-02-18 |
link |
Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once? |
Son, Guijin,..., Seungone |
6 |
2024-02-19 |
link |
Revisiting Knowledge Distillation for Autoregressive Language Models |
Zhong, Qihuang,..., Dacheng |
6 |
2024-04-06 |
link |
Context versus Prior Knowledge in Language Models |
Du, Kevin,..., Ryan |
6 |
2024-06-04 |
link |
Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs |
Cao, Zhiwei,..., Jinsong |
6 |
2024-03-28 |
link |
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data |
Tonneau, Manuel,..., Samuel |
6 |
2024-02-16 |
link |
Exploring Precision and Recall to assess the quality and diversity of LLMs |
Le Bronnec, Florian,..., Alexandre |
6 |
2024-02-19 |
link |
Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation |
Liu, Aiwei,..., Lijie |
6 |
2024-02-20 |
link |
GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick |
Fu, Jiayi,..., Yanghua |
6 |
2024-06-05 |
link |
Text-like Encoding of Collaborative Information in Large Language Models for Recommendation |
Zhang, Yang,..., Xiangnan |
6 |
2024-01-26 |
link |
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models |
Tan, Haochen,..., Linqi |
6 |
2024-05-17 |
link |
Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation |
Niu, Cheng,..., Tong |
6 |
2024-02-19 |
link |
EmoBench: Evaluating the Emotional Intelligence of Large Language Models |
Sabour, Sahand,..., Minlie |
6 |
2024-02-14 |
link |
Towards Privacy-Aware Sign Language Translation at Scale |
Rust, Phillip,..., Jean |
6 |
2024-02-22 |
link |
Unveiling Linguistic Regions in Large Language Models |
Zhang, Zhihao,..., Xuanjing |
6 |
2024-01-12 |
link |
ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation |
Jha, Akshita,..., Sunipa |
6 |
2024-06-05 |
link |
Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends |
Ramprasad, Sanjana,..., Zachary |
6 |
2024-02-13 |
link |
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers |
Lin, Weizhe,..., Bill |
6 |
2023-10-07 |
link |
Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages |
Huang, Shih-Cheng,..., Hung-yi |
6 |
2024-02-16 |
link |
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models |
Lin, Zihao,..., Lifu |
6 |
2023-10-30 |
link |
M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models |
Kwan, Wai-Chung,..., Kam-Fai |
6 |
2024-02-26 |
link |
HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy |
Xiao, Mengxi,..., Jimin |
6 |
2024-02-14 |
link |
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning |
Wang, Yejie,..., Xunliang |
6 |
2024-02-18 |
link |
Benchmarking Knowledge Boundary for Large Language Model: A Different Perspective on Model Evaluation |
Yin, Xunjian,..., Xiaojun |
6 |
2024-04-09 |
link |
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages |
Cahyawijaya, Samuel,..., Pascale |
6 |
2023-11-16 |
link |
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models |
Wang, Jiongxiao,..., Chaowei |
6 |
2024-02-14 |
link |
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech |
Ji, Shengpeng,..., Zhou |
6 |
2024-06-11 |
link |
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation |
Ma, Zhengrui,..., Min |
6 |
None |
link |
AoE: Angle-optimized Embeddings for Semantic Textual Similarity |
Li, Xianming,..., Jing |
6 |
2024-03-09 |
link |
ItD: Large Language Models Can Teach Themselves Induction through Deduction |
Sun, Wangtao,..., Kang |
6 |
2024-04-23 |
link |
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models |
Parmar, Mihir,..., Chitta |
6 |
2024-05-31 |
link |
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark |
Park, Chanjun,..., Hwalsuk |
6 |
2024-03-18 |
link |
Metaphor Understanding Challenge Dataset for LLMs |
Tong, Xiaoyu,..., Ekaterina |
6 |
2023-10-08 |
link |
MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering |
Chen, Xiusi,..., Wei |
6 |
2023-12-20 |
link |
WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning |
Yu, Zhaojian,..., Qiufeng |
6 |
2024-02-15 |
link |
SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs |
Hu, Yebowen,..., Fei |
5 |
2024-05-24 |
link |
GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction |
Felkner, Virginia,..., Jonathan |
5 |
2024-02-16 |
link |
AFaCTA: Assisting the Annotation of Factual Claim Detection with Reliable LLM Annotators |
Ni, Jingwei,..., Markus |
5 |
2024-05-26 |
link |
M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions |
Wang, Zheng,..., Wei |
5 |
2024-02-01 |
link |
What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection |
Feng, Shangbin,..., Yulia |
5 |
2024-06-06 |
link |
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages |
Borenstein, Nadav,..., Ryan |
5 |
2024-03-07 |
link |
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error |
Wang, Boshi,..., Yu |
5 |
2024-02-19 |
link |
Parallel Structures in Pre-training Data Yield In-Context Learning |
Chen, Yanda,..., He |
5 |
2024-02-18 |
link |
FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence |
Joseph, Sebastian,..., Junyi Jessy |
5 |
2023-12-27 |
link |
Prompt Expansion for Adaptive Text-to-Image Generation |
Datta, Siddhartha,..., Peter |
5 |
2023-12-13 |
link |
Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models |
Zheng, Junhao,..., Qianli |
5 |
2024-01-19 |
link |
LangBridge: Multilingual Reasoning Without Multilingual Supervision |
Yoon, Dongkeun,..., Minjoon |
5 |
2023-10-05 |
link |
Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction |
Jian, Yiren,..., Hongxia |
5 |
2024-02-20 |
link |
Interpreting Conversational Dense Retrieval by Rewriting-Enhanced Inversion of Session Embedding |
Cheng, Yiruo,..., Zhicheng |
5 |
2023-12-20 |
link |
Retrieval-augmented Multilingual Knowledge Editing |
Wang, Weixuan,..., Alexandra |
5 |
2024-05-28 |
link |
Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models |
Chen, Longze,..., Min |
5 |
2023-11-16 |
link |
FinanceMath: Knowledge-Intensive Math Reasoning in Finance Domains |
Zhao, Yilun,..., Arman |
5 |
2024-03-13 |
link |
Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale |
Hu, Xiang,..., Kewei |
5 |
2024-05-30 |
link |
Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion |
Cheng, Wei,..., Wei |
5 |
2024-02-20 |
link |
The Hidden Space of Transformer Language Adapters |
Alabi, Jesujoba,..., Mor |
5 |
2024-08-07 |
link |
NACL: A General and Effective KV Cache Eviction Framework for LLM at Inference Time |
Chen, Yilong,..., Hua |
5 |
2023-10-16 |
link |
On Context Utilization in Summarization with Large Language Models |
Ravaut, Mathieu,..., Shafiq |
5 |
2024-02-24 |
link |
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA |
Wang, Sheng,..., Chuan |
5 |
2024-03-15 |
link |
EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models |
Das, Rocktim,..., Preslav |
5 |
2024-02-19 |
link |
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing? |
Gaido, Marco,..., Luisa |
5 |
2023-05-22 |
link |
CopyNE: Better Contextual ASR by Copying Named Entities |
Zhou, Shilin,..., Baoxing |
5 |
2024-02-19 |
link |
MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMs |
Bakman, Yavuz Faruk,..., Salman |
5 |
2024-07-02 |
link |
Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation |
Wang, Xinglin,..., Kan |
5 |
2023-10-02 |
link |
Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games |
Zhang, Yizhe,..., Navdeep |
5 |
2024-03-05 |
link |
OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following |
Shi, Haochen,..., Bang |
5 |
2024-02-06 |
link |
Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback |
Ahn, Daechul,..., Jonghyun |
5 |
2024-01-25 |
link |
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization |
J, Jaavid,..., Anoop |
5 |
2024-01-09 |
link |
MERA: A Comprehensive LLM Evaluation in Russian |
Fenogenova, Alena,..., Sergey |
5 |
2024-02-28 |
link |
Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation |
Xu, Shicheng,..., Jie |
5 |
2023-10-11 |
link |
Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models |
Sun, Yuchong,..., Kun |
5 |
2024-02-15 |
link |
Grounding Language Model with Chunking-Free In-Context Retrieval |
Qian, Hongjin,..., Zhicheng |
5 |
2024-06-13 |
link |
ECBD: Evidence-Centered Benchmark Design for NLP |
Liu, Yu Lu,..., Ziang |
5 |
2023-10-13 |
link |
EasyGen: Easing Multimodal Generation with BiDiffuser and LLMs |
Zhao, Xiangyu,..., Xiao-Ming |
5 |
2023-11-16 |
link |
Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination |
Yang, Nakyeong,..., Kyomin |
5 |
2023-12-21 |
link |
T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step |
Chen, Zehui,..., Feng |
5 |
2023-11-14 |
link |
Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models |
Ni, Shiwen,..., Min |
5 |
2023-11-29 |
link |
CLOMO: Counterfactual Logical Modification with Large Language Models |
Huang, Yinya,..., Linqi |
5 |
2024-06-18 |
link |
An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs |
Rai, Daking,..., Ziyu |
5 |
2024-01-16 |
link |
SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models |
Zhao, Weixiang,..., Wanxiang |
5 |
2024-02-28 |
link |
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning |
Li, Jiachun,..., Jun |
5 |
2024-02-15 |
link |
Answer is All You Need: Instruction-following Text Embedding via Answering the Question |
Peng, Letian,..., Jingbo |
5 |
2024-06-19 |
link |
Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators |
Mahaut, Mat{\'e}o,..., Lluis |
5 |
2024-06-08 |
link |
Planning Like Human: A Dual-process Framework for Dialogue Planning |
He, Tao,..., Bing |
4 |
2024-02-19 |
link |
IMBUE: Improving Interpersonal Effectiveness through Simulation and Just-in-time Feedback with Human-Language Model Interaction |
Lin, Inna,..., Tim |
4 |
2024-03-25 |
link |
An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing |
Chai, Ziwei,..., Yang |
4 |
2023-12-05 |
link |
Prompt Optimization via Adversarial In-Context Learning |
Long, Do,..., Junxian |
4 |
2024-06-12 |
link |
Multimodal Table Understanding |
Zheng, Mingyu,..., Weiping |
4 |
2024-01-19 |
link |
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion |
Wang, Zhichao,..., Yuping |
4 |
2024-02-23 |
link |
Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models |
Xiong, Guanming,..., Wen |
4 |
2023-10-10 |
link |
Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model |
Tomani, Christian,..., Daniel |
4 |
2024-05-21 |
link |
G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation |
Pan, Xingyuan,..., Shanbo |
4 |
2024-06-06 |
link |
What Do Language Models Learn in Context? The Structured Task Hypothesis |
Li, Jiaoda,..., Ryan |
4 |
2024-02-23 |
link |
ToMBench: Benchmarking Theory of Mind in Large Language Models |
Chen, Zhuang,..., Minlie |
4 |
2024-02-19 |
link |
A synthetic data approach for domain generalization of NLI models |
Hosseini, Mohammad Javad,..., Annie |
4 |
2024-08-20 |
link |
Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models |
Chen, Yuyan,..., Yanghua |
4 |
2024-06-03 |
link |
Probing Language Models for Pre-training Data Detection |
Liu, Zhenhua,..., Wenliang |
4 |
2024-03-09 |
link |
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines |
Toker, Michael,..., Yonatan |
4 |
2023-11-13 |
link |
WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models |
Tu, Shangqing,..., Juanzi |
4 |
2024-05-17 |
link |
Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks |
Chatterjee, Anwoy,..., Tanmoy |
4 |
2023-06-28 |
link |
Pareto Optimal Learning for Estimating Large Language Model Errors |
Zhao, Theodore,..., Hoifung |
4 |
2024-07-01 |
link |
IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script Generation |
Han, Senyu,..., Kai |
4 |
2023-11-15 |
link |
Temperature-scaling surprisal estimates improve fit to human reading times - but does it do so for the "right reasons"? |
Liu, Tong,..., Vera |
4 |
2023-12-12 |
link |
Safety Alignment in NLP Tasks: Weakly Aligned Summarization as an In-Context Attack |
Fu, Yu,..., Yue |
4 |
2024-04-15 |
link |
Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval |
Chen, Peter Baile,..., Dan |
4 |
2024-06-04 |
link |
Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding |
Zhang, Zhihan,..., Tat-Seng |
4 |
2024-02-19 |
link |
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages |
Zhang, Yuanchi,..., Yang |
4 |
2023-12-13 |
link |
Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic Image-Report Generation |
Chen, Wenting,..., Yixuan |
4 |
2023-11-11 |
link |
LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction |
Zhou, Hanzhang,..., Kezhi |
4 |
2023-11-15 |
link |
Disinformation Capabilities of Large Language Models |
Vykopal, Ivan,..., Maria |
4 |
2024-05-21 |
link |
SirLLM: Streaming Infinite Retentive LLM |
Yao, Yao,..., Hai |
4 |
2024-02-20 |
link |
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning |
Mondorf, Philipp,..., Barbara |
4 |
2024-02-24 |
link |
ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval |
Yoon, Soyoung,..., Seung-won |
4 |
2023-11-16 |
link |
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents |
Zhao, Yilun,..., Arman |
4 |
2024-03-18 |
link |
QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction |
Huang, Xiang,..., Yuzhong |
4 |
2024-03-21 |
link |
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception |
Han, HyoJung,..., Changhan |
4 |
2024-06-03 |
link |
Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors |
Xue, Mengge,..., Chengguo |
4 |
2022-11-16 |
link |
CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers |
Hu, Yong,..., Jie |
4 |
2024-02-26 |
link |
What Do Language Models Hear? Probing for Auditory Representations in Language Models |
Ngo, Jerry,..., Yoon |
4 |
2023-11-16 |
link |
PixT3: Pixel-based Table To Text generation |
Alonso, I{\~n}igo,..., Mirella |
4 |
2023-08-29 |
link |
SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget |
Kong, Rui,..., Yunxin |
4 |
2023-11-13 |
link |
Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning |
Yu, Yue,..., Michael |
4 |
2024-02-21 |
link |
CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models |
Luo, Fuwen,..., Yang |
4 |
2024-03-03 |
link |
WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection |
Shetty, Anudeex,..., Qiongkai |
4 |
None |
link |
Jailbreak Open-Sourced Large Language Models via Enforced Decoding |
Zhang, Hangfan,..., Dinghao |
4 |
None |
link |
DeCoT: Debiasing Chain-of-Thought for Knowledge-Intensive Tasks in Large Language Models via Causal Intervention |
Wu, Junda,..., Julian |
4 |
2024-01-24 |
link |
UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion |
Li, Wei,..., Xinyan |
4 |
2024-04-29 |
link |
Analyzing Semantic Change through Lexical Replacements |
Periti, Francesco,..., Nina |
4 |
2024-01-29 |
link |
InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification |
Trienes, Jan,..., Junyi Jessy |
4 |
2024-02-28 |
link |
Small But Funny: A Feedback-Driven Approach to Humor Distillation |
Ravi, Sahithya,..., Arash |
4 |
2024-04-05 |
link |
Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation |
Zhong, Tianqi,..., Zhendong |
4 |
2024-03-19 |
link |
Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models |
Lin, Ying-Chun,..., Jaime |
4 |
2023-06-14 |
link |
Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations |
Geigle, Gregor,..., Goran |
4 |
2023-12-25 |
link |
Instruction Fusion: Advancing Prompt Evolution through Hybridization |
Guo, Weidong,..., Di |
4 |
2023-07-11 |
link |
Lightweight reranking for language model generations |
Jain, Siddhartha,..., Bing |
3 |
2024-06-07 |
link |
Uncertainty Aware Learning for Language Model Alignment |
Wang, Yikun,..., Dacheng |
3 |
2024-08-19 |
link |
TaSL: Continual Dialog State Tracking via Task Skill Localization and Consolidation |
Feng, Yujie,..., Xiao-Ming |
3 |
2024-03-06 |
link |
A Modular Approach for Multimodal Summarization of TV Shows |
Mahon, Louis,..., Mirella |
3 |
2024-02-19 |
link |
NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms |
Zheng, Jonathan,..., Wei |
3 |
2024-02-19 |
link |
Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space |
Wu, Zongru,..., Gongshen |
3 |
2024-05-01 |
link |
CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models |
Chen, Zixin,..., Guang |
3 |
2024-02-12 |
link |
Label-Efficient Model Selection for Text Generation |
Ashury Tahan, Shir,..., Eyal |
3 |
2024-01-26 |
link |
Unlearning Traces the Influential Training Data of Language Models |
Isonuma, Masaru,..., Ivan |
3 |
2024-06-08 |
link |
MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention |
Jha, Prince,..., Pushpak |
3 |
2024-05-16 |
link |
Robust Singing Voice Transcription Serves Synthesis |
Li, Ruiqi,..., Zhou |
3 |
2024-06-10 |
link |
FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model |
Lee, Yebin,..., Myungjoo |
3 |
2024-02-16 |
link |
Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning |
Nguyen, Tuc,..., Thai |
3 |
None |
link |
REANO: Optimising Retrieval-Augmented Reader Models through Knowledge Graph Generation |
Fang, Jinyuan,..., Craig |
3 |
2023-11-15 |
link |
Few-shot Transfer Learning for Knowledge Base Question Answering: Fusing Supervised Models with In-Context Learning |
Patidar, Mayur,..., Indrajit |
3 |
2024-05-21 |
link |
Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan Acquisition |
Bortoletto, Matteo,..., Andreas |
3 |
2024-02-19 |
link |
Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models |
Ju, Tianjie,..., Gongshen |
3 |
2024-06-10 |
link |
Interpretability of Language Models via Task Spaces |
Weber, Lucas,..., Dieuwke |
3 |
2024-01-10 |
link |
I am a Strange Dataset: Metalinguistic Tests for Language Models |
Thrush, Tristan,..., Douwe |
3 |
2024-02-13 |
link |
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering |
Schimanski, Tobias,..., Markus |
3 |
2024-01-09 |
link |
Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search |
Li, Haochen,..., Zhiqi |
3 |
2024-06-12 |
link |
TasTe: Teaching Large Language Models to Translate through Self-Reflection |
Wang, Yutong,..., Min |
3 |
2024-05-30 |
link |
The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities |
Stap, David,..., Ke |
3 |
2023-11-08 |
link |
Speech language models lack important brain-relevant semantics |
Oota, Subba Reddy,..., Mariya |
3 |
2024-01-22 |
link |
Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts? |
Tan, Hexiang,..., Xueqi |
3 |
2024-03-04 |
link |
VariErr NLI: Separating Annotation Error from Human Label Variation |
Weber-Genzel, Leon,..., Barbara |
3 |
2024-04-14 |
link |
Text-to-Song: Towards Controllable Music Generation Incorporating Vocals and Accompaniment |
Hong, Zhiqing,..., Zhimeng |
3 |
2023-11-15 |
link |
Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation |
Zhao, Jiachen,..., Andrew |
3 |
2024-02-27 |
link |
Benchmarking Data Science Agents |
Zhang, Yuge,..., Kan |
3 |
2024-02-16 |
link |
Linear Transformers with Learnable Kernel Functions are Better In-Context Models |
Aksenov, Yaroslav,..., Daniil |
3 |
2023-11-29 |
link |
TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models |
Chu, Zheng,..., Bing |
3 |
2024-08-06 |
link |
Making Long-Context Language Models Better Multi-Hop Reasoners |
Li, Yanyang,..., Liwei |
3 |
2024-06-10 |
link |
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs |
Panda, Pranoy,..., Prathosh |
3 |
None |
link |
TaPERA: Enhancing Faithfulness and Interpretability in Long-Form Table QA by Content Planning and Execution-based Reasoning |
Zhao, Yilun,..., Chen |
3 |
2024-06-30 |
link |
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models |
Zhong, Weihong,..., Bing |
3 |
2024-03-07 |
link |
Classist Tools: Social Class Correlates with Performance in NLP |
Curry, Amanda,..., Dirk |
3 |
2024-07-31 |
link |
Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs |
Markowitz, Elan,..., Aram |
3 |
None |
link |
EZ-STANCE: A Large Dataset for English Zero-Shot Stance Detection |
Zhao, Chenye,..., Cornelia |
3 |
2024-02-23 |
link |
Unlocking the Power of Large Language Models for Entity Alignment |
Jiang, Xuhui,..., Yuanzhuo |
3 |
2024-06-04 |
link |
mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models |
Lai, Huiyuan,..., Malvina |
3 |
2023-12-07 |
link |
Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models |
Agostinelli, Victor,..., Lizhong |
3 |
2024-06-06 |
link |
VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval |
Zhou, Junjie,..., Yongping |
3 |
2024-01-29 |
link |
Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA |
Fan, Yue,..., Xin |
3 |
2023-12-15 |
link |
Marathon: A Race Through the Realm of Long Context with Large Language Models |
Zhang, Lei,..., Min |
3 |
2024-06-18 |
link |
AutoDSL: Automated domain-specific language design for structural representation of procedures with constraints |
Shi, Yu-Zhe,..., Qining |
3 |
2024-03-06 |
link |
The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models |
Bhaskar, Adithya,..., Danqi |
3 |
2024-06-06 |
link |
ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions |
Ghosh, Sreyan,..., Dinesh |
3 |
None |
link |
Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense Retrieval |
Li, Chaofan,..., Defu |
3 |
None |
link |
DocLens: Multi-aspect Fine-grained Medical Text Evaluation |
Xie, Yiqing,..., Carolyn |
3 |
2024-06-21 |
link |
Word Matters: What Influences Domain Adaptation in Summarization? |
Li, Yinghao,..., Yang |
3 |
2024-06-25 |
link |
MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning |
Dai, Zhenlong,..., Jingyuan |
3 |
2023-10-08 |
link |
DeVAn: Dense Video Annotation for Video-Language Models |
Liu, Tingkai,..., Hongxia |
3 |
2024-06-06 |
link |
Causal Estimation of Memorisation Profiles |
Lesci, Pietro,..., Tiago |
3 |
2023-03-06 |
link |
XCodeEval: An Execution-based Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval |
Khan, Mohammad Abdullah Matin,..., Shafiq |
3 |
2024-03-05 |
link |
Improving Event Definition Following For Zero-Shot Event Detection |
Cai, Zefan,..., Nanyun |
3 |
2024-01-12 |
link |
Structsum Generation for Faster Text Comprehension |
Jain, Parag,..., Francesco |
3 |
2024-05-16 |
link |
Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks |
Bordalo, Jo{\~a}o,..., Joao |
3 |
2024-01-18 |
link |
Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation |
Kasner, Zden{\v{e}}k,..., Ondrej |
3 |
2024-02-08 |
link |
Transparent and Scrutable Recommendations Using Natural Language User Profiles |
Ramos, Jerome,..., Aldo |
3 |
None |
link |
Fundamental Capabilities of Large Language Models and their Applications in Domain Scenarios: A Survey |
Li, Jiawei,..., Heyan |
3 |
2024-05-25 |
link |
Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models |
Kumar, Abhishek,..., Ali |
3 |
2024-06-02 |
link |
Deciphering Oracle Bone Language with Diffusion Models |
Guan, Haisu,..., Yuliang |
3 |
2024-05-28 |
link |
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models |
Elangovan, Aparna,..., Dan |
3 |
2024-02-18 |
link |
One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation |
Siledar, Tejpalsingh,..., Nikesh |
2 |
2024-05-07 |
link |
Toward In-Context Teaching: Adapting Examples to Students' Misconceptions |
Ross, Alexis,..., Jacob |
2 |
2023-08-25 |
link |
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers |
Xie, Jiawen,..., Nan |
2 |
2023-09-15 |
link |
ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer |
Saakyan, Arkadiy,..., Smaranda |
2 |
2024-02-11 |
link |
Through the Lens of Split Vote: Exploring Disagreement, Difficulty and Calibration in Legal Case Outcome Classification |
Xu, Shanshan,..., Matthias |
2 |
2024-06-01 |
link |
Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning |
Ryu, Sangwon,..., Jungseul |
2 |
2024-06-04 |
link |
Multimodal Reasoning with Multimodal Knowledge Graph |
Lee, Junlin,..., Min |
2 |
2024-01-24 |
link |
SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning |
Chen, Guoxin,..., Yiming |
2 |
2024-05-31 |
link |
Re3: A Holistic Framework and Dataset for Modeling Collaborative Document Revision |
Ruan, Qian,..., Iryna |
2 |
2024-05-30 |
link |
Multi-Aspect Controllable Text Generation with Disentangled Counterfactual Augmentation |
Liu, Yi,..., Wei |
2 |
None |
link |
Self-chats from Large Language Models Make Small Emotional Support Chatbot Better |
Zheng, Zhonghua,..., Liqiang |
2 |
2023-12-18 |
link |
Split and Rephrase with Large Language Models |
Ponce, David,..., Harritxu |
2 |
2024-06-28 |
link |
BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering |
Chu, Zheng,..., Bing |
2 |
2023-08-09 |
link |
VulLibGen: Generating Names of Vulnerability-Affected Packages via a Large Language Model |
Chen, Tianyu,..., Tao |
2 |
2024-06-07 |
link |
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques |
Thakkar, Megh,..., Sarath |
2 |
2024-05-17 |
link |
Feature-Adaptive and Data-Scalable In-Context Learning |
Li, Jiahao,..., Zhendong |
2 |
2024-06-07 |
link |
Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM Optimizers |
Senel, L{\"u}tfi Kerem,..., Shervin |
2 |
2024-04-10 |
link |
Learn from Failure: Fine-Tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving |
An, Chenyang,..., Jingbo |
2 |
2024-05-22 |
link |
Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline |
Yang, Dingyi,..., Qin |
2 |
2024-03-05 |
link |
Eliciting Better Multilingual Structured Reasoning from LLMs through Code |
Li, Bryan,..., Saab |
2 |
2024-02-09 |
link |
NICE: To Optimize In-Context Examples or Not? |
Srivastava, Pragya,..., Amit |
2 |
2024-07-31 |
link |
Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends |
Martinelli, Giuliano,..., Roberto |
2 |
None |
link |
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners |
Huang, Rongjie,..., Dong |
2 |
2023-05-12 |
link |
Synergistic Interplay between Search and Large Language Models for Information Retrieval |
Feng, Jiazhan,..., Daxin |
2 |
2024-02-19 |
link |
Browse and Concentrate: Comprehending Multimodal Content via prior-LLM Context Fusion |
Wang, Ziyue,..., Yang |
2 |
2024-02-20 |
link |
Model Composition for Multimodal Large Language Models |
Chen, Chi,..., Yang |
2 |
2024-07-07 |
link |
IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning |
Joshi, Abhinav,..., Ashutosh |
2 |
2024-02-29 |
link |
NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism |
Li, Miao,..., Yi |
2 |
2023-11-15 |
link |
Never Lost in the Middle: Mastering Long-Context Question Answering with Position-Agnostic Decompositional Training |
He, Junqing,..., Jiaxing |
2 |
2024-03-14 |
link |
TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Sematic Tasks |
Moskvoretskii, Viktor,..., Irina |
2 |
2024-06-05 |
link |
Using Synchronic Definitions and Semantic Relations to Classify Semantic Change Types |
Cassotti, Pierluigi,..., Nina |
2 |
2023-05-23 |
link |
DAPR: A Benchmark on Document-Aware Passage Retrieval |
Wang, Kexin,..., Iryna |
2 |
2023-09-15 |
link |
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending |
Zhu, Shiyi,..., Jianguo |
2 |
2024-04-29 |
link |
Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods |
Yu, Haeun,..., Isabelle |
2 |
2024-06-12 |
link |
To be Continuous, or to be Discrete, Those are Bits of Questions |
Wang, Yiran,..., Masao |
2 |
2024-05-20 |
link |
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus |
Poesina, Eduard,..., Radu |
2 |
None |
link |
Rethinking Task-Oriented Dialogue Systems: From Complex Modularity to Zero-Shot Autonomous Agent |
Xu, Heng-Da,..., Heyan |
2 |
2023-08-21 |
link |
PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User Simulator |
Kong, Chuyi,..., Benyou |
2 |
2024-03-01 |
link |
EUROPA: A Legal Multilingual Keyphrase Generation Dataset |
Sala{\"u}n, Olivier,..., Philippe |
2 |
2024-01-20 |
link |
STICKERCONV: Generating Multimodal Empathetic Responses from Scratch |
Zhang, Yiqun,..., Kaisong |
2 |
None |
link |
VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in Large Vision-Language Models |
Cao, Qingxing,..., Liang |
2 |
2024-02-24 |
link |
HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition |
Liu, Yuxuan,..., Qi |
2 |
2024-03-21 |
link |
Multi-Level Feedback Generation with Large Language Models for Empowering Novice Peer Counselors |
Chaszczewicz, Alicja,..., Diyi |
2 |
2023-11-15 |
link |
MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation |
Wang, Xiaozhi,..., Juanzi |
2 |
2024-05-23 |
link |
ChronosLex: Time-aware Incremental Training for Temporal Generalization of Legal Classification Tasks |
T.y.s.s, Santosh,..., Matthias |
2 |
2024-02-28 |
link |
An Iterative Associative Memory Model for Empathetic Response Generation |
Yang, Zhou,..., Xiangwen |
2 |
2024-02-10 |
link |
Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue |
Wang, Jian,..., Xiaoyong |
2 |
2024-05-16 |
link |
Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction |
Chen, Jianhao,..., Yuzhong |
2 |
None |
link |
Robust Frame-Semantic Models with Lexical Unit Trees and Negative Samples |
Devasier, Jacob,..., Chengkai |
2 |
2024-02-27 |
link |
Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder |
Wang, Jiaqi,..., Zhiguo |
2 |
2024-06-06 |
link |
To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation |
Waheed, Abdul,..., Muhammad |
2 |
None |
link |
A Multi-Task Embedder For Retrieval Augmented LLMs |
Zhang, Peitian,..., Jian-Yun |
2 |
2024-02-17 |
link |
Language Models Don't Learn the Physical Manifestation of Language |
Lee, Bruce,..., Jaehyuk |
2 |
2024-05-26 |
link |
MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations |
Wang, Yuxin,..., Soroush |
2 |
2023-11-16 |
link |
WatME: Towards Lossless Watermarking Through Lexical Redundancy |
Chen, Liang,..., Kam-Fai |
2 |
2024-07-20 |
link |
Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL |
Choi, Yunseon,..., Kee-Eung |
2 |
2024-05-19 |
link |
Your Transformer is Secretly Linear |
Razzhigaev, Anton,..., Andrey |
2 |
2024-01-15 |
link |
Uncovering the Full Potential of Visual Grounding Methods in VQA |
Reich, Daniel,..., Tanja |
2 |
None |
link |
Enhancing Explainable Rating Prediction through Annotated Macro Concepts |
Zhou, Huachi,..., Xiao |
2 |
2024-06-17 |
link |
AI "News" Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian |
Puccetti, Giovanni,..., Andrea |
2 |
None |
link |
Soft Knowledge Prompt: Help External Knowledge Become a Better Teacher to Instruct LLM in Knowledge-based VQA |
Wang, Qunbo,..., Jing |
2 |
2024-03-21 |
link |
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations |
Sun, Jiaxing,..., Conghui |
2 |
2023-11-14 |
link |
MC²: Towards Transparent and Culturally-Aware NLP for Minority Languages in China |
Zhang, Chen,..., Yansong |
2 |
2024-07-01 |
link |
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents |
Deng, Shihan,..., Shuo |
2 |
2024-03-16 |
link |
Deciphering Hate: Identifying Hateful Memes and Their Targets |
Hossain, Eftekhar,..., Sarah Masud |
2 |
None |
link |
DM-BLI: Dynamic Multiple Subspaces Alignment for Unsupervised Bilingual Lexicon Induction |
Hu, Ling,..., Yuemei |
2 |
2024-02-20 |
link |
Modality-Aware Integration with Large Language Models for Knowledge-based Visual Question Answering |
Dong, Junnan,..., Xiao |
2 |
2024-05-16 |
link |
FinTextQA: A Dataset for Long-form Financial Question Answering |
Chen, Jian,..., Junwei |
2 |
2024-05-28 |
link |
Detection-Correction Structure via General Language Model for Grammatical Error Correction |
Li, Wei,..., Houfeng |
2 |
2024-06-03 |
link |
Are AI-Generated Text Detectors Robust to Adversarial Perturbations? |
Huang, Guanhua,..., Zhouwang |
2 |
2024-05-21 |
link |
Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression |
Liu, Peiyu,..., Ji-Rong |
2 |
2024-09-22 |
link |
SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graph |
Chen, Hanzhu,..., Jieping |
2 |
2024-02-28 |
link |
VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models |
Kim, Seoyeon,..., Dongha |
2 |
2024-05-20 |
link |
CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models |
Zhang, Tong,..., Tat-Seng |
2 |
2024-08-25 |
link |
Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In! |
Perrella, Stefano,..., Roberto |
2 |
2024-01-15 |
link |
Selene: Pioneering Automated Proof in Software Verification |
Zhang, Lichen,..., Nan |
1 |
None |
link |
NounAtlas: Filling the Gap in Nominal Semantic Role Labeling |
Navigli, Roberto,..., Alessandro |
1 |
2024-02-16 |
link |
Exploring Hybrid Question Answering via Program-based Prompting |
Shi, Qi,..., Ting |
1 |
None |
link |
Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLMs-Powered Assistance |
Yuan, Bo,..., Wei |
1 |
2024-03-07 |
link |
Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models |
Gao, Changjiang,..., Shujian |
1 |
None |
link |
MAP's not dead yet: Uncovering true language model modes by conditioning away degeneracy |
Yoshida, Davis,..., Kevin |
1 |
2024-08-08 |
link |
Explicating the Implicit: Argument Detection Beyond Sentence Boundaries |
Roit, Paul,..., Ido |
1 |
None |
link |
Visualization Recommendation with Prompt-based Reprogramming of Large Language Models |
Li, Xinhang,..., Enhong |
1 |
None |
link |
MIST: Mutual Information Maximization for Short Text Clustering |
Kamthawee, Krissanee,..., Sarana |
1 |
2024-09-09 |
link |
Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation |
Gopinathan, Muraleekrishna,..., David |
1 |
2024-06-11 |
link |
GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews |
Darrin, Maxime,..., Jackie |
1 |
None |
link |
Fora: A corpus and framework for the study of facilitated dialogue |
Schroeder, Hope,..., Jad |
1 |
None |
link |
The MERSA Dataset and a Transformer-Based Approach for Speech Emotion Recognition |
Zhang, Enshi,..., Christian |
1 |
None |
link |
Harder Task Needs More Experts: Dynamic Routing in MoE Models |
Huang, Quzhe,..., Yansong |
1 |
2024-04-23 |
link |
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts |
Ding, Yifeng,..., Lingming |
1 |
2024-03-28 |
link |
HiRoPE: Length Extrapolation for Code Models Using Hierarchical Position |
Zhang, Kechi,..., Zhi |
1 |
None |
link |
From Sights to Insights: Towards Summarization of Multimodal Clinical Documents |
Ghosh, Akash,..., Setu |
1 |
2024-06-05 |
link |
What is the Best Way for ChatGPT to Translate Poetry? |
Wang, Shanshan,..., Lidia |
1 |
None |
link |
When Phrases Meet Probabilities: Enabling Open Relation Extraction with Cooperating Large Language Models |
Wang, Jiaxin,..., Jun |
1 |
2024-05-29 |
link |
PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering |
Xu, Fangzhi,..., Jun |
1 |
2024-06-16 |
link |
ESCoT: Towards Interpretable Emotional Support Dialogue Systems |
Zhang, Tenggan,..., Qin |
1 |
2024-08-05 |
link |
Beyond Orthography: Automatic Recovery of Short Vowels and Dialectal Sounds in Arabic |
El Kheir, Yassine,..., Shammur |
1 |
None |
link |
Document-Level Machine Translation with Large-Scale Public Parallel Corpora |
Pal, Proyag,..., Kenneth |
1 |
2024-02-15 |
link |
Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description Length |
Lan, Nur,..., Roni |
1 |
2024-02-29 |
link |
COSMIC: Mutual Information for Task-Agnostic Summarization Evaluation |
Darrin, Maxime,..., Pablo |
1 |
2023-11-16 |
link |
Tracking the Newsworthiness of Public Documents |
Spangher, Alexander,..., Jonathan |
1 |
2024-07-03 |
link |
Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment |
Lee, Janghwan,..., Jungwook |
1 |
2024-06-06 |
link |
American Sign Language Handshapes Reflect Pressures for Communicative Efficiency |
Yin, Kayo,..., Dan |
1 |
2024-01-15 |
link |
JumpCoder: Go Beyond Autoregressive Coder via Online Modification |
Chen, Mouxiang,..., Jianling |
1 |
2024-07-04 |
link |
Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot Techniques |
Yeginbergen, Anar,..., Rodrigo |
1 |
None |
link |
Can Large Language Models Interpret Noun-Noun Compounds? A Linguistically-Motivated Study on Lexicalized and Novel Compounds |
Rambelli, Giulia,..., Marianna |
1 |
2024-06-26 |
link |
Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction |
Zhang, Yice,..., Ruifeng |
1 |
2024-04-16 |
link |
Spiral of Silence: How is Large Language Model Killing Information Retrieval? - A Case Study on Open Domain Question Answering |
Chen, Xiaoyang,..., Yingfei |
1 |
2023-09-15 |
link |
How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study |
Waldis, Andreas,..., Iryna |
1 |
2024-06-07 |
link |
More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play |
Wongkamjan, Wichayaporn,..., Jordan |
1 |
2024-06-09 |
link |
PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning |
Qiu, Xiaoqi,..., Chunyan |
1 |
None |
link |
GunStance: Stance Detection for Gun Control and Gun Regulation |
Gyawali, Nikesh,..., Cornelia |
1 |
2024-06-25 |
link |
D2LLM: Decomposed and Distilled Large Language Models for Semantic Search |
Liao, Zihan,..., Wei |
1 |
2024-04-10 |
link |
Transferable and Efficient Non-Factual Content Detection via Probe Training with Offline Consistency Checking |
Zhang, Xiaokang,..., Jie |
1 |
2024-02-19 |
link |
Emergent Word Order Universals from Cognitively-Motivated Language Models |
Kuribayashi, Tatsuki,..., Timothy |
1 |
2024-08-23 |
link |
Causal-Guided Active Learning for Debiasing Large Language Models |
Sun, Zhouhao,..., Bing |
1 |
2024-06-09 |
link |
Arabic Diacritics in the Wild: Exploiting Opportunities for Improved Diacritization |
Elgamal, Salman,..., Nizar |
1 |
2024-05-21 |
link |
Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances |
Zhang, Hanlei,..., Kai |
1 |
2024-01-06 |
link |
CaMML: Context-Aware Multimodal Learner for Large Models |
Chen, Yixin,..., Bo |
1 |
2024-02-17 |
link |
Dissecting Human and LLM Preferences |
Li, Junlong,..., Pengfei |
1 |
2024-02-22 |
link |
RelayAttention for Efficient Large Language Model Serving with Long System Prompts |
Zhu, Lei,..., Rynson |
1 |
None |
link |
Open-Set Semi-Supervised Text Classification via Adversarial Disagreement Maximization |
Chen, Junfan,..., Chunming |
1 |
2024-06-09 |
link |
MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation |
Ma, Yan,..., Pengfei |
1 |
2024-02-20 |
link |
Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation |
Wu, Wen,..., Phil |
1 |
None |
link |
End-to-end Learning of Logical Rules for Enhancing Document-level Relation Extraction |
Qi, Kunxun,..., Hai |
1 |
2024-06-11 |
link |
Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data? |
Fang, Qingkai,..., Yang |
1 |
2024-06-13 |
link |
Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors |
Zhou, Ying,..., Le |
1 |
2024-06-06 |
link |
ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models |
Ren, Yuanyi,..., Guojie |
1 |
2024-04-08 |
link |
EFSA: Towards Event-Level Financial Sentiment Analysis |
Chen, Tianyu,..., Xiang |
1 |
2024-07-04 |
link |
Systematic Task Exploration with LLMs: A Study in Citation Text Generation |
{\c{S}}ahinu{\c{c}}, Furkan,..., Iryna |
1 |
2024-02-21 |
link |
Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment |
Li, Yunxin,..., Min |
1 |
2024-05-30 |
link |
ANAH: Analytical Annotation of Hallucinations in Large Language Models |
Ji, Ziwei,..., Kai |
1 |
2024-06-03 |
link |
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer |
Zhu, Yongxin,..., Dong |