530 |
2023-06-08 |
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models |
link |
Maaz, Muhammad,..., Fahad |
473 |
2023-05-29 |
Large Language Models are not Fair Evaluators |
link |
Wang, Peiyi,..., Zhifang |
448 |
2023-08-28 |
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding |
link |
Bai, Yushi,..., Juanzi |
322 |
2024-02-01 |
OLMo: Accelerating the Science of Language Models |
link |
Groeneveld, Dirk,..., Hannaneh |
237 |
2024-01-12 |
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs |
link |
Zeng, Yi,..., Weiyan |
230 |
2023-12-14 |
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations |
link |
Wang, Peiyi,..., Zhifang |
218 |
2024-01-31 |
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research |
link |
Soldaini, Luca,..., Kyle |
202 |
2024-01-11 |
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models |
link |
Dai, Damai,..., Wenfeng |
182 |
2023-04-22 |
LaMP: When Large Language Models Meet Personalization |
link |
Salemi, Alireza,..., Hamed |
176 |
2024-02-12 |
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model |
link |
{\"U}st{\"u}n, Ahmet,..., Sara |
163 |
2023-10-10 |
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression |
link |
Jiang, Huiqiang,..., Lili |
143 |
None |
VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks |
link |
Koh, Jing Yu,..., Daniel |
141 |
2023-12-31 |
Improving Text Embeddings with Large Language Models |
link |
Wang, Liang,..., Furu |
137 |
2023-12-09 |
Steering Llama 2 via Contrastive Activation Addition |
link |
Rimsky, Nina,..., Alexander |
135 |
2023-09-27 |
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future |
link |
Chu, Zheng,..., Ting |
125 |
2023-09-18 |
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM |
link |
Cao, Bochuan,..., Jinghui |
124 |
2023-07-20 |
L-Eval: Instituting Standardized Evaluation for Long Context Language Models |
link |
An, Chenxin,..., Xipeng |
124 |
2023-07-16 |
ChatDev: Communicative Agents for Software Development |
link |
Qian, Chen,..., Maosong |
117 |
2024-01-17 |
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents |
link |
Cheng, Kanzhi,..., Zhiyong |
114 |
2023-06-16 |
Full Parameter Fine-tuning for Large Language Models with Limited Resources |
link |
Lv, Kai,..., Xipeng |
113 |
2023-08-31 |
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants |
link |
Bandarkar, Lucas,..., Madian |
110 |
2023-02-23 |
Active Prompting with Chain-of-Thought for Large Language Models |
link |
Diao, Shizhe,..., Tong |
110 |
2023-10-03 |
Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View |
link |
Zhang, Jintian,..., Shumin |
109 |
2023-10-09 |
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition |
link |
Dong, Guanting,..., Jingren |
106 |
2024-01-25 |
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models |
link |
He, Hongliang,..., Dong |
105 |
2024-02-21 |
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems |
link |
He, Chaoqun,..., Maosong |
105 |
2024-02-19 |
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling |
link |
Zhan, Jun,..., Xipeng |
105 |
2024-02-09 |
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning |
link |
Singh, Shivalika,..., Sara |
103 |
2023-09-22 |
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs |
link |
Chen, Justin,..., Mohit |
103 |
2023-11-15 |
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization |
link |
Zhang, Zhexin,..., Minlie |
99 |
2023-11-08 |
LooGLE: Can Long-Context Language Models Understand Long Contexts? |
link |
Li, Jiaqi,..., Muhan |
94 |
2023-12-12 |
LLM in a flash: Efficient Large Language Model Inference with Limited Memory |
link |
Alizadeh, Keivan,..., Mehrdad |
86 |
2023-09-04 |
Are Emergent Abilities in Large Language Models just In-Context Learning? |
link |
Lu, Sheng,..., Iryna |
84 |
2024-02-16 |
Do Llamas Work in English? On the Latent Language of Multilingual Transformers |
link |
Wendler, Chris,..., Robert |
79 |
2023-10-27 |
InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews |
link |
Wang, Xintao,..., Yanghua |
78 |
2023-05-24 |
Who Wrote this Code? Watermarking for Code Generation |
link |
Lee, Taehyun,..., Gunhee |
78 |
2024-02-14 |
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding |
link |
Xu, Zhangchen,..., Radha |
78 |
2023-09-13 |
SafetyBench: Evaluating the Safety of Large Language Models |
link |
Zhang, Zhexin,..., Minlie |
76 |
2024-02-19 |
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs |
link |
Jiang, Fengqing,..., Radha |
74 |
2024-04-25 |
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding |
link |
Elhoushi, Mostafa,..., Carole-Jean |
71 |
2023-05-23 |
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models |
link |
Naous, Tarek,..., Wei |
70 |
2024-02-26 |
Do Large Language Models Latently Perform Multi-Hop Reasoning? |
link |
Yang, Sohee,..., Sebastian |
69 |
2023-11-07 |
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training |
link |
Cheng, Jiale,..., Minlie |
69 |
2023-12-31 |
RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models |
link |
Niu, Cheng,..., Tong |
67 |
2024-01-17 |
ReFT: Reasoning with Reinforced Fine-Tuning |
link |
Trung, Luong,..., Hang |
67 |
2024-01-12 |
Large Language Models Can Learn Temporal Reasoning |
link |
Xiong, Siheng,..., Faramarz |
65 |
2024-02-01 |
Don`t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration |
link |
Feng, Shangbin,..., Yulia |
64 |
2024-02-28 |
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards |
link |
Wang, Haoxiang,..., Tong |
64 |
2024-01-14 |
CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges |
link |
Zhang, Kechi,..., Zhi |
64 |
2024-02-01 |
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards |
link |
Alzahrani, Norah,..., Haidar |
62 |
2024-01-19 |
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences |
link |
Wang, Xiyao,..., Furong |
62 |
2024-02-19 |
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models |
link |
Levy, Mosh,..., Yoav |
60 |
2024-02-22 |
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues |
link |
Bai, Ge,..., Wanli |
60 |
2024-02-27 |
Evaluating Very Long-Term Conversational Memory of LLM Agents |
link |
Maharana, Adyasha,..., Yuwei |
59 |
2023-06-10 |
Boosting Language Models Reasoning with Chain-of-Knowledge Prompting |
link |
Wang, Jianing,..., Ming |
57 |
2024-03-04 |
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents |
link |
Song, Yifan,..., Bill Yuchen |
57 |
2023-08-17 |
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models |
link |
Wen, Yilin,..., Jimeng |
56 |
2024-01-02 |
CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation |
link |
Tu, Quan,..., Rui |
56 |
2024-02-01 |
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning |
link |
Li, Ming,..., Tianyi |
55 |
2023-10-16 |
EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities |
link |
Li, Nian,..., Qingmin |
54 |
2024-02-12 |
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension |
link |
Yang, Qian,..., Jingren |
54 |
2024-01-04 |
LLaMA Pro: Progressive LLaMA with Block Expansion |
link |
Wu, Chengyue,..., Ping |
52 |
2024-01-23 |
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment |
link |
Lu, Keming,..., Jingren |
51 |
2023-08-30 |
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness |
link |
Chen, Jiuhai,..., Jonas |
50 |
2024-03-25 |
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild |
link |
Peng, Puyuan,..., David |
50 |
2024-02-26 |
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models |
link |
R{\"o}ttger, Paul,..., Dirk |
49 |
2023-05-23 |
SciMON: Scientific Inspiration Machines Optimized for Novelty |
link |
Wang, Qingyun,..., Tom |
49 |
2024-03-21 |
Detoxifying Large Language Models via Knowledge Editing |
link |
Wang, Mengru,..., Huajun |
49 |
2024-02-28 |
Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key? |
link |
Wang, Qineng,..., Yangqiu |
49 |
2024-01-29 |
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling |
link |
Maini, Pratyush,..., Navdeep |
46 |
2024-03-01 |
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models |
link |
Li, Lei,..., Qi |
45 |
2024-01-12 |
Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty |
link |
Zhou, Kaitlyn,..., Maarten |
45 |
2024-02-26 |
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models |
link |
Tang, Tianyi,..., Ji-Rong |
45 |
2023-12-31 |
DocLLM: A Layout-Aware Generative Language Model for Multimodal Document Understanding |
link |
Wang, Dongsheng,..., Xiaomo |
45 |
2024-02-18 |
Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement |
link |
Xu, Wenda,..., William |
42 |
2024-03-29 |
Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning |
link |
Tong, Yongqi,..., Jingbo |
42 |
2023-12-22 |
NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes |
link |
Fan, Lizhou,..., Yongfeng |
42 |
2024-05-18 |
MapCoder: Multi-Agent Code Generation for Competitive Problem Solving |
link |
Islam, Md. Ashraful,..., Md Rizwan |
41 |
2024-02-29 |
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers |
link |
Li, Qintong,..., Wei |
41 |
2023-12-22 |
VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation |
link |
Ku, Max,..., Wenhu |
41 |
2023-12-14 |
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation |
link |
Xu, Rongwu,..., Han |
40 |
2024-02-26 |
Long-Context Language Modeling with Parallel Context Encoding |
link |
Yen, Howard,..., Danqi |
40 |
2024-02-16 |
Quantifying the Persona Effect in LLM Simulations |
link |
Hu, Tiancheng,..., Nigel |
40 |
2023-06-03 |
MultiLegalPile: A 689GB Multilingual Legal Corpus |
link |
Niklaus, Joel,..., Daniel |
39 |
2023-05-22 |
MAGE: Machine-generated Text Detection in the Wild |
link |
Li, Yafu,..., Yue |
39 |
2024-01-04 |
Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives |
link |
Zhang, Wenqi,..., Weiming |
39 |
2023-11-30 |
AlignBench: Benchmarking Chinese Alignment of Large Language Models |
link |
Liu, Xiao,..., Jie |
38 |
2024-02-26 |
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs |
link |
Lu, Zimu,..., Hongsheng |
38 |
2024-02-05 |
Unified Hallucination Detection for Multimodal Large Language Models |
link |
Chen, Xiang,..., Huajun |
38 |
2024-05-28 |
Faithful Logical Reasoning via Symbolic Chain-of-Thought |
link |
Xu, Jundong,..., Wynne |
37 |
2024-02-28 |
FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability |
link |
Xia, Congying,..., Caiming |
37 |
2024-02-14 |
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation |
link |
Zhang, Xiaoying,..., Helen |
37 |
2023-12-28 |
Experiential Co-Learning of Software-Developing Agents |
link |
Qian, Chen,..., Maosong |
36 |
2024-02-27 |
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space |
link |
Zhang, Shaolei,..., Yang |
36 |
2024-02-20 |
Investigating Cultural Alignment of Large Language Models |
link |
AlKhamissi, Badr,..., Mona |
35 |
2024-02-22 |
Unintended Impacts of LLM Alignment on Global Representation |
link |
Ryan, Michael J,..., Diyi |
34 |
2024-01-11 |
GroundingGPT: Language Enhanced Multi-modal Grounding Model |
link |
Li, Zhaowei,..., Tao |
34 |
2024-01-06 |
The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models |
link |
Li, Junyi,..., Ji-Rong |
32 |
2024-02-21 |
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning |
link |
Yang, Zhaorui,..., Qian |
32 |
2024-05-26 |
M$^3$CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought |
link |
Chen, Qiguang,..., Wanxiang |
32 |
2024-02-24 |
PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails |
link |
Mangaokar, Neal,..., Atul |
32 |
2024-05-13 |
RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors |
link |
Dugan, Liam,..., Chris |
31 |
2023-05-24 |
Harnessing the Power of Large Language Models for Natural Language to First-Order Logic Translation |
link |
Yang, Yuan,..., Faramarz |
31 |
2024-02-23 |
Machine Unlearning of Pre-trained Large Language Models |
link |
Yao, Jin,..., Xiang |
31 |
2023-11-09 |
Agent Lumos: Unified and Modular Training for Open-Source Language Agents |
link |
Yin, Da,..., Bill Yuchen |
31 |
2024-02-19 |
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic |
link |
Bhardwaj, Rishabh,..., Soujanya |
30 |
2024-02-27 |
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization |
link |
Zhang, Wenqi,..., Weiming |
30 |
2024-02-16 |
When is Tree Search Useful for LLM Planning? It Depends on the Discriminator |
link |
Chen, Ziru,..., Huan |
29 |
None |
LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin |
link |
Dou, Shihan,..., Xuanjing |
29 |
2024-01-16 |
MMToM-QA: Multimodal Theory of Mind Question Answering |
link |
Jin, Chuanyang,..., Tianmin |
28 |
2023-06-20 |
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts |
link |
Nguyen, Xuan-Phi,..., Lidong |
28 |
2024-02-17 |
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection |
link |
Wang, Yuxia,..., Preslav |
28 |
2024-02-20 |
Instruction-tuned Language Models are Better Knowledge Learners |
link |
Jiang, Zhengbao,..., Srini |
28 |
2023-11-14 |
CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation |
link |
Yan, Weixiang,..., Shuiguang |
28 |
2023-11-16 |
Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities |
link |
Wilf, Alex,..., Louis-Philippe |
28 |
2023-12-26 |
Aligning Large Language Models with Human Preferences through Representation Engineering |
link |
Liu, Wenhao,..., Xuanjing |
28 |
2024-03-06 |
Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models |
link |
Riddell, Martin,..., Arman |
28 |
2024-01-22 |
PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety |
link |
Zhang, Zaibin,..., Feng |
27 |
2024-01-10 |
AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning |
link |
Qiao, Shuofei,..., Huajun |
27 |
2023-11-15 |
Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models |
link |
Xu, Fangzhi,..., Jun |
26 |
2023-10-05 |
InstructProtein: Aligning Human and Protein Language via Knowledge Instruction |
link |
Wang, Zeyuan,..., Huajun |
26 |
2023-09-29 |
Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency |
link |
Huang, Baizhou,..., Nan |
26 |
2023-10-03 |
OceanGPT: A Large Language Model for Ocean Science Tasks |
link |
Bi, Zhen,..., Huajun |
26 |
2023-11-15 |
PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning |
link |
Zhang, Zhihan,..., Francesco |
26 |
2023-11-10 |
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences |
link |
Tian, Yuanhe,..., Yongdong |
26 |
2024-03-20 |
An Entropy-based Text Watermarking Detection Method |
link |
Lu, Yijian,..., Irwin |
25 |
2024-03-02 |
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal |
link |
Huang, Jianheng,..., Jinsong |
25 |
2024-01-14 |
CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning |
link |
Wang, Weiqi,..., Yangqiu |
25 |
2023-07-03 |
Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models |
link |
Duan, Jinhao,..., Kaidi |
25 |
2023-10-19 |
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models |
link |
Wang, Wenxuan,..., Michael |
25 |
2024-01-12 |
The Unreasonable Effectiveness of Easy Training Data for Hard Tasks |
link |
Hase, Peter,..., Sarah |
24 |
2024-02-19 |
What Evidence Do Language Models Find Convincing? |
link |
Wan, Alexander,..., Dan |
24 |
2024-02-27 |
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations |
link |
Huang, Jing,..., Atticus |
24 |
2024-01-12 |
MAPO: Advancing Multilingual Reasoning through Multilingual-Alignment-as-Preference Optimization |
link |
She, Shuaijie,..., Jiajun |
24 |
2024-01-13 |
Bridging the Preference Gap between Retrievers and LLMs |
link |
Ke, Zixuan,..., Michael |
24 |
2024-03-16 |
DIALECTBENCH: An NLP Benchmark for Dialects, Varieties, and Closely-Related Languages |
link |
Faisal, Fahim,..., Antonios |
23 |
2024-02-16 |
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation |
link |
Du, DaYou,..., Ningyi |
23 |
2024-03-12 |
KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction |
link |
Li, Zixuan,..., Xueqi |
23 |
2024-01-14 |
MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation |
link |
Chen, Jiaqi,..., Kwan-Yee |
23 |
2024-05-31 |
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training |
link |
Fang, Feiteng,..., Ruifeng |
23 |
2024-03-27 |
Measuring Political Bias in Large Language Models: What Is Said and How It Is Said |
link |
Bang, Yejin,..., Pascale |
23 |
2023-12-07 |
Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use |
link |
Chen, Yuhan,..., Rui |
23 |
2024-03-11 |
IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages |
link |
Khan, Mohammed Safi Ur Rahman,..., Mitesh M. |
22 |
2023-11-07 |
PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models |
link |
Li, Haoran,..., Yangqiu |
22 |
2024-07-01 |
FineSurE: Fine-grained Summarization Evaluation using LLMs |
link |
Song, Hwanjun,..., Saab |
22 |
2024-02-06 |
Training Language Models to Generate Text with Citations via Fine-grained Rewards |
link |
Huang, Chengyu,..., Wenya |
22 |
2023-08-31 |
RepCodec: A Speech Representation Codec for Speech Tokenization |
link |
Huang, Zhichao,..., Tom |
22 |
2024-02-22 |
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models |
link |
Lu, Xudong,..., Hongsheng |
22 |
2024-02-20 |
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations |
link |
Lin, Guan-Ting,..., Hung-yi |
22 |
2024-03-05 |
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution |
link |
Plaza-del-Arco, Flor Miriam,..., Dirk |
22 |
2023-05-22 |
Word Embeddings Are Steers for Language Models |
link |
Han, Chi,..., Heng |
21 |
2024-01-12 |
Navigating the Metrics Maze: Reconciling Score Magnitudes and Accuracies |
link |
Kocmi, Tom,..., Matt |
21 |
2024-02-16 |
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows |
link |
Patel, Ajay,..., Chris |
21 |
2023-10-31 |
FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models |
link |
Jiang, Yuxin,..., Wei |
21 |
2024-06-21 |
Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering |
link |
Shi, Zhengliang,..., Zhaochun |
21 |
2023-12-23 |
PokeMQA: Programmable knowledge editing for Multi-hop Question Answering |
link |
Gu, Hengrui,..., Xin |
21 |
2024-02-23 |
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition |
link |
Ye, Lu,..., Yang |
21 |
2024-07-26 |
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents |
link |
Trivedi, Harsh,..., Niranjan |
20 |
2024-02-10 |
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators |
link |
Hu, Yuchen,..., EngSiong |
20 |
2023-11-15 |
Explore Spurious Correlations at the Concept Level in Language Models for Text Classification |
link |
Zhou, Yuhang,..., Furong |
20 |
2023-11-15 |
Exploring the Potential of Large Language Models in Computational Argumentation |
link |
Chen, Guizhen,..., Lidong |
20 |
2023-10-10 |
Exploring Memorization in Fine-tuned Language Models |
link |
Zeng, Shenglai,..., Dawei |
20 |
2024-08-06 |
Synthesizing Text-to-SQL Data from Weak and Strong LLMs |
link |
Yang, Jiaxi,..., Chang |
20 |
2023-12-21 |
T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step |
link |
Chen, Zehui,..., Feng |
20 |
2023-10-28 |
DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy |
link |
Sun, Hongda,..., Rui |
20 |
2024-04-25 |
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages |
link |
Singh, Harman,..., Partha |
20 |
2024-02-16 |
Multi-modal Preference Alignment Remedies Degradation of Visual Instruction Tuning on Language Models |
link |
Li, Shengzhi,..., Shichao |
19 |
2024-02-19 |
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs |
link |
Tan, Jiejun,..., Ji-Rong |
19 |
2023-11-13 |
On Measuring Faithfulness or Self-consistency of Natural Language Explanations |
link |
Parcalabescu, Letitia,..., Anette |
19 |
2024-02-26 |
Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling |
link |
Jiang, Hang,..., Jad |
19 |
2024-02-19 |
Are LLM-based Evaluators Confusing NLG Quality Criteria? |
link |
Hu, Xinyu,..., Xiaojun |
19 |
2024-02-19 |
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question? |
link |
Balepur, Nishant,..., Rachel |
19 |
2024-02-23 |
Advancing Parameter Efficiency in Fine-tuning via Representation Editing |
link |
Wu, Muling,..., Xuanjing |
19 |
2024-02-19 |
CausalGym: Benchmarking causal interpretability methods on linguistic tasks |
link |
Arora, Aryaman,..., Christopher |
19 |
2024-02-15 |
Why are Sensitive Functions Hard for Transformers? |
link |
Hahn, Michael,..., Mark |
18 |
2024-02-14 |
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents |
link |
Qian, Cheng,..., Maosong |
18 |
2024-06-06 |
VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval |
link |
Zhou, Junjie,..., Yongping |
18 |
2024-03-25 |
Attribute First, then Generate: Locally-attributable Grounded Text Generation |
link |
Slobodkin, Aviv,..., Ido |
18 |
2024-02-01 |
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains |
link |
Jacovi, Alon,..., Mor |
18 |
2023-12-20 |
WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning |
link |
Yu, Zhaojian,..., Qiufeng |
18 |
2024-02-23 |
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models |
link |
Yu, Zhuohao,..., Shikun |
18 |
2023-06-21 |
ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews |
link |
D{'}Arcy, Mike,..., Doug |
18 |
2024-01-22 |
Revisiting Demonstration Selection Strategies in In-Context Learning |
link |
Peng, Keqin,..., Dacheng |
18 |
2024-06-05 |
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents |
link |
Wang, Yifei,..., Shengsheng |
18 |
2024-03-15 |
DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time Information Needs of Large Language Models |
link |
Su, Weihang,..., Yiqun |
18 |
2023-11-30 |
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation |
link |
Ke, Pei,..., Minlie |
18 |
2024-03-29 |
Latxa: An Open Language Model and Evaluation Suite for Basque |
link |
Etxaniz, Julen,..., Aitor |
17 |
2024-02-25 |
Citation-Enhanced Generation for LLM-based Chatbots |
link |
Li, Weitao,..., Yang |
17 |
2024-02-19 |
Learning to Edit: Aligning LLMs with Knowledge Editing |
link |
Jiang, Yuxin,..., Wei |
17 |
2023-11-26 |
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation |
link |
Liang, Xun,..., Haiying |
17 |
2023-10-09 |
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning |
link |
Li, Chengpeng,..., Chang |
17 |
2024-05-17 |
Layer-Condensed KV Cache for Efficient Inference of Large Language Models |
link |
Wu, Haoyi,..., Kewei |
17 |
2024-04-23 |
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models |
link |
Parmar, Mihir,..., Chitta |
16 |
2024-02-21 |
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis |
link |
Xie, Yueqi,..., Neil |
16 |
2023-12-20 |
Time is Encoded in the Weights of Finetuned Language Models |
link |
Nylund, Kai,..., Noah |
16 |
2024-02-18 |
Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks |
link |
Wang, Yichen,..., Tianxing |
16 |
2024-02-21 |
Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models |
link |
He, Zhiwei,..., Rui |
16 |
2024-03-05 |
CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following |
link |
Zhang, Kaiyan,..., Bowen |
16 |
2024-02-14 |
Spectral Filters, Dark Signals, and Attention Sinks |
link |
Cancedda, Nicola |
16 |
2024-02-16 |
Large Language Models as Zero-shot Dialogue State Tracker through Function Calling |
link |
Li, Zekun,..., Paul |
16 |
2024-02-23 |
On the Multi-turn Instruction Following for Conversational Web Agents |
link |
Deng, Yang,..., Tat-Seng |
16 |
2024-06-13 |
Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning? |
link |
Su, Zhaochen,..., Min |
16 |
2023-11-16 |
Reducing Privacy Risks in Online Self-Disclosures with Language Models |
link |
Dou, Yao,..., Wei |
16 |
2024-06-06 |
Prototypical Reward Network for Data-Efficient RLHF |
link |
Zhang, Jinghan,..., Kunpeng |
16 |
2024-06-06 |
Confabulation: The Surprising Value of Large Language Model Hallucinations |
link |
Sui, Peiqi,..., Richard |
16 |
2024-01-12 |
Mission: Impossible Language Models |
link |
Kallini, Julie,..., Christopher |
15 |
2024-02-26 |
HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy |
link |
Xiao, Mengxi,..., Jimin |
15 |
2024-06-24 |
UniCoder: Scaling Code Large Language Model via Universal Code |
link |
Sun, Tao,..., Zhoujun |
15 |
2024-02-16 |
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages |
link |
Ye, Junjie,..., Xuanjing |
15 |
2024-02-13 |
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers |
link |
Lin, Weizhe,..., Bill |
15 |
2023-11-14 |
A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts |
link |
Tripto, Nafis Irtiza,..., Dongwon |
15 |
2024-02-18 |
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs |
link |
Wang, Siyuan,..., Xiang |
15 |
2023-11-16 |
On the Impact of Calibration Data in Post-training Quantization and Pruning |
link |
Williams, Miles,..., Nikolaos |
15 |
2024-04-04 |
Learning to Plan and Generate Text with Citations |
link |
Fierro, Constanza,..., Mirella |
15 |
2024-02-18 |
LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks |
link |
Wang, Hanqing,..., Maosong |
15 |
2024-03-09 |
Calibrating Large Language Models Using Their Generations Only |
link |
Ulmer, Dennis,..., Seong |
14 |
2023-05-10 |
ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base |
link |
Yuan, Siyu,..., Deqing |
14 |
2024-02-11 |
Generalizing Conversational Dense Retrieval via LLM-Cognition Data Augmentation |
link |
Chen, Haonan,..., Ziliang |
14 |
2024-01-12 |
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning |
link |
Zhu, Yutao,..., Zhicheng |
14 |
None |
Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense Retrieval |
link |
Liu, Zheng,..., Defu |
14 |
2024-05-21 |
ProtT3: Protein-to-Text Generation for Text-based Protein Understanding |
link |
Liu, Zhiyuan,..., Tat-Seng |
14 |
2024-02-18 |
Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals |
link |
Ortu, Francesco,..., Bernhard |
14 |
2024-02-08 |
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models |
link |
Xu, Hainiu,..., Yulan |
14 |
2024-02-14 |
Towards Privacy-Aware Sign Language Translation at Scale |
link |
Rust, Phillip,..., Jean |
14 |
2024-03-11 |
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis |
link |
Liu, Yanming,..., Xuhong |
14 |
2024-06-05 |
Text-like Encoding of Collaborative Information in Large Language Models for Recommendation |
link |
Zhang, Yang,..., Xiangnan |
14 |
2024-02-20 |
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification |
link |
Peng, Yifan,..., Shinji |
14 |
2024-02-16 |
Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond |
link |
Li, Yongqi,..., Tat-Seng |
14 |
2023-05-22 |
Iterative Forward Tuning Boosts In-Context Learning in Language Models |
link |
Yang, Jiaxi,..., Yongbin |
13 |
2024-05-31 |
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark |
link |
Park, Chanjun,..., Hwalsuk |
13 |
2024-02-01 |
What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection |
link |
Feng, Shangbin,..., Yulia |
13 |
2024-01-12 |
AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters |
link |
Lucy, Li,..., Jesse |
13 |
2024-06-12 |
Multimodal Table Understanding |
link |
Zheng, Mingyu,..., Weiping |
13 |
2024-01-15 |
MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception |
link |
Wang, Yuhao,..., Yu |
13 |
2023-10-13 |
Improving Large Language Models in Event Relation Logical Prediction |
link |
Chen, Meiqi,..., Dongsheng |
13 |
2024-03-04 |
To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering |
link |
Frisoni, Giacomo,..., Zaiqiao |
13 |
2024-02-19 |
Revisiting Knowledge Distillation for Autoregressive Language Models |
link |
Zhong, Qihuang,..., Dacheng |
13 |
2024-01-12 |
Don`t Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation |
link |
Vernikos, Giorgos,..., Andrei |
13 |
2024-02-23 |
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs |
link |
Basu, Kinjal,..., Luis |
13 |
2023-11-15 |
Never Lost in the Middle: Mastering Long-Context Question Answering with Position-Agnostic Decompositional Training |
link |
He, Junqing,..., Jiaxing |
13 |
2024-04-09 |
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages |
link |
Cahyawijaya, Samuel,..., Pascale |
12 |
2024-02-16 |
AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation |
link |
Wang, Zhaowei,..., Simon |
12 |
2024-02-18 |
Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation |
link |
Yin, Xunjian,..., Xiaojun |
12 |
2024-04-15 |
Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval |
link |
Chen, Peter Baile,..., Dan |
12 |
2024-02-24 |
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA |
link |
Wang, Sheng,..., Chuan |
12 |
2024-02-08 |
TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation |
link |
Zhang, Yikai,..., Jiangjie |
12 |
2023-03-06 |
XCodeEval: An Execution-based Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval |
link |
Khan, Mohammad Abdullah Matin,..., Shafiq |
12 |
2024-01-26 |
ProxyQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models |
link |
Tan, Haochen,..., Linqi |
12 |
2023-12-04 |
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia |
link |
Monea, Giovanni,..., Robert |
12 |
2024-02-21 |
Analysing The Impact of Sequence Composition on Language Model Pre-Training |
link |
Zhao, Yu,..., Pasquale |
12 |
2024-08-07 |
NACL: A General and Effective KV Cache Eviction Framework for LLM at Inference Time |
link |
Chen, Yilong,..., Hua |
12 |
2023-11-19 |
Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters |
link |
Li, Yinghui,..., Ying |
12 |
2024-02-19 |
Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation |
link |
Liu, Aiwei,..., Lijie |
12 |
2024-06-04 |
Multimodal Reasoning with Multimodal Knowledge Graph |
link |
Lee, Junlin,..., Min |
12 |
2024-03-12 |
Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs |
link |
Fang, Tianqing,..., Antoine |
12 |
2024-02-16 |
Exploring Precision and Recall to assess the quality and diversity of LLMs |
link |
Le Bronnec, Florian,..., Alexandre |
12 |
2024-07-25 |
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning |
link |
Wang, Tianduo,..., Wei |
12 |
2024-06-20 |
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning |
link |
Nowak, Franz,..., Ryan |
12 |
2024-02-20 |
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation |
link |
Kang, Dongjin,..., Jinyoung |
12 |
2024-02-19 |
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! |
link |
Zhou, Zhanhui,..., Yu |
12 |
2023-05-09 |
COKE: A Cognitive Knowledge Graph for Machine Theory of Mind |
link |
Wu, Jincenzi,..., Minlie |
11 |
2024-02-24 |
Multimodal Instruction Tuning with Conditional Mixture of LoRA |
link |
Shen, Ying,..., Lifu |
11 |
2024-02-06 |
Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback |
link |
Ahn, Daechul,..., Jonghyun |
11 |
2024-02-15 |
Grounding Language Model with Chunking-Free In-Context Retrieval |
link |
Qian, Hongjin,..., Zhicheng |
11 |
2024-07-01 |
IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script Generation |
link |
Han, Senyu,..., Kai |
11 |
2024-05-26 |
M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions |
link |
Wang, Zheng,..., Wei |
11 |
2023-11-16 |
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models |
link |
Wang, Jiongxiao,..., Chaowei |
11 |
2024-06-10 |
FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model |
link |
Lee, Yebin,..., Myungjoo |
11 |
2023-11-15 |
Temporal Knowledge Question Answering via Abstract Reasoning Induction |
link |
Chen, Ziyang,..., Min |
11 |
None |
Jailbreak Open-Sourced Large Language Models via Enforced Decoding |
link |
Zhang, Hangfan,..., Dinghao |
11 |
2024-06-12 |
TasTe: Teaching Large Language Models to Translate through Self-Reflection |
link |
Wang, Yutong,..., Min |
11 |
2024-01-09 |
Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers |
link |
Yona, Gal,..., Mor |
11 |
2024-01-19 |
LangBridge: Multilingual Reasoning Without Multilingual Supervision |
link |
Yoon, Dongkeun,..., Minjoon |
11 |
2024-03-15 |
EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models |
link |
Das, Rocktim,..., Preslav |
11 |
2024-01-22 |
Text Embedding Inversion Security for Multilingual Language Models |
link |
Chen, Yiyi,..., Johannes |
11 |
2023-04-05 |
Efficient OCR for Building a Diverse Digital History |
link |
Carlson, Jacob,..., Melissa |
11 |
2024-03-08 |
Aligning Large Language Models for Controllable Recommendations |
link |
Lu, Wensheng,..., Xing |
11 |
2024-03-19 |
Bypassing LLM Watermarks with Color-Aware Substitutions |
link |
Wu, Qilong,..., Varun |
11 |
2024-02-28 |
ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training |
link |
Zhuo, Le,..., Wentao |
11 |
2023-10-07 |
Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages |
link |
Huang, Shih-Cheng,..., Hung-yi |
11 |
2023-09-16 |
Cross-Lingual Knowledge Editing in Large Language Models |
link |
Wang, Jiaan,..., Fandong |
11 |
2024-02-18 |
Don`t Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech Detection |
link |
Zhang, Min,..., Chang-Tien |
11 |
2024-01-12 |
ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation |
link |
Jha, Akshita,..., Sunipa |
11 |
2024-06-05 |
Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends |
link |
Ramprasad, Sanjana,..., Zachary |
11 |
2024-02-19 |
PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents |
link |
Yang, Qisen,..., Gao |
11 |
2024-03-06 |
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators |
link |
Paul, Indraneil,..., Iryna |
11 |
2024-03-15 |
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling |
link |
Limisiewicz, Tomasz,..., Luke |
11 |
2023-11-16 |
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents |
link |
Zhao, Yilun,..., Arman |
10 |
None |
AoE: Angle-optimized Embeddings for Semantic Textual Similarity |
link |
Li, Xianming,..., Jing |
10 |
2023-03-28 |
When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP |
link |
Papi, Sara,..., Matteo |
10 |
2024-06-08 |
Planning Like Human: A Dual-process Framework for Dialogue Planning |
link |
He, Tao,..., Bing |
10 |
2024-02-18 |
Stealthy Attack on Large Language Model based Recommendation |
link |
Zhang, Jinghao,..., Liang |
10 |
2024-02-22 |
Unveiling Linguistic Regions in Large Language Models |
link |
Zhang, Zhihao,..., Xuanjing |
10 |
2023-11-16 |
Where Do People Tell Stories Online? Story Detection Across Online Communities |
link |
Antoniak, Maria,..., Andrew |
10 |
2024-02-19 |
Parallel Structures in Pre-training Data Yield In-Context Learning |
link |
Chen, Yanda,..., He |
10 |
2023-12-13 |
Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic Image-Report Generation |
link |
Chen, Wenting,..., Yixuan |
10 |
2024-03-09 |
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines |
link |
Toker, Michael,..., Yonatan |
10 |
2024-04-25 |
Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks |
link |
Siska, Charlotte,..., James |
10 |
None |
Fundamental Capabilities of Large Language Models and their Applications in Domain Scenarios: A Survey |
link |
Li, Jiawei,..., Heyan |
10 |
2024-07-02 |
Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation |
link |
Wang, Xinglin,..., Kan |
10 |
2024-06-04 |
Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs |
link |
Cao, Zhiwei,..., Jinsong |
10 |
2024-03-01 |
Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks |
link |
Alwajih, Fakhraddin,..., Muhammad |
10 |
2024-06-10 |
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs |
link |
Panda, Pranoy,..., Prathosh |
10 |
2024-05-21 |
G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation |
link |
Pan, Xingyuan,..., Shanbo |
10 |
2024-01-25 |
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization |
link |
J, Jaavid,..., Anoop |
10 |
2023-12-31 |
BatchEval: Towards Human-like Text Evaluation |
link |
Yuan, Peiwen,..., Kan |
9 |
2024-03-05 |
OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following |
link |
Shi, Haochen,..., Bang |
9 |
2024-02-16 |
AFaCTA: Assisting the Annotation of Factual Claim Detection with Reliable LLM Annotators |
link |
Ni, Jingwei,..., Markus |
9 |
2024-02-24 |
ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval |
link |
Yoon, Soyoung,..., Seung-won |
9 |
2023-10-16 |
On Context Utilization in Summarization with Large Language Models |
link |
Ravaut, Mathieu,..., Shafiq |
9 |
2024-01-31 |
Navigating the OverKill in Large Language Models |
link |
Shi, Chenyu,..., Dahua |
9 |
2024-02-18 |
Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once? |
link |
Son, Guijin,..., Seungone |
9 |
2024-02-27 |
Benchmarking Data Science Agents |
link |
Zhang, Yuge,..., Kan |
9 |
2024-02-19 |
EmoBench: Evaluating the Emotional Intelligence of Large Language Models |
link |
Sabour, Sahand,..., Minlie |
9 |
2024-01-22 |
Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts? |
link |
Tan, Hexiang,..., Xueqi |
9 |
2023-12-05 |
Prompt Optimization via Adversarial In-Context Learning |
link |
Long, Xuan Do,..., Junxian |
9 |
2024-02-24 |
HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition |
link |
Liu, Yuxuan,..., Qi |
9 |
2024-05-30 |
Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion |
link |
Cheng, Wei,..., Wei |
9 |
2024-02-18 |
FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence |
link |
Joseph, Sebastian,..., Junyi Jessy |
9 |
2024-01-09 |
MERA: A Comprehensive LLM Evaluation in Russian |
link |
Fenogenova, Alena,..., Sergey |
9 |
2024-02-28 |
Meta-Task Prompting Elicits Embeddings from Large Language Models |
link |
Lei, Yibin,..., Andrew |
9 |
2024-02-20 |
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts |
link |
Zhao, Hao,..., Jie |
9 |
2024-01-16 |
SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models |
link |
Zhao, Weixiang,..., Wanxiang |
9 |
2024-05-27 |
DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution |
link |
Mao, Yulong,..., Jinan |
9 |
2024-06-30 |
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models |
link |
Zhong, Weihong,..., Bing |
9 |
2024-06-04 |
mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models |
link |
Lai, Huiyuan,..., Malvina |
9 |
2023-11-16 |
FinanceMATH: Knowledge-Intensive Math Reasoning in Finance Domains |
link |
Zhao, Yilun,..., Arman |
9 |
2024-01-12 |
Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation |
link |
Cegin, Jan,..., Peter |
9 |
2024-02-19 |
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing? |
link |
Gaido, Marco,..., Luisa |
9 |
2023-12-13 |
Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models |
link |
Zheng, Junhao,..., Qianli |
9 |
2023-10-30 |
M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models |
link |
Kwan, Wai-Chung,..., Kam-Fai |
9 |
2024-02-23 |
ToMBench: Benchmarking Theory of Mind in Large Language Models |
link |
Chen, Zhuang,..., Minlie |
8 |
2024-02-28 |
Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation |
link |
Xu, Shicheng,..., Jie |
8 |
2023-10-08 |
MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering |
link |
Chen, Xiusi,..., Wei |
8 |
2024-02-15 |
SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs |
link |
Hu, Yebowen,..., Fei |
8 |
2023-10-02 |
Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games |
link |
Zhang, Yizhe,..., Navdeep |
8 |
2024-06-11 |
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation |
link |
Ma, Zhengrui,..., Min |
8 |
2024-06-03 |
Probing Language Models for Pre-training Data Detection |
link |
Liu, Zhenhua,..., Wenliang |
8 |
2024-06-04 |
Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding |
link |
Zhang, Zhihan,..., Tat-Seng |
8 |
None |
Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive Guidance |
link |
Yin, Zhangyue,..., Xipeng |
8 |
2024-05-21 |
SirLLM: Streaming Infinite Retentive LLM |
link |
Yao, Yao,..., Hai |
8 |
2024-03-09 |
ItD: Large Language Models Can Teach Themselves Induction through Deduction |
link |
Sun, Wangtao,..., Kang |
8 |
None |
Rethinking Task-Oriented Dialogue Systems: From Complex Modularity to Zero-Shot Autonomous Agent |
link |
Xu, Heng-Da,..., Heyan |
8 |
2024-03-18 |
Metaphor Understanding Challenge Dataset for LLMs |
link |
Tong, Xiaoyu,..., Ekaterina |
8 |
2024-09-22 |
SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graph |
link |
Chen, Hanzhu,..., Jieping |
8 |
2024-06-19 |
Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators |
link |
Mahaut, Mat{\'e}o,..., Lluis |
8 |
2024-02-14 |
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning |
link |
Wang, Yejie,..., Xunliang |
8 |
2024-03-18 |
QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction |
link |
Huang, Xiang,..., Yuzhong |
8 |
2024-02-26 |
What Do Language Models Hear? Probing for Auditory Representations in Language Models |
link |
Ngo, Jerry,..., Yoon |
8 |
2024-06-07 |
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques |
link |
Thakkar, Megh,..., Sarath |
8 |
2024-03-04 |
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models |
link |
Chen, Changyu,..., Yongbin |
8 |
2024-05-16 |
FinTextQA: A Dataset for Long-form Financial Question Answering |
link |
Chen, Jian,..., Junwei |
8 |
2024-05-30 |
The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities |
link |
Stap, David,..., Ke |
8 |
2024-02-19 |
MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMs |
link |
Bakman, Yavuz Faruk,..., Salman |
8 |
2024-05-28 |
Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models |
link |
Chen, Longze,..., Min |
8 |
2024-03-06 |
A Modular Approach for Multimodal Summarization of TV Shows |
link |
Mahon, Louis,..., Mirella |
8 |
2024-05-17 |
Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation |
link |
Niu, Cheng,..., Tong |
8 |
2024-03-28 |
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data |
link |
Tonneau, Manuel,..., Samuel |
8 |
2024-02-28 |
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning |
link |
Li, Jiachun,..., Jun |
8 |
2024-02-23 |
Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models |
link |
Xiong, Guanming,..., Wen |
8 |
2024-02-17 |
Aligning Large Language Models by On-Policy Self-Judgment |
link |
Lee, Sangkyu,..., Youngjae |
8 |
2023-11-11 |
LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction |
link |
Zhou, Hanzhang,..., Kezhi |
8 |
None |
TaPERA: Enhancing Faithfulness and Interpretability in Long-Form Table QA by Content Planning and Execution-based Reasoning |
link |
Zhao, Yilun,..., Chen |
8 |
2024-03-21 |
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception |
link |
Han, HyoJung,..., Changhan |
8 |
2024-04-06 |
Context versus Prior Knowledge in Language Models |
link |
Du, Kevin,..., Ryan |
8 |
2024-02-16 |
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models |
link |
Lin, Zihao,..., Lifu |
8 |
2024-05-24 |
GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction |
link |
Felkner, Virginia,..., Jonathan |
8 |
2024-06-06 |
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages |
link |
Borenstein, Nadav,..., Ryan |
7 |
2022-11-16 |
CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers |
link |
Hu, Yong,..., Jie |
7 |
2023-10-05 |
Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction |
link |
Jian, Yiren,..., Hongxia |
7 |
2023-12-20 |
Retrieval-Augmented Multilingual Knowledge Editing |
link |
Wang, Weixuan,..., Alexandra |
7 |
2024-02-15 |
Answer is All You Need: Instruction-following Text Embedding via Answering the Question |
link |
Peng, Letian,..., Jingbo |
7 |
2024-02-19 |
IMBUE: Improving Interpersonal Effectiveness through Simulation and Just-in-time Feedback with Human-Language Model Interaction |
link |
Lin, Inna,..., Tim |
7 |
2024-05-28 |
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models |
link |
Elangovan, Aparna,..., Dan |
7 |
2023-11-29 |
TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models |
link |
Chu, Zheng,..., Bing |
7 |
2024-01-09 |
Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search |
link |
Li, Haochen,..., Zhiqi |
7 |
2024-07-07 |
Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition |
link |
Guo, Zirun,..., Zhou |
7 |
2024-03-04 |
VariErr NLI: Separating Annotation Error from Human Label Variation |
link |
Weber-Genzel, Leon,..., Barbara |
7 |
2024-02-20 |
Modality-Aware Integration with Large Language Models for Knowledge-Based Visual Question Answering |
link |
Dong, Junnan,..., Xiao |
7 |
2024-08-06 |
Making Long-Context Language Models Better Multi-Hop Reasoners |
link |
Li, Yanyang,..., Liwei |
7 |
2024-03-05 |
Improving Event Definition Following For Zero-Shot Event Detection |
link |
Cai, Zefan,..., Nanyun |
7 |
2024-02-20 |
Interpreting Conversational Dense Retrieval by Rewriting-Enhanced Inversion of Session Embedding |
link |
Cheng, Yiruo,..., Zhicheng |
7 |
2023-12-27 |
Prompt Expansion for Adaptive Text-to-Image Generation |
link |
Datta, Siddhartha,..., Peter |
7 |
2024-06-25 |
MPCoder: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning |
link |
Dai, Zhenlong,..., Jingyuan |
7 |
2023-11-14 |
Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models |
link |
Ni, Shiwen,..., Min |
7 |
2024-02-20 |
The Hidden Space of Transformer Language Adapters |
link |
Alabi, Jesujoba,..., Mor |
7 |
2024-07-09 |
Automated Justification Production for Claim Veracity in Fact Checking: A Survey on Architectures and Approaches |
link |
Eldifrawi, Islam,..., Amine |
7 |
2023-08-29 |
SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget |
link |
Kong, Rui,..., Yunxin |
7 |
None |
Chain-of-Exemplar: Enhancing Distractor Generation for Multimodal Educational Question Generation |
link |
Luo, Haohao,..., Tat-Seng |
7 |
2024-02-20 |
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning |
link |
Mondorf, Philipp,..., Barbara |
7 |
2024-05-01 |
CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models |
link |
Chen, Zixin,..., Guang |
7 |
2023-11-14 |
Predicting Text Preference Via Structured Comparative Reasoning |
link |
Yan, Jing Nathan,..., Michael |
7 |
2023-11-29 |
CLOMO: Counterfactual Logical Modification with Large Language Models |
link |
Huang, Yinya,..., Linqi |
7 |
2024-02-19 |
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages |
link |
Zhang, Yuanchi,..., Yang |
7 |
2024-07-07 |
IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning |
link |
Joshi, Abhinav,..., Ashutosh |
7 |
2024-05-16 |
Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks |
link |
Bordalo, Jo{\~a}o,..., Joao |
7 |
2024-02-26 |
LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments |
link |
Chen, Junzhe,..., Lijie |
7 |
2023-08-25 |
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers |
link |
Xie, Jiawen,..., Nan |
7 |
2024-02-14 |
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech |
link |
Ji, Shengpeng,..., Zhou |
7 |
2023-11-15 |
Disinformation Capabilities of Large Language Models |
link |
Vykopal, Ivan,..., Maria |
7 |
2024-06-13 |
ECBD: Evidence-Centered Benchmark Design for NLP |
link |
Liu, Yu Lu,..., Ziang |
6 |
2024-08-19 |
TaSL: Continual Dialog State Tracking via Task Skill Localization and Consolidation |
link |
Feng, Yujie,..., Xiao-Ming |
6 |
2023-10-13 |
EasyGen: Easing Multimodal Generation with BiDiffuser and LLMs |
link |
Zhao, Xiangyu,..., Xiao-Ming |
6 |
2024-05-28 |
Detection-Correction Structure via General Language Model for Grammatical Error Correction |
link |
Li, Wei,..., Houfeng |
6 |
2024-02-19 |
A synthetic data approach for domain generalization of NLI models |
link |
Hosseini, Mohammad Javad,..., Annie |
6 |
2024-03-05 |
Eliciting Better Multilingual Structured Reasoning from LLMs through Code |
link |
Li, Bryan,..., Saab |
6 |
2024-02-20 |
GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick |
link |
Fu, Jiayi,..., Yanghua |
6 |
2024-04-14 |
Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment |
link |
Hong, Zhiqing,..., Zhimeng |
6 |
2024-01-26 |
Unlearning Traces the Influential Training Data of Language Models |
link |
Isonuma, Masaru,..., Ivan |
6 |
2024-01-29 |
Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA |
link |
Fan, Yue,..., Xin |
6 |
2024-06-18 |
An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs |
link |
Rai, Daking,..., Ziyu |
6 |
2024-06-08 |
MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention |
link |
Jha, Prince,..., Pushpak |
6 |
2024-03-05 |
InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers |
link |
Yehuda, Yakir,..., Noam |
6 |
2023-10-03 |
Dodo: Dynamic Contextual Compression for Decoder-only LMs |
link |
Qin, Guanghui,..., Benjamin |
6 |
2023-06-28 |
Pareto Optimal Learning for Estimating Large Language Model Errors |
link |
Zhao, Theodore,..., Hoifung |
6 |
2024-03-07 |
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error |
link |
Wang, Boshi,..., Yu |
6 |
2024-05-20 |
CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models |
link |
Zhang, Tong,..., Tat-Seng |
6 |
None |
Self-chats from Large Language Models Make Small Emotional Support Chatbot Better |
link |
Zheng, Zhonghua,..., Liqiang |
6 |
2024-03-25 |
An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing |
link |
Chai, Ziwei,..., Yang |
6 |
2024-01-18 |
Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation |
link |
Kasner, Zden{\v{e}}k,..., Ondrej |
6 |
2024-02-18 |
One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation |
link |
Siledar, Tejpalsingh,..., Nikesh |
6 |
2024-07-31 |
Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs |
link |
Markowitz, Elan,..., Aram |
6 |
2024-06-06 |
What Do Language Models Learn in Context? The Structured Task Hypothesis. |
link |
Li, Jiaoda,..., Ryan |
6 |
2024-07-31 |
Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends |
link |
Martinelli, Giuliano,..., Roberto |
6 |
2023-11-13 |
Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning |
link |
Yu, Yue,..., Michael |
6 |
2023-10-10 |
Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model |
link |
Tomani, Christian,..., Daniel |
6 |
2024-06-12 |
Let`s Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation |
link |
Park, Se,..., Yong |
5 |
2024-05-20 |
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus |
link |
Poesina, Eduard,..., Radu |
5 |
2024-04-10 |
Learn from Failure: Fine-tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving |
link |
An, Chenyang,..., Jingbo |
5 |
2024-06-05 |
Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach |
link |
Lee, Saehyung,..., Sungroh |
5 |
2023-11-13 |
WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models |
link |
Tu, Shangqing,..., Juanzi |
5 |
None |
Evaluating Intention Detection Capability of Large Language Models in Persuasive Dialogues |
link |
Sakurai, Hiromasa,..., Yusuke |
5 |
2024-01-15 |
Selene: Pioneering Automated Proof in Software Verification |
link |
Zhang, Lichen,..., Nan |
5 |
None |
REANO: Optimising Retrieval-Augmented Reader Models through Knowledge Graph Generation |
link |
Fang, Jinyuan,..., Craig |
5 |
2024-06-09 |
MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation |
link |
Ma, Yan,..., Pengfei |
5 |
2023-05-22 |
CopyNE: Better Contextual ASR by Copying Named Entities |
link |
Zhou, Shilin,..., Baoxing |
5 |
2024-03-12 |
Beyond Memorization: The Challenge of Random Memory Access in Language Models |
link |
Zhu, Tongyao,..., Min |
5 |
2023-12-25 |
Instruction Fusion: Advancing Prompt Evolution through Hybridization |
link |
Guo, Weidong,..., Di |
5 |
2024-02-10 |
Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue |
link |
Wang, Jian,..., Xiaoyong |
5 |
2024-01-29 |
InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification |
link |
Trienes, Jan,..., Junyi Jessy |
5 |
2023-05-23 |
DAPR: A Benchmark on Document-Aware Passage Retrieval |
link |
Wang, Kexin,..., Iryna |
5 |
2024-06-03 |
Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors |
link |
Xue, Mengge,..., Chengguo |
5 |
2024-04-29 |
Analyzing Semantic Change through Lexical Replacements |
link |
Periti, Francesco,..., Nina |
5 |
2023-06-14 |
Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations |
link |
Geigle, Gregor,..., Goran |
5 |
None |
Soft Knowledge Prompt: Help External Knowledge Become a Better Teacher to Instruct LLM in Knowledge-based VQA |
link |
Wang, Qunbo,..., Jing |
5 |
2023-11-16 |
PixT3: Pixel-based Table-To-Text Generation |
link |
Alonso, I{\~n}igo,..., Mirella |
5 |
2024-02-27 |
Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder |
link |
Wang, Jiaqi,..., Zhiguo |
5 |
2024-01-19 |
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion |
link |
Wang, Zhichao,..., Yuping |
5 |
2024-02-23 |
Unlocking the Power of Large Language Models for Entity Alignment |
link |
Jiang, Xuhui,..., Yuanzhuo |
5 |
None |
Conundrums in Cross-Prompt Automated Essay Scoring: Making Sense of the State of the Art |
link |
Li, Shengjie,..., Vincent |
5 |
2024-01-20 |
STICKERCONV: Generating Multimodal Empathetic Responses from Scratch |
link |
Zhang, Yiqun,..., Kaisong |
5 |
2023-12-12 |
Safety Alignment in NLP Tasks: Weakly Aligned Summarization as an In-Context Attack |
link |
Fu, Yu,..., Yue |
5 |
2023-11-08 |
Speech language models lack important brain-relevant semantics |
link |
Oota, Subba Reddy,..., Mariya |
5 |
2024-01-10 |
I am a Strange Dataset: Metalinguistic Tests for Language Models |
link |
Thrush, Tristan,..., Douwe |
5 |
2023-11-16 |
Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination |
link |
Yang, Nakyeong,..., Kyomin |
5 |
2023-11-15 |
Few-shot Transfer Learning for Knowledge Base Question Answering: Fusing Supervised Models with In-Context Learning |
link |
Patidar, Mayur,..., Indrajit |
5 |
2024-05-22 |
Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline |
link |
Yang, Dingyi,..., Qin |
5 |
2023-05-12 |
Synergistic Interplay between Search and Large Language Models for Information Retrieval |
link |
Feng, Jiazhan,..., Daxin |
5 |
2023-10-11 |
Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models |
link |
Sun, Yuchong,..., Kun |
5 |
2024-05-16 |
Robust Singing Voice Transcription Serves Synthesis |
link |
Li, Ruiqi,..., Zhou |
5 |
2024-06-04 |
Self-Modifying State Modeling for Simultaneous Machine Translation |
link |
Yu, Donglei,..., Chengqing |
5 |
2024-02-21 |
CODIS: Benchmarking Context-dependent Visual Comprehension for Multimodal Large Language Models |
link |
Luo, Fuwen,..., Yang |
5 |
None |
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners |
link |
Huang, Rongjie,..., Dong |
5 |
2025-04-03 |
Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLMs-Powered Assistance |
link |
Yuan, Bo,..., Wei |
5 |
2024-03-21 |
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations |
link |
Sun, Jiaxing,..., Conghui |
5 |
2024-02-28 |
Small But Funny: A Feedback-Driven Approach to Humor Distillation |
link |
Ravi, Sahithya,..., Arash |
5 |
2024-06-16 |
ESCoT: Towards Interpretable Emotional Support Dialogue Systems |
link |
Zhang, Tenggan,..., Qin |
5 |
None |
REFINESUMM: Self-Refining MLLM for Generating a Multimodal Summarization Dataset |
link |
Patil, Vaidehi,..., Markus |
5 |
None |
DeCoT: Debiasing Chain-of-Thought for Knowledge-Intensive Tasks in Large Language Models via Causal Intervention |
link |
Wu, Junda,..., Julian |
5 |
2023-10-21 |
MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin |
link |
Zhou, Tianshuo,..., Ge |
5 |
2024-06-02 |
Deciphering Oracle Bone Language with Diffusion Models |
link |
Guan, Haisu,..., Yuliang |
5 |
None |
EZ-STANCE: A Large Dataset for English Zero-Shot Stance Detection |
link |
Zhao, Chenye,..., Cornelia |
4 |
2024-02-11 |
Through the Lens of Split Vote: Exploring Disagreement, Difficulty and Calibration in Legal Case Outcome Classification |
link |
Xu, Shanshan,..., Matthias |
4 |
2024-05-25 |
Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models |
link |
Kumar, Abhishek,..., Ali |
4 |
None |
DocLens: Multi-aspect Fine-grained Medical Text Evaluation |
link |
Xie, Yiqing,..., Carolyn |
4 |
2024-06-06 |
ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions |
link |
Ghosh, Sreyan,..., Dinesh |
4 |
2024-06-03 |
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer |
link |
Zhu, Yongxin,..., Dong |
4 |
2024-02-17 |
Dissecting Human and LLM Preferences |
link |
Li, Junlong,..., Pengfei |
4 |
2024-03-14 |
TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Semantic Tasks |
link |
Moskvoretskii, Viktor,..., Irina |
4 |
2024-05-21 |
Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression |
link |
Liu, Peiyu,..., Ji-Rong |
4 |
2024-03-13 |
Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale |
link |
Hu, Xiang,..., Kewei |
4 |
2024-05-23 |
ChronosLex: Time-aware Incremental Training for Temporal Generalization of Legal Classification Tasks |
link |
T.y.s.s, Santosh,..., Matthias |
4 |
2024-05-16 |
Timeline-based Sentence Decomposition with In Context Learning for Temporal Fact Extraction |
link |
Chen, Jianhao,..., Yuzhong |
4 |
2024-05-26 |
MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations |
link |
Wang, Yuxin,..., Soroush |
4 |
2023-11-15 |
MAVEN-ARG: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation |
link |
Wang, Xiaozhi,..., Juanzi |
4 |
2024-03-21 |
Multi-Level Feedback Generation with Large Language Models for Empowering Novice Peer Counselors |
link |
Chaszczewicz, Alicja,..., Diyi |
4 |
2024-06-12 |
Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries |
link |
Huang, Yu-Hsiang,..., Shou-De |
4 |
2024-01-15 |
Uncovering the Full Potential of Visual Grounding Methods in VQA |
link |
Reich, Daniel,..., Tanja |
4 |
2024-06-10 |
Interpretability of Language Models via Task Spaces |
link |
Weber, Lucas,..., Dieuwke |
4 |
2024-06-05 |
Using Synchronic Definitions and Semantic Relations to Classify Semantic Change Types |
link |
Cassotti, Pierluigi,..., Nina |
4 |
None |
StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback |
link |
Dou, Shihan,..., Xuanjing |
4 |
2023-12-15 |
Marathon: A Race Through the Realm of Long Context with Large Language Models |
link |
Zhang, Lei,..., Min |
4 |
2024-02-16 |
Threads of Subtlety: Detecting Machine-Generated Texts Through Discourse Motifs |
link |
Kim, Zae Myung,..., Dongyeop |
4 |
None |
PRP-Graph: Pairwise Ranking Prompting to LLMs with Graph Aggregation for Effective Text Re-ranking |
link |
Luo, Jian,..., Le |
4 |
2024-01-24 |
UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion |
link |
Li, Wei,..., Xinyan |
4 |
2024-04-05 |
Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation |
link |
Zhong, Tianqi,..., Zhendong |
4 |
2023-07-11 |
Lightweight reranking for language model generations |
link |
Jain, Siddhartha,..., Bing |
4 |
2023-08-21 |
PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User Simulator |
link |
Kong, Chuyi,..., Benyou |
4 |
2024-01-12 |
STRUCTSUM Generation for Faster Text Comprehension |
link |
Jain, Parag,..., Francesco |
4 |
2024-02-19 |
Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space |
link |
Wu, Zongru,..., Gongshen |
4 |
2023-11-11 |
BizBench: A Quantitative Reasoning Benchmark for Business and Finance |
link |
Krumdick, Michael,..., Chris |
4 |
2024-02-12 |
Label-Efficient Model Selection for Text Generation |
link |
Ashury Tahan, Shir,..., Eyal |
4 |
2024-07-01 |
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents |
link |
Deng, Shihan,..., Shuo |
4 |
2024-02-19 |
Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models |
link |
Ju, Tianjie,..., Gongshen |
4 |
2023-11-15 |
Temperature-scaling surprisal estimates improve fit to human reading times -- but does it do so for the \textquotedblleftright reasons\textquotedblright? |
link |
Liu, Tong,..., Vera |
4 |
2024-02-29 |
NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism |
link |
Li, Miao,..., Yi |
4 |
2024-02-28 |
A Sentiment Consolidation Framework for Meta-Review Generation |
link |
Li, Miao,..., Eduard |
4 |
2023-12-07 |
Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models |
link |
Agostinelli, Victor,..., Lizhong |
4 |
2024-03-19 |
Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models |
link |
Lin, Ying-Chun,..., Jaime |
4 |
2024-02-19 |
Browse and Concentrate: Comprehending Multimodal Content via Prior-LLM Context Fusion |
link |
Wang, Ziyue,..., Yang |
4 |
2024-07-03 |
Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment |
link |
Lee, Janghwan,..., Jungwook |
4 |
2024-05-17 |
Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks |
link |
Chatterjee, Anwoy,..., Tanmoy |
4 |
None |
LANDeRMT: Dectecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation |
link |
Zhu, Shaolin,..., Deyi |
4 |
None |
VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in Large Vision-Language Models |
link |
Cao, Qingxing,..., Liang |
4 |
2024-06-18 |
AutoDSL: Automated domain-specific language design for structural representation of procedures with constraints |
link |
Shi, Yu-Zhe,..., Qining |
4 |
2024-01-02 |
Cheetah: Natural Language Generation for 517 African Languages |
link |
Adebara, Ife,..., Muhammad |
4 |
2024-05-07 |
Toward In-Context Teaching: Adapting Examples to Students' Misconceptions |
link |
Ross, Alexis,..., Jacob |
4 |
2024-03-03 |
WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection |
link |
Shetty, Anudeex,..., Qiongkai |
4 |
None |
Fora: A corpus and framework for the study of facilitated dialogue |
link |
Schroeder, Hope,..., Jad |
4 |
2024-06-05 |
What is the Best Way for ChatGPT to Translate Poetry? |
link |
Wang, Shanshan,..., Lidia |
4 |
2024-06-14 |
EWEK-QA : Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems |
link |
Dehghan, Mohammad,..., Mehdi |
4 |
2024-03-06 |
The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models |
link |
Bhaskar, Adithya,..., Danqi |
4 |
2024-06-06 |
Causal Estimation of Memorisation Profiles |
link |
Lesci, Pietro,..., Tiago |
3 |
2024-05-28 |
A Unified Temporal Knowledge Graph Reasoning Model Towards Interpolation and Extrapolation |
link |
Chen, Kai,..., Xin |
3 |
2024-05-23 |
Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models |
link |
Kumar, Abhishek,..., Ali |
3 |
2024-05-20 |
Token-wise Influential Training Data Retrieval for Large Language Models |
link |
Lin, Huawei,..., Weijie |
3 |
2024-06-28 |
Prompt Refinement with Image Pivot for Text-to-Image Generation |
link |
Zhan, Jingtao,..., Tao |
3 |
2024-02-20 |
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs |
link |
Zhou, Runlong,..., Beibin |
3 |
2024-06-28 |
BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering |
link |
Chu, Zheng,..., Bing |
3 |
2023-12-25 |
Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis Generation |
link |
Bai, Jiaxin,..., Yangqiu |
3 |
None |
Persuading across Diverse Domains: a Dataset and Persuasion Large Language Model |
link |
Jin, Chuhao,..., Huan |
3 |
2024-02-13 |
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering |
link |
Schimanski, Tobias,..., Markus |
3 |
2024-06-06 |
ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models |
link |
Ren, Yuanyi,..., Guojie |
3 |
None |
DM-BLI: Dynamic Multiple Subspaces Alignment for Unsupervised Bilingual Lexicon Induction |
link |
Hu, Ling,..., Yuemei |
3 |
2024-02-28 |
VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models |
link |
Kim, Seoyeon,..., Dongha |
3 |
2023-11-15 |
MELA: Multilingual Evaluation of Linguistic Acceptability |
link |
Zhang, Ziyin,..., Hai |
3 |
None |
Through the MUD: A Multi-Defendant Charge Prediction Benchmark with Linked Crime Elements |
link |
Wei, Xiao,..., Erik |
3 |
2024-02-28 |
An Iterative Associative Memory Model for Empathetic Response Generation |
link |
Yang, Zhou,..., Xiangwen |
3 |
2024-08-20 |
Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models |
link |
Chen, Yuyan,..., Yanghua |
3 |
None |
Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models |
link |
Luo, Kun,..., Kang |
3 |
2024-06-09 |
GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge? |
link |
Ko, Dayoon,..., Gunhee |
3 |
2024-06-05 |
BIPED: Pedagogically Informed Tutoring System for ESL Education |
link |
Kwon, Soonwoo,..., Kyuseok |
3 |
None |
ARL2: Aligning Retrievers with Black-box Large Language Models via Self-guided Adaptive Relevance Labeling |
link |
Zhang, LingXi,..., Chao |
3 |
2024-06-11 |
Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference |
link |
Bang, Jihwan,..., Simyung |
3 |
2024-01-06 |
CaMML: Context-Aware Multimodal Learner for Large Models |
link |
Chen, Yixin,..., Bo |
3 |
2023-09-29 |
Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts |
link |
Ying, Jiahao,..., Yongbin |
3 |
2023-09-15 |
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending |
link |
Zhu, Shiyi,..., Jianguo |
3 |
2024-06-05 |
Missci: Reconstructing Fallacies in Misrepresented Science |
link |
Glockner, Max,..., Iryna |
3 |
2024-06-05 |
LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback |
link |
Ziegenbein, Timon,..., Henning |
3 |
2024-01-13 |
Graph Language Models |
link |
Plenz, Moritz,..., Anette |
3 |
2024-05-21 |
Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan Acquisition |
link |
Bortoletto, Matteo,..., Andreas |
3 |
2024-02-22 |
RelayAttention for Efficient Large Language Model Serving with Long System Prompts |
link |
Zhu, Lei,..., Rynson |
3 |
2024-05-19 |
Your Transformer is Secretly Linear |
link |
Razzhigaev, Anton,..., Andrey |
3 |
2024-06-07 |
Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM Optimizers |
link |
Senel, L{\"u}tfi Kerem,..., Shervin |
3 |
2024-02-09 |
NICE: To Optimize In-Context Examples or Not? |
link |
Srivastava, Pragya,..., Amit |
3 |
2024-07-31 |
Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation |
link |
Luo, Xiang,..., Xuejie |
3 |
None |
Event-Radar: Event-driven Multi-View Learning for Multimodal Fake News Detection |
link |
Ma, Zihan,..., Xiang |
3 |
2024-02-21 |
Fine-Grained Modeling of Narrative Context: A Coherence Perspective via Retrospective Questions |
link |
Xu, Liyan,..., Jie |
3 |
2024-06-01 |
Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning |
link |
Ryu, Sangwon,..., Jungseul |
3 |
2024-01-24 |
SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning |
link |
Chen, Guoxin,..., Yiming |
3 |
2024-04-08 |
EFSA: Towards Event-Level Financial Sentiment Analysis |
link |
Chen, Tianyu,..., Xiang |
3 |
2024-02-21 |
Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment |
link |
Li, Yunxin,..., Min |
3 |
None |
LEMON: Reviving Stronger and Smaller LMs from Larger LMs with Linear Parameter Fusion |
link |
Chen, Yilong,..., Hua |
3 |
2024-04-29 |
Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods |
link |
Yu, Haeun,..., Isabelle |
3 |
2024-07-20 |
Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL |
link |
Choi, Yunseon,..., Kee-Eung |
3 |
2024-07-18 |
MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking |
link |
Chen, Ting-Chih,..., Chris |
3 |
2024-06-06 |
Decoder-only Streaming Transformer for Simultaneous Translation |
link |
Guo, Shoutao,..., Yang |
3 |
2024-06-05 |
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning |
link |
Zhang, Shaolei,..., Yang |
3 |
2024-06-09 |
Why Don`t Prompt-Based Fairness Metrics Correlate? |
link |
Zayed, Abdelrahman,..., Sarath |
3 |
2023-11-16 |
WatME: Towards Lossless Watermarking Through Lexical Redundancy |
link |
Chen, Liang,..., Kam-Fai |
3 |
2024-06-04 |
Understanding Retrieval Robustness for Retrieval-augmented Image Captioning |
link |
Li, Wenyan,..., Desmond |
3 |
2024-02-16 |
Linear Transformers with Learnable Kernel Functions are Better In-Context Models |
link |
Aksenov, Yaroslav,..., Daniil |
3 |
2023-08-09 |
VulLibGen: Generating Names of Vulnerability-Affected Packages via a Large Language Model |
link |
Chen, Tianyu,..., Tao |
3 |
2024-03-03 |
SyllabusQA: A Course Logistics Question Answering Dataset |
link |
Fernandez, Nigel,..., Andrew |
3 |
2024-02-16 |
Exploring Hybrid Question Answering via Program-based Prompting |
link |
Shi, Qi,..., Ting |
3 |
2024-06-07 |
Uncertainty Aware Learning for Language Model Alignment |
link |
Wang, Yikun,..., Dacheng |
3 |
2024-02-20 |
Model Composition for Multimodal Large Language Models |
link |
Chen, Chi,..., Yang |
3 |
None |
Enhancing Explainable Rating Prediction through Annotated Macro Concepts |
link |
Zhou, Huachi,..., Xiao |
3 |
None |
Can Large Language Models Interpret Noun-Noun Compounds? A Linguistically-Motivated Study on Lexicalized and Novel Compounds |
link |
Rambelli, Giulia,..., Marianna |
3 |
2024-06-05 |
Document-level Claim Extraction and Decontextualisation for Fact-Checking |
link |
Deng, Zhenyun,..., Andreas |
3 |
2024-06-06 |
To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation |
link |
Waheed, Abdul,..., Muhammad |
3 |
2024-03-07 |
Classist Tools: Social Class Correlates with Performance in NLP |
link |
Cercas Curry, Amanda,..., Dirk |
3 |
2024-02-16 |
Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning |
link |
Nguyen, Tuc,..., Thai |
3 |
2024-06-21 |
Word Matters: What Influences Domain Adaptation in Summarization? |
link |
Li, Yinghao,..., Yang |
3 |
2024-02-19 |
NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms |
link |
Zheng, Jonathan,..., Wei |
3 |
2024-02-08 |
Transparent and Scrutable Recommendations Using Natural Language User Profiles |
link |
Ramos, Jerome,..., Aldo |
3 |
2023-11-15 |
Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation |
link |
Zhao, Jiachen,..., Andrew |