314 |
2023-08-18 |
link |
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct |
unknown |
305 |
2023-03-07 |
link |
Larger language models do in-context learning differently |
unknown |
194 |
2024-04-30 |
link |
KAN: Kolmogorov-Arnold Networks |
unknown |
179 |
2024-08-01 |
link |
SAM 2: Segment Anything in Images and Videos |
unknown |
141 |
2024-03-27 |
link |
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models |
unknown |
92 |
2023-08-20 |
link |
Steering Language Models With Activation Engineering |
unknown |
82 |
2024-04-02 |
link |
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks |
unknown |
78 |
2024-08-12 |
link |
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer |
unknown |
73 |
2024-03-12 |
link |
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code |
unknown |
66 |
2024-07-10 |
link |
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models |
unknown |
65 |
2024-05-01 |
link |
Self-Play Preference Optimization for Language Model Alignment |
unknown |
63 |
2024-03-28 |
link |
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models |
unknown |
63 |
2024-04-19 |
link |
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions |
unknown |
63 |
2023-10-26 |
link |
JudgeLM: Fine-tuned Large Language Models are Scalable Judges |
unknown |
57 |
2024-02-15 |
link |
Generative Representational Instruction Tuning |
unknown |
53 |
2024-07-31 |
link |
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling |
unknown |
52 |
2023-08-07 |
link |
Simple synthetic data reduces sycophancy in large language models |
unknown |
52 |
2024-04-02 |
link |
Advancing LLM Reasoning Generalists with Preference Trees |
unknown |
51 |
2024-03-06 |
link |
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect |
unknown |
50 |
2024-03-26 |
link |
The Unreasonable Ineffectiveness of the Deeper Layers |
unknown |
48 |
2024-03-22 |
link |
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models |
unknown |
47 |
2023-12-18 |
link |
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model |
unknown |
45 |
2024-08-20 |
link |
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model |
unknown |
44 |
2024-05-27 |
link |
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models |
unknown |
43 |
2024-03-07 |
link |
Common 7B Language Models Already Possess Strong Math Capabilities |
unknown |
42 |
2024-04-18 |
link |
MeshLRM: Large Reconstruction Model for High-Quality Mesh |
unknown |
42 |
2001-09-01 |
link |
The Turing Game |
unknown |
41 |
2024-08-22 |
link |
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation |
unknown |
40 |
2024-01-29 |
link |
Corrective Retrieval Augmented Generation |
unknown |
40 |
2024-06-24 |
link |
Long Context Transfer from Language to Vision |
unknown |
39 |
2023-11-28 |
link |
Large Language Models Suffer From Their Own Output: An Analysis of the Self-Consuming Training Loop |
unknown |
39 |
2024-06-06 |
link |
Scaling and evaluating sparse autoencoders |
unknown |
39 |
2024-06-12 |
link |
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing |
unknown |
39 |
2024-04-02 |
link |
CameraCtrl: Enabling Camera Control for Text-to-Video Generation |
unknown |
38 |
2024-03-21 |
link |
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text |
unknown |
38 |
2024-06-07 |
link |
Mixture-of-Agents Enhances Large Language Model Capabilities |
unknown |
38 |
2024-06-22 |
link |
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions |
unknown |
37 |
2024-06-05 |
link |
Improve Mathematical Reasoning in Language Models by Automated Process Supervision |
unknown |
37 |
2024-07-24 |
link |
Gymnasium: A Standard Interface for Reinforcement Learning Environments |
unknown |
37 |
2024-07-05 |
link |
Learning to (Learn at Test Time): RNNs with Expressive Hidden States |
unknown |
36 |
2024-07-23 |
link |
OpenHands: An Open Platform for AI Software Developers as Generalist Agents |
unknown |
36 |
2024-04-21 |
link |
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs |
unknown |
34 |
2024-02-13 |
link |
World Model on Million-Length Video And Language With Blockwise RingAttention |
unknown |
33 |
2023-10-17 |
link |
Eliciting Human Preferences with Language Models |
unknown |
32 |
2024-06-07 |
link |
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild |
unknown |
32 |
2024-06-08 |
link |
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers |
unknown |
32 |
2024-05-23 |
link |
Not All Language Model Features Are Linear |
unknown |
32 |
2024-01-30 |
link |
Weak-to-Strong Jailbreaking on Large Language Models |
unknown |
31 |
None |
link |
Battle of the Wordsmiths: Comparing ChatGPT, GPT-4, Claude, and Bard |
unknown |
31 |
2024-08-27 |
link |
Generative Verifiers: Reward Modeling as Next-Token Prediction |
unknown |
30 |
2024-02-28 |
link |
Data Interpreter: An LLM Agent For Data Science |
unknown |
29 |
2024-03-19 |
link |
GaussianFlow: Splatting Gaussian Dynamics for 4D Content Creation |
unknown |
28 |
2024-03-25 |
link |
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance |
unknown |
28 |
2024-07-28 |
link |
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge |
unknown |
27 |
2024-02-20 |
link |
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models |
unknown |
27 |
2022-09-11 |
link |
OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning |
unknown |
27 |
2024-01-25 |
link |
Deconstructing Denoising Diffusion Models for Self-Supervised Learning |
unknown |
27 |
2024-04-24 |
link |
Retrieval Head Mechanistically Explains Long-Context Factuality |
unknown |
27 |
2024-06-10 |
link |
Safety Alignment Should Be Made More Than Just a Few Tokens Deep |
unknown |
26 |
2024-02-02 |
link |
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training? |
unknown |
26 |
2024-03-04 |
link |
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures |
unknown |
26 |
2024-09-06 |
link |
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers |
unknown |
26 |
2024-06-11 |
link |
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling |
unknown |
25 |
2024-06-26 |
link |
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs |
unknown |
25 |
2024-02-12 |
link |
On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks |
unknown |
25 |
2024-02-06 |
link |
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K |
unknown |
25 |
2024-08-09 |
link |
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models |
unknown |
25 |
2024-02-23 |
link |
Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control |
unknown |
24 |
2024-03-13 |
link |
Language models scale reliably with over-training and on downstream tasks |
unknown |
24 |
2024-05-26 |
link |
SpinQuant: LLM quantization with learned rotations |
unknown |
23 |
2022-08-09 |
link |
Training Overparametrized Neural Networks in Sublinear Time |
unknown |
23 |
2024-09-18 |
link |
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning |
unknown |
23 |
2023-12-10 |
link |
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models |
unknown |
23 |
2024-04-03 |
link |
Min-K%++: Improved Baseline for Detecting Pre-Training Data from Large Language Models |
unknown |
23 |
2024-06-20 |
link |
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors |
unknown |
23 |
2024-07-19 |
link |
Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders |
unknown |
23 |
2024-06-13 |
link |
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding |
unknown |
22 |
2024-06-18 |
link |
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges |
unknown |
22 |
2024-06-04 |
link |
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation |
unknown |
22 |
2023-12-26 |
link |
ChartBench: A Benchmark for Complex Visual Reasoning in Charts |
unknown |
22 |
2024-04-15 |
link |
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing |
unknown |
22 |
2023-12-11 |
link |
HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models |
unknown |
21 |
2024-03-12 |
link |
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression |
unknown |
21 |
2024-07-02 |
link |
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation |
unknown |
20 |
2024-09-19 |
link |
Training Language Models to Self-Correct via Reinforcement Learning |
unknown |
20 |
2024-03-29 |
link |
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want |
unknown |
20 |
2024-02-23 |
link |
Repetition Improves Language Model Embeddings |
unknown |
20 |
2024-05-23 |
link |
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents |
unknown |
20 |
2023-05-31 |
link |
SafeDiffuser: Safe Planning with Diffusion Probabilistic Models |
unknown |
20 |
2024-03-11 |
link |
Multistep Consistency Models |
unknown |
20 |
2024-05-13 |
link |
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments |
unknown |
20 |
2023-05-23 |
link |
MOTRv3: Release-Fetch Supervision for End-to-End Multi-Object Tracking |
unknown |
19 |
2024-02-06 |
link |
Personalized Language Modeling from Personalized Human Feedback |
unknown |
19 |
2024-05-29 |
link |
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos |
unknown |
19 |
2024-05-14 |
link |
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control |
unknown |
19 |
2023-11-30 |
link |
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models |
unknown |
19 |
2023-11-29 |
link |
GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation |
unknown |
18 |
2024-06-27 |
link |
LiveBench: A Challenging, Contamination-Free LLM Benchmark |
unknown |
18 |
2023-06-29 |
link |
RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark |
unknown |
18 |
2024-07-08 |
link |
MUSE: Machine Unlearning Six-Way Evaluation for Language Models |
unknown |
18 |
2024-10-07 |
link |
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models |
unknown |
18 |
2024-08-13 |
link |
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents |
unknown |
18 |
2024-08-29 |
link |
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling |
unknown |
18 |
2024-06-04 |
link |
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling |
unknown |
18 |
2024-01-06 |
link |
Human-Instruction-Free LLM Self-Alignment with Limited Samples |
unknown |
18 |
2024-04-15 |
link |
Learn Your Reference Model for Real Good Alignment |
unknown |
17 |
2024-05-23 |
link |
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step |
unknown |
17 |
2023-12-07 |
link |
OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization |
unknown |
17 |
2024-02-28 |
link |
RNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrieval |
unknown |
17 |
2024-07-09 |
link |
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence |
unknown |
17 |
2024-03-31 |
link |
M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models |
unknown |
17 |
2021-07-07 |
link |
Deep Learning for Two-Sided Matching |
unknown |
17 |
2024-02-16 |
link |
Speculative Streaming: Fast LLM Inference without Auxiliary Models |
unknown |
17 |
2024-07-01 |
link |
Tree Search for Language Model Agents |
unknown |
17 |
2024-05-30 |
link |
GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning |
unknown |
16 |
2024-06-20 |
link |
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning |
unknown |
16 |
2024-06-07 |
link |
Towards Semantic Equivalence of Tokenization in Multimodal LLM |
unknown |
16 |
2024-06-23 |
link |
Blind Baselines Beat Membership Inference Attacks for Foundation Models |
unknown |
16 |
2024-08-05 |
link |
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining |
unknown |
16 |
2024-05-29 |
link |
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment |
unknown |
16 |
2024-05-20 |
link |
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning |
unknown |
15 |
2024-03-20 |
link |
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework |
unknown |
15 |
2023-11-20 |
link |
Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model |
unknown |
15 |
2024-06-06 |
link |
Vision-LSTM: xLSTM as Generic Vision Backbone |
unknown |
15 |
2023-10-06 |
link |
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation |
unknown |
15 |
2024-06-11 |
link |
Scaling Large-Language-Model-based Multi-Agent Collaboration |
unknown |
15 |
2024-06-03 |
link |
BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models |
unknown |
15 |
2024-03-18 |
link |
VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model |
unknown |
15 |
2024-04-24 |
link |
Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach |
unknown |
15 |
2024-04-04 |
link |
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis |
unknown |
15 |
2024-05-19 |
link |
Decoding by Contrasting Knowledge: Enhancing LLMs' Confidence on Edited Facts |
unknown |
15 |
2024-02-05 |
link |
Vision-Language Models Provide Promptable Representations for Reinforcement Learning |
unknown |
15 |
2024-07-01 |
link |
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds |
unknown |
14 |
2024-06-26 |
link |
Symbolic Learning Enables Self-Evolving Agents |
unknown |
14 |
2024-08-19 |
link |
LongVILA: Scaling Long-Context Visual Language Models for Long Videos |
unknown |
14 |
2024-05-22 |
link |
TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models |
unknown |
14 |
2024-08-15 |
link |
FuseChat: Knowledge Fusion of Chat Models |
unknown |
14 |
2024-07-22 |
link |
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models |
unknown |
14 |
2024-03-01 |
link |
Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores |
unknown |
14 |
2024-06-12 |
link |
LVBench: An Extreme Long Video Understanding Benchmark |
unknown |
14 |
2024-04-23 |
link |
FMint: Bridging Human Designed and Data Pretrained Models for Differential Equation Foundation Model |
unknown |
14 |
2023-10-01 |
link |
Source Attribution for Large Language Model-Generated Data |
unknown |
14 |
2024-09-03 |
link |
OLMoE: Open Mixture-of-Experts Language Models |
unknown |
14 |
2024-07-22 |
link |
Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs |
unknown |
14 |
2024-05-23 |
link |
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration |
unknown |
14 |
2023-07-02 |
link |
MissDiff: Training Diffusion Models on Tabular Data with Missing Values |
unknown |
14 |
2024-02-20 |
link |
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing |
unknown |
14 |
2024-08-12 |
link |
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers |
unknown |
14 |
2024-07-17 |
link |
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control |
unknown |
14 |
2024-07-01 |
link |
RegMix: Data Mixture as Regression for Language Model Pre-training |
unknown |
14 |
2024-05-29 |
link |
Contextual Position Encoding: Learning to Count What's Important |
unknown |
14 |
2024-08-15 |
link |
Automated Design of Agentic Systems |
unknown |
13 |
2024-06-14 |
link |
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers |
unknown |
13 |
2024-02-27 |
link |
SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation |
unknown |
13 |
2024-09-04 |
link |
Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering |
unknown |
13 |
2024-04-09 |
link |
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis |
unknown |
13 |
2024-05-14 |
link |
CinePile: A Long Video Question Answering Dataset and Benchmark |
unknown |
13 |
2023-11-24 |
link |
Image Super-Resolution with Text Prompt Diffusion |
unknown |
13 |
2024-02-05 |
link |
Evading Data Contamination Detection for Language Models is (too) Easy |
unknown |
13 |
2024-05-29 |
link |
X-VILA: Cross-Modality Alignment for Large Language Model |
unknown |
13 |
2024-05-27 |
link |
Matryoshka Multimodal Models |
unknown |
13 |
2022-02-01 |
link |
MotifExplainer: a Motif-based Graph Neural Network Explainer |
unknown |
13 |
2024-03-13 |
link |
A Decade's Battle on Dataset Bias: Are We There Yet? |
unknown |
13 |
2024-05-16 |
link |
Many-Shot In-Context Learning in Multimodal Foundation Models |
unknown |
13 |
2024-06-08 |
link |
MotionClone: Training-Free Motion Cloning for Controllable Video Generation |
unknown |
13 |
2024-07-05 |
link |
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? |
unknown |
12 |
2024-09-06 |
link |
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation |
unknown |
12 |
2024-06-28 |
link |
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy |
unknown |
12 |
2024-10-02 |
link |
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second |
unknown |
12 |
2024-06-22 |
link |
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs |
unknown |
12 |
2024-06-14 |
link |
Training-free Camera Control for Video Generation |
unknown |
12 |
2024-06-12 |
link |
What If We Recaption Billions of Web Images with LLaMA-3? |
unknown |
12 |
2023-12-16 |
link |
Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot Videos |
unknown |
12 |
2024-05-31 |
link |
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models |
unknown |
12 |
2024-07-19 |
link |
BOND: Aligning LLMs with Best-of-N Distillation |
unknown |
12 |
2024-06-13 |
link |
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding |
unknown |
12 |
2024-05-30 |
link |
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models |
unknown |
12 |
2024-03-21 |
link |
Language Repository for Long Video Understanding |
unknown |
12 |
2024-09-04 |
link |
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture |
unknown |
12 |
2022-01-26 |
link |
Privacy-Preserving Logistic Regression Training with a Faster Gradient Variant |
unknown |
12 |
2024-06-16 |
link |
GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents |
unknown |
12 |
2024-09-25 |
link |
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction |
unknown |
12 |
2024-05-22 |
link |
Towards Comprehensive and Efficient Post Safety Alignment of Large Language Models via Safety Patching |
unknown |
12 |
2024-06-26 |
link |
Kolmogorov-Arnold Graph Neural Networks |
unknown |
12 |
2024-04-19 |
link |
TextSquare: Scaling up Text-Centric Visual Instruction Tuning |
unknown |
11 |
2024-03-11 |
link |
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? |
unknown |
11 |
2024-03-05 |
link |
Cradle: Empowering Foundation Agents Towards General Computer Control |
unknown |
11 |
2024-02-21 |
link |
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching |
unknown |
11 |
2024-07-31 |
link |
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts |
unknown |
11 |
2024-02-12 |
link |
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts |
unknown |
11 |
2024-06-06 |
link |
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? |
unknown |
11 |
2024-06-11 |
link |
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance |
unknown |
11 |
2024-02-13 |
link |
Test-Time Backdoor Attacks on Multimodal Large Language Models |
unknown |
11 |
2024-05-31 |
link |
OR-Bench: An Over-Refusal Benchmark for Large Language Models |
unknown |
11 |
2024-06-20 |
link |
Consistency Models Made Easy |
unknown |
11 |
2023-11-27 |
link |
Regularization by Texts for Latent Diffusion Inverse Solvers |
unknown |
11 |
2024-06-18 |
link |
WebCanvas: Benchmarking Web Agents in Online Environments |
unknown |
11 |
2022-08-08 |
link |
On Rademacher Complexity-based Generalization Bounds for Deep Learning |
unknown |
11 |
2024-05-30 |
link |
Phantom: General Trigger Attacks on Retrieval Augmented Language Generation |
unknown |
11 |
2024-08-28 |
link |
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders |
unknown |
11 |
2024-06-17 |
link |
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model |
unknown |
11 |
2024-03-14 |
link |
Recurrent Drafter for Fast Speculative Decoding in Large Language Models |
unknown |
11 |
2024-05-30 |
link |
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving |
unknown |
11 |
2024-05-06 |
link |
Language-Image Models with 3D Understanding |
unknown |
11 |
2024-07-16 |
link |
Does Refusal Training in LLMs Generalize to the Past Tense? |
unknown |
11 |
2024-07-24 |
link |
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency |
unknown |
11 |
2024-07-01 |
link |
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents |
unknown |
10 |
2023-09-20 |
link |
Transformers versus LSTMs for electronic trading |
unknown |
10 |
2024-09-04 |
link |
Building Math Agents with Multi-Turn Iterative Preference Learning |
unknown |
10 |
2024-03-14 |
link |
Relaxing Accurate Initialization Constraint for 3D Gaussian Splatting |
unknown |
10 |
2024-08-29 |
link |
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling |
unknown |
10 |
2024-06-14 |
link |
From Pixels to Prose: A Large Dataset of Dense Image Captions |
unknown |
10 |
2024-06-14 |
link |
ControlVAR: Exploring Controllable Visual Autoregressive Modeling |
unknown |
10 |
2024-07-06 |
link |
LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts |
unknown |
10 |
2024-09-26 |
link |
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction |
unknown |
10 |
2024-07-11 |
link |
SEED-Story: Multimodal Long Story Generation with Large Language Model |
unknown |
10 |
2024-05-29 |
link |
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF |
unknown |
9 |
2024-05-30 |
link |
Is In-Context Learning Sufficient for Instruction Following in LLMs? |
unknown |
9 |
2024-08-05 |
link |
Self-Taught Evaluators |
unknown |
9 |
2024-03-13 |
link |
Learning to Watermark LLM-generated Text via Reinforcement Learning |
unknown |
9 |
2024-10-04 |
link |
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion |
unknown |
9 |
2024-06-06 |
link |
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion |
unknown |
9 |
2024-07-16 |
link |
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval |
unknown |
9 |
2024-01-08 |
link |
Why Solving Multi-agent Path Finding with Large Language Model has not Succeeded Yet |
unknown |
9 |
2023-12-05 |
link |
Scaling Laws for Adversarial Attacks on Language Model Activations |
unknown |
9 |
2024-05-26 |
link |
Splat-SLAM: Globally Optimized RGB-only SLAM with 3D Gaussians |
unknown |
9 |
2024-05-31 |
link |
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models |
unknown |
9 |
2024-04-18 |
link |
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos |
unknown |
9 |
2024-07-18 |
link |
Prover-Verifier Games improve legibility of LLM outputs |
unknown |
9 |
2024-07-12 |
link |
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training |
unknown |
9 |
2024-02-14 |
link |
Leveraging the Context through Multi-Round Interactions for Jailbreaking Attacks |
unknown |
9 |
2024-04-25 |
link |
Don't Say No: Jailbreaking LLM by Suppressing Refusal |
unknown |
9 |
2024-07-29 |
link |
Can Editing LLMs Inject Harm? |
unknown |
9 |
2024-05-23 |
link |
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models |
unknown |
9 |
2024-05-20 |
link |
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering |
unknown |
9 |
2024-05-30 |
link |
Large Language Models Can Self-Improve At Web Agent Tasks |
unknown |
9 |
2024-03-26 |
link |
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models |
unknown |
9 |
2024-07-12 |
link |
Human-like Episodic Memory for Infinite Context LLMs |
unknown |
9 |
2024-06-17 |
link |
Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces |
unknown |
9 |
2024-06-13 |
link |
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning |
unknown |
9 |
2023-07-15 |
link |
EFOk-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation |
unknown |
9 |
2024-06-14 |
link |
Detecting and Evaluating Medical Hallucinations in Large Vision Language Models |
unknown |
9 |
2023-05-24 |
link |
gRNAde: Geometric Deep Learning for 3D RNA inverse design |
unknown |
9 |
2022-02-01 |
link |
On the Limitations of General Purpose Domain Generalisation Methods |
unknown |
9 |
2023-11-17 |
link |
Point Cloud Self-supervised Learning via 3D to Multi-view Masked Autoencoder |
unknown |
9 |
2024-06-03 |
link |
Unlocking Guidance for Discrete State-Space Diffusion and Flow Models |
unknown |
9 |
2024-06-12 |
link |
CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models |
unknown |
9 |
2024-06-03 |
link |
Decoupled Alignment for Robust Plug-and-Play Adaptation |
unknown |
9 |
2024-04-09 |
link |
Hash3D: Training-free Acceleration for 3D Generation |
unknown |
9 |
2024-08-22 |
link |
Real-Time Video Generation with Pyramid Attention Broadcast |
unknown |
8 |
2024-06-13 |
link |
Understanding Jailbreak Success: A Study of Latent Space Dynamics in Large Language Models |
unknown |
8 |
2024-03-18 |
link |
LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models |
unknown |
8 |
2024-02-27 |
link |
Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems |
unknown |
8 |
2024-06-02 |
link |
Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback |
unknown |
8 |
2024-05-01 |
link |
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment |
unknown |
8 |
2024-06-24 |
link |
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation |
unknown |
8 |
2024-08-06 |
link |
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine |
unknown |
8 |
2024-05-27 |
link |
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters |
unknown |
8 |
2024-07-10 |
link |
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models |
unknown |
8 |
2024-07-11 |
link |
Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks |
unknown |
8 |
2024-09-10 |
link |
LLaMA-Omni: Seamless Speech Interaction with Large Language Models |
unknown |
8 |
None |
link |
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement |
unknown |
8 |
2024-03-02 |
link |
LLaMoCo: Instruction Tuning of Large Language Models for Optimization Code Generation |
unknown |
8 |
2024-08-15 |
link |
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk of Language Models |
unknown |
8 |
2024-09-03 |
link |
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation |
unknown |
8 |
2024-09-19 |
link |
Language Models Learn to Mislead Humans via RLHF |
unknown |
8 |
2024-06-19 |
link |
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models |
unknown |
8 |
2024-05-30 |
link |
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths |
unknown |
8 |
2024-07-22 |
link |
RazorAttention: Efficient KV Cache Compression Through Retrieval Heads |
unknown |
8 |
2024-02-05 |
link |
Markov Persuasion Processes: Learning to Persuade from Scratch |
unknown |
8 |
2024-05-23 |
link |
MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes |
unknown |
8 |
2024-04-04 |
link |
LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity |
unknown |
8 |
2024-02-20 |
link |
Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A |
unknown |
8 |
2024-06-11 |
link |
AI Sandbagging: Language Models can Strategically Underperform on Evaluations |
unknown |
8 |
2024-06-24 |
link |
Adam-mini: Use Fewer Learning Rates To Gain More |
unknown |
8 |
2024-05-28 |
link |
Learning diverse attacks on large language models for robust red-teaming and safety tuning |
unknown |
8 |
2024-05-27 |
link |
LLM-Assisted Static Analysis for Detecting Security Vulnerabilities |
unknown |
8 |
2023-12-13 |
link |
CBQ: Cross-Block Quantization for Large Language Models |
unknown |
8 |
2024-07-29 |
link |
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher |
unknown |
8 |
2024-08-23 |
link |
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? |
unknown |
7 |
2024-02-21 |
link |
PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization |
unknown |
7 |
2024-07-14 |
link |
Lean-STaR: Learning to Interleave Thinking and Proving |
unknown |
7 |
2024-06-20 |
link |
Fantastic Copyrighted Beasts and How (Not) to Generate Them |
unknown |
7 |
2024-03-26 |
link |
Bidirectional Consistency Models |
unknown |
7 |
2024-06-05 |
link |
FusionBench: A Comprehensive Benchmark of Deep Model Fusion |
unknown |
7 |
2024-09-04 |
link |
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency |
unknown |
7 |
2023-12-04 |
link |
StoryGPT-V: Large Language Models as Consistent Story Visualizers |
unknown |
7 |
2023-03-23 |
link |
Type-II Saddles and Probabilistic Stability of Stochastic Gradient Descent |
unknown |
7 |
2024-04-09 |
link |
MuPT: A Generative Symbolic Music Pretrained Transformer |
unknown |
7 |
2024-03-26 |
link |
AgentStudio: A Toolkit for Building General Virtual Agents |
unknown |
7 |
2023-12-01 |
link |
OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline |
unknown |
7 |
2024-03-17 |
link |
BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D Synthesis |
unknown |
7 |
2024-06-13 |
link |
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs |
unknown |
7 |
2024-08-21 |
link |
Critique-out-Loud Reward Models |
unknown |
7 |
2024-07-20 |
link |
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data |
unknown |
7 |
2024-06-08 |
link |
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models |
unknown |
7 |
2024-06-11 |
link |
McEval: Massively Multilingual Code Evaluation |
unknown |
7 |
2023-12-03 |
link |
Behind the Magic, MERLIM: Multi-modal Evaluation Benchmark for Large Image-Language Models |
unknown |
7 |
2024-08-20 |
link |
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding |
unknown |
7 |
2024-05-29 |
link |
FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining |
unknown |
7 |
2024-04-16 |
link |
Uncertainty-Based Abstention in LLMs Improves Safety and Reduces Hallucinations |
unknown |
7 |
2024-09-09 |
link |
Improving Pretraining Data Using Perplexity Correlations |
unknown |
7 |
2024-06-06 |
link |
Interpreting the Second-Order Effects of Neurons in CLIP |
unknown |
7 |
2024-06-24 |
link |
WARP: On the Benefits of Weight Averaged Rewarded Policies |
unknown |
7 |
2021-10-15 |
link |
Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits |
unknown |
7 |
2024-07-17 |
link |
Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems |
unknown |
7 |
2024-09-11 |
link |
Agent Workflow Memory |
unknown |
7 |
2024-03-20 |
link |
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition |
unknown |
7 |
2024-07-08 |
link |
Variational Best-of-N Alignment |
unknown |
7 |
2024-07-01 |
link |
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs |
unknown |
7 |
None |
link |
PoseCheck: Generative Models for 3D Structure-based Drug Design Produce Unrealistic Poses |
unknown |
7 |
2024-08-29 |
link |
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model |
unknown |
7 |
2024-03-21 |
link |
AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation |
unknown |
7 |
2024-06-06 |
link |
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data |
unknown |
7 |
2024-07-11 |
link |
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting |
unknown |
7 |
2024-06-05 |
link |
A-Bench: Are LMMs Masters at Evaluating AI-generated Images? |
unknown |
7 |
2023-12-11 |
link |
GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models |
unknown |
7 |
2024-05-24 |
link |
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception |
unknown |
7 |
2024-06-04 |
link |
Process-Driven Autoformalization in Lean 4 |
unknown |
6 |
2023-11-21 |
link |
Multi-Session Budget Optimization for Forward Auction-based Federated Learning |
unknown |
6 |
2024-08-29 |
link |
OmniRe: Omni Urban Scene Reconstruction |
unknown |
6 |
2023-11-21 |
link |
Limitations of measure-first protocols in quantum machine learning |
unknown |
6 |
2024-06-25 |
link |
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon |
unknown |
6 |
2024-06-05 |
link |
VideoPhy: Evaluating Physical Commonsense for Video Generation |
unknown |
6 |
2023-12-12 |
link |
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning |
unknown |
6 |
2021-09-17 |
link |
On the Convergence of Tsetlin Machines for the AND and the OR Operators |
unknown |
6 |
2024-05-31 |
link |
Direct Alignment of Language Models via Quality-Aware Self-Refinement |
unknown |
6 |
2024-03-22 |
link |
A Transfer Attack to Image Watermarks |
unknown |
6 |
2024-02-22 |
link |
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models |
unknown |
6 |
2024-07-10 |
link |
Video-to-Audio Generation with Hidden Alignment |
unknown |
6 |
2024-05-24 |
link |
Out of Many, One: Designing and Scaffolding Proteins at the Scale of the Structural Universe with Genie 2 |
unknown |
6 |
2024-02-20 |
link |
Bayesian Neural Networks with Domain Knowledge Priors |
unknown |
6 |
2024-06-25 |
link |
Point-SAM: Promptable 3D Segmentation Model for Point Clouds |
unknown |
6 |
2023-08-25 |
link |
Learn With Imagination: Safe Set Guided State-wise Constrained Policy Optimization |
unknown |
6 |
2024-06-21 |
link |
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression |
unknown |
6 |
2024-10-03 |
link |
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge |
unknown |
6 |
2024-02-28 |
link |
Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes |
unknown |
6 |
2023-01-19 |
link |
Robust Gaussian Process Regression with Huber Likelihood |
unknown |
6 |
2024-06-06 |
link |
Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices |
unknown |
6 |
2024-06-12 |
link |
WMAdapter: Adding WaterMark Control to Latent Diffusion Models |
unknown |
6 |
2024-10-06 |
link |
SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference |
unknown |
6 |
2024-06-12 |
link |
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text |
unknown |
6 |
2024-06-27 |
link |
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data |
unknown |
6 |
2024-06-12 |
link |
MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents |
unknown |
6 |
2024-08-16 |
link |
Classifier-Free Guidance is a Predictor-Corrector |
unknown |
6 |
2023-11-20 |
link |
Zero redundancy distributed learning with differential privacy |
unknown |
6 |
2024-07-23 |
link |
MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence |
unknown |
6 |
2023-03-10 |
link |
Uncovering Challenges of Solving the Continuous Gromov-Wasserstein Problem |
unknown |
6 |
2024-08-01 |
link |
DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving |
unknown |
6 |
2024-07-29 |
link |
Diffusion Feedback Helps CLIP See Better |
unknown |
6 |
2024-05-23 |
link |
Knowledge Localization: Mission Not Accomplished? Enter Query Localization! |
unknown |
6 |
2019-12-17 |
link |
Multi-Channel Graph Convolutional Networks |
unknown |
6 |
2024-07-11 |
link |
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients |
unknown |
6 |
2024-05-22 |
link |
Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning |
unknown |
6 |
2024-06-14 |
link |
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation |
unknown |
6 |
2024-06-25 |
link |
Interpreting Attention Layer Outputs with Sparse Autoencoders |
unknown |
6 |
2024-09-23 |
link |
Direct Judgement Preference Optimization |
unknown |
6 |
2024-05-23 |
link |
Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model |
unknown |
6 |
2024-09-19 |
link |
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines |
unknown |
6 |
2023-12-28 |
link |
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation |
unknown |
6 |
2024-10-03 |
link |
Video Instruction Tuning With Synthetic Data |
unknown |
6 |
2024-07-11 |
link |
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist |
unknown |
6 |
2024-06-17 |
link |
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI |
unknown |
6 |
2024-09-19 |
link |
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution |
unknown |
6 |
2024-06-30 |
link |
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning |
unknown |
6 |
2024-06-12 |
link |
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery |
unknown |
6 |
2024-03-05 |
link |
ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving |
unknown |
6 |
2024-09-04 |
link |
Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling |
unknown |
6 |
2024-06-25 |
link |
PAFT: A Parallel Training Paradigm for Effective LLM Fine-Tuning |
unknown |
6 |
2024-06-11 |
link |
Towards Realistic Data Generation for Real-World Super-Resolution |
unknown |
6 |
2024-05-19 |
link |
MHPP: Exploring the Capabilities and Limitations of Language Models Beyond Basic Code Generation |
unknown |
6 |
2024-04-10 |
link |
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models |
unknown |
6 |
2024-04-15 |
link |
RankCLIP: Ranking-Consistent Language-Image Pretraining |
unknown |
6 |
2024-04-29 |
link |
FeDeRA: Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition |
unknown |
6 |
2024-06-12 |
link |
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos |
unknown |
6 |
2024-10-07 |
link |
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents |
unknown |
6 |
2023-07-15 |
link |
Faster Algorithms for Structured Linear and Kernel Support Vector Machines |
unknown |
6 |
2024-06-03 |
link |
MAD: Multi-Alignment MEG-to-Text Decoding |
unknown |
6 |
2024-07-30 |
link |
AI-Assisted Generation of Difficult Math Questions |
unknown |
5 |
2024-02-21 |
link |
Avoiding barren plateaus via Gaussian Mixture Model |
unknown |
5 |
2024-05-24 |
link |
Emergence of a High-Dimensional Abstraction Phase in Language Transformers |
unknown |
5 |
2023-12-21 |
link |
DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models |
unknown |
5 |
2023-06-07 |
link |
Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation |
unknown |
5 |
2024-03-13 |
link |
Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models trained on Corrupted Data |
unknown |
5 |
2024-05-06 |
link |
How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs |
unknown |
5 |
2024-03-15 |
link |
DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers |
unknown |
5 |
2024-06-09 |
link |
Certified Robustness to Data Poisoning in Gradient-Based Training |
unknown |
5 |
2024-10-03 |
link |
How to Train Long-Context Language Models (Effectively) |
unknown |
5 |
2024-06-11 |
link |
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward |
unknown |
5 |
2024-08-27 |
link |
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation |
unknown |
5 |
2024-03-21 |
link |
Physics-Informed Diffusion Models |
unknown |
5 |
2024-06-26 |
link |
On Scaling Up 3D Gaussian Splatting Training |
unknown |
5 |
2024-02-07 |
link |
On Provable Length and Compositional Generalization |
unknown |
5 |
2024-06-08 |
link |
M3GIA: A Cognition Inspired Multilingual and Multimodal General Intelligence Ability Benchmark |
unknown |
5 |
2024-05-27 |
link |
Perturbation-Restrained Sequential Model Editing |
unknown |
5 |
2024-07-30 |
link |
ThinK: Thinner Key Cache by Query-Driven Pruning |
unknown |
5 |
2023-03-01 |
link |
Competence-Based Analysis of Language Models |
unknown |
5 |
2023-07-20 |
link |
FigCaps-HF: A Figure-to-Caption Generative Framework and Benchmark with Human Feedback |
unknown |
5 |
2024-05-10 |
link |
Deep MMD Gradient Flow without adversarial training |
unknown |
5 |
2024-06-04 |
link |
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation |
unknown |
5 |
2024-06-24 |
link |
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs |
unknown |
5 |
2024-04-02 |
link |
AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation |
unknown |
5 |
2024-05-26 |
link |
Large Scale Knowledge Washing |
unknown |
5 |
2024-09-30 |
link |
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning |
unknown |
5 |
2024-09-16 |
link |
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval |
unknown |
5 |
2024-07-03 |
link |
Improved Noise Schedule for Diffusion Training |
unknown |
5 |
2024-05-19 |
link |
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models |
unknown |
5 |
2024-09-01 |
link |
Diffusion Policy Policy Optimization |
unknown |
5 |
2024-06-21 |
link |
SAIL: Self-Improving Efficient Online Alignment of Large Language Models |
unknown |
5 |
2024-07-19 |
link |
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference |
unknown |
5 |
2024-01-21 |
link |
Quantum Architecture Search with Unsupervised Representation Learning |
unknown |
5 |
2024-07-01 |
link |
Eliminating Position Bias of Language Models: A Mechanistic Approach |
unknown |
5 |
2024-06-17 |
link |
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization |
unknown |
5 |
2024-06-12 |
link |
Real2Code: Reconstruct Articulated Objects via Code Generation |
unknown |
5 |
2024-06-21 |
link |
RouteFinder: Towards Foundation Models for Vehicle Routing Problems |
unknown |
5 |
2024-07-17 |
link |
E5-V: Universal Embeddings with Multimodal Large Language Models |
unknown |
5 |
2024-04-28 |
link |
Paint by Inpaint: Learning to Add Image Objects by Removing Them First |
unknown |
5 |
2024-07-19 |
link |
EVLM: An Efficient Vision-Language Model for Visual Understanding |
unknown |
5 |
2024-06-25 |
link |
Machine Unlearning Fails to Remove Data Poisoning Attacks |
unknown |
5 |
2024-04-12 |
link |
Inheritune: Training Smaller Yet More Attentive Language Models |
unknown |
5 |
2024-07-19 |
link |
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities |
unknown |
5 |
2024-09-04 |
link |
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA |
unknown |
5 |
2024-10-10 |
link |
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation |
unknown |
5 |
2024-04-16 |
link |
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation |
unknown |
5 |
2024-05-29 |
link |
ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning |
unknown |
5 |
2024-08-22 |
link |
ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction |
unknown |
5 |
2024-08-20 |
link |
GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting |
unknown |
5 |
2024-10-01 |
link |
Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown |
unknown |
5 |
2024-04-02 |
link |
Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Video and Multi-view Diffusion Models |
unknown |
5 |
2024-08-22 |
link |
A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language |
unknown |
5 |
2023-12-18 |
link |
Social Learning: Towards Collaborative Learning with Large Language Models |
unknown |
5 |
2024-05-08 |
link |
Preble: Efficient Distributed Prompt Scheduling for LLM Serving |
unknown |
5 |
2024-07-06 |
link |
Progress or Regress? Self-Improvement Reversal in Post-training |
unknown |
5 |
2024-02-24 |
link |
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning |
unknown |
5 |
2024-04-11 |
link |
Gaga: Group Any Gaussians via 3D-aware Memory Bank |
unknown |
5 |
2024-07-25 |
link |
Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement |
unknown |
5 |
2024-04-29 |
link |
U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models |
unknown |
5 |
2024-06-11 |
link |
Autoregressive Pretraining with Mamba in Vision |
unknown |
5 |
2024-06-07 |
link |
A Manifold Perspective on the Statistical Generalization of Graph Neural Networks |
unknown |
5 |
2024-10-03 |
link |
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences |
unknown |
5 |
2024-05-30 |
link |
Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA |
unknown |
5 |
2024-08-28 |
link |
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation |
unknown |
5 |
2024-06-27 |
link |
ColPali: Efficient Document Retrieval with Vision Language Models |
unknown |
5 |
2024-07-21 |
link |
When Can Transformers Count to n? |
unknown |
5 |
2022-02-07 |
link |
Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence |
unknown |
5 |
2023-11-24 |
link |
Revisiting Quantum Algorithms for Linear Regressions: Quadratic Speedups without Data-Dependent Parameters |
unknown |
5 |
2024-09-25 |
link |
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale |
unknown |
5 |
2024-04-15 |
link |
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model |
unknown |
5 |
2024-06-17 |
link |
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts |
unknown |
5 |
2024-10-07 |
link |
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation |
unknown |
5 |
2024-02-06 |
link |
Delving into temperature scaling for adaptive conformal prediction |
unknown |
5 |
2024-07-25 |
link |
LoRA-Pro: Are Low-Rank Adapters Properly Optimized? |
unknown |
4 |
2024-06-12 |
link |
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases |
unknown |
4 |
2024-10-03 |
link |
LLaVA-Critic: Learning to Evaluate Multimodal Models |
unknown |
4 |
2024-02-02 |
link |
PiCO: Peer Review in LLMs based on the Consistency Optimization |
unknown |
4 |
2024-02-23 |
link |
Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer |
unknown |
4 |
2024-06-16 |
link |
Data Shapley in One Training Run |
unknown |
4 |
2024-05-30 |
link |
Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee Discussions |
unknown |
4 |
2024-06-02 |
link |
Inverse Constitutional AI: Compressing Preferences into Principles |
unknown |
4 |
2024-05-28 |
link |
Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference |
unknown |
4 |
2024-08-05 |
link |
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models |
unknown |
4 |
2024-06-11 |
link |
VersiCode: Towards Version-controllable Code Generation |
unknown |
4 |
2024-08-01 |
link |
ReSi: A Comprehensive Benchmark for Representational Similarity Measures |
unknown |
4 |
2024-06-13 |
link |
GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning |
unknown |
4 |
2023-07-14 |
link |
Benchmarks and Custom Package for Energy Forecasting |
unknown |
4 |
2024-02-08 |
link |
An Examination on the Effectiveness of Divide-and-Conquer Prompting in Large Language Models |
unknown |
4 |
2024-05-02 |
link |
CoS: Enhancing Personalization and Mitigating Bias with Context Steering |
unknown |
4 |
2024-06-24 |
link |
Video-Infinity: Distributed Long Video Generation |
unknown |
4 |
2024-06-11 |
link |
On the relation between trainability and dequantization of variational quantum learning models |
unknown |
4 |
2024-05-14 |
link |
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory |
unknown |
4 |
2024-06-26 |
link |
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data |
unknown |
4 |
2024-08-24 |
link |
Selective Preference Optimization via Token-Level Reward Function Estimation |
unknown |
4 |
2024-10-03 |
link |
ControlAR: Controllable Image Generation with Autoregressive Models |
unknown |
4 |
2024-03-31 |
link |
CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks |
unknown |
4 |
2024-05-28 |
link |
EG4D: Explicit Generation of 4D Object without Score Distillation |
unknown |
4 |
2024-06-28 |
link |
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model |
unknown |
4 |
2024-05-23 |
link |
AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct |
unknown |
4 |
2024-09-30 |
link |
The Perfect Blend: Redefining RLHF with Mixture of Judges |
unknown |
4 |
2024-06-18 |
link |
The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions |
unknown |
4 |
2024-06-25 |
link |
Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models |
unknown |
4 |
2024-05-29 |
link |
Adaptive In-conversation Team Building for Language Model Agents |
unknown |
4 |
2024-08-15 |
link |
HAIR: Hypernetworks-based All-in-One Image Restoration |
unknown |
4 |
2024-02-21 |
link |
Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach |
unknown |
4 |
2024-07-01 |
link |
Benchmarking Predictive Coding Networks - Made Simple |
unknown |
4 |
2023-12-11 |
link |
Activation Gradient based Poisoned Sample Detection Against Backdoor Attacks |
unknown |
4 |
2024-07-03 |
link |
Universal Length Generalization with Turing Programs |
unknown |
4 |
2024-09-26 |
link |
MIO: A Foundation Model on Multimodal Tokens |
unknown |
4 |
2024-10-02 |
link |
HelpSteer2-Preference: Complementing Ratings with Preferences |
unknown |
4 |
2024-06-03 |
link |
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec |
unknown |
4 |
2024-07-08 |
link |
BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space |
unknown |
4 |
2024-03-06 |
link |
GUIDE: Guidance-based Incremental Learning with Diffusion Models |
unknown |
4 |
2024-05-23 |
link |
Graph Sparsification via Mixture of Graphs |
unknown |
4 |
2024-06-15 |
link |
Task Facet Learning: A Structured Approach to Prompt Optimization |
unknown |
4 |
2024-07-15 |
link |
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated |
unknown |
4 |
2024-03-25 |
link |
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions |
unknown |
4 |
2024-05-24 |
link |
OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code |
unknown |
4 |
2024-04-29 |
link |
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models |
unknown |
4 |
2024-05-01 |
link |
MMTryon: Multi-Modal Multi-Reference Control for High-Quality Fashion Generation |
unknown |
4 |
2023-02-01 |
link |
QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing |
unknown |
4 |
2024-05-28 |
link |
Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving? |
unknown |
4 |
2024-07-02 |
link |
Consistency Flow Matching: Defining Straight Flows with Velocity Consistency |
unknown |
4 |
2024-05-12 |
link |
Stable Signature is Unstable: Removing Image Watermark from Diffusion Models |
unknown |
4 |
2024-08-01 |
link |
OmniParser for Pure Vision Based GUI Agent |
unknown |
4 |
2024-06-03 |
link |
Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets |
unknown |
4 |
2024-09-26 |
link |
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions |
unknown |
4 |
2024-05-30 |
link |
Typography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal Large Language Models |
unknown |
4 |
2024-10-08 |
link |
Pyramidal Flow Matching for Efficient Video Generative Modeling |
unknown |
4 |
2024-05-23 |
link |
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models |
unknown |
4 |
2024-05-24 |
link |
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models |
unknown |
4 |
2024-06-15 |
link |
Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models |
unknown |
4 |
2024-10-02 |
link |
ImageFolder: Autoregressive Image Generation with Folded Tokens |
unknown |
4 |
2024-06-05 |
link |
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion |
unknown |
4 |
2023-04-07 |
link |
A Block Coordinate Descent Method for Nonsmooth Composite Optimization under Orthogonality Constraints |
unknown |
4 |
2024-07-02 |
link |
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models |
unknown |
4 |
2024-03-14 |
link |
FedComLoc: Communication-Efficient Distributed Training of Sparse and Quantized Models |
unknown |
4 |
2024-06-10 |
link |
Low-Rank Quantization-Aware Training for LLMs |
unknown |
4 |
2024-06-10 |
link |
How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark |
unknown |
4 |
2024-07-11 |
link |
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models |
unknown |
4 |
2024-06-07 |
link |
Multi-Head RAG: Solving Multi-Aspect Problems with LLMs |
unknown |
4 |
2023-06-23 |
link |
Variance-Covariance Regularization Improves Representation Learning |
unknown |
4 |
2024-09-06 |
link |
Theory, Analysis, and Best Practices for Sigmoid Self-Attention |
unknown |
4 |
2024-06-05 |
link |
Text-to-Image Rectified Flow as Plug-and-Play Priors |
unknown |
4 |
2024-05-22 |
link |
FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering |
unknown |