#1 A Systematic Comparison of Prompting and Multi-Agent Methods for LLM-based Stance Detection
Score: 34.3
Matched keywords: agent, llm, multi-agent, prompt, reasoning
Categories: cs.CL
Compressed abstract: Stance detection identifies the attitude of a text author toward a given target. Recent studies have explored various LLM-based strategies for this task, from zero-shot prompting to multi-agent debate.
Open summary page · arXiv · PDF
#2 LLM Psychosis: A Theoretical and Diagnostic Framework for Reality-Boundary Failures in Large Language Models
Score: 20.2
Matched keywords: large language models, llm
Categories: cs.CY, cs.AI
Compressed abstract: The deployment of large language models (LLMs) as interactive agents has exposed a category of behavioral failure that prevailing terminology, principally hallucination, fails to adequately characterize. This paper introduces LLM Psychosis as a structured theoretical framework for pathological breakdowns in model cognition that exhibit functional resemblance to clinically recognized psychotic disorders.
Open summary page · arXiv · PDF
#3 I Would If I Could: Reasoning about Dynamics of Actions in Multi-Agent Systems
Score: 18.5
Matched keywords: agent, multi-agent, reasoning
Categories: cs.LO, cs.MA
Compressed abstract: Autonomous agents acting in realistic Multi-Agent Systems (MAS) should be able to adapt during their execution. Standard strategic logics, such as Alternating-time Temporal Logic (ATL), model agents' state- or history-dependent behaviour.
Open summary page · arXiv · PDF
#4 Shorthand for Thought: Compressing LLM Reasoning via Entropy-Guided Supertokens
Score: 24.4
Matched keywords: benchmark, fine-tuning, large language models, llm, reasoning, token
Categories: cs.CL
Compressed abstract: Reasoning in Large Language Models incurs significant inference-time compute, yet the token-level information structure of reasoning traces remains underexplored. We observe that reasoning tokens split into two functional types: low-entropy structural tokens (recurring phrases that scaffold the reasoning process) and higher-entropy organic tokens (problem-specific content that drives toward a solution).
Open summary page · arXiv · PDF
#5 LLM-Flax : Generalizable Robotic Task Planning via Neuro-Symbolic Approaches with Large Language Models
Score: 16.0
Matched keywords: benchmark, large language models, llm
Categories: cs.RO
Compressed abstract: Deploying a neuro-symbolic task planner on a new domain today requires significant manual effort: a domain expert must author relaxation and complementary rules, and hundreds of training problems must be solved to supervise a Graph Neural Network (GNN) object scorer. We propose LLM-Flax, a three-stage framework that eliminates all three sources of manual effort using a locally hosted LLM given only a PDDL domain fil…
Open summary page · arXiv · PDF
#6 Factorized Latent Reasoning for LLM-based Recommendation
Score: 24.0
Matched keywords: alignment, large language models, llm, reasoning
Categories: cs.IR
Compressed abstract: Large language models (LLMs) have recently been adopted for recommendation by framing user preference modeling as a language generation problem. However, existing latent reasoning approaches typically represent user intent with a single latent vector, which struggles to capture the inherently multi-faceted nature of user preferences.
Open summary page · arXiv · PDF
#7 Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models
Score: 18.9
Matched keywords: code generation, diffusion, large language models
Categories: cs.CL, cs.AI, cs.LG
Compressed abstract: Diffusion large language models (dLLMs) offer parallel decoding and bidirectional context, but state-of-the-art dLLMs require billions of parameters for competitive performance. While existing distillation methods for dLLMs reduce inference steps within a single architecture, none address cross-architecture knowledge transfer, in which the teacher and student differ in architecture, attention mechanism, and tokenize…
Open summary page · arXiv · PDF
#8 When to Retrieve During Reasoning: Adaptive Retrieval for Large Reasoning Models
Score: 20.0
Matched keywords: benchmark, rag, reasoning, retrieval-augmented, token
Categories: cs.IR, cs.AI, cs.CL
Compressed abstract: Large reasoning models such as DeepSeek-R1 and OpenAI o1 generate extended chains of thought spanning thousands of tokens, yet their integration with retrieval-augmented generation (RAG) remains fundamentally misaligned. Current RAG systems optimize for providing context before reasoning begins, while reasoning models require evidence injection during multi-step inference chains.
Open summary page · arXiv · PDF
#9 Preserving Disagreement: Architectural Heterogeneity and Coherence Validation in Multi-Agent Policy Simulation
Score: 25.2
Matched keywords: agent, ai, large language models, multi-agent, reasoning
Categories: cs.MA, cs.AI
Compressed abstract: Multi-agent deliberation systems using large language models (LLMs) are increasingly proposed for policy simulation, yet they suffer from artificial consensus: evaluator agents converge on the same option regardless of their assigned value perspectives. We present the AI Council, a three-phase deliberation framework, and conduct 120 deliberations across two policy scenarios to test two interventions.
Open summary page · arXiv · PDF
#10 SecMate: Multi-Agent Adaptive Cybersecurity Troubleshooting with Tri-Context Personalization
Score: 25.2
Matched keywords: agent, large language models, llm, multi-agent
Categories: cs.CR, cs.AI
Compressed abstract: Recent advances in large language models and agentic frameworks have enabled virtual customer assistants (VCAs) for complex support. We present SecMate, a multi-agent VCA for cybersecurity troubleshooting that integrates device, user, and service specificity from conversational and device-level signals.
Open summary page · arXiv · PDF