arXiv daily keyword digest · 2026-06-03

#1 Cross-Lingual Token Arbitrage: Optimizing Code Agent Context Windows via Local LLM Preprocessing

Score: 34.7

Matched keywords: agent, ai, benchmark, code agent, llm, prompt, token

Categories: cs.AI

Compressed abstract: AI-assisted coding agents are bottlenecked by input-token cost. Two pathologies of raw human input drive much of this overhead: tokenization inefficiency for non-English text and structural entropy in conversational prompts.

Open summary page · arXiv · PDF

#2 The Deliberative Illusion: Diagnosing Factual Attrition and Stance Homogenization in Multi-Agent LLM Deliberation

Score: 27.2

Matched keywords: agent, llm, multi-agent

Categories: cs.CL

Compressed abstract: Multi-agent LLM systems often treat consensus as evidence of successful interaction. For deliberative problems, however, reliability depends on whether agents preserve the facts and viewpoints needed to interpret an issue.

Open summary page · arXiv · PDF

#3 The Ringelmann Effect in Multi-Agent LLM Systems: A Scaling Law for Effective Team Size

Score: 27.2

Matched keywords: agent, llm, multi-agent

Categories: physics.soc-ph, cs.AI, cs.MA

Compressed abstract: Inference-time multi-agent LLM scaling lacks a shared unit: counting nominal agents conflates cost with independent evidence. We derive a two-parameter scaling law R(N) = N_eff/N = 1/(1+c(N-1)N^{-}) where the regime exponent classifies any configuration into one of three asymptotic regimes -- hard-ceiling at 1/c ( = 0), sublinear at N^/c (0 < < 1), or linear ( 1), and a mean-field theorem predicts that peer count k…

Open summary page · arXiv · PDF

#4 Multi^2: Hierarchical Multi-Agent Decision-Making with LLM-Based Agents in Interactive Environments

Score: 39.5

Matched keywords: agent, benchmark, fine-tuning, large language model, llm, multi-agent, reasoning

Categories: cs.LG

Compressed abstract: A central goal of large language model (LLM) research is to build agentic systems that can plan, act, and adapt through sustained interaction with dynamic environments. While recent LLM-based agents exhibit impressive contextual reasoning, their long-horizon decision-making remains fragile, often suffering from objective drift, where goals and plans drift over extended interactions.

Open summary page · arXiv · PDF

#5 The Geometry of LLM-as-Judge: Why Inter-LLM Consensus Is Not Human Alignment

Score: 20.2

Matched keywords: alignment, fine-tuning, llm

Categories: cs.CL

Compressed abstract: LMs-as-judges are now standard, yet judges agree strongly with one another while agreeing only weakly with humans. We test whether this reflects shared signal or shared bias by measuring four geometric quantities on the standard LLM-as-judge stack across four community-built Indic datasets, eight Indic languages, and 41 LLM judges: score spread, effective rank, principal angle to the human subspace, and stacked corr…

Open summary page · arXiv · PDF

#6 Toward a Modular Architecture for Embedded AI Agent Systems at the Edge

Score: 33.2

Matched keywords: agent, ai, ai agent, large language models, reasoning, tool use

Categories: cs.AI, cs.MA

Compressed abstract: The rise of Large Language Models (LLMs) has enabled agentic AI capable of complex reasoning and tool use; however, deploying such autonomy in pervasive computing environments remains challenging due to the strict memory and energy constraints of embedded microcontrollers. Existing frameworks typically assume server-class resources or continuous connectivity, leaving a gap for deeply embedded systems.

Open summary page · arXiv · PDF

#7 E2 LLM: Towards Efficient LLM Serving in Heterogeneous Edge/Fog Environments

Score: 17.4

Matched keywords: large language models, llm, token

Categories: cs.DC, cs.AI

Compressed abstract: Large Language Models (LLMs) have become integral to modern applications, yet their deployment remains challenging. Beyond executing the models themselves, practical deployment must address cost efficiency, low latency, and optimal resource utilization.

Open summary page · arXiv · PDF

#8 Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition

Score: 24.7

Matched keywords: benchmark, code generation, fine-tuning, large language models, llm, tool use

Categories: cs.AI

Compressed abstract: Large language models for code generation often need to use APIs that are absent from their pretraining data. This requires more than recalling a function name: models must coordinate signatures, module paths, input-output contracts, semantics, and executable usage patterns.

Open summary page · arXiv · PDF

#9 Multi-Agent Framework Leveraging Knowledge Graphs for Virtual Commissioning Models

Score: 19.0

Matched keywords: agent, agent framework, multi-agent

Categories: cs.CE

Compressed abstract: Virtual commissioning models (VCMs) of discrete manufacturing systems are used to validate automation behavior before physical deployment, but creating and maintaining them remains labor-intensive. Relevant engineering information is distributed across programmable logic controller (PLC) engineering projects, such as Siemens TIA Portal, and kinematic simulation models, such as Siemens NX Mechatronics Concept Designe…

Open summary page · arXiv · PDF

#10 Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning

Score: 28.6

Matched keywords: agent, large language models, llm, reasoning, token

Categories: cs.CL, cs.AI

Compressed abstract: Large language models improve final-answer accuracy through extended chain-of-thought reasoning, but often spend tokens inefficiently and offer little inference-time control. Existing efficient reasoning methods control thinking length by shortening, early-stopping, or compressing traces, leaving how the model thinks implicit.

Open summary page · arXiv · PDF

2026-06-03 · arXiv Daily Keyword Digest (Top 10 of 839)

#1 Cross-Lingual Token Arbitrage: Optimizing Code Agent Context Windows via Local LLM Preprocessing

#2 The Deliberative Illusion: Diagnosing Factual Attrition and Stance Homogenization in Multi-Agent LLM Deliberation

#3 The Ringelmann Effect in Multi-Agent LLM Systems: A Scaling Law for Effective Team Size

#4 Multi^2: Hierarchical Multi-Agent Decision-Making with LLM-Based Agents in Interactive Environments

#5 The Geometry of LLM-as-Judge: Why Inter-LLM Consensus Is Not Human Alignment

#6 Toward a Modular Architecture for Embedded AI Agent Systems at the Edge

#7 E2 LLM: Towards Efficient LLM Serving in Heterogeneous Edge/Fog Environments

#8 Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition

#9 Multi-Agent Framework Leveraging Knowledge Graphs for Virtual Commissioning Models

#10 Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning