2026-04-16 · arXiv Daily Keyword Digest (Top 10 of 592)

Generated: 2026-04-17T08:02:19.410295+09:00

Target date (KST): 2026-04-16

Selection: picked 10 from 592 papers published on the target date

Source: https://export.arxiv.org/api/query (`cat:cs.*`, sorted by submittedDate desc)

Selection logic: keyword-weight score + subject boost

#1 TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration

Score: 35.4

Matched keywords: agent, ai, benchmark, fine-tuning, large language models, llm, multi-agent

Categories: cs.AI, cs.CL

Compressed abstract: While Large Language Models (LLMs) have empowered AI research agents to perform isolated scientific tasks, automating complex, real-world workflows, such as LLM training, remains a significant challenge. In this paper, we introduce TREX, a multi-agent system that automates the entire LLM training life-cycle.

Open summary page · arXiv · PDF

#2 AgentForge: Execution-Grounded Multi-Agent LLM Framework for Autonomous Software Engineering

Score: 31.9

Matched keywords: agent, agent framework, large language models, llm, multi-agent, token

Categories: cs.SE, cs.AI

Compressed abstract: Large language models generate plausible code but cannot verify correctness. Existing multi-agent systems simulate execution or leave verification optional.

Open summary page · arXiv · PDF

#3 When Less Latent Leads to Better Relay: Information-Preserving Compression for Latent Multi-Agent LLM Collaboration

Score: 34.5

Matched keywords: agent, large language model, llm, multi-agent, reasoning

Categories: cs.LG

Compressed abstract: Communication in Large Language Model (LLM)-based multi-agent systems is moving beyond discrete tokens to preserve richer context. Recent work such as LatentMAS enables agents to exchange latent messages through full key-value (KV) caches.

Open summary page · arXiv · PDF

#4 Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data

Score: 35.0

Matched keywords: benchmark, fine-tuning, in-context learning, large language models, llm, prompt, token

Categories: cs.CL, cs.AI, cs.LG

Compressed abstract: In-context learning has established itself as an important learning paradigm for Large Language Models (LLMs). In this paper, we demonstrate that LLMs can learn encoding keys in-context and perform analysis directly on encoded representations.

Open summary page · arXiv · PDF

#5 Adaptive Memory Crystallization for Autonomous AI Agent Learning in Dynamic Environments

Score: 24.2

Matched keywords: agent, ai, ai agent, ai agents

Categories: cs.LG, cs.AI

Compressed abstract: Autonomous AI agents operating in dynamic environments face a persistent challenge: acquiring new capabilities without erasing prior knowledge. We present Adaptive Memory Crystallization (AMC), a memory architecture for progressive experience consolidation in continual reinforcement learning.

Open summary page · arXiv · PDF

#6 Bridging MARL to SARL: An Order-Independent Multi-Agent Transformer via Latent Consensus

Score: 23.8

Matched keywords: agent, benchmark, multi-agent, transformer

Categories: cs.LG, cs.AI, cs.MA

Compressed abstract: Cooperative multi-agent reinforcement learning (MARL) is widely used to address large joint observation and action spaces by decomposing a centralized control problem into multiple interacting agents. However, such decomposition often introduces additional challenges, including non-stationarity, unstable training, weak coordination, and limited theoretical guarantees.

Open summary page · arXiv · PDF

#7 Correct Chains, Wrong Answers: Dissociating Reasoning from Output in LLM Logic

Score: 17.4

Matched keywords: benchmark, llm, reasoning

Categories: cs.CL, cs.AI, cs.LO

Compressed abstract: LLMs can execute every step of chain-of-thought reasoning correctly and still produce wrong final answers. We introduce the Novel Operator Test, a benchmark that separates operator logic from operator name, enabling rigorous distinction between genuine reasoning and pattern retrieval.

Open summary page · arXiv · PDF

#8 The cognitive companion: a lightweight parallel monitoring architecture for detecting and recovering from reasoning degradation in LLM agents

Score: 25.4

Matched keywords: large language model, llm, reasoning, token

Categories: cs.AI, cs.LG

Compressed abstract: Large language model (LLM) agents on multi-step tasks suffer reasoning degradation, looping, drift, stuck states, at rates up to 30% on hard tasks. Current solutions include hard step limits (abrupt) or LLM-as-judge monitoring (10-15% overhead per step).

Open summary page · arXiv · PDF

#9 Modality-Native Routing in Agent-to-Agent Networks: A Multimodal A2 A Protocol Extension

Score: 35.1

Matched keywords: agent, benchmark, llm, multi-agent, multimodal, reasoning

Categories: cs.AI, cs.MA, cs.SE

Compressed abstract: Preserving multimodal signals across agent boundaries is necessary for accurate cross-modal reasoning, but it is not sufficient. We show that modality-native routing in Agent-to-Agent (A2 A) networks improves task accuracy by 20 percentage points over text-bottleneck baselines, but only when the downstream reasoning agent can exploit the richer context that native routing preserves.

Open summary page · arXiv · PDF

#10 SafeHarness: Lifecycle-Integrated Security Architecture for LLM-based Agent Deployment

Score: 37.2

Matched keywords: agent, benchmark, harness, large language model, llm, tool use

Categories: cs.CR, cs.AI

Compressed abstract: The performance of large language model (LLM) agents depends critically on the execution harness, the system layer that orchestrates tool use, context management, and state persistence. Yet this same architectural centrality makes the harness a high-value attack surface: a single compromise at the harness level can cascade through the entire execution pipeline.

Open summary page · arXiv · PDF