Agent & LLM · 2026年7月2日

每日论文速递 · Agent & LLM

💡 一句话：它把“写什么记忆、什么时候取、怎么组织”从 prompt trick 变成可训练能力，长程任务性能提升 2-4x。

2026-07-02 09:03:048 篇论文条目

arXiv:2607.01224 arXiv:2607.00692 arXiv:2607.00151 arXiv:2607.01120 arXiv:2606.31174 arXiv:2606.30546 arXiv:2607.00269 arXiv:2607.01213

📄 每日论文速递 · Agent & LLM

日期：2026-07-02

1. AutoMem：把记忆管理训练成 Agent 的认知技能 / AutoMem: Automated Learning of Memory as a Cognitive Skill

🔗 https://arxiv.org/abs/2607.01224

💡 一句话：它把“写什么记忆、什么时候取、怎么组织”从 prompt trick 变成可训练能力，长程任务性能提升 2-4x。

🎯 关联：非常高。Anna 做 Agent 平台绕不开 memory lifecycle，这篇直接打中“memory 不是存储，而是行为能力”。

2. Self-GC：长程 Agent 的自治理上下文 / Self-GC: Self-Governing Context for Long-Horizon LLM Agents

🔗 https://arxiv.org/abs/2607.00692

💡 一句话：把上下文拆成可索引、可恢复的对象，由 side-channel planner 决定 fold/mask/prune，而不是粗暴摘要。

🎯 关联：非常高。InternOS/Agent runtime 都需要 context GC，这篇的“对象生命周期管理”比普通 summarization 强一个架构层级。

3. SmoothAgent：面向长程 Agent 的 lookahead context serving / SmoothAgent: Efficient Long-Horizon LLM-Based Agent Serving with Lookahead Context Engineering

🔗 https://arxiv.org/abs/2607.00151

💡 一句话：提前异步做 context transformation 和 KV cache 准备，把 context engineering 带来的 TTFT 开销最高降 11.9x。

🎯 关联：很高。Anna 如果做 Agent 平台，这不是模型能力问题，是 serving/runtime 调度问题，值得单独看。

4. 下一代 Agentic RL 系统：让自进化 Agent 真正落地 / Next-Generation Agentic Reinforcement Learning Systems Enable Self-Evolving Agents

🔗 https://arxiv.org/abs/2607.01120

💡 一句话：指出企业级 self-evolving agents 缺的不是 RL 算法，而是 trajectory protocol、data proxy、evolution control plane。

🎯 关联：非常高。这篇像一份 Agent 平台路线图，和 Anna 做 platform/control plane 的方向高度贴合。

5. ClawArena-Team：评测 LLM 管理子 Agent 和动态工作流的能力 / ClawArena-Team: Benchmarking Subagent Orchestration and Dynamic Workflows in Language-Model Agents

🔗 https://arxiv.org/abs/2606.31174

💡 一句话：专门测“一个主 Agent 会不会管理子 Agent”，发现瓶颈不是感知，而是权限分配和 least-privilege routing。

🎯 关联：很高。InternOS 的组织协调系统本质也是 delegation + permission + workflow，这篇 benchmark 设计值得借。

6. MAS-Lab：可靠多智能体系统的规格驱动验证框架 / MAS-Lab: A Specification-Driven Validation Framework for Reliable Multi-Agent Systems

🔗 https://arxiv.org/abs/2606.30546

💡 一句话：主张 MAS 不能再是脚本拼装，要有 declarative spec、MAS-OS、observability/eval overlays。

🎯 关联：非常高。这篇和 Anna 的系统观一致：Agent 不是 demo workflow，而是需要可演化、可验证、可运营的 distributed system。

7. Mnemosyne：AI 工作流的 Agentic Transaction Processing / Mnemosyne: Agentic Transaction Processing for Validating and Repairing AI-generated Workflows

🔗 https://arxiv.org/abs/2607.00269

💡 一句话：把 LLM/Agent 生成的动作视为“不可信 proposal”，只有通过确定性约束验证后才能 commit，并支持 bounded repair。

🎯 关联：极高。Anna 做 InternOS 里的承诺、任务、状态变更，应该直接借这个思想：proposal ≠ truth，runtime 才能 commit。

8. RepoRescue：LLM Agent 修复整仓兼容性的实证研究 / RepoRescue: An Empirical Study of LLM Agents on Whole-Repository Compatibility Rescue

🔗 https://arxiv.org/abs/2607.01213

💡 一句话：构建 315 个真实 Python/Java 老仓库迁移任务，测 Agent 能不能在现代环境下恢复历史测试通过。

🎯 关联：高。对代码 Agent 和软件工程 AI 很实用，尤其是 whole-repo coordination、跨文件修改、测试约束这些平台能力。