All tags
Posts tagged with "Agent"
Claude Code Skill Safety: From 'Please Stop' to 'You Can't Move'
38 Skills, three layers of defense, one hard lesson: natural language instructions are not a safety mechanism. How I systematically hardened 12 unprotected destructive Skills with PreToolUse Hooks, Skill splitting, and disallowed-tools.
Claude Code Skill 安全性:從「拜託你停下來」到「你根本動不了」
38 個 Skills、三層防護、一個血淚教訓:自然語言指令不是安全機制。本文記錄我如何用 PreToolUse Hook、Skill 拆分和 disallowed-tools 系統性地修補 12 個毫無 checkpoint 的破壞性 Skills。
Safety Gates in Claude Code Skills: From Auditing 35 Skills to a Three-Layer Protection Model
I assumed writing 'Use AskUserQuestion' in a Skill was a hard constraint. After auditing 35 Skills, reading the official docs, and digging through GitHub Issues, I found out: the model uses the same mechanism to decide whether to obey your CHECKPOINT and whether to invoke your tool. There's only one gate that's truly 100%.
Claude Code Skill 的安全閘門:從 35 個 Skills 的審計到三層防護模型
我以為在 Skill 裡寫 Use AskUserQuestion 就是 hard constraint。審計完 35 個 Skills、查完官方文檔和 GitHub Issues 之後發現,模型用同一套機制決定要不要理你的 CHECKPOINT 和要不要調用你的 tool。真正 100% 的閘門只有一個。
Git as an External Brain for Claude Code: Beyond MEMORY.md
MEMORY.md isn't the end of the road for AI Agent memory. When project scale exceeds what a context window can hold, Git becomes the truly scalable external memory. This post breaks down the three layers of memory, Git's role among them, and which practices have research backing vs. which are just my own experiments.
Git 作為 Claude Code 的外接大腦:超越 MEMORY.md 的記憶架構
MEMORY.md 不是 AI Agent 記憶的終點。當專案規模超過 context window 能承載的範圍,Git 才是真正能無限擴展的外接記憶體。這篇拆解記憶的三個層次、Git 在其中的角色、以及哪些做法有研究支撐、哪些只是我自己的實驗。
26% 的真相:Mem0 論文、Benchmark 戰爭,和 Graph Memory 的承諾與現實
Mem0 的 26% accuracy boost 是怎麼算出來的?Zep 為什麼說 Mem0 作弊?Graph Memory 真的比純 Vector 好嗎?這篇文章逐層拆解 arXiv 論文,還原 benchmark 爭議真相,給你 production 選型的真實判斷依據。
The Truth About 26%: Mem0's Paper, Benchmark Wars, and the Promise vs Reality of Graph Memory
How was Mem0's 26% accuracy boost actually calculated? Why does Zep accuse Mem0 of cheating? Is Graph Memory really better than pure Vector? This article dissects the arXiv paper layer by layer, reveals the truth behind the benchmark controversy, and gives you real production selection criteria.
2026 AI Agent Memory Wars:三大流派的技術對決
AI Agent 的記憶問題終於有了認真的解法。Graph-based、OS-inspired、Observational——三大架構流派正在正面交鋒。這篇文章幫你搞懂它們的設計哲學、技術 trade-off,以及什麼場景該用哪一個。
2026 AI Agent Memory Wars: Three Architectures, Three Philosophies
AI Agent memory finally has serious solutions. Graph-based, OS-inspired, Observational—three architectural schools are going head-to-head. This article breaks down their design philosophies, technical trade-offs, and when to use which.
Cursor's $29B Secret: The Deleted Shadow Workspace, Reverse-Engineered
A deep dive into Cursor's Shadow Workspace architecture—the core innovation that once gave Cursor a massive edge over Copilot, why it quietly disappeared from settings, and what you can learn from it.
Cursor 的 $29B 秘密:被刪除的 Shadow Workspace 技術解密
深入解析 Cursor 的 Shadow Workspace 技術架構——這個讓 Cursor 碾壓 Copilot 的核心創新,為何後來從設定中消失?以及你可以從中學到什麼。