Subscribe
Sign in
Home
Essays
Analysis
Projects
Roundups
About
Latest
Top
Discussions
BYOB: Build Your Own Benchmark
AI benchmarks are saturating, getting harder to verify, and increasingly irrelevant to how most people use models. The replacements are weirder - and…
Mar 1
•
Charlie Guo
22
6
3
February 2026
The Emerging "Harness Engineering" Playbook
The converging best practices for building with coding agents, from OpenAI to Stripe to OpenClaw.
Feb 22
•
Charlie Guo
68
11
7
GPT-5.3-Codex and Claude Opus 4.6: More System Card Shenanigans
Evaluation awareness, reward hacking, and the cybersecurity problem.
Feb 11
•
Charlie Guo
33
5
3
The Codex App Has Upended My Daily Workflow
I haven't opened my IDE in days. I'm not sure I miss it.
Feb 2
•
Charlie Guo
28
7
2
January 2026
Humans Welcome to Observe
The OpenClaw/Clawdbot explainer: personal AI agents, security nightmares, and robot religion.
Jan 31
•
Charlie Guo
44
5
8
Skills, Tools and MCPs - What’s The Difference?
The primitives for AI systems are still being invented. Here's where we are today.
Jan 24
•
Charlie Guo
44
4
5
The AI Manager's Schedule
The new skills you need when your reports are LLMs.
Jan 16
•
Charlie Guo
27
1
3
On Joining OpenAI
And the next chapter of Artificial Ignorance.
Jan 9
•
Charlie Guo
34
36
1
10 AI Stories That Shaped 2025
A look back at the year in AI news.
Jan 4
•
Charlie Guo
20
2
2
December 2025
AI Roundup 150: Empire State Defiance
December 27, 2025.
Dec 27, 2025
•
Charlie Guo
21
2
2
AI Roundup 149: Flash Forward
December 20, 2025.
Dec 20, 2025
•
Charlie Guo
18
2
1
AI Trends for 2026
And a report card for last year’s predictions.
Dec 17, 2025
•
Charlie Guo
34
5
7
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts