and some change
Home
About Me
#tech
2025
LLM benchmarks like SWE-bench are not trustworthy
Jan 8
2022
Debugging stories: the inconsistent database
Dec 5
Setting this blog up on Github Pages
Nov 3
Mastodon