About
I’m Naoki Sega, a software engineer focused on Kubernetes-native AI/LLM infrastructure — the cloud-native foundations that make AI and large models practical to run reliably, observably, and cost-effectively.
I come at it from a deep cloud-native background: nearly 20 years building and operating backend systems, specializing in Go, Kubernetes, and Google Cloud, with services scaled to 20M+ monthly active users and engineering teams I’ve led along the way. The scheduling, networking, observability, and reliability work that makes any large system hold up is exactly what production AI systems are built on — and that’s the bridge I’m building across.
I work toward this in the open, too: contributing to the Kubernetes ecosystem (including its AI-inference subprojects) and Go, and going deep on LLM and agent internals.
What I work on
- Kubernetes-native AI/LLM infrastructure — my growing focus: the cloud-native platforms that serve, scale, and operate models in production
- Backend & cloud foundations — microservices on Go, Kubernetes (GKE), and Google Cloud, built for scale and reliability
- Open source — contributions to Kubernetes (including AI-inference subprojects), Go, and the AI-native tooling ecosystem
- LLM & agent internals — continuously going deeper to do the above well
About this blog
Posts here tend to be long-form and hands-on — the kind of write-up I wish I’d found before hitting a problem myself. If something I wrote saved you time, or you spotted a mistake, I’d genuinely like to hear about it.
You’ll find my links in the header.