Hello, world | nsega.dev

This is the first post on this blog, so it’s mostly a note to myself about why it exists.

I spend my days on Kubernetes-native AI/LLM infrastructure — the platform layer between “a model that works in a notebook” and “a service that stays up under real traffic.” A lot of what matters there isn’t written down anywhere obvious: it lives in incident retros, Slack threads, and the scar tissue of having run the thing in production.

This blog is where I’ll write that down.

What to expect

Long-form, hands-on posts. I’d rather go deep on one problem than survey ten.
The operational side. Scheduling and autoscaling GPU workloads, serving and routing inference, observability, cost — the parts that decide whether a system holds up.
Honest write-ups. Including the approaches that didn’t work and why.

Why a blog at all

Writing forces me to actually understand a thing well enough to explain it. If a post happens to save someone else a few hours, that’s a bonus.

If something here was useful — or wrong — I’d like to know. My links are in the header.

More soon.