OOP Won. You Just Don't Work on Hard Problems.

Share:
Cover Image for OOP Won. You Just Don't Work on Hard Problems.

Before you start typing about the Uncle Bob altar I must have in my bedroom, hear me out.

If you don't know me, I'm the CTO of Autonoma. We build QA agents that navigate websites and mobile apps to find bugs. Most of my working life, I've been building what I call "jobs": taking something and doing complex, stateful processing on it. Usually with AI involved in some part of the decision making. Real-time video engines that consume 200+ camera feeds. State machines that manage interactive browser sessions. AI agent orchestration systems.

This is very different from the reality of most developers out there who (being reductionist) build endpoints, frontend components, and database queries. And I think that's where the disconnect comes from. If your entire world is request-response, of course OOP feels like overkill. You don't need a Strategy pattern to validate a form.

I need to hold temporary state. Sometimes models in memory. Expensive connections that are hard to recreate. My entire process is stateful. And for that, OOP isn't just nice to have. It's the only thing that scales.

200 cameras, 2 servers, and a custom thread pool

A few years ago I built an engine that consumed video streams from multiple sources (mostly CCTV cameras), processed them, and took action based on the results. This was deployed in an airgapped, high-security environment.

The cameras were a mess. Some used custom protocols for higher quality feeds (30MP image streams at the time) with compression that skipped frames when the delta wasn't big enough. Others were standard RTSP. Others required pulling frames from the camera NVR (the recording server). Each type had a different transport layer, different buffers, different handshakes.

So we built a Stream trait (Scala's version of an interface). You could read images from it and check if it was down. That was it. Behind that trait we had multiple implementations, each with their own transport layer, connection state management, and quirks:

streams/
├── https/
│   ├── netty/
│   ├── ssl/
│   ├── FeedParameter.scala
│   └── JavaNetHttpStream.scala
├── udp/
│   └── opencv/
│       ├── pool/
│       │   └── StreamThreadPool.scala
│       ├── state/
│       └── OpenCVStream.scala
└── videostreams/
    └── VideoStreamImpl.scala

The OpenCV implementation consumed UDP feeds through OpenCV's VideoCapture. The HTTPS ones pulled frames over HTTP with Netty or Java's built-in HTTP client, with SSL support for secured cameras. Each one had its own internal state machine for managing the connection lifecycle (ConnectedState, ForcedDisconnectedState). When we found improvements or needed to support a new camera type with a slightly different handshake, we'd add a new implementation. The consuming code never changed.

Was it overengineered? In earnest, a little bit. It didn't feel that way for long.

We were very machine constrained. Two servers with Xeon processors, and we needed to process 200+ cameras. We weren't doing any video processing on the GPUs because our models lived there, and Xeons are terrible at video processing (lower clock speeds, no specialized hardware like consumer CPUs have). Even consumer CPUs at the time were optimized for maybe 4 UHD streams. We needed 200.

So we built a custom thread pool (in Scala, because this was a few years ago) that was very aggressive at deallocating cameras that weren't being used. It wasn't a regular thread pool. It was a video-stream-aware pool that mapped threads to streams, dynamically scaled based on a configurable streams-per-thread ratio, and killed any stream that hadn't been queried within a timeout window:

// inside the worker thread loop
if (System.currentTimeMillis() - stream.lastRead > restartAfter) {
  stream.changeState(new ForcedDisconnectedState(stream))
  remove(stream.id)
}

If we didn't request an image from a camera for X milliseconds, the pool would force-disconnect the stream and free the resources. We fine-tuned those parameters until the system hummed along on two servers processing 200+ feeds.

The VideoStream interface made all of this possible. The thread pool didn't care what kind of camera it was managing. It just held VideoStream instances. New camera type? New implementation. Thread pool unchanged. Resource management unchanged. The abstraction earned its keep.

"We had bugs on clients every hour"

At Autonoma, we built a product where users could interactively create tests. There's a canvas where you can click, type, write instructions to the LLM, or use helper buttons, in whatever order, at any time, while we're streaming video of the browser to the frontend.

Before the state machine, it was a nightmare. We had race conditions everywhere. A user would click while the AI was processing an instruction, and the test steps would get deleted, reordered, or just disappear. We were getting client bugs every hour.

Then we implemented the State pattern.

We built a BaseState abstract class that declares every possible action that could be performed on the system: click, scroll, typekey, giveInstruction, runAll, pauseReplay, saveTest, goBack, setCookies... over 30 abstract methods. It's verbose. I did it intentionally.

export abstract class BaseState extends EventEmitter {
  abstract click(metadata: ClickInputMetadata): Promise<Try<ClickOutputMetadata>>
  abstract scroll(metadata: ScrollInputMetadata): Promise<Try<ScrollOutputMetadata>>
  abstract typekey(metadata: TypekeyInputMetadata): Promise<Try<TypekeyOutputMetadata>>
  abstract giveInstruction(metadata: GiveInstructionInputMetadata): Promise<Try<GiveInstructionOutputMetadata>>
  abstract runAll(metadata: RunAllInputMetadata): Promise<Try<RunAllOutputMetadata>>
  abstract pauseReplay(metadata: PauseReplayInputMetadata): Promise<Try<PauseReplayOutputMetadata>>
  abstract saveTest(metadata: SaveTestInputMetadata): Promise<Try<SaveTestOutputMetadata>>
  // ... 25+ more abstract methods

  protected abstract getHeaderState(): HeaderState
  protected abstract getPromptState(): PromptState
  protected abstract getActionButtonsState(): ActionsButtonsState
  protected abstract getDeviceViewState(): DeviceViewState
}

Every state subclass is forced to handle every action. No "oh I forgot to handle clicks during processing." The compiler catches it.

We also created a single state object that held the entire UI state: which buttons are active, whether the canvas accepts clicks, whether we're "thinking," all of it:

export type State = {
  device_view: DeviceViewState   // click/scroll/drag disabled?
  steps: StepsState              // test steps + cursor position
  header: HeaderState            // save button, edit name, etc.
  prompt: PromptState            // instruction input enabled?
  action_buttons: ActionsButtonsState  // every button's state
}

The frontend became a pure view of this state object. Like MVC for a WebSocket-driven real-time system. Each state subclass controls what every part of the UI looks like by implementing getHeaderState(), getActionButtonsState(), etc. The frontend doesn't decide what to show. The state tells it.

This also solved a problem we had with partial state reconciliation. Because the connection was over WebSocket, we sometimes had mismatches between what the backend thought was happening and what the frontend showed. With a single source-of-truth state object that we sync wholesale, that problem went away. It's a little inefficient (it's a big JSON blob) but compared to the WebRTC video stream we were already pushing, it was nothing.

Here are the states we had:

  • IdleState: Ready to receive new actions
  • ProcessingState: with subclasses ProcessingInstructionState, ProcessingScrollState, ProcessingTypingState
  • FailedState: An error occurred during the test
  • HealthyState: The test completed a full pass and is ready to be saved

Some of these have exclusive behaviors. If you're in ProcessingTypingState and you click, we immediately commit what you typed and start processing the click. If you're in ProcessingInstructionState and you click, nothing happens. Literally a no-op. And because BaseState forces every state to implement click(), you can't forget to handle that case. The ProcessingInstructionState explicitly returns a no-op. That's a conscious decision, not a missing else branch.

Try modeling that cleanly with a bunch of if statements and a status string. I've been there. It doesn't work. The state machine made an entire class of bugs disappear overnight.

The right kind of inheritance

I know what you're thinking. "Great, so you have a 7-level deep class hierarchy and everything is a AbstractSingletonProxyFactoryBean?"

No. I'm pragmatic about this.

I almost always prefer composition over inheritance. I almost always prefer interfaces and traits over abstract classes. I don't do inheritance for fun. I only reach for it when an abstract class defines shared behavior and I need to specialize, and especially when I need to constrain the order of operations.

Here's a real example. At Autonoma, we have webhooks that fire when tests complete. The sending logic is basically the same regardless of what triggered it (a test, a folder of tests, a tag). You fetch the webhooks, check notification rules, build a payload, send it, and record the delivery. That flow is identical.

What changes is how you build the payload. A folder webhook needs to aggregate results from multiple test runs. A test webhook just reports on one.

So we have a WebhookExecutor abstract class with an execute() method that handles the full flow, and a payload() abstract method that subclasses implement:

export abstract class WebhookExecutor {
  protected abstract payload(): Promise<any>
  protected abstract getCurrentScope(): NotifyScope

  async execute() {
    const webhooks = await db.webhook.findMany({
      where: { organizationID: this.organizationId, active: true },
    })
    if (webhooks.length === 0) return

    const payload = await this.payload()
    if (payload == null) return

    // send to each webhook, check notification rules,
    // record delivery, handle failures...
  }
}

And the folder implementation:

export class FolderWebhookExecutor extends WebhookExecutor {
  constructor(resourceId: string, runGroupID: string, orgId: string) {
    super("folder", resourceId, runGroupID, orgId)
  }

  protected getCurrentScope(): NotifyScope {
    return NotifyScope.FOLDER_LEVEL
  }

  protected async payload() {
    // fetch folder runs, aggregate results,
    // build the folder-specific payload
  }
}

This is the Template Method pattern. The abstract class defines the algorithm skeleton (execute), and subclasses fill in the parts that vary (payload, getCurrentScope). The implementer only needs to know they need to return a JSON payload. That's it.

This isn't inheritance for the sake of inheritance. The abstract class is doing real work: it enforces the order of operations (check rules, build payload, send, record) so nobody can accidentally skip a step or do things out of order. That's a constraint I want.

Interface segregation: when your engines share 80% but not the other 20%

At Autonoma we have two test execution engines: PlaywrightEngine for web and AppiumEngine for mobile. They share a ton of overlap (clicking, navigating, asserting), but they also have capabilities the other doesn't. On web, you can set cookies. On mobile, you can drag and long-press. You can't drag on web (well, you can, but we didn't support it) and you can't set cookies on a mobile device.

So we built it like this:

interface Clickable {
  click(selector: string): Promise<void>
}

interface Draggable {
  drag(from: string, to: string): Promise<void>
  longPress(selector: string): Promise<void>
}

interface Cookiable {
  setCookie(cookie: Cookie): Promise<void>
  getCookies(): Promise<Cookie[]>
}

class PlaywrightEngine extends Engine
  implements Clickable, Cookiable { ... }

class AppiumEngine extends Engine
  implements Clickable, Draggable { ... }

Now when the orchestrator needs to decide what to do, it can check capabilities:

if (isDraggable(engine)) {
  await engine.drag(from, to)
}

No if (platform === "mobile") scattered everywhere. The type system tells you what's possible. Add a new capability? Create a new interface. Implement it on the engines that support it. Everything else stays untouched.

This is the Interface Segregation Principle at work, and it's one of the most practical things OOP gives you. Small, focused interfaces that describe what something can do, not what it is.

The vi.mock problem

Let me show you why dependency injection matters with a real scenario. In one of my projects, I have Next.js server actions. They need to check authentication because I need the organization ID from WorkOS. Here's roughly what that looks like:

"use server"

import { auth } from "@/lib/auth"
import { db } from "@/lib/db"

export async function getTests() {
  const session = await auth()
  if (!session) throw new Error("Unauthorized")

  return db.test.findMany({
    where: { organizationId: session.orgId },
  })
}

Now I want to test this. Not the auth part, just the data fetching logic. What do I do?

vi.mock("@/lib/auth", () => ({
  auth: vi.fn().mockResolvedValue({ orgId: "org_123" }),
}))

This sucks. vi.mock is module-level. It's fragile. It breaks if I rename the import path. It doesn't compose. If I want different auth states in different tests, I'm fighting the test framework instead of writing tests.

With a class and dependency injection:

class TestService {
  constructor(
    private auth: AuthProvider,
    private db: Database,
  ) {}

  async getTests() {
    const session = await this.auth.getSession()
    if (!session) throw new Error("Unauthorized")

    return this.db.test.findMany({
      where: { organizationId: session.orgId },
    })
  }
}

Now testing is trivial:

const service = new TestService(
  { getSession: async () => ({ orgId: "org_123" }) },
  testDb,
)

const tests = await service.getTests()

No mocking framework. No module-level hacks. Just pass in what you want. Want to test the unauthorized case? Pass in an auth provider that returns null. Want to test with a real database? Pass in the real one. Want to test without auth at all? Pass in one that always returns a session.

This isn't even about OOP vs functional. You could do this with closures. The point is that the current "just use server actions" approach in the TypeScript ecosystem makes testing genuinely harder than it needs to be.

Command pattern and AI agents

Here's one that's very relevant right now. The Vercel AI SDK (and most agent frameworks) ask you to define your tools as a dictionary of functions at build time:

const result = await generateText({
  model: openai("gpt-4"),
  tools: {
    getWeather: tool({
      description: "Get the weather",
      parameters: z.object({ city: z.string() }),
      execute: async ({ city }) => getWeather(city),
    }),
    searchWeb: tool({
      description: "Search the web",
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => search(query),
    }),
  },
})

This works. It's simple. It's also completely rigid. What if I want to turn tools on and off based on context? What if the agent should only be able to search the web after it's identified the user's intent? What if I want to add tools dynamically based on what plugins the user has installed?

With the Command pattern:

interface AgentTool {
  name: string
  description: string
  parameters: ZodSchema
  execute(params: unknown): Promise<unknown>
  isEnabled(context: AgentContext): boolean
}

class SearchWebTool implements AgentTool {
  name = "searchWeb"
  // ...
  isEnabled(context: AgentContext) {
    return context.intentIdentified === true
  }
}

Now your tools are objects. They can have state. They can decide whether they're available. You can compose them, filter them, sort them by priority. You can serialize them. You can test them individually. You can build a plugin system where third parties register tools without touching your core agent loop.

Is this overkill for a chatbot that checks the weather? Absolutely. Is it necessary when you're building an agent system that needs to be extensible, configurable, and testable? I think so.

Kubernetes: the best technology for running jobs

I'll keep this short because it could be its own post. If you're running stateful, long-lived processing jobs, Kubernetes (with something like Argo Workflows) is the best technology for it. You pay the complexity tax upfront, but you can do anything.

We run complex stateful workflows at Autonoma. Browser sessions that need to stay alive for minutes. Mobile device connections that can't be interrupted. State that needs to persist across retries. Argo handles the orchestration, Kubernetes handles the resource management, and our OOP codebase handles the actual logic.

The reason this matters for the OOP discussion: when your runtime is a container that might get rescheduled, your code needs clean lifecycle management. Constructors, destructors, resource cleanup. Objects with well-defined lifetimes. Try managing that with a bag of functions and global state.

"So you're an OOP purist?"

No. Let me be clear about what I don't do:

  • I don't build 7-level deep class hierarchies
  • I don't use inheritance to share utility methods
  • I don't make everything a singleton
  • I don't reach for a pattern when a function would do
  • I don't write AbstractSingletonProxyFactoryBean

I understand why people got burned on OOP. Java in 2008 was a war crime. Enterprise patterns applied to CRUD apps are a form of suffering.

What I'm saying is simpler than all that: if you're building stateful systems with complex lifecycle management, multiple implementations of similar behavior, and code that needs to be genuinely testable, OOP gives you tools that nothing else does. State machines. Strategy pattern. Template methods. Interface segregation. Dependency injection. Command pattern.

These aren't academic exercises. They're the difference between "bugs on clients every hour" and "that entire class of bugs is gone."

Most TypeScript projects today have nothing. No structure, no patterns, no abstractions. Just functions, if statements, and vi.mock. That works for small projects. It doesn't scale.

And here's the irony: a lot of you are already doing OOP. You just won't admit it. You have a file with a few let variables at the top, a bunch of functions that read and mutate those variables, and you export some of them while keeping others private. Congratulations, that's a class. You have state and behavior, public and private methods, and implicit this through closure. You just wrote it without the keyword, and without any of the tooling that the keyword gives you: interfaces, type-checked contracts, proper encapsulation, composability, and the ability to have more than one instance.

I'm not asking you to become an OOP purist. I'm asking you to stop pretending it's 2008 and the only alternative to a function is a AbstractFactoryFactory. There's a pragmatic middle ground, and it's where the hardest problems get solved.

Want more of this?

Subscribe to get notified when I write something new. No spam, just occasional thoughts on AI, startups, and things I'm probably doing wrong.