# Deep Review: 20260417-165817-pr-323

| | |
|---|---|
| **Date** | 2026-04-17 16:58 |
| **Repo** | [rancher-sandbox/rancher-desktop-daemon](https://github.com/rancher-sandbox/rancher-desktop-daemon) |
| **Round** | 1 (of PR #323) |
| **Author** | [@mook-as](https://github.com/mook-as) |
| **PR** | [#323](https://github.com/rancher-sandbox/rancher-desktop-daemon/pull/323) — implement log passthrough |
| **Commits** | `f1e7da8` implement log passthrough |
| **Reviewers** | Claude Opus 4.7 (effort: xhigh), Codex GPT 5.4 (effort: high), Gemini 3.1 Pro (effort: default) |
| **Verdict** | **Rework** — handler panics when the Docker watcher is absent (any non-moby or non-running state) and buffers the whole log session into a single WebSocket message, so live tailing does not work and the new endpoint crashes the daemon under normal lifecycle transitions. |
| **Wall-clock time** | `30 min 23 s` |

## Executive Summary

PR #323 adds a `PassthroughController` implementation on the engine controller that streams Docker container logs over a WebSocket at `/passthrough/engine/logs/{id}`. The structural scaffolding — `GetPassthroughEndpoints`, `GetPassthroughHandler`, the controller.go wiring — matches the existing mock controller, but the runtime behaviour in `HandleLogs` has two independent blockers: it dereferences `r.watcher` without taking `watcherMu` or checking for nil, so any request outside a steady "running on moby" window panics the daemon; and it opens a single `NextWriter(TextMessage)` and copies the entire `Follow` stream into that one message, so clients see nothing until the container exits. The underlying `io.ReadCloser` is also never closed, so every request leaks a Docker HTTP connection after the handler returns. Error-path logging is pinned at `V(5)`, which silences all diagnostics at the default verbosity. The PR is scoped to Docker only ("TODO: Figure out containerd") and ships no tests for the new endpoint.

---

## Critical Issues

C1. **`HandleLogs` nil-dereferences and races on `r.watcher`** — `pkg/controllers/app/engine/controllers/container_passthrough.go:36` [Claude Opus 4.7, Codex GPT 5.4, Gemini 3.1 Pro]

```go
// TODO: Figure out containerd
reader, err := r.watcher.cli.ContainerLogs(
    req.Context(),
    containerID,
    client.ContainerLogsOptions{...},
)
```

`EngineReconciler` documents `watcherMu` explicitly for external readers (`engine_reconciler.go:78-81`), and every other access to `r.watcher` — `startWatcherAndSync`, `stopWatcher`, the dead-watcher sweep in `Reconcile` — takes the lock. `HandleLogs` reads `r.watcher` without the lock and without a nil check, so two independent problems follow.

First, the nil dereference. `r.watcher` is nil at startup, whenever `App.status.Running != True`, whenever `App.spec.containerEngine.name != "moby"` (the reconciler's own `wantWatcher = running && engineIsDocker` gate at `engine_reconciler.go:136`), and after any Docker disconnect (line 121 clears it). Discovery advertises the passthrough endpoint unconditionally, and the guard in `GetPassthroughHandler` only covers the window before `c.reconciler` is installed. A request that arrives in any of those normal states panics the HTTP goroutine. `http.Server` recovers the panic for that one request, but the same panic path is reachable from RDD's own UI and from local CLI callers — a trivial crash trigger for any endpoint consumer.

Second, the data race. Even when `r.watcher` is non-nil at the moment of the read, `HandleLogs` races with `Reconcile`/`stopWatcher` writing `r.watcher`. `go test -race` will flag this immediately. Under the Go memory model the read is undefined behaviour: the handler may observe a torn pointer, or use a watcher whose `cli` was already closed by `run`'s deferred `w.cli.Close()` (`docker_watcher.go:158`).

Fix: snapshot the watcher under the mutex and return 503 when it is absent.

```go
r.watcherMu.Lock()
watcher := r.watcher
r.watcherMu.Unlock()
if watcher == nil {
    http.Error(w, "Container engine not ready", http.StatusServiceUnavailable)
    return
}
reader, err := watcher.cli.ContainerLogs(...)
```

A `withClient(func(*client.Client) error)` helper on `EngineReconciler` would lock, nil-check, and invoke the closure in one place — keeping the invariant enforced where the rest of the file already enforces it, and preventing the next passthrough endpoint (exec, stats, non-follow logs) from re-introducing the same bug.

C2. **Single WebSocket message prevents live log streaming** — `pkg/controllers/app/engine/controllers/container_passthrough.go:75-82` [Codex GPT 5.4, Gemini 3.1 Pro]

```go
writer, err := conn.NextWriter(websocket.TextMessage)
if err != nil {
    log.V(5).Info("Failed to get next writer", "error", err)
    return
}
defer writer.Close()

_, err = stdcopy.StdCopy(writer, writer, reader)
```

`conn.NextWriter` returns an `io.WriteCloser` that represents a *single* WebSocket message. `StdCopy` loops over the `Follow: true` Docker log stream, writing to that one writer for the entire session. Gorilla's internal frame buffer flushes when it fills (default 4 KiB), but the message's FIN bit is not set until `writer.Close()` runs — which does not happen until `StdCopy` returns, i.e. until the container exits. A strict WebSocket client (browser `WebSocket` API, anything that reassembles a complete message before delivery) will see nothing until then; a low-volume log trickles out in partial-frame bursts of up to 4 KiB, none of them terminated.

This also breaks the contract documented in `docs/design/api_containers.md:217-218` ("messages are one way, text only, with each line being one message") and diverges from the mock at `pkg/controllers/mock/logs_passthrough.go:65-67`, which sends one `WriteMessage` per line — so the mock-backed BATS coverage passes while the real implementation does not stream.

A secondary problem compounds it: `TextMessage` requires valid UTF-8, but Docker container stdout is routinely non-UTF-8 (binary output, `ls --color=always`, NUL bytes, non-UTF-8 locales). A strict client errors on the first non-UTF-8 payload.

Fix: emit each chunk as a discrete `BinaryMessage` instead of reusing one `NextWriter`.

```go
type wsWriter struct{ conn *websocket.Conn }

func (w *wsWriter) Write(p []byte) (int, error) {
    if err := w.conn.WriteMessage(websocket.BinaryMessage, p); err != nil {
        return 0, err
    }
    return len(p), nil
}

_, err = stdcopy.StdCopy(&wsWriter{conn}, &wsWriter{conn}, reader)
```

A line-buffering variant keeps the `TextMessage`+one-line-per-message shape the docs describe; the important property is one message per chunk, not per entire session.

---

## Important Issues

I1. **Docker log reader and WebSocket goroutine leak on client disconnect** — `pkg/controllers/app/engine/controllers/container_passthrough.go:36-86` [Claude Opus 4.7, Gemini 3.1 Pro]

```go
reader, err := r.watcher.cli.ContainerLogs(
    req.Context(),
    containerID,
    client.ContainerLogsOptions{...},
)
if err != nil { ... return }

upgrader := websocket.Upgrader{}
conn, err := upgrader.Upgrade(w, req, nil)
...
_, err = stdcopy.StdCopy(writer, writer, reader)
```

The moby client's `ContainerLogs` contract states the caller must close the returned `io.ReadCloser`, but `HandleLogs` never calls `reader.Close()` — not on the Upgrade-failure path (lines 61-64) and not on the normal-return path. Once `upgrader.Upgrade` hijacks the connection, `http.Server` no longer cancels `req.Context()` when the remote TCP connection drops, so a client that drops the socket does not cancel the Docker request either, and the response body stays open until the Docker daemon evicts it.

Gorilla WebSocket additionally requires the server to run a concurrent `conn.ReadMessage()` loop to observe client-sent close frames. `HandleLogs` only writes, so on a clean client close frame `StdCopy` blocks forever on `reader.Read` for an idle container — a permanent goroutine + Docker-connection leak per request.

Fix: derive a cancellable context, defer `reader.Close()` right after the error check, and run a reader goroutine to convert WebSocket closure into context cancellation.

```go
ctx, cancel := context.WithCancel(req.Context())
defer cancel()

reader, err := watcher.cli.ContainerLogs(ctx, ...)
if err != nil { ... return }
defer reader.Close()

conn, err := upgrader.Upgrade(w, req, nil)
if err != nil { ... return }
defer conn.Close()

go func() {
    defer cancel()
    for {
        if _, _, err := conn.ReadMessage(); err != nil {
            return
        }
    }
}()
```

I2. **`log.V(5)` silences every error diagnostic** — `pkg/controllers/app/engine/controllers/container_passthrough.go:48-85` [Claude Opus 4.7]

```go
case errdefs.IsNotFound(err):
    log.V(5).Info("Container not found", "container", containerID)
...
default:
    log.V(5).Info("Failed to get container logs", "error", err)
...
conn, err := upgrader.Upgrade(w, req, nil)
if err != nil {
    log.V(5).Info("Failed to upgrade to WebSocket", "error", err)
    return
}
...
log.V(5).Info("Failed to copy container logs", "error", err)
```

Every error branch logs at `V(5)`, which requires `-v=5`/`RDD_LOG_LEVEL=trace` to surface. RDD runs at `v=0` by default, so "container not found", Docker 5xx passthrough, WebSocket upgrade failures, and the post-upgrade copy failure are all invisible in `rdd.stderr.log`. The post-upgrade paths are the only channels where the client does not already see an HTTP body (once the connection is hijacked, `http.Error` cannot run), so a "logs stopped streaming" report has no server-side breadcrumb to investigate.

Keep `V(5)` for client-input validation ("Invalid container ID"), but raise the rest to `log.Error` / `log.Info`:

```go
log.Error(err, "Failed to get container logs", "container", containerID)
http.Error(w, "Failed to get container logs", http.StatusInternalServerError)
```

The mock controller uses the same `V(5)` pattern, but it exists to not spam the test fixture; the real engine path deserves diagnosable failures.

I3. **`GetPassthroughEndpoints` returns map keys in random order** — `pkg/controllers/app/engine/controller.go:76-78` [Gemini 3.1 Pro]

```go
func (c *controller) GetPassthroughEndpoints() []string {
    return slices.Collect(maps.Keys(c.passthroughs))
}
```

Go randomises map iteration, and `slices.Collect` preserves that order. The result feeds `EnabledPassthroughs` in the discovery ConfigMap (`manager.go:331-337`, `manager.go:346`). Today the map has one key so the order is stable by accident; as soon as a second endpoint (exec, stats) is added, the serialised JSON will shuffle on every restart and produce spurious SSA updates on the discovery ConfigMap, which in turn wakes every watcher downstream. This is a latent, one-line fix now; after the second endpoint it becomes a noisy observable.

```go
func (c *controller) GetPassthroughEndpoints() []string {
    keys := slices.Collect(maps.Keys(c.passthroughs))
    slices.Sort(keys)
    return keys
}
```

I4. **Design doc advertises the wrong URL and the wrong message shape** — `docs/design/api_containers.md:217-218` [Codex GPT 5.4]

The design doc says:

> An endpoint at `/passthrough/containers/logs` will speak WebSocket; messages are one way, text only, with each line being one message.

Two mismatches with the PR: (a) the controller registers under its name `engine` (`controller.go:64-65`), and the passthrough server mounts each endpoint at `/{controller}/{endpoint}` (`manager.go:475-478`), so the actual route is `/passthrough/engine/logs`, not `/passthrough/containers/logs`. Any consumer built from the doc will hit 404. (b) "one message per line" is the behaviour of the mock, not of the new Docker implementation (see C2). Once C2 is fixed, the message-shape sentence will be accurate again for `TextMessage`; the URL line needs updating regardless.

---

## Suggestions

S1. **Use `BinaryMessage` for Docker log frames** — `pkg/controllers/app/engine/controllers/container_passthrough.go:75-82` [Claude Opus 4.7]

```go
writer, err := conn.NextWriter(websocket.TextMessage)
```

`stdcopy.StdCopy` forwards the raw stdout/stderr bytes verbatim. Container stdout is frequently non-UTF-8 (binary output, `ls --color=always`, NUL bytes). The WebSocket spec requires `TextMessage` frames to be valid UTF-8, so a strict client will fail on the first non-UTF-8 payload. Covered implicitly by C2's proposed fix, but callable as a standalone fix if C2's framing is deferred.

S2. **`stdcopy.StdCopy(writer, writer, reader)` merges stdout and stderr** — `pkg/controllers/app/engine/controllers/container_passthrough.go:82` [Claude Opus 4.7, Gemini 3.1 Pro]

Passing the same writer for stdout and stderr strips the one structural signal Docker's multiplexed stream carries. Clients that colour stderr or tag it in the UI lose that affordance. Cheaper alternatives: forward the raw multiplexed reader to the client as `BinaryMessage` and let the client demux with `stdcopy`; or wrap each message in a small envelope (`{"stream":"stdout","data":"..."}` or a leading `STREAM_TYPE` byte). Worth deciding before a consumer depends on the merged shape.

S3. **Duplicate container-ID regex and path parser with the mock controller** — `pkg/controllers/app/engine/controllers/container_passthrough.go:21` [Claude Opus 4.7]

```go
var containerIDValidator = regexp.MustCompile(`^[0-9a-fA-F]+$`)
```

`pkg/controllers/mock/logs_passthrough.go:24` declares the identical regex, and both handlers use the same `strings.Cut(strings.TrimLeft(req.URL.Path, "/"), "/")` to extract the ID. A small helper in `pkg/controllers/base` (or a new `pkg/controllers/passthrough`) centralises the rule — important the first time we decide, e.g., to allow truncated IDs or names, so the decision is one edit.

S4. **No tests for the new endpoint** — `pkg/controllers/app/engine/controllers/container_passthrough.go` [Claude Opus 4.7, Codex GPT 5.4, Gemini 3.1 Pro]

No unit or BATS coverage lands with this PR. Minimum set:

1. Unit test against a bare `EngineReconciler{}` (watcher nil) asserting 503 once C1 is fixed — pins the nil-watcher contract.
2. Unit test with an invalid container ID asserting 400 without touching the watcher — pins the regex.
3. BATS integration in `bats/tests/32-app/` that boots a container, opens the endpoint, and asserts expected log bytes arrive on the WebSocket. This is the only coverage that catches C2 (session-buffering), I1 (goroutine growth under repeated open/close), and the mock/real divergence in frame shape.

S5. **`ContainerLogsOptions.Tail` unset replays full history on every open** — `pkg/controllers/app/engine/controllers/container_passthrough.go:39-43` [Claude Opus 4.7]

```go
client.ContainerLogsOptions{
    ShowStdout: true,
    ShowStderr: true,
    Follow:     true,
},
```

`Tail` defaulting to "all" means a container with a gigabyte of history dumps the whole log over the WebSocket before catching up. Accepting `tail`/`since`/`timestamps` from the query string with sensible defaults (e.g. `tail=1000`) matches Docker CLI conventions and costs ~15 lines. Fine to defer, but a TODO comment next to `// TODO: Figure out containerd` would pin it.

S6. **`HandleLogs` docstring omits the path shape and framing** — `pkg/controllers/app/engine/controllers/container_passthrough.go:23` [Claude Opus 4.7, Gemini 3.1 Pro]

```go
// HandleLogs implements the log endpoint to pass through container logs.
func (r *EngineReconciler) HandleLogs(w http.ResponseWriter, req *http.Request) {
```

The mock's handler docstring names the path shape (`/passthrough/mock/logs/{container}`) — the production handler should be at least as explicit about URL shape, expected container-ID form (hex only, names rejected), WebSocket upgrade, and the frame type clients must accept.

---

## Design Observations

### Concerns

- **Docker client ownership is shared but only the watcher's lifetime rules are documented** *(in-scope)* [Claude Opus 4.7]. `docker_watcher.go:141-156` and `engine_reconciler.go:83-92` are explicit that the watcher's `run` goroutine closes `cli` in its own deferred cleanup, and that `stopWatcher` relies on that. Adding a second consumer (`HandleLogs`) that does not go through `watcherMu` silently breaks the invariant "once a caller observes `alive()==false`, the client is released." The next passthrough endpoint (exec, stats, non-follow logs) will re-introduce the same hazard unless the contract is held in one place — a `withClient(func(*client.Client) error)` method on `EngineReconciler` would lock, nil-check, and invoke in one spot.

- **Stream-type information is discarded** *(in-scope)* [Gemini 3.1 Pro]. Using the same writer for `stdcopy.StdCopy`'s stdout/stderr outputs merges the streams. If a future UI wants to colour stderr or tag log lines, the demux signal is already gone. See S2 for concrete options.

- **Every passthrough controller repeats boilerplate** *(future)* [Claude Opus 4.7]. `engine/controller.go:44-87` and `mock/mock_controller.go:43-86` declare the same `passthroughs map[string]http.Handler`, the same `slices.Collect(maps.Keys(...))`, the same trivial `Get*` accessors, and the same "controller not ready" guard. A helper in `pkg/controllers/base` that takes a `map[string]http.Handler` and returns a `PassthroughController` would collapse both implementations to one call and centralise the readiness check (and fix I3 in one place).

### Strengths

- **Apigroup grouping preserved** [Claude Opus 4.7]. Keeping `apiGroup = "containers"` (per the existing comment at `controller.go:37-41`) correctly extends the discovery grouping to the new endpoint so `--controllers` selections still pull the CRDs and the passthrough together.

- **503 guard before reconciler install** [Claude Opus 4.7]. `GetPassthroughHandler` returning a 503 when `c.reconciler == nil` (`controller.go:81-85`) is a nice belt-and-braces against a caller that obtained a handler before `RegisterWithManager` ran. It does not cover C1, but it is the right shape for the watcher-level guard too.

## Testing Assessment

No unit or BATS tests are added. All three agents flagged this. Ranked by risk:

1. **Nil-watcher path (C1)** — a unit test on `EngineReconciler{}` asserting 503 pins the contract and blocks the regression from returning.
2. **WebSocket happy path (C2, I1, S1)** — a BATS integration that opens the endpoint against a real container is the only way to catch session-buffering, text/binary framing, and per-request goroutine leaks.
3. **Error branches** — recorder-backed fake asserting 400 on `InvalidArgument` and 404 on `NotFound`, and checking that nothing is logged at `v=0` today (i.e. the I2 regression guard once I2 is fixed).

## Documentation Assessment

- **URL path wrong in design doc** (I4) — `docs/design/api_containers.md:217-218` still says `/passthrough/containers/logs`. Actual path is `/passthrough/engine/logs`.
- **Package doc stale** [Claude Opus 4.7] — `pkg/controllers/app/engine/controller.go:5-9` still describes the engine package as mirroring engine state and forwarding deletions; it does not mention the new passthrough responsibility. One sentence ("…and serves `/passthrough/engine/logs` for tailing container logs") keeps it honest.
- **`HandleLogs` docstring** (S6) — one-liner, no path shape, no framing, no auth scope.

## Commit Structure

Single commit, single concept, message matches intent. No issues.

## Acknowledged Limitations

- **Containerd not handled** — `container_passthrough.go:35` "// TODO: Figure out containerd". The PR description scopes the work to Docker only. All three agents noted this is expected and does not affect the severity of the other findings.
- **Post-upgrade error path cannot respond** — `container_passthrough.go:85` "// At this point we already sent the data, so we can't send HTTP status." Correctly acknowledges the hijacked-connection constraint; relevant to I2's fix (must log, because the client cannot see the error in HTTP).

## Unresolved Feedback

None. No PR review comments on round 1.

---

## Agent Performance Retro

### [Claude]

Delivered the most complete single review: caught C1 with the sharpest framing (both nil panic *and* data race, with specific reference sites), owned I1 (reader leak), and was the only agent to flag I2 (`V(5)` silences diagnostics) — a finding that requires domain knowledge of RDD's default logger verbosity. Produced six suggestions spanning framing, duplication, ergonomics, history replay, and documentation. Missed C2 as a top-line critical; did note the `TextMessage` UTF-8 issue as S1 but did not identify the single-message buffering problem behind it, which is the dominant bug in the streaming path.

### [Codex]

Tight, verdict-focused review with only the two highest-impact findings. Landed C2 with the clearest mechanism (NextWriter's FIN bit, contract mismatch with the design doc and the mock implementation) and independently confirmed I1's nil-watcher hazard by tracing every site that clears `r.watcher`. Uniquely caught the design-doc URL mismatch (I4), which neither of the other agents surfaced. Rated the nil-watcher issue Important where Claude and Gemini rated it Critical — the consolidation keeps the higher severity because a panic on any non-moby or stopped state is a shipped crash path, not a theoretical race.

### [Gemini]

Three criticals, one important. Got C1 and C2 directly, and was the only agent to surface I3 (map-iteration-randomised passthrough list) — a latent one-line fix that would otherwise land as a flaky CI symptom after a second endpoint arrives. Its C3 (goroutine leak / missing `ReadMessage` loop) strengthened Claude's narrower reader-close finding; consolidated as I1 with Gemini's concurrent-reader framing. Fewest suggestions of the three; no S-level findings at all. Did not run `git blame` (per the known daily-quota gap), so pre-existing-vs-regression calls come from Claude and Codex.

### Summary

| | Claude Opus 4.7 | Codex GPT 5.4 | Gemini 3.1 Pro |
|---|---|---|---|
| Duration | 18m 27s | 5m 52s | 4m 07s |
| Findings | 1C 2I 6S | 2C 1I | 2C 2I |
| Tool calls | 22 (Read 11, Grep 6, Bash 5) | 47 (shell 44, stdin 3) | 26 (grep_search 10, read_file 7, run_shell_command 7) |
| Design observations | 4 | 0 | 1 |
| False positives | 0 | 0 | 0 |
| Unique insights | 2 | 1 | 1 |
| Files reviewed | 2 | 2 | 2 |
| Coverage misses | 0 | 0 | 0 |
| **Totals** | **1C 2I 6S** | **2C 1I** | **2C 2I** |
| Downgraded | 0 | 0 | 1 (I→S) |
| Dropped | 0 | 0 | 0 |


**Reconciliation.** Codex P1 nil-watcher panic: important → **critical** (consolidated into C1; a production-path nil panic is a shipped crash, not a theoretical race). Gemini P1 goroutine leak on client disconnect: critical → **important** (consolidated into I1; reader-close is a clear leak, but the full hang only occurs on a clean client-close with an idle container — the more common torn-TCP case still unwinds via failed writes, so "important" fits the observed blast radius).

Overall, Codex and Gemini converged on the two blockers (C1, C2) with independent reasoning, which made the consolidated verdict easy to defend. Claude carried most of the long tail — the reader leak, the logging level, the streaming-style suggestions, the shared-pattern cleanup — and caught the most design observations. Each agent contributed at least one finding the others missed (Claude: I2; Codex: I4; Gemini: I3).

## Review Process Notes

### Skill improvements

None this round — every finding was within existing dimensions.

### Repo context updates

None this round.

---

## Appendix: Original Reviews

The three agent reviews below are reproduced verbatim from the workdir (`review-{claude,codex,gemini}-pass-1.md`), preserving each agent's heading levels and code blocks. Consolidation-level attributions use the `[Agent]` pills; the original reviews use each agent's native formatting.

### [Claude]


Now I have enough context. Let me write the review.

---

#### Code Review — PR #323: implement log passthrough

| Field | Value |
|---|---|
| Verdict | Changes requested |
| Files reviewed | 2 of 2 |

##### Executive Summary

The PR wires a new `PassthroughController` implementation on the engine controller that proxies Docker container logs over a WebSocket at `/passthrough/engine/logs/{id}`. The core flow is a direct port of the mock controller's handler and works for the happy path, but it reads `r.watcher` without the watcher mutex and never nil-checks it, so the handler will data-race with the reconciler and can nil-panic whenever the App is not currently running on moby. The fetched `io.ReadCloser` is also never closed.

##### Findings

###### Critical Issues

**C1. `HandleLogs` nil-dereferences and races on `r.watcher`** — `pkg/controllers/app/engine/controllers/container_passthrough.go:36`

```go
// TODO: Figure out containerd
reader, err := r.watcher.cli.ContainerLogs(
    req.Context(),
    containerID,
    client.ContainerLogsOptions{...},
)
```

`r.watcher` is only assigned by `startWatcherAndSync` (engine_reconciler.go:234) and cleared by `stopWatcher` (engine_reconciler.go:242) or by `Reconcile` on a dead watcher (engine_reconciler.go:121), and every one of those sites is guarded by `r.watcherMu` (engine_reconciler.go:81, with a doc at 78–80 that the lock exists specifically for external readers). `HandleLogs` reads the same field without the lock and without checking for `nil`.

Two concrete problems fall out:

1. **Nil dereference.** `r.watcher` is nil at startup, whenever the App is not `Running`, whenever `App.spec.containerEngine.name != "moby"`, and after any Docker disconnect. The nil check at controller.go:81 only covers `c.reconciler` — once the reconciler is installed, `GetPassthroughHandler` returns the stored closure that unconditionally calls `r.watcher.cli.ContainerLogs(...)`. Any request that arrives during those windows panics the passthrough HTTP goroutine. `http.Server` recovers the panic for that one request, but the nil-panic is a trivial crash trigger for any local caller, including RDD's own UI.
2. **Data race.** Even when the watcher happens to be non-nil at the moment of the read, `HandleLogs` races with `Reconcile`/`stopWatcher` writing `r.watcher`. `-race` will flag this immediately. Under the Go memory model the read is undefined behavior — the handler could observe a torn pointer, or work off a watcher whose `cli` was just closed by `run`'s deferred `w.cli.Close()` (docker_watcher.go:158).

Fix: snapshot the watcher under the mutex and 503 when it is absent.

```diff
 // HandleLogs implements the log endpoint to pass through container logs.
 func (r *EngineReconciler) HandleLogs(w http.ResponseWriter, req *http.Request) {
 	log := ctrl.LoggerFrom(req.Context())
 	containerID, _, _ := strings.Cut(strings.TrimLeft(req.URL.Path, "/"), "/")
 	log.Info("Handling logs for container", "containerID", containerID)

 	if !containerIDValidator.MatchString(containerID) {
 		log.V(5).Info("Invalid container ID", "container", containerID)
 		http.Error(w, "Invalid container ID", http.StatusBadRequest)
 		return
 	}

+	r.watcherMu.Lock()
+	watcher := r.watcher
+	r.watcherMu.Unlock()
+	if watcher == nil {
+		http.Error(w, "Container engine not ready", http.StatusServiceUnavailable)
+		return
+	}
+
 	// TODO: Figure out containerd
-	reader, err := r.watcher.cli.ContainerLogs(
+	reader, err := watcher.cli.ContainerLogs(
 		req.Context(),
 		containerID,
 		client.ContainerLogsOptions{...},
 	)
```

`(critical, regression)`

---

###### Important Issues

**I1. Docker log stream `io.ReadCloser` is never closed** — `pkg/controllers/app/engine/controllers/container_passthrough.go:36-86`

```go
reader, err := r.watcher.cli.ContainerLogs(
    req.Context(),
    containerID,
    client.ContainerLogsOptions{...},
)
if err != nil {
    ...
    return
}

upgrader := websocket.Upgrader{}
conn, err := upgrader.Upgrade(w, req, nil)
...
_, err = stdcopy.StdCopy(writer, writer, reader)
```

`ContainerLogs` returns an `io.ReadCloser` whose godoc (moby client `container_logs.go:31`) states "It's up to the caller to close the stream." There is no `defer reader.Close()` in `HandleLogs`, on either the WebSocket-upgrade-failure path (line 61–64) or the normal-return path (line 82–86). The reader is only auto-closed when the request context cancels, but once `upgrader.Upgrade` hijacks the connection the standard `http.Server` no longer observes client socket close on the request context (Go 1.22+ explicitly documents that hijacked requests don't propagate remote close through `req.Context()`), so a client that just drops the WebSocket TCP connection will not cancel the context and the Docker-side HTTP body stays open until the daemon evicts it. This is a per-request goroutine + connection leak in a `Follow: true` endpoint.

Fix: add a deferred `Close` immediately after the error check.

```diff
 	reader, err := r.watcher.cli.ContainerLogs(...)
 	if err != nil {
 		switch {
 		...
 		}
 		return
 	}
+	defer reader.Close()
```

`(important, regression)`

---

**I2. `log.V(5)` silences all diagnostics on the error paths** — `pkg/controllers/app/engine/controllers/container_passthrough.go:48-85`

```go
case errdefs.IsNotFound(err):
    log.V(5).Info("Container not found", "container", containerID)
...
default:
    log.V(5).Info("Failed to get container logs", "error", err)
    http.Error(w, "Failed to get container logs", http.StatusInternalServerError)
...
conn, err := upgrader.Upgrade(w, req, nil)
if err != nil {
    log.V(5).Info("Failed to upgrade to WebSocket", "error", err)
    return
}
```

Per the memory on CLI logger conventions, external/embedded controllers use klog where V(5) requires an explicit `-v=5` or `RDD_LOG_LEVEL=trace`. The default RDD service runs at `v=0`. All the failure branches — invalid ID, container not found, 500 from Docker, failed WebSocket upgrade, failed WebSocket close, failed `stdcopy` — are only logged at `V(5)`, so when a user reports "`rdd set` log tail returned 500", there will be nothing in `rdd.stderr.log` to tell the operator which branch fired or what Docker actually returned. The WebSocket upgrade and the post-upgrade copy failure are the only channels where the client does not already see a body (once upgraded, `http.Error` can't run). These should surface at default verbosity.

Fix: keep `V(5)` for "client sent invalid ID" (line 30), raise the rest to `log.Info` or `log.Error`, e.g.:

```diff
-		default:
-			log.V(5).Info("Failed to get container logs", "error", err)
+		default:
+			log.Error(err, "Failed to get container logs", "container", containerID)
 			http.Error(w, "Failed to get container logs", http.StatusInternalServerError)
 		}
 		return
 	}
...
-		log.V(5).Info("Failed to upgrade to WebSocket", "error", err)
+		log.Error(err, "Failed to upgrade to WebSocket", "container", containerID)
...
-		log.V(5).Info("Failed to copy container logs", "error", err)
+		log.Error(err, "Failed to copy container logs", "container", containerID)
```

The mock controller uses the same `V(5)` pattern (`mock/logs_passthrough.go:34-69`) but it is a test/screenshot fixture — the real engine path deserves diagnosable failures.

`(important, gap)`

---

###### Suggestions

**S1. Use `BinaryMessage`, not `TextMessage`, for Docker log frames** — `pkg/controllers/app/engine/controllers/container_passthrough.go:75-82`

```go
writer, err := conn.NextWriter(websocket.TextMessage)
...
_, err = stdcopy.StdCopy(writer, writer, reader)
```

`stdcopy.StdCopy` forwards the raw stdout/stderr payload bytes verbatim. Container stdout is frequently not UTF-8 (any binary-emitting program, `ls --color=always`, anything emitting NUL bytes, non-UTF-8 locales). The WebSocket spec requires `TextMessage` frames to be valid UTF-8, so a strict client (gorilla's own Conn.ReadMessage, browser `WebSocket` API in some paths) will error on the first non-UTF-8 payload. Use `websocket.BinaryMessage` to match the underlying data. The mock controller uses `TextMessage` because it hand-writes ASCII; that justification does not carry over here.

```diff
-	writer, err := conn.NextWriter(websocket.TextMessage)
+	writer, err := conn.NextWriter(websocket.BinaryMessage)
```

`(suggestion, regression)`

---

**S2. `stdcopy.StdCopy(writer, writer, reader)` merges stdout and stderr** — `pkg/controllers/app/engine/controllers/container_passthrough.go:82`

```go
_, err = stdcopy.StdCopy(writer, writer, reader)
```

Passing the same writer for stdout and stderr strips the one piece of structural information Docker's multiplexed stream carries. Clients displaying logs typically want to differentiate (e.g., color stderr red). Forwarding the raw `reader` to the WebSocket as `BinaryMessage` and letting the client demux with `stdcopy` is a lower-cost option; alternatively, label frames by writing the `STREAM_TYPE` byte at the start of each message. This isn't required for initial "passthrough" but is worth locking in before consumers depend on the merged shape.

`(suggestion, enhancement)`

---

**S3. `containerIDValidator` is duplicated verbatim with mock** — `pkg/controllers/app/engine/controllers/container_passthrough.go:21`

```go
var containerIDValidator = regexp.MustCompile(`^[0-9a-fA-F]+$`)
```

`pkg/controllers/mock/logs_passthrough.go:24` defines the exact same regex. Both handlers share the identical parsing pattern (`strings.Cut(strings.TrimLeft(req.URL.Path, "/"), "/")`) too. Factor these into a small helper in `pkg/controllers/base` (or a new `pkg/controllers/passthrough`) so a future container-ID change (e.g., allowing truncated IDs or names) is a single edit.

`(suggestion, gap)`

---

**S4. Missing tests for the new endpoint** — `pkg/controllers/app/engine/controllers/container_passthrough.go` (entire file)

No unit or BATS coverage is added. The reconciler's existing tests don't exercise the HTTP handler. At minimum:

1. A unit test that hits `HandleLogs` with an invalid container ID and expects 400 without touching `r.watcher` (guards against the regex being loosened).
2. A unit test with `r.watcher == nil` that expects 503 (once C1 is addressed) — guards against the nil-panic regression returning.
3. A BATS integration in `bats/tests/32-app/` that starts a container via the engine controller, hits `/passthrough/engine/logs/<id>`, and asserts a chunk of the expected log text comes back over the WebSocket.

`(suggestion, gap)`

---

**S5. `ContainerLogsOptions.Tail` is unset, so every request replays full history then follows** — `pkg/controllers/app/engine/controllers/container_passthrough.go:39-43`

```go
client.ContainerLogsOptions{
    ShowStdout: true,
    ShowStderr: true,
    Follow:     true,
},
```

`Tail` defaulting to "all" means a container that has produced a gigabyte of log output will dump the entire history into the WebSocket before catching up. Accepting `tail`/`since`/`timestamps` from the query string (with sane defaults, e.g. `tail=1000`) is the conventional Docker CLI behavior and costs ~15 lines. Fine to defer, but worth a TODO next to the existing `// TODO: Figure out containerd` comment.

`(suggestion, enhancement)`

---

**S6. `HandleLogs` lacks a doc-comment describing path shape, query params, and auth scope** — `pkg/controllers/app/engine/controllers/container_passthrough.go:23`

```go
// HandleLogs implements the log endpoint to pass through container logs.
func (r *EngineReconciler) HandleLogs(w http.ResponseWriter, req *http.Request) {
```

The mock controller's handler has a much better docstring (`mock/logs_passthrough.go:26-27`) that names the path shape — `/passthrough/mock/logs/{container}`. The engine version — which is the production one — should be at least as explicit. Suggested:

```go
// HandleLogs serves the `/passthrough/engine/logs/{id}` endpoint. It
// upgrades the request to a WebSocket and streams Docker container
// logs (stdout+stderr demuxed) until the client disconnects. `id`
// must be a hex container ID; names are not accepted.
```

`(suggestion, gap)`

---

##### Design Observations

###### Concerns

- **Docker client ownership is now shared between the reconciler-owned watcher and the passthrough handler, but only the watcher's lifetime rules are documented** `(in-scope)`. `docker_watcher.go:141-156` and `engine_reconciler.go:83-92` are explicit that `run` closes `cli` after the goroutine exits, and that `stopWatcher` uses that. Adding a second consumer that does not go through the watcherMu (C1 above) silently breaks the invariant that "once a caller observes `alive()==false`, the client is released." Any alternative that reaches into `r.watcher.cli` — or worse, holds it across calls — needs the same shutdown handshake. Putting an explicit `withClient(func(*client.Client) error)` method on `EngineReconciler` that takes the lock, checks `alive()`, and calls the closure would keep the invariant where the rest of the file already enforces it and would prevent the next passthrough endpoint (exec? stats? `rdd container logs` non-follow?) from re-introducing the bug.

- **Every passthrough controller repeats identical boilerplate** `(future)`. `engine/controller.go:44-87` and `mock/mock_controller.go:43-86` declare the same struct field (`passthroughs map[string]http.Handler`), the same `slices.Collect(maps.Keys(...))`, the same two-line `Get*` methods, and the same "controller not ready" fallback. The only real per-controller state is the handler set. A tiny helper in `pkg/controllers/base` that takes a `map[string]http.Handler` and returns an object implementing `PassthroughController` would collapse both controllers' boilerplate to one line and move the "not ready" guard to one place.

###### Strengths

- Keeping `controllerName` in the `apiGroup = "containers"` (per the existing comment at `engine/controller.go:37-41`) correctly extends the grouping to the new endpoint — the passthrough URL `/passthrough/engine/logs` matches the controller's registered name and will be listed under the `containers` group in discovery.
- Returning a clearly-wired 503 from `GetPassthroughHandler` when `c.reconciler == nil` (`engine/controller.go:81-85`) is a nice belt-and-braces against a caller that somehow got a handler before `RegisterWithManager` installed the reconciler.

##### Testing Assessment

No tests are added. Ranked by risk:

1. **`r.watcher == nil` path** — highest risk given C1; a trivial unit test on a bare `EngineReconciler{}` would exercise the nil-path and pin the 503 contract once added.
2. **WebSocket happy path** — BATS integration that spins up a container, opens the endpoint, expects the container's log line on the socket. This is the only way to catch `TextMessage` vs `BinaryMessage` regressions (S1) and the reader-leak (I1, observable as goroutine growth under repeated requests).
3. **Error branches** — Docker returns `NotFound`/`InvalidArgument`; a recorder-backed fake should assert the 404/400 response body and status.

##### Documentation Assessment

- The new exported method `EngineReconciler.HandleLogs` has a one-liner that does not state the URL shape, the expected form of `containerID`, or that the endpoint upgrades to a WebSocket. See S6.
- The package-level doc comment on `engine` (`controller.go:5-9`) still describes the controller as mirroring engine state and forwarding deletions — it does not mention the new passthrough responsibility. Adding "and serves `/passthrough/engine/logs` for tailing container logs" keeps the package doc honest.
- `docs/design/` is not updated; the PR introduces a new user-facing HTTP surface and should at least get a short note in `docs/design/api_containers.md` or a new `docs/design/passthrough.md` describing the contract (path shape, auth scope = localhost, frame type, demux scheme).

##### Commit Structure

Single commit, single concept, message describes intent. No issues.

##### Acknowledged Limitations

- `container_passthrough.go:35` — `// TODO: Figure out containerd`. This PR is scoped to Docker ("for docker only" per the description), so containerd being unimplemented is expected. No impact on severity of other findings.
- `container_passthrough.go:85` — `// At this point we already sent the data, so we can't send HTTP status.` Correctly acknowledges the post-upgrade error path can only log. Consistent with I2's fix (bump log level).

##### Coverage Summary

| File | Coverage |
|---|---|
| `pkg/controllers/app/engine/controller.go` | Reviewed, no issues beyond those referenced in S3 (shared pattern) |
| `pkg/controllers/app/engine/controllers/container_passthrough.go` | C1, I1, I2, S1, S2, S3, S4, S5, S6 |
### [Codex]

###### Executive Summary

This change wires the engine controller into the shared passthrough server and adds a Docker-backed WebSocket log handler. Two regressions block merge: the handler sends the entire follow session as one WebSocket message, so live logs do not reach clients until the stream ends, and it dereferences `r.watcher` without synchronization or readiness checks, so requests during stopped or `containerd` states panic the passthrough handler.

###### Findings

####### Critical Issues

C1. **Single WebSocket message prevents live log streaming** — `pkg/controllers/app/engine/controllers/container_passthrough.go:75-82` (critical, regression)

```go
writer, err := conn.NextWriter(websocket.TextMessage)
if err != nil {
	log.V(5).Info("Failed to get next writer", "error", err)
	return
}
defer writer.Close()

_, err = stdcopy.StdCopy(writer, writer, reader)
```

Requests routed through `pkg/service/controllers/manager.go:478` and `pkg/controllers/app/engine/controller.go:52-53` land here, where line 75 opens one `TextMessage`, line 80 defers `Close` until the handler returns, and line 82 copies the entire `Follow` stream into that single message. `gorilla/websocket` only flushes a `NextWriter` message when `Close` is called, so a long-running container never produces a complete WebSocket message and clients do not receive live log updates. That breaks the contract documented in `docs/design/api_containers.md:217-218` and differs from the mock implementation at `pkg/controllers/mock/logs_passthrough.go:65-67`, which sends one message per line.

Fix: send discrete messages as logs arrive instead of holding one writer open for the whole session. A line-buffering writer passed into `stdcopy.StdCopy` keeps the existing Docker demuxing while restoring streaming semantics.

```diff
- writer, err := conn.NextWriter(websocket.TextMessage)
- if err != nil {
-     log.V(5).Info("Failed to get next writer", "error", err)
-     return
- }
- defer writer.Close()
-
- _, err = stdcopy.StdCopy(writer, writer, reader)
+ stdout := newWebSocketLineWriter(conn, log)
+ stderr := newWebSocketLineWriter(conn, log)
+ _, err = stdcopy.StdCopy(stdout, stderr, reader)
```

####### Important Issues

I1. **`HandleLogs` can panic when the Docker watcher is absent or stopping** — `pkg/controllers/app/engine/controllers/container_passthrough.go:36-44` (important, regression)

```go
reader, err := r.watcher.cli.ContainerLogs(
	req.Context(),
	containerID,
	client.ContainerLogsOptions{
		ShowStdout: true,
		ShowStderr: true,
		Follow:     true,
	},
)
```

Discovery advertises the passthrough endpoint unconditionally in `pkg/service/controllers/manager.go:331-337`, and `pkg/controllers/app/engine/controller.go:80-86` only returns `503` before `c.reconciler` is initialized. After startup, `EngineReconciler.Reconcile` intentionally leaves `r.watcher` nil whenever the App is stopped or the backend is `containerd` (`pkg/controllers/app/engine/controllers/engine_reconciler.go:136-165`), and `stopWatcher` clears it during shutdown (`pkg/controllers/app/engine/controllers/engine_reconciler.go:239-247`). Line 36 dereferences `r.watcher.cli` without taking `watcherMu` or checking for nil, so a request in any of those normal states panics the handler, and concurrent stop/start traffic violates the reconciler’s own locking assumptions from `pkg/controllers/app/engine/controllers/engine_reconciler.go:64-81`.

Fix: snapshot the current watcher under `watcherMu` before using it, return `503 Service Unavailable` when the engine is not ready, and keep all direct `r.watcher` access behind one helper so passthrough traffic follows the same concurrency rules as the reconcile paths.

```diff
- reader, err := r.watcher.cli.ContainerLogs(
+ watcher := r.currentWatcher()
+ if watcher == nil {
+     http.Error(w, "Container engine not ready", http.StatusServiceUnavailable)
+     return
+ }
+ reader, err := watcher.cli.ContainerLogs(
```

####### Suggestions

None.

###### Testing Assessment

Targeted package tests pass (`go test ./pkg/controllers/app/engine/... ./pkg/service/controllers/...`), but they do not exercise the new endpoint behavior.

1. No integration test covers `engine/logs` against a long-lived Docker container. The only passthrough test is the mock controller case in `bats/tests/34-containers-controllers/containers-mock.bats:42-80`, and that mock writes one `WriteMessage` per line, so it would not catch C1.
2. No test hits the new endpoint while `ContainerEngineReady` is `False` or `NotApplicable` (stopped VM, startup, shutdown, or `containerd` backend), so the nil-watcher path behind I1 can ship unnoticed.

###### Documentation Assessment

`docs/design/api_containers.md:217-221` still documents `/passthrough/containers/logs`, but this controller registers under its controller name, so the actual route is `/passthrough/engine/logs` (`pkg/controllers/app/engine/controller.go:64-65`, `pkg/service/controllers/manager.go:475-478`). The user-facing contract needs to be updated, or the controller/endpoint naming needs to be aligned before consumers are built against the wrong URL.

###### Acknowledged Limitations

- **Code comment**: `// TODO: Figure out containerd` — `pkg/controllers/app/engine/controllers/container_passthrough.go:35`. The PR explicitly scopes log passthrough to Docker only. This change makes that limitation user-visible through a new endpoint, but the unsupported-backend scope itself is already acknowledged in code.

###### Coverage Summary

- `pkg/controllers/app/engine/controller.go` — C1, I1
- `pkg/controllers/app/engine/controllers/container_passthrough.go` — C1, I1

### [Gemini]

###### Executive Summary

This PR introduces a log passthrough endpoint for Docker containers, exposing a WebSocket stream that proxies log output from the container engine. The implementation suffers from critical structural flaws: it dereferences the Docker watcher without safety guards (crashing the daemon if hit early), fundamentally misunderstands the Gorilla WebSocket streaming model by trying to buffer an infinite log stream into a single text frame, and leaks goroutines/connections permanently when clients disconnect. 

###### Findings

####### Critical Issues

**C1. Nil pointer dereference panics daemon on uninitialized watcher** — `pkg/controllers/app/engine/controllers/container_passthrough.go:31`

```go
	reader, err := r.watcher.cli.ContainerLogs(
		req.Context(),
```

The `HandleLogs` function accesses `r.watcher` directly without holding `r.watcherMu` and without verifying it is non-nil. Because the Docker watcher initializes asynchronously, can die and restart, and is skipped entirely when using `containerd`, `r.watcher` may be `nil`. An HTTP request to this endpoint while the watcher is nil will panic the process and crash the `rdd` daemon, bringing down the embedded kube-apiserver.

Fix: Safely acquire a reference to the watcher and check for `nil`:
```go
	r.watcherMu.Lock()
	w := r.watcher
	r.watcherMu.Unlock()
	if w == nil {
		http.Error(w, "Docker engine watcher is not running", http.StatusServiceUnavailable)
		return
	}
	
	reader, err := w.cli.ContainerLogs( // ...
```

**C2. Protocol violation and infinite buffering of logs** — `pkg/controllers/app/engine/controllers/container_passthrough.go:76-81`

```go
	writer, err := conn.NextWriter(websocket.TextMessage)
	// ...
	defer writer.Close()

	_, err = stdcopy.StdCopy(writer, writer, reader)
```

`conn.NextWriter` returns an `io.WriteCloser` representing a *single* WebSocket message. `stdcopy.StdCopy` loops over the Docker log stream, continuously writing to this one writer. Because `writer.Close()` is not called until `StdCopy` returns, the WebSocket frame is never completed and the client receives nothing until the container exits. Furthermore, Docker logs can contain arbitrary binary output, which violates the strict UTF-8 constraint of `websocket.TextMessage`.

Fix: Instead of creating a single `NextWriter`, pass a custom `io.Writer` to `StdCopy` that emits each chunk as a discrete WebSocket message.
```go
type wsWriter struct { conn *websocket.Conn }
func (w *wsWriter) Write(p []byte) (int, error) {
	err := w.conn.WriteMessage(websocket.BinaryMessage, p)
	return len(p), err
}

// Inside HandleLogs:
_, err = stdcopy.StdCopy(&wsWriter{conn: conn}, &wsWriter{conn: conn}, reader)
```

**C3. Permanent goroutine and Docker connection leak on client disconnect** — `pkg/controllers/app/engine/controllers/container_passthrough.go:31-81`

The HTTP request context `req.Context()` is passed to `ContainerLogs`, but once `upgrader.Upgrade` succeeds, Gorilla WebSocket hijacks the connection and stops propagating client disconnects to `req.Context()`. 
Furthermore, `HandleLogs` blocks on `StdCopy` but never concurrently reads from the WebSocket (`conn.ReadMessage()`), which is required by Gorilla to process client-initiated close frames. Finally, `reader.Close()` is never called. If a client disconnects while the container is idle, the server will never notice; `StdCopy` will wait forever for the next log line, leaking a goroutine and a Docker API connection permanently.

Fix: Derive a cancellable context, ensure `reader.Close()` is deferred, and spawn a concurrent loop to monitor for WebSocket disconnects:
```go
	ctx, cancel := context.WithCancel(req.Context())
	defer cancel()

	reader, err := w.cli.ContainerLogs(ctx, /* ... */)
	if err != nil { /* ... */ }
	defer reader.Close()

	conn, err := upgrader.Upgrade(w, req, nil)
	if err != nil { /* ... */ }
	defer conn.Close()

	// Gorilla requires a concurrent reader to process Close frames.
	go func() {
		defer cancel()
		for {
			if _, _, err := conn.ReadMessage(); err != nil {
				break
			}
		}
	}()
	
	// proceed with StdCopy...
```

####### Important Issues

**I1. Non-deterministic ConfigMap state causing flapping updates** — `pkg/controllers/app/engine/controller.go:76`

```go
func (c *controller) GetPassthroughEndpoints() []string {
	return slices.Collect(maps.Keys(c.passthroughs))
}
```

Iterating over a Go map produces randomly ordered keys. Because `GetPassthroughEndpoints` is collected to populate the `EnabledPassthroughs` payload inside the discovery JSON ConfigMap (`Manager.registerDiscovery`), adding a second endpoint to the `passthroughs` map will cause the JSON array to randomly shuffle its order on every daemon startup. This triggers an unnecessary SSA update and sends spurious watch events to downstream components watching the ConfigMap.

Fix: Sort the slice before returning it. 
```go
func (c *controller) GetPassthroughEndpoints() []string {
	keys := slices.Collect(maps.Keys(c.passthroughs))
	slices.Sort(keys)
	return keys
}
```

####### Suggestions

None.

###### Design Observations

**Concerns**

- **Stdout and Stderr multiplexing loss (in-scope)**: By passing the same WebSocket writer as both `dstout` and `dsterr` to `stdcopy.StdCopy`, the implementation strips Docker's stream headers but merges stdout and stderr into an indistinguishable byte stream. If the UI or CLI consumer needs to color-code or differentiate standard error from standard output, this design permanently destroys that context. Consider wrapping the output in a JSON payload (e.g. `{"stream": "stdout", "data": "..."}`) or sending different WebSocket message types if the client protocol allows.

###### Testing Assessment

1. **Passthrough Endpoint Lifecycle (High Risk)**: There are no unit or BATS integration tests added for the new `/engine/logs/` passthrough endpoint. 
2. **Websocket Client Disconnect (High Risk)**: There is no test ensuring the backend Docker connection is properly severed when the WebSocket client cleanly closes the connection or breaks the TCP pipe.

###### Documentation Assessment

The new `HandleLogs` and passthrough logic lack docstrings explaining the expected WebSocket client protocol (e.g. "Clients must accept Binary messages containing raw docker stream chunks").

###### Commit Structure

No issues. The single commit effectively encapsulates the log passthrough addition.

###### Acknowledged Limitations

- **Containerd support deferred**: 
  - Code comment: `// TODO: Figure out containerd` (`pkg/controllers/app/engine/controllers/container_passthrough.go:35`)
  - The passthrough is currently limited to Docker only. Containerd endpoints remain unhandled.

###### Coverage Summary

- `pkg/controllers/app/engine/controller.go` — **I1**
- `pkg/controllers/app/engine/controllers/container_passthrough.go` — **C1, C2, C3**
