# Deep Review: 20260412-203409-pr-295 | | | |---|---| | **Date** | 2026-04-12 20:34 | | **Repo** | [rancher-sandbox/rancher-desktop-daemon](https://github.com/rancher-sandbox/rancher-desktop-daemon) | | **Round** | 4 (of PR #295) | | **Author** | [@jandubois](https://github.com/jandubois) | | **PR** | [#295](https://github.com/rancher-sandbox/rancher-desktop-daemon/pull/295) — Container Engine watcher | | **Commits** | `33db27d` Address review round 3 findings for PR #295
`cf4b90e` ci: disable Spotlight indexing on macOS BATS runners
`c484dab` ci: capture memory pressure and API state in support bundle
`3196b24` bats: park spec.state=unknown before pausing test-state
`84c0736` engine: apply Container/Image/Volume capitalization convention
…plus 14 earlier commits | | **Reviewers** | Claude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro | | **Verdict** | **Merge with fixes** — the `spec.state=created` convergence bug (I1) fires redundant `ContainerStop` calls on every unrelated watch event, and the App controller's blind `Status().Update` (I2) now races the new `setEngineCondition` writer. Neither is a functional break but both deserve a fix before merge. | | **Wall-clock time** | `57 min 34 s` | --- ## Executive Summary This round adds an engine controller that mirrors Docker state (`Container`, `Image`, `Volume`) into the rdd Kubernetes API, forwards user-initiated deletes back to Docker, and makes `rdd set` wait for the desired state by default. The core structure — `dockerWatcher` for Docker-side I/O, `EngineReconciler` for controller-runtime integration — is clean and carefully commented, and the BATS coverage is thorough. Review round 3's findings have been addressed. Structure: 0 critical, 4 important, 13 suggestions, 3 design observations. The dominant issue is a single line of state-matching logic in `reconcileContainerState`: once a `Container` with `spec.state=created` reaches Docker's `exited` state, `desired != actual` keeps dispatching `ContainerStop` on every reconcile. This was independently caught by all three agents. Three other Important findings surface real concurrency or drift bugs that happen to be structural rather than catastrophic: the App controller's pre-existing `Status().Update` is no longer safe now that the engine controller writes the same conditions, `rdd set`'s watch predicate ignores the `ObservedGeneration` the engine already stamps, and the `status.ports` list is built in non-deterministic map-iteration order which churns atomic-list SSA applies. --- ## Critical Issues None. --- ## Important Issues I1. **`reconcileContainerState` dispatches redundant `ContainerStop` every reconcile once a `created`-desired container reaches `exited`** — `pkg/controllers/app/engine/controllers/docker_watcher.go:334-370` [Claude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro] ```go actual := c.Status.Status if desired == actual { return nil } // ... switch desired { case containersv1alpha1.ContainerStatusCreated: log.Info("Stopping container", "id", c.Name) _, err := w.cli.ContainerStop(ctx, c.Name, dockerclient.ContainerStopOptions{}) return err } ``` After the user patches `spec.state=created` on a running container, `ContainerStop` transitions the container to Docker's `exited` state (not `created`), and `applyContainer` writes `status.status=exited` at `sync_containers.go:157`. On the next reconcile — triggered by any `Container`/`Image`/`Volume` watch event via `SetupWithManager`'s unconditional enqueueApp mapping — `desired == "created"` and `actual == "exited"`, so the `desired != actual` check at line 341 falls through and dispatches `ContainerStop` again. Docker's `ContainerStop` on a non-running container returns `NotModified` which the client treats as success, so no error propagates — but the reconciler never reaches a quiescent state for this container. Cost scales with the rate of unrelated watch events times the number of `created`-desired stuck containers. The same issue applies to `actual == "dead"` or `actual == "created"` when `desired == "created"`. The new BATS test at `engine.bats:231-248` only verifies the one-shot transition to `exited` and does not observe the subsequent churn, so this isn't caught by CI. Fix: treat any stopped state (`created`, `exited`, `dead`) as satisfying `desired=created`, symmetric with the unpause handling in the `Running` branch at lines 354-363. ```diff func (w *dockerWatcher) reconcileContainerState(ctx context.Context, c *containersv1alpha1.Container) error { desired := c.Spec.State if desired == containersv1alpha1.ContainerStatusUnknown { return nil } actual := c.Status.Status - if desired == actual { - return nil - } + // Treat any stopped state as satisfying desired=created. + if desired == containersv1alpha1.ContainerStatusCreated { + switch actual { + case containersv1alpha1.ContainerStatusCreated, + containersv1alpha1.ContainerStatusExited, + containersv1alpha1.ContainerStatusDead: + return nil + } + } + if desired == actual { + return nil + } ``` (Gemini flagged this as Critical on the grounds of "Docker API spam loop"; Claude and Codex called it Important. Calibrated as Important here: the loop is bounded by unrelated event frequency, `NotModified` is a silent success, and no error propagates — the cost is wasted RPCs, not a hang or failure.) I2. **`App.Status.Conditions` now has two writers but the AppReconciler still uses a blind `Status().Update`** — `pkg/controllers/app/app/controllers/app_controller.go:229-251` [Codex GPT 5.4] ```go // Mirror LimaVM status conditions onto the App status. // The priority chain above returns after every other action, so the App's // resourceVersion from the initial Get is still current — no re-read needed. // ... if statusChanged { if err := r.Status().Update(ctx, &app); err != nil { log.Error(err, "Unable to update App status") return ctrl.Result{}, err } } ``` This PR adds a second writer to `App.Status.Conditions` via `setEngineCondition` (`engine_reconciler.go:223-240`), which correctly wraps its update in `retry.RetryOnConflict` with a re-Get. The existing `AppReconciler` still does not: the comment at line 230-231 claims the `resourceVersion` from the initial Get is still current, which was true when the App controller was the sole writer but is no longer true once the engine reconciler can land a `ContainerEngineReady` update in between the Get and the Update. The repo overview explicitly calls this out: *"`App.Status.Conditions` has multiple concurrent writers (`AppReconciler` mirrors LimaVM conditions; `EngineReconciler` writes `ContainerEngineReady`). Flag any non-SSA, non-retry-on-conflict `Status().Update` on the App object."* This isn't catastrophic — controller-runtime requeues on the returned conflict error and the next reconcile re-Gets — but it turns routine status churn into avoidable 409s and noisy reconcile-error logs. Fix: convert the App controller's status write to the same `retry.RetryOnConflict` + re-Get pattern used in `setEngineCondition`. ```go if statusChanged { if err := retry.RetryOnConflict(retry.DefaultRetry, func() error { latest := &v1alpha1.App{} if err := r.Get(ctx, client.ObjectKeyFromObject(&app), latest); err != nil { return err } for _, cond := range limaVM.Status.Conditions { apimeta.SetStatusCondition(&latest.Status.Conditions, metav1.Condition{ Type: cond.Type, Status: cond.Status, Reason: cond.Reason, Message: base.TruncateConditionMessage(cond.Message), ObservedGeneration: latest.Generation, LastTransitionTime: cond.LastTransitionTime, }) } return r.Status().Update(ctx, latest) }); err != nil { log.Error(err, "Unable to update App status") return ctrl.Result{}, err } } ``` I3. **`rdd set running=true` wait ignores the `ObservedGeneration` the engine already stamps** — `cmd/rdd/set.go:453-474`, `pkg/controllers/app/engine/controllers/engine_reconciler.go:223-240` [Claude Opus 4.6] ```go if runningVal == "true" { logrus.Info("Waiting for container engine to be ready") return watchCondition(ctx, config, func(obj *unstructured.Unstructured) bool { return conditionStatus(obj, "ContainerEngineReady") == metav1.ConditionTrue }) } ``` The predicate only checks `Status == True` and has no generation handoff from the preceding patch. The "Known hazard" comment at `cmd/rdd/set.go:445-452` acknowledges that `rdd set containerEngine.name= running=true` on an already-running VM can be satisfied by the stale `Connected` condition left over from the previous backend — but `setEngineCondition` already writes `ObservedGeneration: latest.Generation` at `engine_reconciler.go:234`, so the fix is cheap: capture the generation from the patch response in `setAction` and compare it in the predicate. The `waitForDesiredState` signature needs a `minGen int64` parameter, `patchApp`/`createAndPatchApp` need to return the post-patch generation, and `watchCondition`'s predicate gains `cond.ObservedGeneration >= minGen`. Without the fix, `rdd set containerEngine.name=containerd` (without changing `running`) is the worst case: no wait triggers at all because no `running` argument is set, even though a full VM restart follows. With `running=true` included, the wait happens but can be satisfied by a pre-patch snapshot. Fix sketch (conceptual, paired with I4 in `rdd set.go`): ```go // In setAction, after patchApp/createAndPatchApp: var app appv1alpha1.App if err := c.Get(ctx, client.ObjectKey{Name: "app"}, &app); err != nil { return fmt.Errorf("failed to get App after patch: %w", err) } minGen := app.Generation // In waitForDesiredState predicate: return conditionStatusForGeneration(obj, "ContainerEngineReady", metav1.ConditionTrue, minGen) ``` Claude flagged this as Important because the fix is already available (the stamp exists), not because the hazard itself is new. I4. **`status.ports` is built in non-deterministic map-iteration order and churns the atomic list on every sync** — `pkg/controllers/app/engine/controllers/sync_containers.go:176-190` [Claude Opus 4.6] ```go var applyPorts []*containersv1alpha1apply.ContainerPortApplyConfiguration if inspect.NetworkSettings != nil { for portName, ports := range inspect.NetworkSettings.Ports { var bindings []*containersv1alpha1apply.ContainerPortBindingApplyConfiguration for _, port := range ports { bindings = append(bindings, containersv1alpha1apply.ContainerPortBinding(). WithHostIP(port.HostIP.String()). WithHostPort(port.HostPort)) } applyPorts = append(applyPorts, containersv1alpha1apply.ContainerPort(). WithName(portName.String()). WithBindings(bindings...)) } } ``` `inspect.NetworkSettings.Ports` is a Go map, so iteration order is randomized per process. The Container CRD declares `status.ports` as a plain array with no `x-kubernetes-list-type` annotation (verified against `pkg/controllers/containers/container/crd.yaml`: only `status.conditions` carries `x-kubernetes-list-type: map`). Lists without a list-type default to atomic, which is position-sensitive: the apiserver treats `[A, B]` and `[B, A]` as different values and produces a new `resourceVersion` on every apply that reorders them. That fires a watch event, which the engine reconciler catches via `Watches(&containersv1alpha1.Container{}, enqueueApp)`, which re-enters `reconcileContainerSpecs + processFinalizers` — a four-list sweep per watch event. For a container with two or more exposed ports, every unrelated Docker event (image pull, volume create, etc.) can amplify into a status re-apply and a full reconcile sweep. Fix: sort `inspect.NetworkSettings.Ports` by port name before building the apply slice. The inner `bindings` loop is Docker-ordered (IPv4 then IPv6) and does not need sorting unless Docker's internal order changes. ```diff if inspect.NetworkSettings != nil { + portNames := make([]mobynat.Port, 0, len(inspect.NetworkSettings.Ports)) + for p := range inspect.NetworkSettings.Ports { + portNames = append(portNames, p) + } + sort.Slice(portNames, func(i, j int) bool { return portNames[i] < portNames[j] }) - for portName, ports := range inspect.NetworkSettings.Ports { + for _, portName := range portNames { + ports := inspect.NetworkSettings.Ports[portName] ``` Alternatively, annotate `Ports` with `+listType=map` + `+listMapKey=name` in `container_types.go` so the apiserver merger ignores ordering — this is more robust but requires a CRD regeneration and is a larger change. --- ## Suggestions S1. **`DockerSocket()` silently swallows `os.MkdirAll` error** — `pkg/instance/instance.go:136-144` [Claude Opus 4.6] ```go var DockerSocket = sync.OnceValue(func() string { home, err := os.UserHomeDir() if err != nil { panic(fmt.Errorf("could not get home directory: %w", err)) } dir := filepath.Join(home, ".rd2") _ = os.MkdirAll(dir, 0o700) return filepath.Join(dir, "docker.sock") }) ``` If `MkdirAll` fails (permission denied, read-only filesystem), the error is discarded and the returned path may point at a non-existent parent. `dockerclient.WithHost("unix://" + socketPath)` in `docker_watcher.go:47` then fails at connect time with a cryptic "no such file or directory". Sibling helpers (`LogDir`, `ShortDir`, etc.) don't do `MkdirAll` — the directory is really Lima's responsibility, and creating it here is inconsistent. Fix: drop the `MkdirAll` call entirely and let Lima own the directory lifecycle. S2. **`setEngineCondition` does not tolerate App deletion during retry** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:223-241` [Claude Opus 4.6] ```go return retry.RetryOnConflict(retry.DefaultRetry, func() error { latest := &appv1alpha1.App{} if err := r.Get(ctx, client.ObjectKey{Name: app.Name}, latest); err != nil { return err } ``` `Reconcile` wraps its initial Get in `client.IgnoreNotFound`, but if the App is deleted between that Get and a later `setEngineCondition` call (e.g., `rdd svc delete` runs concurrently), the inner Get returns `NotFound` and `retry.RetryOnConflict` does not treat `NotFound` as retriable — the error propagates up and the reconciler returns it, causing a gratuitous requeue-then-bail. Fix: treat `NotFound` as a no-op inside the retry closure. ```diff return retry.RetryOnConflict(retry.DefaultRetry, func() error { latest := &appv1alpha1.App{} if err := r.Get(ctx, client.ObjectKey{Name: app.Name}, latest); err != nil { + if apierrors.IsNotFound(err) { + return nil + } return err } ``` S3. **`deleteImage` may untag the wrong image in a tag-race** — `pkg/controllers/app/engine/controllers/docker_watcher.go:393-404` [Claude Opus 4.6] ```go ref := img.Status.ID if img.Status.RepoTag != "" { ref = img.Status.RepoTag } _, err := w.cli.ImageRemove(ctx, ref, dockerclient.ImageRemoveOptions{}) ``` Consider: (1) Image A is tagged `busybox:latest`, mirror A holds `Status.RepoTag=busybox:latest`. (2) `docker pull busybox:latest` brings in image B, moves the tag, emits untag(A)+tag(B). (3) Before the watcher processes those events, the user deletes mirror A. `processImageFinalizers` reads the stale mirror A, calls `ImageRemove("busybox:latest")`, which now hits image B and untags it. The ID-based path would have untagged A correctly. The intent of using the tag is presumably to untag-only rather than remove — so that a multi-tagged image keeps its other tags. A safer approach re-inspects the image and verifies `Inspect().RepoTags` still contains the tag before calling `ImageRemove`, or removes by `status.ID` + a `ForceOwnership: false` flag. S4. **Image finalizer path has no empty-status guard, asymmetric with the Volume path** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:402-424` [Claude Opus 4.6] `processVolumeFinalizers` at lines 426-455 guards `w.deleteVolume(ctx, v.Status.Name)` with `if v.Status.Name != ""` because a bare-skeleton mirror has no engine name to forward to. The Image path has no equivalent guard: if a user directly creates an `Image` mirror via kubectl (with the mirror finalizer) and then deletes it before status is populated, `deleteImage` passes an empty string to `ImageRemove`, which the daemon rejects. Since the Container webhook only intercepts Update not Create, such a bare mirror is technically reachable. Fix: symmetrically guard the Image path with `img.Status.ID != "" || img.Status.RepoTag != ""`. S5. **`parseContainerName("/")` returns `("moby", "")` which downstream CRD validation would reject** — `pkg/controllers/app/engine/controllers/sync_containers.go:207-214` [Claude Opus 4.6] The table test at `sync_containers_test.go:42-51` pins the `"/"` result as `("moby", "")`. That flows into `ContainerStatus().WithName("")` at `sync_containers.go:151`, and `status.name` is `+required` in `container_types.go:73-75`, so the apiserver would reject the apply with a validation error. In practice `inspect.Name` from Docker is never `/` because Docker always generates a name, so this is defensive only. But the current test pins a result that in the hot path would produce an invalid mirror. Fix: return an explicit error or fall back to a synthetic name (e.g., the container ID). A test assertion + a clear error is better than silently producing an invalid apply. S6. **`applyContainer` reasserts the finalizer even on a terminating mirror** — `pkg/controllers/app/engine/controllers/sync_containers.go:135-147` [Claude Opus 4.6] If a user has initiated `kubectl delete container/...` (mirror has `DeletionTimestamp`) and Docker then emits a container event (the container dies on its own), `applyContainer` reaches the `else` branch and tries to re-assert the mirror finalizer via SSA. Adding a finalizer to a deleting object is typically a no-op under SSA (the finalizer is still present), but if any other field manager stripped it the reassertion fails with a cryptic error and the next Docker event for the same container repeats the failure. Fix: skip the finalizer reassertion when `existing.DeletionTimestamp != nil`. S7. **Docker state mapping assumes the moby enum is a strict subset of the CRD enum** — `pkg/controllers/app/engine/controllers/sync_containers.go:157` [Claude Opus 4.6] ```go WithStatus(containersv1alpha1.ContainerStatusValue(inspect.State.Status)). ``` `inspect.State.Status` is a free-form Docker string, directly cast into the CRD enum without validation. The CRD enum at `container_types.go:11` currently covers every documented Docker state, but a future Docker release that introduces a new state string (`initializing`, `error`, etc.) would cause SSA applies to fail with CRD validation errors, the per-container error would be logged and skipped by `syncAllContainers:57-59`, and that container's mirror would silently fall behind. Fix: whitelist the known states and fall back to `unknown` on mismatch, so forward-compat with Docker is explicit rather than implicit. S8. **`setAction` does not hand off the post-patch generation to `waitForDesiredState`** — `cmd/rdd/set.go:293-311` [Claude Opus 4.6] Pairs with I3: neither `createAndPatchApp` nor `patchApp` returns the `metadata.generation` of the post-write App object, so `waitForDesiredState` has no way to filter stale condition snapshots in the informer's initial list. The fix is a small plumbing change — return `app.Generation` from the patch path and thread it into `watchCondition`'s predicate. S9. **Gemini: Missing channel closure check in Docker watcher loop** — `pkg/controllers/app/engine/controllers/docker_watcher.go:183-188` [Gemini 3.1 Pro] ```go case msg := <-result.Messages: if err := w.handleEvent(ctx, msg); err != nil { log.Error(err, "Failed to handle Docker event", ...) } ``` If the messages channel closes (daemon disconnect), the select continuously yields a zero-value `events.Message`. `handleEvent` safely ignores unknown event types (default → return nil), so this isn't a hard spin — the loop will also see `result.Err` close and terminate — but explicit `msg, ok := <-result.Messages` with a closed-channel exit is safer and matches the Go idiom. S10. **`processContainerFinalizers` uses a blind `r.Update` for finalizer removal** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:390-397` [Gemini 3.1 Pro] `removeMirrorResource` wraps its finalizer-strip in `retry.RetryOnConflict`; `processContainerFinalizers` does not. On a 409, the whole reconcile requeues and re-runs `deleteContainer`, which is idempotent but wastes an API round-trip. Consistency across finalizer paths would simplify reasoning. S11. **`bats-with-timeout.sh` gives no context when the rdd binary is missing** — `scripts/bats-with-timeout.sh:30-42` [Claude Opus 4.6] ```bash rdd_bin="${script_dir}/../bin/rdd" if [ ! -x "$rdd_bin" ] && [ -x "${rdd_bin}.exe" ]; then rdd_bin="${rdd_bin}.exe" fi log_dir=$("$rdd_bin" svc paths log_dir) ``` If neither `bin/rdd` nor `bin/rdd.exe` exists (a developer running the script before `make`), the `$()` subshell fails with a generic shell error and `set -o errexit` aborts without explaining what went wrong. Fix: `[ -x "$rdd_bin" ] || { echo "bats-with-timeout: rdd binary not found at $rdd_bin; run 'make' first" >&2; exit 2; }` before the command. S12. **`pgid` log field asserts Unix-only invariant in prose, not in code** — `pkg/controllers/lima/limavm/controllers/limavm_lifecycle.go:479-491` [Claude Opus 4.6] The log field hardcodes `pgid` to `haCmd.Process.Pid` on the assumption that `process.SetGroup` makes the child a process-group leader. The comment explains the assumption but doesn't cross-reference `SetGroup`, so a future change that drops `Setpgid` would silently break downstream tooling (the new `bats-with-timeout.sh` pgid matching). Fix: query the real pgid via `syscall.Getpgid(haCmd.Process.Pid)` after `Start()` and log the result. One extra syscall for a defensible invariant. S13. **`DOCKER_HOST` is a file-level export in `engine.bats`** — `bats/tests/32-app-controllers/engine.bats:12-16` [Claude Opus 4.6] Top-level assignments run at file load and export `DOCKER_HOST` globally to every test in the file (and any subshell). This is harmless today but will leak through if another BATS suite ever sources this file. The documented `.rd2` TODO replacement will be cleaner if `DOCKER_HOST` is scoped to `local_setup_file` instead of being module-level. --- ## Design Observations ### Concerns 1. **Engine reconciler iterates all containers on every unrelated watch event** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:170-182` [Claude Opus 4.6, Codex GPT 5.4] `(future)`. The author's TODO at lines 170-176 acknowledges this. Concrete cost: every `Container`/`Image`/`Volume` event triggers a full `reconcileContainerSpecs + processFinalizers` sweep (4 List calls + O(N) Docker RPCs in the stuck-state case — see I1). Splitting into per-kind reconcilers with watch predicates (`deletion timestamp set`, `spec.state changed`) would bound cost to O(1) per event. Not blocking, but I1 is a direct consequence of this design. 2. **Engine namespace and `containernamespace/moby` are not repaired after user-initiated delete** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:457-475`, `pkg/controllers/app/engine/controllers/sync_volumes.go:56-66` [Codex GPT 5.4] `(in-scope)`. `ensureNamespace` and `syncContainerNamespace` run only in `fullSync` at watcher startup; `SetupWithManager` does not watch `corev1.Namespace` or `ContainerNamespace`. A user delete of either resource is therefore not repaired until the Docker watcher is torn down and restarted. For the moby `ContainerNamespace` this is documented at `sync_volumes.go:56-61` as a deliberate design choice (to avoid a finalizer trap), and the BATS test at `engine.bats:355-361` codifies the "stays deleted" behavior. For the `rancher-desktop` Namespace there is no such deliberation — if deleted, subsequent mirror applies would fail with NotFound until the next watcher restart. Users are unlikely to delete these resources in an embedded single-user context, so Codex's Important severity is demoted to a design observation; but a `Watches(&corev1.Namespace{}, enqueueAppIfName("rancher-desktop"))` would close the gap cleanly. 3. **`Container.Status` has concurrent SSA and non-SSA writers** — `pkg/controllers/containers/container/container_controller.go` (pre-PR) and `pkg/controllers/app/engine/controllers/sync_containers.go` (new) [Claude Opus 4.6] `(future)`. The pre-existing container reconciler writes `status.Conditions` via a plain `r.Status().Update()` without a field manager, while the new engine controller writes `status.Name`, `status.Status`, `status.Ports`, etc. via SSA with field manager `engine-controller`. Each `Status().Update` sends the full status object, so the container reconciler's client implicitly takes ownership of engine-written fields, which the engine controller then forcibly reclaims via `ForceOwnership` on the next sync. Functional today, but the repo-context rule about concurrent writers on `App.Status.Conditions` applies symmetrically here. Migrating the container reconciler to SSA with a distinct field manager would eliminate the ping-pong. ### Strengths - **Clean separation between `dockerWatcher` and `EngineReconciler`** [Claude Opus 4.6]. Docker-side I/O and controller-runtime integration are split cleanly, making ownership of state obvious. - **Daemon-clock `Since` filter is the right design choice** [Claude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro]. `dockerEventsSince()` uses Docker's `Info().SystemTime` to anchor the event stream's `Since` filter rather than host `time.Now()`, avoiding the host/guest clock-skew bug that would otherwise drop startup-window events. The host-clock fallback with a 2-minute bias is pragmatic for the rare case where `Info` is unavailable. - **Race/retry decisions are commented with reasoning, not just mechanics** [Claude Opus 4.6]. `sync_containers.go:102-122` on why `create` omits `ForceOwnership` is a textbook example — it captures the "fail loudly on concurrent user patch" intent that a terse SSA snippet would hide. - **`errors.Join` throughout cleanup/sync paths** [Claude Opus 4.6]. `cleanupMirrorResources`, `processFinalizers`, `syncAllContainers`, and `syncAllImages` consistently avoid the "one stuck object blocks the rest" footgun. - **`NotApplicable=True` for containerd backend** [Gemini 3.1 Pro]. Reporting `ContainerEngineReady=True` with Reason `NotApplicable` is a clean way to let `rdd set running=true containerEngine.name=containerd` complete its wait even though mirroring is a no-op — with the caveat about Reason discrimination captured in I3. --- ## Testing Assessment BATS coverage is thorough for happy-path transitions and the non-obvious cases (untag-leaves-dangling, tagging-dangling-removes-mirror, in-use image finalizer, paused-to-created, paused-to-running). Untested scenarios ranked by risk: 1. **Steady-state reconcile cadence after `spec.state=created` (I1)**. No test verifies that once `status.status=exited` is observed, subsequent unrelated events stop dispatching `ContainerStop`. A unit test on `reconcileContainerState` with a counting fake Docker client would catch this without introducing flakiness. 2. **Watcher death and reconnect while the VM is still running**. The `watcherDied` path in `Reconcile` at `engine_reconciler.go:88-106` handles it, but nothing exercises the "Docker is still up but the event stream closed" scenario (proxy timeout, API version compaction). Simulatable by killing the Docker client's underlying TCP socket. 3. **Containers in `dead`/`removing` states with `desired=created`**. Claude's I1 plus S7 flag this region as ambiguous; no test covers it. 4. **Concurrent create-then-patch race**. `sync_containers.go:117-122`'s comment explicitly relies on a "fail loudly on conflict" contract, but no test verifies the 409 is retried on the next reconcile. 5. **Full sync after `rdd svc delete` followed by `rdd svc start` with prior Docker state still present**. Verifies the stale-mirror prune path actually runs. 6. **`--timeout=0` preserves legacy no-wait behavior**. Covered implicitly by `rdd set --timeout=10s running=false` at `engine.bats:346` but not explicitly tested for `=0`. 7. **Windows moby startup with `rdd set running=true`**. Currently skipped via `skip_on_windows` in `local_setup_file`; when WSL2 socket forwarding lands this becomes exercisable. --- ## Documentation Assessment - `docs/design/api_containers.md` is correctly updated for the new engine mirroring behavior and the `spec.state=unknown` default — well done. - `docs/design/api_app.md` still describes `App.Status.Conditions` as mirrored from LimaVM; it should mention the new `ContainerEngineReady` condition and its writer ownership [Codex GPT 5.4]. - `docs/design/cmd_app.md` still describes `rdd set` as a pure patch command and does not document the new default wait behavior or the `--timeout=0` escape hatch [Codex GPT 5.4]. - `reconcileContainerState` has good per-branch comments but no top-of-function "desired vs actual → action" table. A three-row table would have surfaced I1 during authoring. - The CRD enum restriction on `spec.state` (`created;running;unknown`) lives only in the kubebuilder annotation; the doc comment at `container_types.go:55-66` could inline the valid values in prose. --- ## Commit Structure Commit history is unusually clean. Related chunks — the Docker client dependency, the engine controller, the webhook, docs, BATS tests, `rdd set` wait — each get their own commit. CI-only changes are separated from code changes. The four "Address review round N findings" commits are preserved to make each round's response visible against the baseline, which is acceptable under the project's git-workflow guidance. The diff is large (~132KB, 27 files) but internally consistent. Splitting the engine controller from the CI infrastructure would be tempting, but the BATS support-bundle work is specifically motivated by debugging the engine controller's hangs in CI, so bundling them is defensible. --- ## Acknowledged Limitations - **`DockerSocket()` and `lima-template.yaml` hardcode `.rd2`** — `pkg/instance/instance.go:127-135`, `bats/tests/32-app-controllers/engine.bats:13-14`. Documented TODO; fix requires parameterizing the Lima template on `RDD_INSTANCE` and moving both hardcodes together. Not blocking. - **`rdd set containerEngine.name=` on an already-running VM does not wait on the restart** — `cmd/rdd/set.go:440-452`. Documented; promoted here to I3 because the `ObservedGeneration` machinery to fix it already exists in `setEngineCondition`. - **`reconcileContainerSpecs` and `processFinalizers` are O(N) per child event** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:170-176`. Documented TODO; acceptable for current workloads. - **Sequential container/image Inspect during initial sync** — `sync_containers.go:30-36`, `sync_images.go:36-40`. Documented; acceptable for typical dev workloads. - **Windows signal/pgid semantics for the `pgid` log field** — `limavm_lifecycle.go:479-486`. Documented; S12 proposes a runtime assertion. - **`containernamespace/moby` can be deleted without recreation** — `pkg/controllers/app/engine/controllers/sync_volumes.go:56-61`. Documented inline as a deliberate design choice to avoid a finalizer trap; `engine.bats:355-361` codifies it. - **Lima template rename from `.rd/docker.sock` → `.rd2/docker.sock`**. Not an upgrade concern — no public release yet. - **Windows Docker socket is still Unix-prefixed** — `docker_watcher.go:45-47` passes `"unix://" + path` on all platforms, but Lima (and therefore the engine reconciler's `wantWatcher` gate) is not yet exercised on Windows. Latent bug rather than active regression; worth a platform check once WSL2 lands. --- ## Agent Performance Retro ### [Claude] Claude produced the longest, most detailed review (614 lines, 3 important + 12 suggestions + 3 design observations). Most unique value came from tracing pre-existing code: verifying `moby` daemon's `stop.go` to confirm `ContainerStop` on a non-running container returns `NotModified`, connecting the ports-ordering churn (I4) to the missing `x-kubernetes-list-type` annotation in `crd.yaml`, and spotting that the `ObservedGeneration` handoff for I3 is cheap because `setEngineCondition` already stamps the field. Notable miss: didn't flag the App controller's blind `Status().Update` even though the repo-context rule calls it out explicitly. Claude's S3 (`deleteImage` tag race) is a good find but the proposed diff is the wrong direction — Claude recovers in prose. ### [Codex] Codex produced the tightest, most-calibrated review: zero criticals, four importants, zero suggestions. Every finding was grounded in a concrete cross-file interaction — I2 (App controller blind Update) is the clearest example of reading the repo-context rule and applying it. Codex was the only agent to notice the Namespace repair gap (demoted to design observation here) and the latent Windows unix-socket issue (demoted to acknowledged limitation). Codex's severity calibration was more aggressive than Claude's and more accurate than Gemini's — four importants all landed as legitimate concerns after verification. ### [Gemini] Gemini's review was the shortest and the most Critical-happy: one critical (C1 spec.state convergence), one important (CrashLoopBackOff bypass), two suggestions. The Critical framing overstated the severity — the loop is bounded by unrelated event rate and `NotModified` is silent success — but the finding itself matched Claude and Codex. Gemini's S1 (channel closure check) and S2 (`processContainerFinalizers` missing `RetryOnConflict`) are legitimate minor cleanups. Gemini's CrashLoopBackOff observation is a real design concern worth surfacing even if not action-required today. Known stderr noise: Gemini's sandbox blocked one path outside the worktree (`~/go/pkg/mod/...`) and hit an `ENAMETOOLONG` stat error on what looks like an embedded bash snippet from the `bats-with-timeout.sh` diff — neither affected the final review. ### Summary | | Claude Opus 4.6 | Codex GPT 5.4 | Gemini 3.1 Pro | |---|---|---|---| | Duration | 25m 38s | 6m 34s | 4m 58s | | Findings | 3I 11S | 2I | 1I 2S | | Tool calls | 76 (Bash 41, Read 29, TodoWrite 5) | 21 (shell 21) | 9 (run_shell_command 5, grep_search 2, read_file 2) | | Design observations | 8 | 3 | 2 | | False positives | 0 | 0 | 0 | | Unique insights | 13 | 2 | 2 | | Files reviewed | 27 | 27 | 27 | | Coverage misses | 0 | 0 | 0 | | **Totals** | **3I 11S** | **2I** | **1I 2S** | | Downgraded | 0 | 2 (I→S) | 2 (I→S) | | Dropped | 0 | 0 | 0 | Claude provided the broadest coverage and deepest pre-existing-code tracing. Codex provided the best severity calibration and directly actionable findings grounded in the repo-context rules. Gemini's coverage was the narrowest but it independently caught the dominant issue (I1). For this PR, Codex was the highest-signal agent: every finding it raised survived consolidation at the same severity it proposed. ### Reconciliation - Gemini C1 spec.state convergence: critical → important I1 (NotModified is silent success; no hang or error; cost is wasted RPCs not functional break). - Gemini I1 CrashLoopBackOff bypass: important → demoted to Design Observation (user explicitly opted into declarative state; Docker restart policy handles the containerized case). - Codex I2 Namespace repair gap: important → demoted to Design Observation (moby path is deliberate per comment; rancher-desktop path is a corner case users don't hit in single-user embedded context). - Codex I4 Windows unix-socket: important → demoted to Acknowledged Limitation (latent bug; Lima stack is not yet exercised on Windows, engine reconciler's `wantWatcher` gate is currently unreachable on that platform). --- ## Review Process Notes ### Skill improvements None. The repo-context rules covered every dimension needed; every Important finding traces back to an existing rule in `deep-review-context.md`. ### Repo context updates - **Add a rule** about Go map iteration order feeding into SSA atomic-list fields. Claude's I4 (non-deterministic `status.ports` ordering) is the kind of finding that will recur across any controller that mirrors map-keyed upstream data into a K8s list field without an `x-kubernetes-list-type` annotation. A one-line rule like *"When building a `status.X` list from a Go map iteration, sort by key before constructing the apply. Unannotated CRD lists default to atomic, and atomic-list SSA is position-sensitive — random iteration order churns resourceVersion on every sync"* would make this a default check for future syncs. --- ## Appendix: Original Reviews ### Claude Opus 4.6 ``` Now I have enough understanding to write the findings report. # Code Review — PR #295 Container Engine Watcher ## Executive Summary This PR adds an engine controller that mirrors Docker state (containers, images, volumes) into the rdd Kubernetes API, forwards deletions back to Docker, extends `rdd set` to wait for desired state, and introduces CI/BATS infrastructure to diagnose hangs. The implementation is thorough, well-commented, and shows multiple rounds of review feedback already applied; the main functional issue is a loop in `reconcileContainerState` that keeps dispatching `ContainerStop` after the container has already transitioned to `exited`. ## Findings ### Critical Issues None. ### Important Issues **I1. `reconcileContainerState` dispatches redundant `ContainerStop` every reconcile once a container reaches `exited`** — `pkg/controllers/app/engine/controllers/docker_watcher.go:334-370` ```go actual := c.Status.Status if desired == actual { return nil } // ... switch desired { // ... case containersv1alpha1.ContainerStatusCreated: log.Info("Stopping container", "id", c.Name) _, err := w.cli.ContainerStop(ctx, c.Name, dockerclient.ContainerStopOptions{}) return err } ``` After the user sets `spec.state=created` on a running Container, the reconciler at line 366 calls `ContainerStop`. Docker transitions the container to `exited` (not `created`), and `syncContainer` writes `status.status=exited` via SSA at `sync_containers.go:157`. Every subsequent watch event that triggers `reconcileContainerSpecs` (line 323) will re-enter `reconcileContainerState` for this container: `desired == "created"`, `actual == "exited"`, `desired != actual` at line 341 is true, so it dispatches `ContainerStop` again. I confirmed in moby source (`daemon/stop.go` line 30-37) that `ContainerStop` on a non-running container returns `errdefs.NotModified`, which the client at `request.go:225` treats as success (2xx-3xx). So the repeat calls are wasted but not error-producing — no propagated error, no requeue, just one extra Docker RPC on every unrelated watch event that goes through `reconcileContainerSpecs` (see the engine reconciler's `.Watches(&containersv1alpha1.Container{}, enqueueApp)` in `engine_reconciler.go:472-474`). Cost is `O(stuck-containers × external-event-rate)` Docker calls per second. The BATS test at `engine.bats:231-248` only verifies the one-shot transition to `exited` and does not catch the churn. The same problem would apply to `actual == "dead"` or `actual == "created"` with `desired == "created"`. Fix: treat all "not currently running-or-transitioning" states as satisfying `desired == created`, symmetric with the unpause path in the `Running` branch at lines 354-363. ```diff func (w *dockerWatcher) reconcileContainerState(ctx context.Context, c *containersv1alpha1.Container) error { desired := c.Spec.State if desired == containersv1alpha1.ContainerStatusUnknown { return nil } actual := c.Status.Status - if desired == actual { - return nil - } + // Treat any stopped state as matching desired=created. + isStopped := func(s containersv1alpha1.ContainerStatusValue) bool { + switch s { + case containersv1alpha1.ContainerStatusCreated, + containersv1alpha1.ContainerStatusExited, + containersv1alpha1.ContainerStatusDead: + return true + } + return false + } + switch desired { + case containersv1alpha1.ContainerStatusRunning: + if actual == containersv1alpha1.ContainerStatusRunning { + return nil + } + case containersv1alpha1.ContainerStatusCreated: + if isStopped(actual) { + return nil + } + } log := logf.FromContext(ctx).WithName("docker-watcher") ``` `(important, regression)` --- **I2. `rdd set running=true` with `containerEngine.name=containerd` is reported as acknowledged, but the `NotApplicable` terminal state also trips the cleanup path for a freshly-provisioned containerd VM** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:116-148` ```go wantWatcher := running && engineIsDocker if !wantWatcher { // ... if running && !engineIsDocker { terminalReason = "NotApplicable" terminalStatus = metav1.ConditionTrue terminalMessage = "Engine mirroring is only supported with the moby backend" } current := meta.FindStatusCondition(app.Status.Conditions, conditionContainerEngineReady) alreadyClean := !watcherDied && current != nil && current.Reason == terminalReason && current.Status == terminalStatus if !alreadyClean { if err := r.cleanupMirrorResources(ctx); err != nil { // ... ``` On the very first reconcile after a user brings up a containerd VM, `current == nil` (the condition has never been set), so `alreadyClean` is false and `cleanupMirrorResources` runs at line 142. That helper at line 248-266 lists Containers, Volumes, Images, and ContainerNamespaces in the `rancher-desktop` namespace. This is not a correctness bug (the lists are empty, the function returns quickly), but the comment at line 108-115 claims "once the condition reflects the terminal state, cleanup would be a no-op against an empty namespace, so we short-circuit" — yet the short-circuit only kicks in on the **second** reconcile, not the first, regardless of whether cleanup is actually needed. More importantly, in the containerd-on-existing-moby-state case (user switches backend), the mirrors from the moby session exist and must be cleaned up. The `NotApplicable` → Status=True message says "Engine mirroring is only supported with the moby backend" — but on the client side, `rdd set running=true containerEngine.name=containerd` then satisfies `ContainerEngineReady=True` immediately without waiting for the cleanup to finish. The `waitForDesiredState` predicate at `cmd/rdd/set.go:464-466` only checks `Status == ConditionTrue` and has no `reason != "NotApplicable"` guard. The author called this out in the comment block at `cmd/rdd/set.go:445-452` but framed it as a `rdd set` limitation; the actual hazard is broader because setting the condition to True happens **before** `cleanupMirrorResources` actually returns (step 141-148: cleanup first, then set condition). So in the common case this works, but if the backend switch and stop/start fire in rapid succession the CLI can observe a moby `ContainerEngineReady=True` still in the condition on a racy reconcile read. Fix: have `waitForDesiredState` also check that the Reason is either `Connected` (moby ready) or `NotApplicable` (containerd), not just Status=True, and distinguish these from the stale `Connected` that pre-dates the spec change. A simple mitigation: generation-based observation via `ObservedGeneration` on the condition — which `setEngineCondition` already stamps at line 234 — by filtering in the CLI watch predicate. ```diff if runningVal == "true" { logrus.Info("Waiting for container engine to be ready") + // Capture the App generation we wrote, so we ignore stale condition + // snapshots whose ObservedGeneration predates our patch. + var minGen int64 = 0 // set by caller from the Patch response return watchCondition(ctx, config, func(obj *unstructured.Unstructured) bool { - return conditionStatus(obj, "ContainerEngineReady") == metav1.ConditionTrue + return conditionIsCurrent(obj, "ContainerEngineReady", metav1.ConditionTrue, minGen) }) } ``` `(important, gap)` — note this aligns with the author's own TODO; I'm promoting it because the acknowledged hazard already has an observable escape (ObservedGeneration is stamped but unused) that the CLI could consume today. --- **I3. Container port application order is non-deterministic, causing spurious status churn** — `pkg/controllers/app/engine/controllers/sync_containers.go:176-190` ```go var applyPorts []*containersv1alpha1apply.ContainerPortApplyConfiguration if inspect.NetworkSettings != nil { for portName, ports := range inspect.NetworkSettings.Ports { var bindings []*containersv1alpha1apply.ContainerPortBindingApplyConfiguration for _, port := range ports { bindings = append(bindings, containersv1alpha1apply.ContainerPortBinding(). WithHostIP(port.HostIP.String()). WithHostPort(port.HostPort)) } applyPorts = append(applyPorts, containersv1alpha1apply.ContainerPort(). WithName(portName.String()). WithBindings(bindings...)) } } statusApply.WithPorts(applyPorts...) ``` Go map iteration at line 178 produces `applyPorts` in random order. The Container CRD defines `status.ports` as `type: array` with no `x-kubernetes-list-type: map` or `set` annotation (see `pkg/controllers/containers/container/crd.yaml:420-454`), so Kubernetes treats it as atomic. An atomic list diff is position-based: rearranging entries is a change. Every sync of a container with multiple exposed ports may therefore produce a no-op status update with a new resourceVersion, which fires a watch event, which enqueues another reconcile, which re-reads containers, which… The engine reconciler watches Container via `enqueueApp` (line 472 of engine_reconciler.go), so each spurious update triggers another `reconcileContainerSpecs` + `processFinalizers` pass. Fix: sort by port name before building the apply slice. Same considerations apply to the inner `bindings` loop if any port has multiple host bindings; in practice Docker keeps IPv4 first and IPv6 second in deterministic order. ```diff if inspect.NetworkSettings != nil { + portNames := make([]nat.Port, 0, len(inspect.NetworkSettings.Ports)) + for p := range inspect.NetworkSettings.Ports { + portNames = append(portNames, p) + } + sort.Slice(portNames, func(i, j int) bool { return portNames[i] < portNames[j] }) - for portName, ports := range inspect.NetworkSettings.Ports { + for _, portName := range portNames { + ports := inspect.NetworkSettings.Ports[portName] var bindings []*containersv1alpha1apply.ContainerPortBindingApplyConfiguration ``` `(important, regression)` — demoting to IMPORTANT because there is no functional break, only event-traffic amplification. ### Suggestions **S1. `DockerSocket()` silently swallows `os.MkdirAll` error** — `pkg/instance/instance.go:136-144` ```go var DockerSocket = sync.OnceValue(func() string { home, err := os.UserHomeDir() if err != nil { panic(fmt.Errorf("could not get home directory: %w", err)) } dir := filepath.Join(home, ".rd2") _ = os.MkdirAll(dir, 0o700) return filepath.Join(dir, "docker.sock") }) ``` If `MkdirAll` fails (permission error, read-only filesystem), the error is swallowed at line 142 and the function returns a path whose parent may not exist. The subsequent `dockerclient.WithHost("unix://" + socketPath)` call in `docker_watcher.go:47` will fail at connect time with a cryptic "no such file or directory" rather than reporting the actual setup problem. The surrounding functions (`LogDir`, `ShortDir`, etc.) also don't do `MkdirAll`, so adding one here is inconsistent — the directory is really the responsibility of Lima (which creates `.rd2` when the VM starts). Fix: remove the `MkdirAll` call entirely and let Lima own the directory. If you need to keep it as a safety net, at least `panic(err)` on failure for consistency with the `UserHomeDir` error at line 139. ```diff var DockerSocket = sync.OnceValue(func() string { home, err := os.UserHomeDir() if err != nil { panic(fmt.Errorf("could not get home directory: %w", err)) } - dir := filepath.Join(home, ".rd2") - _ = os.MkdirAll(dir, 0o700) - return filepath.Join(dir, "docker.sock") + return filepath.Join(home, ".rd2", "docker.sock") }) ``` `(suggestion, regression)` --- **S2. `setEngineCondition` does not tolerate App deletion during retry** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:223-241` ```go func (r *EngineReconciler) setEngineCondition(ctx context.Context, app *appv1alpha1.App, status metav1.ConditionStatus, reason, message string) error { return retry.RetryOnConflict(retry.DefaultRetry, func() error { latest := &appv1alpha1.App{} if err := r.Get(ctx, client.ObjectKey{Name: app.Name}, latest); err != nil { return err } ``` The Reconcile function at line 79-81 uses `client.IgnoreNotFound` when the App is missing. But if the App is deleted between Reconcile's initial Get and a later `setEngineCondition` call (e.g., the user runs `rdd svc delete` concurrently with the reconciler), line 227 returns the NotFound error. `retry.RetryOnConflict` does not treat NotFound as retriable, so it returns the NotFound through. The caller at line 147 wraps it in the Reconcile return, which controller-runtime interprets as a transient error and requeues — only for the next reconcile to also observe NotFound and finally bail out via `IgnoreNotFound`. Fix: skip the update when the App is gone. ```diff func (r *EngineReconciler) setEngineCondition(ctx context.Context, app *appv1alpha1.App, status metav1.ConditionStatus, reason, message string) error { return retry.RetryOnConflict(retry.DefaultRetry, func() error { latest := &appv1alpha1.App{} if err := r.Get(ctx, client.ObjectKey{Name: app.Name}, latest); err != nil { + if apierrors.IsNotFound(err) { + return nil + } return err } ``` `(suggestion, regression)` --- **S3. `deleteImage` may untag the wrong image when RepoTag points to a newly-pulled image** — `pkg/controllers/app/engine/controllers/docker_watcher.go:393-404` ```go func (w *dockerWatcher) deleteImage(ctx context.Context, img *containersv1alpha1.Image) error { ref := img.Status.ID if img.Status.RepoTag != "" { ref = img.Status.RepoTag } _, err := w.cli.ImageRemove(ctx, ref, dockerclient.ImageRemoveOptions{}) ``` Consider this sequence: 1. Image A is tagged `busybox:latest`; engine creates a mirror with `Status.ID=sha256:A`, `Status.RepoTag=busybox:latest`. 2. User runs `docker pull busybox:latest` (registry has a newer version). Docker pulls image B, moves the `busybox:latest` tag from A to B, emits untag(A) then tag(B) events. 3. In the narrow window before the engine processes those events, the user runs `kubectl delete image/`. 4. `processImageFinalizers` reads the stale mirror A which still has `Status.RepoTag=busybox:latest`, calls `ImageRemove("busybox:latest")`, and untags image B instead of A. The ID-based fallback (`ref = img.Status.ID`) would have untagged A correctly. Using RepoTag is fine when the tag still points to the same ID, but it's an unchecked race. Fix: call `ImageRemove` with the ID, not the tag, or pass the digest form (`sha256:...`) so Docker disambiguates. Removing by ID with `PruneChildren=false` leaves other tags intact as a side effect (Docker's behavior). ```diff func (w *dockerWatcher) deleteImage(ctx context.Context, img *containersv1alpha1.Image) error { - // Use the tag if available, otherwise the raw image ID. - ref := img.Status.ID - if img.Status.RepoTag != "" { - ref = img.Status.RepoTag + // Remove by ID when possible; the tag mirror covers only one tag + // out of potentially many, and Docker's tag namespace is mutable. + ref := img.Status.RepoTag + if ref == "" { + ref = img.Status.ID } - _, err := w.cli.ImageRemove(ctx, ref, dockerclient.ImageRemoveOptions{}) ``` Actually — the author's intent here is that deleting a tag mirror should untag (not remove) the image, so the fix is the opposite of what I wrote. Recommend the author add a comment pinning down the semantic ("we want to untag, and we tolerate the race because a stale mirror's tag reference is itself suspect") or, better, re-fetch the image and verify `Inspect().RepoTags` still contains the tag before calling ImageRemove. `(suggestion, gap)` --- **S4. `processVolumeFinalizers` depends on `v.Status.Name` being populated, but there is a documented race during sync** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:426-455` ```go for i := range volumes.Items { v := &volumes.Items[i] if v.DeletionTimestamp == nil || !hasFinalizer(v, mirrorFinalizer) { continue } // A Volume mirror with an empty Status.Name has no // engine-populated status yet ... if v.Status.Name != "" { if err := w.deleteVolume(ctx, v.Status.Name); err != nil { ``` The comment at lines 437-441 explicitly states the reason: a user could create a bare Volume skeleton without status, or the sync could land here before `applyVolume` populates status. But the same reasoning applies to `processContainerFinalizers` (lines 375-400) and `processImageFinalizers` (lines 402-424), where `c.Name`/`img.Status.RepoTag` is used unconditionally — with no empty-string guard. Containers use the K8s name as the Docker ID, so that path is safe, but the Image finalizer path at line 413 calls `w.deleteImage` which dereferences `img.Status.RepoTag` and `img.Status.ID`. If a user creates a bare `Image` mirror without status, both fields are empty and `deleteImage` passes empty string as the `ref` to `ImageRemove`, which the daemon rejects with a generic error. This is only reachable if a user directly creates a mirror via kubectl (the Container webhook only intercepts Update, not Create, so this is possible — see `container_controller.go:88-90`). Fix: symmetrically guard the Image path with `img.Status.ID != ""`. ```diff if err := w.deleteImage(ctx, img); err != nil { - errs = append(errs, fmt.Errorf("failed to delete image %s from Docker: %w", img.Name, err)) - continue + if img.Status.ID == "" && img.Status.RepoTag == "" { + // Bare skeleton: nothing in Docker to forward the delete to. + } else { + errs = append(errs, fmt.Errorf("failed to delete image %s from Docker: %w", img.Name, err)) + continue + } } ``` `(suggestion, gap)` --- **S5. `parseContainerName` produces misleading result for the "/" input** — `pkg/controllers/app/engine/controllers/sync_containers.go:207-214` ```go func parseContainerName(fullName string) (namespace, name string) { fullName = strings.TrimPrefix(fullName, "/") namespace, name, found := strings.Cut(fullName, "/") if !found { return containerNamespace, namespace } return namespace, name } ``` `parseContainerName("/")` returns `("moby", "")` per the table test at `sync_containers_test.go:42-51`. This is fine for the "/" edge case but has no corresponding defensive handling downstream: `applyContainer` will then call `containersv1alpha1apply.ContainerStatus().WithName("")` at sync_containers.go:151, producing a Container mirror with `status.name=""`. Since `status.name` is marked `+required` in `container_types.go:71-74`, the CRD will reject the apply with a validation error. In the hot path, `inspect.Name` is never `/` because Docker always populates a generated name if the user didn't supply one, so this is defensive only. Fix: return an explicit error or fall back to a synthetic name. Since this is defensive, a test assertion + a panic would be clearer than silently producing an invalid mirror. `(suggestion, gap)` --- **S6. `applyContainer` reassertion of finalizer runs even when the mirror is terminating** — `pkg/controllers/app/engine/controllers/sync_containers.go:123-147` ```go var existing containersv1alpha1.Container err := w.k8s.Get(ctx, client.ObjectKey{Name: inspect.ID, Namespace: apiNamespace}, &existing) if apierrors.IsNotFound(err) { // ...create... } else if err != nil { // ... } else { // Re-assert the mirror finalizer on every sync of an existing Container finalizerOnly := containersv1alpha1apply.Container(inspect.ID, apiNamespace). WithFinalizers(mirrorFinalizer) if err := w.k8s.Apply(ctx, finalizerOnly, client.ForceOwnership, client.FieldOwner(controllerName)); err != nil { return fmt.Errorf("failed to reassert container finalizer %s: %w", inspect.ID, err) } } ``` If the user has called `kubectl delete container/...` (mirror has `DeletionTimestamp` set) and Docker then emits a container event (e.g., the container dies on its own), `syncContainer` → `applyContainer` will reach this branch and try to reassert the finalizer. Kubernetes forbids **adding** a finalizer to a resource that is already being deleted (`422 Invalid`). Since SSA will interpret "finalizer still present in the object" as not-a-change (the finalizer was added on create), the API server should be a no-op — but if a separate field manager somehow removed it, the reassertion fails with a cryptic error and `syncContainer` propagates the failure up, the event handler logs an error, and the next Docker event for the same container repeats the failure. Fix: skip the reassertion when the existing mirror has a non-zero `DeletionTimestamp`. ```diff } else { + if existing.DeletionTimestamp != nil { + // The mirror is terminating; finalizer reassertion is forbidden and + // unnecessary — the finalizer is about to be stripped by + // processContainerFinalizers. + return w.updateContainerStatusOnly(ctx, inspect) + } finalizerOnly := ... ``` `(suggestion, gap)` --- **S7. Inline Docker state mapping assumes the Docker enum is a strict subset of the CRD enum** — `pkg/controllers/app/engine/controllers/sync_containers.go:157` ```go WithStatus(containersv1alpha1.ContainerStatusValue(inspect.State.Status)). ``` `inspect.State.Status` is a free-form Docker string. The current CRD enum at `container_types.go:11` covers every documented Docker state (`created, running, pausing, paused, restarting, removing, exited, dead`). If a future Docker release introduces a new state string (`initializing`, `error`, etc.), the SSA apply will fail with a CRD validation error, the per-container error will be logged by `syncAllContainers:57-59`, and that container's mirror will silently fall behind. Fix: validate the conversion against the known enum list and fall back to `unknown` on mismatch. ```diff -WithStatus(containersv1alpha1.ContainerStatusValue(inspect.State.Status)). +WithStatus(mapDockerState(inspect.State.Status)). ``` ```go func mapDockerState(s string) containersv1alpha1.ContainerStatusValue { switch containersv1alpha1.ContainerStatusValue(s) { case containersv1alpha1.ContainerStatusCreated, ...: return containersv1alpha1.ContainerStatusValue(s) } return containersv1alpha1.ContainerStatusUnknown } ``` `(suggestion, gap)` --- **S8. `cmd/rdd/set.go` wait logic also applies on an `App create` path without a generation handoff** — `cmd/rdd/set.go:293-310` ```go err = c.Get(ctx, client.ObjectKey{Name: "app"}, &app) switch { case apierrors.IsNotFound(err): if err := createAndPatchApp(ctx, c, restConfig, specMap, specSchema, dryRun); err != nil { return err } // ... default: if err := patchApp(ctx, c, &app, specMap, dryRun); err != nil { return err } } if dryRun || timeout == 0 { return nil } return waitForDesiredState(ctx, restConfig, properties, timeout) ``` The wait is called after either `createAndPatchApp` or `patchApp`, but neither function returns the post-write generation number. The wait's watch predicate at `waitForDesiredState:464-466` reads any `ContainerEngineReady=True` regardless of `observedGeneration`, so it could satisfy on a pre-change condition snapshot from the informer's initial list (see `watchCondition:498-505`, the precondition path). This is the "known hazard" at lines 445-452. Mitigating it here has concrete benefits: the `setEngineCondition` helper already stamps `ObservedGeneration: latest.Generation` at `engine_reconciler.go:234`, so a few extra lines in the CLI would make the race go away. See I2. `(suggestion, gap)` — pairs with I2. --- **S9. `bats-with-timeout.sh` line 40 calls `rdd svc paths log_dir` before `set -o errexit` handling can recover** — `scripts/bats-with-timeout.sh:30-42` ```bash script_dir=$(cd "$(dirname "$0")" && pwd) rdd_bin="${script_dir}/../bin/rdd" if [ ! -x "$rdd_bin" ] && [ -x "${rdd_bin}.exe" ]; then rdd_bin="${rdd_bin}.exe" fi log_dir=$("$rdd_bin" svc paths log_dir) mkdir -p "${log_dir}" ``` If neither `bin/rdd` nor `bin/rdd.exe` exists (a build failure or a developer invoking the script before `make`), the `rdd_bin` path points at a non-executable file, and the `$()` subshell at line 40 exits with an error. `set -o errexit` aborts the script, which is correct — but the error message is just a shell "not found" without any context pointing at the missing binary. This is a minor UX gap. Fix: short-circuit with a clear error if neither binary exists. ```diff +if [ ! -x "$rdd_bin" ]; then + echo "bats-with-timeout: rdd binary not found at $rdd_bin; run 'make' first" >&2 + exit 2 +fi log_dir=$("$rdd_bin" svc paths log_dir) ``` `(suggestion, gap)` --- **S10. `pgid` log field semantics documented but not validated** — `pkg/controllers/lima/limavm/controllers/limavm_lifecycle.go:479-491` ```go // On Unix, SetGroup() sets Setpgid=true which makes the hostagent a // new process group leader, so pgid == pid. The "pgid" log field // reflects that Unix-only semantics ... logger.Info("Hostagent started, watcher active", "instance", limaVM.Name, "pid", haCmd.Process.Pid, "pgid", haCmd.Process.Pid) ``` The log field `pgid` is set to `haCmd.Process.Pid` on the assumption that `SetGroup` makes the process its own process group leader on Unix. This is correct for Linux/Darwin today, but the invariant is implicit — a future change to `process.SetGroup` that drops the `Setpgid` flag would silently break downstream tooling (`bats-with-timeout.sh` matching). Two options: (a) Query the real pgid via `syscall.Getpgid(haCmd.Process.Pid)` right after `Start()`. (b) Add a one-line comment asserting the invariant next to `process.SetGroup(haCmd)` at line 452 to cross-reference this log field. Option (a) is more defensive and costs one syscall. ```diff + pgid, pgidErr := syscall.Getpgid(haCmd.Process.Pid) + if pgidErr != nil { + pgid = haCmd.Process.Pid + } logger.Info("Hostagent started, watcher active", "instance", limaVM.Name, "pid", haCmd.Process.Pid, - "pgid", haCmd.Process.Pid) + "pgid", pgid) ``` `(suggestion, gap)` --- **S11. `engine.bats` `DOCKER_HOST` is a module-level export with a file-level `readonly` semantic intended** — `bats/tests/32-app-controllers/engine.bats:12-16` ```bash NAMESPACE="rancher-desktop" # TODO: use .rd${RDD_INSTANCE} once the Lima template derives the socket path # from the instance suffix instead of hardcoding ".rd2". DOCKER_HOST="unix://${HOME}/.rd2/docker.sock" export DOCKER_HOST ``` Top-level assignments in BATS files execute when the file is loaded, and `DOCKER_HOST` is exported globally to every test in the file (and any subshell). This is fine for now, but if another test suite later sources this same helper, its `DOCKER_HOST` leaks through. Declaring it in `local_setup_file` would scope it correctly. This is especially relevant because the acknowledged TODO will eventually replace this with a per-instance path; the fix will be cleaner if the variable is not already global. `(suggestion, gap)` — low priority. --- **S12. `syncContainerNamespace` uses SSA without touching any field other than metadata** — `pkg/controllers/app/engine/controllers/sync_volumes.go:62-67` ```go func (w *dockerWatcher) syncContainerNamespace(ctx context.Context) error { applyConfig := containersv1alpha1apply.ContainerNamespace(containerNamespace, apiNamespace) return w.k8s.Apply(ctx, applyConfig, client.ForceOwnership, client.FieldOwner(controllerName)) } ``` The apply config has no spec, no status, no labels — only the `metadata.name` and `metadata.namespace` set by `containersv1alpha1apply.ContainerNamespace(...)`. This is fine as a create, but on subsequent reconciles it's a no-op that still issues an API call. Since `syncContainerNamespace` is called only once per watcher lifetime (from `fullSync`), this is harmless — but it's worth noting the pattern is fragile: if someone later moves this call into an event handler, every container/volume event would generate a redundant ContainerNamespace apply. No fix required today; consider adding a comment at line 56-61 that says "called only on fullSync". `(suggestion, enhancement)` ## Design Observations ### Concerns **Cross-controller status writes on Container are not coordinated.** The existing `container_reconciler.go` (pre-PR) writes `status.Conditions` via `r.Status().Update()` without a field manager, and the new engine controller writes `status.Name`, `status.Status`, `status.Ports`, etc. via SSA with field manager `engine-controller`. Both controllers reconcile the same object on every watch event. A regular `Status().Update()` sends the full status object, which means the container reconciler's client can implicitly take ownership of the engine-written fields (field manager is the client name, not `engine-controller`). The subsequent engine-controller apply then has to forcibly reclaim those fields via `ForceOwnership` at `sync_containers.go:195-196`. This works, but each Container update bounces between the two field managers and can generate 2× the reconciles. Per the repo overview: "`App.Status.Conditions` has multiple concurrent writers... Flag any non-SSA, non-retry-on-conflict `Status().Update` on the App object" — the same logic applies to `Container.Status.Conditions`. Migrating `container_reconciler.go` to SSA with a distinct field manager (`container-condition-reconciler` or similar) would eliminate the churn. `(future)` **Engine reconciler iterates all containers on every unrelated watch event.** `engine_reconciler.go:170-182` acknowledges this with a TODO. The concrete amplification depends on how many containers have a non-`unknown` spec.state. For the current test fixtures (a handful of containers), it's fine; in a developer's real environment with 30-50 containers, every Container/Image/Volume event fires a full `reconcileContainerSpecs + processFinalizers` sweep, which is `4 × O(N)` Lists followed by `O(N)` Docker calls in the stuck-state case (see I1). A per-type reconciler with predicates would bound cost to `O(1)` per event. `(future)` **Event `Since` filter uses second-precision, host-clock fallback trades correctness for robustness.** `docker_watcher.go:78-90` fetches Docker's `SystemTime` and falls back to `time.Now().Add(-2*time.Minute)` on failure. The 2-minute bias absorbs arbitrary clock skew, which is pragmatic. One edge case: if the host clock is *ahead* of the guest clock by more than 2 minutes, the fallback window misses real events. Since Lima's clock is synced via chrony on boot, this is unlikely — but is worth a sanity check on startup. `(future)` ### Strengths - The split between `docker_watcher` (Docker client + event handling) and `engine_reconciler` (controller-runtime integration + mirror lifecycle) is clean and makes the ownership of state obvious. Good separation of concerns. - Every non-trivial race/retry decision is commented in place with the reasoning, not just the mechanic. `sync_containers.go:102-122`'s explanation of why create uses no `ForceOwnership` is a textbook example. - Error handling via `errors.Join` in `cleanupMirrorResources`, `processFinalizers`, and `syncAllContainers` consistently avoids the "one stuck object blocks the rest" footgun. - The `imageStatusFromInspect` helper explicitly documents the slice-aliasing trap at `sync_images.go:146-150` — this is the kind of subtle Go pitfall that bites later, and catching it upfront is valuable. - The BATS test suite covers the non-obvious cases (untag-leaves-dangling, tagging-dangling-removes-mirror, in-use image finalizer, spec.state paused-to-created, uppercase volume names) that a less thoughtful test would miss. ## Testing Assessment BATS coverage is thorough for the happy path. Specific untested scenarios, ranked by risk: 1. **Repeated reconcile after `spec.state=created` with Docker in `exited` state (I1)** — no test verifies that the reconciler doesn't call ContainerStop on every subsequent event. A BATS test that pauses the reconciler, does N unrelated Docker operations, then inspects reconcile cadence would catch this. Alternative: a unit test on `reconcileContainerState` with a mock Docker client counting calls. 2. **Watcher death and reconnect while App is still running.** The `watcherDied` path at `engine_reconciler.go:88-106` is exercised only on Docker going down; the "Docker is still up but the event stream closed" scenario (proxy timeout, API compaction) is covered in code but not in tests. 3. **Container spec.state transitions through `dead` and `removing` states.** The paused→created and paused→running tests cover the pause branch, but the `dead`/`removing` edge cases are not tested and behavior under those states is ambiguous (see I1 and S7). 4. **Concurrent create-then-patch race.** The comment at `sync_containers.go:117-122` says the create Apply deliberately omits `ForceOwnership` to fail loudly on a concurrent user patch. No test verifies that the resulting 409 is retried on the next reconcile. 5. **Engine reconciler during a partial CRD delete of stale mirrors.** If a mirror has a finalizer added by a third party, `cleanupMirrorResources` strips ours but the mirror stays in Terminating. The reconciler's `alreadyClean` check on the App condition eventually succeeds, but the mirror is not surfaced in any assertion. 6. **Webhook reassertion during terminating state (S6).** If the engine controller reasserts the finalizer on a deleting container, does the Apply succeed or return an error? Not tested. 7. **Full sync after `rdd svc delete` followed by `rdd svc start` with the same VM/Docker state still present.** Verifies the stale-mirror prune path actually runs. 8. **`--timeout=0` preserves legacy no-wait behavior.** Covered implicitly by `rdd set --timeout=10s` at line 346 but not explicitly tested for `=0`. ## Documentation Assessment - `docs/design/api_containers.md` is correctly updated for the new engine mirroring behavior and the `spec.state: unknown` default. Well-done. - `engine_reconciler.go`, `docker_watcher.go`, and the sync_*.go files have consistently good package and function doc comments. The package doc comment in `engine/controller.go:5-9` accurately describes what the package does (not what it imports). - Gap: the `ContainerStop` / `ContainerStart` / `ContainerUnpause` dispatch logic in `reconcileContainerState` has a detailed comment about why each dispatch is correct but no cross-reference to Docker's state model. Adding a one-line table-style comment listing "desired vs actual → action" at the top of the function would make the correctness argument reviewable at a glance — and would have surfaced I1 during authoring. - `cmd/rdd/set.go:440-452` — the "Known hazard" comment is candid but understates the problem. See I2. - The CRD enum at `container_types.go:55-66` restricts spec.state to 3 values but the doc comment does not list them; a reader has to cross-reference the kubebuilder annotation. Consider inlining the valid values in prose. ## Commit Structure Commit history is cleaner than most — related chunks (Docker client dep, engine controller, webhook, docs, BATS tests, `rdd set` wait) are each in their own commit, CI-only changes are separate from code changes, and review-round fixups are clearly labeled. The four "Address review round N findings" commits are preserved intentionally to make each round's response visible against the baseline; this is acceptable under the project's guidance. No reordering suggested. The PR scope is large (~132KB diff, 27 files) but internally consistent — splitting the engine controller from the CI infrastructure is tempting, but the BATS support bundle work is specifically motivated by debugging the engine controller's hangs in CI, so bundling them is defensible. ## Acknowledged Limitations - **`DockerSocket()` and `lima-template.yaml` hardcode `.rd2`.** Documented as a TODO at `pkg/instance/instance.go:127-135` and cross-referenced at `engine.bats:13-14`. The fix requires parameterizing the Lima template on `RDD_INSTANCE` and both sides must move together. I concur this is not blocking. - **`rdd set containerEngine.name= running=true` on an already-running VM.** Documented at `cmd/rdd/set.go:445-452`. I do promote this to an Important finding (I2) because the `ObservedGeneration` machinery to fix it already exists. - **`reconcileContainerSpecs` and `processFinalizers` are O(N) per child event.** Documented at `engine_reconciler.go:170-176`. Consistent with current codebase performance; deferred. - **Sequential container/image inspect during initial sync.** Documented at `sync_containers.go:30-36` and `sync_images.go:36-40`. Consistent with current typical workload. - **Windows signal/pgid semantics for the `pgid` log field.** Documented at `limavm_lifecycle.go:479-486`. I promote the invariant check to S10 but the deferral is acknowledged. - **Shared `App.Status.Conditions` writers.** Noted in the repo overview; the engine reconciler's `setEngineCondition` uses `retry.RetryOnConflict` which is the expected mitigation pattern. - **Container reconciler's `Status().Update()` is not SSA.** Pre-existing, not regressed by this PR. I surface it as a Design Observation. - **`containerNamespace/moby` can be deleted by the user without recreation.** Documented inline at `sync_volumes.go:56-61`; the test `engine.bats:355-361` relies on this to avoid a re-create race. Acceptable. - **Lima template change from `.rd/docker.sock` → `.rd2/docker.sock`.** This is strictly a rename to align with `ShortDir()` conventions (`.rd` → `.rd2` for RDD_INSTANCE="2"), and there is no public release to upgrade. Not a rollback concern per the repo overview. ## Coverage Summary | File | Status | | --- | --- | | `.github/workflows/bats.yaml` | Reviewed, no issues (Spotlight disable is well-motivated) | | `bats/Makefile` | Reviewed, no issues | | `bats/helpers/commands.bash` | Reviewed, no issues (fd-close is a targeted fix) | | `bats/tests/32-app-controllers/engine.bats` | S11, partially I1 | | `cmd/app-controller/main.go` | Trivial | | `cmd/rdd/set.go` | I2, S8 | | `docs/design/api_containers.md` | Reviewed, no issues | | `go.mod` | Reviewed, no issues | | `go.sum` | Trivial | | `pkg/apis/containers/v1alpha1/applyconfiguration/containers/v1alpha1/containerspec.go` | Trivial (generated) | | `pkg/apis/containers/v1alpha1/container_types.go` | Reviewed, no issues | | `pkg/apis/containers/v1alpha1/volume_types.go` | Trivial (selectable field) | | `pkg/controllers/app/app/lima-template.yaml` | Reviewed, no issues | | `pkg/controllers/app/engine/controller.go` | Reviewed, no issues | | `pkg/controllers/app/engine/controllers/docker_watcher.go` | I1, S3 | | `pkg/controllers/app/engine/controllers/engine_reconciler.go` | I2, S2, S4 | | `pkg/controllers/app/engine/controllers/sync_containers.go` | I3, S5, S6, S7 | | `pkg/controllers/app/engine/controllers/sync_containers_test.go` | Reviewed, no issues | | `pkg/controllers/app/engine/controllers/sync_images.go` | Reviewed, no issues | | `pkg/controllers/app/engine/controllers/sync_volumes.go` | S12 | | `pkg/controllers/containers/container/container_webhooks.go` | Reviewed, no issues | | `pkg/controllers/containers/container/crd.yaml` | Reviewed, no issues (generated from types) | | `pkg/controllers/containers/volume/crd.yaml` | Trivial (generated) | | `pkg/controllers/lima/limavm/controllers/limavm_lifecycle.go` | S10 | | `pkg/instance/instance.go` | S1 | | `pkg/service/cmd/service.go` | Trivial | | `scripts/bats-with-timeout.sh` | S9 | All 27 changed files examined.``` ### Codex GPT 5.4 ``` ### Executive Summary This PR adds a new engine controller that mirrors Docker state into `Container` / `Image` / `Volume` resources, makes `rdd set` wait for readiness by default, and adds substantial BATS coverage around the new behavior. The main regressions are in state convergence and repair paths: `spec.state=created` never stabilizes once Docker reports `exited`, the synthetic engine namespace resources are not recreated after deletion, `App.Status.Conditions` now has two concurrent writers but only the new one retries conflicts, and the new wait path is wired up on Windows even though the engine watcher still hardcodes a Unix socket. ### Findings #### Critical Issues None. #### Important Issues I1. **`spec.state=created` never converges once Docker reports `exited`** — `pkg/controllers/app/engine/controllers/docker_watcher.go:340-367` (important, regression) ```go actual := c.Status.Status if desired == actual { return nil } switch desired { case containersv1alpha1.ContainerStatusCreated: log.Info("Stopping container", "id", c.Name) _, err := w.cli.ContainerStop(ctx, c.Name, dockerclient.ContainerStopOptions{}) return err } ``` The new API contract says `Container.spec.state=created` means “the Docker container should be stopped” (`pkg/apis/containers/v1alpha1/container_types.go:56-66`), and the new integration test explicitly waits for `status.status=exited` after patching `spec.state=created` (`bats/tests/32-app-controllers/engine.bats:231-247`). But `reconcileContainerState()` only treats the exact string equality at `docker_watcher.go:340-343` as converged, so the next reconcile sees `desired=created` and `actual=exited`, calls `ContainerStop()` again at `docker_watcher.go:364-367`, and requeues on the resulting error forever. In practice every successful stop transitions into a permanent retry loop. Fix: treat `created` as the broader “stopped” intent rather than as a literal Docker status. A small helper such as `isStopped(actual)` covering at least `created`, `exited`, and `dead` would let `reconcileContainerState()` return success once the container is no longer running. ```diff - actual := c.Status.Status - if desired == actual { + actual := c.Status.Status + if desired == containersv1alpha1.ContainerStatusCreated && + (actual == containersv1alpha1.ContainerStatusCreated || + actual == containersv1alpha1.ContainerStatusExited || + actual == containersv1alpha1.ContainerStatusDead) { + return nil + } + if desired == actual { return nil } ``` I2. **Engine namespace drift is permanent until the watcher restarts** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:457-475` (important, regression) ```go return ctrl.NewControllerManagedBy(mgr). For(&appv1alpha1.App{}). Named("engine-reconciler"). WatchesRawSource(source.Channel(r.reconcileChan, enqueueApp)). Watches(&containersv1alpha1.Container{}, enqueueApp). Watches(&containersv1alpha1.Image{}, enqueueApp). Watches(&containersv1alpha1.Volume{}, enqueueApp). Complete(r) ``` The controller creates the backing Kubernetes namespace and the synthetic `containernamespace/moby` only during `fullSync()` (`pkg/controllers/app/engine/controllers/docker_watcher.go:422-430`, `pkg/controllers/app/engine/controllers/sync_volumes.go:40-66`). After startup, the normal reconcile path at `pkg/controllers/app/engine/controllers/engine_reconciler.go:170-182` only processes `Container` specs and finalizers, and `SetupWithManager()` at `pkg/controllers/app/engine/controllers/engine_reconciler.go:468-475` does not watch `Namespace` or `ContainerNamespace`. That means a user delete of `namespace/rancher-desktop` or `containernamespace/moby` is never repaired until the Docker watcher is torn down and started again. The new BATS test even codifies the broken steady state by deleting `containernamespace/moby` and asserting it stays gone (`bats/tests/32-app-controllers/engine.bats:355-360`). Fix: make namespace repair part of the steady-state reconcile, not just watcher bootstrap. The simplest fix is to call `ensureNamespace()` and `syncContainerNamespace()` before processing child events whenever the watcher is running; the cleaner fix is to also watch `corev1.Namespace` and `ContainerNamespace` and enqueue the App singleton when either disappears. I3. **`App.Status.Conditions` now has two writers, but the old writer still uses a blind `Status().Update`** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:216-239` (important, regression) ```go changed := meta.SetStatusCondition(&latest.Status.Conditions, metav1.Condition{ Type: conditionContainerEngineReady, Status: status, Reason: reason, Message: message, ObservedGeneration: latest.Generation, }) ... return r.Status().Update(ctx, latest) ``` This PR adds a second status writer for `App.Status.Conditions` at `engine_reconciler.go:223-239`, but the existing App controller still mirrors LimaVM conditions with a single stale-object `r.Status().Update(ctx, &app)` at `pkg/controllers/app/app/controllers/app_controller.go:235-247`. The new engine code correctly retries conflicts, but the old writer still assumes “the App's resourceVersion from the initial Get is still current” (`app_controller.go:230-231`), which is no longer true once `ContainerEngineReady` is being written concurrently. That turns ordinary LimaVM and engine status churn into avoidable 409s and noisy reconcile failures. Fix: convert the App controller’s status write to the same retry-on-conflict pattern used in `setEngineCondition()`, or switch both writers to SSA / patching with disjoint ownership. As-is, only one side of the new concurrent-writer relationship was hardened. I4. **Windows moby startup now waits on a Unix-socket watcher that cannot connect** — `pkg/controllers/app/engine/controllers/docker_watcher.go:45-58` (important, regression) ```go socketPath := instance.DockerSocket() cli, err := dockerclient.New( dockerclient.WithHost("unix://" + socketPath), ) ... if _, err := cli.Ping(pingCtx, dockerclient.PingOptions{}); err != nil { return nil, fmt.Errorf("failed to ping Docker: %w", err) } ``` The engine controller is now imported unconditionally in embedded mode (`pkg/service/cmd/service.go:56-60`) and in the standalone app-controller (`cmd/app-controller/main.go:12-16`), and `rdd set` now waits for `ContainerEngineReady=True` whenever `running=true` is set (`cmd/rdd/set.go:462-466`). But the watcher still hardcodes a Unix Docker endpoint at `docker_watcher.go:45-47`, while `instance.DockerSocket()` returns a normal filesystem path at `pkg/instance/instance.go:136-143`. On Windows that becomes a `unix://C:\...` endpoint, which matches the new test comment that the Docker socket access pattern is “not yet wired up for Windows/WSL2” (`bats/tests/32-app-controllers/engine.bats:19-23`). The result is that the new readiness wait path is enabled on a platform where the watcher cannot satisfy it. Fix: gate the engine watcher and the `running=true` wait on platform support. Until WSL2 socket forwarding exists, Windows should either skip registering the engine controller for `moby`, or set `ContainerEngineReady=False/NotSupported` without making `rdd set running=true` block on it. #### Suggestions None. ### Design Observations **Concerns** 1. `(future)` The engine controller is currently an App-scoped “god reconciler” that scans every mirrored child on each child event (`pkg/controllers/app/engine/controllers/engine_reconciler.go:170-176`). Splitting spec reconciliation, finalizer handling, and synthetic-resource repair into per-kind reconcilers would remove the O(N)-per-event sweep and make bugs like I2 much harder to introduce. **Strengths** 1. The daemon-clock `Since` handling in `dockerEventsSince()` / `newDockerWatcher()` is a solid design choice. Using Docker’s `Info().SystemTime` avoids the host/guest skew bug that would otherwise drop startup-window events. 2. The new finalizer-strip helpers consistently use retry-on-conflict and tolerate `NotFound`, which is the right pattern for this controller-runtime cache model. ### Testing Assessment 1. Windows `moby` startup with `rdd set running=true` is still untested, even though the new code now waits on `ContainerEngineReady` there (I4). 2. There is no test that deletes `namespace/rancher-desktop` or `containernamespace/moby` while the VM stays running and verifies the controller recreates them (I2). The new test currently asserts the opposite steady state. 3. There is no test that patches `spec.state=created` onto an already-`exited` or `dead` container and verifies the reconciler stops retrying (I1). ### Documentation Assessment `docs/design/api_app.md:48-68` still says App conditions are mirrored from LimaVM, but this PR adds a new `ContainerEngineReady` condition written by a different controller. `docs/design/cmd_app.md:5-22` also still describes `rdd set` as a pure patch command and does not document the new default wait behavior or the `--timeout=0` escape hatch. ### Acknowledged Limitations - **Code comment**: “TODO: use ShortDir() once the Lima template derives the socket path from the instance suffix instead of hardcoding ".rd2".” — `pkg/instance/instance.go:127-135`. This remains an acknowledged multi-instance limitation; the new engine tests still depend on the same hardcoded socket path. - **Code comment**: “TODO: changes that trigger a VM restart without changing "running" ... are not waited on.” — `cmd/rdd/set.go:440-452`. The PR keeps that limitation and works around it in `bats/tests/32-app-controllers/app.bats:207-228` with explicit `rdd ctl wait` calls. - **Code comment**: “The long-term fix is to split these into per-resource reconcilers...” — `pkg/controllers/app/engine/controllers/engine_reconciler.go:170-176`. This remains accurate; the current implementation is probably acceptable for small dev workloads, but it leaves the controller with a broad repair surface. ### Coverage Summary - `.github/workflows/bats.yaml` — Reviewed, no issues - `bats/Makefile` — Reviewed, no issues - `bats/helpers/commands.bash` — Reviewed, no issues - `bats/tests/32-app-controllers/engine.bats` — I1, I2, I4 - `cmd/app-controller/main.go` — I4 - `cmd/rdd/set.go` — I4 - `docs/design/api_containers.md` — Reviewed, no issues - `go.mod` — Reviewed, no issues - `go.sum` — Reviewed, no issues - `pkg/apis/containers/v1alpha1/applyconfiguration/containers/v1alpha1/containerspec.go` — I1 - `pkg/apis/containers/v1alpha1/container_types.go` — I1 - `pkg/apis/containers/v1alpha1/volume_types.go` — Reviewed, no issues - `pkg/controllers/app/app/lima-template.yaml` — Reviewed, no issues - `pkg/controllers/app/engine/controller.go` — Reviewed, no issues - `pkg/controllers/app/engine/controllers/docker_watcher.go` — I1, I2, I4 - `pkg/controllers/app/engine/controllers/engine_reconciler.go` — I2, I3 - `pkg/controllers/app/engine/controllers/sync_containers.go` — Reviewed, no issues - `pkg/controllers/app/engine/controllers/sync_containers_test.go` — Reviewed, no issues - `pkg/controllers/app/engine/controllers/sync_images.go` — Reviewed, no issues - `pkg/controllers/app/engine/controllers/sync_volumes.go` — I2 - `pkg/controllers/containers/container/container_webhooks.go` — Reviewed, no issues - `pkg/controllers/containers/container/crd.yaml` — I1 - `pkg/controllers/containers/volume/crd.yaml` — Reviewed, no issues - `pkg/controllers/lima/limavm/controllers/limavm_lifecycle.go` — Reviewed, no issues - `pkg/instance/instance.go` — I4 - `pkg/service/cmd/service.go` — I4 - `scripts/bats-with-timeout.sh` — Reviewed, no issues ``` ### Gemini 3.1 Pro ``` ### Executive Summary This PR implements the engine controller that connects to the Docker socket, mirrors container engine state (Containers, Images, Volumes) into Kubernetes resources, and supports basic bidirectional state management. The core sync and event loop are well-structured, but there is a critical logical flaw where the controller fails to recognize that a container is already stopped, leading to infinite API spam loops. ### Findings #### Critical Issues **C1. Docker API spam loop on stopped containers** — `pkg/controllers/app/engine/controllers/docker_watcher.go:274-301` (critical, regression) ```go actual := c.Status.Status if desired == actual { return nil } log := logf.FromContext(ctx).WithName("docker-watcher") switch desired { // ... case containersv1alpha1.ContainerStatusCreated: log.Info("Stopping container", "id", c.Name) _, err := w.cli.ContainerStop(ctx, c.Name, dockerclient.ContainerStopOptions{}) return err } ``` The `reconcileContainerState` function compares `desired != actual` to decide whether to stop a container. If `spec.state` is `"created"` (which expresses user intent for the container to be stopped), and the actual state is `"exited"` or `"dead"`, the condition `desired != actual` evaluates to true. The function then unconditionally calls `w.cli.ContainerStop`. Since `ContainerStop` on an already-stopped container natively succeeds and returns `nil`, the reconciler completes successfully without updating any K8s status. Consequently, every subsequent watch event across the entire system (images, volumes, other containers) triggers `reconcileContainerSpecs`, which iterates over all containers and re-issues `ContainerStop` for every stopped container, resulting in an endless O(N) Docker API spam loop. Fix: Consider the `"created"` desired state satisfied if the container is already in a stopped state. ```diff if desired == actual { return nil } + + // A desired state of "created" expresses intent for the container to be + // stopped. If the container is already stopped, this intent is satisfied. + if desired == containersv1alpha1.ContainerStatusCreated { + switch actual { + case containersv1alpha1.ContainerStatusCreated, + containersv1alpha1.ContainerStatusExited, + containersv1alpha1.ContainerStatusDead: + return nil + } + } log := logf.FromContext(ctx).WithName("docker-watcher") ``` #### Important Issues **I1. CrashLoopBackOff bypassed for `spec.state=running`** — `pkg/controllers/app/engine/controllers/docker_watcher.go:288-296` (important, design gap) When the user requests `spec.state=running` and the container continuously crashes, its status rapidly changes to `exited`. `desired (running) != actual (exited)` triggers an immediate `ContainerStart` call. The container runs, crashes, emits a `die` event, K8s state updates to `exited`, and the resulting watch event triggers another immediate `ContainerStart`. This circumvents Docker's native restart policies and creates an unthrottled restart loop entirely driven by K8s watch events without any backoff mechanism. Fix: Rely on Docker's native restart policies for robust container lifecycles, and explicitly document that enforcing `spec.state=running` does not provide native `CrashLoopBackOff`. If strict declarative restarts are required, implement an exponential backoff state tracking in the controller before invoking `ContainerStart`. #### Suggestions **S1. Missing channel closure check in Docker watcher loop** — `pkg/controllers/app/engine/controllers/docker_watcher.go:178-183` (suggestion, robustness) ```go case msg := <-result.Messages: if err := w.handleEvent(ctx, msg); err != nil { log.Error(err, "Failed to handle Docker event", "type", msg.Type, "action", msg.Action, "actor", msg.Actor.ID) } ``` The `<-result.Messages` select case does not check the `ok` boolean. If the channel is closed (e.g. daemon disconnect), it continuously yields a zero-value message. While `handleEvent` safely ignores empty messages and the loop will eventually terminate when `select` randomly picks the closed `result.Err` case, it's safer to explicitly check for closure to avoid any CPU spin cycles. Fix: Check `ok` and exit gracefully: ```diff - case msg := <-result.Messages: + case msg, ok := <-result.Messages: + if !ok { + log.Info("Docker event stream closed") + w.enqueueReconcile() + return + } ``` **S2. Missing `RetryOnConflict` when stripping finalizers** — `pkg/controllers/app/engine/controllers/engine_reconciler.go:375-382` (suggestion, consistency) ```go if removeFinalizer(c, mirrorFinalizer) { if err := client.IgnoreNotFound(r.Update(ctx, c)); err != nil { errs = append(errs, fmt.Errorf("failed to remove finalizer from Container %s: %w", c.Name, err)) } } ``` `processContainerFinalizers` uses a direct `r.Update(ctx, c)` to remove the mirror finalizer. If the object was updated concurrently, this returns a conflict error and relies on the controller-runtime to requeue and retry the entire `deleteContainer` operation. While `deleteContainer` is idempotent, handling conflicts locally using `retry.RetryOnConflict` avoids unnecessary API latency and matches the pattern established in `removeMirrorResource`. Fix: Wrap the finalizer removal in a `retry.RetryOnConflict` block that re-fetches the container before updating. ### Design Observations **Strengths** - The fallback logic for handling the lack of `SystemTime` in the Docker API (biasing `time.Now()` into the past) is an excellent way to guarantee no event drops occur during the initial VM full sync gap. - Utilizing `NotApplicable` as `True` for the Containerd backend nicely circumvents the blocking `waitForDesiredState` issue on `rdd set`, allowing the CLI to proceed while gracefully opting out of unimplemented syncing logic. ### Testing Assessment 1. **Container state transition edge cases**: The BATS tests adequately cover pausing and unpausing containers, but do not specifically test `desired=created` vs `actual=exited`. Adding a test that validates stable reconciliation loops (i.e. ensuring the controller goes quiet when the container successfully hits `exited`) would have caught the C1 bug. ### Acknowledged Limitations - **Code comment**: `cmd/rdd/set.go:440-449` "Known hazard until the "Reconciled" condition lands: `rdd set containerEngine.name= running=true` on an already-running VM returns immediately because the stale ContainerEngineReady=True from the previous backend satisfies the precondition check..." The code correctly outlines this hazard and explicitly defers fixing it until the Reconciled condition lands. - **Code comment**: `pkg/instance/instance.go:127-133` and `bats/tests/32-app-controllers/engine.bats:13-14` "TODO: use ShortDir() once the Lima template derives the socket path from the instance suffix instead of hardcoding `.rd2`." This limitation requires upstream templating changes before it can be effectively resolved. - **Code comment**: `cmd/rdd/set.go:437-439` "TODO: changes that trigger a VM restart without changing "running" (e.g. containerEngine.name) are not waited on. This requires a "Reconciled" condition..." The comment explicitly accepts that pure backend swaps aren't currently blockable operations. ### Coverage Summary - `.github/workflows/bats.yaml`: Trivial - `bats/Makefile`: Reviewed, no issues - `bats/helpers/commands.bash`: Reviewed, no issues - `bats/tests/32-app-controllers/engine.bats`: Reviewed, no issues - `cmd/app-controller/main.go`: Trivial - `cmd/rdd/set.go`: Reviewed, no issues - `docs/design/api_containers.md`: Reviewed, no issues - `go.mod`: Trivial - `go.sum`: Trivial - `pkg/apis/containers/v1alpha1/applyconfiguration/containers/v1alpha1/containerspec.go`: Trivial - `pkg/apis/containers/v1alpha1/container_types.go`: Reviewed, no issues - `pkg/apis/containers/v1alpha1/volume_types.go`: Trivial - `pkg/controllers/app/app/lima-template.yaml`: Reviewed, no issues - `pkg/controllers/app/engine/controller.go`: Reviewed, no issues - `pkg/controllers/app/engine/controllers/docker_watcher.go`: C1, I1, S1 - `pkg/controllers/app/engine/controllers/engine_reconciler.go`: S2 - `pkg/controllers/app/engine/controllers/sync_containers.go`: Reviewed, no issues - `pkg/controllers/app/engine/controllers/sync_containers_test.go`: Reviewed, no issues - `pkg/controllers/app/engine/controllers/sync_images.go`: Reviewed, no issues - `pkg/controllers/app/engine/controllers/sync_volumes.go`: Reviewed, no issues - `pkg/controllers/containers/container/container_webhooks.go`: Reviewed, no issues - `pkg/controllers/containers/container/crd.yaml`: Trivial - `pkg/controllers/containers/volume/crd.yaml`: Trivial - `pkg/controllers/lima/limavm/controllers/limavm_lifecycle.go`: Reviewed, no issues - `pkg/instance/instance.go`: Reviewed, no issues - `pkg/service/cmd/service.go`: Trivial - `scripts/bats-with-timeout.sh`: Reviewed, no issues ``` ## Resolution | | | |---|---| | **Addressed** | 2026-04-12 | | **Commit** | `e2df2f9 Address review round 4 findings for PR #295` | | # | Finding | Action | |---|---------|--------| | 1 | Important #1: reconcileContainerState dispatches redundant ContainerStop once desired=created reaches exited | Fixed (state matrix + isStopped helper) | | 2 | Important #2: App.Status.Conditions has two writers but AppReconciler still uses a blind Status().Update | Fixed (retry.RetryOnConflict + re-Get) | | 3 | Important #3: rdd set running=true wait ignores the ObservedGeneration that setEngineCondition stamps | Fixed (generation handoff via targetGen + conditionInfo) | | 4 | Important #4: status.ports is built in non-deterministic map-iteration order and churns the atomic SSA list | Fixed (sorted ports + +listType=map annotation + CRD regen) | | 5 | Suggestion #1: DockerSocket() silently swallows os.MkdirAll error | Skipped — MkdirAll is a workaround for the hardcoded `.rd2` path; reverted initial attempt per project owner guidance | | 6 | Suggestion #2: setEngineCondition does not tolerate App deletion during retry | Fixed (IsNotFound inside retry closure) | | 7 | Suggestion #3: deleteImage may untag the wrong image in a tag race | Skipped — deliberate untag-vs-delete semantics; review's proposed fix direction was wrong | | 8 | Suggestion #4: Image finalizer path has no empty-status guard | Fixed (symmetric with processVolumeFinalizers guard) | | 9 | Suggestion #5: parseContainerName("/") returns ("moby", "") | Skipped — defensive only, hot path never hits it | | 10 | Suggestion #6: applyContainer reasserts the finalizer even on a terminating mirror | Fixed (skip reassert when DeletionTimestamp != nil) | | 11 | Suggestion #7: Docker state mapping assumes moby enum is a strict subset of the CRD enum | Fixed (mapDockerContainerState whitelist + fallback to unknown) | | 12 | Suggestion #8: setAction does not hand off the post-patch generation to waitForDesiredState | Fixed as part of Important #3 | | 13 | Suggestion #9: Missing channel closure check in Docker watcher loop | Fixed (ok check + enqueueReconcile on close) | | 14 | Suggestion #10: processContainerFinalizers uses a blind r.Update for finalizer removal | Fixed (retry.RetryOnConflict + re-Get) | | 15 | Suggestion #11: bats-with-timeout.sh gives no context when the rdd binary is missing | Fixed (explicit existence check + clear error) | | 16 | Suggestion #12: pgid log field asserts Unix-only invariant in prose, not in code | Commented (documented SetGroup invariant at call site) | | 17 | Suggestion #13: DOCKER_HOST is a file-level export in engine.bats | Skipped — blocked on the `.rd2` TODO | | 18 | Testing Gap #1: Steady-state reconcile cadence after spec.state=created | Skipped — would need fault injection or a mock Docker client; I1 fix addresses the root cause | | 19 | Testing Gap #2: Watcher death and reconnect while the VM is still running | Skipped — requires fault injection | | 20 | Testing Gap #3: Containers in dead/removing states with desired=created | Skipped — addressed by isStopped helper | | 21 | Testing Gap #4: Concurrent create-then-patch race | Skipped — requires fault injection | | 22 | Testing Gap #5: Full sync after rdd svc delete/start with prior state | Skipped — manual testing territory | | 23 | Testing Gap #6: --timeout=0 preserves legacy no-wait behavior | Test written (engine.bats regression test) | | 24 | Testing Gap #7: Windows moby startup with rdd set running=true | Skipped — blocked on WSL2 socket forwarding | | 25 | Documentation Gap #1: api_app.md missing ContainerEngineReady note | Documentation updated | | 26 | Documentation Gap #2: cmd_app.md missing new wait behavior and --timeout=0 | Documentation updated | | 27 | Documentation Gap #3: container_types.go spec.state doc should inline enum values | Documentation updated (container_types.go + regenerated applyconfiguration) |