Deep Review: 20260412-221509-pr-295

Date2026-04-12 22:15
Reporancher-sandbox/rancher-desktop-daemon
Round5 (of PR #295)
Author@jandubois
PR#295 — Container Engine watcher
Branchdocker
Commitse2df2f9 Address review round 4 findings for PR #295
33db27d Address review round 3 findings for PR #295
cf4b90e ci: disable Spotlight indexing on macOS BATS runners
c484dab ci: capture memory pressure and API state in support bundle
3196b24 bats: park spec.state=unknown before pausing test-state
84c0736 engine: apply Container/Image/Volume capitalization convention
c8fa377 generate: refresh containerspec.go and crd.yaml for spec.state doc
62d222a scripts: resolve rdd binary relative to bats-with-timeout.sh
66eb50e Address review round 2 findings for PR #295
7c8f356 ci: always capture support bundle and log hostagent pgid
9669b4b bats: close fds 3 and 4 when invoking rdd
50bdc15 ci: capture support bundle when BATS targets time out
762c1e8 Make rdd set wait for the desired state by default
65943da Add BATS integration tests for the engine controller
ba6caf7 Document engine mirroring and container state transitions
4db3f20 Allow container spec.state transitions via webhook
93359f7 Add engine controller that mirrors Docker state into K8s
87f1df1 Add DockerSocket() helper and Docker client dependency
32a0109 Work around distro proxy bug; update docker socket path
536ccb2 Rosetta is now a vz driver option, not a global option
ReviewersClaude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro
VerdictMerge with fixes — one critical regression in the new rdd set default wait and six important engine-watcher resilience gaps, none of which require redesign.
Wall-clock time37 min 44 s


Executive Summary

Round 5 of PR #295 lands the engine controller that mirrors Docker Container/Image/Volume state into the existing containers.rancherdesktop.io API group, plus a new default-wait contract for rdd set, CI support-bundle improvements, and related Lima template fixes. The watcher/reconciler split is well-structured, the SSA field-manager workaround is carefully documented, and the BATS coverage for the happy paths is solid.

The review found one critical regression and six important issues, all correctability without architectural change:

Nine suggestions cover SSA list-ordering churn, lock-while-closing asymmetry in Reconcile, DockerSocket() side effects, error-message specificity, and a few dead/unreachable branches.

Structure: 1 critical, 6 important, 9 suggestions.


Critical Issues

C1. rdd set hangs for the full timeout when the App is already running Gemini
}

// Reconcile handles App condition changes, Docker watcher lifecycle,
// Container spec.state transitions, and finalizer processing for mirror
// resources.
func (r *EngineReconciler) Reconcile(ctx context.Context, _ ctrl.Request) (ctrl.Result, error) {
	log := logf.FromContext(ctx)

	var app appv1alpha1.App
	if err := r.Get(ctx, client.ObjectKey{Name: appName}, &app); err != nil {
		return ctrl.Result{}, client.IgnoreNotFound(err)
	}

	running := meta.IsStatusConditionTrue(app.Status.Conditions, "Running")
	engineIsDocker := app.Spec.ContainerEngine.Name == engineMoby

	// Detach a dead watcher from the reconciler under watcherMu, then
	// stop it outside the lock. Holding the mutex across w.stop() —
	// which blocks on <-w.done and cli.Close() — would stall any
	// concurrent reconcile the moment MaxConcurrentReconciles > 1.
	var diedWatcher *dockerWatcher
	r.watcherMu.Lock()
	watcherRunning := r.watcher != nil
	if watcherRunning && !r.watcher.alive() {
		diedWatcher = r.watcher
		r.watcher = nil
		watcherRunning = false
	}
	r.watcherMu.Unlock()
	watcherDied := diedWatcher != nil
	if watcherDied {
		diedWatcher.stop()
	}

	// A watcher that dies while the App is still running is treated as a
	// transient disconnect: log it, clear the watcher pointer, and fall
	// through to the normal reconcile flow. If wantWatcher is still true
	// below, the reconciler starts a fresh watcher whose fullSync
	// reconciles any drift in place — downstream clients see no churn
	// because the existing mirror resources keep their identity. An
	// actual stop or backend change is handled by the !wantWatcher
	// branch below, which does sweep the mirrors.
	if watcherDied {
		log.Info("Docker watcher died, will attempt to reconnect")
	}

	// The watcher should only run when the App is Running with the moby
	// backend. In every other state (stopped, containerd, both) we stop
	// the watcher and sweep mirror resources. The sweep is gated on the
	// current ContainerEngineReady reason: once the condition reflects
	// the terminal state, cleanup would be a no-op against an empty
	// namespace, so we short-circuit to avoid four empty List calls per
	// unrelated reconcile. On failure, the condition stays pending and
	// the next requeue re-tries the sweep.
	wantWatcher := running && engineIsDocker
	if !wantWatcher {
		if watcherRunning {
			log.Info("Stopping Docker watcher",
				"running", running, "engine", app.Spec.ContainerEngine.Name)
			r.stopWatcher()
		}
		terminalReason := "Stopped"
		terminalStatus := metav1.ConditionFalse
		terminalMessage := "Container engine stopped"
		if running && !engineIsDocker {
			// NotApplicable is reported as Status=True so that
			// `rdd set running=true containerEngine.name=containerd`
			// stops waiting on ContainerEngineReady — even though the
			// engine controller is not mirroring anything in this
			// backend. UI consumers that expect Container/Image/Volume
			// resources should gate on the Reason as well, not on
			// Status alone. The condition will be renamed when
			// containerd mirroring lands.
			terminalReason = "NotApplicable"
			terminalStatus = metav1.ConditionTrue
			terminalMessage = "Engine mirroring is only supported with the moby backend"
		}
		current := meta.FindStatusCondition(app.Status.Conditions, conditionContainerEngineReady)
		alreadyClean := !watcherDied && current != nil && current.Reason == terminalReason && current.Status == terminalStatus
		if !alreadyClean {
			if err := r.cleanupMirrorResources(ctx); err != nil {
				log.Error(err, "Failed to clean up mirror resources")
				return ctrl.Result{}, err
			}
		}
		return ctrl.Result{}, r.setEngineCondition(ctx, &app, terminalStatus, terminalReason, terminalMessage)
	}

	if !watcherRunning {
		log.Info("App is running, starting Docker watcher")
		if err := r.startWatcher(ctx); err != nil {
			log.Error(err, "Failed to start Docker watcher")
			if condErr := r.setEngineCondition(ctx, &app, metav1.ConditionFalse, "ConnectFailed", err.Error()); condErr != nil {
				log.Error(condErr, "Failed to update ContainerEngineReady to ConnectFailed")
			}
			return ctrl.Result{}, err
		}
	}

	// Re-assert the Connected condition on every steady-state reconcile
	// so its observedGeneration tracks the App's current generation. When
	// the spec is patched without cycling the watcher (e.g. `rdd set
	// running=true kubernetes.enabled=true` from an already-running App),
	// this is the only place that stamps the new generation into
	// ContainerEngineReady, and rdd set's wait predicate depends on it.
	// setEngineCondition no-ops when nothing changed, so the extra call
	// is free on stable reconciles.
	if err := r.setEngineCondition(ctx, &app, metav1.ConditionTrue, "Connected", "Container engine synced"); err != nil {
		return ctrl.Result{}, err
	}

	// reconcileContainerSpecs + processFinalizers each issue one or more
	// List() calls across every mirrored object on every reconcile, and
	// every Container/Image/Volume watch event triggers a reconcile via
	// SetupWithManager below. Cost is therefore O(N) per child event.
	// The long-term fix is to split these into per-resource reconcilers
	// with watch predicates ("deletion timestamp set", "spec.state
	// changed"); until then the sweep runs on every event.
	if err := r.reconcileContainerSpecs(ctx); err != nil {
		return ctrl.Result{}, fmt.Errorf("failed to reconcile container specs: %w", err)
	}

rdd set now waits by default (commit 762c1e8). The wait predicate at cmd/rdd/set.go:478-481 looks for ContainerEngineReady.status == True && observedGeneration >= minGen, where minGen is app.Generation right after patchApp (cmd/rdd/set.go:425-447). That generation filter is the exact fix for stale-snapshot races and is correct by itself.

// patchApp applies a merge patch with the given spec properties. In dry-run
// mode, the API server validates the patch but does not persist it. The
// returned generation is the App's metadata.generation after the patch
// completes; the caller uses it to filter stale condition snapshots in
// the post-write wait.
func patchApp(ctx context.Context, c client.Client, app *appv1alpha1.App, specMap map[string]any, dryRun bool) (int64, error) {
	patchObj := map[string]any{"spec": specMap}
	patchBytes, err := json.Marshal(patchObj)
	if err != nil {
		return 0, fmt.Errorf("failed to marshal patch: %w", err)
	}

	var opts []client.PatchOption
	if dryRun {
		opts = append(opts, client.DryRunAll)
	}

	if err := c.Patch(ctx, app, client.RawPatch(types.MergePatchType, patchBytes), opts...); err != nil {
		return 0, fmt.Errorf("failed to update App: %w", err)
	}

	if dryRun {
		logrus.Info("App validated (dry run)")
	} else {
		logrus.Info("App updated")
	}
	return app.Generation, nil
}

// waitForDesiredState waits for the App to reach the state implied by the
// properties that were set. If running=true was set, it waits for the
// container engine to be ready. If running=false was set, it waits for the
// App's Running condition to go to False — i.e. the VM has actually
	ctx, cancel := context.WithTimeout(ctx, timeout)
	defer cancel()

	if runningVal == "true" {
		logrus.Info("Waiting for container engine to be ready")
		return watchCondition(ctx, config, func(obj *unstructured.Unstructured) bool {
			status, gen, present := conditionInfo(obj, "ContainerEngineReady")
			return present && gen >= minGen && status == metav1.ConditionTrue
		})
	}
	logrus.Info("Waiting for the VM to stop")
	return watchCondition(ctx, config, func(obj *unstructured.Unstructured) bool {
		// Running absent means the VM was never started (nothing to
		// wait for). Otherwise wait until the condition is refreshed

The engine reconciler, however, only calls setEngineCondition in two branches: the !wantWatcher terminal-state branch (lines 124-155) and the !watcherRunning fresh-start branch (lines 157-175). In the steady-state path — App running, watcher already running — the reconciler falls through directly to reconcileContainerSpecs/processFinalizers and returns without touching App.Status.Conditions. Because meta.SetStatusCondition is also only called in those two branches, ContainerEngineReady.observedGeneration keeps its previous value and never advances past whatever generation was in effect when the watcher last started.

Concretely, any command that bumps app.Generation without also cycling the watcher hangs for the full 5 minute default:

$ rdd set running=true kubernetes.enabled=true   # when already running=true
# patchApp → spec changes → generation N → N+1
# waitForDesiredState → minGen=N+1
# Reconciler reruns but takes the steady-state path → observedGen stays N
# watchCondition loops until timeout → "timed out waiting for App state"

The hang is not limited to that one permutation — any property change that leaves running=true and doesn't restart the watcher hits it (e.g., toggling kubernetes.*, containerEngine.mirrors, etc., along with an idempotent running=true). The simple start/stop BATS flow in engine.bats doesn't exercise it because it only toggles running through terminal states, which is why the regression slipped through.

Fix: move the setEngineCondition(Connected) call out of the !watcherRunning block so every wantWatcher=true reconcile re-asserts the condition. setEngineCondition already no-ops when nothing changed (meta.SetStatusCondition compares all fields) and wraps the write in retry.RetryOnConflict, so an unconditional call is cheap on stable reconciles but stamps the current latest.Generation into observedGeneration whenever the App spec advances.

 	if !watcherRunning {
 		log.Info("App is running, starting Docker watcher")
 		if err := r.startWatcher(ctx); err != nil {
 			log.Error(err, "Failed to start Docker watcher")
 			if condErr := r.setEngineCondition(ctx, &app, metav1.ConditionFalse, "ConnectFailed", err.Error()); condErr != nil {
 				log.Error(condErr, "Failed to update ContainerEngineReady to ConnectFailed")
 			}
 			return ctrl.Result{}, err
 		}
-		if err := r.setEngineCondition(ctx, &app, metav1.ConditionTrue, "Connected", "Container engine synced"); err != nil {
-			return ctrl.Result{}, err
-		}
 		// Fall through so pending spec.state patches and finalizers ...
 	}
 
+	if err := r.setEngineCondition(ctx, &app, metav1.ConditionTrue, "Connected", "Container engine synced"); err != nil {
+		return ctrl.Result{}, err
+	}
+
 	if err := r.reconcileContainerSpecs(ctx); err != nil {

A matching BATS test should run rdd set running=true <some-unrelated-prop>=<value> from the running state and assert that it returns well before the 5-minute timeout. Classification: regression, critical.


Important Issues

I1. ContainerNamespace/moby is not re-mirrored after user delete Codex Claude
func (r *EngineReconciler) processVolumeFinalizers(ctx context.Context, w *dockerWatcher) error {
	var volumes containersv1alpha1.VolumeList
	if err := r.List(ctx, &volumes, client.InNamespace(apiNamespace)); err != nil {
		return err
	}
	var errs []error
	for i := range volumes.Items {
		v := &volumes.Items[i]
		if v.DeletionTimestamp == nil || !hasFinalizer(v, mirrorFinalizer) {
			continue
		}
		// A Volume mirror with an empty Status.Name has no
		// engine-populated status yet — either a user created it as a
		// bare skeleton or it landed in the startup race window before
		// applyVolume ran. There is no Docker-side name to forward a
		// delete to, so strip the finalizer and let the Delete proceed.
		if v.Status.Name != "" {
			if err := w.deleteVolume(ctx, v.Status.Name); err != nil {
				errs = append(errs, fmt.Errorf("failed to delete volume %s from Docker: %w", v.Name, err))
				continue
			}
		}
		key := client.ObjectKeyFromObject(v)
		retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
			latest := &containersv1alpha1.Volume{}
			if err := r.Get(ctx, key, latest); err != nil {
				if apierrors.IsNotFound(err) {
					return nil
				}

syncContainerNamespace() at pkg/controllers/app/engine/controllers/sync_volumes.go:62-66 creates containernamespace/moby exactly once from fullSync on watcher startup. The reconciler watches Container, Image, and Volume but not ContainerNamespace, and ContainerNamespace has no mirror finalizer, so a user who runs rdd ctl delete containernamespace/moby succeeds and produces no event that wakes the engine reconciler back up. Subsequent Container/Image/Volume Docker events trigger reconciles, but reconcileContainerSpecs/processFinalizers never touch the namespace — it stays gone until the VM is restarted (or the watcher dies and fullSyncs).

// Unlike `Container` / `Image` / `Volume` mirrors, this resource has no
// mirror finalizer: Docker has no corresponding engine object to delete
// on the reverse path, and cleanupMirrorResources sweeps it
// unconditionally on VM stop, so a finalizer with no handler would only
// trap user-initiated deletes in Terminating until the next bounce.
func (w *dockerWatcher) syncContainerNamespace(ctx context.Context) error {
	applyConfig := containersv1alpha1apply.ContainerNamespace(containerNamespace, apiNamespace)

	return w.k8s.Apply(ctx, applyConfig,
		client.ForceOwnership, client.FieldOwner(controllerName))
}

// syncAllVolumes lists all Docker volumes and creates/updates `Volume`
// mirrors, then removes stale ones.
func (w *dockerWatcher) syncAllVolumes(ctx context.Context) error {

The design doc at docs/design/api_containers.md:88-96 says the dockerd namespace "cannot be modified in any way," but nothing currently enforces that. Worse, the new BATS test at bats/tests/32-app-controllers/engine.bats:364-370 asserts that deleting containernamespace/moby leaves it absent. That locks in the broken invariant instead of protecting it, and any future fix will also need to update this test.


When the container engine deletes an object (the user ran `docker rm` or
`nerdctl rm`, for example), the engine controller strips the finalizer
and deletes the mirror directly, so no reverse engine call occurs.

## Namespaces

`ContainerNamespace` objects reflect the container engine namespaces.  This is
only useful when using the `containerd` backend; when using `dockerd`, the only
valid instance will have a name of `moby`, and it cannot be modified in any way.

`ContainerNamespace` objects only have the default metadata, since they
currently do not need anything else.

```yaml
apiVersion: containers.rancherdesktop.io/v1alpha1
kind: ContainerNamespace
metadata:
  name: k8s.io # `moby` when using the dockerd engine.
    rdd set running=true
    rdd ctl wait --for=create --namespace="${NAMESPACE}" \
        containernamespace/moby --timeout=10s
}

@test "deleting containernamespace/moby completes without a finalizer hang" {
    # moby ContainerNamespace has no mirror finalizer, so a user delete
    # must return promptly rather than get trapped in Terminating.
    rdd ctl delete containernamespace/moby --namespace="${NAMESPACE}" --timeout=10s
    run -0 rdd ctl get containernamespaces --namespace="${NAMESPACE}" --output=name
    refute_output
}

# --- containerd backend ---

@test "containerd backend reports ContainerEngineReady=NotApplicable and skips mirroring" {
    # Stop first so there is no stale True/Connected from moby to

Fix: either add Watches(&containersv1alpha1.ContainerNamespace{}, enqueueApp) to SetupWithManager and idempotently re-run ensureNamespace + syncContainerNamespace on every reconcile (both are single Applies and already idempotent), or reject deletion of the moby namespace via an admission webhook while the engine controller is active. The watch-and-restore variant is the smaller change and consistent with the existing mirror pattern. Either way, update the BATS test to assert the namespace returns — not that it stays gone. Classification: regression, important.

I2. processImageFinalizers/processVolumeFinalizers skip retry.RetryOnConflict Gemini Claude
		retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
			latest := &containersv1alpha1.Container{}
			if err := r.Get(ctx, key, latest); err != nil {
				if apierrors.IsNotFound(err) {
					return nil
				}
				return err
			}
			if !removeFinalizer(latest, mirrorFinalizer) {
				return nil
			}
			return r.Update(ctx, latest)
		})
		if retryErr != nil {
			errs = append(errs, fmt.Errorf("failed to remove finalizer from Container %s: %w", c.Name, retryErr))
		}
	}
	return errors.Join(errs...)
}

func (r *EngineReconciler) processImageFinalizers(ctx context.Context, w *dockerWatcher) error {
	var images containersv1alpha1.ImageList
	if err := r.List(ctx, &images, client.InNamespace(apiNamespace)); err != nil {
		return err
	}
	var errs []error
	for i := range images.Items {
		img := &images.Items[i]
		if img.DeletionTimestamp == nil || !hasFinalizer(img, mirrorFinalizer) {
			continue
		}
		// An Image mirror with no engine-populated status has no Docker
		// reference to forward the delete to — either a user created it
		// as a bare skeleton or it landed in the startup race window
		// before applyImage ran. Symmetric with processVolumeFinalizers'
		// empty-Status.Name guard: strip the finalizer and let the
		// Delete proceed.
		if img.Status.ID != "" || img.Status.RepoTag != "" {
			if err := w.deleteImage(ctx, img); err != nil {
				errs = append(errs, fmt.Errorf("failed to delete image %s from Docker: %w", img.Name, err))
				continue
			}
		}
		key := client.ObjectKeyFromObject(img)
		retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
			latest := &containersv1alpha1.Image{}
			if err := r.Get(ctx, key, latest); err != nil {
				if apierrors.IsNotFound(err) {
					return nil
				}
				return err
			}
			if !removeFinalizer(latest, mirrorFinalizer) {
				return nil
			}
			return r.Update(ctx, latest)
		})
		if retryErr != nil {
			errs = append(errs, fmt.Errorf("failed to remove finalizer from Image %s: %w", img.Name, retryErr))
		}
	}
	return errors.Join(errs...)
}

func (r *EngineReconciler) processVolumeFinalizers(ctx context.Context, w *dockerWatcher) error {
	var volumes containersv1alpha1.VolumeList
	if err := r.List(ctx, &volumes, client.InNamespace(apiNamespace)); err != nil {
		return err
	}
	var errs []error
	for i := range volumes.Items {
		v := &volumes.Items[i]

processContainerFinalizers (lines 410-423) correctly re-Gets latest under retry.RetryOnConflict before stripping the finalizer. The Image and Volume helpers mutate the stale img/v from the outer List and call Update directly. On a 409 — which can fire any time a concurrent watcher sync just re-applied status — the Update fails and the finalizer isn't stripped. The Docker-side delete has already succeeded by that point, so the reconciler requeues, re-Lists, re-inspects, and re-issues the engine-side delete against an already-gone object before finally stripping the finalizer. It's self-healing but burns a Docker round-trip per conflict and breaks the invariant "once we tell Docker to delete, we strip the finalizer on the first attempt," which the adjacent Container path upholds.

func (r *EngineReconciler) processContainerFinalizers(ctx context.Context, w *dockerWatcher) error {
	var containers containersv1alpha1.ContainerList
	if err := r.List(ctx, &containers, client.InNamespace(apiNamespace)); err != nil {
		return err
	}
	var errs []error
	for i := range containers.Items {
		c := &containers.Items[i]
		if c.DeletionTimestamp == nil || !hasFinalizer(c, mirrorFinalizer) {
			continue
		}
		if err := w.deleteContainer(ctx, c.Name); err != nil {
			errs = append(errs, fmt.Errorf("failed to delete container %s from Docker: %w", c.Name, err))
			continue
		}
		// Strip the finalizer with retry-on-conflict so a stale cache
		// does not force a whole-reconcile requeue just to reach the
		// same idempotent deleteContainer call. NotFound is benign: a
		// concurrent Docker destroy event may already have stripped
		// the finalizer and deleted the mirror.
		key := client.ObjectKeyFromObject(c)
		retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
			latest := &containersv1alpha1.Container{}
			if err := r.Get(ctx, key, latest); err != nil {

Fix: mirror the processContainerFinalizers pattern in both helpers. Put the finalizer strip inside a retry.RetryOnConflict loop that re-Gets latest and treats NotFound as benign:

-        if removeFinalizer(img, mirrorFinalizer) {
-            if err := client.IgnoreNotFound(r.Update(ctx, img)); err != nil {
-                errs = append(errs, fmt.Errorf("failed to remove finalizer from Image %s: %w", img.Name, err))
-            }
-        }
+        key := client.ObjectKeyFromObject(img)
+        retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
+            latest := &containersv1alpha1.Image{}
+            if err := r.Get(ctx, key, latest); err != nil {
+                if apierrors.IsNotFound(err) {
+                    return nil
+                }
+                return err
+            }
+            if !removeFinalizer(latest, mirrorFinalizer) {
+                return nil
+            }
+            return r.Update(ctx, latest)
+        })
+        if retryErr != nil {
+            errs = append(errs, fmt.Errorf("failed to remove finalizer from Image %s: %w", img.Name, retryErr))
+        }

Apply the same change to processVolumeFinalizers (lines 485-489). Classification: regression, important.

I3. rdd set wait misinterprets StartFailed/ConnectFailed/StopFailed conditions Codex
	}

	ctx, cancel := context.WithTimeout(ctx, timeout)
	defer cancel()

	if runningVal == "true" {
		logrus.Info("Waiting for container engine to be ready")
		return watchCondition(ctx, config, func(obj *unstructured.Unstructured) bool {
			status, gen, present := conditionInfo(obj, "ContainerEngineReady")
			return present && gen >= minGen && status == metav1.ConditionTrue
		})
	}
	logrus.Info("Waiting for the VM to stop")
	return watchCondition(ctx, config, func(obj *unstructured.Unstructured) bool {
		// Running absent means the VM was never started (nothing to
		// wait for). Otherwise wait until the condition is refreshed
		// with our generation and reports a non-True status.
		status, gen, present := conditionInfo(obj, "Running")
		if !present {
			return true
		}
		return gen >= minGen && status != metav1.ConditionTrue
	})
}

// watchCondition watches the App singleton until the supplied predicate
// returns true or the context expires. It uses watchtools.UntilWithSync
// so benign watch-channel closures (API server timeouts, proxy
// disconnects, resource-version compaction) re-list transparently

The running=true branch only accepts ContainerEngineReady=True. If startup fails — LimaVM publishes Running=False/StartFailed at pkg/controllers/lima/limavm/controllers/limavm_lifecycle.go:382-385,427-430,463-476, or the engine reconciler publishes ContainerEngineReady=False/ConnectFailed at engine_reconciler.go:161-163 — the wait keeps looping until the 5-minute timeout and returns the generic "timed out waiting for App state" error instead of the actual failure reason. Users see a timeout when they should have seen "start failed" immediately.

		if !alreadyClean {
			if err := r.cleanupMirrorResources(ctx); err != nil {
				log.Error(err, "Failed to clean up mirror resources")
				return ctrl.Result{}, err
			}
		}
		return ctrl.Result{}, r.setEngineCondition(ctx, &app, terminalStatus, terminalReason, terminalMessage)
	}

	if !watcherRunning {
		log.Info("App is running, starting Docker watcher")
		if err := r.startWatcher(ctx); err != nil {
			log.Error(err, "Failed to start Docker watcher")

The running=false branch has the opposite problem: it treats any refreshed non-True Running condition as success. When Lima publishes Running=False/StopFailed at limavm_lifecycle.go:527-539, the wait returns success even though the VM didn't stop.


	// Verify the instance stopped
	inst, err := store.Inspect(ctx, limaVM.Name)
	if err != nil {
		logger.Error(err, "Failed to inspect instance after stop")
		if updateErr := r.updateCondition(ctx, limaVM, ConditionRunning, metav1.ConditionFalse, ReasonStopFailed, err.Error()); updateErr != nil {
			logger.Error(updateErr, "Failed to update status after stop failure")
		}
		return ctrl.Result{}, err
	}

	if inst != nil && inst.Status == limatype.StatusRunning {
		err := errors.New("instance still running after stop attempt")
		logger.Error(err, "Failed to stop Lima instance")
		if updateErr := r.updateCondition(ctx, limaVM, ConditionRunning, metav1.ConditionFalse, ReasonStopFailed, err.Error()); updateErr != nil {
			logger.Error(updateErr, "Failed to update status after stop failure")
		}
		return ctrl.Result{}, err
	}

	logger.Info("Stopped Lima instance", "instance", limaVM.Name)
	if err := r.updateCondition(ctx, limaVM, ConditionRunning, metav1.ConditionFalse, ReasonStopped, "Lima instance stopped successfully"); err != nil {
		logger.Error(err, "Failed to update status after stop")

Fix: inspect reason as well as status. The predicates should look like:

running=true success   : ContainerEngineReady.status==True
running=true failure   : Running.reason in {StartFailed} OR ContainerEngineReady.reason in {ConnectFailed}
running=false success  : Running absent OR Running.reason in {Stopped}
running=false failure  : Running.reason in {StopFailed}

On failure the wait should return an error whose message carries the condition's message field so the user sees the underlying cause. Extending conditionInfo to also return reason/message is a small change that keeps watchCondition generic. Classification: regression, important.

I4. docker rename events are dropped Codex

// handleEvent processes a single Docker event.
func (w *dockerWatcher) handleEvent(ctx context.Context, msg events.Message) error {
	switch msg.Type {
	case events.ContainerEventType:
		return w.handleContainerEvent(ctx, msg)
	case events.ImageEventType:
		return w.handleImageEvent(ctx, msg)
	case events.VolumeEventType:
		return w.handleVolumeEvent(ctx, msg)
	default:
		return nil
	}
}

// handleContainerEvent processes container events.
func (w *dockerWatcher) handleContainerEvent(ctx context.Context, msg events.Message) error {
	log := logf.FromContext(ctx).WithName("docker-watcher")

	switch msg.Action {
	case events.ActionCreate,
		events.ActionStart,
		events.ActionStop,
		events.ActionDie,
		events.ActionPause,
		events.ActionUnPause,
		events.ActionRestart,
		events.ActionRename:
		log.V(1).Info("Container event", "action", msg.Action, "id", msg.Actor.ID)
		return w.syncContainer(ctx, msg.Actor.ID)

	case events.ActionDestroy:
		log.V(1).Info("Container destroyed", "id", msg.Actor.ID)

applyContainer derives the mirrored name from inspect.Name at sync_containers.go:111 and writes it into status.name/status.namespace. Docker emits events.ActionRename when docker rename is used, but the switch lets it fall through to default: return nil. The mirror's status.name therefore stays stale indefinitely — until some unrelated start/stop/destroy event happens to trigger another syncContainer, which re-inspects and rewrites the name. Consumers of Container.status.name (including the rdd ctl UX and any future label selectors) see the old name until that happens.

// WithSpec(unknown), the engine-controller field owner would keep
// reasserting spec.state=unknown, clobbering any user patch to
// running/created. Gating the WithSpec call behind a Get — so spec is
// written exactly once, on creation — keeps the invariant "subsequent
// syncs never touch spec" without needing a second field owner.
func (w *dockerWatcher) applyContainer(ctx context.Context, inspect mobycontainer.InspectResponse) error {
	namespace, name := parseContainerName(inspect.Name)

	// Create the `Container` mirror if it doesn't exist. spec.state is
	// always set to "unknown" on creation — meaning the engine mirrors
	// Docker container state without expressing intent. The user can later

Fix: add events.ActionRename to the same case block as the other state-refreshing events. Also add a BATS test that renames a running container and waits for .status.name to reflect the new name:

 	case events.ActionCreate,
 		events.ActionStart,
 		events.ActionStop,
 		events.ActionDie,
 		events.ActionPause,
 		events.ActionUnPause,
-		events.ActionRestart:
+		events.ActionRestart,
+		events.ActionRename:
 		log.V(1).Info("Container event", "action", msg.Action, "id", msg.Actor.ID)
 		return w.syncContainer(ctx, msg.Actor.ID)

Classification: gap, important.

I5. Transient Docker Inspect failures during syncAllImages delete valid Image mirrors Gemini
// correspond to a Docker image ID and its RepoTags. A tagged image gets
// one mirror per tag (name = sanitized-id + "-" + sha256(tag)); a
// dangling image gets a single mirror named after the sanitized ID.
// Sharing this helper between applyImageFromInspect and syncAllImages
// lets the stale-mirror sweep seed activeNames from the list response
// alone, so a transient Inspect failure cannot classify existing
// mirrors as stale.
func imageMirrorNames(id string, repoTags []string) []string {
	if len(repoTags) == 0 {
		return []string{sanitizeKubernetesObjectName(id)}
	}
	names := make([]string, 0, len(repoTags))
	for _, tag := range repoTags {
		names = append(names, fmt.Sprintf("%s-%x",
			sanitizeKubernetesObjectName(id),
			sha256.Sum256([]byte(tag))))
	}
	return names
}

// syncAllImages lists all Docker images and creates/updates `Image`
// mirrors, then removes stale ones.
//
// Images are Inspected sequentially for the same reason as
// syncAllContainers: one-shot at startup, typical dev machines have
// tens of images, and we need fields that only Inspect exposes
// (RepoDigests, full Labels, detailed metadata). Parallelise here if
// startup latency becomes a concern.
func (w *dockerWatcher) syncAllImages(ctx context.Context) error {
	log := logf.FromContext(ctx).WithName("docker-watcher")

	listResult, err := w.cli.ImageList(ctx, dockerclient.ImageListOptions{All: true})
	if err != nil {
		return fmt.Errorf("failed to list images: %w", err)
	}

	// Track which `Image` mirror names we create so we can prune stale ones.
	activeNames := make(map[string]bool)

	// Per-item inspect failures are logged and skipped rather than
	// failing the whole startup (see the matching note in
	// syncAllContainers): a single broken image must not block the
	// watcher from coming up. Structural errors below (k8s list,
	// stale-mirror cleanup) are still fatal.
	//
	// Seed activeNames from the list response before calling Inspect.
	// If Inspect fails transiently the mirror stays in activeNames and
	// is not pruned below — otherwise a flaky Docker daemon would
	// classify a perfectly good Image mirror as stale and delete it.
	var errs []error
	for _, summary := range listResult.Items {
		for _, n := range imageMirrorNames(summary.ID, summary.RepoTags) {
			activeNames[n] = true
		}
		if err := w.syncImageFromSummary(ctx, summary); err != nil {

syncImageFromSummary returns the mirror names after a successful ImageInspect. If Docker returns a transient 500/503 for a single image — which happens on loaded or recently-restarted daemons — the names are never added to activeNames, and the stale-mirror sweep at the bottom then deletes the valid K8s Image mirror that already existed for that image. The user sees their Image blink out and reappear on the next event or fullSync.

The applyImageFromInspect helper already computes the deterministic mirror name from inspect.ID + RepoTags (or the ID alone for dangling images). The same name can be computed from the ImageListItem summary — it has both ID and RepoTags.

Fix: pre-compute the expected names from the summary and seed activeNames before calling syncImageFromSummary, so transient inspect failures can't trigger mirror deletion:

 	var errs []error
 	for _, summary := range listResult.Items {
+		// Seed active names before Inspect so a transient Inspect failure
+		// does not classify the existing mirror as stale.
+		if len(summary.RepoTags) > 0 {
+			for _, tag := range summary.RepoTags {
+				name := fmt.Sprintf("%s-%x",
+					sanitizeKubernetesObjectName(summary.ID),
+					sha256.Sum256([]byte(tag)))
+				activeNames[name] = true
+			}
+		} else {
+			activeNames[sanitizeKubernetesObjectName(summary.ID)] = true
+		}
 		names, err := w.syncImageFromSummary(ctx, summary)
 		if err != nil {
 			log.Error(err, "Skipping image during full sync", "id", summary.ID)
 			continue
 		}
-		for _, n := range names {
-			activeNames[n] = true
-		}
 	}

Refactor the name derivation into a helper so applyImageFromInspect and syncAllImages share it and can't drift. syncAllContainers already uses dc.ID from the list response for the same purpose, so it doesn't need this fix. Classification: bug, important.

I6. Docker watcher drops events on transient handleEvent errors Gemini
	result := w.cli.Events(ctx, dockerclient.EventsListOptions{
		Since:   since,
		Filters: eventFilter,
	})

	for {
		select {
		case <-ctx.Done():
			log.Info("Docker watcher stopping")
			return
		case err := <-result.Err:
			if ctx.Err() != nil {
				return
			}
			log.Error(err, "Docker event stream error")
			w.enqueueReconcile()
			return
		case msg, ok := <-result.Messages:
			if !ok {
				// The daemon closed the event stream without writing to
				// result.Err. Treat the same way as an error: wake the
				// reconciler so it observes the dead watcher and starts
				// a fresh one.
				log.Info("Docker event stream closed")
				w.enqueueReconcile()
				return
			}
			// A transient handleEvent error (apiserver 503, SSA conflict
			// past its internal retry) is logged and the event is
			// dropped. Container events tend to self-heal because the
			// next state change (start/stop/die) triggers another sync
			// on the same ID; image pull and volume create events fire
			// only once, so a dropped apply leaves the mirror missing
			// until the next full resync (watcher restart or engine
			// reconnect). The bounded-window fix is a periodic
			// fullSync tick, deferred until the rest of the mirroring
			// work lands.
			if err := w.handleEvent(ctx, msg); err != nil {
				log.Error(err, "Failed to handle Docker event",

A transient failure inside handleEvent — apiserver 503, SSA conflict that survives its internal retry, temporary network partition — is logged and dropped. The watcher never periodically resyncs (there's no time-based fullSync tick), so whether the mirror eventually catches up depends on whether Docker emits another event for the same object:

Gemini's proposed fix — return from run on any error and let the dead watcher be detected by the reconciler — is correct but aggressive: every transient 503 would tear down the event stream, trigger a fresh fullSync, and churn every mirror. A gentler fix is enough in practice:

  1. On handleEvent error, enqueue an event-specific retry (e.g., store msg.Actor.ID in a small retry set and call syncContainer/reconcileImageByID/syncVolume on a short backoff), or
  2. Run fullSync on a periodic tick (e.g., every 5 minutes) as a belt-and-suspenders resync, so any missed event is reconciled within a bounded window.

Option 2 is the smaller change, matches how controller-runtime's SyncPeriod works, and is how kubelet's image manager handles the same class of bug. Classification: gap, important.


Suggestions

S1. Reconcile holds watcherMu across the blocking w.stop() call Claude
	}

	running := meta.IsStatusConditionTrue(app.Status.Conditions, "Running")
	engineIsDocker := app.Spec.ContainerEngine.Name == engineMoby

	// Detach a dead watcher from the reconciler under watcherMu, then
	// stop it outside the lock. Holding the mutex across w.stop() —
	// which blocks on <-w.done and cli.Close() — would stall any
	// concurrent reconcile the moment MaxConcurrentReconciles > 1.
	var diedWatcher *dockerWatcher
	r.watcherMu.Lock()
	watcherRunning := r.watcher != nil
	if watcherRunning && !r.watcher.alive() {
		diedWatcher = r.watcher
		r.watcher = nil
		watcherRunning = false
	}
	r.watcherMu.Unlock()
	watcherDied := diedWatcher != nil

The standalone stopWatcher() helper at lines 212-221 correctly captures the watcher, releases watcherMu, then calls w.stop(). The inline version above holds the lock across a blocking cli.Close(). It's only a latent bug at MaxConcurrentReconciles=1 (the current default), but the asymmetry is a trap: anyone who later enables concurrent reconciles will hit a lock held across a blocking Docker close. Match the pattern:

	r.watcherMu.Lock()
	defer r.watcherMu.Unlock()

	if r.watcher != nil {
		return nil
	}

	w, err := newDockerWatcher(ctx, r.Client, r.reconcileChan)
	if err != nil {
		return err
	}
	r.watcher = w
	return nil
}

// stopWatcher stops the Docker watcher goroutine and waits for it to finish.
func (r *EngineReconciler) stopWatcher() {
	r.watcherMu.Lock()
	w := r.watcher
	r.watcher = nil
 	r.watcherMu.Lock()
 	watcherRunning := r.watcher != nil
-	watcherDied := watcherRunning && !r.watcher.alive()
+	var diedWatcher *dockerWatcher
+	if watcherRunning && !r.watcher.alive() {
+		diedWatcher = r.watcher
+		r.watcher = nil
+		watcherRunning = false
+	}
+	r.watcherMu.Unlock()
+	watcherDied := diedWatcher != nil
 	if watcherDied {
-		r.watcher.stop()
-		r.watcher = nil
-		watcherRunning = false
+		diedWatcher.stop()
 	}
-	r.watcherMu.Unlock()

Classification: gap, suggestion.

S2. ImageStatus.RepoDigests is sent to SSA in Docker's order Claude
			}
		}
	} else {
		// Dangling image (no tags). status.namespace is a required
		// field on the CRD, so set it here too even though it carries
		// no additional information beyond what the tagged branch sets.
		if err := w.applyImage(ctx,
			containersv1alpha1apply.Image(names[0], apiNamespace).
				WithFinalizers(mirrorFinalizer),
			imageStatusFromInspect(inspect).
				WithNamespace(containerNamespace),
		); err != nil {
			errs = append(errs, err)
		}
	}

	return names, errors.Join(errs...)
}

// imageStatusFromInspect builds a fresh ImageStatus apply config from a
// Docker inspect response. Returning a new value per call avoids the
// aliasing trap of a shallow struct copy — slice/map WithX calls on
// one "copy" would otherwise mutate the backing memory shared with
// other callers.

ImageStatus.RepoDigests in pkg/apis/containers/v1alpha1/image_types.go:28-31 is a plain []string with no +listType — atomic under SSA. Docker's RepoDigests is sourced from the reference store and is not documented to have a stable order, so an image with multiple references can reorder between inspects. Any reorder mints a new resourceVersion on every status apply, which fires watch events and re-triggers the engine reconciler — exactly the churn the port-sort already guards against.

	// multiple Image objects.  Images without tags will have this unset, but
	// only one Image object should exist in that case.
	//
	// +optional
	RepoTag string `json:"repoTag"`
	// RepoDigests are the signed digests of the image.
	//
	// +optional
	RepoDigests []string `json:"repoDigests,omitempty"`
	// CreatedAt is the time the image was created.
	//
	// +optional
	CreatedAt metav1.Time `json:"createdAt"`
	// Architecture associated with the image.

Fix: sort a copy before passing it in.

+    digests := slices.Clone(inspect.RepoDigests)
+    sort.Strings(digests)
     statusApply := containersv1alpha1apply.ImageStatus().
         WithID(inspect.ID).
-        WithRepoDigests(inspect.RepoDigests...).
+        WithRepoDigests(digests...).

Classification: gap, suggestion.

S3. ContainerPort.Bindings is sent to SSA in Docker's order Claude
	// Port bindings. Go map iteration is randomized, so sort by port
	// name before building the apply slice: even with
	// +listType=map +listMapKey=name on Status.Ports, the entries land
	// in the stored object in the order we provide, and a stable order
	// avoids spurious resourceVersion churn when consumers render the
	// list or diff it.
	var applyPorts []*containersv1alpha1apply.ContainerPortApplyConfiguration
	if inspect.NetworkSettings != nil {
		portNames := make([]mobynetwork.Port, 0, len(inspect.NetworkSettings.Ports))
		for p := range inspect.NetworkSettings.Ports {
			portNames = append(portNames, p)
		}
		sort.Slice(portNames, func(i, j int) bool {
			return portNames[i].String() < portNames[j].String()
		})
		for _, portName := range portNames {
			ports := inspect.NetworkSettings.Ports[portName]
			// Sort bindings by (HostIP, HostPort) for the same reason
			// portNames is sorted above: ContainerPort.Bindings is
			// atomic under SSA, and Docker can return dual-stack
			// bindings in either order, so an unsorted apply would
			// mint a new resourceVersion on every sync.
			sortedPorts := slices.Clone(ports)
			sort.Slice(sortedPorts, func(i, j int) bool {
				if sortedPorts[i].HostIP.String() != sortedPorts[j].HostIP.String() {
					return sortedPorts[i].HostIP.String() < sortedPorts[j].HostIP.String()
				}
				return sortedPorts[i].HostPort < sortedPorts[j].HostPort
			})
			var bindings []*containersv1alpha1apply.ContainerPortBindingApplyConfiguration
			for _, port := range sortedPorts {
				bindings = append(bindings, containersv1alpha1apply.ContainerPortBinding().
					WithHostIP(port.HostIP.String()).

Same pattern as S2, one nesting level deeper: the outer portNames slice is correctly sorted at line 198, but the inner bindings slice follows whatever order Docker returned (Dual-stack docker run -p 80:80 can emit either order depending on the kernel). ContainerPort.Bindings in container_types.go:48-52 has no +listType, so it's atomic. Sort by (HostIP, HostPort) before building the apply config. Classification: gap, suggestion.

	// Name of the port; in the form [port]/[protocol], e.g. "80/tcp".
	//
	// +required
	Name string `json:"name"`
	// Bindings to the host port; can contain multiple entries to e.g. express
	// IPv4 and IPv6 bindings.
	//
	// +required
	Bindings []ContainerPortBinding `json:"bindings"`
}

// ContainerSpec defines the configuration the container was created with.
type ContainerSpec struct {
	// State is the desired state of the container. Valid values are
	// "unknown", "created", and "running". The engine controller creates
S4. rdd set timeout error doesn't name the condition Claude
			return false, nil
		}
		return satisfied(obj), nil
	}

	if _, err := watchtools.UntilWithSync(ctx, lw, &unstructured.Unstructured{}, precondition, condition); err != nil {
		if errors.Is(err, context.DeadlineExceeded) {
			return errors.New("timed out waiting for App state")
		}
		if errors.Is(err, context.Canceled) {
			return errors.New("wait cancelled")
		}
		return fmt.Errorf("failed to watch App: %w", err)
	}
	return nil
}

// conditionInfo returns the status, observedGeneration, and presence of
// the named condition on an App. present is false when the condition is

When rdd set running=true times out, the user sees timed out waiting for App state with no hint that the wait was specifically on ContainerEngineReady=True. When this fix ships with I3, the error will also need to carry the failure reason. Plumb a descriptive label into watchCondition and return something like timed out waiting for ContainerEngineReady=True (5m) or timed out waiting for App to stop (5m). Classification: enhancement, suggestion.

S5. deleteAllOfType passes the stale list item to r.Delete Claude
// deleteAllOfType lists resources, strips finalizers, and deletes them.
// Finalizer removal uses retry.RetryOnConflict so a stale cache or a
// concurrent Update does not require the caller to requeue. Per-item
// errors are collected so one stuck object does not block the rest of
// the list.
func (r *EngineReconciler) deleteAllOfType(ctx context.Context, list client.ObjectList) error {
	if err := r.List(ctx, list, client.InNamespace(apiNamespace)); err != nil {
		return err
	}

	items, err := meta.ExtractList(list)
	if err != nil {
		return err
	}

	var errs []error
	for _, item := range items {
		obj := item.(client.Object)
		key := client.ObjectKeyFromObject(obj)
		// meta.ExtractList strips the TypeMeta from each item (the API
		// server only populates GVK on the top-level list), so
		// obj.GetObjectKind() is empty here. Format the concrete Go
		// type name with %T instead.
		retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
			latest := obj.DeepCopyObject().(client.Object)
			if err := r.Get(ctx, key, latest); err != nil {
				if apierrors.IsNotFound(err) {
					return nil
				}
				return err
			}
			if !removeFinalizer(latest, mirrorFinalizer) {
				return nil
			}
			return r.Update(ctx, latest)
		})
		if retryErr != nil {
			errs = append(errs, fmt.Errorf("failed to remove finalizer from %T %s: %w",
				obj, obj.GetName(), retryErr))
			continue

The final r.Delete passes obj — the stale list item — not latest from the retry closure. That's functionally correct because Delete keys off name+namespace and doesn't send resourceVersion, but the closure-scoped latest isn't visible outside the retry body and future maintainers could reasonably think they need to pass it. Add a one-line comment explaining that obj is acceptable because Delete doesn't use resourceVersion, or hoist latest out of the closure. Classification: gap, suggestion.

S6. DockerSocket() performs filesystem side effects at construction Claude
// neither — or the watcher will open a different socket from the one
// Lima is port-forwarding to, and mirroring will silently diverge per
// instance. Until then, multi-instance BATS runs cannot share the
// same host and the engine.bats test has a parallel TODO at line
// 12-14.
var DockerSocket = sync.OnceValue(func() string {
	home, err := os.UserHomeDir()
	if err != nil {
		panic(fmt.Errorf("could not get home directory: %w", err))
	}
	dir := filepath.Join(home, ".rd2")
	_ = os.MkdirAll(dir, 0o700)
	return filepath.Join(dir, "docker.sock")
})

Unlike the other instance.* accessors (Dir, LogDir, LimaHome), which are pure path computations, DockerSocket() performs os.MkdirAll and swallows the error. Any caller that just wants the path — rdd svc paths, CLI help, a test — silently creates the directory as a side effect. Two cleaner options:

  1. Move the MkdirAll to the watcher (newDockerWatcher) where the directory is actually needed, with proper error handling.
  2. Keep it in instance but rename to EnsureDockerSocket() and return the error.

Option 1 matches how LimaHome() stays pure and defers mkdir to the Lima controller. This will also need to change when the hardcoded ".rd2" is replaced by ShortDir() per the existing TODO at lines 127-135. Classification: gap, suggestion.


// DockerSocket returns the path to the Docker socket for this instance
// (e.g., ~/.rd2/docker.sock). This is the host-side socket that Lima
// port-forwards from the guest's /var/run/docker.sock.
//
// TODO: use ShortDir() once the Lima template derives the socket path
// from the instance suffix instead of hardcoding ".rd2". The two
// hardcodes (here and in pkg/controllers/app/app/lima-template.yaml)
// must move together — either both parameterized on RDD_INSTANCE or
// neither — or the watcher will open a different socket from the one
// Lima is port-forwarding to, and mirroring will silently diverge per
// instance. Until then, multi-instance BATS runs cannot share the
// same host and the engine.bats test has a parallel TODO at line
// 12-14.
var DockerSocket = sync.OnceValue(func() string {
	home, err := os.UserHomeDir()
	if err != nil {
		panic(fmt.Errorf("could not get home directory: %w", err))
	}
S7. watchCondition allocates a new dynamic client per call Claude

watchCondition constructs a fresh dynamic.NewForConfig(config) on every invocation. For rdd set today this is fine — one call per command — but if the helper is reused elsewhere it's wasteful. Consider accepting the dynamic client as a parameter and reusing the existing controller-runtime client from setAction where possible. Classification: enhancement, suggestion.

S8. Dead branches in the container state-transition webhook Claude
}

// ValidateUpdate implements [ctrlwebhookadmission.Validator].
// Only spec.state changes are allowed; all other spec fields are immutable.
func (c *immutableValidator) ValidateUpdate(_ context.Context, oldContainer, newContainer *v1alpha1.Container) (warnings ctrlwebhookadmission.Warnings, err error) {
	if oldContainer.Spec.State != newContainer.Spec.State {
		// Allow state transitions between "created", "running", and "unknown".
		switch newContainer.Spec.State {
		case v1alpha1.ContainerStatusCreated, v1alpha1.ContainerStatusRunning, v1alpha1.ContainerStatusUnknown:
			// Valid transition.
		default:
			return nil, fmt.Errorf("invalid target state %q: must be %q, %q, or %q",
				newContainer.Spec.State,
				v1alpha1.ContainerStatusCreated,
				v1alpha1.ContainerStatusRunning,
				v1alpha1.ContainerStatusUnknown)
		}
		// Compare specs with state normalized to check nothing else changed.
		oldCopy := oldContainer.Spec
		oldCopy.State = newContainer.Spec.State
		if !equality.Semantic.DeepEqual(oldCopy, newContainer.Spec) {
			return nil, fmt.Errorf("only spec.state may be changed: old: %v, new: %v", oldContainer.Spec, newContainer.Spec)
		}
		return nil, nil
	}

	if !equality.Semantic.DeepEqual(oldContainer.Spec, newContainer.Spec) {
		return nil, fmt.Errorf("the Container spec must not be modified: old: %v, new: %v", oldContainer.Spec, newContainer.Spec)
	}

The CRD enum validation in crd.yaml:285-288 (enum: [created, running, unknown]) runs before the webhook, so the default branch is unreachable through normal admission. The second equality.Semantic.DeepEqual check at line 51 is also dead code today because ContainerSpec has only the State field — but the moment another field is added (immutable ports, immutable labels, etc.) it becomes load-bearing. Either add a one-line comment noting the DeepEqual guards future fields, or drop both dead branches and trust the CRD enum until more fields land. Classification: enhancement, suggestion.

S9. reconcileContainerState silently ignores unknown desired states Claude
// Full state matrix (rows=desired from the spec.state enum, columns=actual
// from Docker). The webhook restricts desired to {unknown, created,
// running}; actual can be any value in the CRD enum.
//
//	desired \ actual  | running | paused   | created | exited | dead | pausing/restarting | removing | unknown
//	------------------+---------+----------+---------+--------+------+--------------------+----------+--------
//	running           | nil     | unpause  | start   | start  | start| start              | start    | start
//	created           | stop    | stop     | nil     | nil    | nil  | stop               | stop     | stop
//
// ContainerStop handles paused/restarting containers natively, so the
// created branch always dispatches Stop from any non-stopped state. The
// running branch handles paused explicitly because Docker rejects
// ContainerStart on a paused container. isStopped() treats
// created/exited/dead as terminal stopped states: the user's intent
// ("do not run") is already satisfied and redispatching Stop would be
// wasted work on every unrelated reconcile event.
func (w *dockerWatcher) reconcileContainerState(ctx context.Context, c *containersv1alpha1.Container) error {
	desired := c.Spec.State
	if desired == containersv1alpha1.ContainerStatusUnknown {
		return nil
	}

	actual := c.Status.Status
	log := logf.FromContext(ctx).WithName("docker-watcher")

	switch desired {
	case containersv1alpha1.ContainerStatusRunning:
		if actual == containersv1alpha1.ContainerStatusRunning {
			return nil
		}
		if actual == containersv1alpha1.ContainerStatusPaused {
			log.Info("Unpausing container", "id", c.Name)
			_, err := w.cli.ContainerUnpause(ctx, c.Name, dockerclient.ContainerUnpauseOptions{})
			return err
		}
		log.Info("Starting container", "id", c.Name)
		_, err := w.cli.ContainerStart(ctx, c.Name, dockerclient.ContainerStartOptions{})
		return err
	case containersv1alpha1.ContainerStatusCreated:
		if isStopped(actual) {
			return nil
		}

Today the webhook restricts desired to {unknown, running, created}, so the switch covers every reachable value. A future relaxation of the enum — or a direct storage manipulation that bypasses admission — would produce a silent no-op. Add a default branch that logs the unexpected state (at Error level) so regressions surface quickly instead of silently dropping state transitions. Classification: gap, suggestion.


Design Observations

Concerns

D1. reconcileContainerSpecs + processFinalizers fan out on every mirror event (in-scope). Claude The existing comment at engine_reconciler.go:177-183 spells out the cost: every Container/Image/Volume watch event triggers a reconcile that Lists every mirrored object to check for spec.state changes and pending finalizers. The long-term fix — per-resource reconcilers with watch predicates for deletionTimestamp!=nil and spec.state changes — is correct. On typical dev machines the load is bounded, but this will need attention before the feature supports larger workloads. Classification: future.

D2. Two field managers to work around SSA spec ownership (in-scope). Claude The explanation at sync_containers.go:144-157 is correct but it documents a design wart: server-side apply can't express "update only finalizers" without releasing spec from the original owner, so a second field manager (engine-controller-finalizer) was introduced. An alternative — keep SSA only for the creation path and switch to a strategic merge patch for finalizer re-assertion — would avoid the two-manager split and match how most controllers add finalizers. Classification: future.

Strengths

S1. Docker-daemon clock for the events Since anchor (in-scope). Claude Codex dockerEventsSince() at docker_watcher.go:107-122 reads Info.SystemTime from the daemon and uses that for the events Since filter rather than time.Now() on the host. The host/guest clock-skew hazard is real for anything running inside Lima; the fallback to a biased host clock is a sensible safety net.

S2. Cross-controller retry-on-conflict for App.Status.Conditions (in-scope). Claude Codex setEngineCondition (engine_reconciler.go:230-255) and the matching AppReconciler helper (app_controller.go:239-259) both re-Get inside a retry.RetryOnConflict loop. That's the right fix for controller-runtime not serializing writes across reconcilers that both touch the App's status.

S3. applyContainer spec-on-create-only with Get-gated Apply (in-scope). Claude The Get-gated WithSpec(unknown) at sync_containers.go:126-131 preserves user spec.state patches across subsequent syncs without needing a second field manager for spec — a cleaner alternative than the finalizer split in D2.

S4. Explicit state-matrix doc on reconcileContainerState (in-scope). Claude The comment at docker_watcher.go:340-360 tabulates every desired×actual combination and explains why paused must be handled for running but not for created. The matching BATS tests at engine.bats:264-320 verify both directions.

S5. Panic recovery + alive()-triggered restart (in-scope). Claude Gemini The recover() block in run() (docker_watcher.go:153-159) falls through to enqueueReconcile, so the next reconcile observes the dead watcher and fulls-syncs a fresh one. It's a clean handoff that avoids permanent lockups on unexpected event shapes.

S6. Port-binding sort to avoid SSA churn (in-scope). Claude Gemini sync_containers.go:197-201 sorts portNames before building the apply config, which was the same anti-pattern S2/S3 above flag at the inner level. It's the right instinct — the inner bindings slice just needs the same treatment.


Testing Assessment

Claude Codex Gemini The new BATS engine tests are well-designed: they cover tag/untag (tagged, dangling, in-use), volume lifecycle, container lifecycle, spec.state transitions including the paused variants, cleanup-on-shutdown, the containerd NotApplicable path, and the rdd set timeout semantics. The comment at engine.bats:272-275 ("park spec.state=unknown before pausing") is a good example of documenting test-only gotchas.

Untested scenarios ranked by risk:

  1. rdd set running=true <extra-prop>=<value> when already running (C1). The simple start/stop flow doesn't exercise the steady-state condition path, which is why the critical regression landed unnoticed. Add a test that sets an unrelated property with running=true from the running state and asserts the command returns well before the default timeout.
  2. rdd set against StartFailed/ConnectFailed/StopFailed (I3). The new default wait has zero coverage on failure paths — the tests only exercise success paths.
  3. docker rename event path (I4). No test renames a container and asserts .status.name refreshes.
  4. containernamespace/moby recreation (I1). The current test at engine.bats:364-370 asserts the broken "stays gone" invariant. Rewrite it to assert the namespace returns after a subsequent docker run.
  5. Engine watcher transient disconnect (related to I6). No test kills dockerd inside the Lima VM (e.g. limactl shell rdd-bats-lima systemctl restart docker) to verify the watcher dies, the reconciler restarts it, and mirrors re-converge without data loss.
  6. Transient Docker Inspect failures during syncAllImages (I5). A unit test with a fake Docker client that returns 503 on selected IDs would demonstrate the current mirror-deletion bug.
  7. Concurrent rdd set races. No test exercises the IsAlreadyExists retry at set.go:354-362 that the "two rdd sets" comment handles.
  8. In-flight Docker events during fullSync. The dockerEventsSince window at docker_watcher.go:71-90 is critical for correctness but has no test that creates/removes resources during the watcher startup window to verify replayed events land.

Codex ran make against the worktree during its review; the engine controller package tests were fine once pkg/guestagent/lima-guestagent.gz was built.


Documentation Assessment

Claude Codex


Commit Structure

Codex Claude This PR is broader than "Container Engine watcher" — the engine watcher itself is bundled with a new CLI wait contract (762c1e8), a large CI support-bundle wrapper (50bdc15, 7c8f356, c484dab, cf4b90e), a BATS fd-inheritance fix (9669b4b), and unrelated Lima template/runtime fixes (32a0109, 536ccb2). Splitting the CI infrastructure into its own PR would make the engine controller history easier to bisect, but at round 5 the review history is already anchored here — flag for the next large feature rather than re-slicing this one. The "Address review round N findings" commits are the repo convention and are fine as-is.


Acknowledged Limitations

Claude Codex


Unresolved Feedback

No unresolved inline review comments from earlier rounds — all of the PR's inline comments are marked resolved or addressed in the Address review round N findings commits.


Agent Performance Retro

Claude

Claude returned the most complete coverage — two important findings (I1, I2) and nine suggestions — and was the only agent to produce Design Observations. The two importants were both shared (with Codex on the ContainerNamespace/moby watch gap, with Gemini on the missing image/volume finalizer retry), but Claude reasoned through the SSA consequences in each case better than the other two. Its biggest miss was the critical C1 bug — the analysis touched setEngineCondition and the generation filter separately but never connected them to the steady-state wantWatcher=true && watcherRunning=true path. Claude also missed I3/I4/I5/I6. The per-file coverage table was exhaustive, the suggestions were tightly scoped (no speculative refactors), and every finding carried a concrete diff.

Codex

Codex found three unique important issues: the ContainerNamespace/moby watch gap (shared with Claude), the rdd set failure-state mishandling (I3), and the dropped docker rename event (I4). Both I3 and I4 were missed by the other two agents, and I4 in particular is the kind of event-taxonomy bug that's only visible if you go read the Docker SDK event enum — Codex did. It ran make against the worktree and briefly hit a build error (missing pkg/guestagent/lima-guestagent.gz) which it recovered from. Zero suggestions and zero design observations, which is on-brand — Codex stays at the important-bug bar and doesn't pad. Missed C1, I2, I5, I6.

Gemini

Gemini was the only agent to catch C1 — the critical rdd set hang — and its reasoning on how the generation filter interacts with the !watcherRunning branch was tight and correct. It also found I5 (transient Inspect failures deleting valid Image mirrors) and I6 (dropped events on handleEvent errors), both of which the other two agents missed. Gemini hit a tool error during its shell-call loop — one bash invocation mis-parsed a chunk of the bats-with-timeout.sh patch as a filename, producing ENAMETOOLONG: name too long, stat '/private/var/folders/.../review-gemini-pass-1-.../ &\n+cmd_pid=$!\n...'. The error didn't block the review; findings still landed. Zero suggestions and zero design observations. Per known behaviour, Gemini skipped git blame for regression classification, but the findings were all clearly regression-scoped from the diff alone.

Summary

Claude Opus 4.6Codex GPT 5.4Gemini 3.1 Pro
Duration21m 49s6m 37s4m 28s
Findings2I 9S3I1C 3I
Tool calls72 (Read 36, Bash 19, Grep 9)45 (shell 42, stdin 3)
Design observations200
False positives000
Unique insights923
Files reviewed303030
Coverage misses000
Totals2I 9S3I1C 3I
Downgraded000
Dropped000

Overall assessment. Gemini was the highest-value agent this round — it alone caught the critical regression that would hang the new default wait, and it also found two important issues nobody else saw. Codex matched it on unique-important count (I3, I4) and did the deepest event-taxonomy work. Claude delivered the broadest coverage and the most suggestions, but on the critical path it missed what Gemini caught. Without Gemini, the critical C1 bug would have shipped; without Codex, the failure-state wait and the rename event would have shipped. All three were necessary to get the full picture this round.


Review Process Notes

Skill improvements

None this round. The critical finding C1 was caught by one agent; the missed-by-two-agents pattern is already handled by the "unique findings" heuristic in the prompt. No new dimension or instruction would have helped Claude or Codex see C1 without also increasing noise.

Repo context updates

None this round. The existing deep-review-context.md adequately framed the cross-controller status-condition mechanics (the part that matters for C1) and the SSA field-manager patterns (the part that matters for I2). Both agents that missed C1 had access to the same context; the miss was an analysis gap, not a context gap.