Deep Review: 20260412-203409-pr-295

Date2026-04-12 20:34
Reporancher-sandbox/rancher-desktop-daemon
Round4 (of PR #295)
Author@jandubois
PR#295 — Container Engine watcher
Commits33db27d Address review round 3 findings for PR #295
cf4b90e ci: disable Spotlight indexing on macOS BATS runners
c484dab ci: capture memory pressure and API state in support bundle
3196b24 bats: park spec.state=unknown before pausing test-state
84c0736 engine: apply Container/Image/Volume capitalization convention
…plus 14 earlier commits
ReviewersClaude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro
VerdictMerge with fixes — the spec.state=created convergence bug (I1) fires redundant ContainerStop calls on every unrelated watch event, and the App controller's blind Status().Update (I2) now races the new setEngineCondition writer. Neither is a functional break but both deserve a fix before merge.
Wall-clock time57 min 34 s


Executive Summary

This round adds an engine controller that mirrors Docker state (Container, Image, Volume) into the rdd Kubernetes API, forwards user-initiated deletes back to Docker, and makes rdd set wait for the desired state by default. The core structure — dockerWatcher for Docker-side I/O, EngineReconciler for controller-runtime integration — is clean and carefully commented, and the BATS coverage is thorough. Review round 3's findings have been addressed.

Structure: 0 critical, 4 important, 13 suggestions, 3 design observations.

The dominant issue is a single line of state-matching logic in reconcileContainerState: once a Container with spec.state=created reaches Docker's exited state, desired != actual keeps dispatching ContainerStop on every reconcile. This was independently caught by all three agents. Three other Important findings surface real concurrency or drift bugs that happen to be structural rather than catastrophic: the App controller's pre-existing Status().Update is no longer safe now that the engine controller writes the same conditions, rdd set's watch predicate ignores the ObservedGeneration the engine already stamps, and the status.ports list is built in non-deterministic map-iteration order which churns atomic-list SSA applies.


Critical Issues

None.


Important Issues

I1. reconcileContainerState dispatches redundant ContainerStop every reconcile once a created-desired container reaches exited Claude Codex Gemini
		return w.k8s.Update(ctx, latest)
	})
	if retryErr != nil {
		return fmt.Errorf("failed to remove finalizer from %s: %w", name, retryErr)
	}
	deleteTarget := obj.DeepCopyObject().(client.Object)
	deleteTarget.SetName(name)
	deleteTarget.SetNamespace(apiNamespace)
	return client.IgnoreNotFound(w.k8s.Delete(ctx, deleteTarget))
}

// reconcileContainerState checks the Container's spec.state against the
// observed status and dispatches the Docker action that bridges them.
// The engine creates Containers with spec.state="unknown", which the
// reconciler ignores.
//
// Full state matrix (rows=desired from the spec.state enum, columns=actual
// from Docker). The webhook restricts desired to {unknown, created,
// running}; actual can be any value in the CRD enum.
//
//	desired \ actual  | running | paused   | created | exited | dead | pausing/restarting | removing | unknown
//	------------------+---------+----------+---------+--------+------+--------------------+----------+--------
//	running           | nil     | unpause  | start   | start  | start| start              | start    | start
//	created           | stop    | stop     | nil     | nil    | nil  | stop               | stop     | stop
//
// ContainerStop handles paused/restarting containers natively, so the
// created branch always dispatches Stop from any non-stopped state. The
// running branch handles paused explicitly because Docker rejects
// ContainerStart on a paused container. isStopped() treats
// created/exited/dead as terminal stopped states: the user's intent
// ("do not run") is already satisfied and redispatching Stop would be
// wasted work on every unrelated reconcile event.
func (w *dockerWatcher) reconcileContainerState(ctx context.Context, c *containersv1alpha1.Container) error {
	desired := c.Spec.State
	if desired == containersv1alpha1.ContainerStatusUnknown {
		return nil
	}

	actual := c.Status.Status
	log := logf.FromContext(ctx).WithName("docker-watcher")

	switch desired {
	case containersv1alpha1.ContainerStatusRunning:
		if actual == containersv1alpha1.ContainerStatusRunning {
			return nil
		}
		if actual == containersv1alpha1.ContainerStatusPaused {

After the user patches spec.state=created on a running container, ContainerStop transitions the container to Docker's exited state (not created), and applyContainer writes status.status=exited at sync_containers.go:157. On the next reconcile — triggered by any Container/Image/Volume watch event via SetupWithManager's unconditional enqueueApp mapping — desired == "created" and actual == "exited", so the desired != actual check at line 341 falls through and dispatches ContainerStop again. Docker's ContainerStop on a non-running container returns NotModified which the client treats as success, so no error propagates — but the reconciler never reaches a quiescent state for this container. Cost scales with the rate of unrelated watch events times the number of created-desired stuck containers. The same issue applies to actual == "dead" or actual == "created" when desired == "created".

		finalizerOnly := containersv1alpha1apply.Container(inspect.ID, apiNamespace).
			WithFinalizers(mirrorFinalizer)
		if err := w.k8s.Apply(ctx, finalizerOnly,
			client.ForceOwnership, client.FieldOwner(finalizerFieldOwner)); err != nil {
			return fmt.Errorf("failed to reassert container finalizer %s: %w", inspect.ID, err)
		}
	}

	// Build status.
	statusApply := containersv1alpha1apply.ContainerStatus().
		WithName(name).

The new BATS test at engine.bats:231-248 only verifies the one-shot transition to exited and does not observe the subsequent churn, so this isn't caught by CI.

        "${image_ref}" --timeout=30s
}

# --- Container state transitions via spec ---

@test "patching spec.state=created stops Docker container" {
    docker run -d --name test-state busybox sleep 3600

    run -0 docker inspect test-state --format '{{.Id}}'
    cid=${output}

    rdd ctl wait --for=jsonpath='{.status.status}'=running \
        --namespace="${NAMESPACE}" container/"${cid}" --timeout=30s

    rdd ctl patch container "${cid}" --namespace="${NAMESPACE}" \
        --type=merge -p '{"spec":{"state":"created"}}'

    rdd ctl wait --for=jsonpath='{.status.status}'=exited \
        --namespace="${NAMESPACE}" container/"${cid}" --timeout=30s

    run -0 docker inspect test-state --format '{{.State.Status}}'
    assert_output "exited"
}

@test "patching spec.state=running restarts Docker container" {
    run -0 docker inspect test-state --format '{{.Id}}'
    cid=${output}

Fix: treat any stopped state (created, exited, dead) as satisfying desired=created, symmetric with the unpause handling in the Running branch at lines 354-363.

 func (w *dockerWatcher) reconcileContainerState(ctx context.Context, c *containersv1alpha1.Container) error {
     desired := c.Spec.State
     if desired == containersv1alpha1.ContainerStatusUnknown {
         return nil
     }

     actual := c.Status.Status
-    if desired == actual {
-        return nil
-    }
+    // Treat any stopped state as satisfying desired=created.
+    if desired == containersv1alpha1.ContainerStatusCreated {
+        switch actual {
+        case containersv1alpha1.ContainerStatusCreated,
+            containersv1alpha1.ContainerStatusExited,
+            containersv1alpha1.ContainerStatusDead:
+            return nil
+        }
+    }
+    if desired == actual {
+        return nil
+    }

(Gemini flagged this as Critical on the grounds of "Docker API spam loop"; Claude and Codex called it Important. Calibrated as Important here: the loop is bounded by unrelated event frequency, NotModified is a silent success, and no error propagates — the cost is wasted RPCs, not a hang or failure.)

I2. App.Status.Conditions now has two writers but the AppReconciler still uses a blind Status().Update Codex
		if err := r.Update(ctx, limaVM); err != nil {
			return ctrl.Result{}, fmt.Errorf("failed to propagate running state to LimaVM: %w", err)
		}
		return ctrl.Result{}, nil
	}

	// Mirror LimaVM status conditions onto the App status. The engine
	// reconciler writes ContainerEngineReady on the same object, so the
	// App's resourceVersion from the initial Get can be stale by the time
	// we write. retry.RetryOnConflict + re-Get matches the pattern in
	// EngineReconciler.setEngineCondition so concurrent writers no longer
	// 409-loop through controller-runtime requeue. Truncate messages
	// defensively: the LimaVM controller already truncates at source, but
	// guarding here ensures a future bypass can't cause the App status
	// update to fail CRD validation.
	if err := retry.RetryOnConflict(retry.DefaultRetry, func() error {
		latest := &v1alpha1.App{}
		if err := r.Get(ctx, client.ObjectKeyFromObject(&app), latest); err != nil {
			return err
		}
		changed := false
		for _, cond := range limaVM.Status.Conditions {
			changed = apimeta.SetStatusCondition(&latest.Status.Conditions, metav1.Condition{
				Type:               cond.Type,
				Status:             cond.Status,
				Reason:             cond.Reason,
				Message:            base.TruncateConditionMessage(cond.Message),
				ObservedGeneration: latest.Generation,
				LastTransitionTime: cond.LastTransitionTime,
			}) || changed
		}
		if !changed {
			return nil

This PR adds a second writer to App.Status.Conditions via setEngineCondition (engine_reconciler.go:223-240), which correctly wraps its update in retry.RetryOnConflict with a re-Get. The existing AppReconciler still does not: the comment at line 230-231 claims the resourceVersion from the initial Get is still current, which was true when the App controller was the sole writer but is no longer true once the engine reconciler can land a ContainerEngineReady update in between the Get and the Update. The repo overview explicitly calls this out: *"App.Status.Conditions has multiple concurrent writers (AppReconciler mirrors LimaVM conditions; EngineReconciler writes ContainerEngineReady). Flag any non-SSA, non-retry-on-conflict Status().Update on the App object."*

	if w != nil {
		w.stop()
	}
}

// setEngineCondition updates the ContainerEngineReady condition on the
// App resource. The App controller also writes App.Status.Conditions
// (to mirror LimaVM conditions) and controller-runtime does not
// serialize reconciles across controllers, so a naive Update can race
// and 409. retry.RetryOnConflict plus a re-Get inside the loop is the
// same pattern used elsewhere in this file (see removeMirrorResource,
// deleteAllOfType).
func (r *EngineReconciler) setEngineCondition(ctx context.Context, app *appv1alpha1.App, status metav1.ConditionStatus, reason, message string) error {
	return retry.RetryOnConflict(retry.DefaultRetry, func() error {
		latest := &appv1alpha1.App{}
		if err := r.Get(ctx, client.ObjectKey{Name: app.Name}, latest); err != nil {
			// The App can be deleted concurrently with the reconciler
			// (e.g., rdd svc delete mid-reconcile). Treat NotFound as a
			// no-op so we do not propagate a spurious error that would
			// requeue only to be IgnoreNotFound-dropped on the next
			// reconcile.
			if apierrors.IsNotFound(err) {
				return nil
			}
			return err
		}
		changed := meta.SetStatusCondition(&latest.Status.Conditions, metav1.Condition{
			Type:               conditionContainerEngineReady,

This isn't catastrophic — controller-runtime requeues on the returned conflict error and the next reconcile re-Gets — but it turns routine status churn into avoidable 409s and noisy reconcile-error logs.

Fix: convert the App controller's status write to the same retry.RetryOnConflict + re-Get pattern used in setEngineCondition.

if statusChanged {
    if err := retry.RetryOnConflict(retry.DefaultRetry, func() error {
        latest := &v1alpha1.App{}
        if err := r.Get(ctx, client.ObjectKeyFromObject(&app), latest); err != nil {
            return err
        }
        for _, cond := range limaVM.Status.Conditions {
            apimeta.SetStatusCondition(&latest.Status.Conditions, metav1.Condition{
                Type:               cond.Type,
                Status:             cond.Status,
                Reason:             cond.Reason,
                Message:            base.TruncateConditionMessage(cond.Message),
                ObservedGeneration: latest.Generation,
                LastTransitionTime: cond.LastTransitionTime,
            })
        }
        return r.Status().Update(ctx, latest)
    }); err != nil {
        log.Error(err, "Unable to update App status")
        return ctrl.Result{}, err
    }
}
I3. rdd set running=true wait ignores the ObservedGeneration the engine already stamps Claude

// waitForDesiredState waits for the App to reach the state implied by the
// properties that were set. If running=true was set, it waits for the
// container engine to be ready. If running=false was set, it waits for the
// App's Running condition to go to False — i.e. the VM has actually
// stopped, which is stricter than "container engine disconnected".
// Other property changes do not wait.
//
// minGen is the App's metadata.generation after our write. The predicate
// only accepts condition snapshots whose ObservedGeneration >= minGen,
// so a stale ContainerEngineReady=True from a previous backend (or a
// stale Running=False from a prior stop) cannot satisfy the wait before
// the reconciler has observed the new spec. EngineReconciler and
// AppReconciler both stamp ObservedGeneration when they update conditions.
//
// TODO: changes that trigger a VM restart without changing "running" (e.g.
// containerEngine.name alone) are still not waited on. A dedicated
// "Reconciled" condition on App would let the CLI detect when the full
// reconcile chain has settled for any property change.
func waitForDesiredState(ctx context.Context, config *rest.Config, properties map[string]string, timeout time.Duration, minGen int64) error {
	runningVal, ok := properties["running"]
	if !ok {
		return nil
	}

	ctx, cancel := context.WithTimeout(ctx, timeout)
	defer cancel()

	if runningVal == "true" {
		logrus.Info("Waiting for container engine to be ready")
		return watchCondition(ctx, config, func(obj *unstructured.Unstructured) bool {
			status, gen, present := conditionInfo(obj, "ContainerEngineReady")

The predicate only checks Status == True and has no generation handoff from the preceding patch. The "Known hazard" comment at cmd/rdd/set.go:445-452 acknowledges that rdd set containerEngine.name=<other> running=true on an already-running VM can be satisfied by the stale Connected condition left over from the previous backend — but setEngineCondition already writes ObservedGeneration: latest.Generation at engine_reconciler.go:234, so the fix is cheap: capture the generation from the patch response in setAction and compare it in the predicate. The waitForDesiredState signature needs a minGen int64 parameter, patchApp/createAndPatchApp need to return the post-patch generation, and watchCondition's predicate gains cond.ObservedGeneration >= minGen.


	if dryRun {
		logrus.Info("App validated (dry run)")
	} else {
		logrus.Info("App updated")
	}
	return app.Generation, nil
}

// waitForDesiredState waits for the App to reach the state implied by the
// properties that were set. If running=true was set, it waits for the
// container engine to be ready. If running=false was set, it waits for the
// App's Running condition to go to False — i.e. the VM has actually
// stopped, which is stricter than "container engine disconnected".
// Other property changes do not wait.
//
// minGen is the App's metadata.generation after our write. The predicate
// only accepts condition snapshots whose ObservedGeneration >= minGen,
// deleteAllOfType).
func (r *EngineReconciler) setEngineCondition(ctx context.Context, app *appv1alpha1.App, status metav1.ConditionStatus, reason, message string) error {
	return retry.RetryOnConflict(retry.DefaultRetry, func() error {
		latest := &appv1alpha1.App{}
		if err := r.Get(ctx, client.ObjectKey{Name: app.Name}, latest); err != nil {
			// The App can be deleted concurrently with the reconciler
			// (e.g., rdd svc delete mid-reconcile). Treat NotFound as a
			// no-op so we do not propagate a spurious error that would
			// requeue only to be IgnoreNotFound-dropped on the next
			// reconcile.
			if apierrors.IsNotFound(err) {

Without the fix, rdd set containerEngine.name=containerd (without changing running) is the worst case: no wait triggers at all because no running argument is set, even though a full VM restart follows. With running=true included, the wait happens but can be satisfied by a pre-patch snapshot.

Fix sketch (conceptual, paired with I4 in rdd set.go):

// In setAction, after patchApp/createAndPatchApp:
var app appv1alpha1.App
if err := c.Get(ctx, client.ObjectKey{Name: "app"}, &app); err != nil {
    return fmt.Errorf("failed to get App after patch: %w", err)
}
minGen := app.Generation

// In waitForDesiredState predicate:
return conditionStatusForGeneration(obj, "ContainerEngineReady", metav1.ConditionTrue, minGen)

Claude flagged this as Important because the fix is already available (the stamp exists), not because the hazard itself is new.

I4. status.ports is built in non-deterministic map-iteration order and churns the atomic list on every sync Claude

	if inspect.State.Error != "" {
		statusApply.WithError(inspect.State.Error)
	}

	if t, err := time.Parse(time.RFC3339Nano, inspect.Created); err == nil {
		statusApply.WithCreatedAt(metav1.NewTime(t))
	}
	if t, err := time.Parse(time.RFC3339Nano, inspect.State.StartedAt); err == nil && !t.IsZero() {
		statusApply.WithStartedAt(metav1.NewTime(t))
	}
	if t, err := time.Parse(time.RFC3339Nano, inspect.State.FinishedAt); err == nil && !t.IsZero() {
		statusApply.WithFinishedAt(metav1.NewTime(t))
	}

	// Port bindings. Go map iteration is randomized, so sort by port
	// name before building the apply slice: even with
	// +listType=map +listMapKey=name on Status.Ports, the entries land
	// in the stored object in the order we provide, and a stable order
	// avoids spurious resourceVersion churn when consumers render the
	// list or diff it.
	var applyPorts []*containersv1alpha1apply.ContainerPortApplyConfiguration
	if inspect.NetworkSettings != nil {
		portNames := make([]mobynetwork.Port, 0, len(inspect.NetworkSettings.Ports))
		for p := range inspect.NetworkSettings.Ports {

inspect.NetworkSettings.Ports is a Go map, so iteration order is randomized per process. The Container CRD declares status.ports as a plain array with no x-kubernetes-list-type annotation (verified against pkg/controllers/containers/container/crd.yaml: only status.conditions carries x-kubernetes-list-type: map). Lists without a list-type default to atomic, which is position-sensitive: the apiserver treats [A, B] and [B, A] as different values and produces a new resourceVersion on every apply that reorders them. That fires a watch event, which the engine reconciler catches via Watches(&containersv1alpha1.Container{}, enqueueApp), which re-enters reconcileContainerSpecs + processFinalizers — a four-list sweep per watch event. For a container with two or more exposed ports, every unrelated Docker event (image pull, volume create, etc.) can amplify into a status re-apply and a full reconcile sweep.

Fix: sort inspect.NetworkSettings.Ports by port name before building the apply slice. The inner bindings loop is Docker-ordered (IPv4 then IPv6) and does not need sorting unless Docker's internal order changes.

 if inspect.NetworkSettings != nil {
+    portNames := make([]mobynat.Port, 0, len(inspect.NetworkSettings.Ports))
+    for p := range inspect.NetworkSettings.Ports {
+        portNames = append(portNames, p)
+    }
+    sort.Slice(portNames, func(i, j int) bool { return portNames[i] < portNames[j] })
-    for portName, ports := range inspect.NetworkSettings.Ports {
+    for _, portName := range portNames {
+        ports := inspect.NetworkSettings.Ports[portName]

Alternatively, annotate Ports with +listType=map + +listMapKey=name in container_types.go so the apiserver merger ignores ordering — this is more robust but requires a CRD regeneration and is a larger change.


Suggestions

S1. DockerSocket() silently swallows os.MkdirAll error Claude
// neither — or the watcher will open a different socket from the one
// Lima is port-forwarding to, and mirroring will silently diverge per
// instance. Until then, multi-instance BATS runs cannot share the
// same host and the engine.bats test has a parallel TODO at line
// 12-14.
var DockerSocket = sync.OnceValue(func() string {
	home, err := os.UserHomeDir()
	if err != nil {
		panic(fmt.Errorf("could not get home directory: %w", err))
	}
	dir := filepath.Join(home, ".rd2")
	_ = os.MkdirAll(dir, 0o700)
	return filepath.Join(dir, "docker.sock")
})

If MkdirAll fails (permission denied, read-only filesystem), the error is discarded and the returned path may point at a non-existent parent. dockerclient.WithHost("unix://" + socketPath) in docker_watcher.go:47 then fails at connect time with a cryptic "no such file or directory". Sibling helpers (LogDir, ShortDir, etc.) don't do MkdirAll — the directory is really Lima's responsibility, and creating it here is inconsistent.

// newDockerWatcher creates a Docker client, performs a full sync, and starts
// the event stream watcher goroutine.
func newDockerWatcher(ctx context.Context, k8s client.Client, reconcileChan chan<- event.GenericEvent) (*dockerWatcher, error) {
	socketPath := instance.DockerSocket()
	cli, err := dockerclient.New(
		dockerclient.WithHost("unix://" + socketPath),
	)
	if err != nil {
		return nil, fmt.Errorf("failed to create Docker client: %w", err)
	}

Fix: drop the MkdirAll call entirely and let Lima own the directory lifecycle.

S2. setEngineCondition does not tolerate App deletion during retry Claude
	if w != nil {
		w.stop()
	}
}

// setEngineCondition updates the ContainerEngineReady condition on the
// App resource. The App controller also writes App.Status.Conditions
// (to mirror LimaVM conditions) and controller-runtime does not
// serialize reconciles across controllers, so a naive Update can race
// and 409. retry.RetryOnConflict plus a re-Get inside the loop is the
// same pattern used elsewhere in this file (see removeMirrorResource,
// deleteAllOfType).
func (r *EngineReconciler) setEngineCondition(ctx context.Context, app *appv1alpha1.App, status metav1.ConditionStatus, reason, message string) error {
	return retry.RetryOnConflict(retry.DefaultRetry, func() error {
		latest := &appv1alpha1.App{}
		if err := r.Get(ctx, client.ObjectKey{Name: app.Name}, latest); err != nil {
			// The App can be deleted concurrently with the reconciler
			// (e.g., rdd svc delete mid-reconcile). Treat NotFound as a
			// no-op so we do not propagate a spurious error that would
			// requeue only to be IgnoreNotFound-dropped on the next
			// reconcile.
			if apierrors.IsNotFound(err) {
				return nil
			}
			return err
		}
		changed := meta.SetStatusCondition(&latest.Status.Conditions, metav1.Condition{
			Type:               conditionContainerEngineReady,
			Status:             status,

Reconcile wraps its initial Get in client.IgnoreNotFound, but if the App is deleted between that Get and a later setEngineCondition call (e.g., rdd svc delete runs concurrently), the inner Get returns NotFound and retry.RetryOnConflict does not treat NotFound as retriable — the error propagates up and the reconciler returns it, causing a gratuitous requeue-then-bail.

Fix: treat NotFound as a no-op inside the retry closure.

 return retry.RetryOnConflict(retry.DefaultRetry, func() error {
     latest := &appv1alpha1.App{}
     if err := r.Get(ctx, client.ObjectKey{Name: app.Name}, latest); err != nil {
+        if apierrors.IsNotFound(err) {
+            return nil
+        }
         return err
     }
S3. deleteImage may untag the wrong image in a tag-race Claude
		_, err := w.cli.ContainerStop(ctx, c.Name, dockerclient.ContainerStopOptions{})
		return err
	}
	return nil
}

// isStopped reports whether a container status represents a terminal
// non-running state. Every state listed here satisfies a desired state
// of "created": the container is not running, cannot transition without
// a user action (start/rm), and dispatching ContainerStop again would
// be a no-op.
func isStopped(s containersv1alpha1.ContainerStatusValue) bool {
	switch s {
	case containersv1alpha1.ContainerStatusCreated,
		containersv1alpha1.ContainerStatusExited,
		containersv1alpha1.ContainerStatusDead:
		return true
	}
	return false
}

// deleteContainer removes a container from Docker. NotFound errors are

Consider: (1) Image A is tagged busybox:latest, mirror A holds Status.RepoTag=busybox:latest. (2) docker pull busybox:latest brings in image B, moves the tag, emits untag(A)+tag(B). (3) Before the watcher processes those events, the user deletes mirror A. processImageFinalizers reads the stale mirror A, calls ImageRemove("busybox:latest"), which now hits image B and untags it. The ID-based path would have untagged A correctly.

The intent of using the tag is presumably to untag-only rather than remove — so that a multi-tagged image keeps its other tags. A safer approach re-inspects the image and verifies Inspect().RepoTags still contains the tag before calling ImageRemove, or removes by status.ID + a ForceOwnership: false flag.

S4. Image finalizer path has no empty-status guard, asymmetric with the Volume path Claude

processVolumeFinalizers at lines 426-455 guards w.deleteVolume(ctx, v.Status.Name) with if v.Status.Name != "" because a bare-skeleton mirror has no engine name to forward to. The Image path has no equivalent guard: if a user directly creates an Image mirror via kubectl (with the mirror finalizer) and then deletes it before status is populated, deleteImage passes an empty string to ImageRemove, which the daemon rejects. Since the Container webhook only intercepts Update not Create, such a bare mirror is technically reachable.

			}
			return r.Update(ctx, latest)
		})
		if retryErr != nil {
			errs = append(errs, fmt.Errorf("failed to remove finalizer from Container %s: %w", c.Name, retryErr))
		}
	}
	return errors.Join(errs...)
}

func (r *EngineReconciler) processImageFinalizers(ctx context.Context, w *dockerWatcher) error {
	var images containersv1alpha1.ImageList
	if err := r.List(ctx, &images, client.InNamespace(apiNamespace)); err != nil {
		return err
	}
	var errs []error
	for i := range images.Items {
		img := &images.Items[i]
		if img.DeletionTimestamp == nil || !hasFinalizer(img, mirrorFinalizer) {
			continue
		}
		// An Image mirror with no engine-populated status has no Docker
		// reference to forward the delete to — either a user created it
		// as a bare skeleton or it landed in the startup race window
		// before applyImage ran. Symmetric with processVolumeFinalizers'
		// empty-Status.Name guard: strip the finalizer and let the
		// Delete proceed.
		if img.Status.ID != "" || img.Status.RepoTag != "" {
			if err := w.deleteImage(ctx, img); err != nil {
				errs = append(errs, fmt.Errorf("failed to delete image %s from Docker: %w", img.Name, err))
				continue
			}
		}
		if removeFinalizer(img, mirrorFinalizer) {
			if err := client.IgnoreNotFound(r.Update(ctx, img)); err != nil {
				errs = append(errs, fmt.Errorf("failed to remove finalizer from Image %s: %w", img.Name, err))
			}
		}
	}
	return errors.Join(errs...)

Fix: symmetrically guard the Image path with img.Status.ID != "" || img.Status.RepoTag != "".

S5. parseContainerName("/") returns ("moby", "") which downstream CRD validation would reject Claude

The table test at sync_containers_test.go:42-51 pins the "/" result as ("moby", ""). That flows into ContainerStatus().WithName("") at sync_containers.go:151, and status.name is +required in container_types.go:73-75, so the apiserver would reject the apply with a validation error. In practice inspect.Name from Docker is never / because Docker always generates a name, so this is defensive only. But the current test pins a result that in the hot path would produce an invalid mirror.

			wantNamespace: containerNamespace,
			wantName:      "",
		},
		{
			input:         "/",
			wantNamespace: containerNamespace,
			wantName:      "",
		},
	}
	for _, tc := range tests {
		t.Run(tc.input, func(t *testing.T) {
			ns, name := parseContainerName(tc.input)
			if ns != tc.wantNamespace || name != tc.wantName {
				t.Errorf("parseContainerName(%q) = (%q, %q), want (%q, %q)",
					tc.input, ns, name, tc.wantNamespace, tc.wantName)
			}
		})
	}
}
		// controllerName. SSA treats fields absent from an apply config
		// as released by that manager; releasing spec from controllerName
		// would prune spec (no other manager owns it) and fail the
		// required-field validation on the Container CRD. A separate
		// manager that only ever touches finalizers keeps controllerName's
		// spec ownership untouched.
		finalizerOnly := containersv1alpha1apply.Container(inspect.ID, apiNamespace).
			WithFinalizers(mirrorFinalizer)
		if err := w.k8s.Apply(ctx, finalizerOnly,
			client.ForceOwnership, client.FieldOwner(finalizerFieldOwner)); err != nil {
			return fmt.Errorf("failed to reassert container finalizer %s: %w", inspect.ID, err)
	// +kubebuilder:validation:Enum=created;running;unknown
	State ContainerStatusValue `json:"state"`
}

// ContainerStatus defines the observed state of the container.
type ContainerStatus struct {
	// Name of the container; this is distinct from the container ID.
	//
	// +required
	Name string `json:"name"`
	// Namespace is the container namespace; refers to a `ContainerNamespace`
	// object in the same Kubernetes namespace.
	//

Fix: return an explicit error or fall back to a synthetic name (e.g., the container ID). A test assertion + a clear error is better than silently producing an invalid apply.

S6. applyContainer reasserts the finalizer even on a terminating mirror Claude

If a user has initiated kubectl delete container/... (mirror has DeletionTimestamp) and Docker then emits a container event (the container dies on its own), applyContainer reaches the else branch and tries to re-assert the mirror finalizer via SSA. Adding a finalizer to a deleting object is typically a no-op under SSA (the finalizer is still present), but if any other field manager stripped it the reassertion fails with a cryptic error and the next Docker event for the same container repeats the failure.

Fix: skip the finalizer reassertion when existing.DeletionTimestamp != nil.

S7. Docker state mapping assumes the moby enum is a strict subset of the CRD enum Claude
		finalizerOnly := containersv1alpha1apply.Container(inspect.ID, apiNamespace).
			WithFinalizers(mirrorFinalizer)
		if err := w.k8s.Apply(ctx, finalizerOnly,
			client.ForceOwnership, client.FieldOwner(finalizerFieldOwner)); err != nil {
			return fmt.Errorf("failed to reassert container finalizer %s: %w", inspect.ID, err)
		}
	}

	// Build status.
	statusApply := containersv1alpha1apply.ContainerStatus().
		WithName(name).

inspect.State.Status is a free-form Docker string, directly cast into the CRD enum without validation. The CRD enum at container_types.go:11 currently covers every documented Docker state, but a future Docker release that introduces a new state string (initializing, error, etc.) would cause SSA applies to fail with CRD validation errors, the per-container error would be logged and skipped by syncAllContainers:57-59, and that container's mirror would silently fall behind.


import (
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// +kubebuilder:validation:Enum=created;running;pausing;paused;restarting;removing;exited;dead;unknown

// ContainerStatusValue describes the status of a container.
type ContainerStatusValue string

// Possible values for ContainerStatusValue.

Fix: whitelist the known states and fall back to unknown on mismatch, so forward-compat with Docker is explicit rather than implicit.

S8. setAction does not hand off the post-patch generation to waitForDesiredState Claude

Pairs with I3: neither createAndPatchApp nor patchApp returns the metadata.generation of the post-write App object, so waitForDesiredState has no way to filter stale condition snapshots in the informer's initial list. The fix is a small plumbing change — return app.Generation from the patch path and thread it into watchCondition's predicate.

S9. Gemini: Missing channel closure check in Docker watcher loop Gemini
				return
			}
			log.Error(err, "Docker event stream error")
			w.enqueueReconcile()
			return
		case msg, ok := <-result.Messages:
			if !ok {
				// The daemon closed the event stream without writing to
				// result.Err. Treat the same way as an error: wake the
				// reconciler so it observes the dead watcher and starts
				// a fresh one.
				log.Info("Docker event stream closed")
				w.enqueueReconcile()
				return
			}
			if err := w.handleEvent(ctx, msg); err != nil {

If the messages channel closes (daemon disconnect), the select continuously yields a zero-value events.Message. handleEvent safely ignores unknown event types (default → return nil), so this isn't a hard spin — the loop will also see result.Err close and terminate — but explicit msg, ok := <-result.Messages with a closed-channel exit is safer and matches the Go idiom.

S10. processContainerFinalizers uses a blind r.Update for finalizer removal Gemini

removeMirrorResource wraps its finalizer-strip in retry.RetryOnConflict; processContainerFinalizers does not. On a 409, the whole reconcile requeues and re-runs deleteContainer, which is idempotent but wastes an API round-trip. Consistency across finalizer paths would simplify reasoning.

S11. bats-with-timeout.sh gives no context when the rdd binary is missing Claude
instance="${RDD_INSTANCE:-2}"

# Locate the rdd binary relative to this script rather than via PATH.
# CI runners do not add <repo>/bin to PATH, and bats targets invoke
# us with `../scripts/bats-with-timeout.sh` from the bats/ directory.
script_dir=$(cd "$(dirname "$0")" && pwd)
rdd_bin="${script_dir}/../bin/rdd"
if [ ! -x "$rdd_bin" ] && [ -x "${rdd_bin}.exe" ]; then
    rdd_bin="${rdd_bin}.exe"
fi
if [ ! -x "$rdd_bin" ]; then
    echo "bats-with-timeout: rdd binary not found at $rdd_bin; run 'make' first" >&2
    exit 2
fi

# `rdd svc paths log_dir` is a pure local computation (see
# cmd/rdd/service_paths.go): it resolves the path from the instance
# suffix without touching the running service, so it is safe to call
# even when the target under test is hung.
log_dir=$("$rdd_bin" svc paths log_dir)
mkdir -p "${log_dir}"
bundle_file="${log_dir}/support-bundle.log"

If neither bin/rdd nor bin/rdd.exe exists (a developer running the script before make), the $() subshell fails with a generic shell error and set -o errexit aborts without explaining what went wrong.

Fix: [ -x "$rdd_bin" ] || { echo "bats-with-timeout: rdd binary not found at $rdd_bin; run 'make' first" >&2; exit 2; } before the command.

S12. pgid log field asserts Unix-only invariant in prose, not in code Claude

The log field hardcodes pgid to haCmd.Process.Pid on the assumption that process.SetGroup makes the child a process-group leader. The comment explains the assumption but doesn't cross-reference SetGroup, so a future change that drops Setpgid would silently break downstream tooling (the new bats-with-timeout.sh pgid matching).

Fix: query the real pgid via syscall.Getpgid(haCmd.Process.Pid) after Start() and log the result. One extra syscall for a defensible invariant.

S13. DOCKER_HOST is a file-level export in engine.bats Claude

Top-level assignments run at file load and export DOCKER_HOST globally to every test in the file (and any subshell). This is harmless today but will leak through if another BATS suite ever sources this file. The documented .rd2 TODO replacement will be cleaner if DOCKER_HOST is scoped to local_setup_file instead of being module-level.


Design Observations

Concerns

  1. Engine reconciler iterates all containers on every unrelated watch eventpkg/controllers/app/engine/controllers/engine_reconciler.go:170-182 Claude Codex (future). The author's TODO at lines 170-176 acknowledges this. Concrete cost: every Container/Image/Volume event triggers a full reconcileContainerSpecs + processFinalizers sweep (4 List calls + O(N) Docker RPCs in the stuck-state case — see I1). Splitting into per-kind reconcilers with watch predicates (deletion timestamp set, spec.state changed) would bound cost to O(1) per event. Not blocking, but I1 is a direct consequence of this design.
  2. Engine namespace and containernamespace/moby are not repaired after user-initiated deletepkg/controllers/app/engine/controllers/engine_reconciler.go:457-475, pkg/controllers/app/engine/controllers/sync_volumes.go:56-66 Codex (in-scope). ensureNamespace and syncContainerNamespace run only in fullSync at watcher startup; SetupWithManager does not watch corev1.Namespace or ContainerNamespace. A user delete of either resource is therefore not repaired until the Docker watcher is torn down and restarted. For the moby ContainerNamespace this is documented at sync_volumes.go:56-61 as a deliberate design choice (to avoid a finalizer trap), and the BATS test at engine.bats:355-361 codifies the "stays deleted" behavior. For the rancher-desktop Namespace there is no such deliberation — if deleted, subsequent mirror applies would fail with NotFound until the next watcher restart. Users are unlikely to delete these resources in an embedded single-user context, so Codex's Important severity is demoted to a design observation; but a Watches(&corev1.Namespace{}, enqueueAppIfName("rancher-desktop")) would close the gap cleanly.
  3. Container.Status has concurrent SSA and non-SSA writerspkg/controllers/containers/container/container_controller.go (pre-PR) and pkg/controllers/app/engine/controllers/sync_containers.go (new) Claude (future). The pre-existing container reconciler writes status.Conditions via a plain r.Status().Update() without a field manager, while the new engine controller writes status.Name, status.Status, status.Ports, etc. via SSA with field manager engine-controller. Each Status().Update sends the full status object, so the container reconciler's client implicitly takes ownership of engine-written fields, which the engine controller then forcibly reclaims via ForceOwnership on the next sync. Functional today, but the repo-context rule about concurrent writers on App.Status.Conditions applies symmetrically here. Migrating the container reconciler to SSA with a distinct field manager would eliminate the ping-pong.

Strengths


Testing Assessment

BATS coverage is thorough for happy-path transitions and the non-obvious cases (untag-leaves-dangling, tagging-dangling-removes-mirror, in-use image finalizer, paused-to-created, paused-to-running). Untested scenarios ranked by risk:

  1. Steady-state reconcile cadence after spec.state=created (I1). No test verifies that once status.status=exited is observed, subsequent unrelated events stop dispatching ContainerStop. A unit test on reconcileContainerState with a counting fake Docker client would catch this without introducing flakiness.
  2. Watcher death and reconnect while the VM is still running. The watcherDied path in Reconcile at engine_reconciler.go:88-106 handles it, but nothing exercises the "Docker is still up but the event stream closed" scenario (proxy timeout, API version compaction). Simulatable by killing the Docker client's underlying TCP socket.
  3. Containers in dead/removing states with desired=created. Claude's I1 plus S7 flag this region as ambiguous; no test covers it.
  4. Concurrent create-then-patch race. sync_containers.go:117-122's comment explicitly relies on a "fail loudly on conflict" contract, but no test verifies the 409 is retried on the next reconcile.
  5. Full sync after rdd svc delete followed by rdd svc start with prior Docker state still present. Verifies the stale-mirror prune path actually runs.
  6. --timeout=0 preserves legacy no-wait behavior. Covered implicitly by rdd set --timeout=10s running=false at engine.bats:346 but not explicitly tested for =0.
  7. Windows moby startup with rdd set running=true. Currently skipped via skip_on_windows in local_setup_file; when WSL2 socket forwarding lands this becomes exercisable.

Documentation Assessment


Commit Structure

Commit history is unusually clean. Related chunks — the Docker client dependency, the engine controller, the webhook, docs, BATS tests, rdd set wait — each get their own commit. CI-only changes are separated from code changes. The four "Address review round N findings" commits are preserved to make each round's response visible against the baseline, which is acceptable under the project's git-workflow guidance.

The diff is large (~132KB, 27 files) but internally consistent. Splitting the engine controller from the CI infrastructure would be tempting, but the BATS support-bundle work is specifically motivated by debugging the engine controller's hangs in CI, so bundling them is defensible.


Acknowledged Limitations


Agent Performance Retro

Claude

Claude produced the longest, most detailed review (614 lines, 3 important + 12 suggestions + 3 design observations). Most unique value came from tracing pre-existing code: verifying moby daemon's stop.go to confirm ContainerStop on a non-running container returns NotModified, connecting the ports-ordering churn (I4) to the missing x-kubernetes-list-type annotation in crd.yaml, and spotting that the ObservedGeneration handoff for I3 is cheap because setEngineCondition already stamps the field. Notable miss: didn't flag the App controller's blind Status().Update even though the repo-context rule calls it out explicitly. Claude's S3 (deleteImage tag race) is a good find but the proposed diff is the wrong direction — Claude recovers in prose.

Codex

Codex produced the tightest, most-calibrated review: zero criticals, four importants, zero suggestions. Every finding was grounded in a concrete cross-file interaction — I2 (App controller blind Update) is the clearest example of reading the repo-context rule and applying it. Codex was the only agent to notice the Namespace repair gap (demoted to design observation here) and the latent Windows unix-socket issue (demoted to acknowledged limitation). Codex's severity calibration was more aggressive than Claude's and more accurate than Gemini's — four importants all landed as legitimate concerns after verification.

Gemini

Gemini's review was the shortest and the most Critical-happy: one critical (C1 spec.state convergence), one important (CrashLoopBackOff bypass), two suggestions. The Critical framing overstated the severity — the loop is bounded by unrelated event rate and NotModified is silent success — but the finding itself matched Claude and Codex. Gemini's S1 (channel closure check) and S2 (processContainerFinalizers missing RetryOnConflict) are legitimate minor cleanups. Gemini's CrashLoopBackOff observation is a real design concern worth surfacing even if not action-required today. Known stderr noise: Gemini's sandbox blocked one path outside the worktree (~/go/pkg/mod/...) and hit an ENAMETOOLONG stat error on what looks like an embedded bash snippet from the bats-with-timeout.sh diff — neither affected the final review.

Summary

Claude Opus 4.6Codex GPT 5.4Gemini 3.1 Pro
Duration25m 38s6m 34s4m 58s
Findings3I 11S2I1I 2S
Tool calls76 (Bash 41, Read 29, TodoWrite 5)21 (shell 21)9 (runshellcommand 5, grepsearch 2, readfile 2)
Design observations832
False positives000
Unique insights1322
Files reviewed272727
Coverage misses000
Totals3I 11S2I1I 2S
Downgraded02 (I→S)2 (I→S)
Dropped000

Claude provided the broadest coverage and deepest pre-existing-code tracing. Codex provided the best severity calibration and directly actionable findings grounded in the repo-context rules. Gemini's coverage was the narrowest but it independently caught the dominant issue (I1). For this PR, Codex was the highest-signal agent: every finding it raised survived consolidation at the same severity it proposed.

Reconciliation


Review Process Notes

Skill improvements

None. The repo-context rules covered every dimension needed; every Important finding traces back to an existing rule in deep-review-context.md.

Repo context updates