Deep Review: 20260412-031335-pr-295
| Date | 2026-04-12 03:13 |
| Repo | rancher-sandbox/rancher-desktop-daemon |
| Round | 2 (of PR #295) |
| Author | @jandubois |
| PR | #295 — Container Engine watcher |
| Commits | 7c8f356 ci: always capture support bundle and log hostagent pgid9669b4b bats: close fds 3 and 4 when invoking rdd50bdc15 ci: capture support bundle when BATS targets time out762c1e8 Make rdd set wait for the desired state by default65943da Add BATS integration tests for the engine controllerba6caf7 Document engine mirroring and container state transitions4db3f20 Allow container spec.state transitions via webhook93359f7 Add engine controller that mirrors Docker state into K8s87f1df1 Add DockerSocket() helper and Docker client dependency32a0109 Work around distro proxy bug; update docker socket path536ccb2 Rosetta is now a vz driver option, not a global option |
| Reviewers | Claude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro |
| Verdict | Merge with fixes — untag handling is broken, container pause/unpause is asymmetric, and several watcher lifecycle races need addressing before merge |
| Wall-clock time | 29 min 41 s |
Executive Summary ¶
PR #295 introduces the engine controller — a new EngineReconciler plus long-lived dockerWatcher goroutine that mirrors Docker containers, images, and volumes into Kubernetes resources, forwards K8s-initiated deletions and spec.state transitions back to Docker, relaxes the container webhook to permit state changes, and adds rdd set --timeout wait semantics. It also bundles CI hang-diagnostics (bats-with-timeout.sh), Lima template updates (.rd2 socket path, vz rosetta nesting), a distro proxy workaround, and BATS integration coverage.
The overall architecture is sound — a singleton reconciler owning a watcher goroutine, finalizer-gated reverse deletes, and a Get-then-Apply create pattern to avoid fighting user intent on spec.state. But the watcher's Docker↔K8s event handling has several correctness gaps: ActionUnTag events never remove stale mirrors (the event payload does not carry the tag name, and the call site passes the image ID into a tag-based lookup), paused containers cannot be unpaused via spec.state=running (asymmetric with the tested pause→stop path), and a transient Docker event-stream blip wipes every mirror resource. There is also a bootstrap race where the initial fullSync finishes before the event stream subscribes, so mutations in that window are invisible, and a NotFound during Inspect during fullSync aborts the whole sync.
No CRITICAL findings: 11 IMPORTANT issues (1 unique to each of the three reviewers plus a several-agent cluster around untag and error handling) and 10 SUGGESTIONS. Structurally, two commits (4db3f20 and 762c1e8) bundle engine-controller behavior changes into commits advertised as webhook and CLI features — rework the commit history before merge. Test coverage is strong on the happy paths (K8s-initiated deletion, state transitions, moby⇄containerd backend switch) but misses paused→running, transient Docker disconnect, and Docker-initiated untag flows.
Critical Issues ¶
None.
Important Issues ¶
events.ActionUnPause,
events.ActionRestart:
log.V(1).Info("Container event", "action", msg.Action, "id", msg.Actor.ID)
return w.syncContainer(ctx, msg.Actor.ID)
case events.ActionDestroy:
log.V(1).Info("Container destroyed", "id", msg.Actor.ID)
return w.removeMirrorResource(ctx, &containersv1alpha1.Container{}, msg.Actor.ID)
default:
return nil
}
}
All three reviewers caught this. When Docker emits an untag event, msg.Actor.ID is the image ID hash, not the tag. removeImageByTag searches for status.repoTag == <hash> at pkg/controllers/app/engine/controllers/sync_images.go:185, which never matches. The per-tag Image mirror survives indefinitely, until a watcher restart runs fullSync again.
// reconcileImageByID re-inspects an image and reconciles every K8s Image
// mirror whose status.id matches. Tags still present are re-applied; K8s
// resources for tags that are no longer present are deleted. If the
// image has been fully removed, all mirrors with that status.id are
// deleted instead.
//
// This is the path for events that carry an image ID but not the tag
// name — notably Docker's untag events, where the event payload only
// contains the image ID (see handleImageEvent).
func (w *dockerWatcher) reconcileImageByID(ctx context.Context, id string) error {
result, err := w.cli.ImageInspect(ctx, id)
I verified the upstream Moby contract: daemon/images/image_delete.go:134,191,272 all call LogImageEvent(ctx, imgID.String(), imgID.String(), events.ActionUnTag), so both Actor.ID and Attributes["name"] are the image ID hash — the tag that was removed is not propagated through the event payload at all. Gemini proposed fixing this by reading msg.Actor.Attributes["name"], which is also wrong for this reason.
Fix (Codex's proposal): treat untag as a per-image reconciliation, not a direct tag delete. Re-inspect the image by ID, then delete any K8s Image objects whose status.id matches but whose status.repoTag is no longer in the fresh RepoTags. If the inspect returns NotFound, fall back to removeImagesByID.
case events.ActionUnTag:
- return w.removeImageByTag(ctx, msg.Actor.ID)
+ return w.syncImageAndPruneStaleTags(ctx, msg.Actor.ID)
// iteration starts from a fresh copy via DeepCopyObject and writes only
// that copy, so caller state is never observed.
func (w *dockerWatcher) removeMirrorResource(ctx context.Context, obj client.Object, name string) error {
key := client.ObjectKey{Name: name, Namespace: apiNamespace}
retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
latest := obj.DeepCopyObject().(client.Object)
if err := w.k8s.Get(ctx, key, latest); err != nil {
if apierrors.IsNotFound(err) {
return nil
}
return err
}
if !removeFinalizer(latest, mirrorFinalizer) {
return nil
}
return w.k8s.Update(ctx, latest)
})
if retryErr != nil {
return fmt.Errorf("failed to remove finalizer from %s: %w", name, retryErr)
}
The comment at docker_watcher.go:268-271 carefully explains that ContainerStop handles paused/restarting containers natively, and the BATS test patching spec.state=created stops a paused container at bats/tests/32-app-controllers/engine.bats:200-220 covers that direction. The symmetric case (desired=running, actual=paused) falls into ContainerStart, which Docker rejects for paused containers with "container is paused, unpause before starting". The user's intent is silently dropped and there is no test for this direction.
//
// The obj parameter is used as an empty object template: callers may
// pass either a zero-valued struct (e.g. &containersv1alpha1.Volume{})
// or a slice element from a prior List; in both cases each retry
// iteration starts from a fresh copy via DeepCopyObject and writes only
// that copy, so caller state is never observed.
func (w *dockerWatcher) removeMirrorResource(ctx context.Context, obj client.Object, name string) error {
key := client.ObjectKey{Name: name, Namespace: apiNamespace}
retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
latest := obj.DeepCopyObject().(client.Object)
if err := w.k8s.Get(ctx, key, latest); err != nil {
if apierrors.IsNotFound(err) {
return nil
}
run -0 docker inspect test-state --format '{{.Id}}'
cid=${output}
rdd ctl wait --for=jsonpath='{.status.status}'=running \
--namespace="${NAMESPACE}" container/"${cid}" --timeout=30s
rdd ctl patch container "${cid}" --namespace="${NAMESPACE}" \
--type=merge -p '{"spec":{"state":"created"}}'
rdd ctl wait --for=jsonpath='{.status.status}'=exited \
--namespace="${NAMESPACE}" container/"${cid}" --timeout=30s
run -0 docker inspect test-state --format '{{.State.Status}}'
assert_output "exited"
}
@test "patching spec.state=running restarts Docker container" {
run -0 docker inspect test-state --format '{{.Id}}'
cid=${output}
rdd ctl patch container "${cid}" --namespace="${NAMESPACE}" \
--type=merge -p '{"spec":{"state":"running"}}'
rdd ctl wait --for=jsonpath='{.status.status}'=running \
--namespace="${NAMESPACE}" container/"${cid}" --timeout=30s
run -0 docker inspect test-state --format '{{.State.Status}}'
assert_output "running"
}
@test "patching spec.state=created stops a paused container" {
Fix: dispatch ContainerUnpause when actual == paused and desired == running, and add a BATS test mirroring the existing stop-paused test.
switch desired {
case containersv1alpha1.ContainerStatusRunning:
+ if actual == containersv1alpha1.ContainerStatusPaused {
+ log.Info("Unpausing container", "id", c.Name)
+ _, err := w.cli.ContainerUnpause(ctx, c.Name, dockerclient.ContainerUnpauseOptions{})
+ return err
+ }
log.Info("Starting container", "id", c.Name)
_, err := w.cli.ContainerStart(ctx, c.Name, dockerclient.ContainerStartOptions{})
return err
k8s: k8s,
cancel: watchCancel,
done: make(chan struct{}),
reconcileChan: reconcileChan,
}
// Capture a Docker-relative timestamp before fullSync begins. The
// run() goroutine passes this to the event stream as the Since
// filter, so any mutation that happens while fullSync is snapshotting
// state is replayed once the stream opens. Without this, the window
// between the List-based snapshot and the event subscription is a
// blind spot during VM startup.
since := strconv.FormatInt(time.Now().Unix(), 10)
if err := w.fullSync(watchCtx); err != nil {
watchCancel()
cli.Close()
return nil, fmt.Errorf("failed to perform initial sync: %w", err)
newDockerWatcher() runs fullSync before run() subscribes to the Docker event stream at docker_watcher.go:108-110. Any container/image/volume mutation in Docker during that window is invisible — it happened after the sync snapshotted state, but before the event stream was open. This is especially plausible during VM startup where dockerd is actively accepting work as the App's Running condition flips.
}
}
// run is the main watcher goroutine that processes Docker events. The
// since parameter is the "Since" filter passed to the Docker events
// endpoint, captured before the initial fullSync so events that
// happened during the sync window are replayed once the stream opens.
func (w *dockerWatcher) run(ctx context.Context, since string) {
log := logf.FromContext(ctx).WithName("docker-watcher")
defer close(w.done)
// Recover from panics in event handling so an unexpected Docker
// event shape cannot crash the whole app-controller process. The
// engine reconciler notices the dead watcher via alive() on the
After startup, Reconcile() marks the engine connected and returns ctrl.Result{}, nil at engine_reconciler.go:143-146, so there is no immediate second sync to close the gap.
return ctrl.Result{}, r.setEngineCondition(ctx, &app, terminalStatus, terminalReason, terminalMessage)
}
if !watcherRunning {
log.Info("App is running, starting Docker watcher")
if err := r.startWatcher(ctx); err != nil {
log.Error(err, "Failed to start Docker watcher")
_ = r.setEngineCondition(ctx, &app, metav1.ConditionFalse, "ConnectFailed", err.Error())
return ctrl.Result{}, err
}
if err := r.setEngineCondition(ctx, &app, metav1.ConditionTrue, "Connected", "Container engine synced"); err != nil {
return ctrl.Result{}, err
}
return ctrl.Result{}, nil
Fix: open the event stream before or in parallel with fullSync, buffer events until the sync finishes, then replay. Alternatively, record a Since timestamp before fullSync starts and subscribe to the event stream with Since set to that timestamp so post-snapshot mutations are replayed after the sync.
r.watcher = nil
watcherRunning = false
}
r.watcherMu.Unlock()
// A watcher that dies while the App is still running is treated as a
// transient disconnect: log it, clear the watcher pointer, and fall
// through to the normal reconcile flow. If wantWatcher is still true
// below, the reconciler starts a fresh watcher whose fullSync
// reconciles any drift in place — downstream clients see no churn
// because the existing mirror resources keep their identity. An
// actual stop or backend change is handled by the !wantWatcher
// branch below, which does sweep the mirrors.
if watcherDied {
log.Info("Docker watcher died, will attempt to reconnect")
}
// The watcher should only run when the App is Running with the moby
// backend. In every other state (stopped, containerd, both) we stop
// the watcher and sweep mirror resources. The sweep is gated on the
// current ContainerEngineReady reason: once the condition reflects
// the terminal state, cleanup would be a no-op against an empty
// namespace, so we short-circuit to avoid four empty List calls per
Any transient error on the Docker event stream (docker_watcher.go:117-123 — e.g. user ran systemctl restart docker inside the guest) causes run() to return, flips the watcher to !alive on the next reconcile, and then cleanupMirrorResources deletes every Container/Image/Volume/ContainerNamespace in rancher-desktop. The subsequent reconcile restarts the watcher and fullSync recreates them, but every downstream client (front end, rdd ctl get, anything watching) sees the whole universe blink DELETED → ADDED with fresh UIDs. A single Docker blip is as disruptive as an intentional VM shutdown.
defer close(w.done)
// Recover from panics in event handling so an unexpected Docker
// event shape cannot crash the whole app-controller process. The
// engine reconciler notices the dead watcher via alive() on the
// next reconcile and starts a fresh one.
defer func() {
if r := recover(); r != nil {
log.Error(nil, "panic in Docker watcher goroutine", "recovered", r)
}
}()
eventFilter := dockerclient.Filters{}.
Add("type", string(events.ContainerEventType)).
Add("type", string(events.ImageEventType)).
Add("type", string(events.VolumeEventType))
result := w.cli.Events(ctx, dockerclient.EventsListOptions{
Fix: split "VM stopped / engine backend changed" (sweep everything — current behavior) from "transient watcher failure" (stop the watcher, leave mirror resources in place, attempt a bounded retry; on success fullSync reconciles drift in place). Only sweep if reconnect fails after a timeout. This preserves resource identity across transient outages and avoids flooding downstream watchers.
}
return errors.Join(errs...)
}
// syncContainer inspects a single container and creates/updates the K8s resource.
// NotFound is treated as success: the container raced a concurrent delete
// between List and Inspect, and the stale K8s mirror will be pruned by
// syncAllContainers' remove-stale step later in the same sync.
func (w *dockerWatcher) syncContainer(ctx context.Context, id string) error {
result, err := w.cli.ContainerInspect(ctx, id, dockerclient.ContainerInspectOptions{})
if cerrdefs.IsNotFound(err) {
return nil
}
if err != nil {
return fmt.Errorf("failed to inspect container %s: %w", id, err)
}
During fullSync, the controller lists all containers/images/volumes and then Inspects each by ID. If a resource is deleted in Docker between the List and Inspect calls (a routine TOCTOU), Inspect returns a NotFound error. That error propagates through syncAllContainers → fullSync → newDockerWatcher, which then tears down the watcher (watchCancel(), cli.Close()) and returns ConnectFailed. The engine reconciler then flips the App condition to ConnectFailed and waits for the next retry — over nothing more than a single concurrent docker rm.
Fix: swallow NotFound as success in syncContainer, syncImageFromSummary, and syncVolume — the stale K8s mirror will be swept in the "remove stale" step later in the sync.
result, err := w.cli.ContainerInspect(ctx, id, dockerclient.ContainerInspectOptions{})
if err != nil {
+ if cerrdefs.IsNotFound(err) {
+ return nil
+ }
return fmt.Errorf("failed to inspect container %s: %w", id, err)
}
WithRepoTag(tag).
WithNamespace(containerNamespace),
); err != nil {
errs = append(errs, err)
}
}
} else {
// Dangling image (no tags).
name := sanitizeKubernetesObjectName(inspect.ID)
names = append(names, name)
if err := w.applyImage(ctx,
containersv1alpha1apply.Image(name, apiNamespace).
WithFinalizers(mirrorFinalizer),
imageStatusFromInspect(inspect),
); err != nil {
errs = append(errs, err)
}
}
return names, errors.Join(errs...)
}
// imageStatusFromInspect builds a fresh ImageStatus apply config from a
// Docker inspect response. Returning a new value per call avoids the
// aliasing trap of a shallow struct copy — slice/map WithX calls on
// one "copy" would otherwise mutate the backing memory shared with
// other callers.
func imageStatusFromInspect(inspect mobyimage.InspectResponse) *containersv1alpha1apply.ImageStatusApplyConfiguration {
statusApply := containersv1alpha1apply.ImageStatus().
WithID(inspect.ID).
WithRepoDigests(inspect.RepoDigests...).
WithArchitecture(inspect.Architecture).
WithOS(inspect.Os).
WithSize(inspect.Size).
WithLabels(inspect.Config.Labels)
if t, err := time.Parse(time.RFC3339Nano, inspect.Created); err == nil {
statusApply.WithCreatedAt(metav1.NewTime(t))
}
return statusApply
}
When an image that was dangling (K8s mirror name <id>) is subsequently tagged, applyImageFromInspect creates the new tagged mirror <id>-<taghash> but never deletes the pre-existing dangling mirror. Both survive until the next fullSync at watcher restart.
This is the mirror image of I1: tags and the dangling form should transition cleanly in both directions, but neither direction is currently handled incrementally.
Fix: inside the len(inspect.RepoTags) > 0 branch, proactively remove the dangling resource:
w.removeMirrorResource(ctx, &containersv1alpha1.Image{}, sanitizeKubernetesObjectName(inspect.ID))
// the original spec.state patch) and then sit forever.
func (r *EngineReconciler) reconcileContainerSpecs(ctx context.Context) error {
r.watcherMu.Lock()
w := r.watcher
r.watcherMu.Unlock()
if w == nil {
return nil
}
var containers containersv1alpha1.ContainerList
if err := r.List(ctx, &containers, client.InNamespace(apiNamespace)); err != nil {
return err
}
var errs []error
for i := range containers.Items {
c := &containers.Items[i]
if c.DeletionTimestamp != nil {
continue
}
if err := w.reconcileContainerState(ctx, c); err != nil {
reconcileContainerState returns raw Docker API errors, but reconcileContainerSpecs only logs them and returns success. The reconcile then falls through to return ctrl.Result{}, nil, so a failed start/stop gets exactly one attempt: the user patch that changed spec.state produced the only watch event, and no requeue is requested when Docker rejects the operation. The mismatch sits forever until some unrelated event triggers a reconcile.
Gemini and Codex both flagged this as a correctness issue (Gemini at Critical, Codex at Important); Claude flagged it as Suggestion S5 pointing out the asymmetry with the per-item errors.Join pattern used in processContainerFinalizers (engine_reconciler.go:334-356). I settled on Important — the retry never happens unless another reconcile fires, but the reconcile loop's overall safety net (watch events from later Docker activity) provides an eventual fallback.
if err := r.processContainerFinalizers(ctx, w); err != nil {
return err
}
if err := r.processImageFinalizers(ctx, w); err != nil {
return err
}
return r.processVolumeFinalizers(ctx, w)
}
// processContainerFinalizers deletes the Docker-side container for every
// K8s Container pending deletion and only strips the mirror finalizer
// when the Docker delete succeeds. Per-item errors are collected so one
// stuck container does not block the rest; the reconciler retries the
// remaining stuck items on the next reconcile.
func (r *EngineReconciler) processContainerFinalizers(ctx context.Context, w *dockerWatcher) error {
var containers containersv1alpha1.ContainerList
if err := r.List(ctx, &containers, client.InNamespace(apiNamespace)); err != nil {
return err
}
var errs []error
for i := range containers.Items {
c := &containers.Items[i]
if c.DeletionTimestamp == nil || !hasFinalizer(c, mirrorFinalizer) {
continue
}
if err := w.deleteContainer(ctx, c.Name); err != nil {
errs = append(errs, fmt.Errorf("failed to delete container %s from Docker: %w", c.Name, err))
continue
}
if removeFinalizer(c, mirrorFinalizer) {
// NotFound is benign: a concurrent Docker destroy event may
// already have stripped the finalizer and deleted the
// mirror between our List and Update.
Fix: collect per-container errors with errors.Join and return them so controller-runtime requeues with backoff, matching the existing finalizer pattern.
+ var errs []error
for i := range containers.Items {
...
if err := w.reconcileContainerState(ctx, c); err != nil {
- logf.FromContext(ctx).Error(err, "Failed to reconcile container state", "container", c.Name)
+ errs = append(errs, fmt.Errorf("container %s: %w", c.Name, err))
}
}
- return nil
+ return errors.Join(errs...)
// current ContainerEngineReady reason: once the condition reflects
// the terminal state, cleanup would be a no-op against an empty
// namespace, so we short-circuit to avoid four empty List calls per
// unrelated reconcile. On failure, the condition stays pending and
// the next requeue re-tries the sweep.
wantWatcher := running && engineIsDocker
if !wantWatcher {
if watcherRunning {
log.Info("Stopping Docker watcher",
"running", running, "engine", app.Spec.ContainerEngine.Name)
r.stopWatcher()
}
terminalReason := "Stopped"
terminalStatus := metav1.ConditionFalse
terminalMessage := "Container engine stopped"
if running && !engineIsDocker {
terminalReason = "NotApplicable"
terminalStatus = metav1.ConditionTrue
terminalMessage = "Engine mirroring is only supported with the moby backend"
}
current := meta.FindStatusCondition(app.Status.Conditions, conditionContainerEngineReady)
alreadyClean := !watcherDied && current != nil && current.Reason == terminalReason && current.Status == terminalStatus
if !alreadyClean {
if err := r.cleanupMirrorResources(ctx); err != nil {
log.Error(err, "Failed to clean up mirror resources")
return ctrl.Result{}, err
}
}
return ctrl.Result{}, r.setEngineCondition(ctx, &app, terminalStatus, terminalReason, terminalMessage)
}
The comment defends unconditional cleanup ("so transient errors are retried on the next requeue"), but in the common "VM stopped" and "containerd backend" steady states every incoming App event triggers four List calls (Container/Image/Volume/ContainerNamespace), each returning empty. Since the reconciler also Watches Container/Image/Volume as trigger sources (engine_reconciler.go:421-423), any update on any of them in rancher-desktop cascades into four list calls per event.
}
// SetupWithManager sets up the controller with the Manager.
func (r *EngineReconciler) SetupWithManager(mgr ctrl.Manager) error {
r.reconcileChan = make(chan event.GenericEvent, 1)
// Map any container/image/volume event to a reconcile of the App singleton.
enqueueApp := handler.EnqueueRequestsFromMapFunc(
func(_ context.Context, _ client.Object) []reconcile.Request {
return []reconcile.Request{{NamespacedName: types.NamespacedName{Name: appName}}}
},
)
Fix: gate cleanup on a state change — only sweep on transition into the not-wanted state (watcher was running, or a dedicated "CleanupPending" bit is set). If ContainerEngineReady already reflects the terminal state and observedGeneration == app.Generation, short-circuit with no List calls.
// WithSpec(unknown), the engine-controller field owner would keep
// reasserting spec.state=unknown, clobbering any user patch to
// running/created. Gating the WithSpec call behind a Get — so spec is
// written exactly once, on creation — keeps the invariant "subsequent
// syncs never touch spec" without needing a second field owner.
func (w *dockerWatcher) applyContainer(ctx context.Context, inspect mobycontainer.InspectResponse) error {
namespace, name := parseContainerName(inspect.Name)
// Create the resource if it doesn't exist. spec.state is always set to
// "unknown" on creation — meaning the engine mirrors Docker state without
// expressing intent. The user can later set it to "running" or "created"
// to control the container; the reconciler ignores "unknown".
//
// Deliberately omit ForceOwnership on the create apply: if a user
// patch to spec.state landed in the window between the Get above
// and this Apply, ForceOwnership would silently clobber it. Without
// the flag, the Apply succeeds on a fresh resource (no other owner
// for spec.state yet) and fails loudly on a concurrent conflict
// (the Get-NotFound observation was racy).
var existing containersv1alpha1.Container
err := w.k8s.Get(ctx, client.ObjectKey{Name: inspect.ID, Namespace: apiNamespace}, &existing)
if apierrors.IsNotFound(err) {
The Get-then-Apply pattern (explained by the comment) is the right idea — spec must be written exactly once to avoid fighting user intent. But the Get is racy: a concurrent user patch spec.state=running that lands between the Get (returning NotFound) and the Apply is clobbered by the Apply, because ForceOwnership explicitly steals spec.state from whoever wrote it last. The window is small but real: in practice the first Docker create event fires very close to when rdd-ctl-driven user patches might arrive for pre-existing resources after a crash recovery.
Fix: drop ForceOwnership on the initial create apply, or use a typed Create call with explicit metav1.CreateOptions instead of SSA. The narrower fix is to remove client.ForceOwnership from the create path and leave it on the status-only subsequent apply.
return true, nil
}
}
return false, nil
}
condition := func(event watch.Event) (bool, error) {
obj, ok := event.Object.(*unstructured.Unstructured)
if !ok {
return false, nil
}
return satisfied(obj), nil
}
if _, err := watchtools.UntilWithSync(ctx, lw, &unstructured.Unstructured{}, precondition, condition); err != nil {
if errors.Is(err, context.DeadlineExceeded) {
return errors.New("timed out waiting for App state")
When the user interrupts rdd set mid-wait with Ctrl+C, ctx is cancelled and watchtools returns context.Canceled. The CLI reports "timed out waiting for App state", which is false — the user cancelled. A user debugging why rdd set "timed out" after 2 seconds will go hunting in VM logs for the symptom instead of realizing they pressed Ctrl+C.
Fix: distinguish the two cases.
- if errors.Is(err, context.DeadlineExceeded) || errors.Is(err, context.Canceled) {
+ if errors.Is(err, context.DeadlineExceeded) {
return errors.New("timed out waiting for App state")
}
+ if errors.Is(err, context.Canceled) {
+ return errors.New("wait cancelled")
+ }
return fmt.Errorf("failed to watch App: %w", err)
//
// TODO: changes that trigger a VM restart without changing "running" (e.g.
// containerEngine.name) are not waited on. This requires a "Reconciled"
// condition with observedGeneration on the App resource so the CLI can
// detect when the full reconcile chain has settled.
//
// Known hazard until the "Reconciled" condition lands: `rdd set
// containerEngine.name=<other> running=true` on an already-running VM
// returns immediately because the stale ContainerEngineReady=True from
// the previous backend satisfies the precondition check before the
// reconciler has observed the new spec. Users who switch backends
// on a running VM should either stop first (`rdd set running=false`)
// or re-query ContainerEngineReady after the set completes to confirm
// the backend has finished transitioning.
func waitForDesiredState(ctx context.Context, config *rest.Config, properties map[string]string, timeout time.Duration) error {
runningVal, ok := properties["running"]
if !ok {
return nil
}
ctx, cancel := context.WithTimeout(ctx, timeout)
defer cancel()
if runningVal == "true" {
rdd set containerEngine.name=containerd running=true on an already-running VM races with the stale cached ContainerEngineReady=True/Connected from the moby backend. The precondition check at set.go:489-495 fires on the stale state and returns immediately, leaving the user with a half-restarted VM even though rdd set succeeded.
return fmt.Errorf("failed to create dynamic client: %w", err)
}
lw := &cache.ListWatch{
ListFunc: func(options metav1.ListOptions) (runtime.Object, error) {
options.FieldSelector = "metadata.name=app"
return dynClient.Resource(appGVR).List(ctx, options)
},
WatchFunc: func(options metav1.ListOptions) (watch.Interface, error) {
options.FieldSelector = "metadata.name=app"
return dynClient.Resource(appGVR).Watch(ctx, options)
},
}
precondition := func(store cache.Store) (bool, error) {
for _, item := range store.List() {
if obj, ok := item.(*unstructured.Unstructured); ok && satisfied(obj) {
The test at bats/tests/32-app-controllers/engine.bats:266-274 explicitly documents this limitation: "Stop first so there is no stale True/Connected from moby to satisfy the CLI wait below before the engine reconciler has run." That comment is the confession — the code works around a race the test harness itself can't avoid without stopping first.
rdd ctl wait --for=jsonpath='{.status.status}'=paused \
--namespace="${NAMESPACE}" container/"${cid}" --timeout=30s
rdd ctl patch container "${cid}" --namespace="${NAMESPACE}" \
--type=merge -p '{"spec":{"state":"running"}}'
rdd ctl wait --for=jsonpath='{.status.status}'=running \
--namespace="${NAMESPACE}" container/"${cid}" --timeout=30s
run -0 docker inspect test-unpause --format '{{.State.Status}}'
assert_output "running"
docker rm --force test-unpause
}
# --- Cleanup on shutdown ---
@test "stopping VM removes all mirror resources" {
# Make sure we have at least one resource to verify cleanup.
The author's TODO at set.go:440-443 acknowledges the root cause and proposes a future Reconciled condition with observedGeneration. Until that lands: either narrow the documented behavior ("waits for engine readiness only when running is the only property"), or skip the precondition shortcut when properties include anything besides running. At minimum, document the hazard in the TODO itself so future maintainers know it's a real user-visible issue, not just a polish item.
// container engine to be ready. If running=false was set, it waits for the
// App's Running condition to go to False — i.e. the VM has actually
// stopped, which is stricter than "container engine disconnected".
// Other property changes do not wait.
//
// TODO: changes that trigger a VM restart without changing "running" (e.g.
// containerEngine.name) are not waited on. This requires a "Reconciled"
// condition with observedGeneration on the App resource so the CLI can
// detect when the full reconcile chain has settled.
//
// Known hazard until the "Reconciled" condition lands: `rdd set
// containerEngine.name=<other> running=true` on an already-running VM
// returns immediately because the stale ContainerEngineReady=True from
// the previous backend satisfies the precondition check before the
Suggestions ¶
log.V(1).Info("Image deleted", "id", msg.Actor.ID)
return w.removeImagesByID(ctx, msg.Actor.ID)
default:
return nil
}
}
// handleVolumeEvent processes volume events.
func (w *dockerWatcher) handleVolumeEvent(ctx context.Context, msg events.Message) error {
log := logf.FromContext(ctx).WithName("docker-watcher")
switch msg.Action {
case events.ActionCreate:
log.V(1).Info("Volume created", "name", msg.Actor.ID)
return w.syncVolume(ctx, msg.Actor.ID)
case events.ActionDestroy:
log.V(1).Info("Volume destroyed", "name", msg.Actor.ID)
return w.removeMirrorResource(ctx, &containersv1alpha1.Volume{},
volumeK8sName(msg.Actor.ID))
default:
return nil
}
}
// removeMirrorResource removes the finalizer from a mirror resource and deletes
// it. This is used when Docker has already deleted the object.
obj is caller-provided and retry iterations reuse the same pointer. That works in practice because controller-runtime's Get resets-and-populates, but it depends on an implementation detail. Callers like syncAllContainers at sync_containers.go:60-66 pass &k8sContainers.Items[i], a slice element with prior list state pre-populated. deleteAllOfType at engine_reconciler.go:257 already uses the fresher DeepCopyObject() inside the retry closure pattern — apply the same here.
// Remove stale K8s containers.
var k8sContainers containersv1alpha1.ContainerList
if err := w.k8s.List(ctx, &k8sContainers, client.InNamespace(apiNamespace)); err != nil {
return fmt.Errorf("failed to list K8s containers: %w", err)
}
for i := range k8sContainers.Items {
c := &k8sContainers.Items[i]
if !dockerIDs[c.Name] {
log.V(1).Info("Removing stale container", "id", c.Name)
if err := w.removeMirrorResource(ctx, c, c.Name); err != nil {
errs = append(errs, err)
}
}
}
return errors.Join(errs...)
return err
}
var errs []error
for _, item := range items {
obj := item.(client.Object)
key := client.ObjectKeyFromObject(obj)
kind := obj.GetObjectKind().GroupVersionKind().Kind
retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
latest := obj.DeepCopyObject().(client.Object)
// (e.g., ~/.rd2/docker.sock). This is the host-side socket that Lima
// port-forwards from the guest's /var/run/docker.sock.
//
// TODO: use ShortDir() once the Lima template derives the socket path
// from the instance suffix instead of hardcoding ".rd2". The two
// hardcodes (here and in pkg/controllers/app/app/lima-template.yaml)
// must move together — either both parameterized on RDD_INSTANCE or
// neither — or the watcher will open a different socket from the one
// Lima is port-forwarding to, and mirroring will silently diverge per
// instance. Until then, multi-instance BATS runs cannot share the
// same host and the engine.bats test has a parallel TODO at line
// 12-14.
var DockerSocket = sync.OnceValue(func() string {
home, err := os.UserHomeDir()
if err != nil {
panic(fmt.Errorf("could not get home directory: %w", err))
}
dir := filepath.Join(home, ".rd2")
_ = os.MkdirAll(dir, 0o700)
Commit 87f1df1 originally used filepath.Join(ShortDir(), "docker.sock"), which is the correct per-instance path. Commit 762c1e8 regressed it to a hardcoded .rd2 to match the (also-hardcoded) Lima template path at pkg/controllers/app/app/lima-template.yaml:30. The TODO at instance.go:127-128 acknowledges this, but the two hardcodes need to be fixed together — either both parameterized on the instance suffix, or a pair of tests that fails the build if they ever diverge.
- 0
guestSocket: /var/run/docker.sock
hostPortRange:
- 0
- 0
hostSocket: "{{.Home}}/.rd2/docker.sock"
provision:
- mode: system
script: |
#!/bin/bash
// DockerSocket returns the path to the Docker socket for this instance
// (e.g., ~/.rd2/docker.sock). This is the host-side socket that Lima
// port-forwards from the guest's /var/run/docker.sock.
//
// TODO: use ShortDir() once the Lima template derives the socket path
// from the instance suffix instead of hardcoding ".rd2". The two
// hardcodes (here and in pkg/controllers/app/app/lima-template.yaml)
// must move together — either both parameterized on RDD_INSTANCE or
// neither — or the watcher will open a different socket from the one
// Lima is port-forwarding to, and mirroring will silently diverge per
// instance. Until then, multi-instance BATS runs cannot share the
Also, os.MkdirAll silently swallows errors (_ =); while the directory is typically pre-created by rdd svc start, the silent failure is a trap. The BATS helper at bats/tests/32-app-controllers/engine.bats:12-14 carries a parallel TODO, confirming the author is aware.
# Engine controller tests — verify that the engine controller mirrors Docker
# containers, images, and volumes into K8s resources, and that K8s deletions
# and spec.state changes are forwarded to Docker.
NAMESPACE="rancher-desktop"
# TODO: use .rd${RDD_INSTANCE} once the Lima template derives the socket path
# from the instance suffix instead of hardcoding ".rd2".
DOCKER_HOST="unix://${HOME}/.rd2/docker.sock"
export DOCKER_HOST
local_setup_file() {
# The Docker socket access pattern used by these tests is not yet wired
# up for Windows/WSL2.
Unlike controller-runtime reconcilers (wrapped in HandleCrash), this raw goroutine has no panic recovery. A nil-pointer in handleEvent — e.g. an unexpected Moby event shape — takes down the whole app-controller process. Add defer utilruntime.HandleCrash(...) or at minimum defer func() { if r := recover(); r != nil { log.Error(...) } }().
Containers and volumes use Force: true, meaning rdd ctl delete container/foo force-stops a running container without giving processes a chance to exit gracefully. Images use no Force, so in-use images are correctly kept — but the inconsistency is surprising and undocumented. Either document the policy in a comment, or consider Force: false for containers (the user can patch spec.state=created first, then delete, which is the K8s-idiomatic pattern).
name := sanitizeKubernetesObjectName(inspect.ID)
names = append(names, name)
if err := w.applyImage(ctx,
containersv1alpha1apply.Image(name, apiNamespace).
WithFinalizers(mirrorFinalizer),
imageStatusFromInspect(inspect),
); err != nil {
errs = append(errs, err)
}
}
return names, errors.Join(errs...)
}
// imageStatusFromInspect builds a fresh ImageStatus apply config from a
// Docker inspect response. Returning a new value per call avoids the
// aliasing trap of a shallow struct copy — slice/map WithX calls on
// one "copy" would otherwise mutate the backing memory shared with
// other callers.
func imageStatusFromInspect(inspect mobyimage.InspectResponse) *containersv1alpha1apply.ImageStatusApplyConfiguration {
statusApply := containersv1alpha1apply.ImageStatus().
statusCopy := *statusApply is a shallow copy: RepoDigests (slice), Labels (map), and every *string/*int64 field share backing memory with the original across iterations. It is safe today because the loop only calls WithRepoTag and WithNamespace — both set-only scalar operations — but a future edit that adds WithLabels(extraLabels) or WithRepoDigests(extra) inside the loop would corrupt all prior iterations' payloads via in-place append/map mutation.
Fix: replace the shallow copy with a helper that returns a fresh ImageStatusApplyConfiguration each iteration, building from the common fields explicitly.
// Start watcher goroutine to track hostagent lifecycle and reap the process.
// The watcher enqueues reconciles as phase transitions occur.
r.startWatcher(ctx, limaVM.Name, limaVM.Namespace, haCmd, inst.Dir, begin)
// On Unix, SetGroup() sets Setpgid=true which makes the hostagent a
// new process group leader, so pgid == pid. The "pgid" log field
// reflects that Unix-only semantics and is consumed by
// bats-with-timeout.sh for post-mortem leak detection (attributing
// orphaned qemu grandchildren back to their hostagent ancestor).
// On Windows, SetGroup() sets CREATE_NEW_PROCESS_GROUP instead —
// a different console-control-event concept — and this log field
// will need to be renamed or recalculated when Windows CI arrives.
logger.Info("Hostagent started, watcher active",
"instance", limaVM.Name,
"pid", haCmd.Process.Pid,
"pgid", haCmd.Process.Pid)
On Unix the claim holds (Setpgid=true → pgid=pid). On Windows there is no POSIX process group; SetGroup sets CREATE_NEW_PROCESS_GROUP, which is a different console-control-event concept — logging the value as "pgid" is a misnomer. Harmless today (Windows CI is not exercised), but will bite when it's added. Either read the real pgid via syscall.Getpgid(haCmd.Process.Pid) on Unix only, or label the field with a platform-neutral name.
# --- K8s deletion removes Docker object ---
@test "deleting Container resource removes Docker container" {
docker run -d --name test-delete busybox sleep 3600
run -0 docker inspect test-delete --format '{{.Id}}'
cid=${output}
rdd ctl wait --for=create --namespace="${NAMESPACE}" \
container/"${cid}" --timeout=30s
Per the repo convention, prefer refute_output "" over [ -n ... ], which fails silently on violation. Swap both sites.
run -0 rdd ctl get image --namespace="${NAMESPACE}" \
--field-selector "status.repoTag=busybox:latest" -o name
+refute_output ""
image_ref=${output}
-[ -n "${image_ref}" ]
var containers containersv1alpha1.ContainerList
if err := r.List(ctx, &containers, client.InNamespace(apiNamespace)); err != nil {
return err
}
var errs []error
for i := range containers.Items {
c := &containers.Items[i]
if c.DeletionTimestamp == nil || !hasFinalizer(c, mirrorFinalizer) {
continue
}
if err := w.deleteContainer(ctx, c.Name); err != nil {
errs = append(errs, fmt.Errorf("failed to delete container %s from Docker: %w", c.Name, err))
continue
}
if removeFinalizer(c, mirrorFinalizer) {
r.Update(ctx, c) can return NotFound if the Docker ActionDestroy event handler concurrently stripped the finalizer and deleted the mirror object between this reconcile's List and Update. It's a benign race but produces noisy error logs and triggers unnecessary reconciles.
Fix: client.IgnoreNotFound(r.Update(ctx, c)). The same pattern applies to processImageFinalizers and processVolumeFinalizers.
## Engine Mirroring
The engine controller (`pkg/controllers/app/engine/`) watches the `App` resource
for the `Running` condition. When the VM is running with the `moby` backend,
the controller:
1. Connects to the Docker engine via the host socket.
2. Creates the `rancher-desktop` K8s namespace and the `moby`
`ContainerNamespace` resource.
3. Lists all containers, images, and volumes and creates corresponding K8s
resources.
4. Watches the Docker event stream for create, update, and delete events.
Containerd mirroring is not implemented yet; with `containerEngine.name=containerd`
The new doc text says the engine controller creates engine-specific namespaces for containerd, but the implementation does the opposite: engine_reconciler.go:115-133 shuts the watcher down and reports ContainerEngineReady=True, reason=NotApplicable whenever app.Spec.ContainerEngine.Name != "moby". That wording will mislead anyone trying to build against the API from the doc.
// current ContainerEngineReady reason: once the condition reflects
// the terminal state, cleanup would be a no-op against an empty
// namespace, so we short-circuit to avoid four empty List calls per
// unrelated reconcile. On failure, the condition stays pending and
// the next requeue re-tries the sweep.
wantWatcher := running && engineIsDocker
if !wantWatcher {
if watcherRunning {
log.Info("Stopping Docker watcher",
"running", running, "engine", app.Spec.ContainerEngine.Name)
r.stopWatcher()
}
terminalReason := "Stopped"
terminalStatus := metav1.ConditionFalse
terminalMessage := "Container engine stopped"
if running && !engineIsDocker {
terminalReason = "NotApplicable"
terminalStatus = metav1.ConditionTrue
terminalMessage = "Engine mirroring is only supported with the moby backend"
}
current := meta.FindStatusCondition(app.Status.Conditions, conditionContainerEngineReady)
alreadyClean := !watcherDied && current != nil && current.Reason == terminalReason && current.Status == terminalStatus
if !alreadyClean {
if err := r.cleanupMirrorResources(ctx); err != nil {
log.Error(err, "Failed to clean up mirror resources")
return ctrl.Result{}, err
}
}
return ctrl.Result{}, r.setEngineCondition(ctx, &app, terminalStatus, terminalReason, terminalMessage)
Fix:
-2. Creates the `rancher-desktop` K8s namespace and the appropriate
- `ContainerNamespace` resource (`moby` for dockerd, engine-specific
- namespaces for containerd).
+2. Creates the `rancher-desktop` K8s namespace and the `moby`
+ `ContainerNamespace` resource when the App uses the moby backend.
+ Containerd mirroring is not implemented yet; the controller reports
+ `ContainerEngineReady` with reason `NotApplicable` in that mode.
Separately, the Image example at docs/design/api_containers.md:218-219 still describes the old "random suffix" naming scheme even though the implementation now uses a deterministic hash — that text predates this PR but is stale.
docs/design/api_containers.md:306-312 still describes the old finalizer-untag semantics Claude¶
Delete the Image object; the finalizer will untag the image. If all tags of an
image are removed, _and_ it is not in use by a container (running or not), then
the image itself is removed.
This contradicts the current code: on a Docker untag the controller deletes the K8s resource directly, without going through the finalizer path. The doc change in ba6caf7 left this stale. Reconcile the doc with the implementation — the current code better matches the intent, but the description on this line should match what the code does.
Design Observations ¶
Concerns ¶
- Engine controller cleanup policy treats "VM stopped" and "watcher died" identically
(in-scope)— A transient event-stream blip triggers the same aggressive sweep as a user-initiated shutdown. See I4 for the specific fix; the broader design question is whether theReconcilestate machine should distinguish "intentional stop" (from App condition) from "involuntary failure" (from watcher liveness). Claude Gemini spec.state: unknownconflates "no user intent" with a real spec state(future)— A cleaner model would be a nullablespec.state(pointer) withnil = no opinion, or splitspec.desired.statefromstatus.managedBy. The current shape forces the webhook to enforce "stop trying to be unknown once a user expressed intent" — future contributors will trip over this. Not in-scope for this PR. ClaudecleanupMirrorResourcesassumes exclusive ownership ofrancher-desktop(future)—deleteAllOfTypeiterates every resource in the namespace and strips the mirror finalizer. The mock controller atpkg/controllers/mock/mock_controller.go:27uses the sameapiNamespace = "rancher-desktop"; if the two ever coexist (e.g. mock + real engine in a dev instance), cleanup will wipe the mock data. Filter by a label selector (engine.rancherdesktop.io/mirror=true) or by field owner. Claude- Engine feature is a single singleton reconciler owning bootstrap sync, event streaming, desired-state convergence, and finalizer processing
(future)— Splitting into per-resource reconcilers would remove the global O(N) sweeps, make retries object-scoped instead of best-effort, and eliminate the I3 bootstrap race by giving each object a durable reconcile path. The code already acknowledges this atengine_reconciler.go:149-155with an in-place comment. Codex
Strengths ¶
- The
Get-then-Applycomment atsync_containers.go:87-93explaining why spec is written only on create (to avoid SSA fights with user patches) is the kind of insight most code lacks. Claude processContainerFinalizers,processImageFinalizers, andprocessVolumeFinalizerscollect per-item errors witherrors.Join— one stuck object does not block the rest. This is the correct pattern for sweep loops. (The asymmetry withreconcileContainerSpecs, which does not do this, is I7.) Claude- The
docker_watcher.alive()check paired with the explicitstop() → <-w.donehandshake avoids the common goroutine-ownership ambiguity. Claude - The mirror reverse-delete path is designed around the repo's "no K8s GC" constraint: mirror resources carry an explicit engine finalizer, and
cleanupMirrorResourcessweeps all mirrored kinds on shutdown instead of assuming owner-reference garbage collection. Codex - The new
rdd setimplementation uses a watch-based wait rather than polling — the right control-plane pattern for this codebase. Codex - The
immutableValidatorwebhook correctly prevents drift by forcing explicit transitions; theequality.Semantic.DeepEqualcomparison withStatenormalized is the right way to verify nothing else changed. Gemini - Deterministic RFC 1123 volume names via SHA-256 hashing with the original name preserved in
status.nameelegantly handles Docker names that contain uppercase/underscore characters. Gemini
Testing Assessment ¶
Strong happy-path coverage: container lifecycle mirroring, docker run/stop/rm, volume create/delete (including uppercase names), image pull, K8s-initiated container deletion, in-use image finalizer retention, spec-state transitions in the running↔created↔paused(→stopped) direction, VM shutdown cleanup, and the moby↔containerd backend switch. The BATS test at bats/tests/32-app-controllers/engine.bats is readable, well-commented, and validates the main contracts end-to-end.
Gaps, in rough order of risk:
- Paused → running transition (I2). No test. High risk of silently-dropped user intent on unpause.
- Docker-initiated untag of in-use image (I1, I6). The existing
deleting an in-use Image keeps the finalizertest covers the K8s-initiated direction; the Docker-initiateddocker rmi <tag>path is not exercised. - Transient Docker disconnect (I4). No test. Would need to kill/restart Docker inside the VM or sever the host socket briefly.
NotFoundduringfullSyncInspect (I5). No test for a container being removed concurrently with the initial sync.- Watcher bootstrap race (I3). No test for mutations landing between
fullSyncand event-stream subscription. - Multi-property
rdd setwhile VM running (I11). The test atengine.bats:266-274documents the stale-condition race but works around it by stopping first. - Concurrent user patch during engine reconcile (I9). No test for a user patching
spec.state=runningin the window between the engine's Get and Apply on create. - Engine tests only cover a single instance and skip Windows/WSL2 — the hardcoded
.rd2socket path (S2) means multi-instance tests cannot be added until the Lima template is parameterized on the instance suffix.
Documentation Assessment ¶
docs/design/api_containers.mdis substantially updated with engine mirroring semantics,unknownstate rationale, and finalizer lifecycle — a net improvement.- The
State is the desired state of the containercomment atpkg/apis/containers/v1alpha1/container_types.go:56-61does not describe whatunknownmeans to the reconciler. Update to: "State is the desired state of the container;unknownmeans the engine controller mirrors without expressing intent and takes no start/stop action." - The
rdd sethelp text atcmd/rdd/set.go:68-72correctly mentions--timeout=0, but commit message762c1e8refers to a--wait=falseflag that does not exist. Fix the commit message (see I7 / Commit Structure). - S9 and S10 document stale doc text that should be updated with this PR.
Commit Structure ¶
This PR bundles the core engine-watcher feature with unrelated CI and Lima template work. Two commits in particular are mis-titled:
4db3f20"Allow container spec.state transitions via webhook" — the webhook diff is a small part of this commit. It also introduces theunknownenum value, changes the CRD default fromrunningtounknown, adds thereconcileContainerStateflow with paused-container handling, addsprocessContainerFinalizers/processImageFinalizers/processVolumeFinalizers, and reworkssync_containers.goto gate spec-apply on Get. A reader scanninggit log --onelinefor theunknownenum addition would not find it. Claude Codex762c1e8"Make rdd set wait for the desired state by default" — same problem, larger. ~300 lines of engine controller churn, theDockerSocketregression fromShortDir()to hardcoded.rd2(S2), and a significant reconciler rework. The commit message also refers to a--wait=falseflag that does not exist — the flag is--timeout=0. Claude Codex50bdc15and7c8f356— the second immediately flips the first from "on-timeout only" to "always". These should be a single commit. Claude
Crisp/self-contained commits (good examples): 9669b4b (bats fd close), ba6caf7 (docs), 32a0109 (distro proxy workaround), 536ccb2 (rosetta vz move).
Fix: split the commits along responsibility lines and correct the 762c1e8 commit message to reflect the actual flag name before the PR lands.
Acknowledged Limitations ¶
DockerSocket()hardcodes.rd2— TODO atpkg/instance/instance.go:127-128, mirrored by a parallel TODO atbats/tests/32-app-controllers/engine.bats:12-14. S2 is in findings because it's a regression from the previously correct state, not because the author is unaware.rdd setdoesn't wait for non-runningproperty changes that trigger a VM restart — TODO atcmd/rdd/set.go:440-443proposes aReconciledcondition withobservedGeneration. I11 is still in findings because the race is user-visible today, not just a future polish item.reconcileContainerSpecs+processFinalizersrun O(N) per child event — comment atengine_reconciler.go:149-155acknowledges the long-term fix (split into per-resource reconcilers). Not promoted to a finding given the typical scale.mobyContainerNamespace has no mirror finalizer by design — comment atsync_volumes.go:54-59explains; the BATS testdeleting containernamespace/moby completes without a finalizer hangverifies.- Containerd backend reports
ContainerEngineReady=True/NotApplicable— intentional so thatrdd set running=truereturns promptly. The test comment atengine.bats:266-273documents this. - Windows hostagent
pgidlogging — the comment atlimavm_lifecycle.go:479-481assumes Unix Setpgid semantics. S6 is in findings because the wording breaks down on Windows, not because the author is unaware.
Agent Performance Retro ¶
Claude ¶
Claude produced the most comprehensive review of the three by a wide margin (8 Important, 8 Suggestions, 3 Design observations), and carried the most unique signal: the paused→running asymmetry (I2), the ForceOwnership create race (I9), the watchCondition cancel/timeout confusion (I10), the multi-property rdd set stale-condition race (I11), and the idle-reconcile cleanupMirrorResources cost (I8). It also caught the shallow-copy trap (S5), the panic-recovery gap (S3), and the Force flag inconsistency (S4), plus the commit-structure critique. It correctly framed I1 as "no dangling mirror is ever created" rather than "the tag is not removed," which is a sharper diagnosis than Codex or Gemini's.
Claude's run time was unusual: it took 22 minutes — well past the nominal 15-minute timeout — but produced the deepest analysis. Worth noting that --effort max is paying for quality here.
Codex ¶
Codex produced a tight, well-calibrated review (3 Important, 1 Suggestion) with the single highest-signal unique finding of the round: the watcher bootstrap race (I3) between fullSync and event-stream subscription. That requires tracing the initialization order carefully — a class of bug where no raw code read would catch it. Codex also reached the correct fix for the untag bug (I1) independently before Claude got there, and correctly identified that the commit-structure and documentation issues needed flagging. It downgraded the error-swallowing from Gemini's "Critical" to Important (correct) and from Claude's "Suggestion" (correct). Zero false positives.
Gemini ¶
Gemini produced 4 "Critical" and 3 "Important" findings but hallucinated C1 — the "typo in BATS wrapper $ @ causes immediate CI failure" finding. The actual scripts/bats-with-timeout.sh:225 reads "$@" & with no space. I verified this directly. A CI-failure-level claim should never be made without verifying the literal character sequence in the source, and Gemini's own tool output earlier contained Error stating path " & — a garbled state during its own reading of the file that it apparently then committed to as fact.
Gemini also inflated severity on C3 and C4 (early return bypasses finalizers, container state errors swallowed) — both are real concerns but neither is Critical. And its suggested fix for the untag bug (use msg.Actor.Attributes["name"]) is also wrong, because Docker sets that attribute to the image ID hash too. That said, Gemini caught two things the others missed: the dangling-image leak on tag (I6) and the NotFound during fullSync issue (I5). When correctly calibrated, those are valuable finds.
Gemini's review ran in about 6 minutes — fastest of the three.
Summary ¶
| Claude Opus 4.6 | Codex GPT 5.4 | Gemini 3.1 Pro | |
|---|---|---|---|
| Duration | 22m 15s | 7m 00s | 6m 51s |
| Findings | 8I 8S | 3I 1S | 4I 1S |
| Tool calls | 99 (Bash 36, Read 36, Grep 20) | 49 (shell 47, stdin 2) | 4 (runshellcommand 3, googlewebsearch 1) |
| Design observations | 3 | 2 | 1 |
| False positives | 0 | 0 | 1 |
| Unique insights | 7 | 1 | 2 |
| Files reviewed | 26 | 26 | 26 |
| Coverage misses | 0 | 0 | 0 |
| Totals | 8I 8S | 3I 1S | 4I 1S |
| Downgraded | 0 | 0 | 3 (I→S) |
| Dropped | 0 | 0 | 1 |
Claude is the clear leader for depth and calibration on this review; Codex's bootstrap-race catch is the single most valuable unique finding; Gemini's hallucinated "critical CI failure" is the most concerning failure mode and reinforces the rule that model claims about literal file contents need to be verified before being trusted.
Reconciliation:
- Gemini C1 (shell typo, Critical → rejected as false positive): the claimed
"$ @"does not exist; the actual script has"$@". Verified by readingscripts/bats-with-timeout.sh:225. - Gemini C2 (untag broken, Critical → Important as I1): the bug is real but not a data-loss-level regression — stale resources survive until the next watcher restart's
fullSync. Also, Gemini's proposed fix is wrong (readingActor.Attributes["name"]returns the image ID hash per Mobydaemon/images/image_events.go:22). - Gemini C3 (early return bypasses finalizers, Critical → rejected): subsequent reconciles from the cleanup's own watch events process the deferred finalizers. Not critical in practice.
- Gemini C4 (state errors swallowed, Critical → Important as I7): real bug but not Critical; retry eventually happens via subsequent Docker events.
- Claude S5 (state errors swallowed, Suggestion → promoted to Important as I7): merged with Codex/Gemini framing.
- Gemini Design Observation (tear-down on disconnect → promoted to Important as I4): the issue is user-visible (UIDs churn, watch clients see DELETED→ADDED), so it warrants Important rather than future-design treatment.
Review Process Notes ¶
Skill improvements:
- None this round. The review dimensions already cover watcher lifecycle, initialization ordering, retry semantics, and Docker event-handling contracts. Gemini's hallucinated CI-failure Critical is a model-specific failure, not a prompt gap — no amount of prompt tweaking removes "verify the literal characters" as a model responsibility.
Repo context updates:
[repo]Add todeep-review-context.md: "When reviewing watcher or event-stream code, cross-check against upstream source for the event payload contract — Docker's image events in particular do not propagate tag names throughActor.IDorActor.Attributes. Agents must trace intodaemon/images/image_events.go(or equivalent) when proposing a fix that reads event fields."
Why: this review had two agents (Gemini, Codex initially) propose fixes based on a wrong assumption about what Actor.ID and Actor.Attributes["name"] contain for ActionUnTag. Future reviewers will save time if this is documented up front.