Deep Review: 20260412-171333-pr-295
| Date | 2026-04-12 17:13 |
| Repo | rancher-sandbox/rancher-desktop-daemon |
| Round | 3 |
| Author | @jandubois |
| PR | #295 — Container Engine watcher |
| Commits | c484dab ci: capture memory pressure and API state in support bundle3196b24 bats: park spec.state=unknown before pausing test-state84c0736 engine: apply Container/Image/Volume capitalization conventionc8fa377 generate: refresh containerspec.go and crd.yaml for spec.state doc62d222a scripts: resolve rdd binary relative to bats-with-timeout.sh66eb50e Address review round 2 findings for PR #2957c8f356 ci: always capture support bundle and log hostagent pgid9669b4b bats: close fds 3 and 4 when invoking rdd50bdc15 ci: capture support bundle when BATS targets time out762c1e8 Make rdd set wait for the desired state by default65943da Add BATS integration tests for the engine controllerba6caf7 Document engine mirroring and container state transitions4db3f20 Allow container spec.state transitions via webhook93359f7 Add engine controller that mirrors Docker state into K8s87f1df1 Add DockerSocket() helper and Docker client dependency32a0109 Work around distro proxy bug; update docker socket path536ccb2 Rosetta is now a vz driver option, not a global option |
| Reviewers | Claude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro |
| Verdict | Merge with fixes — no critical blockers; four important findings are narrowly-scoped with trivial fixes |
| Wall-clock time | 31 min 5 s |
Executive Summary ¶
This PR introduces the engine controller that mirrors Docker container engine state (containers, images, volumes) into the containers.rancherdesktop.io API group via fullSync-then-watch, forwards user-initiated mirror deletions back to Docker through a finalizer, and dispatches Container start/stop on spec.state transitions. It also adds the rdd set wait-for-desired-state path, 389 lines of BATS integration coverage, a new CI support-bundle capture wrapper, and Lima template updates (new socket path, rosetta moved under vmOpts.vz, distro workaround).
The design is sound: idempotent fullSync, panic-recovery, per-item error aggregation with errors.Join, a narrow webhook allowing only spec.state transitions, and deliberate reasoning about server-side apply field ownership (documented inline in sync_containers.go). Verdict is merge with fixes: no critical regressions, four important findings with one-to-three-line fixes, and a cluster of smaller suggestions around condition semantics, clock assumptions, and finalizer edge cases.
Structure: 4 important, 12 suggestions, 0 critical. The important findings are self-healing gaps (panic recovery, watcher-restart early return, sequential finalizer short-circuit, ContainerNamespace resurrection) that the "list-then-watch" architecture would normally handle but for specific early-returns or missing watches.
Critical Issues ¶
None.
Important Issues ¶
return "", fmt.Errorf("failed to query Docker info: %w", err)
}
if result.Info.SystemTime == "" {
return "", errors.New("Docker daemon did not report SystemTime")
}
t, err := time.Parse(time.RFC3339Nano, result.Info.SystemTime)
if err != nil {
return "", fmt.Errorf("failed to parse Docker SystemTime %q: %w", result.Info.SystemTime, err)
}
return strconv.FormatInt(t.Unix(), 10), nil
}
// stop cancels the watcher goroutine and waits for it to finish.
func (w *dockerWatcher) stop() {
w.cancel()
The recover block logs the panic and then defer close(w.done) fires, marking the watcher dead. Unlike the normal error-stream exit path at docker_watcher.go:138-144 — which calls w.enqueueReconcile() before returning — the panic path emits no wake-up signal. The reconciler only learns the watcher died when another reconcile is triggered by an App spec change, a mirror watch event, or the channel source. While the watcher is dead, though, no Container/Image/Volume change will ever fire (the watcher is the source of those events), and an idle user triggers nothing. A panic can silently leave the engine controller in a degraded state for an arbitrarily long time without ContainerEngineReady flipping to false.
select {
case <-w.done:
return false
default:
return true
}
}
// run is the main watcher goroutine that processes Docker events. The
// since parameter is the "Since" filter passed to the Docker events
// endpoint, captured before the initial fullSync so events that
// happened during the sync window are replayed once the stream opens.
func (w *dockerWatcher) run(ctx context.Context, since string) {
log := logf.FromContext(ctx).WithName("docker-watcher")
defer close(w.done)
// Recover from panics in event handling so an unexpected Docker
// event shape cannot crash the whole app-controller process.
The comment at docker_watcher.go:113-116 explicitly claims "The engine reconciler notices the dead watcher via alive() on the next reconcile and starts a fresh one" — but there is no guaranteed next reconcile.
infoCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
result, err := cli.Info(infoCtx, dockerclient.InfoOptions{})
if err != nil {
return "", fmt.Errorf("failed to query Docker info: %w", err)
}
if result.Info.SystemTime == "" {
return "", errors.New("Docker daemon did not report SystemTime")
}
t, err := time.Parse(time.RFC3339Nano, result.Info.SystemTime)
if err != nil {
return "", fmt.Errorf("failed to parse Docker SystemTime %q: %w", result.Info.SystemTime, err)
}
return strconv.FormatInt(t.Unix(), 10), nil
Fix: enqueue a reconcile from the panic-recovery defer.
defer func() {
if r := recover(); r != nil {
log.Error(nil, "panic in Docker watcher goroutine", "recovered", r)
+ w.enqueueReconcile()
}
}()
Claude additionally suggested capturing debug.Stack() so the panic is actionable — worth doing.
w := r.watcher
r.watcherMu.Unlock()
if w == nil {
return nil
}
var containers containersv1alpha1.ContainerList
if err := r.List(ctx, &containers, client.InNamespace(apiNamespace)); err != nil {
return err
}
var errs []error
for i := range containers.Items {
c := &containers.Items[i]
if c.DeletionTimestamp != nil {
continue
}
Each per-type helper internally uses errors.Join so one stuck Container does not block the rest of the Container list. But the wrapper short-circuits at the type level: a single failed processContainerFinalizers aborts the function before processImageFinalizers or processVolumeFinalizers runs. The new test at engine.bats:195-227 (deleting an in-use Image keeps the finalizer until the container is removed) demonstrates the exact scenario where this matters: a stuck image delete legitimately returns an error until the container is removed. While stuck, Volume finalizers on the same reconcile cycle are silently starved. This contradicts the per-item aggregation pattern used throughout the rest of the file (cleanupMirrorResources at engine_reconciler.go:221-238, reconcileContainerSpecs at engine_reconciler.go:294-318).
container/"${cid}" --timeout=30s
run -1 docker inspect test-delete
}
@test "deleting an in-use Image keeps the finalizer until the container is removed" {
docker run -d --name test-inuse busybox sleep 3600
run -0 docker inspect test-inuse --format '{{.Id}}'
cid=${output}
rdd ctl wait --for=jsonpath='{.status.status}'=running \
--namespace="${NAMESPACE}" container/"${cid}" --timeout=30s
# Resolve the Image mirror name from its repoTag.
run -0 rdd ctl get image --namespace="${NAMESPACE}" \
--field-selector "status.repoTag=busybox:latest" -o name
assert_output
image_ref=${output}
# Docker will refuse to remove an image referenced by a running
# container. With I3 fixed, processImageFinalizers leaves the
# finalizer in place and the Image mirror stays (in Terminating
# state) until the image is actually removable.
rdd ctl delete "${image_ref}" --namespace="${NAMESPACE}" --wait=false
run -0 rdd ctl get "${image_ref}" --namespace="${NAMESPACE}" \
-o jsonpath='{.metadata.deletionTimestamp}'
assert_output
run -0 rdd ctl get "${image_ref}" --namespace="${NAMESPACE}" \
-o jsonpath='{.metadata.finalizers[0]}'
assert_output "engine.rancherdesktop.io/docker-mirror"
# Remove the container so Docker permits the image removal. The
# next reconcile's finalizer retry succeeds and the Image mirror
# is finally collected.
docker rm --force test-inuse
rdd ctl wait --for=delete --namespace="${NAMESPACE}" \
"${image_ref}" --timeout=30s
}
# --- Container state transitions via spec ---
@test "patching spec.state=created stops Docker container" {
docker run -d --name test-state busybox sleep 3600
// setEngineCondition updates the ContainerEngineReady condition on the
// App resource. The App controller also writes App.Status.Conditions
// (to mirror LimaVM conditions) and controller-runtime does not
// serialize reconciles across controllers, so a naive Update can race
// and 409. retry.RetryOnConflict plus a re-Get inside the loop is the
// same pattern used elsewhere in this file (see removeMirrorResource,
// deleteAllOfType).
func (r *EngineReconciler) setEngineCondition(ctx context.Context, app *appv1alpha1.App, status metav1.ConditionStatus, reason, message string) error {
return retry.RetryOnConflict(retry.DefaultRetry, func() error {
latest := &appv1alpha1.App{}
if err := r.Get(ctx, client.ObjectKey{Name: app.Name}, latest); err != nil {
return err
}
changed := meta.SetStatusCondition(&latest.Status.Conditions, metav1.Condition{
Type: conditionContainerEngineReady,
Status: status,
Reason: reason,
Message: message,
ObservedGeneration: latest.Generation,
})
if !changed {
return nil
}
return r.Status().Update(ctx, latest)
})
}
// cleanupMirrorResources removes every mirror resource the engine
// obj.GetObjectKind() is empty here. Format the concrete Go
// type name with %T instead.
retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
latest := obj.DeepCopyObject().(client.Object)
if err := r.Get(ctx, key, latest); err != nil {
if apierrors.IsNotFound(err) {
return nil
}
return err
}
if !removeFinalizer(latest, mirrorFinalizer) {
return nil
}
return r.Update(ctx, latest)
})
if retryErr != nil {
errs = append(errs, fmt.Errorf("failed to remove finalizer from %T %s: %w",
obj, obj.GetName(), retryErr))
continue
}
if err := client.IgnoreNotFound(r.Delete(ctx, obj)); err != nil {
errs = append(errs, fmt.Errorf("failed to delete %T %s: %w",
obj, obj.GetName(), err))
}
}
return errors.Join(errs...)
}
// reconcileContainerSpecs handles `Container` spec.state changes by
// calling Docker start/stop. Per-Container errors are collected with
// errors.Join and returned so controller-runtime requeues with backoff
// — matching the pattern in processContainerFinalizers. Without the
// return, a failed start/stop would get exactly one attempt (the watch
// event from the original spec.state patch) and then sit forever.
func (r *EngineReconciler) reconcileContainerSpecs(ctx context.Context) error {
Fix: collect errors across all finalizer types.
- if err := r.processContainerFinalizers(ctx, w); err != nil {
- return err
- }
- if err := r.processImageFinalizers(ctx, w); err != nil {
- return err
- }
- return r.processVolumeFinalizers(ctx, w)
+ return errors.Join(
+ r.processContainerFinalizers(ctx, w),
+ r.processImageFinalizers(ctx, w),
+ r.processVolumeFinalizers(ctx, w),
+ )
continue
}
if removeFinalizer(img, mirrorFinalizer) {
if err := client.IgnoreNotFound(r.Update(ctx, img)); err != nil {
errs = append(errs, fmt.Errorf("failed to remove finalizer from Image %s: %w", img.Name, err))
}
}
}
return errors.Join(errs...)
}
func (r *EngineReconciler) processVolumeFinalizers(ctx context.Context, w *dockerWatcher) error {
var volumes containersv1alpha1.VolumeList
if err := r.List(ctx, &volumes, client.InNamespace(apiNamespace)); err != nil {
return err
}
var errs []error
for i := range volumes.Items {
v := &volumes.Items[i]
if v.DeletionTimestamp == nil || !hasFinalizer(v, mirrorFinalizer) {
continue
}
// A Volume mirror with an empty Status.Name has no
// engine-populated status yet — either a user created it as a
// bare skeleton or it landed in the startup race window before
// applyVolume ran. There is no Docker-side name to forward a
// delete to, so strip the finalizer and let the Delete proceed.
if v.Status.Name != "" {
if err := w.deleteVolume(ctx, v.Status.Name); err != nil {
syncContainerNamespace at sync_volumes.go:62-67 is called only from fullSync, which only runs at watcher startup. SetupWithManager does not watch ContainerNamespace, so user-initiated deletion produces no follow-up reconcile, and the hot loop in Reconcile (after line 153) never re-checks the moby mirror. The new test at engine.bats:355-361 confirms the gap from one side — kubectl delete containernamespace/moby succeeds without a finalizer hang — but there is no assertion that the engine re-creates the resource. The adjacent test engine.bats:349-353 ("restarting VM restores ContainerEngineReady and moby namespace") only validates recovery via a VM bounce.
// Unlike `Container` / `Image` / `Volume` mirrors, this resource has no
// mirror finalizer: Docker has no corresponding engine object to delete
// on the reverse path, and cleanupMirrorResources sweeps it
// unconditionally on VM stop, so a finalizer with no handler would only
// trap user-initiated deletes in Terminating until the next bounce.
func (w *dockerWatcher) syncContainerNamespace(ctx context.Context) error {
applyConfig := containersv1alpha1apply.ContainerNamespace(containerNamespace, apiNamespace)
return w.k8s.Apply(ctx, applyConfig,
client.ForceOwnership, client.FieldOwner(controllerName))
}
// syncAllVolumes lists all Docker volumes and creates/updates `Volume`
// mirrors, then removes stale ones.
func (w *dockerWatcher) syncAllVolumes(ctx context.Context) error {
log := logf.FromContext(ctx).WithName("docker-watcher")
# already stopped, the wait must complete immediately rather than
# hang on the already-False engine condition.
rdd set --timeout=10s running=false
}
@test "restarting VM restores ContainerEngineReady and moby namespace" {
rdd set running=true
rdd ctl wait --for=create --namespace="${NAMESPACE}" \
containernamespace/moby --timeout=10s
}
@test "deleting containernamespace/moby completes without a finalizer hang" {
# moby ContainerNamespace has no mirror finalizer, so a user delete
# must return promptly rather than get trapped in Terminating.
rdd ctl delete containernamespace/moby --namespace="${NAMESPACE}" --timeout=10s
run -0 rdd ctl get containernamespaces --namespace="${NAMESPACE}" --output=name
refute_output
}
# --- containerd backend ---
@test "containerd backend reports ContainerEngineReady=NotApplicable and skips mirroring" {
# Stop first so there is no stale True/Connected from moby to
}
if !watcherRunning {
log.Info("App is running, starting Docker watcher")
if err := r.startWatcher(ctx); err != nil {
log.Error(err, "Failed to start Docker watcher")
if condErr := r.setEngineCondition(ctx, &app, metav1.ConditionFalse, "ConnectFailed", err.Error()); condErr != nil {
log.Error(condErr, "Failed to update ContainerEngineReady to ConnectFailed")
}
return ctrl.Result{}, err
}
This also contradicts the "mirror Docker state into K8s resources" contract documented in docs/design/api_containers.md.
Fix: watch ContainerNamespace so the engine reconciler notices user deletions, then re-call syncContainerNamespace in the wantWatcher branch. The Apply is idempotent so the cost is negligible.
return ctrl.NewControllerManagedBy(mgr).
For(&appv1alpha1.App{}).
Named("engine-reconciler").
WatchesRawSource(source.Channel(r.reconcileChan, enqueueApp)).
Watches(&containersv1alpha1.Container{}, enqueueApp).
Watches(&containersv1alpha1.Image{}, enqueueApp).
+ Watches(&containersv1alpha1.ContainerNamespace{}, enqueueApp).
Watches(&containersv1alpha1.Volume{}, enqueueApp).
Complete(r)
Alternatively, document moby as a protected singleton and reject the delete in the webhook. The design doc should be updated either way to state what the intended behavior is.
terminalMessage = "Engine mirroring is only supported with the moby backend"
}
current := meta.FindStatusCondition(app.Status.Conditions, conditionContainerEngineReady)
alreadyClean := !watcherDied && current != nil && current.Reason == terminalReason && current.Status == terminalStatus
if !alreadyClean {
if err := r.cleanupMirrorResources(ctx); err != nil {
log.Error(err, "Failed to clean up mirror resources")
return ctrl.Result{}, err
}
}
return ctrl.Result{}, r.setEngineCondition(ctx, &app, terminalStatus, terminalReason, terminalMessage)
}
if !watcherRunning {
log.Info("App is running, starting Docker watcher")
if err := r.startWatcher(ctx); err != nil {
log.Error(err, "Failed to start Docker watcher")
if condErr := r.setEngineCondition(ctx, &app, metav1.ConditionFalse, "ConnectFailed", err.Error()); condErr != nil {
log.Error(condErr, "Failed to update ContainerEngineReady to ConnectFailed")
}
return ctrl.Result{}, err
}
After successfully starting a fresh watcher and setting the condition, the reconciler returns without running reconcileContainerSpecs or processFinalizers. In most cases this is self-healing: setEngineCondition flips ContainerEngineReady from Stopped/False to Connected/True, which triggers a new App reconcile that enters the hot loop. But in the watcher-restart path (the previous condition was already Connected/True from before the crash), meta.SetStatusCondition returns changed=false, no status write happens, and no App reconcile fires from that source. fullSync then becomes the only trigger: its status Apply calls usually change a timestamp or field and emit watch events, but if Docker state is unchanged during the restart window, the applies can be no-ops and no reconcile follows. Pending user intent (e.g., a spec.state=running patch made during the gap) stalls until some unrelated event triggers a reconcile.
Gemini framed this as critical — it is real, but the practical blast radius is narrower than a guaranteed stall. Downgrading to important. The fix is trivial and eliminates the worry entirely.
Fix: remove the early return so the reconciler falls through to the shared code path.
if err := r.setEngineCondition(ctx, &app, metav1.ConditionTrue, "Connected", "Container engine synced"); err != nil {
return ctrl.Result{}, err
}
- return ctrl.Result{}, nil
}
// reconcileContainerSpecs + processFinalizers each issue one or more
Suggestions ¶
k8s: k8s,
cancel: watchCancel,
done: make(chan struct{}),
reconcileChan: reconcileChan,
}
// Capture a Docker-relative timestamp before fullSync begins. The
// run() goroutine passes this to the event stream as the Since
// filter, so any mutation that happens while fullSync is snapshotting
// state is replayed once the stream opens. Without this, the window
// between the List-based snapshot and the event subscription is a
// blind spot during VM startup.
//
// Use the Docker daemon's own clock (Info.SystemTime) rather than
// time.Now() on the host. The daemon runs inside the Lima VM and
// evaluates Since against its own clock; any host/guest skew would
// silently drop events inside the skew window. Fall back to a
The comment claims "Docker-relative timestamp" but time.Now() is the host clock. The Docker server (inside the Lima VM) interprets the since query parameter against its own clock. If the host clock is ahead of the guest clock by N seconds, events generated in the window [now - N, start of event stream] are filtered out by the server and never reach the watcher — silently lost. With Lima/Vz the skew is normally millisecond-scale so practical risk is low, but the failure mode is silent rather than detectable.
Two cheap fixes:
// Option A: bias the timestamp into the past so any realistic skew is covered.
// fullSync is idempotent, so duplicate replay is safe.
since := strconv.FormatInt(time.Now().Add(-2*time.Minute).Unix(), 10)
// Option B: query the Docker server's view of "now" before fullSync.
if info, err := cli.Info(ctx); err == nil && info.SystemTime != "" {
if t, perr := time.Parse(time.RFC3339Nano, info.SystemTime); perr == nil {
since = strconv.FormatInt(t.Unix(), 10)
}
}
At minimum, fix the comment to say "host wall clock" and acknowledge the clock-skew assumption.
// always set to "unknown" on creation — meaning the engine mirrors
// Docker container state without expressing intent. The user can later
// set it to "running" or "created" to control the Docker container;
// the reconciler ignores "unknown".
//
// Deliberately omit ForceOwnership on the create apply: if a user
// patch to spec.state landed in the window between the Get above
// and this Apply, ForceOwnership would silently clobber it. Without
// the flag, the Apply succeeds on a fresh resource (no other owner
// for spec.state yet) and fails loudly on a concurrent conflict
// (the Get-NotFound observation was racy).
var existing containersv1alpha1.Container
err := w.k8s.Get(ctx, client.ObjectKey{Name: inspect.ID, Namespace: apiNamespace}, &existing)
if apierrors.IsNotFound(err) {
applyConfig := containersv1alpha1apply.Container(inspect.ID, apiNamespace).
WithFinalizers(mirrorFinalizer).
WithSpec(containersv1alpha1apply.ContainerSpec().WithState(containersv1alpha1.ContainerStatusUnknown))
if err := w.k8s.Apply(ctx, applyConfig, client.FieldOwner(controllerName)); err != nil {
return fmt.Errorf("failed to create container %s: %w", inspect.ID, err)
}
} else if err != nil {
return fmt.Errorf("failed to get container %s: %w", inspect.ID, err)
The mirror finalizer is set only on the IsNotFound (creation) branch. On any subsequent sync of the same Container, the code falls through to the status-only Apply at sync_containers.go:176 without re-asserting the finalizer. If the user removes the finalizer via kubectl edit, the next reconcile won't restore it, and a subsequent user delete bypasses the engine's Docker-side cleanup entirely — the K8s mirror is GC'd instantly while the Docker container survives.
if t, err := time.Parse(time.RFC3339Nano, inspect.State.FinishedAt); err == nil && !t.IsZero() {
statusApply.WithFinishedAt(metav1.NewTime(t))
}
// Port bindings.
var applyPorts []*containersv1alpha1apply.ContainerPortApplyConfiguration
if inspect.NetworkSettings != nil {
for portName, ports := range inspect.NetworkSettings.Ports {
var bindings []*containersv1alpha1apply.ContainerPortBindingApplyConfiguration
for _, port := range ports {
bindings = append(bindings, containersv1alpha1apply.ContainerPortBinding().
sync_images.go:112-133 and sync_volumes.go:128-130 reassert the finalizer on every sync because their applies are unconditional. The Container path is the only outlier because of its special "spec-only-on-create" gate.
var errs []error
if len(inspect.RepoTags) > 0 {
for _, tag := range inspect.RepoTags {
// Deterministic name from image ID + tag hash (same as mock controller).
name := fmt.Sprintf("%s-%x",
sanitizeKubernetesObjectName(inspect.ID),
sha256.Sum256([]byte(tag)))
names = append(names, name)
if err := w.applyImage(ctx,
containersv1alpha1apply.Image(name, apiNamespace).
WithFinalizers(mirrorFinalizer),
imageStatusFromInspect(inspect).
WithRepoTag(tag).
WithNamespace(containerNamespace),
); err != nil {
errs = append(errs, err)
}
}
} else {
// Dangling image (no tags). status.namespace is a required
// field on the CRD, so set it here too even though it carries
// no additional information beyond what the tagged branch sets.
name := sanitizeKubernetesObjectName(inspect.ID)
names = append(names, name)
if err := w.applyImage(ctx,
containersv1alpha1apply.Image(name, apiNamespace).
WithFinalizers(mirrorFinalizer),
imageStatusFromInspect(inspect).
WithNamespace(containerNamespace),
); err != nil {
}
// applyVolume creates or updates a `Volume` mirror from a Docker volume.
func (w *dockerWatcher) applyVolume(ctx context.Context, vol mobyvolume.Volume) error {
mirrorName := volumeMirrorName(vol.Name)
applyConfig := containersv1alpha1apply.Volume(mirrorName, apiNamespace).
WithFinalizers(mirrorFinalizer)
err := w.k8s.Apply(ctx, applyConfig,
client.ForceOwnership, client.FieldOwner(controllerName))
if err != nil {
return fmt.Errorf("failed to apply volume %s: %w", mirrorName, err)
Fix: split finalizer reassertion into a separate Apply that always runs:
finalizerOnly := containersv1alpha1apply.Container(inspect.ID, apiNamespace).
WithFinalizers(mirrorFinalizer)
if err := w.k8s.Apply(ctx, finalizerOnly,
client.ForceOwnership, client.FieldOwner(controllerName)); err != nil {
return fmt.Errorf("failed to assert container finalizer %s: %w", inspect.ID, err)
}
With SSA, applying only WithFinalizers preserves the user-owned spec.state field.
if err != nil {
return err
}
r.watcher = w
return nil
}
// stopWatcher stops the Docker watcher goroutine and waits for it to finish.
func (r *EngineReconciler) stopWatcher() {
r.watcherMu.Lock()
w := r.watcher
r.watcher = nil
r.watcherMu.Unlock()
if w != nil {
w.stop()
}
}
// setEngineCondition updates the ContainerEngineReady condition on the
// App resource. The App controller also writes App.Status.Conditions
// (to mirror LimaVM conditions) and controller-runtime does not
// serialize reconciles across controllers, so a naive Update can race
The AppReconciler at pkg/controllers/app/app/controllers/app_controller.go:247 already calls r.Status().Update(ctx, &app) to mirror LimaVM conditions onto App.Status.Conditions. The EngineReconciler now also calls Status().Update on the same object to set ContainerEngineReady. Both reconcilers run in the same manager and are not serialized across controllers (controller-runtime serializes within a controller, not across them). When both have a stale local copy and both issue a Status().Update, the loser gets a 409 conflict.
ObservedGeneration: app.Generation,
LastTransitionTime: cond.LastTransitionTime,
}) || statusChanged
}
if statusChanged {
if err := r.Status().Update(ctx, &app); err != nil {
log.Error(err, "Unable to update App status")
return ctrl.Result{}, err
}
}
In practice conflicts will be infrequent because meta.SetStatusCondition short-circuits no-op updates and the App reconciler reconciles less often. But the race is real.
Fix (either of two patterns already used elsewhere in this file):
- Use server-side apply with
client.Apply+ a distinctFieldOwnerfor the engine condition — field-level merge avoids resource-version conflict entirely. - Wrap the
Status().Updateinretry.RetryOnConflictplus a re-Get, mirroringremoveMirrorResourceatdocker_watcher.go:271-283anddeleteAllOfTypeatengine_reconciler.go:262-274.
return w.removeImagesByID(ctx, msg.Actor.ID)
default:
return nil
}
}
// handleVolumeEvent processes volume events.
func (w *dockerWatcher) handleVolumeEvent(ctx context.Context, msg events.Message) error {
log := logf.FromContext(ctx).WithName("docker-watcher")
switch msg.Action {
case events.ActionCreate:
log.V(1).Info("Volume created", "name", msg.Actor.ID)
return w.syncVolume(ctx, msg.Actor.ID)
case events.ActionDestroy:
log.V(1).Info("Volume destroyed", "name", msg.Actor.ID)
return w.removeMirrorResource(ctx, &containersv1alpha1.Volume{},
volumeMirrorName(msg.Actor.ID))
default:
return nil
errs = append(errs, fmt.Errorf("failed to delete Volumes: %w", err))
}
if err := r.deleteAllOfType(ctx, &containersv1alpha1.ImageList{}); err != nil {
errs = append(errs, fmt.Errorf("failed to delete Images: %w", err))
}
if err := r.deleteAllOfType(ctx, &containersv1alpha1.ContainerNamespaceList{}); err != nil {
errs = append(errs, fmt.Errorf("failed to delete ContainerNamespaces: %w", err))
}
return errors.Join(errs...)
}
// deleteAllOfType lists resources, strips finalizers, and deletes them.
// Finalizer removal uses retry.RetryOnConflict so a stale cache or a
// concurrent Update does not require the caller to requeue. Per-item
// errors are collected so one stuck object does not block the rest of
// the list.
func (r *EngineReconciler) deleteAllOfType(ctx context.Context, list client.ObjectList) error {
if err := r.List(ctx, list, client.InNamespace(apiNamespace)); err != nil {
return err
}
items, err := meta.ExtractList(list)
if err != nil {
// neither — or the watcher will open a different socket from the one
// Lima is port-forwarding to, and mirroring will silently diverge per
// instance. Until then, multi-instance BATS runs cannot share the
// same host and the engine.bats test has a parallel TODO at line
// 12-14.
var DockerSocket = sync.OnceValue(func() string {
home, err := os.UserHomeDir()
if err != nil {
panic(fmt.Errorf("could not get home directory: %w", err))
}
dir := filepath.Join(home, ".rd2")
_ = os.MkdirAll(dir, 0o700)
return filepath.Join(dir, "docker.sock")
})
Two concerns:
- Hardcoded directory: On any instance where
RDD_INSTANCE != "2"(e.g.,bats-app-controller, a developer'sdevinstance), this creates~/.rd2/instead ofinstance.ShortDir(). The TODO atinstance.go:124-133already acknowledges the hardcode, but the side effect of creating the directory early means every caller that reads this value (even from code paths that never use the socket) litters the user's home. Lima itself creates the directory when port-forwarding the socket, so theMkdirAllhere is likely redundant. - Swallowed error:
_ = os.MkdirAll(...)ignores any failure (permission, ENOSPC). If the directory cannot be created, the function still returns a path to a non-existent directory, and the later dial fails with an error unrelated to the real cause.
var LimaHome = sync.OnceValue(func() string {
return filepath.Join(ShortDir(), "lima")
})
// DockerSocket returns the path to the Docker socket for this instance
// (e.g., ~/.rd2/docker.sock). This is the host-side socket that Lima
// port-forwards from the guest's /var/run/docker.sock.
//
// TODO: use ShortDir() once the Lima template derives the socket path
// from the instance suffix instead of hardcoding ".rd2". The two
// hardcodes (here and in pkg/controllers/app/app/lima-template.yaml)
// must move together — either both parameterized on RDD_INSTANCE or
// neither — or the watcher will open a different socket from the one
// Lima is port-forwarding to, and mirroring will silently diverge per
// instance. Until then, multi-instance BATS runs cannot share the
// same host and the engine.bats test has a parallel TODO at line
// 12-14.
var DockerSocket = sync.OnceValue(func() string {
home, err := os.UserHomeDir()
if err != nil {
Fix (minimal): drop the MkdirAll entirely and let Lima own the directory.
- dir := filepath.Join(home, ".rd2")
- _ = os.MkdirAll(dir, 0o700)
- return filepath.Join(dir, "docker.sock")
+ return filepath.Join(home, ".rd2", "docker.sock")
If Lima does not pre-create it, propagate the error instead of ignoring it.
if watcherRunning {
log.Info("Stopping Docker watcher",
"running", running, "engine", app.Spec.ContainerEngine.Name)
r.stopWatcher()
}
terminalReason := "Stopped"
terminalStatus := metav1.ConditionFalse
terminalMessage := "Container engine stopped"
if running && !engineIsDocker {
// NotApplicable is reported as Status=True so that
// `rdd set running=true containerEngine.name=containerd`
// stops waiting on ContainerEngineReady — even though the
// engine controller is not mirroring anything in this
// backend. UI consumers that expect Container/Image/Volume
// resources should gate on the Reason as well, not on
// Status alone. The condition will be renamed when
// containerd mirroring lands.
terminalReason = "NotApplicable"
Setting ContainerEngineReady=True/NotApplicable for the containerd backend makes rdd set running=true complete immediately (matching the BATS test at engine.bats:365-389), but it tells any other consumer of ContainerEngineReady=True ("the engine is ready") something misleading: there is no mirroring at all, and any UI relying on Container/Image/Volume resources will see nothing. Status=True with Reason=NotApplicable conflicts with the condition name.
refute_output
}
# --- containerd backend ---
@test "containerd backend reports ContainerEngineReady=NotApplicable and skips mirroring" {
# Stop first so there is no stale True/Connected from moby to
# satisfy the CLI wait below before the engine reconciler has run.
rdd set running=false
# Start with containerd. rdd set waits for ContainerEngineReady=True,
# which the engine reconciler satisfies immediately with reason
# NotApplicable because engine mirroring only supports the moby
# backend.
rdd set containerEngine.name=containerd running=true
run -0 rdd ctl get app app \
-o jsonpath='{.status.conditions[?(@.type=="ContainerEngineReady")].reason}'
assert_output "NotApplicable"
# No mirror resources should exist in containerd mode.
run -0 rdd ctl get containers --namespace="${NAMESPACE}" --output=name
refute_output
run -0 rdd ctl get images --namespace="${NAMESPACE}" --output=name
refute_output
run -0 rdd ctl get volumes --namespace="${NAMESPACE}" --output=name
refute_output
run -0 rdd ctl get containernamespaces --namespace="${NAMESPACE}" --output=name
refute_output
}
Two cleaner alternatives:
- Rename the condition to
ContainerEngineMirroredsoTrue/NotApplicabledoesn't conflict with the type name. - Use
Status=Unknown/NotApplicable— still allowsrdd setto stop waiting but doesn't claim active readiness.
Option two is cheaper and keeps the condition name descriptive; the CLI wait at cmd/rdd/set.go:464-466 would need to accept Unknown as a terminal state.
ctx, cancel := context.WithTimeout(ctx, timeout)
defer cancel()
if runningVal == "true" {
logrus.Info("Waiting for container engine to be ready")
return watchCondition(ctx, config, func(obj *unstructured.Unstructured) bool {
return conditionStatus(obj, "ContainerEngineReady") == metav1.ConditionTrue
})
}
logrus.Info("Waiting for the VM to stop")
return watchCondition(ctx, config, func(obj *unstructured.Unstructured) bool {
// Running=False means the VM has stopped. Running absent means
// the VM was never started (nothing to wait for).
current := meta.FindStatusCondition(app.Status.Conditions, conditionContainerEngineReady)
alreadyClean := !watcherDied && current != nil && current.Reason == terminalReason && current.Status == terminalStatus
if !alreadyClean {
if err := r.cleanupMirrorResources(ctx); err != nil {
log.Error(err, "Failed to clean up mirror resources")
return ctrl.Result{}, err
}
}
return ctrl.Result{}, r.setEngineCondition(ctx, &app, terminalStatus, terminalReason, terminalMessage)
}
if !watcherRunning {
log.Info("App is running, starting Docker watcher")
if err := r.startWatcher(ctx); err != nil {
log.Error(err, "Failed to start Docker watcher")
The _ = discards any error from setEngineCondition. If the status update fails (e.g., a 409 conflict against a concurrent App reconciler update — see S3), the App keeps reporting its stale value (often True/Connected from the previous startup) until the next reconcile. The reconciler requeues on the original startWatcher error so the state eventually converges, but the API-server-visible condition is wrong in the interim.
log.Error(err, "Failed to start Docker watcher")
- _ = r.setEngineCondition(ctx, &app, metav1.ConditionFalse, "ConnectFailed", err.Error())
+ if condErr := r.setEngineCondition(ctx, &app, metav1.ConditionFalse, "ConnectFailed", err.Error()); condErr != nil {
+ log.Error(condErr, "Failed to update ContainerEngineReady=ConnectFailed")
+ }
return ctrl.Result{}, err
if removeFinalizer(c, mirrorFinalizer) {
// NotFound is benign: a concurrent Docker destroy event may
// already have stripped the finalizer and deleted the
// mirror between our List and Update.
if err := client.IgnoreNotFound(r.Update(ctx, c)); err != nil {
errs = append(errs, fmt.Errorf("failed to remove finalizer from Container %s: %w", c.Name, err))
}
}
}
return errors.Join(errs...)
}
func (r *EngineReconciler) processImageFinalizers(ctx context.Context, w *dockerWatcher) error {
var images containersv1alpha1.ImageList
if err := r.List(ctx, &images, client.InNamespace(apiNamespace)); err != nil {
return err
}
var errs []error
for i := range images.Items {
img := &images.Items[i]
if img.DeletionTimestamp == nil || !hasFinalizer(img, mirrorFinalizer) {
continue
}
if err := w.deleteImage(ctx, img); err != nil {
errs = append(errs, fmt.Errorf("failed to delete image %s from Docker: %w", img.Name, err))
continue
}
if removeFinalizer(img, mirrorFinalizer) {
if err := client.IgnoreNotFound(r.Update(ctx, img)); err != nil {
errs = append(errs, fmt.Errorf("failed to remove finalizer from Image %s: %w", img.Name, err))
}
}
}
If a Volume mirror is created through the K8s API by something other than the engine controller — or in a startup race window where the mirror has the finalizer but the engine has not yet populated its status — Status.Name will be empty. deleteVolume then calls Docker's VolumeRemove(""), which errors out, and the finalizer is never stripped. The Volume mirror is permanently stuck in Terminating.
Fix: short-circuit the finalizer strip when there is no Docker name to forward to.
if v.DeletionTimestamp == nil || !hasFinalizer(v, mirrorFinalizer) {
continue
}
+if v.Status.Name == "" {
+ if removeFinalizer(v, mirrorFinalizer) {
+ if err := client.IgnoreNotFound(r.Update(ctx, v)); err != nil {
+ errs = append(errs, fmt.Errorf("failed to remove finalizer from Volume %s: %w", v.Name, err))
+ }
+ }
+ continue
+}
if err := w.deleteVolume(ctx, v.Status.Name); err != nil {
reconcileChan chan<- event.GenericEvent
}
// newDockerWatcher creates a Docker client, performs a full sync, and starts
// the event stream watcher goroutine.
func newDockerWatcher(ctx context.Context, k8s client.Client, reconcileChan chan<- event.GenericEvent) (*dockerWatcher, error) {
socketPath := instance.DockerSocket()
cli, err := dockerclient.New(
dockerclient.WithHost("unix://" + socketPath),
)
if err != nil {
return nil, fmt.Errorf("failed to create Docker client: %w", err)
}
// Verify the connection by pinging Docker.
pingCtx, pingCancel := context.WithTimeout(ctx, 5*time.Second)
defer pingCancel()
Without dockerclient.WithAPIVersionNegotiation(), the client uses its compiled-in default API version. If the Docker engine inside the Lima VM is newer or older than the client's default, calls will fail with cryptic version-mismatch errors. RDD controls the Lima image today so this is mostly theoretical, but a user who sshes into the VM and runs zypper up, or a future Lima image bump, could trigger it unexpectedly.
cli, err := dockerclient.New(
dockerclient.WithHost("unix://" + socketPath),
+ dockerclient.WithAPIVersionNegotiation(),
)
Verify the exact option name in moby/moby/client@v0.4.0 — moby has historically used WithAPIVersionNegotiation and NegotiateAPIVersion interchangeably across versions.
# Check whether a cmdline belongs to the current RDD_INSTANCE. The
# instance name appears in any path derived from it (~/.rd<instance>/,
# rancher-desktop-<instance>/, ...) and in sh wrapper argv as
# `RDD_INSTANCE=<instance>`. Anchor on those exact markers rather than
# matching any substring, so a short instance like "bats" does not
# match every "bats-*" sibling target's processes and trigger cross-
# target SIGKILL cleanup.
matches_our_instance() {
case "$1" in
*"RDD_INSTANCE=${instance}"*) return 0 ;;
*"/.rd${instance}/"*) return 0 ;;
*"rancher-desktop-${instance}/"*) return 0 ;;
*"rancher-desktop-${instance} "*) return 0 ;;
esac
The helper is used by our_leaked_pids / sigquit_our_go_leaks / sigkill_our_leaks at lines 306-365 to scope SIGQUIT and SIGKILL cleanup to the current RDD_INSTANCE. A bare substring match is broader than the comment at bats-with-timeout.sh:68-70 promises: RDD_INSTANCE=bats would match every bats-* instance name in bats/Makefile:53-83, so the cleanup path can target sibling control planes rather than its own. The destructive scoping is the whole point of this helper — subtle over-matching here risks killing a parallel run's daemons.
ps -o command= -p "$1" 2>/dev/null || true
fi
}
# Check whether a cmdline belongs to the current RDD_INSTANCE. The
# instance name appears in any path derived from it (~/.rd<instance>/,
# rancher-desktop-<instance>/, ...) and in sh wrapper argv as
# `RDD_INSTANCE=<instance>`. Anchor on those exact markers rather than
# matching any substring, so a short instance like "bats" does not
# match every "bats-*" sibling target's processes and trigger cross-
# target SIGKILL cleanup.
matches_our_instance() {
case "$1" in
echo
dump_api_state
echo
echo "=== End of support bundle (${context}) ==="
} >>"${bundle_file}" 2>&1
}
# Enumerate leaked process PIDs belonging to the current RDD_INSTANCE.
# `go_only=1` restricts to Go binaries (for SIGQUIT goroutine dumps);
# otherwise includes qemu/limactl drivers too.
our_leaked_pids() {
local go_only=$1
local pid comm
while read -r pid comm; do
if [ "$go_only" = 1 ]; then
case "$comm" in
rdd|rdd.exe|*-controller|*-controller.exe|lima-guestagent|hostagent) ;;
*) continue ;;
esac
else
case "$comm" in
rdd|rdd.exe|*-controller|*-controller.exe|lima-guestagent|hostagent|qemu*|limactl*) ;;
*) continue ;;
esac
fi
if matches_our_instance "$(cmdline_of "$pid")"; then
echo "$pid"
fi
done < <(ps -axo pid=,ucomm= 2>/dev/null || ps -eo pid=,ucomm= 2>/dev/null || true)
}
# SIGQUIT Go processes belonging to our RDD_INSTANCE so their goroutine
# stacks land in the preserved stderr logs. No-op if nothing is leaked.
sigquit_our_go_leaks() {
local pid
local leaked
leaked=$(our_leaked_pids 1)
if [ -z "$leaked" ]; then
return
fi
{
echo
echo "=== SIGQUIT -> leaked Go processes (RDD_INSTANCE=${instance}) ==="
for pid in $leaked; do
echo "SIGQUIT pid=${pid} pgid=$(pgid_of "$pid") cmdline=$(cmdline_of "$pid")"
kill -QUIT "$pid" 2>/dev/null || true
done
} >>"${bundle_file}" 2>&1
# Let Go runtimes flush goroutine dumps to stderr before subsequent
# steps terminate them.
sleep 1
}
# SIGKILL any process still running under our RDD_INSTANCE so the CI
# runner is clean for later steps. No-op if nothing is leaked.
sigkill_our_leaks() {
local pid
local leaked
leaked=$(our_leaked_pids 0)
if [ -z "$leaked" ]; then
return
fi
{
echo
echo "=== SIGKILL -> leaked processes (RDD_INSTANCE=${instance}) ==="
for pid in $leaked; do
echo "SIGKILL pid=${pid} pgid=$(pgid_of "$pid") cmdline=$(cmdline_of "$pid")"
kill -KILL "$pid" 2>/dev/null || true
done
Fix: anchor the match on actual instance markers that appear in paths and environment assignments.
matches_our_instance() {
case "$1" in
- *"${instance}"*) return 0 ;;
+ *"RDD_INSTANCE=${instance}"*|*"/.rd${instance}/"*|*"rancher-desktop-${instance}"*) return 0 ;;
esac
return 1
}
}
// cleanupMirrorResources removes every mirror resource the engine
// controller owns. Errors are collected across all four kinds with
// errors.Join so one stuck resource type does not block the rest — the
// remaining types are still swept and the caller requeues on the
// combined error, retrying only what actually failed.
func (r *EngineReconciler) cleanupMirrorResources(ctx context.Context) error {
log := logf.FromContext(ctx)
log.Info("Cleaning up all mirror resources")
var errs []error
if err := r.deleteAllOfType(ctx, &containersv1alpha1.ContainerList{}); err != nil {
errs = append(errs, fmt.Errorf("failed to delete Containers: %w", err))
}
if err := r.deleteAllOfType(ctx, &containersv1alpha1.VolumeList{}); err != nil {
errs = append(errs, fmt.Errorf("failed to delete Volumes: %w", err))
}
if err := r.deleteAllOfType(ctx, &containersv1alpha1.ImageList{}); err != nil {
errs = append(errs, fmt.Errorf("failed to delete Images: %w", err))
}
if err := r.deleteAllOfType(ctx, &containersv1alpha1.ContainerNamespaceList{}); err != nil {
errs = append(errs, fmt.Errorf("failed to delete ContainerNamespaces: %w", err))
}
return errors.Join(errs...)
}
// deleteAllOfType lists resources, strips finalizers, and deletes them.
// Finalizer removal uses retry.RetryOnConflict so a stale cache or a
// concurrent Update does not require the caller to requeue. Per-item
// errors are collected so one stuck object does not block the rest of
// the list.
func (r *EngineReconciler) deleteAllOfType(ctx context.Context, list client.ObjectList) error {
if err := r.List(ctx, list, client.InNamespace(apiNamespace)); err != nil {
return err
}
items, err := meta.ExtractList(list)
if err != nil {
return err
}
var errs []error
for _, item := range items {
obj := item.(client.Object)
key := client.ObjectKeyFromObject(obj)
// meta.ExtractList strips the TypeMeta from each item (the API
// server only populates GVK on the top-level list), so
// obj.GetObjectKind() is empty here. Format the concrete Go
// type name with %T instead.
retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
Items extracted from a List response via meta.ExtractList have empty TypeMeta — the K8s API server only populates GVK on the top-level list object, not its items. Error messages log as failed to remove finalizer from /volume-name instead of Volume/volume-name, making operator debugging harder. Cosmetic (doesn't affect behavior) — downgraded to suggestion.
- errs = append(errs, fmt.Errorf("failed to remove finalizer from %s/%s: %w",
- kind, obj.GetName(), retryErr))
+ errs = append(errs, fmt.Errorf("failed to remove finalizer from %T %s: %w",
+ obj, obj.GetName(), retryErr))
Each syncAll* collects per-item errors with errors.Join, fullSync joins across all four sync types, and newDockerWatcher fails the whole startup on any non-nil error. A single permanently-broken inspect (engine-side corruption, rare race) prevents every other healthy resource from being mirrored, and the ContainerEngineReady=ConnectFailed condition stays sticky until the problem resource disappears.
Consider logging individual inspect errors and continuing the sync rather than failing the entire startup on any per-item error. The stale-prune step already runs regardless and will eventually clean up mirrors for items that stay broken.
# bats-with-timeout.sh captures a support bundle (process tree, Go
# goroutine dumps via SIGQUIT, open fds) before killing the target so
# we can diagnose CI hangs post-mortem. Use `=` so $@ resolves to the
# current target at recipe time.
#
# NOTE: setting BATS_TIMEOUT=0 disables the wrapper entirely, which
# also disables the support-bundle capture and leaked-process cleanup.
# If you are debugging a hang locally and need both the long runtime
# and the diagnostic capture, set BATS_TIMEOUT to a large value
# (e.g. 86400) instead of 0.
BATS_TIMEOUT ?= 1800
When a developer runs make bats-app BATS_TIMEOUT=0 to disable the timeout (e.g., for debugging a hang), the wrapper is not invoked at all — no support-bundle capture, no leaked-process cleanup. Document this in the Makefile comment, or split the wrapper so timeout=0 still runs it with the timeout disabled.
Design Observations ¶
Concerns ¶
- Reconciler scales linearly per child event (in-scope) Claude. The in-code comment at
engine_reconciler.go:155-161already calls this out: every Container/Image/Volume watch event triggers a reconcile viaSetupWithManager, and each reconcile issues full List() calls across all mirror types. Cost is O(N) per child event. The right long-term fix — per-resource reconcilers with watch predicates ondeletionTimestamp set/spec.state changed— is the correct structural answer. Worth doing before the controller is expected to handle more than a few dozen containers. - Two writers on
App.Status.Conditions(in-scope) Claude. The App reconciler mirrors LimaVM conditions; the engine reconciler setsContainerEngineReady. They are different field-level concerns butUpdatesends the entire status. See S3 for the immediate impact. A cleaner long-term design has all App.Status writers use SSA with distinct field owners; this eliminates conflicts entirely and makes per-condition ownership explicit. - Engine assumes host/guest clock parity in multiple places (in-scope) Claude. Beyond the
sincefilter (S1), the varioustime.Parse(time.RFC3339Nano, ...)calls inapplyContainer,imageStatusFromInspect, andapplyVolumeall trust Docker's timestamps as-is. None is wrong, but they all make the same assumption. A one-shot startup check (compareInfo().SystemTimetotime.Now()and warn if skew exceeds N seconds) would surface clock drift early. ContainerNamespaceis documented as a mirror resource but implemented as a startup-only singleton (in-scope) Codex. Two coherent models exist: reconcile it like the other mirrors, or reject deletion in the webhook and document it as protected state. Picking one simplifies the system; the current hybrid is the direct cause of I3.
Strengths ¶
- Idempotent fullSync + watch model (in-scope) Claude. The fundamental design — list-then-watch with full reconciliation on every watcher startup — handles every edge case (panic, transient disconnect, Docker errors) by self-healing. This is the right architecture.
- Captured
sincetimestamp before fullSync (in-scope) Codex. Capturing the event-stream cursor atdocker_watcher.go:70-84before the initial sync is the right way to close the startup blind spot between snapshot and stream subscription — separate from S1's clock concern. - Error aggregation pattern (in-scope) Claude. The per-resource
errors.Joinpattern incleanupMirrorResources,reconcileContainerSpecs, and per-finalizer helpers is a solid choice; onlyprocessFinalizersitself violates it (see I2). - Spec-state-once-on-create reasoning (in-scope) Claude. The comment at
sync_containers.go:111-116explaining whyForceOwnershipis deliberately omitted on the create Apply is excellent — it captures the exact race the author considered and why the chosen design avoids it. - Webhook restraint (in-scope) Claude Gemini. The new validator at
container_webhooks.go:35-62allows onlyspec.stateto change and normalizes it before comparing the rest. AsContainerSpecgrows, the comparison continues to work correctly. removeMirrorResourceretry-on-conflict (in-scope) Claude. The watcher-side finalizer strip usesretry.RetryOnConflictatdocker_watcher.go:271so concurrent reconciles don't require a manual requeue. This matches the pattern used indeleteAllOfType.- BATS coverage breadth (in-scope) Claude.
engine.batsis unusually thorough — it covers dangling/tagged transitions both directions, paused-container state transitions both directions, in-use image deletion, and the "rdd set running=false returns promptly" regression. The inline comments explain each test's design carefully. watchtools.UntilWithSyncinrdd set(in-scope) Codex. Solid choice for the new CLI wait path — it handles watch restarts without turning transient disconnects into user-visible flakes.
Testing Assessment ¶
Coverage is broad and intentional. Specific untested scenarios that warrant tests, ranked by risk:
- Watcher death and recovery while VM is running — no test causes a Docker event-stream error or panics the goroutine, so the self-healing claim is unverified end-to-end. Hard to test from BATS but doable with a fault-injection hook. Addresses I1.
ContainerNamespace/mobyre-creation after user delete — the existing test atengine.bats:355-361verifies the delete succeeds without a finalizer hang, but there is no assertion the engine re-creates it without bouncing the VM. Addresses I3.- Pending user work during watcher restart — a test that patches
spec.state=runningwhile the watcher is dying and asserts the patch is eventually processed without an external trigger. Addresses I4. - Concurrent App.Status updates from App + Engine reconcilers — a stress test that rapidly toggles
running=true/falseand asserts neither reconciler reportsConflicterrors. Addresses S3. - Empty
Volume.Status.Namefinalizer hang — create a Volume via the K8s API directly with empty status, set deletionTimestamp, assert finalizer cleanup proceeds. Addresses S7. fullSyncstale-prune race — force a container-removed-between-List-and-Inspect race and verify the stale mirror is pruned in the same sync.- Overlapping RDD_INSTANCE names in
bats-with-timeout.sh— a shell test covering the substring-match risk before the wrapper sends destructive signals. Addresses S9.
Unit test coverage gap: the only Go test in the new package is sync_containers_test.go for parseContainerName. The reconciler state machine, finalizer cleanup, and error-aggregation logic are exercised only by BATS — valuable but slow to iterate on. envtest-based unit tests for reconcile transitions, stuck finalizers, and the cleanup per-type error accumulation would shorten the feedback loop significantly.
Local verification: go test ./pkg/controllers/app/engine/controllers ./pkg/controllers/containers/container passed in Codex's worktree.
Documentation Assessment ¶
docs/design/api_containers.md is updated substantively — new Terminology section, Engine Mirroring section, finalizer lifecycle, container state transitions, and updated Volume name format. Gaps:
- The
ContainerEngineReady=NotApplicablereason is introduced in the design doc (lines 62-64) but the implication ("containerd users should not depend on this condition for mirroring readiness") is not stated. See S5. - The doc does not state whether deleting
containernamespace/mobyis supposed to be rejected or self-healed. Clarifying alongside I3 is the right place. instance.DockerSocket()has no doc-string explaining the~/.rd2hardcode beyond a TODO. Surfacing the assumption in the function comment (or name) would help the next reader.cmd/rdd/set.go's new--timeoutflag is documented in the command help, but the failure mode ("timed out waiting for App state") is not — users may expect a request-level timeout rather than a wait-for-condition timeout.engine.bats:147-149referencesvolumeK8sNamebut the actual helper isvolumeMirrorName. Stale comment.
Commit Structure ¶
17 commits, one explicit "Address review round 2 findings" fixup (66eb50e), several small capitalization/comment/script fixups (84c0736, c8fa377, 3196b24, 62d222a), and a cluster of CI infrastructure commits (50bdc15, 7c8f356, c484dab, 9669b4b) intermixed with the feature work. The fixups are review-driven and reasonable per the project convention. The CI support-bundle work is arguably independent of the engine controller — the new script does not depend on engine code, and engine code does not depend on the script — and could have been a separate PR that stood on its own merits. Bundling makes the diff larger and the relationship between the new tooling and the feature less clear, but not a blocker.
Acknowledged Limitations ¶
- DockerSocket hardcoded to
~/.rd2atpkg/instance/instance.go:127-135: TODO explicitly flags that multi-instance BATS runs cannot share a host until the Lima template derives the path from the instance suffix;engine.bats:13-14carries the parallel TODO. This PR makes the limitation more impactful but does not worsen it. rdd set containerEngine.name=<other> running=truereturns immediately on a running VM atcmd/rdd/set.go:445-452: the doc comment describes this hazard and points to the proposedReconciledcondition withobservedGenerationas the long-term fix.- per-reconcile O(N) List + Inspect at
engine_reconciler.go:155-161: acknowledged with a clear path to per-resource reconcilers with watch predicates. - Sequential per-object Inspect during fullSync at
sync_containers.go:30-36andsync_images.go:36-40: acknowledged with pointers to where an errgroup could go if startup latency becomes a concern. - Tag-event handling requires re-inspect at
docker_watcher.go:218-223: the untag event payload only carries the image ID hash, soreconcileImageByIDre-inspects and prunes stale tag mirrors. The asymmetry is understood. pgidlog field is Unix-only atlimavm_lifecycle.go:479-490: the comment flags that it will need renaming when Windows CI arrives.
Unresolved Feedback ¶
No unresolved review comments on the PR itself — round 2 comments were addressed in commit 66eb50e.
Agent Performance Retro ¶
[Claude] — Claude carried the review. It produced 3 of the 4 important findings (I1, I2, I3) and 10 of the 12 suggestions, including the only one that required cross-component reasoning (S3: concurrent App + Engine status writers, which depended on reading the App reconciler code outside the diff). Its unique finding S1 (clock-skew in the since filter) is a quiet bug the other two agents missed — the author's comment explicitly says "Docker-relative timestamp" and both Codex and Gemini accepted it without checking what time.Now() actually returns. Claude's S8 (API-version negotiation) is a genuine hedging concern but closer to a code review idiom than a bug. The review's prose wandered slightly in spots (S10, S11 on testing/Makefile are valid but low-value), but calibration across severities was accurate and every claim verified against current code.
[Codex] — Codex produced the two highest-signal findings it raised (I1/panic-recovery and I3/ContainerNamespace resurrection, both shared with Claude) plus one sharp suggestion (S9: substring match in the bats-with-timeout helper) that Claude missed entirely despite both agents reading the same script. Its finding density is low — three items total — but every one is load-bearing and the analysis traced through the webhook, SetupWithManager, and the new test code to prove the claim. No false positives, no padding. Codex also ran go test in its worktree and reported pass; this is the kind of verification step the prompt invites but only Codex exercised.
[Gemini] — Gemini produced both a high-value finding (I2/short-circuit finalizers, shared with Claude) and one legitimate critical-severity bug-adjacent catch (I4/early-return after startWatcher, downgraded to important after verification), but it also produced the review's only false positive (C1: claimed a "$ @" syntax error with a space in bats-with-timeout.sh that does not exist, with line numbers 159 and 349 that do not correspond to the cited function). The hallucinated critical is the single-worst event in this review — it would have blocked merge if trusted. Gemini's I2 was downgraded to a suggestion (S10) because it only affects error message quality. Net: one genuine important, one downgraded-critical, one downgraded-important, one fabricated critical. Calibration was inflated.
Summary ¶
| Claude Opus 4.6 | Codex GPT 5.4 | Gemini 3.1 Pro | |
|---|---|---|---|
| Duration | 22m 58s | 6m 21s | 5m 48s |
| Findings | 3I 11S | 2I 1S | 1I 3S |
| Tool calls | 77 (Read 39, Bash 30, TodoWrite 7) | 43 (shell 41, plan 1, stdin 1) | 1 (runshellcommand 1) |
| Design observations | 6 | 3 | 2 |
| False positives | 0 | 0 | 1 |
| Unique insights | 9 | 1 | 1 |
| Files reviewed | 27 | 27 | 27 |
| Coverage misses | 0 | 0 | 0 |
| Totals | 3I 11S | 2I 1S | 1I 3S |
| Downgraded | 0 | 0 | 2 (I→S) |
| Dropped | 0 | 0 | 1 |
Reconciliation. Gemini pass-1 C1 (bash syntax error) rejected as a false positive after reading scripts/bats-with-timeout.sh:202 directly and confirming the code is "$@" & with no space; cited line numbers 159/349 also do not match the file. Gemini pass-1 C2 (reconciler stalls after startWatcher) downgraded from critical to important I4 — the scenario is real but narrower than framed: setEngineCondition triggers a subsequent App reconcile in most cases, and only the watcher-restart path with unchanged condition + status-apply no-op actually stalls. Gemini pass-1 I2 (missing GVK in deleteAllOfType error) downgraded from important to suggestion S10 — cosmetic error-message issue, no behavior impact.
Overall, Claude contributed the most value on this review; Codex contributed the highest signal-to-noise; Gemini contributed one genuine finding per category but lost calibration at the top end.
Review Process Notes ¶
Skill improvements ¶
- Reinforce literal verification of short code snippets when an agent claims a bash/shell syntax error. Gemini fabricated a
"$ @"critical finding with line numbers that did not correspond to the cited function. Single-character syntax claims are cheap to verify via a direct file read, and this review only caught it because the orchestrator happened to have read the file already. A prompt-level nudge — "before reporting a single-character syntax error as critical, quote the exact line using a direct file read tool call and confirm the line number" — could head this off for future reviews.
Repo context updates ¶
- [repo] UPDATE
deep-review-context.md: engine controller now uses host wall clock for Dockersincefilter. Add a bullet: "When reviewing engine controller changes, flag any code that usestime.Now()for filters passed to the Docker daemon API. The Docker daemon is inside the Lima VM and filters on its own clock; host clock must be translated viaInfo().SystemTimeor biased into the past to cover skew." This was Claude's unique S1 finding and an easy miss for agents that don't know Docker runs in the VM rather than the host. - [repo] UPDATE
deep-review-context.md: App.Status has multiple writers. Add a bullet: "App.Status.Conditionsis now written by bothAppReconciler(LimaVM mirror) andEngineReconciler(ContainerEngineReady). Controllers run concurrently — flag any non-SSA, non-retry-on-conflictStatus().Updateon the App object." This generalizes S3 and would have helped both Gemini and Codex catch the concurrent-writer concern independently.