Efficient Test-Time Optimization for Depth Completion via Low-Rank Decoder Adaptation

*Equal contribution Corresponding author
Main figure

Our method establishes a new Pareto frontier, achieving the lowest error with highly efficient inference among test-time optimization-based depth completion methods. By contrast, existing test-time optimization approaches require several seconds of inference per image.

Abstract

Zero-shot depth completion has gained attention for its ability to generalize across environments without sensor-specific datasets or retraining. However, most existing approaches rely on diffusion-based test-time optimization, which is computationally expensive due to iterative denoising. Recent visual-prompt-based methods reduce training cost but still require repeated forward--backward passes through the full frozen network to optimize input-level prompts, resulting in slow inference. In this work, we show that adapting only the decoder is sufficient for effective test-time optimization, as depth foundation models concentrate depth-relevant information within a low-dimensional decoder subspace. Based on this insight, we propose a lightweight test-time adaptation method that updates only this low-dimensional subspace using sparse depth supervision. Our approach achieves state-of-the-art performance, establishing a new Pareto frontier between accuracy and efficiency for test-time adaptation. Extensive experiments on five indoor and outdoor datasets demonstrate consistent improvements over prior methods, highlighting the practicality of fast zero-shot depth completion.

Qualitative Results

Top: error maps w.r.t. ground truth. Bottom: predicted depth maps.
Blue dashed boxes highlight representative regions.

Quantitative Results

Our method establishes a new Pareto frontier, achieving the lowest error with highly efficient inference among zero-shot depth completion methods. Compared to TestPromptDC, it reduces MAE by 22.2% and RMSE by 19.6% on average across five datasets, achieving the best performance on most benchmarks.

Quantitative results

Comparison with Prior Depth Completion Methods

(a) Training-based depth completion requires offline training with paired RGB–depth data. (b) Test-time optimization methods adapt latent variables or visual prompts at inference time, incurring high computational cost. (c) Our method adapts only the decoder low-dimensional subspace, enabling efficient test-time adaptation.

Comparison with prior methods

Motivation, Efficiency–Accuracy Trade-off, and Iterative Refinement

BibTeX

@article{seo2026efficient,
      title={Efficient Test-Time Optimization for Depth Completion via Low-Rank Decoder Adaptation},
      author={Seo, Minseok and Lee, Wonjun and Jang, Jaehyuk and Kim, Changick},
      journal={arXiv preprint arXiv:2603.01765},
      year={2026},
}