Learned B-spline parametrization of lattice focal coding for monocular RGBD imaging

Abstract

Monocular RGBD imaging, also known as simultaneous all-in-focus (AiF) imaging and monocular depth estimation (MDE), represents a significant yet challenging task in computer vision. The crux lies in devising optical coding techniques to maximize the modulation transfer function (MTF) across various depths while minimizing their cross-correlation, all while aligning with the capabilities of image processing algorithms. End-to-end design of optics and algorithms offers a promising avenue towards achieving this holistic objective, but these approaches require solving non-convex inverse problems with millions of parameters. In this paper, we introduce a lattice-focal shape capable of nearly achieving the MTF bound as an initial solution, followed by employing B-spline parameterization for surface geometry representation to reduce the number of optimization variables. Further integration with the Restormer-based neural network, which possesses a global perspective, achieves high-performance RGBD imaging quality. Compared against state-of-the-art monocular RGBD imaging methods, our proposed approach improves the imaging peak signal-to-noise ratio (PSNR) by 3.0 dB and reduces the depth mean absolute error (MAE) by 39%. Experiments in real indoor and outdoor scenes validate the effectiveness of our method. The proposed approach paves the way for the development of monocular RGBD imaging.

Publication
IEEE International Conference on Computational Photography 2024