TY - JOUR
T1 - Sparse cost volume for efficient stereo matching
AU - Lu, Chuanhua
AU - Uchiyama, Hideaki
AU - Thomas, Diego
AU - Shimada, Atsushi
AU - Taniguchi, Rin ichiro
N1 - Funding Information:
A part of this research was funded by JSPS KAKENHI grant number JP17H01768.
Publisher Copyright:
© 2018 by the authors.
PY - 2018/11/1
Y1 - 2018/11/1
N2 - Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume Net (SCV-Net) achieving high accuracy, low memory cost and fast computation. The idea of the cost volume for stereo matching was initially proposed in GC-Net. In our work, by making the cost volume compact and proposing an efficient similarity evaluation for the volume, we achieved faster stereo matching while improving the accuracy. Moreover, we propose to use weight normalization instead of commonly-used batch normalization for stereo matching tasks. This improves the robustness to not only sensor noises in images but also batch size in the training process. We evaluated our proposed network on the Scene Flow and KITTI 2015 datasets, its performance overall surpasses the GC-Net. Comparing with the GC-Net, our SCV-Net achieved to: (1) reduce 73.08% GPU memory cost; (2) reduce 61.11% processing time; (3) improve the 3PE from 2.87% to 2.61% on the KITTI 2015 dataset.
AB - Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume Net (SCV-Net) achieving high accuracy, low memory cost and fast computation. The idea of the cost volume for stereo matching was initially proposed in GC-Net. In our work, by making the cost volume compact and proposing an efficient similarity evaluation for the volume, we achieved faster stereo matching while improving the accuracy. Moreover, we propose to use weight normalization instead of commonly-used batch normalization for stereo matching tasks. This improves the robustness to not only sensor noises in images but also batch size in the training process. We evaluated our proposed network on the Scene Flow and KITTI 2015 datasets, its performance overall surpasses the GC-Net. Comparing with the GC-Net, our SCV-Net achieved to: (1) reduce 73.08% GPU memory cost; (2) reduce 61.11% processing time; (3) improve the 3PE from 2.87% to 2.61% on the KITTI 2015 dataset.
UR - http://www.scopus.com/inward/record.url?scp=85057102637&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057102637&partnerID=8YFLogxK
U2 - 10.3390/rs10111844
DO - 10.3390/rs10111844
M3 - Article
AN - SCOPUS:85057102637
SN - 2072-4292
VL - 10
JO - Remote Sensing
JF - Remote Sensing
IS - 11
M1 - 1844
ER -