Sparse cost volume for efficient stereo matching

研究成果: ジャーナルへの寄稿記事

抄録

Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume Net (SCV-Net) achieving high accuracy, low memory cost and fast computation. The idea of the cost volume for stereo matching was initially proposed in GC-Net. In our work, by making the cost volume compact and proposing an efficient similarity evaluation for the volume, we achieved faster stereo matching while improving the accuracy. Moreover, we propose to use weight normalization instead of commonly-used batch normalization for stereo matching tasks. This improves the robustness to not only sensor noises in images but also batch size in the training process. We evaluated our proposed network on the Scene Flow and KITTI 2015 datasets, its performance overall surpasses the GC-Net. Comparing with the GC-Net, our SCV-Net achieved to: (1) reduce 73.08% GPU memory cost; (2) reduce 61.11% processing time; (3) improve the 3PE from 2.87% to 2.61% on the KITTI 2015 dataset.

元の言語英語
記事番号1844
ジャーナルRemote Sensing
10
発行部数11
DOI
出版物ステータス出版済み - 11 1 2018

Fingerprint

cost
sensor
normalisation
evaluation
supervised learning

All Science Journal Classification (ASJC) codes

  • Earth and Planetary Sciences(all)

これを引用

Sparse cost volume for efficient stereo matching. / Lu, Chuanhua; Uchiyama, Hideaki; Thomas, Diego Gabriel Francis; Shimada, Atsushi; Taniguchi, Rin-Ichiro.

:: Remote Sensing, 巻 10, 番号 11, 1844, 01.11.2018.

研究成果: ジャーナルへの寄稿記事

@article{482fbe182a3e486185f9d10342a44109,
title = "Sparse cost volume for efficient stereo matching",
abstract = "Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume Net (SCV-Net) achieving high accuracy, low memory cost and fast computation. The idea of the cost volume for stereo matching was initially proposed in GC-Net. In our work, by making the cost volume compact and proposing an efficient similarity evaluation for the volume, we achieved faster stereo matching while improving the accuracy. Moreover, we propose to use weight normalization instead of commonly-used batch normalization for stereo matching tasks. This improves the robustness to not only sensor noises in images but also batch size in the training process. We evaluated our proposed network on the Scene Flow and KITTI 2015 datasets, its performance overall surpasses the GC-Net. Comparing with the GC-Net, our SCV-Net achieved to: (1) reduce 73.08{\%} GPU memory cost; (2) reduce 61.11{\%} processing time; (3) improve the 3PE from 2.87{\%} to 2.61{\%} on the KITTI 2015 dataset.",
author = "Chuanhua Lu and Hideaki Uchiyama and Thomas, {Diego Gabriel Francis} and Atsushi Shimada and Rin-Ichiro Taniguchi",
year = "2018",
month = "11",
day = "1",
doi = "10.3390/rs10111844",
language = "English",
volume = "10",
journal = "Remote Sensing",
issn = "2072-4292",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "11",

}

TY - JOUR

T1 - Sparse cost volume for efficient stereo matching

AU - Lu, Chuanhua

AU - Uchiyama, Hideaki

AU - Thomas, Diego Gabriel Francis

AU - Shimada, Atsushi

AU - Taniguchi, Rin-Ichiro

PY - 2018/11/1

Y1 - 2018/11/1

N2 - Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume Net (SCV-Net) achieving high accuracy, low memory cost and fast computation. The idea of the cost volume for stereo matching was initially proposed in GC-Net. In our work, by making the cost volume compact and proposing an efficient similarity evaluation for the volume, we achieved faster stereo matching while improving the accuracy. Moreover, we propose to use weight normalization instead of commonly-used batch normalization for stereo matching tasks. This improves the robustness to not only sensor noises in images but also batch size in the training process. We evaluated our proposed network on the Scene Flow and KITTI 2015 datasets, its performance overall surpasses the GC-Net. Comparing with the GC-Net, our SCV-Net achieved to: (1) reduce 73.08% GPU memory cost; (2) reduce 61.11% processing time; (3) improve the 3PE from 2.87% to 2.61% on the KITTI 2015 dataset.

AB - Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume Net (SCV-Net) achieving high accuracy, low memory cost and fast computation. The idea of the cost volume for stereo matching was initially proposed in GC-Net. In our work, by making the cost volume compact and proposing an efficient similarity evaluation for the volume, we achieved faster stereo matching while improving the accuracy. Moreover, we propose to use weight normalization instead of commonly-used batch normalization for stereo matching tasks. This improves the robustness to not only sensor noises in images but also batch size in the training process. We evaluated our proposed network on the Scene Flow and KITTI 2015 datasets, its performance overall surpasses the GC-Net. Comparing with the GC-Net, our SCV-Net achieved to: (1) reduce 73.08% GPU memory cost; (2) reduce 61.11% processing time; (3) improve the 3PE from 2.87% to 2.61% on the KITTI 2015 dataset.

UR - http://www.scopus.com/inward/record.url?scp=85057102637&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057102637&partnerID=8YFLogxK

U2 - 10.3390/rs10111844

DO - 10.3390/rs10111844

M3 - Article

VL - 10

JO - Remote Sensing

JF - Remote Sensing

SN - 2072-4292

IS - 11

M1 - 1844

ER -