SalientDSO: Bringing Attention to Direct Sparse Odometry

dc.contributor.advisorAloimonos, Yiannisen_US
dc.contributor.advisorFermuller, Corneliaen_US
dc.contributor.authorLiang, Huai-Jenen_US
dc.contributor.departmentElectrical Engineeringen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2018-07-17T05:54:47Z
dc.date.available2018-07-17T05:54:47Z
dc.date.issued2018en_US
dc.description.abstractAlthough cluttered indoor scenes have a lot of useful high-level semantic information which can be used for mapping and localization, most Visual Odometry (VO) algorithms rely on the usage of geometric features such as points, lines and planes. Lately, driven by this idea, the joint optimization of semantic labels and obtaining odometry has gained popularity in the robotics community. The joint optimization is good for accurate results but is generally very slow. At the same time, in the vision community, direct and sparse approaches for VO have stricken the right balance between speed and accuracy. We merge the successes of these two communities and present a way to incorporate semantic information in the form of visual saliency to Direct Sparse Odometry -- a highly successful direct sparse VO algorithm. We also present a framework to filter the visual saliency based on scene parsing. Our framework, SalientDSO, relies on the widely successful deep learning based approaches for visual saliency and scene parsing which drives the feature selection for obtaining highly-accurate and robust VO even in the presence of as few as 40 point features per frame. We provide extensive quantitative evaluation of SalientDSO on the ICL-NUIM and TUM monoVO datasets and show that we outperform DSO and ORB-SLAM -- two very popular state-of-the-art approaches in the literature. We also collect and publicly release a CVL-UMD dataset which contains two indoor cluttered sequences on which we show qualitative evaluations. To our knowledge this is the first framework to use visual saliency and scene parsing to drive the feature selection in direct VO.en_US
dc.identifierhttps://doi.org/10.13016/M2HX15T8P
dc.identifier.urihttp://hdl.handle.net/1903/20869
dc.language.isoenen_US
dc.subject.pqcontrolledRoboticsen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pquncontrolledDirect Sparse Odometryen_US
dc.subject.pquncontrolledScene Parsingen_US
dc.subject.pquncontrolledSLAMen_US
dc.subject.pquncontrolledVisual Saliencyen_US
dc.titleSalientDSO: Bringing Attention to Direct Sparse Odometryen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Liang_umd_0117N_18848.pdf
Size:
8.73 MB
Format:
Adobe Portable Document Format