Research
We work on various problems in
geometric and semantic computer vision and machine learning,
with applications to mobile robots, automotives, and augmented/virtual reality.
geometric and semantic computer vision and machine learning,
with applications to mobile robots, automotives, and augmented/virtual reality.
Visual SLAM / Visual Inertial Odometry
ROVO: Robust Omnidirectional Visual Odometry
In this paper we propose a robust visual odometry system for a wide-baseline camera rig with wide field-of-view (FOV) fisheye lenses, which provides full omnidirectional stereo observations of the environment. For more robust and accurate ego-motion estimation we adds three components to the standard VO pipelines, 1) the hybrid projection model for improved feature matching, 2) multi-view p3p ransanc algorithm for pose estimation, and 3) online update of rig extrinsic parameters. The proposed system is extensively evaluated with synthetic datasets with ground-truth and real sequences of highly dynamic environment, and its superior performance is demonstrated.
Hochang Seok, Jongwoo Lim*, “ROVO: Robust Omnidirectional Visual Odometry for Wide-baseline Wide-FOV Camera Systems,” in ICRA 2019
Visual Inertial Odometry Using Coupled Nonlinear Optimization
Visual inertial odometry (VIO) gained lots of interest recently for efficient and accurate ego-motion estimation of robots and automobiles. With a monocular camera and an inertial measurement unit (IMU) rigidly attached, VIO aims to estimate the 3D pose trajectory of the device in a global metric space. We propose a novel visual inertial odometry algorithm which directly optimizes the camera poses with noisy IMU data and visual feature locations.
Euntae Hong, Jongwoo Lim, “Visual inertial odometry using coupled nonlinear optimization,” in IROS 2017, [link]
Online Environment Mapping (Metric-topological map)
Building the map of environment for localization and navigation is critical for scene understanding and robot operation. We propose a metric-topological mapping which holds the benefits of both metric maps and topological maps.
Camera Motion Estimation using Points and Lines
Points are commonly used for structure from motion and ego-motion estimation. We investigated more robust and fast ways to use line features for motion estimation of a stereo camera rig.
Vivek Pradeep, Jongwoo Lim, “Egomotion Estimation Using Assorted Features,” in IJCV, Vol. 98, Issue 2, Page 202-216, 2012 [link]
Earlier version: Egomotion using Assorted Features, in CVPR 2010 [link]
Manmohan Chandraker, Jongwoo Lim, David Kriegman, “Moving in Stereo: Efficient Structure and Motion Using Lines,” in ICCV 2009 [link]
Depth Estimation / 3D Modeling
Robust stereo matching using adaptive random walk with restart algorithm
In this paper, we propose a robust dense stereo reconstruction algorithm using a random walk with restart. The pixel-wise matching costs are aggregated into superpixels and the modified random walk with restart algorithm updates the matching cost for all possible disparities between the superpixels.
Sehyung Lee, Jin Han Lee, Jongwoo Lim, Il Hong Suh, “Robust Stereo Matching using Adaptive Random Walk with Restart Algorithm,” in Image and Vision Computing 2015 [link]
Visual Object Tracking
Tracking Persons-of-Interest via Adaptive Discriminative Features
Multi-face tracking in unconstrained videos is a challenging problem as faces of one person often appear drastically different in multiple shots due to significant variations in scale, pose, expression, illumination, and make-up. Low- level features used in existing multi-target tracking methods are not effective for identifying faces with such large appearance variations. In this paper, we tackle this problem by learning discriminative, video-specific face features using convolutional neural networks (CNNs).
Shun Zhang, Yihong Gong, Jia-Bin Huang, Jongwoo Lim, Jinjun Wang, Narendra Ahuja, Ming-Hsuan Yang, “Tracking Persons-of-Interest via Adaptive Discriminative Features,” in ECCV 2016, [link]
Visual Tracking Benchmark
For decades many visual trackers have been proposed, but there was little effort to quantitatively measure and compare their performance. In this work we provide a dataset which contains common test videos with hand-labeled groundtruth. The tracker library with standardized interface for massive evaluation enables the researchers to easily test and compare their trackers with the state-of-the-art trackers.
Yi Wu, Jonbwoo Lim*, Ming-Hsuan Yang, “Online Object Tracking: A Benchmark,” in CVPR 2013, [link] [project page , code]
Deep Learning
DFT-based Transformation Invariant Pooling Layer for Visual Classification
We propose a DFT based pooling layer for convolutional neural networks. The proposed DFT magnitude pooling satisfies translation invariance and shape preserving properties. It pools DFT magnitude of last convolution feature map based on shift theorem. Convolutional neural networks with the proposed method improve the performance of various visual classification tasks. We validate the ability of the transformation invariance by sufficient experiments of the paper.
Jongbin Ryu, Ming-Hsuan Yang, Jongwoo Lim*, “DFT-based Transformation Invariant Pooling Layer for Visual Classification,” in ECCV 2018, p. 84-99 [project]
Research Project Pages
Omnidirectional Stereo Dataset (ICCV19, TPAMI21)
Tracker Benchmark (CVPR13, TPAMI15)