Published on Sat Feb 06 2021

MOTS R-CNN: Cosine-margin-triplet loss for multi-object tracking

Amit Satish Unde, Renu M. Rameshan

One of the central tasks of multi-object tracking involves learning a distance metric that is consistent with the semantic similarities of objects. We propose a scale-invariant tracking by using a multi-layer feature aggregation scheme to make the model robust against scale variations and occlusions.

0
0
0
Abstract

One of the central tasks of multi-object tracking involves learning a distance metric that is consistent with the semantic similarities of objects. The design of an appropriate loss function that encourages discriminative feature learning is among the most crucial challenges in deep neural network-based metric learning. Despite significant progress, slow convergence and a poor local optimum of the existing contrastive and triplet loss based deep metric learning methods necessitates a better solution. In this paper, we propose cosine-margin-contrastive (CMC) and cosine-margin-triplet (CMT) loss by reformulating both contrastive and triplet loss functions from the perspective of cosine distance. The proposed reformulation as a cosine loss is achieved by feature normalization which distributes the learned features on a hypersphere. We then propose the MOTS R-CNN framework for joint multi-object tracking and segmentation, particularly targeted at improving the tracking performance. Specifically, the tracking problem is addressed through deep metric learning based on the proposed loss functions. We propose a scale-invariant tracking by using a multi-layer feature aggregation scheme to make the model robust against object scale variations and occlusions. The MOTS R-CNN achieves the state-of-the-art tracking performance on the KITTI MOTS dataset. We show that the MOTS R-CNN reduces the identity switching by and on cars and pedestrians, respectively in comparison to Track R-CNN.

Thu Jun 11 2020
Computer Vision
Quasi-Dense Similarity Learning for Multiple Object Tracking
Quasi-Dense Similarity Learning densely samples hundreds of region proposals on a pair of images. It achieves 68.7 MOTA at 20.3 FPS on MOT17 without using external training data. Compared to methods with similar detectors, it boosts almost 10 points of MOTA.
3
6
63
Wed Sep 28 2016
Machine Learning
Similarity Mapping with Enhanced Siamese Network for Multi-Object Tracking
Multi-object tracking has recently become an important area of computer vision. Despite growing attention, achieving high performance tracking is still challenging. We introduce a novel tracking system based on similarity mapping.
0
0
0
Thu Apr 08 2021
Computer Vision
Multiple Object Tracking with Correlation Learning
0
0
0
Sat Jun 15 2019
Computer Vision
How To Train Your Deep Multi-Object Tracker
The recent trend in vision-based multi-object tracking (MOT) is heading toward leveraging the representational power of deep learning. However, existing methods train only certain sub- modules using loss functions that often do not correlate with established tracking evaluation measures. As these measures are not differentiable,
0
0
0
Wed Mar 15 2017
Computer Vision
Large Margin Object Tracking with Circulant Feature Maps
Large margin object tracking method which absorbs the strong discriminative ability from structured output SVM and speeds up by the correlation filter algorithm. A multimodal target localization technique is proposed to improve the target localization precision.
0
0
0
Fri Feb 09 2018
Computer Vision
Multiple Target Tracking by Learning Feature Representation and Distance Metric Jointly
This paper proposes a novel affinity model by learning feature representation and distance metric jointly in a unified deep architecture. We design a CNN network to obtain appearance cue tailored towards person Re-ID, and an LSTM network for motion cue to predict target position.
0
0
0