Published on Mon Aug 26 2019

SPGNet: Semantic Prediction Guidance for Scene Parsing

Bowen Cheng, Liang-Chieh Chen, Yunchao Wei, Yukun Zhu, Zilong Huang, Jinjun Xiong, Thomas Huang, Wen-Mei Hwu, Honghui Shi

Multi-scale context module and single-stage encoder-decoder structure are commonly employed for semantic segmentation. Semantic Prediction Guidance (SPG) module learns to re-weight the local features through the guidance from pixel-wise semantic prediction.

0
0
0
Abstract

Multi-scale context module and single-stage encoder-decoder structure are commonly employed for semantic segmentation. The multi-scale context module refers to the operations to aggregate feature responses from a large spatial extent, while the single-stage encoder-decoder structure encodes the high-level semantic information in the encoder path and recovers the boundary information in the decoder path. In contrast, multi-stage encoder-decoder networks have been widely used in human pose estimation and show superior performance than their single-stage counterpart. However, few efforts have been attempted to bring this effective design to semantic segmentation. In this work, we propose a Semantic Prediction Guidance (SPG) module which learns to re-weight the local features through the guidance from pixel-wise semantic prediction. We find that by carefully re-weighting features across stages, a two-stage encoder-decoder network coupled with our proposed SPG module can significantly outperform its one-stage counterpart with similar parameters and computations. Finally, we report experimental results on the semantic segmentation benchmark Cityscapes, in which our SPGNet attains 81.1% on the test set using only 'fine' annotations.

Tue Apr 06 2021
Computer Vision
DCANet: Dense Context-Aware Network for Semantic Segmentation
0
0
0
Thu Jul 29 2021
Computer Vision
A Unified Efficient Pyramid Transformer for Semantic Segmentation
Semantic segmentation is a challenging problem due to difficulties in modeling context in complex scenes and class confusions along boundaries. We advocate a unified framework(UN-EPT) to segment objects by considering both context information and boundary artifacts.
1
0
0
Tue Mar 05 2019
Computer Vision
Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation
DUpsampling is a data-dependent upsampling to replace bilinear. It takes advantages of the redundancy in the label space of semantic segmentation. It is able to recover the pixel-wise prediction from the low-resolution outputs of CNNs.
0
0
0
Fri Jun 29 2018
Computer Vision
Gated Feedback Refinement Network for Coarse-to-Fine Dense Semantic Image Labeling
We develop two encoder-decoder based deep learning architectures to address this problem. We propose Label Refinement Network (LRN) that predicts segmentation labels in a coarse-to-fine fashion at several spatial resolutions. We also propose Gated Feedback Refinement Network that addresses this limitation.
0
0
0
Fri Jul 30 2021
Computer Vision
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation
Semantic segmentation requires per-pixel prediction for a given image. Most previous methodsemploy upsampling decoders to recover the spatial resolution. Here, we propose a novel decoder, termed the dynamic neural representational decoder (NRD)
4
0
3
Wed Apr 03 2019
Computer Vision
GFF: Gated Fully Fusion for Semantic Segmentation
Semantic segmentation generates comprehensive understanding of scenes through densely predicting the category for each pixel. High-level features from Deep Convolutional Neural Networks already demonstrate their effectiveness. We propose a new architecture to selectively fuse features from multiple levels.
0
0
0