Abstract— The rapid development and commercialization of Unmanned Aerial Vehicle (UAV) technology has made it possible to conduct urban traffic information extraction using UAV images. However, the large variations of targets in urban environments, complex foregrounds and backgrounds in cities, and severe tree and shadow occlusions pose great challenges in car and roadextraction using UAV images. In this study, we propose a lightweight, Efficient Dual Contextual Parsing Network (EDCPNet) to address the above issues. The proposed EDCP module in EDCPNet is mainly composed of spatial contextual parsing (SCP) and channel contextual parsing (CCP), which can effectively acquire rich contextual features in both spatial and channel dimensions, adaptively recalibrate the attention weights, perceive the salient features of targets in images, and suppress the importance of irrelevant elements. It thus leads to improved performance and adaptability that facilitate the practical applications of large-scale urban traffic monitoring in complex urban scenes. We conduct experiments on two benchmark datasets UAVid and UDD) by comparing the proposed EDCPNet with six other competing methods, i.e., U-Net, PSPNet, Deelabv3+, SegNet,ESNet, and ERFNet, and validate the effectiveness of the proposed EDCP module via extensive ablation studies. The results suggest that the proposed network outperforms all competing methods in car and road extraction from UAV images with a balanced computational cost. Its great performance and low computational demand (with only 2.37M model parameters) facilitate its deployment on edge computing devices with memory constrain
Keywords—Unmanned aerial vehicle images, road extraction,attention mechanism, lightweight network
| DOI: 10.17148/IJARCCE.2023.125130