DSA-SOLO: Double Split Attention SOLO for Side-Scan Sonar Target Segmentation
Abstract
:1. Introduction
- (1)
- We proposed a novel model named DSA-SOLO for side-scan sonar image instance segmentation, and experimentally demonstrated its effectiveness in the object segmentation of side-scan sonar images.
- (2)
- We proposed a DSA module which fuses spatial and channel attention together to extract the target feature. This model improves segmentation accuracy without affecting the speed.
- (3)
- The experimental results contrasting to the existed instance segmentation methods on SCTD [21] dataset show that the proposed DSA-SOLO can achieve better performance.
2. Literature Review
2.1. Instance Segmentation for Sonar Images
2.2. Attention Mechanisms
3. Methods
3.1. DSA-SOLO
3.2. Double Split Attention (DSA) Module
3.2.1. Channel Attention
3.2.2. Spatial Attention
3.3. Loss Function
4. Experimental Results
4.1. Dataset
4.2. Implementation Detail and Evaluation Indexes
4.3. Comparative Experiments
4.4. Ablation Experiments
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Guo, Y.; Wei, L.; Xu, X. A sonar image segmentation algorithm based on quantum-inspired particle swarm optimization and fuzzy clustering. Neural Comput. Appl. 2018, 32, 16775–16782. [Google Scholar] [CrossRef]
- Huo, G.; Yang, S.; Li, Q.; Zhou, Y. A Robust and Fast Method for Sidescan Sonar Image Segmentation Using Nonlocal Despeckling and Active Contour Model. IEEE Trans. Cybern. 2017, 47, 855–872. [Google Scholar] [CrossRef] [PubMed]
- Steele, S.; Ejdrygiewicz, J.; Dillon, J. Automated Synthetic Aperture Sonar Image Segmentation using Spatially Coherent Clustering. In Proceedings of the OCEANS 2021: San Diego—Porto, San Diego, CA, USA, 20–23 September 2021. [Google Scholar]
- Chabane, A.N.; Islam, N.; Zerr, B. Incremental clustering of sonar images using self-organizing maps combined with fuzzy adaptive resonance theory. Ocean Eng. 2017, 142, 133–144. [Google Scholar] [CrossRef]
- Liu, Y.; Li, Q.; Huo, G. Robust and fast-converging level set method for side-scan sonar image segmentation. J. Electron. Imaging 2017, 26, 063021. [Google Scholar] [CrossRef]
- Imen, K.; Fablet, R.; Boucher, J.M.; Augustin, J.M. Region-based and incidence angle dependent segmentation of seabed sonar images using a level set approach combined to local texture statistics. In Proceedings of the OCEANS 2006—Asia Pacific, Singapore, 16–19 May 2006. [Google Scholar]
- Wang, L.; Ye, X.; Wang, G.; Wang, L. A Fast Hierarchical MRF Sonar Image Segmentation Algorithm. Int. J. Robot. Autom 2017, 32, 48–54. [Google Scholar] [CrossRef]
- Li, J.; Jiang, P.; Zhu, H. A Local Region-Based Level Set Method With Markov Random Field for Side-Scan Sonar Image Multi-Level Segmentation. IEEE Sens. J. 2021, 21, 510–519. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. ar**s in Deep Residual Networks. ar**s+in+Deep+Residual+Networks&author=He,+K.&author=Zhang,+X.&author=Ren,+S.&author=Sun,+J.&publication_year=2016&journal=ar** Networks for Instance Segmentation. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Gao, N.; Shan, Y.; Wang, Y.; Zhao, X.; Huang, K. SSAP: Single-Shot Instance Segmentation With Affinity Pyramid. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 661–673. [Google Scholar] [CrossRef]
- Bolya, D.; Zhou, C.; **ao, F.; Lee, Y.J. YOLACT Real-time Instance Segmentation. ar**v 2019, ar**v:1904.02689v2. [Google Scholar]
- **e, E.; Sun, P.; Song, X.; Wang, W.; Liang, D.; Shen, C.; Luo, P. PolarMask: Single Shot Instance Segmentation with Polar Representation. ar**v 2020, ar**v:1909.13226v4. [Google Scholar]
- Wang, X.; Zhang, R.; Shen, C.; Kong, T.; Li, L. SOLO: A Simple Framework for Instance Segmentation. ar**v 2021, ar**v:2106.15974v1. [Google Scholar] [CrossRef]
- Xu, F.; Huang, H.; Wu, J.; Jiang, L. Active Mask-Box Scoring R-CNN for Sonar Image Instance Segmentation. Electronics 2022, 11, 2048. [Google Scholar] [CrossRef]
- Fan, Z.; **a, W.; Liu, X.; Li, H. Detection and segmentation of underwater objects from forward-looking sonar based on a modified Mask RCNN. Signal. Image Video Process. 2021, 15, 1135–1143. [Google Scholar] [CrossRef]
- Kessel, R.T. Using sonar speckle to identify regions of interest and for mine detection. Proc. Detect. Remediat. Technol. Mines Minelike Targets 2002, 4742, 440–451. [Google Scholar]
- Jaderberg, M.; Simonyan, K.; Zisserman, A.; Kavukcuoglu, K. Spatial transformer networks. ar**v 2015, ar**v:1506.02025. [Google Scholar]
- Almahairi, A.; Ballas, N.; Cooijmans, T.; Zheng, Y.; Larochelle, H.; Courville, A. Dynamic Capacity Networks. ar**v 2015, ar**v:1511.07838v7. [Google Scholar]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef] [Green Version]
- Park, J.; Woo, S.; Lee, J.; Kweon, I.S. BAM: Bottleneck Attention Module. ar**v 2018, ar**v:1807.06514v2. [Google Scholar]
- Lin, T.-Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. ar**v 2016, ar**v:1612.03144v2. [Google Scholar]
- Wu, Y.; He, K. Group Normalization. ar**v 2018, ar**v:1803.08494. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proc. Int. Conf. Mach. Learn. 2015, 37, 448–456. [Google Scholar]
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer normalization. ar**v 2016, ar**v:1607.06450. [Google Scholar]
- Ulyanov, D.; Vedaldi, A.; Lempitsky, V. Instance normalization: The missing ingredient for fast stylization. ar**v 2016, ar**v:1607.08022. [Google Scholar]
- Li, T.Y.; Goyal, P.; Grishick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar]
- Sun, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual Attention Network for Scene Segmentation. ar**v 2019, ar**v:1809.02983v4. [Google Scholar]
Stage | Output Size | Backbone |
---|---|---|
Conv1 | 112 × 112 | 7 × 7 conv, 64 |
Conv2 | 56 × 56 | 3 × 3 max pooling |
Conv3 | 28 × 28 | |
Conv4 | 14 × 14 | |
Conv5 | 7 × 7 |
Label | Train | Val |
---|---|---|
ship | 295 | 83 |
plane | 72 | 18 |
body | 37 | 9 |
Model | mAP.5:.95 | mAP.5 | mAP.75 | FPS |
---|---|---|---|---|
Mask R-CNN | 37.8% | 71.8% | 31.6% | 7.9 |
YOLACT | 33.6% | 65.7% | 15.3% | 13.13 |
Polar Mask | 34.5% | 68.6% | 17.6% | 11.97 |
SOLOv2 | 40.0% | 73.3% | 35.8% | 18.64 |
DSA-SOLO | 42.7% | 78.4% | 43.2% | 18.14 |
Attention Module | mAP.5:.95 | mAP.5 | mAP.75 | FPS |
---|---|---|---|---|
SENet | 37.8% | 72.6% | 31.8% | 18.89 |
STN | 36.9% | 73.5% | 32.8% | 18.35 |
CBAM | 39.2% | 74.3% | 39.8% | 19.04 |
DANet | 38.1% | 73.8% | 31.6% | 19.02 |
DSA | 42.7% | 78.4% | 43.2% | 18.14 |
Model | mAP.5:.95 | mAP.5 | mAP.75 | FPS |
---|---|---|---|---|
SOLOv2 | 40.0% | 73.3% | 35.8% | 18.64 |
C-S Unit Only | 40.2% | 75.2% | 35.7% | 19.02 |
S-C Unit Only | 41.9% | 75.9% | 44.3% | 18.39 |
DSA | 42.7% | 78.4% | 43.2% | 18.14 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, H.; Zuo, Z.; Sun, B.; Wu, P.; Zhang, J. DSA-SOLO: Double Split Attention SOLO for Side-Scan Sonar Target Segmentation. Appl. Sci. 2022, 12, 9365. https://doi.org/10.3390/app12189365
Huang H, Zuo Z, Sun B, Wu P, Zhang J. DSA-SOLO: Double Split Attention SOLO for Side-Scan Sonar Target Segmentation. Applied Sciences. 2022; 12(18):9365. https://doi.org/10.3390/app12189365
Chicago/Turabian StyleHuang, Honghe, Zhen Zuo, Bei Sun, Peng Wu, and Jiaju Zhang. 2022. "DSA-SOLO: Double Split Attention SOLO for Side-Scan Sonar Target Segmentation" Applied Sciences 12, no. 18: 9365. https://doi.org/10.3390/app12189365