Point Cloud Instance Segmentation with Inaccurate Bounding-Box Annotations
Abstract
:1. Introduction
- To the best of our knowledge, this is the first work to simultaneously explore inexact and inaccurate annotations in the point cloud instance segmentation task.
- We propose a novel self-distillation framework for applying consistency regularization and label refurbishment by using data perturbation and history information.
- Extensive experiments were conducted to demonstrate the effectiveness of our method. The results on ScanNet-v2 show that our SDPH achieved comparable performance to that of densely and accurately supervised methods.
2. Related Works
2.1. Point Cloud Instance Segmentation
2.1.1. Proposal-Based Methods
2.1.2. Proposal-Free Methods
2.2. Weakly Supervised Point Cloud Segmentation
3. Our Method
3.1. Overview
3.2. Pseudo-Label Generation
3.3. Point Cloud Instance Segmentation Network
3.4. Self-Distillation Based on Perturbation and History
3.4.1. Perturbation-Based Consistency Regularization
3.4.2. History-Guided Label Refurbishment
3.4.3. Temporal Consistency Regularization
3.5. Total Loss
4. Experiments
4.1. Experimental Settings
4.1.1. Dataset
4.1.2. Evaluation Metrics
4.1.3. Implementation Details
4.2. Instance Segmentation Results
4.3. Ablation Study
4.4. Analysis of Label Refurbishment
4.5. Complexity Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- El Madawi, K.; Rashed, H.; El Sallab, A.; Nasr, O.; Kamel, H.; Yogamani, S. RGB and LiDAR fusion based 3D Semantic Segmentation for Autonomous Driving. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; pp. 7–12. [Google Scholar] [CrossRef] [Green Version]
- Yan, Z.; Duckett, T.; Bellotto, N. Online learning for 3D LiDAR-based human detection: Experimental analysis of point cloud clustering and classification methods. Auton. Robot. 2020, 44, 147–164. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhang, L.; Liu, Y.; Meng, D.; Cui, Z.; Gao, C.; Gao, X.; Lian, C.; Shen, D. Two-Stream Graph Convolutional Network for Intra-Oral Scanner Image Segmentation. IEEE Trans. Med. Imaging 2022, 41, 826–835. [Google Scholar] [CrossRef] [PubMed]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 4338–4364. [Google Scholar] [CrossRef] [PubMed]
- Lu, H.; Shi, H. Deep Learning for 3D Point Cloud Understanding: A Survey. ar** for 3D Instance Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 13–19 June 2020. [Google Scholar]
- Vu, T.; Kim, K.; Luu, T.M.; Nguyen, T.; Yoo, C.D. SoftGroup for 3D Instance Segmentation on Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–20 June 2022; pp. 2708–2717. [Google Scholar]
- Zhou, Z.H. A brief introduction to weakly supervised learning. Natl. Sci. Rev. 2018, 5, 44–53. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.; Li, Z.; **. ar**&author=Reed,+S.&author=Lee,+H.&author=Anguelov,+D.&author=Szegedy,+C.&author=Erhan,+D.&author=Rabinovich,+A.&publication_year=2014&journal=ar**v" class='google-scholar' target='_blank' rel='noopener noreferrer'>Google Scholar]
- Song, H.; Kim, M.; Lee, J.G. SELFIE: Refurbishing Unclean Samples for Robust Deep Learning. In Machine Learning Research, Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; Chaudhuri, K., Salakhutdinov, R., Eds.; PMLR: San Diego, CA, USA, 2019; Volume 97, pp. 5907–5915. [Google Scholar]
- Nguyen, T.; Mummadi, C.K.; Ngo, T.P.N.; Nguyen, T.H.P.; Beggel, L.; Brox, T. SELF: Learning to filter noisy labels with self-ensembling. In Proceedings of the International Conference on Learning Representations (ICLR), Virtual, 26–30 April 2020. [Google Scholar]
- Lahoud, J.; Ghanem, B.; Pollefeys, M.; Oswald, M.R. 3D Instance Segmentation via Multi-Task Metric Learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
Method | Metric | 0% | 10% | 20% | 30% | 40% | 50% | 60% |
---|---|---|---|---|---|---|---|---|
Box2Mask [20] | 39.1 | 37.5 | 36.3 | 36.3 | 35.2 | 33.6 | 32.0 | |
59.7 | 57.5 | 55.8 | 55.4 | 53.3 | 50.4 | 46.7 | ||
71.8 | 69.8 | 68.8 | 67.3 | 65.8 | 62.6 | 58.2 | ||
SDPH | 40.1 | 41.2 | 40.8 | 40.0 | 40.4 | 37.6 | 36.5 | |
60.4 | 60.4 | 60.3 | 58.7 | 58.6 | 55.1 | 52.5 | ||
73.0 | 72.1 | 71.7 | 70.7 | 69.0 | 65.4 | 61.9 | ||
Improvements | 1.0 | 3.7 | 4.5 | 3.7 | 5.2 | 4.0 | 4.5 | |
0.7 | 2.9 | 4.5 | 3.3 | 5.3 | 4.7 | 5.8 | ||
1.2 | 2.3 | 2.9 | 3.4 | 3.2 | 2.8 | 3.7 |
Setting | Method | Bathtub | Bed | Bookshe. | Cabinet | Chair | Counter | Curtain | Desk | Door | Otherfu. | Picture | Refrige. | S. Curtain | Sink | Sofa | Table | Toilet | Window | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Full | SegCluster [35] | 13.4 | 16.4 | 13.5 | 11.7 | 11.8 | 18.9 | 13.7 | 12.4 | 12.2 | 11.1 | 12.0 | 0.0 | 11.2 | 18.0 | 18.9 | 14.6 | 13.8 | 19.5 | 11.5 |
SGPN [11] | 22.2 | 0.0 | 31.5 | 13.6 | 20.7 | 31.6 | 17.4 | 22.2 | 14.1 | 16.6 | 18.6 | 0.0 | 0.0 | 0.0 | 52.4 | 40.6 | 31.9 | 72.9 | 15.3 | |
3D-SIS [35] | 35.7 | 57.6 | 66.3 | 16.9 | 32.0 | 65.3 | 22.1 | 22.6 | 35.1 | 26.7 | 21.1 | 0.0 | 28.6 | 37.2 | 39.6 | 56.4 | 29.4 | 74.9 | 10.1 | |
MTML [52] | 55.4 | 79.4 | 80.6 | 45.3 | 34.6 | 87.7 | 9.7 | 54.2 | 49.9 | 45.8 | 33.5 | 19.8 | 44.1 | 74.9 | 44.5 | 80.3 | 67.4 | 98.0 | 47.2 | |
PointGroup [38] | 71.3 | 86.5 | 79.5 | 74.4 | 67.3 | 92.5 | 64.8 | 61.6 | 74.1 | 54.8 | 65.4 | 48.2 | 38.3 | 71.1 | 82.8 | 85.1 | 74.2 | 100 | 63.6 | |
3D-MPA [9] | 72.4 | 90.3 | 83.4 | 78.3 | 69.9 | 87.6 | 62.5 | 66.0 | 69.2 | 56.6 | 48.6 | 48.0 | 61.4 | 93.1 | 75.2 | 76.1 | 74.8 | 99.2 | 62.2 | |
Weak | SPIB [22] | 61.4 | 87.4 | 86.8 | 48.8 | 45.4 | 89.0 | 49.6 | 47.8 | 52.3 | 49.2 | 45.5 | 9.9 | 48.3 | 82.6 | 63.2 | 88.1 | 66.2 | 95.9 | 41.9 |
Box2Mask [20] | 71.8 | 87.1 | 83.8 | 68.2 | 59.5 | 94.5 | 58.5 | 65.1 | 78.6 | 59.8 | 67.1 | 45.6 | 46.9 | 77.4 | 79.5 | 87.0 | 75.5 | 96.9 | 61.4 | |
SDPH | 73.0 | 87.1 | 82.6 | 73.6 | 62.1 | 95.2 | 63.0 | 61.5 | 85.5 | 61.1 | 63.1 | 43.5 | 46.7 | 82.0 | 85.4 | 86.3 | 78.2 | 98.3 | 59.3 |
PCR | HLR | TCR | |||
---|---|---|---|---|---|
35.2 | 53.3 | 65.8 | |||
√ | 37.1 | 53.7 | 65.1 | ||
√ | 37.6 | 55.4 | 66.6 | ||
√ | 37.8 | 56.7 | 67.8 | ||
√ | √ | 39.5 | 58.1 | 67.9 | |
√ | √ | 37.1 | 54.8 | 65.6 | |
√ | √ | 39.5 | 57.4 | 68.8 | |
√ | √ | √ | 40.4 | 58.6 | 69.0 |
Method | Training Time (ms) | Inference Time (ms) |
---|---|---|
Box2Mask [20] | 444 | 1044 |
SDPH | 722 | 1026 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Peng, Y.; Feng, H.; Chen, T.; Hu, B. Point Cloud Instance Segmentation with Inaccurate Bounding-Box Annotations. Sensors 2023, 23, 2343. https://doi.org/10.3390/s23042343
Peng Y, Feng H, Chen T, Hu B. Point Cloud Instance Segmentation with Inaccurate Bounding-Box Annotations. Sensors. 2023; 23(4):2343. https://doi.org/10.3390/s23042343
Chicago/Turabian StylePeng, Yinyin, Hui Feng, Tao Chen, and Bo Hu. 2023. "Point Cloud Instance Segmentation with Inaccurate Bounding-Box Annotations" Sensors 23, no. 4: 2343. https://doi.org/10.3390/s23042343