Pixel-Reasoning-Based Robotics Fine Gras** for Novel Objects with Deep EDINet Structure
Abstract
:1. Introduction
- We propose a fine gras** representation model to generate the gripper configuration of parallel-jaw, which can effectively avoid the collision problem for clutter objects. Besides, the adaptive gras** width is fine for deformed or rigid objects in the gras** process;
- It is proposed to use the EDINet network to generate pixel-level gripper configurations to avoid missing potential ground truth grasp poses and reduce calculation time. The EDINet meets the real-time performance within 25 ms and achieves a very good balance in the speed and accuracy of gras** reasoning;
- Our system shows out-performance on the Cornell grasp datasets due to proper network structure, and it has been proven to be effective for novel objects in cluttered scenes. In actual robot gras**, our method has an average grasp success rate of 97.2% in a single-object scene and an average success rate of 93.7% in a cluttered scene. Moreover, our method outperforms the state-of-the-art algorithms in real application;
- Our network uses RGB-D multi-modal data to enhance the diversity and saliency of features so that it is easy to train the model and effectively improve the accuracy and success rate of gras** detection.
2. Related Work
2.1. Robotic Gras**
2.2. Gras** Representation
2.3. Network for Gras**
3. Robot Grasp Representation
4. Proposed Methods
4.1. The Robotics Gras** System
4.2. The EDINet Architecture
4.3. Gras** Training
- Grasp confidence: We regard the grasp confidence as a binary label and express it with a score between 0 and 1. The closer it is to 1, the higher the success rate of gras**.
- Grasp width: In order to achieve depth invariance, we set the gras** width and in the range of [0,], and is the maximum width of the gripper. In the training process, we first scale it to [0,1] and then use the camera parameters and the measured depth to calculate the grasp width.
- Grasp Angle: Set the area of the grasp rectangle to and encoding the angle as a vector component on the unit circle produces a value in the range [–1,1] and eliminates the possibility of discontinuity when the angle surrounds . We use to represent the grasp angle.
4.4. Loss Function
4.5. Pixel-Level Gras** Detection
5. Implementation Details
5.1. Training Dataset
5.2. Metrics for Grasp Detection
- (1)
- The rotation angle difference between the predicted grasp rectangle and the ground truth rectangle is less than 30°;
- (2)
- The Jaccard index between the predicted gras** rectangle and the ground truth rectangle is more than 0.25, where the Jacquard index is defined as:
5.3. Test in Datasets
6. Results and Analysis
6.1. Ablation Experiment on Network
6.2. Test Results on the Cornell Grasp Dataset
6.3. Test Results on the Jacquard Dataset
7. Robot Fine Gras**
7.1. Adaptive Closing Width Test
7.2. Gras** with Adaptive Opening Test
8. Unknown Objects Gras**
8.1. Single Target Gras** Test
8.2. Cluttered Gras** Test
9. Conclusions
Author Contributions
Funding
Baseline | Encoder–Decoder | Inception Module | Up-Sampling Module | IW (%) | OW (%) |
---|---|---|---|---|---|
√ | 94.9 | 94.7 | |||
√ | √ | 96.2 | 95.9 | ||
√ | √ | √ | 98.3 | 97.3 | |
√ | √ | √ | √ | 98.9 | 97.7 |
Authors | Algorithm | Accuracy (%) | Speed (ms) | |
---|---|---|---|---|
IW | OW | |||
Wang et al. [21] | DDNet | 96.1 | 95.5 | |
Yu et al. [22] | TsGNet | 93.13 | 92.99 | |
Yu et al. [26] | SE-ResUNet | 98.2 | 97.1 | 25 |
Park et al. [32] | DNNs | 97.7 | 96.1 | 7 |
Song et al. [13] | RPN | 96.2 | 95.6 | |
Asif et al. [35] | DGDG | 97.5 | 111 | |
Kumra et al. [36] | ResNet-50x2 | 89.2 | 88.9 | 103 |
Morrison et al. [38] | GG-CNN | 73 | 69 | 19 |
Ainetter et al. [39] | Det_Seg_refine | 98.2 | 32 | |
Cao et al. [41] | RSEN | 96.4 | - | - |
Chen et al. [42] | FCN | 82.8 | 81.9 | |
Zhou et al. [43] | FCGN, Resnet101 | 97.7 | 96.6 | 117 |
Shao et al. [44] | SAE+BN+SAE | 95.51 | - | - |
Depierre et al. [45] | Grasp Regression | 95.2 | - | - |
Yu et al. [46] | Multilevel CNNs | 95.8 | 96.2 | - |
Liu et al. [47] | Mask-RCNN Q-Net, Y-Net | 95.2 | - | - |
Redom et al. [48] | AlexNet | 88.0 | 87.1 | 76 |
Asif et al. [49] | GraspNet | 90.2 | 90.6 | 24 |
Guo et al. [50] | ZF-net | 93.2 | 89.1 | - |
Karaoguz et al. [51] | GPRN | 88.7 | - | 200 |
Kumra et al. [52] | GR-ConvNet | 97.7 | 96.6 | 20 |
Chu et al. [53] | FasterRcnn | 96.0 | 96.1 | 120 |
Zhang et al. [54] | ROI-GD | 93.6 | 93.5 | 40 |
Ours | EDINet-RGB | 97.8 | 96.6 | 24 |
EDINet-D | 95.5 | 93.2 | 24 | |
EDINet-RGBD | 98.9 | 97.7 | 25 |
Authors | Splitting | Jaccard Index | ||||
---|---|---|---|---|---|---|
0.20 | 0.25 | 0.30 | 0.35 | 0.40 | ||
Song et al. [13] | IW (%) | - | 95.6 | 94.9 | 91.2 | 87.6 |
Chu et al. [28] | - | 96.0 | 94.9 | 92.1 | 84.7 | |
Zhou et al. [43] | 98.31 | 97.74 | 96.61 | 95.48 | - | |
Ours | 99.1 | 98.9 | 98.2 | 97.2 | 96.7 | |
Song et al. [13] | OW (%) | - | 97.1 | 97.1 | 96.4 | 93.4 |
Chu et al. [28] | 96.1 | 92.7 | 87.6 | 82.6 | ||
Zhou et al. [43] | 97.74 | 96.61 | 93.78 | 91.53 | - | |
Ours | 98.9 | 97.7 | 97.6 | 97.1 | 96.5 |
Authors | Algorithm | Accuracy (%) |
---|---|---|
Song et al. [13] | RPN | 91.5 |
Yu et al. [26] | ResUNet | 95.7 |
Ainetter et al. [39] | Det_Seg_refine | 94.86 |
Liu et al. [47] | Mask-RCNN Q-Net, Y-Net | 92.1 |
Depierre et al. [45] | Gras** Regression | 85.74 |
Morrison et al. [38] | GG-CNN2 | 84 |
Kumra et al. [52] | GR-ConvNet | 94.6 |
Depierre et al. [55] | AlexNet | 74.2 |
Ours | EDINet-RGB | 95.5 |
EDINet-D | 94.9 | |
EDINet-RGBD | 96.1 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shi, C.; Miao, C.; Zhong, X.; Zhong, X.; Hu, H.; Liu, Q. Pixel-Reasoning-Based Robotics Fine Gras** for Novel Objects with Deep EDINet Structure. Sensors 2022, 22, 4283. https://doi.org/10.3390/s22114283
Shi C, Miao C, Zhong X, Zhong X, Hu H, Liu Q. Pixel-Reasoning-Based Robotics Fine Gras** for Novel Objects with Deep EDINet Structure. Sensors. 2022; 22(11):4283. https://doi.org/10.3390/s22114283
Chicago/Turabian StyleShi, Chaoquan, Chunxiao Miao, Xungao Zhong, Xunyu Zhong, Huosheng Hu, and Qiang Liu. 2022. "Pixel-Reasoning-Based Robotics Fine Gras** for Novel Objects with Deep EDINet Structure" Sensors 22, no. 11: 4283. https://doi.org/10.3390/s22114283