Grasp Label Format

Raw Label Format

The raw label is composed of two parts, i.e. labels for all grasp candidates on each object and collision masks for each scene.

Labels on Objects

The raw label on each object is a list of numpy arrays.

>>> import numpy as np
>>> l = np.load('000_labels.npz') # GRASPNET_ROOT/grasp_label/000_labels.npz
>>> l.files
['points', 'offsets', 'collision', 'scores']
>>> l['points'].shape
(3459, 3)
>>> l['offsets'].shape
(3459, 300, 12, 4, 3)
>>> l['collision'].shape
(3459, 300, 12, 4)
>>> l['collision'].dtype
dtype('bool')
>>> l['scores'].shape
(3459, 300, 12, 4)
>>> l['scores'][0][0][0][0]
-1.0
  • ‘points’ records the grasp center point coordinates in model frame.

  • ‘offsets’ records the in-plane rotation, depth and width of the gripper respectively in the last dimension.

  • ‘collision’ records the bool mask for if the grasp pose collides with the model.

  • ‘scores’ records the minimum coefficient of friction between the gripper and object to achieve a stable grasp.

Note

In the raw label, the lower score the grasp has, the better it is. However, -1.0 score means the grasp pose is totally not acceptable.

300, 12, 4 denote view id, in-plane rotation id and depth id respectively. The views are defined here in graspnetAPI/utils/utils.py.

1
2
3
4
5
6
7
8
        cloud = cloud.reshape([-1, 3])
    return cloud

def generate_views(N, phi=(np.sqrt(5)-1)/2, center=np.zeros(3, dtype=np.float32), R=1):
    ''' Author: chenxi-wang
    View sampling on a sphere using Febonacci lattices.

    **Input:**

Collision Masks on Each Scene

Collision mask on each scene is a list of numpy arrays.

>>> import numpy as np
>>> c = np.load('collision_labels.npz') # GRASPNET_ROOT/collision_label/scene_0000/collision_labels.npz
>>> c.files
['arr_0', 'arr_4', 'arr_5', 'arr_2', 'arr_3', 'arr_7', 'arr_1', 'arr_8', 'arr_6']
>>> c['arr_0'].shape
(487, 300, 12, 4)
>>> c['arr_0'].dtype
dtype('bool')
>>> c['arr_0'][10][20][3]
array([ True,  True,  True,  True])

‘arr_i’ is the collision mask for the i th object in the object_id_list.txt for each scene whose shape is (num_points, 300, 12, 4). num_points, 300, 12, 4 denote the number of points in the object, view id, in-plane rotation id and depth id respectively.

Users can refer to graspnetAPI.GraspNet.loadGrasp() for more details of how to use the labels.

API Loaded Labels

Dealing with the raw labels are time-consuming and need high familiarity with graspnet. So the API also provides an easy access to the labels.

By calling graspnetAPI.GraspNet.loadGrasp(), users can get all the positive grasp labels in a scene with their parameters and scores.

There are totally four kinds of data structures for loaded grasp labels: Grasp, GraspGroup, RectGrasp and RectGraspGroup. The internal data format of each class is a numpy array which is more efficient than the Python list. Their definitions are given in grasp.py

Example Labels

Before looking into the details, an example is given below.

Loading a GraspGroup instance.

__author__ = 'mhgou'
__version__ = '1.0'

from graspnetAPI import GraspNet, Grasp, GraspGroup
import open3d as o3d
import cv2
import numpy as np

# GraspNetAPI example for loading grasp for a scene.
# change the graspnet_root path

####################################################################
graspnet_root = '/disk1/graspnet' # ROOT PATH FOR GRASPNET
####################################################################

sceneId = 1
annId = 3

# initialize a GraspNet instance  
g = GraspNet(graspnet_root, camera='kinect', split='train')

# load grasps of scene 1 with annotation id = 3, camera = kinect and fric_coef_thresh = 0.2
_6d_grasp = g.loadGrasp(sceneId = sceneId, annId = annId, format = '6d', camera = 'kinect', fric_coef_thresh = 0.2)
print('6d grasp:\n{}'.format(_6d_grasp))

# _6d_grasp is an GraspGroup instance defined in grasp.py
print('_6d_grasp:\n{}'.format(_6d_grasp))

Users can access elements by index or slice.

# index
grasp = _6d_grasp[0]
print('_6d_grasp[0](grasp):\n{}'.format(grasp))

# slice
print('_6d_grasp[0:2]:\n{}'.format(_6d_grasp[0:2]))
print('_6d_grasp[[0,1]]:\n{}'.format(_6d_grasp[[0,1]]))

Each element of GraspGroup is a Grasp instance. The properties of Grasp can be accessed via provided methods.

# grasp is a Grasp instance defined in grasp.py
# access and set properties
print('grasp.translation={}'.format(grasp.translation))
grasp.translation = np.array([1.0, 2.0, 3.0])
print('After modification, grasp.translation={}'.format(grasp.translation))
print('grasp.rotation_matrix={}'.format(grasp.rotation_matrix))
grasp.rotation_matrix = np.eye(3).reshape((9))
print('After modification, grasp.rotation_matrix={}'.format(grasp.rotation_matrix))
print('grasp.width={}, height:{}, depth:{}, score:{}'.format(grasp.width, grasp.height, grasp.depth, grasp.score))
print('More operation on Grasp and GraspGroup can be seen in the API document')

RectGrasp is the class for rectangle grasps. The format is different from Grasp. But the provided APIs are similar.

# load rectangle grasps of scene 1 with annotation id = 3, camera = realsense and fric_coef_thresh = 0.2
rect_grasp_group = g.loadGrasp(sceneId = sceneId, annId = annId, format = 'rect', camera = 'realsense', fric_coef_thresh = 0.2)
print('rectangle grasp group:\n{}'.format(rect_grasp_group))

# rect_grasp is an RectGraspGroup instance defined in grasp.py
print('rect_grasp_group:\n{}'.format(rect_grasp_group))

# index
rect_grasp = rect_grasp_group[0]
print('rect_grasp_group[0](rect_grasp):\n{}'.format(rect_grasp))

# slice
print('rect_grasp_group[0:2]:\n{}'.format(rect_grasp_group[0:2]))
print('rect_grasp_group[[0,1]]:\n{}'.format(rect_grasp_group[[0,1]]))

# properties of rect_grasp 
print('rect_grasp.center_point:{}, open_point:{}, height:{}, score:{}'.format(rect_grasp.center_point, rect_grasp.open_point, rect_grasp.height, rect_grasp.score))

6D Grasp

Actually, 17 float numbers are used to define a general 6d grasp. The width, height, depth, score and attached object id are also part of the definition.

Note

In the loaded label, the higher score the grasp has, the better it is which is different from raw labels. Actually, score = 1.1 - raw_score (which is the coefficient of friction)

                target_points = target_points[mask1]
                target_points = transform_points(target_points, trans)
                target_points = transform_points(target_points, np.linalg.inv(camera_pose))

The detailed defition of each parameter is shown in the figure.

_images/grasp_definition.png
class Grasp():
    def __init__(self, *args):
        '''
        **Input:**

        - args can be a numpy array or tuple of the score, width, height, depth, rotation_matrix, translation, object_id

        - the format of numpy array is [score, width, height, depth, rotation_matrix(9), translation(3), object_id]

        - the length of the numpy array is 17.
        '''
        if len(args) == 0:
            self.grasp_array = np.array([0, 0.02, 0.02, 0.02, 1, 0, 0, 0, 1 ,0 , 0, 0, 1, 0, 0, 0, -1], dtype = np.float64)
        elif len(args) == 1:
            if type(args[0]) == np.ndarray:
                self.grasp_array = copy.deepcopy(args[0])
            else:
                raise TypeError('if only one arg is given, it must be np.ndarray.')
        elif len(args) == 7:
            score, width, height, depth, rotation_matrix, translation, object_id = args
            self.grasp_array = np.concatenate([np.array((score, width, height, depth)),rotation_matrix.reshape(-1), translation, np.array((object_id)).reshape(-1)]).astype(np.float64)
        else:
            raise ValueError('only 1 or 7 arguments are accepted')

6D Grasp Group

Usually, there are a lot of grasps in a scene, GraspGroup is a class for these grasps. Compared with Grasp, GraspGroup contains a 2D numpy array, the additional dimension is the index for each grasp.

class GraspGroup():
    def __init__(self, *args):
        '''
        **Input:**

        - args can be (1) nothing (2) numpy array of grasp group array (3) str of the npy file.
        '''
        if len(args) == 0:
            self.grasp_group_array = np.zeros((0, GRASP_ARRAY_LEN), dtype=np.float64)
        elif len(args) == 1:
            if isinstance(args[0], np.ndarray):
                self.grasp_group_array = args[0]
            elif isinstance(args[0], str):
                self.grasp_group_array = np.load(args[0])
            else:
                raise ValueError('args must be nothing, numpy array or string.')
        else:
            raise ValueError('args must be nothing, numpy array or string.')

Common operations on a list such as indexing, slicing and sorting are implemented. Besides, one important function is that users can dump a GraspGroup into a numpy file and load it in another program by calling GraspGroup.save_npy() and GraspGroup.from_npy().

Rectangle Grasp

7 float numbers are used to define a general rectangle grasp, i.e. the center point, the open point, height, score and the attached object id. The detailed defition of each parameter is shown in the figure above and the coordinates for center point and open point are in the pixel frame.

class RectGrasp():
    def __init__(self, *args):
        '''
        **Input:**

        - args can be a numpy array or tuple of the center_x, center_y, open_x, open_y, height, score, object_id

        - the format of numpy array is [center_x, center_y, open_x, open_y, height, score, object_id]

        - the length of the numpy array is 7.
        '''
        if len(args) == 1:
            if type(args[0]) == np.ndarray:
                self.rect_grasp_array = copy.deepcopy(args[0])
            else:
                raise TypeError('if only one arg is given, it must be np.ndarray.')
        elif len(args) == RECT_GRASP_ARRAY_LEN:
            self.rect_grasp_array = np.array(args).astype(np.float64)
        else:
            raise ValueError('only one or six arguments are accepted')

Rectangle Grasp Group

The format for RectGraspGroup is similar to that of RectGrasp and GraspGroup.

class RectGraspGroup():
    def __init__(self, *args):
        '''
        **Input:**

        - args can be (1) nothing (2) numpy array of rect_grasp_group_array (3) str of the numpy file.
        '''
        if len(args) == 0:
            self.rect_grasp_group_array = np.zeros((0, RECT_GRASP_ARRAY_LEN), dtype=np.float64)
        elif len(args) == 1:
            if isinstance(args[0], np.ndarray):
                self.rect_grasp_group_array = args[0]
            elif isinstance(args[0], str):
                self.rect_grasp_group_array = np.load(args[0])
            else:
                raise ValueError('args must be nothing, numpy array or string.')
        else:
            raise ValueError('args must be nothing, numpy array or string.')

Note

We recommend users to access and modify the labels by provided functions although users can also manipulate the data directly but it is Not Recommended. Please refer to the Python API for more details.

Grasp and GraspGroup Transformation

Users can transform a Grasp or GraspGroup giving a 4x4 matrix.

# transform grasp
g = Grasp() # simple Grasp
frame = o3d.geometry.TriangleMesh.create_coordinate_frame(0.1)

# Grasp before transformation
o3d.visualization.draw_geometries([g.to_open3d_geometry(), frame])
g.translation = np.array((0,0,0.01))

# setup a transformation matrix
T = np.eye(4)
T[:3,3] = np.array((0.01, 0.02, 0.03))
T[:3,:3] = np.array([[0,0,1.0],[1,0,0],[0,1,0]])
g.transform(T)

# Grasp after transformation
o3d.visualization.draw_geometries([g.to_open3d_geometry(), frame])

g1 = Grasp()
gg = GraspGroup()
gg.add(g)
gg.add(g1)

# GraspGroup before transformation
o3d.visualization.draw_geometries([*gg.to_open3d_geometry_list(), frame])

gg.transform(T)

# GraspGroup after transformation
o3d.visualization.draw_geometries([*gg.to_open3d_geometry_list(), frame])
_images/transformation.png