graspnetAPI package

Subpackages

Submodules

graspnetAPI.grasp module

class graspnetAPI.grasp.Grasp(*args)[source]

Bases: object

property depth

**

  • float of the depth.

Type

**Output

from_npy(npy_file_path)[source]

Input:

  • npy_file_path: string of the file path.

property height

**

  • float of the height.

Type

**Output

property object_id

**

  • int of the object id that this grasp grasps

Type

**Output

property rotation_matrix

**

  • np.array of shape (3, 3) of the rotation matrix.

Type

**Output

save_npy(npy_file_path)[source]

Input:

  • npy_file_path: string of the file path.

property score

**

  • float of the score.

Type

**Output

to_open3d_geometry(color=None)[source]

Input:

  • color: optional, tuple of shape (3) denotes (r, g, b), e.g., (1,0,0) for red

Ouput:

  • list of open3d.geometry.Geometry of the gripper.

transform(T)[source]

Input:

  • T: np.array of shape (4, 4)

Output:

  • Grasp instance after transformation, the original Grasp will also be changed.

property translation

**

  • np.array of shape (3,) of the translation.

Type

**Output

property width

**

  • float of the width.

Type

**Output

class graspnetAPI.grasp.GraspGroup(*args)[source]

Bases: object

__getitem__(index)[source]

Input:

  • index: int, slice, list or np.ndarray.

Output:

  • if index is int, return Grasp instance.

  • if index is slice, np.ndarray or list, return GraspGroup instance.

__len__()[source]

Output:

  • int of the length.

add(element)[source]

Input:

  • element: Grasp instance or GraspGroup instance.

property depths

**

  • numpy array of shape (-1, ) of the depths.

Type

**Output

from_npy(npy_file_path)[source]

Input:

  • npy_file_path: string of the file path.

property heights

**

  • numpy array of shape (-1, ) of the heights.

Type

**Output

nms(translation_thresh=0.03, rotation_thresh=0.5235987755982988)[source]

Input:

  • translation_thresh: float of the translation threshold.

  • rotation_thresh: float of the rotation threshold.

Output:

  • GraspGroup instance after nms.

property object_ids

**

  • numpy array of shape (-1, ) of the object ids.

Type

**Output

random_sample(numGrasp=20)[source]

Input:

  • numGrasp: int of the number of sampled grasps.

Output:

  • GraspGroup instance of sample grasps.

remove(index)[source]

Input:

  • index: list of the index of grasp

property rotation_matrices

**

  • np.array of shape (-1, 3, 3) of the rotation matrices.

Type

**Output

save_npy(npy_file_path)[source]

Input:

  • npy_file_path: string of the file path.

property scores

**

  • numpy array of shape (-1, ) of the scores.

Type

**Output

sort_by_score(reverse=False)[source]

Input:

  • reverse: bool of order, if False, from high to low, if True, from low to high.

Output:

  • no output but sort the grasp group.

to_open3d_geometry_list()[source]

Output:

  • list of open3d.geometry.Geometry of the grippers.

to_rect_grasp_group(camera)[source]

Input:

  • camera: string of type of camera, ‘realsense’ or ‘kinect’.

Output:

  • RectGraspGroup instance or None.

transform(T)[source]

Input:

  • T: np.array of shape (4, 4)

Output:

  • GraspGroup instance after transformation, the original GraspGroup will also be changed.

property translations

**

  • np.array of shape (-1, 3) of the translations.

Type

**Output

property widths

**

  • numpy array of shape (-1, ) of the widths.

Type

**Output

class graspnetAPI.grasp.RectGrasp(*args)[source]

Bases: object

property center_point

**

  • tuple of x,y of the center point.

Type

**Output

get_key_points()[source]

Output:

  • center, open_point, upper_point, each of them is a numpy array of shape (2,)

property height

**

  • float of the height.

Type

**Output

property object_id

**

  • int of the object id that this grasp grasps

Type

**Output

property open_point

**

  • tuple of x,y of the open point.

Type

**Output

property score

**

  • float of the score.

Type

**Output

to_grasp(camera, depths, depth_method=<function center_depth>)[source]

Input:

  • camera: string of type of camera, ‘kinect’ or ‘realsense’.

  • depths: numpy array of the depths image.

  • depth_method: function of calculating the depth.

Output:

  • grasp: Grasp instance of None if the depth is not valid.

to_opencv_image(opencv_rgb)[source]

input:

  • opencv_rgb: numpy array of opencv BGR format.

Output:

  • numpy array of opencv RGB format that shows the rectangle grasp.

class graspnetAPI.grasp.RectGraspGroup(*args)[source]

Bases: object

__getitem__(index)[source]

Input:

  • index: int, slice, list or np.ndarray.

Output:

  • if index is int, return Grasp instance.

  • if index is slice, np.ndarray or list, return RectGraspGroup instance.

__len__()[source]

Output:

  • int of the length.

add(rect_grasp)[source]

Input:

  • rect_grasp: RectGrasp instance

batch_get_key_points()[source]

Output:

  • center, open_point, upper_point, each of them is a numpy array of shape (2,)

property center_points

**

  • numpy array the center points of shape (-1, 2).

Type

**Output

from_npy(npy_file_path)[source]

Input:

  • npy_file_path: string of the file path.

property heights

**

  • numpy array of the heights.

Type

**Output

property object_ids

**

  • numpy array of the object ids that this grasp grasps.

Type

**Output

property open_points

**

  • numpy array the open points of shape (-1, 2).

Type

**Output

random_sample(numGrasp=20)[source]

Input:

  • numGrasp: int of the number of sampled grasps.

Output:

  • RectGraspGroup instance of sample grasps.

remove(index)[source]

Input:

  • index: list of the index of rect_grasp

save_npy(npy_file_path)[source]

Input:

  • npy_file_path: string of the file path.

property scores

**

  • numpy array of the scores.

Type

**Output

sort_by_score(reverse=False)[source]

Input:

  • reverse: bool of order, if False, from high to low, if True, from low to high.

Output:

  • no output but sort the grasp group.

to_grasp_group(camera, depths, depth_method=<function batch_center_depth>)[source]

Input:

  • camera: string of type of camera, ‘kinect’ or ‘realsense’.

  • depths: numpy array of the depths image.

  • depth_method: function of calculating the depth.

Output:

  • grasp_group: GraspGroup instance or None.

Note

The number may not be the same to the input as some depth may be invalid.

to_opencv_image(opencv_rgb, numGrasp=0)[source]

input:

  • opencv_rgb: numpy array of opencv BGR format.

  • numGrasp: int of the number of grasp, 0 for all.

Output:

  • numpy array of opencv RGB format that shows the rectangle grasps.

graspnetAPI.graspnet module

class graspnetAPI.graspnet.GraspNet(root, camera='kinect', split='train')[source]

Bases: object

checkDataCompleteness()[source]

Check whether the dataset files are complete.

Output:

  • bool, True for complete, False for not complete.

getDataIds(sceneIds=None)[source]

Input:

  • sceneIds:int or list of int of the scenes ids.

Output:

  • a list of int of the data ids. Data could be accessed by calling self.loadData(ids).

getObjIds(sceneIds=None)[source]

Input:

  • sceneIds: int or list of int of the scene ids.

Output:

  • a list of int of the object ids in the given scenes.

getSceneIds(objIds=None)[source]

Input:

  • objIds: int or list of int of the object ids.

Output:

  • a list of int of the scene ids that contains all the objects.

loadBGR(sceneId, camera, annId)[source]

Input:

  • sceneId: int of the scene index.

  • camera: string of type of camera, ‘realsense’ or ‘kinect’

  • annId: int of the annotation index.

Output:

  • numpy array of the rgb in BGR order.

loadCollisionLabels(sceneIds=None)[source]

Input:

  • sceneIds: int or list of int of the scene ids.

Output:

  • dict of the collision labels.

loadData(ids=None, *extargs)[source]

Input:

  • ids: int or list of int of the the data ids.

  • extargs: extra arguments. This function can also be called with loadData(sceneId, camera, annId)

Output:

  • if ids is int, returns a tuple of data path

  • if ids is not specified or is a list, returns a tuple of data path lists

loadDepth(sceneId, camera, annId)[source]

Input:

  • sceneId: int of the scene index.

  • camera: string of type of camera, ‘realsense’ or ‘kinect’

  • annId: int of the annotation index.

Output:

  • numpy array of the depth with dtype = np.uint16

loadGrasp(sceneId, annId=0, format='6d', camera='kinect', grasp_labels=None, collision_labels=None, fric_coef_thresh=0.4)[source]

Input:

  • sceneId: int of scene id.

  • annId: int of annotation id.

  • format: string of grasp format, ‘6d’ or ‘rect’.

  • camera: string of camera type, ‘kinect’ or ‘realsense’.

  • grasp_labels: dict of grasp labels. Call self.loadGraspLabels if not given.

  • collision_labels: dict of collision labels. Call self.loadCollisionLabels if not given.

  • fric_coef_thresh: float of the frcition coefficient threshold of the grasp.

ATTENTION

the LOWER the friction coefficient is, the better the grasp is.

Output:

  • If format == ‘6d’, return a GraspGroup instance.

  • If format == ‘rect’, return a RectGraspGroup instance.

loadGraspLabels(objIds=None)[source]

Input:

  • objIds: int or list of int of the object ids.

Output:

  • a dict of grasplabels of each object.

loadMask(sceneId, camera, annId)[source]

Input:

  • sceneId: int of the scene index.

  • camera: string of type of camera, ‘realsense’ or ‘kinect’

  • annId: int of the annotation index.

Output:

  • numpy array of the mask with dtype = np.uint16

loadObjModels(objIds=None)[source]

Function:

  • load object 3D models of the given obj ids

Input:

  • objIDs: int or list of int of the object ids

Output:

  • a list of open3d.geometry.PointCloud of the models

loadObjTrimesh(objIds=None)[source]

Function:

  • load object 3D trimesh of the given obj ids

Input:

  • objIDs: int or list of int of the object ids

Output:

  • a list of trimesh.Trimesh of the models

loadRGB(sceneId, camera, annId)[source]

Input:

  • sceneId: int of the scene index.

  • camera: string of type of camera, ‘realsense’ or ‘kinect’

  • annId: int of the annotation index.

Output:

  • numpy array of the rgb in RGB order.

loadSceneModel(sceneId, camera='kinect', annId=0, align=False)[source]

Input:

  • sceneId: int of the scene index.

  • camera: string of type of camera, ‘realsense’ or ‘kinect’

  • annId: int of the annotation index.

  • align: bool of whether align to the table frame.

Output:

  • open3d.geometry.PointCloud list of the scene models.

loadScenePointCloud(sceneId, camera, annId, align=False, format='open3d', use_workspace=False, use_mask=True, use_inpainting=False)[source]

Input:

  • sceneId: int of the scene index.

  • camera: string of type of camera, ‘realsense’ or ‘kinect’

  • annId: int of the annotation index.

  • aligh: bool of whether align to the table frame.

  • format: string of the returned type. ‘open3d’ or ‘numpy’

  • use_workspace: bool of whether crop the point cloud in the work space.

  • use_mask: bool of whether crop the point cloud use mask(z>0), only open3d 0.9.0 is supported for False option.

    Only turn to False if you know what you are doing.

  • use_inpainting: bool of whether inpaint the depth image for the missing information.

Output:

  • open3d.geometry.PointCloud instance of the scene point cloud.

  • or tuple of numpy array of point locations and colors.

loadWorkSpace(sceneId, camera, annId)[source]

Input:

  • sceneId: int of the scene index.

  • camera: string of type of camera, ‘realsense’ or ‘kinect’

  • annId: int of the annotation index.

Output:

  • tuple of the bounding box coordinates (x1, y1, x2, y2).

show6DPose(sceneIds, saveFolder='save_fig', show=False, perObj=False)[source]

Input:

  • sceneIds: int or list of scene ids.

  • saveFolder: string of the folder to store the image.

  • show: bool of whether to show the image.

  • perObj: bool, show grasps on each object

Output:

  • No output but to save the rendered image and maybe show the result.

showObjGrasp(objIds=[], numGrasp=10, th=0.5, maxWidth=0.08, saveFolder='save_fig', show=False)[source]

Input:

  • objIds: int of list of objects ids.

  • numGrasp: how many grasps to show in the image.

  • th: threshold of the coefficient of friction.

  • maxWidth: float, only visualize grasps with width<=maxWidth

  • saveFolder: string of the path to save the rendered image.

  • show: bool of whether to show the image.

Output:

  • No output but save the rendered image and maybe show it.

showSceneGrasp(sceneId, camera='kinect', annId=0, format='6d', numGrasp=20, show_object=True, coef_fric_thresh=0.1)[source]

Input:

  • sceneId: int of the scene index.

  • camera: string of the camera type, ‘realsense’ or ‘kinect’.

  • annId: int of the annotation index.

  • format: int of the annotation type, ‘rect’ or ‘6d’.

  • numGrasp: int of the displayed grasp number, grasps will be randomly sampled.

  • coef_fric_thresh: float of the friction coefficient of grasps.

graspnetAPI.graspnet_eval module

class graspnetAPI.graspnet_eval.GraspNetEval(root, camera, split='test')[source]

Bases: graspnetAPI.graspnet.GraspNet

Class for evaluation on GraspNet dataset.

Input:

  • root: string of root path for the dataset.

  • camera: string of type of the camera.

  • split: string of the date split.

eval_all(dump_folder, proc=2)[source]

Input:

  • dump_folder: string of the folder that saves the npy files.

  • proc: int of the number of processes to use to evaluate.

Output:

  • res: numpy array of the detailed accuracy.

  • ap: float of the AP for all split.

eval_novel(dump_folder, proc=2)[source]

Input:

  • dump_folder: string of the folder that saves the npy files.

  • proc: int of the number of processes to use to evaluate.

Output:

  • res: numpy array of the detailed accuracy.

  • ap: float of the AP for novel split.

eval_scene(scene_id, dump_folder, TOP_K=50, return_list=False, vis=False, max_width=0.1)[source]

Input:

  • scene_id: int of the scene index.

  • dump_folder: string of the folder that saves the dumped npy files.

  • TOP_K: int of the top number of grasp to evaluate

  • return_list: bool of whether to return the result list.

  • vis: bool of whether to show the result

  • max_width: float of the maximum gripper width in evaluation

Output:

  • scene_accuracy: np.array of shape (256, 50, 6) of the accuracy tensor.

eval_seen(dump_folder, proc=2)[source]

Input:

  • dump_folder: string of the folder that saves the npy files.

  • proc: int of the number of processes to use to evaluate.

Output:

  • res: numpy array of the detailed accuracy.

  • ap: float of the AP for seen split.

eval_similar(dump_folder, proc=2)[source]

Input:

  • dump_folder: string of the folder that saves the npy files.

  • proc: int of the number of processes to use to evaluate.

Output:

  • res: numpy array of the detailed accuracy.

  • ap: float of the AP for similar split.

get_model_poses(scene_id, ann_id)[source]

Input:

  • scene_id: int of the scen index.

  • ann_id: int of the annotation index.

Output:

  • obj_list: list of int of object index.

  • pose_list: list of 4x4 matrices of object poses.

  • camera_pose: 4x4 matrix of the camera pose relative to the first frame.

  • align mat: 4x4 matrix of camera relative to the table.

get_scene_models(scene_id, ann_id)[source]

return models in model coordinate

parallel_eval_scenes(scene_ids, dump_folder, proc=2)[source]

Input:

  • scene_ids: list of int of scene index.

  • dump_folder: string of the folder that saves the npy files.

  • proc: int of the number of processes to use to evaluate.

Output:

  • scene_acc_list: list of the scene accuracy.

graspnetAPI.moving_graspnet module

class graspnetAPI.moving_graspnet.MovingGraspNet(root)[source]

Bases: object

API for MovingGraspNet.

Parameters

root (str) – root directory for MovingGraspNet.

get_camera_obj_ids(scene_name, camera_sn)[source]

Get object indices in a camera

Parameters
  • scene_name (str) – scene name.

  • camera_sn (str) – camera serial number.

Returns

the object indices.

Return type

list

get_depth_path(scene_name, camera_sn, frame)[source]

Get depth image path

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the frame id.

Returns

path of the depth image.

Return type

str

get_frame_list(scene_name, camera_sn)[source]

Get all frame in a scene.

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

Returns

the frame ids.

Return type

list(str)

get_near_frames(scene_name, camera_sn, frame, max_distance)[source]

Find the near frame indices.

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the given frame id.

  • max_distance (int) – maxmimum distance(the unit is ms(1/1000 second)) allowed.

Returns

the list of frame indices within the max distance.

Return type

list(str)

get_registered_object_pose_filename(scene_name, camera_sn, frame, obj_id)[source]

Get the new object pose filename

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the frame id.

  • obj_id (int) – the index of the object.

Returns

the new object pose filename.

Return type

str

get_rgb_path(scene_name, camera_sn, frame)[source]

Get RGB image path

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the frame id.

Returns

path of the RGB image.

Return type

str

load_cam_intrinsic(camera_sn)[source]

Get camera intrinsic matrix for a camera.

Parameters

camera_sn (str) – camera serial number.

Returns

the intrinsic matraces.

Return type

dict

load_frame_collision_labels(scene_name, camera_sn, frame, obj_ids=None, split='multiobj_25frame')[source]

Load collision labels of the given frame. Author: Chenxi Wang

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the given frame id.

  • obj_ids (int/list) – object indices.

  • split (str) – which collision labels to load, support ‘multiobj_25frame’ and ‘initial_frame’.

Returns

collision labels of the given frame.

Return type

collision_labels(dict)

load_frame_grasp_poses(scene_name, camera_sn, frame, obj_ids=None, grasp_labels=None, fric_coef_thresh=0.4, num_sample=None, collision_split='multiobj_25frame')[source]

Load grasp poses on an object. Author: Chenxi Wang

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the given frame id.

  • obj_ids (int/list) – object indices.

  • grasp_labels (dict) – grasp labels of target objects.

  • fric_coef_thresh (float) – maximum friction coefficient for grasps.

  • num_sample (int) – number of samples.

  • collision_split (str) – which collision labels to load, support ‘multiobj_25frame’ and ‘initial_frame’.

Returns

grasp poses of the given frame in camera coordinates.

Return type

grasp_group

load_obj_grasp_labels(obj_ids=None)[source]

Load grasp labels of objects. Author: Chenxi Wang

Parameters

obj_ids (int/list) – object indices.

Returns

grasp labels of target objects.

Return type

grasp_labels(dict)

load_obj_grasp_poses(obj_id, grasp_labels=None, collision_labels=None, fric_coef_thresh=0.4)[source]

Load grasp poses on an object. Author: Chenxi Wang

Parameters
  • obj_id (int) – the index of the object.

  • grasp_labels (dict) – grasp labels of target objects.

  • fric_coef_thresh (float) – maximum friction coefficient for grasps.

Returns

grasp pose of the object in object coordinates.

Return type

grasp_group

load_object_point_cloud(obj_id)[source]

Load object point cloud

Parameters

obj_id (int) – object index.

Returns

the object point cloud.

Return type

open3d.geometry.PointCloud

load_object_pose(scene_name, camera_sn, frame, obj_id, registered=True)[source]

Get object 6d pose. None for no pose available.

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the frame id.

  • obj_id (int) – the index of the object.

  • registered (bool) – whether to use the registered pose.

Returns

the object pose.

Return type

np.array(4x4)

load_point_cloud(scene_name, camera_sn, frame)[source]

Get open3d point cloud.

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the frame id.

Returns

the point cloud.

Return type

open3d.geometry.PointCloud

load_scene_metadata(scene_name)[source]

Get metadata of a scene.

Parameters

scene_name – (str): the scene name.

Returns

the metadata of the scene.

Return type

dict

load_scene_object_list(scene_name, camera_sn, frame)[source]

Get object indices list with in a scene.

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the frame id.

Returns

the object indices.

Return type

list(int)

load_scene_registered_object_list(scene_name, camera_sn, frame)[source]

Get registered object indices list with in a scene.

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the frame id.

Returns

the object indices.

Return type

list(int)

load_scene_with_object(scene_name, camera_sn, frame, registered=True)[source]

Get open3d point cloud for both scene and object(s).

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

  • frame (str) – the frame id.

  • registered (bool) – whether to use registered object pose.

Returns

the point clouds.

Return type

list(open3d.geometry.PointCloud)

graspnetAPI.moving_graspnet_eval module

class graspnetAPI.moving_graspnet_eval.GraspDist(translation=0.1, rotation=0.5235987755982988)[source]

Bases: object

Grasp pose distance.

Parameters
  • translation (float) – translational distance.

  • rotation (float) – rotational distance

__gt__(other)[source]

greater than a thresh.

__lt__(other)[source]

less than a thresh.

classmethod dist_from_grasp_pose(g1: graspnetAPI.grasp.Grasp, g2: graspnetAPI.grasp.Grasp)[source]

Calculate grasp distance given two grasp pose.

Parameters
  • g1 (Grasp) – grasp pose 1.

  • g2 (Grasp) – grasp pose 2.

Returns

distance of the two grasp pose.

Return type

GraspDist

class graspnetAPI.moving_graspnet_eval.MovingGraspNetEval(root, pred_dir, dist_thresh=GraspDist:(t: 0.1, r:0.5235987755982988))[source]

Bases: graspnetAPI.moving_graspnet.MovingGraspNet

Moving GraspNet Evaluation class.

Parameters
  • root (str) – MovingGraspNet root directory.

  • pred_dir (str) – prediction files directory.

  • dist_thresh (GraspDist) – threshold grasp distance.

eval_mgta_all()[source]

Evaluate MGTA for all sequences.

Returns

average MGTA for all sequences.

Return type

float

get_seq_mgta(scene_name, camera_sn)[source]

Calculate MGTA score from dumped files.

Parameters
  • scene_name (str) – the scene name.

  • camera_sn (str) – the camera serial number.

Returns

MGTA score.

Return type

float

load_query_pose(scene_name, camera_sn)[source]

Load the grasp poses queries.

Parameters
  • scene_name (str) – scene name.

  • camera_sn (str) – camera serial number.

Returns

dict{“grasp_pose”: gg, “start_frame”: frame, “object_id”: id}

Return type

list

Module contents