Open3D Format: Complete Guide for Robotics Data
Open3D provides standard point cloud formats and processing tools for 3D robotics data. Understand PCD, PLY, and Open3D's tensor dataset format.
Schema and Structure
Open3D works with three primary 3D data formats, each with distinct strengths. PCD (Point Cloud Data) files, originally defined by the Point Cloud Library (PCL), store point clouds with a structured header specifying fields (e.g., x, y, z, rgb, normal_x, normal_y, normal_z), their sizes and types, the total point count, and the data encoding (ascii, binary, or binary_compressed). The binary_compressed variant uses LZF compression, reducing file sizes by 30-50% for typical robotics point clouds while maintaining fast decompression. PLY (Polygon File Format, also called Stanford Triangle Format) supports both point clouds and polygon meshes, with a header defining element types (vertex, face) and their properties. PLY is the preferred format when both point data and mesh topology are needed, such as for reconstructed 3D scenes or object models used in grasp planning.
Open3D's native dataset format for RGBD reconstruction pipelines stores registered sequences as a directory structure: trajectory.json (camera poses as 4x4 transformation matrices for each frame), a depth/ directory with 16-bit PNG depth images, and a color/ directory with RGB images. The trajectory.json file contains an array of 4x4 homogeneous transformation matrices representing the camera-to-world transform for each frame, enabling TSDF (Truncated Signed Distance Function) volumetric integration to reconstruct dense 3D surfaces. This format is used by Open3D's reconstruction pipeline (open3d.pipelines.integration.ScalableTSDFVolume) and is interoperable with ElasticFusion, BundleFusion, and other real-time reconstruction systems.
Open3D 0.18+ introduced a tensor-based data model (open3d.t.geometry) that stores point clouds and meshes as dictionaries of named tensors on CPU or GPU. This tensor geometry system supports arbitrary per-point and per-vertex attributes stored as named Open3D tensors, enabling seamless interoperability with PyTorch and TensorFlow via DLPack zero-copy transfer. For ML training on 3D data, the tensor API eliminates the conversion overhead between Open3D's legacy geometry objects and framework-specific tensor types, achieving throughput of 50-100 million points per second for operations like voxel downsampling and normal estimation on modern GPUs.
Frameworks and Models Using Open3D Format
Open3D
The primary 3D processing library for robotics, providing I/O, registration, reconstruction, and visualization for PCD, PLY, and custom formats.
Open3D-ML
Machine learning extension for Open3D supporting 3D semantic segmentation models (RandLA-Net, KPConv, PointTransformer) with native data loaders.
CloudCompare
Open-source 3D point cloud processing and comparison tool supporting PCD, PLY, LAS, E57, and 20+ other formats.
PCL (Point Cloud Library)
C++ library for point cloud processing that defined the PCD format and interoperates with Open3D through shared file formats.
PyTorch3D
Meta's 3D deep learning library that loads PLY meshes and point clouds for differentiable rendering and 3D understanding.
Meshlab
Open-source mesh processing system supporting PLY, OBJ, STL, and other formats for 3D model cleaning and decimation.
Reading and Writing 3D Data with Open3D in Python
Open3D provides unified I/O for all supported 3D formats. Reading a point cloud is a single call: pcd = o3d.io.read_point_cloud('scene.pcd') or o3d.io.read_point_cloud('scene.ply'), returning an open3d.geometry.PointCloud with .points (Nx3 float64), .colors (Nx3 float64, 0-1 range), and .normals (Nx3 float64) attributes. For meshes: mesh = o3d.io.read_triangle_mesh('model.ply') returns an object with .vertices, .triangles, .vertex_normals, and .vertex_colors. The format is auto-detected from the file extension, and Open3D supports reading PCD (ascii/binary/binary_compressed), PLY (ascii/binary), OBJ, STL, OFF, and GLTF formats.
Writing follows the same pattern: o3d.io.write_point_cloud('output.pcd', pcd, write_ascii=False, compressed=True) writes a binary compressed PCD file. For best compatibility with PCL-based C++ code, use PCD format with binary encoding. For exchange with Blender, MeshLab, or web viewers, PLY binary is preferred. The key performance consideration for large point clouds (10M+ points) is to avoid ASCII encoding, which is 5-10x slower to read and 3-5x larger on disk than binary. For RGBD sequences, o3d.io.read_pinhole_camera_trajectory('trajectory.json') loads the camera poses, and the reconstruction pipeline is initialized with o3d.pipelines.integration.ScalableTSDFVolume(voxel_length=0.005, sdf_trunc=0.04).
For ML workflows, Open3D's tensor-based API (open3d.t.io) provides zero-copy conversion to PyTorch tensors. Loading a point cloud as tensors: tpcd = o3d.t.io.read_point_cloud('scene.ply'), then tpcd.point.positions.to(o3d.core.Device('cuda:0')) moves points to GPU, and tpcd.point.positions.to_dlpack() provides DLPack interop with PyTorch via torch.from_dlpack(). This pipeline achieves sub-millisecond conversion overhead for point clouds up to 1 million points, making it practical for on-the-fly data loading in training loops. Open3D-ML extends this with dataset classes for S3DIS, ScanNet, SemanticKITTI, and Toronto3D that handle format-specific loading and provide standardized train/val/test splits.
When to Use Open3D Formats vs Alternatives
PCD and PLY are the most widely supported point cloud formats, but specialized formats may be better for specific applications.
| Format | Best For | Mesh Support | Compression | Ecosystem |
|---|---|---|---|---|
| PCD | Point clouds with custom fields | No (points only) | LZF (binary_compressed) | PCL, Open3D, ROS |
| PLY | Point clouds + meshes | Yes (vertices + faces) | None (binary is compact) | Universal (Blender, MeshLab, etc.) |
| LAS / LAZ | Aerial LiDAR, surveying | No | LAZ (4-8x ratio) | PDAL, LAStools, CloudCompare |
| E57 | Terrestrial laser scanning | No | Built-in (efficient) | FARO, Leica, CloudCompare |
| NumPy (.npy/.npz) | ML training pipelines | No (array only) | npz with zlib | PyTorch, TensorFlow, NumPy |
Converting from Other Formats
| Source Format | Tool / Library | Complexity | Notes |
|---|---|---|---|
| ROS PointCloud2 | open3d + ros_numpy | trivial | Convert ROS messages to Open3D point clouds via numpy intermediate: o3d.geometry.PointCloud(o3d.utility.Vector3dVector(points)). |
| Depth images (RGBD) | o3d.geometry.PointCloud.create_from_depth_image() | trivial | Project depth maps to 3D with camera intrinsics; supports both Open3D and custom pinhole models. |
| LiDAR binary (KITTI .bin) | numpy + open3d | trivial | Load binary float32 array with np.fromfile(), wrap as Open3D PointCloud with Vector3dVector. |
| LAS / LAZ (aerial LiDAR) | laspy + open3d | trivial | Read with laspy, extract xyz/rgb arrays, construct Open3D PointCloud objects. |
| OBJ / STL meshes | o3d.io.read_triangle_mesh() | trivial | Direct loading of OBJ (with materials) and STL (binary or ascii) mesh files. |
| NumPy arrays | o3d.utility.Vector3dVector() | trivial | Wrap (N,3) numpy arrays directly as Open3D point positions, colors, or normals. |
Open3D in Robotics Perception and Manipulation Pipelines
Open3D serves as the 3D processing backbone for many robotic manipulation systems. In a typical pick-and-place pipeline, depth cameras capture RGBD frames that are converted to point clouds using Open3D's create_from_rgbd_image() with the camera intrinsic matrix. These point clouds are then processed through a chain of Open3D operations: statistical outlier removal (remove_statistical_outlier) to clean noise, plane segmentation (segment_plane) to remove the table surface, DBSCAN clustering (cluster_dbscan) to segment individual objects, and ICP registration (registration_icp) to align detected objects against known 3D models for pose estimation. This entire pipeline runs at 5-15 Hz on a modern CPU depending on point cloud density.
For 3D scene reconstruction in robotics, Open3D's TSDF integration pipeline converts a sequence of RGBD frames with known camera poses into a dense 3D mesh or point cloud. The ScalableTSDFVolume class uses a voxel hashing approach that allocates voxels only where data exists, enabling memory-efficient reconstruction of large scenes. A typical reconstruction with 5mm voxel resolution and 500 RGBD frames at 640x480 produces a mesh with 1-5 million vertices, consuming 2-4 GB of memory. The resulting mesh is directly usable for collision checking in motion planning (via Open3D's raycasting scene), navigation map generation, and digital twin creation.
Open3D-ML extends the library with deep learning models for 3D semantic segmentation, which is increasingly important for scene understanding in robotics. Models like RandLA-Net (efficient random sampling for large-scale point clouds), KPConv (kernel point convolution for deformable 3D features), and PointTransformer (attention-based 3D understanding) are available as pre-trained Open3D-ML pipelines that ingest Open3D point cloud objects directly. Training on custom robotics datasets requires providing point clouds with per-point label arrays, which Open3D stores as custom attributes on the PointCloud object.
References
- [1]Zhou et al.. “Open3D: A Modern Library for 3D Data Processing.” arXiv 2018, 2018. Link
- [2]Dai et al.. “ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes.” CVPR 2017, 2017. Link
- [3]Rusu and Cousins. “3D is here: Point Cloud Library (PCL).” ICRA 2011, 2011. Link
- [4]Hu et al.. “RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds.” CVPR 2020, 2020. Link
- [5]Ravi et al.. “Accelerating 3D Deep Learning with PyTorch3D.” SIGGRAPH Asia 2020, 2020. Link
Claru Data Delivery in Open3D Format
Claru delivers 3D data in Open3D-compatible formats (PCD binary compressed, PLY binary, or NumPy arrays) with calibration files for direct loading into Open3D processing pipelines. Point clouds include per-point RGB colors, surface normals, and optional semantic labels as custom PCD fields. For RGBD sequence datasets, we provide Open3D-compatible trajectory.json files with camera poses, enabling direct use with the TSDF reconstruction pipeline.
For manipulation datasets, Claru provides pre-segmented object point clouds with per-object PLY meshes suitable for ICP-based pose refinement and collision checking. For 3D semantic segmentation training, we deliver point clouds with per-point labels in Open3D-ML compatible format, with class taxonomies defined for your target environment (warehouse shelving, kitchen surfaces, factory floor). Large-scale point cloud deliveries (100M+ points) are pre-partitioned into spatial tiles with overlap regions to enable distributed processing and training.
Frequently Asked Questions
PCD (Point Cloud Data) is optimized for point clouds with arbitrary per-point fields (position, color, normals, intensity, custom features) and supports LZF compression in binary_compressed mode. PLY (Polygon File Format) supports both point clouds and meshes (vertices + faces with texture coordinates), making it the preferred choice when mesh topology is needed alongside point data. For pure point clouds in robotics pipelines, PCD is simpler and has native support in PCL and ROS. For 3D models, reconstructed meshes, and exchange with CAD/graphics tools (Blender, MeshLab), PLY is more broadly compatible.
Open3D handles point clouds with tens of millions of points on a single machine. Key techniques for large-scale processing include voxel downsampling (o3d.geometry.PointCloud.voxel_down_sample with typical voxel sizes of 1-5cm), octree-based spatial indexing (o3d.geometry.Octree) for efficient neighbor queries, and GPU-accelerated operations via the Open3D tensor API. For point clouds exceeding 100 million points (e.g., city-scale LiDAR scans), the recommended approach is spatial tiling followed by per-tile processing, which Open3D's crop() function supports efficiently.
Open3D's RGBD dataset format uses a directory with trajectory.json (array of 4x4 camera-to-world transformation matrices), depth/ (16-bit PNG depth images where pixel value multiplied by depth_scale gives meters), and color/ (RGB images as PNG or JPEG). The trajectory file stores one 4x4 matrix per frame as a flat 16-element array in row-major order. This format integrates directly with Open3D's TSDF integration pipeline: ScalableTSDFVolume for memory-efficient reconstruction, or UniformTSDFVolume for fixed-resolution voxel grids.
Open3D's tensor API (o3d.t.geometry) provides zero-copy conversion via DLPack. Load a point cloud as tensors: tpcd = o3d.t.io.read_point_cloud('scene.ply'). Access positions as an Open3D tensor: positions = tpcd.point.positions. Convert to PyTorch: torch_positions = torch.from_dlpack(positions.to_dlpack()). The reverse direction works similarly: tpcd.point.positions = o3d.core.Tensor.from_dlpack(torch_tensor). This zero-copy path avoids memory allocation overhead and achieves sub-millisecond conversion for point clouds up to millions of points.
Open3D-ML provides pre-trained pipelines for 3D semantic segmentation: RandLA-Net (efficient for large-scale outdoor scans, 200K+ points per scene), KPConv (high accuracy on indoor scenes with deformable kernel points), and PointTransformer (attention-based architecture for fine-grained understanding). These models accept Open3D PointCloud objects with per-point label arrays for training and return per-point predictions for inference. Open3D-ML includes dataset loaders for S3DIS (indoor rooms), ScanNet (indoor reconstructions), SemanticKITTI (outdoor driving), and Toronto3D (urban mapping) with standardized data splits.
Get Data in Open3D Format
Claru delivers 3D robotics data in Open3D-compatible formats (PCD, PLY) with calibration, normals, and semantic labels ready for your processing pipeline. Tell us your requirements.