Bimanual Data Collection

The DK1 is designed from the ground up for bimanual teleoperation data collection. This guide covers wiring both arms and cameras, running the leader/follower recording procedure, understanding the bimanual dataset schema, and getting your data ready for ACT training.

Before Recording

Hardware Connections for Bimanual Recording

Bimanual recording requires more connections than a single-arm setup. Verify every connection before starting LeRobot — missing a connection mid-session corrupts the episode.

🥇

Leader Arm (Dynamixel XL330)

USB-C from leader arm to host PC. This arm is moved by the operator's hand. Use a short cable (1 m) to avoid accidental disconnections during teleop. Verify: ls /dev/ttyACM0

🥈

Follower Arm (DM4340 + power)

USB-C from follower arm to host PC plus DC power supply. The follower arm requires external power — never run on USB power alone. Verify: ls /dev/ttyACM1

📷

Wrist Camera (follower arm)

Mount a USB webcam to the follower arm's end-effector. This is the primary manipulation camera. Connect via USB 3.0. Verify: ls /dev/video0

📖

Overhead / Workspace Camera

Fixed camera above the bimanual workspace at ~70 cm height, angled 30° down. Captures both arms simultaneously. Second USB 3.0 port. Verify: ls /dev/video2

Critical: bimanual synchronization. With two arms and two cameras, synchronization is the most important data quality factor for the DK1. LeRobot timestamps all streams from the host PC clock. To minimize timestamp skew: (1) use separate USB bus controllers for cameras and arms, (2) use USB 3.0 hubs with stable clocks, (3) set CPU governor to performance mode. Target: <5 ms skew between all four streams. A 10 ms desync between left and right arm states can cause ACT training failures on contact-rich tasks.

Recording Workflow

Leader/Follower Teleoperation Recording Procedure

Follow these steps for every DK1 recording session. The bimanual procedure has a few extra steps compared to single-arm collection.

1

Pre-session safety check

Clear the shared workspace between both arms (1.5 m × 1 m). Verify both arms reach the shared workspace without collision. Test E-stop before recording. See Safety page.

2

Connect and verify both arms

# Verify serial ports are available
ls /dev/ttyACM*
# Expected: /dev/ttyACM0 (leader) and /dev/ttyACM1 (follower)

# Quick connection test
python -m lerobot.scripts.control_robot \
  --robot.type=bi_dk1_follower \
  --robot.config=~/.lerobot/robots/dk1_bimanual.yaml \
  --control.type=none
3

Verify camera feeds

Both cameras must be streaming before starting LeRobot. A missing camera will silently produce episodes with null image frames.

python3 -c "
import cv2
for i in [0, 2]:
    cap = cv2.VideoCapture(i)
    if cap.isOpened():
        ret, frame = cap.read()
        print(f'Camera {i}: OK ({frame.shape[1]}x{frame.shape[0]})')
    else:
        print(f'Camera {i}: FAILED')
    cap.release()
"
4

Move arms to starting position

Manually move the leader arm to the starting teleop position. The follower arm will mirror it. Hold the leader arm steady for 2–3 seconds to confirm synchronization before the warmup period starts.

5

Set up the task scene

Place objects in consistent starting positions for both arms. Photograph the starting configuration. For bimanual tasks, mark exact positions with tape — scene consistency is even more critical because both arm trajectories must be compatible.

6

Start bimanual LeRobot recording

source ~/.venvs/dk1/bin/activate
python -m lerobot.scripts.control_robot \
  --robot.type=bi_dk1_follower \
  --robot.config=~/.lerobot/robots/dk1_bimanual.yaml \
  --control.type=record \
  --control.fps=30 \
  --control.repo_id=your-username/dk1-bimanual-pick-place-v1 \
  --control.num_episodes=50 \
  --control.single_task="Pick up block with left arm, place in bin with right arm" \
  --control.warmup_time_s=5 \
  --control.reset_time_s=15

Use a longer reset_time_s for bimanual tasks — resetting two arms and the scene takes more time than single-arm setups.

7

Review and replay episodes

After each batch of 10 episodes, replay and review before continuing. Pay attention to arm coordination — lag between left and right arms will appear as jitter in the follower's movements.

python -m lerobot.scripts.visualize_dataset \
  --repo_id=your-username/dk1-bimanual-pick-place-v1 \
  --episode_index=0
8

Push to HuggingFace Hub

huggingface-cli login
python -m lerobot.scripts.push_dataset_to_hub \
  --repo_id=your-username/dk1-bimanual-pick-place-v1
Dataset Format

LeRobot Dataset Format for Bimanual (DK1)

The DK1 bimanual dataset schema doubles the joint state fields compared to a single-arm recording. Each episode contains synchronized observations from both leader and follower arms plus all cameras.

Directory structure

your-username/dk1-bimanual-pick-place-v1/
├── meta/
│   ├── info.json          # Dataset metadata, fps, shapes, robot_type
│   ├── episodes.jsonl     # Per-episode metadata (task, length, outcome)
│   └── stats.json         # Min/max/mean/std for all fields
├── data/
│   └── chunk-000/
│       ├── episode_000000.parquet
│       └── ...
└── videos/
    └── chunk-000/
        ├── observation.images.wrist_cam/
        │   ├── episode_000000.mp4
        │   └── ...
        └── observation.images.overhead_cam/
            └── ...

Episode data schema (bimanual)

Fields in each episode Parquet file — DK1 bimanual (bi_dk1_follower)
observation.state float32[16] Follower arm joint positions: 7 DOF + gripper × 2 arms = 16 values
observation.state_left float32[8] Left follower arm: 7 joint positions + 1 gripper (rad)
observation.state_right float32[8] Right follower arm: 7 joint positions + 1 gripper (rad)
action float32[16] Target positions for both follower arms (from leader arm teleop)
action_left float32[8] Target positions for left arm from leader
action_right float32[8] Target positions for right arm from leader
observation.images.wrist_cam video path Wrist-mounted camera on follower arm end-effector
observation.images.overhead_cam video path Fixed overhead camera showing full bimanual workspace
timestamp float64 Host PC Unix timestamp. Both arms are sampled at this timestamp.
arm_sync_delta_ms float32 DK1-specific: time delta between left and right arm state reads. Flag episodes where this exceeds 10 ms.
next.done bool True on the last frame of an episode
Quality Assurance

Quality Checklist for Bimanual Demos

Bimanual datasets have stricter quality requirements than single-arm data. Poor coordination between arms is the leading cause of DK1 policy training failure.

  • 1
    Arm synchronization delta is under 10 ms Check the arm_sync_delta_ms field in each episode. Spikes above 10 ms indicate USB bus contention or a dropped serial packet. Delete episodes with sustained high deltas.
  • 2
    No follower arm oscillation during contact Review follower arm trajectories at contact points (grasp, handoff, placement). Oscillation appears as high-frequency noise in observation.state. Reduce PD gains if present. See software troubleshooting.
  • 3
    Both arms complete the task in the same episode For bimanual tasks, an episode is only valid if both arms complete their assigned subtasks. If the left arm succeeded but the right arm dropped the object, mark the episode as failed and delete or annotate it.
  • 4
    No missing camera frames Both camera streams must have the expected number of frames. Missing frames from either camera corrupt the visuomotor policy's input. Check with lerobot.scripts.visualize_dataset.
  • 5
    Task scene was reset identically between episodes Both arms' workspace must be reset for each episode. Object position, arm starting configuration, and camera angles must all match. Use the photographed starting configuration as reference.
  • 6
    Episode length is consistent All successful episodes should be within ±25% of median length. Bimanual tasks often have higher variance than single-arm tasks, but extreme outliers (3× median) should be discarded.
  • 7
    Dataset stats are symmetric for both arms In meta/stats.json, check that action_left and action_right stats are plausible for your task geometry. If one arm shows zero variance, that arm was not moving — check port assignments.
  • 8
    Teleop demonstration style is consistent All demonstrations should use the same approach path, grasp strategy, and handoff technique. Mixed strategies produce multimodal action distributions that confuse ACT training. Use a single operator per task version.
Next Step

Training ACT on Your Bimanual Dataset

Once your dataset passes the quality checklist, train ACT or Diffusion Policy directly with LeRobot. ACT is recommended for DK1 bimanual tasks — its chunked action prediction handles the coordination between arms better than single-step policies.

Train ACT (recommended for bimanual)

python -m lerobot.scripts.train \
  --policy.type=act \
  --dataset.repo_id=your-username/dk1-bimanual-pick-place-v1 \
  --policy.chunk_size=100 \
  --policy.n_action_steps=100 \
  --training.num_epochs=5000 \
  --training.batch_size=8 \
  --output_dir=outputs/dk1-act-bimanual

Train Diffusion Policy (for contact-rich tasks)

python -m lerobot.scripts.train \
  --policy.type=diffusion \
  --dataset.repo_id=your-username/dk1-bimanual-pick-place-v1 \
  --training.num_epochs=8000 \
  --output_dir=outputs/dk1-diffusion-bimanual

Go deeper: Read the full Data Collection Pipeline Overview in the Robotics Library for a thorough treatment of episode structure, dataset versioning, synchronization strategies, and multi-task bimanual dataset composition.

Dataset Ready? Start Training.

Push your bimanual dataset to HuggingFace and start training ACT.