Camera Only =========== Overview -------- The Camera Only perception system provides vision-based cone detection using YOLO object detection models combined with Intel RealSense depth cameras. This module processes RGB-D (color + depth) images to detect cones and determine their 3D positions relative to the vehicle. **Historical Note:** The system originally used a single camera setup, but the field of view (FOV) was not large enough to adequately cover the track width and detect cones on both sides. The dual camera configuration significantly improves lateral coverage and detection reliability. System Architecture ------------------- The dual camera system uses two Intel RealSense depth cameras: - **Left Camera (D435i)**: Positioned on the left side of the vehicle, mounted upside down - **Right Camera (D435)**: Positioned on the right side of the vehicle Each camera independently: 1. Captures RGB color images 2. Captures aligned depth images 3. Runs YOLO-based cone detection on color images 4. Uses depth data to compute 3D cone positions 5. Publishes detected cones with color classification Dual Camera Node ---------------- The `DualCamera` class manages both cameras and performs synchronized processing. Key Features ~~~~~~~~~~~~ - **Independent Processing**: Each camera processes its data stream independently - **YOLO Segmentation**: Uses YOLO model with instance segmentation for precise cone detection - **Depth Integration**: Combines segmentation masks with depth data for 3D localization - **Color Classification**: Detects and classifies blue and yellow cones - **Confidence Filtering**: Only publishes detections above configurable confidence threshold - **Visualization Support**: Optional depth overlay and visualization outputs ROS Topics ~~~~~~~~~~ **Subscribed Topics:** .. list-table:: :widths: 40 30 30 :header-rows: 1 * - Topic - Type - Description * - `/camera/realsense_imu/color/image_raw` - sensor_msgs/Image - Left camera RGB image * - `/camera/realsense_imu/aligned_depth_to_color/image_raw` - sensor_msgs/Image - Left camera depth image aligned to RGB * - `/camera/realsense2/color/image_raw` - sensor_msgs/Image - Right camera RGB image * - `/camera/realsense2/aligned_depth_to_color/image_raw` - sensor_msgs/Image - Right camera depth image aligned to RGB **Published Topics:** .. list-table:: :widths: 40 30 30 :header-rows: 1 * - Topic - Type - Description * - `/realsense/d435i/cones` - ConesCartesian - Left camera cone detections * - `/realsense/d435/cones` - ConesCartesian - Right camera cone detections * - `/perception_cones_viz` - PointCloud - Visualization of detected cones Processing Pipeline ------------------- For each camera, the processing pipeline follows these steps: 1. **Image Reception** Color and depth images are received via ROS subscriptions. Processing only occurs when both images are available. 2. **YOLO Detection** The color image is passed through a YOLO model to detect cone instances with segmentation masks. 3. **Depth Extraction** .. code-block:: python # For each detected cone segmentation mask: # 1. Extract depth values within the mask depths = depth_image[mask == 1] # 2. Back-project pixels to 3D camera coordinates pixels = np.vstack((u, v, ones)) camera_coords = (inv_K @ pixels) * depths # 3. Compute median position cone_position = np.median(camera_coords, axis=1) 4. **Coordinate Transform** Camera coordinates are converted to vehicle frame: - X (forward): Camera Z-axis - Y (left): Camera X-axis - Z (up): Camera Y-axis (not published) 5. **Color Classification** YOLO class outputs are mapped to FEB color codes: - Blue cones (class 8) → color = 2 - Yellow cones (class 1) → color = 1 - Other cones → color = 0 6. **Publishing** Cone detections are published as ``ConesCartesian`` messages with timestamp and evaluation time. Configuration Parameters ------------------------ From `CameraOnlySettings`: .. list-table:: :widths: 40 60 :header-rows: 1 * - Parameter - Description * - `yolo_minimum_confidence` - Minimum YOLO confidence threshold for detections (e.g., 0.5) * - `publish_visual` - Enable/disable visualization publishing * - `dual_camera` - Enable dual camera mode (vs single camera) From `EarlyStageFusionSettings`: .. list-table:: :widths: 40 60 :header-rows: 1 * - Parameter - Description * - `left_intrinsic` - 3x3 camera intrinsic matrix for left camera * - `right_intrinsic` - 3x3 camera intrinsic matrix for right camera Camera Calibration ------------------ Both cameras require intrinsic calibration matrices for accurate 3D reconstruction. The intrinsic matrix K relates pixel coordinates to camera rays: .. math:: \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = K \begin{bmatrix} X/Z \\ Y/Z \\ 1 \end{bmatrix} .. math:: K = \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix} Where: - (fx, fy): Focal lengths in pixels - (cx, cy): Principal point (optical center) Image Processing Notes ---------------------- **Depth Units:** RealSense depth images are provided in millimeters and must be converted to meters: .. code-block:: python depth_image = depth_image.astype(np.float32) / 1000.0 **Image Flipping:** The left camera images are flipped 180 degrees due to physical mounting orientation: .. code-block:: python img = cv2.flip(img, -1) # Flip both axes depth_image = cv2.flip(depth_image, -1) **Depth Alignment:** Depth images are pre-aligned to the color image frame by the RealSense driver, ensuring pixel correspondence. .. important:: To enable proper depth-to-color alignment, you must launch the cameras using the ``perception.py`` or ``camera.py`` launch files from the ``sn4_bringup`` package. These launch files configure the RealSense driver with the ``align_depth.enable: True`` parameter, which ensures the depth image is spatially aligned to match the color image frame. Without this alignment, depth values will not correspond to the correct pixel locations in the RGB image. .. code-block:: bash # Correct: Launches cameras with alignment enabled ros2 launch sn4_bringup camera.py # Or launch full perception system ros2 launch sn4_bringup perception.py Visualization and Debugging ---------------------------- The module includes several visualization tools: **Depth Overlay:** Shows segmented cone regions overlaid on the original image with color-coded depth information (green intensity indicates depth). **Interactive Depth Viewer:** Matplotlib-based interactive viewer allows hovering over depth images to inspect exact depth values at each pixel. Usage Example ------------- .. code-block:: python import rclpy from camera_only.dual_camera import DualCamera rclpy.init() node = DualCamera() rclpy.spin(node) node.destroy_node() rclpy.shutdown() Performance Considerations -------------------------- - **Processing Rate**: Limited by YOLO inference time (~30-60 FPS depending on hardware) - **Latency**: Includes image acquisition, YOLO inference, and depth processing - **Accuracy**: Dependent on depth quality, lighting conditions, and cone visibility - **Range**: Effective up to ~15 meters depending on cone size and lighting Advantages of Dual Camera System --------------------------------- **Increased FOV Coverage:** - Single camera FOV: ~69° horizontal (D435) - Dual camera coverage: ~120-140° effective horizontal coverage - Eliminates blind spots on track edges **Improved Reliability:** - Redundancy if one camera fails or is occluded - Better cone detection at different lateral positions - Complementary viewpoints reduce false negatives **Challenges:** - Increased computational load (2x processing) - Camera synchronization and calibration requirements - More complex system integration API Reference ------------- .. automodule:: perception.camera_only.camera_only.dual_camera :members: :undoc-members: :show-inheritance: .. automodule:: perception.camera_only.camera_only.model_operations :members: :undoc-members: :show-inheritance: