SLAM Architecture Overview

This page provides a detailed comparison of our two GraphSLAM implementations and explains when to use each approach.

Developer Note: The Late Stage Fusion (MWEKF) implementation was developed as an advanced robotics project to maximize sensor utilization and state estimation accuracy. This approach is preferred because it fully leverages the camera’s higher update rate (30 Hz vs LiDAR’s 10 Hz), incorporates MPC control outputs directly into the dynamics model, and provides a flexible framework for adding future sensors (GPS, steering encoders, etc.) directly into the Extended Kalman Filter.

Fusion Strategy Comparison

Early Stage Fusion vs Late Stage Fusion

Aspect

Early Stage Fusion

Late Stage Fusion (MWEKF)

Input

Single fused cone message

Separate camera & LiDAR messages

Update Rate

Limited by slower sensor (LiDAR)

Full rate of both sensors

Architecture

GraphSLAM only

MWEKF + GraphSLAM

State Estimation

IMU + wheelspeeds between updates

Continuous EKF with MPC dynamics

Complexity

Simpler, fewer components

More complex, two-stage filtering

Latency

Moderate (depends on solve frequency)

Low (EKF runs at sensor rates)

When to Use Each Approach

Early Stage Fusion is recommended when:

  • Simpler system architecture is preferred

  • Computational resources are limited

  • Perception module already provides well-fused observations

  • Lower maintenance overhead is desired

  • Real-time performance is less critical

Late Stage Fusion (MWEKF) is recommended when:

  • Maximum sensor utilization is required

  • High-frequency state estimates are needed (for fast control loops)

  • MPC controller is available and providing control inputs

  • Camera and LiDAR operate at different rates

  • More accurate state estimation is critical

  • Computational resources are available

Preferred Approach: Late Stage Fusion (MWEKF) is generally preferred as it fully utilizes both sensor update rates instead of downsampling to match the slower sensor.

GraphSLAM Backend

Both implementations share the same core GraphSLAM optimization backend, which represents the SLAM problem as a factor graph:

State Variables:

  • Robot poses: \(\mathbf{x}_t = [x_t, y_t]\) at each timestep \(t\)

  • Landmark positions: \(\mathbf{l}_i = [x_i, y_i]\) for each cone \(i\)

Factor Graph Constraints:

  1. Odometry factors: \(\mathbf{x}_t - \mathbf{x}_{t-1} = \Delta\mathbf{x}_t\)

    • Weight: dx_weight (default: 2.0)

    • Represents relative motion between timesteps

  2. Observation factors: \(\mathbf{l}_i - \mathbf{x}_t = \mathbf{z}_{i,t}\)

    • Weight: z_weight (default: 1.0)

    • Represents observed cone position relative to robot

Optimization:

The system is solved as a sparse least squares problem:

\[\min_{\mathbf{x}, \mathbf{l}} \|A[\mathbf{x}; \mathbf{l}] - \mathbf{b}\|^2\]

Where:

  • \(A\) is a sparse constraint matrix (stored as LIL format, converted to CSR for solving)

  • \(\mathbf{b}\) contains the constraint values

  • Solution uses scipy.sparse.linalg.spsolve with normal equations: \(A^T A \mathbf{x} = A^T \mathbf{b}\)

Performance: Typical solve time is ~630 microseconds with optimized sparse matrix operations.

Matrix Accumulation Visualization

The GraphSLAM algorithm builds a sparse constraint matrix incrementally as the vehicle moves and observes landmarks. Here’s how the system matrix grows:

Initial Conditions and First Observation:

Starting from origin \(\mathbf{P}_0 = (0,0)\) with a cone observed at \((-2, 3)\) relative to the vehicle:

\[\begin{split}\underbrace{\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ -1 & 0 & 1 & 0 \\ 0 & -1 & 0 & 1 \end{bmatrix}}_{\text{Constraint Matrix } A} \underbrace{\begin{bmatrix} P_{x_0} \\ P_{y_0} \\ l_{1_x} \\ l_{1_y} \end{bmatrix}}_{\text{State Vector}} = \underbrace{\begin{bmatrix} 0 \\ 0 \\ -2 \\ 3 \end{bmatrix}}_{\text{Measurements}}\end{split}\]

After Vehicle Motion:

Vehicle moves to \((0, 2)\) and observes the same cone from a new pose:

\[\begin{split}\begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ -1 & 0 & 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 1 & 0 & 0 \\ -1 & 0 & 0 & 0 & 1 & 0 \\ 0 & -1 & 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & -1 & 0 \\ 0 & 0 & 0 & 1 & 0 & -1 \end{bmatrix} \begin{bmatrix} P_{x_0} \\ P_{y_0} \\ l_{1_x} \\ l_{1_y} \\ P_{x_1} \\ P_{y_1} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ -2 \\ 3 \\ 0 \\ 2 \\ -2 \\ 1 \end{bmatrix}\end{split}\]

Where:

  • First 2 rows: Initial pose constraint \(P_{x_0} = 0, P_{y_0} = 0\)

  • Rows 3-4: First landmark observation \(l_1 - P_0 = (-2, 3)\)

  • Rows 5-6: Odometry constraint \(P_1 - P_0 = (0, 2)\)

  • Rows 7-8: Second observation of same landmark \(l_1 - P_1 = (-2, 1)\)

General Structure:

As the vehicle continues to move and observe landmarks, the matrix grows in a specific pattern:

\[\begin{split}\begin{bmatrix} P_w & 0 & 0 & 0 & 0 & 0 & \cdots \\ 0 & P_w & 0 & 0 & 0 & 0 & \cdots \\ -P_w & 0 & L_w & 0 & 0 & 0 & \cdots \\ 0 & -P_w & 0 & L_w & 0 & 0 & \cdots \\ -P_w & 0 & 0 & 0 & P_w & 0 & \cdots \\ 0 & -P_w & 0 & 0 & 0 & P_w & \cdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \end{bmatrix} \begin{bmatrix} P_{x_0} \\ P_{y_0} \\ l_{1_x} \\ l_{1_y} \\ P_{x_1} \\ P_{y_1} \\ \vdots \end{bmatrix}\end{split}\]

Where:

  • \(P_w\) = Position weight (dx_weight, default: 2.0)

  • \(L_w\) = Landmark observation weight (z_weight, default: 1.0)

Sparsity Pattern:

The constraint matrix is highly sparse because:

  1. Each pose only connects to adjacent poses (odometry)

  2. Each landmark only connects to poses where it was observed

  3. Most matrix entries remain zero, enabling efficient sparse solvers

Key Properties:

  • Block structure: Each pose and landmark occupies 2 rows (x, y coordinates)

  • Incremental growth: Matrix expands as vehicle explores new areas

  • Loop closure: Re-observing landmarks creates additional constraints, improving accuracy

  • Sparse solution: Solved using \(A^T A x = A^T b\) with CSR format for efficiency

Data Association

Both implementations support two data association strategies:

Nearest Neighbor (Default)

  • Matches new observations to existing landmarks based on Euclidean distance

  • New landmark created if distance exceeds max_landmark_distance

  • Fast and works well when odometry is reasonably accurate

  • Used by default in early stage fusion

ICP Refinement (Optional)

  • Applies Iterative Closest Point algorithm before nearest neighbor matching

  • Optimizes for translation and rotation to align new observations with existing map

  • Cost function clips distances at dclip to ignore outliers

  • Helps with loop closure and odometry drift correction

  • Currently optional in both implementations (can be enabled in configuration)

Coordinate Frame Transforms

The system handles multiple coordinate frames:

  1. Sensor Frames: Camera and LiDAR have their own coordinate systems

  2. Vehicle Frame: Center of Gravity (CoG) of the car

  3. Map Frame: Global coordinate system for SLAM

Important Transforms:

  • LiDAR to CoG: Forward offset of -0.9652m (configured in early stage fusion)

  • Camera frames: Handled via rotate_and_translate_cones() in late stage fusion

  • All cone observations are transformed to the vehicle CoG before being added to the graph

Lap Detection

Both implementations include automatic lap counting functionality:

Orange Cone Gate Detection:

  1. System identifies large orange cones (color code 0)

  2. Computes vertical gate boundaries when sufficient cones detected

  3. Monitors vehicle trajectory for gate crossing using line segment intersection

  4. Increments lap counter when crossing detected (with minimum 20-30 second cooldown)

Behavior Changes:

  • After First Lap: System publishes global map containing all discovered cones

  • Lap Count: Included in published /slam/state message for race management

  • Race End: Can trigger end-of-race signal after configured number of laps

State Message Format

The published state message (feb_msgs/State) contains:

State:
  x: float          # Global x position (meters)
  y: float          # Global y position (meters)
  velocity: float   # Forward velocity (m/s)
  heading: float    # Yaw angle (radians)
  lap_count: int    # Current lap number

This state is used by:

  • Local path planner for trajectory generation

  • Global path planner for long-term planning

  • Control module for feedback

  • Visualization and logging systems