-
Notifications
You must be signed in to change notification settings - Fork 34
CMP3103 Week 10
- Learning Objectives
- Background
- Environment Setup
- Building the Package
- Running the Simulation
- Workshop Tasks
- Expected Results
- Hints and Common Issues
- Stretch Tasks
- Key References
By the end of this workshop you will be able to:
| # | Skill |
|---|---|
| 1 | Receive and display images from a simulated robot camera using ROS 2 Python and cv_bridge |
| 2 | Apply the HOG (Histogram of Oriented Gradients) pedestrian detector from OpenCV to find people in a scene |
| 3 | Select the most confident detection and extract its bounding box |
| 4 | Implement a simple proportional controller to steer a differential-drive robot towards a detected person |
| 5 | Understand the structure of a ROS 2 Python node (subscriber, publisher, callback) |
Clone the following repository, a ROS package that contains helper functions and the node that template that you need to expand during the workshop
git clone https://github.com/athapoly/hri_workshopThe hri_world.world file defines a 10 × 10 m walled arena containing a single walking human actor. The actor is scaled to approximately 0.9 m (child-sized) so it is clearly visible in the camera at 2–4 m range without filling the entire frame. It patrols a rectangular route:
(-3, -3) → (-3, 3) → (3, 3) → (3, -3) → repeat
The Histogram of Oriented Gradients (HOG) descriptor, combined with a linear SVM, is one of the classical computer vision methods for detecting upright pedestrians. OpenCV ships with a pre-trained SVM (HOGDescriptor_getDefaultPeopleDetector()) so no model download is required. The detector works at its best when the person occupies roughly 64–128 pixels of height in the image.
This workshop runs inside the provided VS Code dev container. All required software is pre-installed:
| Software | Version |
|---|---|
| ROS 2 | Humble Hawksbill |
| Gazebo Classic | 11.x |
| OpenCV | 4.x (system) |
| Python | 3.10 |
| cv_bridge | ros-humble-cv-bridge |
Open VS Code and the dev container will start automatically. The integrated terminal gives you a shell with ROS 2 already sourced.
# Check ROS 2
ros2 --version
# Confirm OpenCV is available in Python
python3 -c "import cv2; print(cv2.__version__)"
# Check the Limo simulation package exists
ros2 pkg list | grep limo_gazebosimAll commands should be run from the workspace root.
cd /workspaces/hri
# Build with symlink install so Python file changes are picked up immediately
colcon build --symlink-install --packages-select hri_workshop
# Source the workspace overlay
source install/setup.bashTip: After the first build you only need to re-run
source install/setup.bashwhen you add new entry points. Because of--symlink-install, edits to Python files insrc/are reflected immediately without rebuilding.
Open two terminals inside the dev container (use the + button in the VS Code terminal panel or split the terminal).
source /workspaces/hri/install/setup.bash
ros2 launch hri_workshop hri_workshop.launch.pyWait until you see Gazebo started successfully in the log (can take 20–30 seconds on first run). A Gazebo window should open showing the arena with the walking human actor.
Optional flags:
# Launch with RViz2 for sensor visualisation
ros2 launch hri_workshop hri_workshop.launch.py use_rviz:=true
# Headless (no GUI) – useful on slow machines
ros2 launch hri_workshop hri_workshop.launch.py headless:=truesource /workspaces/hri/install/setup.bash
# List all active topics
ros2 topic list
# Verify the camera is publishing
ros2 topic hz /limo_camera/image
# Inspect a single camera message (press Ctrl+C to stop)
ros2 topic echo /limo_camera/image --onceYou should see /limo_camera/image publishing at approximately 10 Hz.
Open the template node in your editor:
src/hri_workshop/hri_workshop/human_detector.py
All four tasks are marked with TODO blocks inside the image_callback method. Complete them in order.
Goal: Confirm you are receiving images from the simulation.
What to do:
- Find the
TODO – Task 1block inimage_callback. - Add a
cv2.imshow()call to display the raw camera frame.
cv2.imshow('Limo Camera', cv_image)
cv2.waitKey(1)- Optionally log the image shape every N frames using a counter:
# In __init__:
self.frame_count = 0
# In image_callback:
self.frame_count += 1
if self.frame_count % 30 == 0:
self.get_logger().info(f'Image shape: {cv_image.shape}')- Run the node:
ros2 run hri_workshop human_detectorCheckpoint ✓ An OpenCV window opens and you can see the live camera feed. The walking human should be visible when the actor passes in front of the robot.
Goal: Use self.hog.detectMultiScale() to find bounding boxes around people.
What to do:
- Find the
TODO – Task 2block. - Call
detectMultiScaleoncv_image:
rects, weights = self.hog.detectMultiScale(
cv_image,
winStride=(8, 8),
padding=(4, 4),
scale=1.05,
)- Log the number of detections:
self.get_logger().info(f'Detections: {len(rects)}')Parameters explained:
| Parameter | Meaning | Effect of increasing |
|---|---|---|
winStride |
Step size of the sliding window | Faster, but may miss detections |
padding |
Border around each window | More context, slightly slower |
scale |
Pyramid downscaling factor (>1) | Fewer scales → faster, coarser |
Performance tip: If the node is too slow, resize the image to half size before detection:
small = cv2.resize(cv_image, (0, 0), fx=0.5, fy=0.5) rects, weights = self.hog.detectMultiScale(small, ...) # Scale bounding boxes back up rects = rects * 2 # multiply all coordinates by 2
Checkpoint ✓ When the actor walks past you see log messages like Detections: 1 or Detections: 2.
Goal: Pick the highest-confidence bounding box, draw it on the image, and compute the horizontal centre of the person.
What to do:
- Find the
TODO – Task 3block. - Handle the case where no person is detected:
if len(rects) == 0:
self.get_logger().warn('No person detected – stopping robot')
self.cmd_vel_pub.publish(Twist())
cv2.imshow('Limo Camera', cv_image)
cv2.waitKey(1)
return- Select the best (highest-weight) detection:
best_idx = int(np.argmax(weights))
bx, by, bw, bh = rects[best_idx]- Draw all detections in green, the best one in red:
for (x, y, w, h) in rects:
cv2.rectangle(cv_image, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.rectangle(cv_image, (bx, by), (bx + bw, by + bh), (0, 0, 255), 3)- Compute the horizontal centre:
cx = bx + bw // 2- Mark the centre on the image:
cv2.circle(cv_image, (cx, by + bh // 2), 6, (255, 0, 0), -1)
cv2.imshow('Limo Camera', cv_image)Checkpoint ✓ The camera window shows green boxes around all detections and a red box with a blue dot at the centre of the best one.
Goal: Use the horizontal pixel error to steer the robot so it faces and approaches the detected person.
What to do:
- Find the
TODO – Task 4block. - Compute the horizontal error (distance from image centre to detection centre):
image_center_x = self.image_width // 2
error = image_center_x - cx
# Positive error → person is to the LEFT → turn left (positive angular.z)
# Negative error → person is to the RIGHT → turn right (negative angular.z)- Compute angular velocity using the proportional gain:
angular_z = self.angular_gain * error- Publish the Twist command:
twist = Twist()
twist.linear.x = self.linear_speed
twist.angular.z = angular_z
self.cmd_vel_pub.publish(twist)- Add a dead-band to avoid jitter when the person is already centred:
if abs(error) < 20: # 20 pixel threshold
angular_z = 0.0Checkpoint ✓ The Limo robot rotates towards the detected person and then drives forward, maintaining the person roughly in the centre of its camera view.
After completing all four tasks your node should:
- Display a live annotated camera window showing bounding boxes around the person.
- Stop the robot (
Twist()with all zeros) when no person is visible. - Rotate the robot towards the person when they are detected off-centre.
- Drive the robot forward while keeping the person centred.
You can verify the velocity commands being sent:
ros2 topic echo /cmd_velExpected output when person is visible and centred:
linear:
x: 0.2
y: 0.0
z: 0.0
angular:
x: 0.0
y: 0.0
z: ~0.0 # close to zero when centred
Make sure the devcontainer's virtual display is running. Open a browser to http://localhost:5801 (noVNC) to see the graphical desktop, or set the DISPLAY variable:
export DISPLAY=:1The image may have an unexpected format. Print cv_image.dtype and cv_image.shape to check. HOG expects uint8 data.
- The actor may be too small (< 40 px height) or too close (> ~300 px height). Try adjusting the
scaleparameter to1.02for finer multi-scale search. - HOG performs best on upright pedestrians. Make sure the camera is level.
- Lower the minimum detection size: add
minSize=(30, 60)todetectMultiScale.
- Check
ros2 topic echo /cmd_vel– are you publishing? - Make sure the simulation is running (Terminal 1).
- The
twist_watchdognode stops the robot if no message arrives for 0.5 s – keep publishing even when the robot should be stationary.
You need to rebuild and re-source:
colcon build --symlink-install --packages-select hri_workshop
source install/setup.bashCompleted the four main tasks early? Try these extensions:
The depth image (topic /limo_camera/depth/camera_image_depth) gives per-pixel depth in metres. Sample the depth at the detection centre to estimate distance and reduce linear.x as the robot approaches.
detectMultiScale may return overlapping boxes for the same person. Apply OpenCV's groupRectangles or implement IoU-based NMS to clean up the detections.
Replace HOG with the pre-trained YOLOv8-nano or MobileNet SSD (available via ultralytics or cv2.dnn). Compare detection quality at different scales.
Instead of detecting from scratch every frame, implement a tracker (e.g. cv2.TrackerCSRT_create()) to follow the bounding box between detections and reduce CPU load.
Subscribe to /scan (sensor_msgs/LaserScan) and use the closest obstacle distance in the forward arc to stop the robot before colliding with the person.
| Resource | URL |
|---|---|
| OpenCV HOG Descriptor | https://docs.opencv.org/4.x/d5/d33/structcv_1_1HOGDescriptor.html |
| cv_bridge ROS 2 | https://docs.ros.org/en/humble/p/cv_bridge/ |
| geometry_msgs/Twist | https://docs.ros2.org/humble/api/geometry_msgs/msg/Twist.html |
| sensor_msgs/Image | https://docs.ros2.org/humble/api/sensor_msgs/msg/Image.html |
| rclpy API | https://docs.ros.org/en/humble/p/rclpy/ |
| ROS 2 Humble tutorials | https://docs.ros.org/en/humble/Tutorials.html |
| Dalal & Triggs HOG paper | https://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf |
| AgileX Limo robot | https://global.agilex.ai/products/limo |
Copyright by Lincoln Centre for Autonomous Systems
-
CMP3103
- Using the Docker Image
- Week 1
- Week 2
- Week 3
- Week 4
- Week 5: Start working towards Coursework 2526
- Using the real Limo Robot