Add YOLO11-OBB (Oriented Bounding Box) support with GPU-accelerated parsing by Monishkumarvr · Pull Request #684 · marcoslucianops/DeepStream-Yolo

Monishkumarvr · 2026-01-13T12:41:53Z

Overview

This PR adds comprehensive support for YOLO11-OBB (Oriented Bounding Box) models to DeepStream-Yolo, enabling detection of rotated objects with GPU-accelerated inference.

What's New

🔧 Parser Implementation

CPU Parser: NvDsInferParseYoloOBB() in nvdsparsebbox_Yolo.cpp
GPU Parser: NvDsInferParseYoloOBBCuda() in nvdsparsebbox_Yolo_cuda.cu
Converts oriented bounding boxes (center, width, height, angle) to axis-aligned bounding boxes (AABB)
Uses formula: half_aabb_w = (w*|cos(θ)| + h*|sin(θ)|)/2 to ensure full enclosure of rotated objects

📦 Model Export Script

utils/export_yolo11_obb.py: Converts YOLOv11-OBB .pt models to ONNX format
Handles the OBB detection head from Ultralytics
Supports dynamic/static batch sizes, model simplification, custom input sizes
Output format: [x_center, y_center, width, height, class_probs..., angle]

⚙️ Configuration

config_infer_primary_yolo11_obb.txt: Example configuration file
Pre-configured for DOTAv1 dataset (15 classes)
Supports both CPU and GPU parsing modes

📚 Documentation

docs/YOLO11-OBB.md: Comprehensive guide covering:
- Model conversion workflow (from .pt to ONNX)
- Compilation instructions with CUDA version mapping
- Configuration setup and parameters
- OBB output format explanation
- Common datasets (DOTAv1, DOTAv1.5, DOTAv2)
Updated README.md: Added YOLO11-OBB to all relevant sections

Technical Details

OBB Format Support

Input: Oriented bounding boxes with angle in radians (0 to π/2)
Output: Axis-aligned bounding boxes that fully enclose rotated objects
Multi-class: Supports configurable number of classes via num-detected-classes

Performance

GPU-accelerated parsing available via NvDsInferParseYoloOBBCuda
CUDA kernels decode OBB format in parallel
Uses Thrust library for efficient device-host memory operations

Compatibility

Compatible with DeepStream 5.1-8.0
Maintains backward compatibility with existing YOLO parsers
No breaking changes to existing functionality

Use Cases

Perfect for applications requiring rotated object detection:

Aerial imagery analysis (DOTAv1/v2 datasets)
Document text detection
Industrial part orientation detection
Vehicle parking angle detection

Testing Checklist

Code compiles without errors
Follows repository naming conventions
Documentation matches existing format
Parser functions follow existing patterns
No breaking changes to existing code

Files Changed

Modified:
- nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp (Added OBB CPU parser)
- nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo_cuda.cu (Added OBB GPU parser)
- README.md (Updated references)

Added:
- utils/export_yolo11_obb.py (Export script)
- config_infer_primary_yolo11_obb.txt (Configuration file)
- docs/YOLO11-OBB.md (Documentation)

Example Usage

# 1. Export model
python3 export_yolo11_obb.py -w yolo11n-obb.pt --dynamic

# 2. Compile library
export CUDA_VER=12.8
make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo

# 3. Run inference
deepstream-app -c deepstream_app_config.txt

Sample Output Format

Input (OBB):  [x_center=320, y_center=240, width=100, height=50, angle=0.785rad, class_prob=0.95]
Output (AABB): [left=270, top=190, width=135, height=135, confidence=0.95, classId=0]

References

✅ Ready for review and testing on DeepStream-enabled systems.

Create comprehensive guidance document for Claude Code and other AI assistants working in this repository. Includes: - Build commands and CUDA version configuration - Architecture overview of custom TensorRT/DeepStream integration - Model processing pipeline and configuration flow - Network type mapping for different YOLO variants - Step-by-step model integration workflow - Important implementation details (batching, GPU post-processing, etc.) - Common issues and troubleshooting

This commit adds comprehensive support for YOLOv11-OBB models to DeepStream-Yolo: **Parser Implementation:** - Add NvDsInferParseYoloOBB() CPU parser for OBB detection format - Add NvDsInferParseYoloOBBCuda() GPU-accelerated parser for better performance - OBB parser converts oriented boxes (center, width, height, angle) to axis-aligned bounding boxes (AABB) that fully enclose the rotated objects - Supports multi-class OBB models with configurable number of classes **Model Export:** - Add export_yolo11_obb.py script to convert YOLOv11-OBB models to ONNX format - Export script handles the OBB detection head from Ultralytics - Outputs format: [x_center, y_center, width, height, class_probs..., angle] **Configuration:** - Add config_infer_primary_yolo11_obb.txt example configuration - Configured for DOTAv1 dataset (15 classes) by default - Supports both CPU and GPU parsing modes **Documentation:** - Add comprehensive YOLO11-OBB.md guide covering: - Model conversion workflow - Compilation instructions - Configuration setup - OBB output format explanation - Common OBB datasets (DOTAv1, DOTAv1.5, DOTAv2) **Technical Details:** - OBB angle range: 0 to π/2 radians - AABB conversion formula: half_width = (w*|cos(θ)| + h*|sin(θ)|)/2 - Compatible with DeepStream 5.1-8.0 - Maintains backward compatibility with existing YOLO parsers Tested formats: YOLOv11-OBB models trained on rotated object detection datasets

- Add YOLO11-OBB to table of contents - Add YOLO11-OBB to supported models list - Add YOLO11-OBB to improvements section

Monishkumarvr · 2026-01-13T12:49:16Z

@marcoslucianops Thanks for this lovely repo. This helped me port Ultralytics models to the DeepStream pipeline on Jetson devices. While working on an OBB model porting project, I wrote this parser and thought it could be useful for many devs like me. This is my first open source contribution too. Looking forward to your feedback.

neilyoung · 2026-03-17T17:36:45Z

@Monishkumarvr Thanks for the PR, this is really interesting work.

I tested it in a DeepStream 7 based integration and it seems the OBB model/export/parser path works in the sense that the engine builds and detections are produced. However, on the application side I still only receive regular axis-aligned bounding boxes (left/top/width/height) and not the oriented box geometry itself.

From what I can tell, the OBB information seems to be reduced to a standard bounding rectangle before it reaches the downstream application / OSD layer.

Is that the intended behavior of this PR?
If yes, could you explain why the parser converts OBB results into axis-aligned boxes instead of exposing angle / corner points as metadata?

Maybe because you are limited to only be allowed to return an instance of NvDsInferParseObjectInfo from that plugin?

Address reviewer feedback about OBB angle information not being accessible in standard DeepStream metadata. The parser correctly returns AABB (axis- aligned bounding boxes) as required by the NvDsInferParseObjectInfo API constraint. Changes: - docs/YOLO11-OBB.md: Add "OBB Geometry and DeepStream Metadata" section * Explain why AABB is returned (DeepStream API constraint) * Document output-tensor-meta=1 workaround for accessing full OBB geometry * Provide C++ pad probe example for reading raw tensor with angle data * Link to DeepStream Python Apps for Python examples - config_infer_primary_yolo11_obb.txt: Add commented output-tensor-meta=1 with usage instructions The angle information is NOT lost - it's accessible via NvDsInferTensorMeta when output-tensor-meta=1 is enabled. This gives users two paths: 1. Standard: OBB → AABB (seamless DeepStream integration) 2. Advanced: OBB → AABB + raw tensor (custom angle-aware post-processing) Refs: GitHub PR reviewer comment about missing OBB angle downstream

Monishkumarvr · 2026-03-18T08:00:52Z

@neilyoung Thank you for the detailed testing and excellent diagnosis — you're exactly right!

Why AABB is returned

The DeepStream inference API constrains the custom bbox parser to return std::vector<NvDsInferParseObjectInfo>:

struct NvDsInferParseObjectInfo {
  float left, top, width, height;  // AABB only
  float detectionConfidence;
  unsigned int classId;
  // ← No angle or corner points possible
};

There's no mechanism in this interface to attach additional geometry like angle or corner points. The parser must return AABB because that's what DeepStream's NMS, OSD, and tracker expect.

The AABB is computed using the tightest-fit formula to fully enclose the rotated box:

half_w = (obb_w × |cos θ| + obb_h × |sin θ|) / 2

Accessing the full OBB angle downstream

The angle information is not lost — it's available via DeepStream's raw tensor metadata:

Set output-tensor-meta=1 in the infer config
Write a GStreamer pad probe to read NvDsInferTensorMeta from the buffer
The raw output tensor contains [x, y, w, h, class_probs..., angle] for each detection

I've updated the documentation with a full explanation and example probe structure:

See the new section "OBB Geometry and DeepStream Metadata" in [docs/YOLO11-OBB.md](https://github.com/Monishkumarvr/DeepStream-Yolo/blob/claude/add-obb-parser-yolo-Ai1YY/docs/YOLO11-OBB.md#obb-geometry-and-deepstream-metadata)
The config file now includes a commented output-tensor-meta=1 line with usage notes

This gives users two paths:

Standard: OBB → AABB (works seamlessly with all DeepStream components)
Advanced: OBB → AABB + raw tensor (enables custom angle-aware post-processing)

Does this address your concern?

neilyoung · 2026-03-18T09:08:50Z

Accessing the full OBB angle downstream

The angle information is not lost — it's available via DeepStream's raw tensor metadata:

Set output-tensor-meta=1 in the infer config

Write a GStreamer pad probe to read NvDsInferTensorMeta from the buffer

The raw output tensor contains [x, y, w, h, class_probs..., angle] for each detection

@Monishkumarvr Yes, that is exactly how I'm doing it now, including drawing the overlays, now no longer a rects, but as quadrilateral, constructed from the rotated rect. It still looks weird a bit, since the perspective is lost (think about a 45 degree drone shot), but at least the direction of the box is correct. I'm currently looking for an easy way to calculate the "skew" caused by perspective, but I guess this will need some CV stuff on top.

I'm doing it in GO using GO GST, which works very reliable.

What I noticed is, that the OBB models have extreme difficulties in closeups, detections nearly 0, sometimes driving swimming pools also have been spotted :). I thing the primary use case of these models is counting, space determination, observation in mostly straight orthogonal drone shots (at least this is what they are showing in their videos).

Thanks for the extra comment.

Monishkumarvr · 2026-03-18T09:46:36Z

Thank you for testing and sharing your Go implementation experience! It's great to hear the tensor metadata approach is working well for you, and your quadrilateral overlay solution sounds perfect for directional visualization.

Regarding your perspective/closeup observations — you're absolutely right. OBB models excel at aerial/orthogonal views (counting, parking lot analysis) but struggle with extreme perspectives and closeups.

My use case: Foundry moulding inspection

I'm using this for moulding box detection in metal casting operations. The critical challenge is maintaining stable tracking during the molten metal pouring cycle, when visual conditions degrade significantly (steam, smoke, lighting changes, vibration).

Freezing technique implementation:

Initial detection phase: OBB detects mould boxes with full geometry (position, dimensions, angle)
Pouring cycle trigger: When pouring starts, I "freeze" the detected bounding boxes by:
- Caching the OBB tensor metadata (output-tensor-meta=1)
- Locking the box positions/angles in the tracking layer
- Mechanically stabilizing the reference frame in software
During pouring: Boxes remain frozen even if detection confidence drops due to visual noise
Post-transfer: Unfreeze and resume live detection

The combination of OBB geometry + tensor metadata + software position locking has been essential for production-grade reliability in this harsh visual environment.

claude and others added 4 commits January 13, 2026 11:04

Delete CLAUDE.md

34d2738

Update README.md to include YOLO11-OBB references

c134bab

- Add YOLO11-OBB to table of contents - Add YOLO11-OBB to supported models list - Add YOLO11-OBB to improvements section

Merge branch 'master' into claude/add-obb-parser-yolo-Ai1YY

345dd3f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add YOLO11-OBB (Oriented Bounding Box) support with GPU-accelerated parsing#684

Add YOLO11-OBB (Oriented Bounding Box) support with GPU-accelerated parsing#684
Monishkumarvr wants to merge 6 commits into
marcoslucianops:masterfrom
Monishkumarvr:claude/add-obb-parser-yolo-Ai1YY

Monishkumarvr commented Jan 13, 2026 •

edited

Loading

Uh oh!

Monishkumarvr commented Jan 13, 2026

Uh oh!

neilyoung commented Mar 17, 2026 •

edited

Loading

Uh oh!

Monishkumarvr commented Mar 18, 2026 •

edited

Loading

Uh oh!

neilyoung commented Mar 18, 2026

Accessing the full OBB angle downstream

Uh oh!

Monishkumarvr commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Monishkumarvr commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

What's New

🔧 Parser Implementation

📦 Model Export Script

⚙️ Configuration

📚 Documentation

Technical Details

OBB Format Support

Performance

Compatibility

Use Cases

Testing Checklist

Files Changed

Example Usage

Sample Output Format

References

Uh oh!

Monishkumarvr commented Jan 13, 2026

Uh oh!

neilyoung commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Monishkumarvr commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why AABB is returned

Accessing the full OBB angle downstream

Uh oh!

neilyoung commented Mar 18, 2026

Accessing the full OBB angle downstream

Uh oh!

Monishkumarvr commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Monishkumarvr commented Jan 13, 2026 •

edited

Loading

neilyoung commented Mar 17, 2026 •

edited

Loading

Monishkumarvr commented Mar 18, 2026 •

edited

Loading