YOLOv11: A New Iteration of “You Only Look Once”

YOLO (You Only Look Once) is a cutting-edge object-detection algorithm first introduced as a research paper by J. Redmon, et al. (2015). In the realm of real-time object identification, YOLOv11 architecture represents a significant advancement over its predecessor, the Region-based Convolutional Neural Network (R-CNN).

Utilizing an entire image as input, this single-pass approach with a single neural network predicts bounding boxes and class probabilities. In this post, we will delve into YOLOv11 – the most recent iteration developed by Ultralytics.

About us: Viso Suite is an End-to-End Computer Vision Infrastructure that offers all the necessary tools to train, build, deploy, and manage computer vision applications at scale. By combining accuracy, reliability, and lower total cost of ownership, Viso Suite is ideal for multi-use case, multi-location deployments. To explore enterprise-grade computer vision infrastructure, schedule a demo of Viso Suite with our team of experts.

Viso Suite is an end-to-end machine learning solution.
Viso Suite is the end-to-end enterprise Computer Vision Solution

What is YOLOv11?

YOLOv11 is the latest iteration of YOLO, an advanced real-time object detection system. The YOLO family enters a new era with YOLOv11, a more capable and adaptable model that pushes the boundaries of computer vision.

The model supports various computer vision tasks such as pose estimation and instance segmentation. The computer vision community that has utilized previous YOLO versions will appreciate YOLOv11 for its improved efficiency and optimized architecture.

Glenn Jocher, CEO, and founder of Ultralytics, stated: “With YOLOv11, we aimed to create a model that combines power and practicality for real-world applications. Its enhanced accuracy and efficiency make it a versatile tool tailored to address specific challenges across different sectors.”

crowd counting with yolov11crowd counting with yolov11
Crowd counting with YOLOv11
Supported Tasks

Ultralytics’ YOLOv11 is a widely used tool among developers and researchers due to its innovative architecture. The computer vision community can leverage YOLOv11 to create innovative solutions and advanced models. It enables a variety of computer vision tasks, including:

  • Object Detection
  • Instance Segmentation
  • Pose Estimation
  • Oriented Detection
  • Classification

Some of the key enhancements include improved feature extraction, more accurate detail capture, higher accuracy with fewer parameters, and faster processing rates that significantly enhance real-time performance.

An Overview of YOLO Models

Here is a summary of the YOLO family of models up to YOLOv11.

Release Authors Tasks Paper
YOLO 2015 Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi Object Detection, Basic Classification You Only Look Once: Unified, Real-Time Object Detection
YOLOv2 2016 Joseph Redmon, Ali Farhadi Object Detection, Improved Classification YOLO9000: Better, Faster, Stronger
YOLOv3 2018 Joseph Redmon, Ali Farhadi Object Detection, Multi-scale Detection YOLOv3: An Incremental Improvement
YOLOv4 2020 Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao Object Detection, Basic Object Tracking YOLOv4: Optimal Speed and Accuracy of Object Detection
YOLOv5 2020 Ultralytics Object Detection, Basic Instance Segmentation (via custom modifications) no
YOLOv6 2022 Chuyi Li, et al. Object Detection, Instance Segmentation YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications
YOLOv7 2022 Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao Object Detection, Object Tracking, Instance Segmentation YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
YOLOv8 2023 Ultralytics Object Detection, Instance Segmentation, Panoptic Segmentation, Keypoint Estimation no
YOLOv9 2024 Chien-Yao Wang, I-Hau Yeh, Hong-Yuan Mark Liao Object Detection, Instance Segmentation YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
YOLOv10 2024 Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, Guiguang Ding Object Detection YOLOv10: Real-Time End-to-End Object Detection

Key Advantages of YOLOv11

YOLOv11 represents an improvement over YOLOv9 and YOLOv10, both released earlier in 2024. It features enhanced architectural designs, more effective feature extraction algorithms, and improved training methods. The unique combination of YOLOv11’s speed, accuracy, and efficiency sets it apart as one of Ultralytics’ most powerful models to date.

YOLOv11 boasts an improved design that enables more precise detection of intricate details – even in challenging scenarios. It also offers enhanced feature extraction, allowing for the extraction of multiple patterns and details from images.

Compared to its predecessors, Ultralytics’ YOLOv11 introduces several significant improvements. Key advancements include:

YOLOv11 performance compared to its predecessorsYOLOv11 performance compared to its predecessors
YOLOv11 model performance compared to its predecessors
  • Better accuracy with fewer parameters: YOLOv11m is more computationally efficient without sacrificing accuracy. It achieves greater mean Average Precision (mAP) on the COCO dataset with 22% fewer parameters than YOLOv8m.
  • Wide variety of tasks supported: YOLOv11 is capable of performing a wide range of CV tasks, including pose estimation, object recognition, image classification, instance segmentation, and orientated object detection (OBB).
  • Improved speed and efficiency: Faster processing rates are achieved via improved architectural designs and training pipelines that strike a compromise between accuracy and performance.
  • Fewer parameters: fewer parameters make models faster without significantly affecting v11’s correctness.
  • Improved feature extraction: YOLOv11 has a better neck and backbone architecture to improve feature extraction capabilities, which leads to more accurate object detection.
  • Adaptability across contexts: YOLOv11 is adaptable to a wide range of contexts, such as cloud platforms, edge devices, and systems that are compatible with NVIDIA GPUs.

YOLOv11 – How to Use It?

As of October 10, 2024, Ultralytics has not yet published the YOLOv11 paper or its architecture diagram. However, there is sufficient documentation available on GitHub. The model is less resource-intensive and capable of handling complex tasks. It is an excellent choice for challenging AI projects as it enhances large-scale model performance.

The training process has improvements to the augmentation pipeline, making it easier for YOLOv11 to adapt to various tasks – whether small projects or large-scale applications. To start using YOLOv11, install the latest version of the Ultralytics package:

pip install ultralytics>=8.3.0

You can utilize YOLOv11 for real-time object detection and other computer vision applications with just a few lines of code. Use this code to load a pre-trained YOLOv11 model and perform inference on an image:

from ultralytics import YOLO
# Load the YOLO11 model
model = YOLO("yolo11n.pt")
# Run inference on an image
results = model("path/to/image.jpg")
# Display results
results[0].show()

YOLOv11 for person detection on construction sitesYOLOv11 for person detection on construction sites Post navigation