YOLOv11: A New Iteration of "You Only Look Once"

YOLO (You Only Look Once) is a cutting-edge object-detection algorithm first introduced as a research paper by J. Redmon, et al. (2015). In the realm of real-time object identification, YOLOv11 architecture represents a significant advancement over its predecessor, the Region-based Convolutional Neural Network (R-CNN).

Utilizing an entire image as input, this single-pass approach with a single neural network predicts bounding boxes and class probabilities. In this post, we will delve into YOLOv11 – the most recent iteration developed by Ultralytics.

About us: Viso Suite is an End-to-End Computer Vision Infrastructure that offers all the necessary tools to train, build, deploy, and manage computer vision applications at scale. By combining accuracy, reliability, and lower total cost of ownership, Viso Suite is ideal for multi-use case, multi-location deployments. To explore enterprise-grade computer vision infrastructure, schedule a demo of Viso Suite with our team of experts.

Viso Suite is an end-to-end machine learning solution. — Viso Suite is the end-to-end enterprise Computer Vision Solution

Table of Contents

What is YOLOv11?

YOLOv11 is the latest iteration of YOLO, an advanced real-time object detection system. The YOLO family enters a new era with YOLOv11, a more capable and adaptable model that pushes the boundaries of computer vision.

The model supports various computer vision tasks such as pose estimation and instance segmentation. The computer vision community that has utilized previous YOLO versions will appreciate YOLOv11 for its improved efficiency and optimized architecture.

Glenn Jocher, CEO, and founder of Ultralytics, stated: “With YOLOv11, we aimed to create a model that combines power and practicality for real-world applications. Its enhanced accuracy and efficiency make it a versatile tool tailored to address specific challenges across different sectors.”

crowd counting with yolov11 — Crowd counting with YOLOv11

Supported Tasks

Ultralytics’ YOLOv11 is a widely used tool among developers and researchers due to its innovative architecture. The computer vision community can leverage YOLOv11 to create innovative solutions and advanced models. It enables a variety of computer vision tasks, including:

Object Detection
Instance Segmentation
Pose Estimation
Oriented Detection
Classification

Some of the key enhancements include improved feature extraction, more accurate detail capture, higher accuracy with fewer parameters, and faster processing rates that significantly enhance real-time performance.

An Overview of YOLO Models

Here is a summary of the YOLO family of models up to YOLOv11.

	Release	Authors	Tasks	Paper
YOLO	2015	Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi	Object Detection, Basic Classification	You Only Look Once: Unified, Real-Time Object Detection
YOLOv2	2016	Joseph Redmon, Ali Farhadi	Object Detection, Improved Classification	YOLO9000: Better, Faster, Stronger
YOLOv3	2018	Joseph Redmon, Ali Farhadi	Object Detection, Multi-scale Detection	YOLOv3: An Incremental Improvement
YOLOv4	2020	Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao	Object Detection, Basic Object Tracking	YOLOv4: Optimal Speed and Accuracy of Object Detection
YOLOv5	2020	Ultralytics	Object Detection, Basic Instance Segmentation (via custom modifications)	no
YOLOv6	2022	Chuyi Li, et al.	Object Detection, Instance Segmentation	YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications
YOLOv7	2022	Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao	Object Detection, Object Tracking, Instance Segmentation	YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
YOLOv8	2023	Ultralytics	Object Detection, Instance Segmentation, Panoptic Segmentation, Keypoint Estimation	no
YOLOv9	2024	Chien-Yao Wang, I-Hau Yeh, Hong-Yuan Mark Liao	Object Detection, Instance Segmentation	YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
YOLOv10	2024	Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, Guiguang Ding	Object Detection	YOLOv10: Real-Time End-to-End Object Detection

Key Advantages of YOLOv11

YOLOv11 represents an improvement over YOLOv9 and YOLOv10, both released earlier in 2024. It features enhanced architectural designs, more effective feature extraction algorithms, and improved training methods. The unique combination of YOLOv11’s speed, accuracy, and efficiency sets it apart as one of Ultralytics’ most powerful models to date.

YOLOv11 boasts an improved design that enables more precise detection of intricate details – even in challenging scenarios. It also offers enhanced feature extraction, allowing for the extraction of multiple patterns and details from images.

Compared to its predecessors, Ultralytics’ YOLOv11 introduces several significant improvements. Key advancements include:

YOLOv11 performance compared to its predecessors — YOLOv11 model performance compared to its predecessors

Better accuracy with fewer parameters: YOLOv11m is more computationally efficient without sacrificing accuracy. It achieves greater mean Average Precision (mAP) on the COCO dataset with 22% fewer parameters than YOLOv8m.
Wide variety of tasks supported: YOLOv11 is capable of performing a wide range of CV tasks, including pose estimation, object recognition, image classification, instance segmentation, and orientated object detection (OBB).
Improved speed and efficiency: Faster processing rates are achieved via improved architectural designs and training pipelines that strike a compromise between accuracy and performance.
Fewer parameters: fewer parameters make models faster without significantly affecting v11’s correctness.
Improved feature extraction: YOLOv11 has a better neck and backbone architecture to improve feature extraction capabilities, which leads to more accurate object detection.
Adaptability across contexts: YOLOv11 is adaptable to a wide range of contexts, such as cloud platforms, edge devices, and systems that are compatible with NVIDIA GPUs.

YOLOv11 – How to Use It?

As of October 10, 2024, Ultralytics has not yet published the YOLOv11 paper or its architecture diagram. However, there is sufficient documentation available on GitHub. The model is less resource-intensive and capable of handling complex tasks. It is an excellent choice for challenging AI projects as it enhances large-scale model performance.

The training process has improvements to the augmentation pipeline, making it easier for YOLOv11 to adapt to various tasks – whether small projects or large-scale applications. To start using YOLOv11, install the latest version of the Ultralytics package:

pip install ultralytics>=8.3.0

You can utilize YOLOv11 for real-time object detection and other computer vision applications with just a few lines of code. Use this code to load a pre-trained YOLOv11 model and perform inference on an image:

from ultralytics import YOLO # Load the YOLO11 model model = YOLO("yolo11n.pt") # Run inference on an image results = model("path/to/image.jpg") # Display results results[0].show()

What is YOLOv11?

Supported Tasks

An Overview of YOLO Models

Key Advantages of YOLOv11

YOLOv11 – How to Use It?

Related Posts