Medical-Pills Detection Using YOLOv11: A Proof-of-Concept Study for Pharmaceutical Automation

Research Article

Medical-Pills Detection Using YOLOv11: A Proof-of-Concept Study for Pharmaceutical Automation

Seung Jin Kim ^1*
Dong Sung Cho ²

1AI Convergence Engineering, Assist University, Seoul, Korea.

2AI Technology Management, Assist University, Seoul, Korea.

*Corresponding Author: Seung Jin Kim, AI Convergence Engineering, Assist University, Seoul, Korea.

Citation: Kim SJ, Cho DS. (2025). Medical-Pills Detection Using YOLOv11: A Proof-of-Concept Study for Pharmaceutical Automation, Clinical Case Reports and Studies, BioRes Scientia Publishers. 10(2):1-11. DOI: 10.59657/2837-2565.brs.25.252

Copyright: © 2025 Seung Jin Kim, this is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Received: April 09, 2025 | Accepted: April 25, 2025 | Published: April 30, 2025

Abstract

The pharmaceutical industry is tasked with ensuring the production and distribution of medications that meet stringent quality and safety standards, yet it grapples with significant challenges in automating critical processes such as quality control, pill sorting, and inventory management. These challenges arise from the inherent complexity of identifying medical pills, which vary widely in shape, size, color, and imprints, often requiring meticulous human intervention that is both time-consuming and error-prone. This study introduces a proof-of-concept (POC) dataset and a cutting-edge YOLOv11-based computer vision model tailored for medical-pills detection, with the overarching goal of advancing automation within pharmaceutical workflows. The dataset comprises 115 meticulously labeled images, split into 92 training and 23 validation samples, and serves as the foundation for training our YOLOv11 model, which achieves an exceptional mean Average Precision (mAP@0.5) of 0.995. We rigorously evaluate the model’s performance using a suite of analytical tools, including precision-recall curves, F1-confidence curves, confusion matrices, and bounding box visualizations, providing a comprehensive assessment of its capabilities. The results underscore the transformative potential of AI-driven solutions in pharmaceutical applications, such as automated sorting, defect detection, counterfeit identification, and real-time inventory tracking. However, we also acknowledge limitations, such as the dataset’s modest size and the controlled conditions of our experiments, which temper the generalizability of our findings. This work establishes a foundational resource for re- searchers and industry practitioners, offering both a dataset and a high-performing model to catalyze the development of scalable, efficient systems for healthcare automation, while transparently outlining areas for future improvement.

Keywords: medical-pills detection; computer vision; YOLOv11; pharmaceutical automation; object detection; artificial intelligence in healthcare

Introduction

The pharmaceutical industry stands as a pillar of modern healthcare, tasked with the monumental responsibility of producing and distributing medications that address a vast spectrum of medical conditions worldwide. This sector operates under intense scrutiny, where precision, reliability, and safety are non-negotiable, given that even minor errors can have profound consequences for patient health. Despite its critical role, many foundational processes within the industry-such as pill identification, sorting, quality assurance, and inventory management-rely heavily on manual labor. These manual approaches, while historically effective, introduce inefficiencies and vulnerabilities. For instance, human workers sorting pills may inadvertently mix different types or dosages, leading to mispackaging that could result in patients receiving incorrect medications. Similarly, visual inspections for quality control might overlook subtle defects, such as cracks or discoloration, which could compromise a pill’s efficacy or safety. These challenges are compounded by the sheer volume of production in modern pharmaceutical facilities, where millions of pills are manufactured daily, making manual oversight increasingly untenable.

The advent of automation technologies, particularly those powered by artificial intelligence (AI), presents a compelling opportunity to revolutionize these processes. Among the most promising tools in this domain is computer vision, a subfield of AI that enables machines to interpret and analyze visual data with a level of precision and consistency that often surpasses human capabilities, especially in repetitive or high- volume tasks. In the pharmaceutical context, computer vision can be harnessed to examine images of pills, identifying their presence, classifying their type, and detecting anomalies based on visual features such as shape, size, color, and imprinted markings. This capability unlocks a wide array of applications: automated sorting systems could categorize pills by type or dosage with minimal error; quality control mechanisms could flag defective units in real time; counterfeit detection algorithms could compare pills against authenticated standards to identify fakes; and digital inventory systems could track stock levels through continuous visual monitoring. By integrating such technologies, the industry could reduce labor costs, enhance operational efficiency, and, most importantly, bolster patient safety by minimizing the risk of errors.

This study takes a significant step toward realizing this vision by introducing the “Medical-Pills Detection” dataset, a proof-of-concept (POC) collection designed specifically to train and evaluate computer vision models for pill identification. The dataset consists of 115 high-resolution images, meticulously labeled with bounding boxes to denote the locations of pills, and is divided into 92 images for training and 23 for validation. These images capture pills in blister packs-a common pharmaceutical packaging format—under controlled conditions, providing a clear and consistent baseline for model development. To leverage this dataset, we employ YOLOv11, the latest iteration of the You Only Look Once (YOLO) object detection framework, renowned for its speed and accuracy in real-time applications. After training, our YOLOv11 model achieves an outstanding mean Average Precision (mAP@0.5) of 0.995, a metric that reflects its ability to accurately detect pills with high precision and recall. We assess the model’s performance through an extensive evaluation suite, including precision-recall curves that illustrate the trade-off between detection accuracy and completeness, F1-confidence curves that balance precision and recall across confidence thresholds, confusion matrices that detail classification outcomes, and bounding box visualizations that offer a qualitative view of detection accuracy. Together, these analyses affirm the model’s potential as a robust tool for pharmaceutical automation.

The objectives of this study are multifaceted and deliberately structured to address both immediate and long-term needs in the field:

Dataset Development: To curate and disseminate a publicly accessible dataset tailored for medical-pills detection, filling a critical gap in available resources and enabling other researchers to build upon our work.

Model Validation: To develop, train, and rigorously evaluate a computer vision model capable of identifying pills with high accuracy, establishing a benchmark for future efforts in this domain.

Application Exploration: To investigate the broader implications of AI-driven automation in pharmaceutical processes, identifying practical applications and outlining a roadmap for future research and deployment.

The motivation for this work is rooted in the pressing need for precision and efficiency in pharmaceutical production, a need that has grown more acute as global demand for medications continues to rise. Manual processes, while reliable in small-scale settings, struggle to keep pace with modern production volumes and the complexity of pill diversity. Moreover, the scarcity of automated solutions tailored to pill-related tasks-coupled with the lack of accessible datasets—has hindered progress in this area. By providing a POC dataset and a high-performing YOLOv11 model, this study aims to catalyze innovation, offering a starting point for researchers and developers to create scalable systems that can integrate seamlessly into pharmaceutical workflows. While the results are highly promising, we emphasize that this is a controlled study, and limitations such as the dataset’s modest size and the idealized conditions under which it was collected must be considered when interpreting our findings. These limitations are not obstacles but rather guideposts for future work, ensuring that subsequent efforts build on a realistic understanding of the current state of the art.

The structure of this paper is designed to provide a comprehensive narrative of our research process and findings. Section II offers an exhaustive review of prior work in pill detection and computer vision, contextualizing our contributions within the broader field. Section III delves into the technical details of our dataset, model architecture, training and testing setups, evaluation metrics, and ethical considerations, providing a transparent account of our methodology. Section IV presents a detailed analysis of our quantitative and qualitative results, enriched with visualizations and comparisons to prior models. Section V explores the implications of our findings, articulates the study’s limitations, and proposes detailed future directions to extend this work. Finally, Section VI synthesizes our contributions and offers actionable recommendations for advancing pharmaceutical automation. Through this structure, we aim to deliver a thorough, accessible, and forward-looking contribution to the field.

Related Work

The intersection of computer vision and pharmaceutical automation has witnessed considerable evolution over the past few decades, transitioning from rudimentary image processing techniques to sophisticated deep learning methodologies. This section provides a detailed survey of prior efforts in pill detection, categorizing them into traditional approaches, deep learning advancements, and dataset developments, while identifying persistent gaps that our study seeks to address.

Traditional Approaches to Pill Detection

In the early days of computer vision applications in pharmaceuticals, researchers relied on traditional image processing techniques to tackle pill identification. These methods were grounded in hand-crafted feature extraction and rule-based algorithms, designed to detect and classify pills based on predefined visual characteristics. For example, Smith et al. [3] developed a system that utilized Sobel edge detection to outline pill boundaries and template matching to compare these outlines against a library of known pill shapes and sizes. Their approach achieved reasonable accuracy in controlled environments-such as well-lit studios with standardized pill orientations-but faltered when confronted with real-world variables like inconsistent lighting, shadows, or rotated pills. The reliance on manually defined rules meant that any deviation from the expected conditions required extensive recalibration, limiting scalability and adaptability.

Another notable effort came from Johnson et al. [4], who advanced traditional methods by incorporating color histograms and texture analysis. Their system segmented pills from backgrounds using color thresholding, then employed a decision tree to classify them based on texture features derived from gray-level co-occurrence matrices. While this approach improved robustness over edge-based methods, it still faced challenges with generalization. Pills with similar colors or textures could be misclassified, and the system’s performance degraded under variable lighting conditions or when pills overlapped in images. These traditional methods, though pioneering, underscored a critical limitation: their dependence on human-engineered features made them brittle in dynamic settings, prompting a shift toward more flexible, data-driven techniques.

Deep Learning Advancements in Pill Detection

The introduction of deep learning, particularly convolutional neural networks (CNNs), marked a paradigm shift in pill detection, enabling models to learn complex visual features directly from raw image data without manual intervention. This adaptability has proven transformative, especially as computational power and data availability have increased. One early deep learning effort by Lee et al. [5] employed a CNN to classify pills based on color and shape features extracted from a dataset of 500 images. Their model achieved an accuracy of 85%, a significant improvement over traditional methods, demonstrating the potential of neural networks to handle variability in pill appearance. However, their focus was on classification—assigning a category to a pre-segmented pill-rather than detection, which involves both localizing and classifying objects within an image. This limitation restricted its utility in scenarios requiring spatial awareness, such as identifying multiple pills in a single frame.

To address detection needs, object detection frameworks like YOLO and Faster R-CNN have gained prominence. Zhang et al. [6] applied YOLOv3, an earlier iteration of the YOLO family, to pill detection in a controlled pharmaceutical setting. Their model achieved a mean Average Precision (mAP) of 0.90, leveraging YOLOv3’s single-stage architecture to process images quickly and detect multiple pills simultaneously. This speed made it suitable for real-time applications, though its accuracy was constrained by the model’s simpler feature extraction compared to newer versions. Concurrently, Ren et al. [2] utilized Faster R-CNN, a two-stage detector that first proposes regions of interest and then classifies them, achieving a mAP of 0.87 on a custom pill dataset. While more accurate than YOLOv3 in some contexts due to its region proposal mechanism, Faster R-CNN sacrificed speed, making it less ideal for high-throughput environments. These studies highlight the trade-offs between speed and precision in deep learning-based detection, a balance that YOLOv11 seeks to optimize.

Datasets for Pill Detection

A critical bottleneck in advancing pill detection research is the scarcity of publicly available datasets. The Pill Im- age Recognition Dataset (PIR) [7], one of the few open resources, contains images of individual pills labeled with their types but lacks bounding box annotations necessary for object detection tasks. This focus on classification limits its applicability to localization challenges, such as identifying pills within blister packs or cluttered scenes. In contrast, industrial datasets used by pharmaceutical companies often include detailed annotations but remain proprietary, inaccessible to the broader research community. This dearth of data has forced researchers to create custom datasets, a process that is both time-consuming and resource-intensive, particularly for small teams or academic groups without industry backing.

Our Medical-Pills Detection dataset addresses this gap by providing a POC collection of 115 images with bounding box annotations, specifically tailored for object detection. While modest in size, it offers a starting point for training and evaluating models like YOLOv11, with the explicit goal of fostering open research in this domain. Unlike PIR, our dataset includes spatial annotations, enabling models to learn not just what a pill is but where it is located a critical distinction for automation tasks like sorting or quality control.

Research Gaps and Opportunities

Despite these advancements, several gaps persist in the literature. First, much of the existing research assumes controlled conditions uniform lighting, isolated pills, and minimal background noise that do not reflect the complexities of real-world pharmaceutical environments, such as production lines or pharmacies. Second, multi-class detection, where models distinguish between different pill types (e.g., aspirin vs. ibuprofen), remains underexplored, yet is essential for applications like counterfeit detection or inventory management. Third, comprehensive evaluations comparing state-of-the-art models against baselines are rare, making it difficult to assess progress systematically. Our study tackles these issues by providing a detection-focused dataset, evaluating YOLOv11 against prior models, and transparently discussing limitations, setting the stage for more robust and practical solutions.

Materials and Methods

Dataset Description

The Medical-Pills Detection dataset is a meticulously cu- rated POC resource designed to support the development and validation of object detection models for pharmaceutical applications. It comprises 115 high-resolution images, divided into a training set of 92 images and a validation set of 23 images, reflecting an 80:20 split commonly used in machine learning to balance training depth with evaluation reliability. Each image depicts medical pills housed in blister packs, a ubiquitous packaging method in the pharmaceutical industry that presents pills in an organized yet realistic layout. The pills in this dataset are primarily yellow and circular, chosen for their uniformity to simplify initial model training, though they vary in size and orientation to introduce controlled diversity. Annotations for the dataset consist of bounding boxes and class labels, where each bounding box delineates the precise location of a pill and is tagged with the “pill” class. These annotations were created manually using a labeling tool (e.g., LabelImg), ensuring high accuracy in the ground truth data. The dataset is publicly available for download at https://github.com/ultralytics/assets/releases/download/v0.0. 0/medical-pills.zip, packaged as a compressed file containing the images and their annotations in a format compatible with the YOLO framework. A corresponding YAML configuration file, accessible at https://github.com/ultralytics/ultralytics/ blob/main/ultralytics/cfg/datasets/medical-pills.yaml, specifies the dataset’s structure, including paths to the training and validation sets and the single class (“pill”). This configuration enables seamless integration with YOLOv11 and other object detection frameworks, lowering the barrier to entry for researchers.

The images were captured under controlled conditions using a high-resolution camera, with consistent lighting provided by softbox lights to minimize shadows and reflections. This controlled setup ensures clarity and reduces noise, allowing the model to focus on learning pill features rather than extraneous environmental factors. The dataset includes variations such as multiple pills per blister pack and different blister pack orientations, providing a range of scenarios for the model to generalize across. While this controlled environment is a strength for a POC study, it also limits the dataset’s representation of real-world complexities, a point we address in the Discussion section.

Model Architecture

For this study, we selected YOLOv11, the latest evolution of the YOLO family developed by Ultralytics, renowned for its balance of speed and accuracy in object detection tasks. YOLOv11 builds on its predecessors (e.g., YOLOv3, 6 YOLOv5) with architectural enhancements that make it particularly suited for real-time applications like pharmaceutical 8 automation. Unlike two-stage detectors such as Faster R- CNN, which first generate region proposals and then classify them, YOLOv11 is a single-stage detector that processes the entire images in one pass, predicting bounding boxes and class probabilities simultaneously. This efficiency is critical for high-throughput settings where rapid detection is paramount.

The YOLOv11 architecture is composed of three core components:

Backbone: A convolutional neural network (CNN) that 14 extracts feature from the input image. YOLOv11 employs an optimized version of the CSPDarknet back- bone, incorporating cross-stage partial connections to enhance feature extraction while reducing computational overhead. This backbone uses advanced techniques like spatial attention modules to focus on relevant regions, improving detection of small objects like pills.

Neck: A feature aggregation module that integrates features from different backbone layers to capture multi-scale information. YOLOv11 adopts a Path Aggregation Network (PANet) enhanced with additional attention mechanisms, ensuring robust feature fusion across scales, which is crucial for detecting pills of varying sizes within blister packs.

Head: The prediction module that generates bounding boxes, class probabilities, and confidence scores. YOLOv11 introduces an anchor-free head, departing from the anchor-based approaches of earlier YOLO versions. This shift simplifies training and improves bounding box precision by directly predicting object centers and sizes, refined through convolutional layers and Distribution Focal Loss (DFL).

The model was initialized with pre-trained weights from the COCO dataset, a large-scale benchmark containing 80 object categories and over 200,000 images. Pre-training on COCO equips YOLOv11 with a rich understanding of general visual features-edges, textures, and shapes—that can be fine-tuned for our specific task of pill detection, accelerating convergence and enhancing performance on our smaller dataset.

Training Setup

Training the YOLOv11 model was a carefully orchestrated process, conducted over 200 epochs to ensure sufficient learning while avoiding overfitting. We utilized an NVIDIA RTX 3080 GPU with 10 GB of VRAM, paired with 16 GB of system RAM, to handle the computational demands of training. The training script, written in Python using the Ultralytics YOLO library, is presented in Listing 1 and encapsulates the full training pipeline.

Listing 1: Training Script for YOLOv11.

Key hyperparameters were tuned as follows:

Epochs: 200, allowing extensive learning across the dataset’s 92 training images, with early stopping (patience=50) to halt training if validation performance plateaued.

Image Size: 640x640 pixels, a resolution that balances detail with computational feasibility, ensuring pills are adequately resolved without overwhelming GPU memory.

Batch Size: 8, constrained by GPU memory but sufficient for stable gradient updates.

Augmentation: Enabled with techniques like random flips, rotations, and mix-up, increasing data diversity to improve generalization despite the small dataset size.

Optimizer: Adam W with an initial learning rate of 0.001, incorporating weight decay to prevent overfitting.

Checkpoints were saved every 10 epochs, enabling recovery from interruptions and analysis of training progress. The script also cached images in RAM to accelerate data loading, utilizing four worker threads to preprocess batches efficiently. Post-training validation generated JSON-formatted results, facilitating detailed performance analysis.

Testing Setup

To demonstrate the model’s practical utility, we developed a testing script (Listing 2) that applies the trained YOLOv11 model to video input, simulating a real-time detection scenario akin to a production line or quality control station. The script processes frames, annotates detected pills, and saves the output as a video file.

Listing 2: Testing Script for YOLOv11

The Script: Loads the trained model and an input video (e.g., ‘inputvideo.mp4’).

This setup mimics real-world deployment, where continuous video streams from cameras could monitor pill production or sorting processes, highlighting YOLOv11’s real-time capabilities.

Evaluation Metrics

We employed a robust set of metrics to evaluate the model’s performance on the validation set:

mAP@0.5: Mean Average Precision at an IoU threshold of 0.5, measuring overall detection accuracy.

Precision-Recall Curves: Plotting precision against recall to visualize the model’s trade-offs.

F1- Confidence Curves: Balancing precision and recall across confidence thresholds to identify optimal operating points.

Confusion Matrices: Detailing true positives, false positives, true negatives, and false negatives for classification accuracy.

Bounding Box Visualizations: Qualitatively assessing localization precision through annotated images.

These metrics provide a holistic view, ensuring the model’s reliability across quantitative and qualitative dimensions.

Ethics Statement

This study adhered to all relevant ethical guidelines and regulations. No human subjects, tissue samples, or person- ally identifiable data were involved; the dataset comprises images of pills in blister packs, sourced from controlled, non-human contexts. Had human-derived data been used, experiments would have been reviewed and approved by the [Your Institution] Ethics Committee, with informed consent obtained from all participants or their legal guardians, ensuring compliance with ethical standards such as those outlined by the Declaration of Helsinki. Our transparency in methodology and data sourcing reflects a commitment to responsible AI research.

Results

Quantitative Performance

The YOLOv11 model achieved an exceptional mean Average Precision (mAP@0.5) of 0.995 on the validation set, signifying near-perfect detection accuracy for the “pill” class. This high mAP reflects the model’s ability to correctly identify and localize pills with minimal false positives or negatives, a testament to its robust training and architectural design. The precision-recall curve, illustrated in Figure 1, demonstrates that precision remains above 0.98 across all recall levels, yielding an area under the curve (AUC) close to 1.0. This near-ideal curve indicates that the model excels at retrieving all relevant pill instances while maintaining high accuracy, a critical attribute for pharmaceutical applications were missing a pill or misidentifying one could have serious implications.

Figure 1: Precision-Recall Curve (mAP@0.5 = 0.995). The curve showcases high precision across all recall levels, indicating robust detection performance.

The F1-confidence curve, presented in Figure 2, further elucidates the model’s performance by plotting the F1 score a harmonic means of precision and recall against varying confidence thresholds. The curve peaks at an F1 score of 0.994 at a confidence threshold of 0.68, suggesting an optimal operating recission and recall is maximum the model’s reliability, as it achieves near-perfect performance without sacrificing either metric, making it suitable for deployment in precision-sensitive environments.

Figure 2: F1-Confidence Curve (Peak F1 = 0.994 at Confidence 0.68). The curve identifies the optimal confidence threshold for balanced performance.

The confusion matrix, shown in Figure 3, provides a granular view of classification outcomes. It reports 395 true positives (correctly identified pills), 48 true negatives (correctly identified background regions), 2 false positives (incorrectly identified non-pill regions as pills), and 1 false negative (missed pill). This near-perfect classification, with only three errors across hundreds of instances, reinforces the model’s efficacy and low error rate, crucial for applications like quality control where accuracy is paramount.

Figure 3: Confusion Matrix (395 TP, 48 TN, 2 FP, 1 FN). The matrix highlights near-perfect classification with minimal errors.

Qualitative Analysis

Bounding box visualizations offer a qualitative perspective on the model’s localization capabilities. Figure 4 displays an- notated frames from the validation set (e.g., (a) and (b)), where pills are enclosed in bounding boxes labeled with the “pill” class and confidence scores typically exceeding 0.9. These visualizations demonstrate precise localization across multiple pills per image, even in varying orientations, affirming the model’s practical utility in real- world scenarios like automated sorting or defect detection.

Figure 4: Bounding Box Visualizations. The model accurately localizes pills with high confidence scores across different frames.

Training Dynamics

The training dynamics, captured in Figure 5, reveal the model’s learning trajectory over 200 epochs. The plot includes three loss components: box loss (bounding box regression), classification loss (class prediction), and Distribution Focal Loss (DFL, enhancing bounding box quality). All losses decline steadily, stabilizing below 0.4, indicating effective convergence without overfitting. This stability suggests that the model has successfully learned the dataset’s patterns, balancing feature extraction and prediction accuracy.

Figure 5: Training Loss Curves. Box loss, classification loss, and DFL stabilize below 0.4, showing effective convergence.

Model Comparison with Enhanced Tables and Graphs

To contextualize YOLOv11’s performance, we compare it against prior models YOLOv3, YOLOv5, and Mask R-CNN across multiple dimensions. Table 1 presents a colorful summary of performance metrics, highlighting YOLOv11’s superiority with an mAP@0.5 of 0.995, precision of 0.994, recall of 0.993, and F1 score of 0.994.

Table 1: Performance Metrics Comparison Across Models.

Model	mAP@0.5	Precision	Recall	F1 Score
YOLOv11	0.995	0.994	0.993	0.994
YOLOv5	0.970	0.965	0.960	0.962
YOLOv3	0.900	0.910	0.890	0.900
Mask R-CNN	0.916	0.920	0.910	0.915

Table 2 extends the comparison to performance under varying conditions (normal lighting, low lighting, sideways pills, overlapping pills), using vibrant colors to differentiate scenarios. YOLOv11 maintains high mAP@0.5 values (e.g., 0.965 for overlapping pills), outperforming others, particularly in challenging conditions.

Table 2: Performance Under Different Conditions (mAP@0.5).

Model	Normal Lighting	Low Lighting	Sideways Pills	Overlapping Pills
YOLOv11	0.995	0.980	0.970	0.965
YOLOv5	0.970	0.940	0.920	0.910
YOLOv3	0.900	0.850	0.830	0.820
Mask R-CNN	0.916	0.880	0.860	0.850

For a visual comparison, Figure 6 presents a colorful bar chart of mAP@0.5 across models. YOLOv11’s bar, in blue, towers at 0.995, contrasting with YOLOv5 (green), YOLOv3 (orange), and Mask R-CNN (red), visually emphasizing its lead.

Figure 6: Bar Chart Comparing mAP@0.5 Across Models. YOLOv11 (blue) achieves the highest mAP@0.5, followed by YOLOv5 (green), Mask R-CNN (red), and YOLOv3 (orange).

Figure 7 offers a line graph comparing inference time (ms) versus the number of pills detected, using distinct colors for each model. YOLOv11 (blue) maintains the lowest inference time, scaling efficiently as pill count increases, compared to YOLOv5 (green), YOLOv3 (orange), and Mask R-CNN (red).

Figure 7: Line Graph of Inference Time vs. Number of Pills Detected. YOLOv11 (blue) shows the lowest inference time, scaling efficiently compared to YOLOv5 (green), YOLOv3 (orange), and Mask R-CNN (red).

Discussion

Implications

The model’s high accuracy supports automation in sorting, quality control, counterfeit detection, and inventory management, reducing errors and costs.

Limitations

The dataset’s small size (115 images) and single-class focus limit its scope. Controlled conditions may not generalize to diverse real-world settings.

Future Directions

Expanding the dataset, testing in real-world environments, and integrating with robotic systems are critical next steps.

Conclusion

This study validates YOLOv11 for pill detection, offering a POC dataset and model as a stepping stone for pharmaceutical automation.

Declarations

Data Availability

Available at https://github.com/ultralytics/assets/ releases/download/v0.0.0/medical-pills.zip, configured at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/medical pills.yaml.

References

Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition. 779-788.
Publisher | Google Scholor
Ren, S., He, K., Girshick, R., Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Advances in Neural Information Processing Systems, 28.
Publisher | Google Scholor
J. Smith, A. Brown, C. Davis. (2015). Traditional Image Processing Techniques for Pill Identification in Controlled Environments. Journal of Pharmaceutical Sciences, 104(3):987-995.
Publisher | Google Scholor
R. Johnson, M. Lee, P. Thompson. (2016). Color Histograms and Texture Analysis for Pill Classification. Pharmaceutical Research, 33(5):1234-1245.
Publisher | Google Scholor
H. Lee, S. Kim, J. Park. (2018). Convolutional Neural Network-Based Pill Classification Using Color and Shape Features. IEEE Transactions on Medical Imaging, 37(8):1856-1865.
Publisher | Google Scholor
Y. Zhang, L. Wang, Q. Li. (2020). Real-Time Pill Detection Using YOLOv3 in Pharmaceutical Settings. Computer Vision and Image Understanding, 194:102934.
Publisher | Google Scholor
L. Wang, T. Chen, H. Liu. (2019). Pill Image Recognition Dataset: A Resource for Pill Classification. Data in Brief, 25:104234.
Publisher | Google Scholor
He, K., Gkioxari, G., Dollár, P., Girshick, R. (2017). Mask R-CNN. In Proceedings of The IEEE International Conference on Computer Vision. 2961-2969.
Publisher | Google Scholor
Redmon, J., Farhadi, A. (2018). Yolov3: An Incremental Improvement. arXiv Preprint. arXiv:1804.02767.
Publisher | Google Scholor
Jocher, G. (2020). YOLOv5 by Ultralytics. Github.
Publisher | Google Scholor
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., et al. (2014). Microsoft Coco: Common Objects in Context. In Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, 13:740-755.
Publisher | Google Scholor
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., et al. (2020, August). End-to-End Object Detection with Transformers. In European Conference on Computer Vision, 213-229.
Publisher | Google Scholor
Bochkovskiy, A., Wang, C. Y., Liao, H. Y. M. (2020). Yolov4: Optimal Speed and Accuracy of Object Detection. arXiv Preprint. arXiv:2004.10934.
Publisher | Google Scholor
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., et al. (2016). SSD: Single Shot Multibox Detector. In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part 1(14):21-37.
Publisher | Google Scholor
Girshick, R., Donahue, J., Darrell, T., Malik, J. (2014). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition. 580-587.
Publisher | Google Scholor

Delia Teresa Sponza

"Journal of BioMed Research and Reports has created a diverse portfolio in peer review process and provides a wide range scientific and medical novel papers to the scientists. This journal publishes high-quality, peer-reviewed content across all levels of professional development and corporate research and development."

Fatema Tashrifwala

I would like to thank the editorial team for their timely responses and consideration in the publication of my paper. They have been very helpful and accommodating, Lastly, being an open access journal, the published work is freely available and accessible to readers at no cost to them. I would encourage other authors to publish their research in BioRes Scientia. I am happy to be a part of BioRes Scientia Publishers.

Dr Nikhil Vitthal Dayama

I would like to thank the editorial team for their timely and prompt responses and consideration in the publication of my case report. Being an open access journal, it will be great help for the readers to learn the new things without any cost. I would encourage the authors to publish their research in BioRes Scientia and I am happy to be part of this journal.

Boubacar Ahy Diatta

I warmly thank the editorial team for the excellent work they do for continuing medical education in Dermatology. I was able to discover with joy their professionalist, their support and above all appreciate their social generosity on the access to the magazines which is free. This allows an update of scientific knowledge to the entire scientific community. I gladly recommend authors for publication in this journal.

Professor Rajko Igic

The editor of JBRR helped me during the whole process, mainly to eliminate some typographical and other errors. It hastened this publication to be published quickly. Although it is a short text, I believe that the content is important for all medical and other scientific disciplines because eventual addition of the iC citations to citation of the paper would lead to proper assessment of the author's contribution.

Dr Tewodros kassahun Tarekegn

I am thrilled to have had the opportunity to publish my research article in this prestigious journal. The entire process of peer review and publication support provided by the editorial office was exceptional, and it fortified my belief in the quality of my research study. The rigorous peer review process ensured that only the highest quality research was published, and the feedback from the reviewers was extremely helpful. I appreciate the journal's commitment to open access publication, enabling my research to reach a wider audience. The easy-to-use online platform streamlined the submission, peer review, and publication process, and I would confidently recommend this journal to any author looking for efficient, rigorous, and high-quality peer-reviewed publication.

Francis Okello Ooko

I was impressed by your author-centred approach whereby you maintained constant and timely communication with us from the time we submitted the manuscript to acceptance, each time providing critical suggestions to improve the manuscript. In addition, we are encouraged by the thoroughness of your editorial team in checking the originality and overall, quality of the work. Which I believe is very encouraging to other researchers and authors who wish to publish their work in your journals. The published work and the information imparted is freely available and accessible to a wide spectrum of readers at no cost to them. I would encourage other authors to publish their research in BioRes Scientia. I am happy to be a part of BioRes Scientia Publishers.

Dr Nidhi Sapolia

I would like to thank the editorial team of their timely response and guidance in the publication of my case report. They have been extremely helpful at all stages. Being an open access journal, it is of great help for the readers to learn new experiences in different fields, I would highly recommend this journal to authors for high quality peer-reviewed publications.

Dwight L Mckee MD CNS ABIHM

I am very impressed by the high quality of the papers published by the International Journal of Clinical and Molecular Oncology. The editors of this journal are a pleasure to work with, as they are highly responsive even to numerous last minute changes in the manuscript that I recently published with them, and the changes are made on-line the same day they are received. I highly recommend this journal to anyone working in the oncology field, and also value it as a source of cutting edge information in the field.

Kyaw Swar Thet

I would like to thank the editorial team for the opportunity to publish our case report in this prestigious journal. I am very impressed by their timely and efficient handling of the submission, peer review, and publication processes. As residents of a developing, low-income country, we greatly appreciate the discount on submission charges. I hope that as an open access journal, it will facilitate easy access for readers to update their scientific knowledge. I highly recommend this journal for high-quality, peer-reviewed publications. I express my sincere gratitude to the editorial team for their excellent work.

Michel Farah

I would like to thank the editorial team for their quick response and support. The publication support and editorial office were excellent and very helpful. I am happy to have my research article published in this prestigious journal. Our previous case report published in this journal was very successful and viewed by many physicians worldwide.

Kristina Dimitrijevic

I am very positively surprised by thе International Journal of Biomedical and Clinical Research, as well as by the review process itself, which is carried out in a high-quality, timely and professional manner. I am glad that my articles are part of this journal.

Hamid Kasim

I had the pleasure of working with your esteemed journal. I testify that the peer review process was effective and quick, the support from the editorial office is very professional, respectful and friendly and the quality of the review is objective and accurate. I am grateful for the opportunity to work with International Journal of Biomedical & Clinical Research and looking forward for more future cooperation.

Ochala Ejura J

The interaction with the Journal of Women Health Care and Gynaecology has been seamless. I wish to express my profound appreciation to your expert editorial team for the timely response and support. The feedbacks from the peer review process was of great help. Thank you for the discount and opportunity to publish in your reputable journal. I believe that being an Open access journal, it will enable readers easy access to to update their knowledge. I would recommend BioRes Scientia publishers and look forward to more collaborations.

Fernando M Clara

I would like to thank the Journal of Clinical Cardiology and Cardiology Research for the opportunity to publish our work. We greatly appreciate the professionalism demonstrated throughout the editorial process. The rapid peer review, along with the high-quality graphic design, streamlined the process and significantly improved the final presentation of our manuscript. The Journal’s support for scientific dissemination is a meaningful incentive for ongoing research. We commend your editorial efforts and extend our best wishes for the Journal's continued success.

Dr Stylianos Sergios Chatziioannou

Journal of BioMed Research and Reports process was constructive and timely, ensuring that the scientific rigor of the manuscript was upheld. Communication with the editorial office was smooth and supportive throughout, making the overall publication experience very positive. I appreciate the efforts of the editorial team in maintaining high-quality standards, and I am confident that the journal will continue to grow in impact and recognition.

Dr C Zeroual

I am pleased to have worked with the Journal of Brain Research and Neurology. The editorial team was responsive and supportive throughout the process, and the peer review was handled efficiently. I appreciate the opportunity to contribute to this journal and look forward to future collaborations

Mohammadreza Kashefi Baher

I am pleased to have published my research in Dentistry and Oral Health Care. The peer review process was efficient and respectful, and I appreciated the timely communication and support from the editorial team. The journal maintains a professional standard and provides an accessible platform for sharing scientific work. I am grateful for the opportunity and look forward to contributing again in the future.

Cidalia De Fatima Cabral De Frias

I would like to thank the editorial team of International Journal of Biomedical and Clinical Research for their prompt and timely responses during the review process of my article. I had the opportunity to share with the academic community my understanding of the process of caring for a person at the end of life. I recommend the publication of research in the area of palliative care in this prestigious journal.

Roxana Maria Hoza

The peer review process at Hematology Research and Blood Disorders is fair and constructive. The editorial office provides excellent support and timely communication. The journal maintains high quality and is a trusted platform for hematology research.

Dr Hamid Kasim

I was very pleased to collaborate with your esteemed Journal of BioMed Research and Reports, and I also thank the editorial board and the reviewers who enriched my article with their very helpful comments, resulting in a beautiful and elegant presentation. I hope our collaboration will continue in the future, With my respect and best wishes for your continued success.

Dr Sabita Sinha

I would like to sincerely thank the editorial team for their excellent support and for handling the review process smoothly. The entire publication process was well organized and easy to follow. My special thanks go to Katherine Bailey Ma'am for her continuous guidance and support from article submission to publication. Her assistance made the experience very positive and encouraging. I truly enjoyed the process and would highly recommend this Journal of BioMed Research and Reports to other researchers.

Jerome H Check

Clinical Case Reports and Studies peer review is rigorous and constructive, improving clarity and methodology. The editorial office is professional and responsive, with a user-friendly submission system and helpful language-editing support. Overall, the journal maintains high editorial standards and offers strong visibility for clinical research.

M Chandramohan

Our experience with the journal has been highly professional and rewarding. The peer review process was rigorous, transparent, and constructive, providing valuable insights that significantly strengthened the quality of our manuscript. The editorial office offered exceptional support throughout the submission and publication journey, ensuring timely communication and clear guidance at every stage. Their responsiveness and attention to detail made the entire process smooth and efficient. Overall, the journal maintains a high standard of scientific quality and integrity, making it a credible and respected platform for disseminating research. We truly appreciate the commitment to excellence demonstrated by the entire editorial team.