Blog

Picture this: A radiologist stares at a chest X-ray at 3 AM in a busy emergency room. In the corner of her screen, an AI assistant highlights a tiny shadow she might have missed—a small tumor caught early enough to save a life. This is the power of medical AI, but it's only as good as the data used to train it.Behind every life-saving AI detection lies thousands of hours of painstaking work: medical data annotation. It's the crucial bridge between raw medical data and artificial intelligence that can spot diseases faster than human eyes. But here's the challenge: annotating medical data isn't just about drawing boxes around obvious features—it requires the precision of a surgeon, the knowledge of a specialist, and the attention to detail of a forensic investigator.The stakes? They couldn't be higher. A single poorly annotated data point could mean the difference between an AI system catching or missing a critical diagnosis.In this article, we explore major types of healthcare data annotation, their practical implications, and the challenges they present.Source: UnsplashMedical Data Annotation: Why Is It Special?Data annotation is becoming increasingly common across various fields, but medical data annotation presents some unique challenges. The most important idiosyncrasies of raw medical data are its amount, complexity, and privacy.The sheer amount of raw, completely unstructured healthcare records that is generated every second—from X-rays, MRIs, and patient records during diagnosis, treatment, and prevention—requires highly efficient and accurate annotation tools.The complexity of the raw material is the second biggest challenge for the industry: not only is the file format limited to the standardized one that includes multiple layers and high bit depths, but such files require a very skilled labeler, who must be both a specialized healthcare professional and also well-trained to work with data annotation tools.Given the sensitive nature of personal health information and medical data, its annotation must be handled securely and privately at all times.These unique challenges demand specialized capabilities from annotation tools. When selecting these tools, it's essential to remember that efficiency and high precision are critical priorities for each medical data annotation type discussed below.Types of Medical Labelling and Their ApplicationsEvery day, hospitals generate a tsunami of data in various formats—from crystal-clear 3D brain scans to hurried emergency room dictations. Each type of data requires its own specialized annotation approach. Let's dive into the four critical domains where annotation is revolutionizing healthcare:Images: The foundation of medical AI, from X-rays to cellular microscopyAudio: Capturing everything from heart murmurs to emergency callsVideo: Dynamic data from surgical procedures to patient monitoringText: The hidden goldmine in medical records and research papersLet's explore how each type is transforming patient care.Source: UnsplashMedical Image AnnotationWhen you hear "medical imaging," you might think of simple X-rays. But today's medical imaging spans a mind-boggling range: from atomic-level electron microscope scans to full-body 3D MRIs. Each requires its own annotation approach:For disease prevention, the most critical part of annotation is to label the object on the image as ‘normal’ or ‘abnormal.’ This annotation type is vital in detecting abnormalities and producing correct diagnoses. This is usually called image classification—as the name suggests, classifying the image into a predefined category or class.For other purposes, such as pre-surgery scanning, it is vital to locate and mark the exact object position—this type of annotation is called object detection.Another type of medical image annotation is connected to labeling all the image components presented—image segmentation. One can label each separate object as a whole (so-called instance segmentation) or go as deep as labeling each pixel with a category label (so-called semantic segmentation).Each of these types of annotation can be further divided into specific ways of annotating the data based on the technique used, for example, creating a figure around the object in the image (bounding box or polygon), locating the specific part/feature of the image (keypoints), marking objects in the picture for multiple image alignment (landmarking), or even creating a collection of points to mark the 3D coordinates of the image (point cloud).All these methods are aimed at recognizing and annotating key parts of the raw medical image.Medical Audio AnnotationWhile medical images are widespread, medical data is not limited to visuals—another important type is medical audio annotation. For example, the already mentioned emergency care field relies heavily on distinguishing keywords in inbound emergency calls to identify the issue correctly and dispatch the right team.Here we can distinguish the two most important types of data annotation: Conversation recordings (or, more generally, spoken language recordings) that include all the doctor-patient talks and dictations, and Physiological sounds (or Auscultation sounds)—heartbeats, lung sounds, bowel movements, and other varieties of sounds recorded on medical devices.In addition to disease prevention and research, similar to other types of annotation, spoken language recordings are an extremely valuable asset to train AI to correctly recognize speech and medical jargon and then compile comprehensive notes from each conversation or fill in necessary forms—extremely valuable for the efficiency and accuracy of note-taking.Source: UnsplashMedical Video AnnotationMedical video annotations include a multitude of data, from surgical procedure videos to surveillance of patient behavior. It must be mentioned that medical videos play an enormous role in teaching young medical professionals in their specialized fields.This raw data can be annotated frame-by-frame or segment-by-segment. The most common types include:Annotation of video diagnostics—similar to image annotation—can help in the first stages of treatments to produce a correct diagnosis for the patient. These videos can include videos from any in-body camera footage (colonoscopies, laparoscopic surgeries, etc.) as well as ultrasounds and echocardiograms.By labeling and annotating anomalies and pathologies in each frame of the footage, we can assist in teaching AI to detect anomalies, minimizing the manual work for the clinicians and being able to go through a much larger amount of footage in a shorter time.Annotating surgical videos by labeling them with timestamps to determine each stage of the surgery (e.g., incision, dissection, suturing, closure, etc.) or marking the object’s location in the video with a bounding box (tools, tumors, anatomical elements, etc.)—this has an enormous practical value not only in providing insights into correct surgical procedures and best field practices but also in real-time flagging errors and possible surgery complications.Patient monitoring includes a variety of data from motion analysis in rehabilitation to patient surveillance in their rooms. While rehabilitation footage is great for tracking patients’ progress toward recovery, surveillance footage can assist in greater safety for patients that pose a risk to themselves or others. Here we must emphasize the privacy of the recording and its ethical handling.Medical Text AnnotationA huge amount of data in the healthcare sector is created manually in the form of notes, records, prescriptions, research papers, and so on. This raw data can and should be labeled and annotated as well. Similarly to image and video annotation, this data can be labeled as a whole document or on the sentence/tag level.By classifying the documents according to their contents, medical specialty, or even labeling them by symptoms, test values, etc., mentioned within, one can create a huge database that can be used by ML to interpret and cross-reference, which can refine multiple medical fields, such as diagnostics, medication prescription, and medical research, to name a few.By labeling any specific information within the document, such as specific diseases, links between symptoms and medication, findings, etc., one can fine-tune the annotation and make interpretations more specific and accurate.Source: UnsplashSearching for the Right Medical Annotation ToolsWe’ve discussed all the different types of medical data annotation and highlighted its areas of usage and importance in the healthcare industry. All of the above further solidifies that choosing the correct annotation tool is a crucial decision for any medical project.CVAT data annotation software stands out as a powerful solution for medical annotation projects. It works with most file formats, and its open-source nature enables the possibility to be customized to handle DICOM or NIfTI medical imaging files. Unlike some image-only annotation tools, CVAT handles video annotation, which is great for medical projects with cine loops or surgical videos. CVAT can also handle large-scale projects and big datasets, which makes your project easily scalable.And, most importantly, CVAT has an active community and is continuously improving. There are plugins and scripts available (for automation, pre-processing, etc.), and if a needed feature isn’t there, one can modify the code.In summary, CVAT offers a strong combination of flexibility, scalability, and cost-effectiveness for medical data annotation. It may require a bit more setup, compared to some turnkey commercial solutions, but it gives you full control.ConclusionThe role of data annotation in healthcare continues to expand as AI applications become more sophisticated and widespread. The success of these applications relies on the quality and precision of annotated medical data, whether it's images, audio, video, or text. As we've explored, each type of annotation presents unique challenges and requirements.The synergy between professional medical expertise and advanced annotation tools is reshaping healthcare delivery. When implemented effectively, these tools enhance the accuracy of AI-driven diagnostics and treatments and allow healthcare professionals to focus more on patient care rather than routine data interpretation.As medical technology continues to evolve, the importance of high-quality data annotation will grow. This will make it a crucial component in the future of healthcare delivery and improved patient outcomes.Next StepsFor organizations looking to implement or improve their medical data annotation processes:Try CVAT Online to explore our flexible and customizable medical imaging annotation solution.Learn about CVAT On-prem for teams requiring additional security and control.If you're looking for expert assistance, discover our professional Annotation Services.Visit cvat.ai to learn more about CVAT's comprehensive medical annotation capabilities.

Industry Insights & Reviews

April 1, 2025

How to Create Data Labeling Specifications for Your Annotation Project: A Client's Guide (+ Free Template)

Manual data labeling can be a real slog, especially when you’re working with massive datasets. That’s why automated annotation is such a lifesaver—it speeds up the process, ensures consistency, and frees you up to focus on building smarter machine learning models. CVAT OGs know that both our platforms (SaaS and on-premises) support a number of options for automated annotation using AI models, including:‍Nuclio platformRoboflow and Hugging Face integrationCLI-based annotation on your own hardwareThese methods are used and loved by thousands of users, but because data annotation projects come in all shapes and sizes, they may not work for everyone. Nuclio functions, for example, are currently managed by the CVAT administrator and are limited to CVAT On-prem installations. Roboflow and Hugging Face support a limited range of model architectures. CLI-based annotation requires users to set up and run models only on their own machines, which can be hardware-intensive and time-consuming for some teams.‍Today, we’re excited to share that CVAT is addressing all these limitations with the launch of AI agents. ‍What is a CVAT AI agent?An AI agent in CVAT is a process (or service) that runs on your hardware or infrastructure and acts as a bridge between the CVAT platform and your AI model. Its main role is to receive auto-annotation requests from CVAT, transfer data (e.g., images) to your model for processing, retrieve the resulting annotations (e.g., object coordinates, masks, polygons), and send these results back to CVAT for automatic inclusion in your task. ‍In other words, CVAT AI agents work as a bridge between your custom model and the CVAT platform, enabling seamless integration of your model into the auto-annotation process.‍How are CVAT AI agents different from other automation methods?‍Customization and accuracy: Unlike with Roboflow, or Hugging Face integrations, you can now use your own AI models, tailored specifically to your datasets and tasks, to produce precise annotations that meet your exact training requirements.Collaboration and accessibility: Unlike CLI-based annotations, AI agents allow you to centralize your model setup and share it across your organization. Team members can access and use the models without any additional setup.‍‍Flexibility across platforms: AI agents don’t require CVAT administrator control and are available on both CVAT Online and On-prem (Enterprise, version 2.25 or later), giving you the freedom to deploy and manage your models in any environment.These features make CVAT AI agents a powerful tool for scaling your annotation processes while maintaining accuracy, collaboration, and control.‍How to annotate data with CVAT AI agentNow, let’s see how to set up automated data annotation with a custom model using a CVAT AI agent. For that, you will need:An account on a CVAT instance. In the tutorial we’ll use CVAT Online, but you can use your own CVAT On-prem instance if you wish - just substitute your instance’s URL in the commands.A CVAT task with labels from the COCO dataset (or a subset of them) and some images.You will also need to install Python and CVAT CLI to your machine.‍Refresher: CLI-based annotationLet’s first briefly review how CLI-based annotation works, since the agent-based method has a lot in common with it.‍First, you need a Python module that implements the auto-annotation function interface from the CVAT SDK. These modules will serve as bridges between CVAT and whatever deep learning framework you might use. For brevity, we will refer to such modules as native functions. ‍The CVAT SDK includes some predefined native functions (using models from torchvision), but for this article, we’ll use a custom function that uses YOLO11 from Ultralytics. ‍Here it is:‍import PIL.Image from ultralytics import YOLO import cvat_sdk.auto_annotation as cvataa _model = YOLO("yolo11n.pt") spec = cvataa.DetectionFunctionSpec( labels=[cvataa.label_spec(name, id) for id, name in _model.names.items()], ) def _yolo_to_cvat(results): for result in results: for box, label in zip(result.boxes.xyxy, result.boxes.cls): yield cvataa.rectangle(int(label.item()), [p.item() for p in box]) def detect(context, image): conf_threshold = 0.5 if context.conf_threshold is None else context.conf_threshold return list(_yolo_to_cvat(_model.predict( source=image, verbose=False, conf=conf_threshold)))‍Save it to yolo11_func.py., and then run:‍cvat-cli --server-host https://app.cvat.ai --auth "<user>:<password>" task auto-annotate <task id> --function-file yolo11_func.py --allow-unmatched-labels‍This will make the CLI download the images from your task, run the model on them and upload the resulting annotations back to the task.‍Note: long-time readers might notice a few changes since the last time we talked about CLI-based annotation on this blog. In particular, we changed the command structure of CVAT so that you have to use task auto-annotate rather than just auto-annotate. In addition, native functions can now support custom confidence thresholds, so our YOLO11 example reflects that.‍Registering the function with CVATNow, let’s see how we can integrate the same model as an agent-based function.‍An important thing to know is that the agent-based functions feature also uses native functions. In other words, if you already have a native function you’ve used with the cvat-cli task auto-annotate command, you can use the same function as an agent-based function, and vice versa. So let’s reuse the yolo11_func.py file we just created.‍First, we must let CVAT know about our function. Use the following command:‍cvat-cli --server-host https://app.cvat.ai --auth “<user>:<password>” function create-native "YOLO11" --function-file yolo11_func.py‍The string “YOLO11” here is just a name that CVAT will use for display purposes; you can use any name of your choosing.‍Now, if you open CVAT and go to the Models tab, you will see our model there, looking something like this:‍‍You can click on it and check that it has all the expected properties, such as label names. However, if you actually try to use this model for automatic annotation, it will not work. The request will stay “queued”, and after a while, it will automatically be aborted. That’s because we need to do one final step.‍Note: At no point in the process does the function itself (like the Python code or weights) get uploaded to CVAT. The only information the registration process transfers to CVAT is metadata about the function, such as the name and list of labels.‍Powering the function with an agentWe must now run an agent that will process requests for the function. Use the following command:‍cvat-cli --server-host https://app.cvat.ai --auth “<user>:<password>” function run-agent 58 --function-file yolo11_func.py‍Instead of 58, substitute the model ID you see in the CVAT UI. You can also find the same ID in the output of the function create-native command. This command starts an agent for our function, which runs indefinitely. The job of the agent is to process all incoming auto-annotation requests involving the function.‍While the agent is running, open your task in CVAT and click Actions -> Automatic annotation. You’ll be able to select the YOLO11 model and set the auto-annotation parameters, like for any other type of model CVAT supports.‍Click "Annotate." After a short delay, you should see the agent start printing messages about processing the new request. Once it’s done, CVAT should notify you that the annotation is complete. You can then examine the jobs of your task to see the new bounding boxes. The agent will keep running, ready to process more requests.‍CleanupNow that we’re done testing the function, we can remove it from CVAT. First, interrupt the agent by pressing Ctrl+C in the terminal. Second, delete the function by running the following command:‍cvat-cli --server-host https://app.cvat.ai --auth “<user>:<password>” function delete 58‍Alternatively, you can do this through the UI: find the model in the Models tab, click the ellipsis and select Delete.‍Working in an organizationIn the preceding tutorial, you added the function to your personal workspace. In this case, only you can annotate with it. Now let’s discuss what’s needed to share a function with an organization.‍First, you’ll need to add an --org parameter to all of your CLI commands:‍cvat-cli --org <your organization slug> ...‍Second, you should be aware of the permission policy when you work in an organization. A function can be…‍… added by any organization supervisor.… removed by its owner or any organization maintainer.… used to auto-annotate a task by any user that has write access to that task.These rules are the same as for Roboflow and Hugging Face functions.‍In addition, to power a function, an agent must run as that function’s owner or any organization maintainer. However, an agent must be able to access data for tasks it’s requested to process. So if you want to make it possible to use the function on any task in the organization, you should run the agent as a user with the maintainer role.‍Technical detailsThe following diagram shows the major components involved in agent-based functions. In the general case, the agent can run in a completely separate infrastructure from the CVAT server. The only requirement is that it’s able to connect to the CVAT server via the usual HTTPS port. The agent does not need to accept any incoming connections. Of course, if you run your own CVAT instance, you can run the agent in the same infrastructure, even on the same machine if you’d like.‍While so far we’ve been talking about the agent, you’re not actually limited to running one agent per function. If you’d like to be able to annotate more than one task at a time, you can run multiple agents:All annotation requests coming from the users are placed in a queue and distributed to agents on a first-come-first-serve basis. If one agent crashes or hangs while processing a request, that request will eventually be reassigned to another agent.‍What AI agents can’t do (yet)AI agents are still pretty new, so there are a few things they can’t do just yet (but don’t worry, we’re on it and will roll out updates soon): ‍Annotate just one frame,Work with skeletons,Handle videos or 3D data tasks, orSupport shapes with attributes.‍Get started with CVAT AI agentsCVAT AI agents are here to level up how teams automate data annotation. Now, you can use models trained just for your unique datasets or tasks, no matter if you’re on CVAT Online or On-Prem. This means: ‍✅ more precise annotations that are better aligned with your requirements, ✅ less manual fixing, and ✅ datasets that are ready to go for AI training or deployment. ‍And, with a centralized setup, your whole team can easily access the model, speeding up workflows and improving collaboration.‍Ready to take your automated annotation to the next level? Sign up or log in to your CVAT Online account or contact us to get CVAT with AI agents support on your server.‍

Product Updates

January 16, 2025

Announcing CVAT AI Agents: A New (and Better) Way to Automate Data Annotation using Your Own Models

IntroductionChoosing the right data annotation service is a key step in any AI or machine learning project. High-quality labeling services are essential for training algorithms and ensuring accurate predictions. CVAT (Computer Vision Annotation Tool) and Clarifai are two leading platforms offering various annotation services. These platforms cater to a wide range of users, from individual researchers to large companies.In this comparison, we’ll examine the strengths and weaknesses of both. We will focus on performance, scalability, and ease of use. We will also consider the target audience and suitability for specific industries. This will help you make the best choice for your project.‍Performance and capabilitiesCVAT is an open-source tool designed for teams that need more control and customization over their annotation workflows. It offers the following annotation types.Annotation types2D Image Annotations: Support for detailed annotations like bounding boxes, polylines, points, skeletons and polygons for more intricate data.Video Annotations: Capabilities for object tracking, recognition, and event detection in video-based tasks.3D Sensor Fusion: Provides support for annotations involving 3D sensor data, making it ideal for applications like autonomous driving, robotics, and LiDAR tasks.‍One of CVAT's key strengths is its ability to handle complex annotations, like instance and semantic segmentation with high precision. This makes it ideal for industries like healthcare, automotive, and surveillance, where detailed accuracy is very important.Clarifai is a comprehensive platform that focuses on automating data annotation processes to improve efficiency. Its main features include:‍2D Image Annotations: Efficient handling of large-scale image classification tasks using AI-driven automation, including bounding boxes and polygons.Text Classification: Support for natural language processing (NLP) initiatives, making it suitable for text-based projects.Video Annotations: Offers video object tracking to automate and simplify video analysis.Document Analysis: Named entity recognition (NER) for processing and analyzing large volumes of text efficiently.‍Clarifai is highly adaptable for different annotation tasks due to its AI tools. This makes it a good fit for industries like e-commerce, finance, and media. These industries handle a large amount of data, but the annotations are less complex.‍Ease of UseCVAT provides an easy-to-use platform that doesn't require technical expertise. Users can quickly sign up on the CVAT cloud platform and start labeling process right away. Data scientists and AI researchers value its powerful customization features. However, smaller teams or individuals without much technical knowledge can also use it effortlessly. The platform also supports complex project setups and allows for collaboration among multiple users, making it suitable for team-based projects.‍Clarifai is also designed for ease of use, requiring minimal setup. Its intuitive platform includes many automated features that help reduce manual effort. This makes it a great choice for project managers or companies looking to outsource data labeling without getting into the technical details. Teams can quickly start using the platform, even if they don’t have extensive technical knowledge in data annotation.‍‍Scalability and FlexibilityScalability is crucial for teams and organizations looking to expand their AI projects. CVAT excels in this area, primarily because it is open-source. This allows teams to enhance their annotation operations by improving infrastructure, adding custom plugins, or adjusting workflows to fit specific needs. Such flexibility is particularly beneficial for large organizations and AI research teams. These teams are involved in complex projects that require tailored workflows or intricate annotations. Examples include projects in the autonomous driving or aerospace sectors.‍‍On the other hand, Clarifai offers a simple approach to scalability. With its global workforce and AI automation, it excels in projects that require quick deployment. Companies in sectors like retail, healthcare, and marketing can easily scale their annotation needs. They can do this using Clarifai’s fully managed services. These services help reduce operational burdens. This is particularly advantageous for businesses looking for fast results without the need to establish a dedicated in-house annotation team.‍Industry-Specific SuitabilityClarifai and CVAT are versatile tools that can be applied across various industries, though they approach data annotation differently. Clarifai emphasizes automated data labeling, ideal for large datasets requiring speed and efficiency. Its AI-driven labeling is fast, yet it also supports manual annotation when needed for flexibility. On the other hand, CVAT focuses on manual labeling. This makes it better suited for tasks that demand high accuracy and human oversight. CVAT also offers automated and semi-automated annotation options. This allows CVAT to adapt to projects where repetitive or simpler tasks can be handled by AI. More complex tasks are left for human annotators.‍The decision between manual and automated annotation depends on the complexity of the data and specific project requirements. Automated annotation excels with large, straightforward datasets, whereas manual annotation is essential for more precise and intricate work. Both tools successfully cater to the unique data annotation requirements of various sectors, ensuring high-quality results across industries, including:‍HealthcareAnnotation helps analyze medical images like X-rays and MRIs. It is important for diagnosing tumors and other diseases.‍Surveillance and Security In this field, annotation is used for video tasks like event detection and facial recognition. It improves accuracy in important situations.‍Autonomous VehiclesAnnotation is key for object tracking and 3D sensor fusion. It trains models for lane detection, pedestrian tracking, and obstacle recognition.‍E-commerceAnnotation assists in classifying images and tagging products. This makes it easier to handle large data volumes and enhances user experience.‍Retail and MarketingIn these areas, annotation analyzes customer data. It helps businesses gain insights and make predictions.‍RoboticsAnnotation trains robots for tasks like object recognition and navigation. It creates reliable models for complex environments, such as automated warehouses and factories.‍Pricing ModelData Labeling ServicesA labeling service is a data annotation service used to train artificial intelligence models. Specialists manually mark objects in images or text so that the AI can learn to recognize and categorize them. This process is crucial for creating high-quality training datasets. These datasets allow AI to accurately perform tasks such as facial recognition, object detection, or text analysis. CVAT and Clarifai offer data labeling services. Below, we will review their data annotation offerings:‍CVAT· Discussion of Requirements: First, you contact the CVAT team or your contacts to discuss the details of your project. This helps them understand your specific needs and goals.· Proof of concept (POC) annotation: CVAT will request a data sample and an initial specification. This will allow CVAT to demonstrate its expertise. It will also help prepare an accurate project quote and estimate the time required to complete the project. This phase is completely free for a customer!· Team Formation: Depending on the scope and complexity of the project, CVAT may form a specialized team of annotators. This team will be responsible for carrying out the annotations according to your requirements.· Project and Task Creation: CVAT creates a project on their platform, including tasks for annotation. These tasks contain instructions and examples to guide the annotators on how to work with your data.· Data Preparation and Upload: You provide your data (images, videos, etc.), which are then uploaded into the system. CVAT supports various formats, making the upload process easier.· Annotation Process: The annotators begin working on annotating the data. CVAT offers powerful annotation tools, allowing the team to perform their tasks efficiently.· Quality Control: During and after the annotation, quality control is conducted. This may include reviewing the annotators' work and using automated tools to ensure accuracy.· Documentation: CVAT provides documentation for the project, including reports on completed work, quality metrics, and any important comments. This is useful for analysis and reporting.· Delivery of Annotated Data: Once the project is completed, you receive the annotated data in the agreed format, ready for use in your project.· Feedback and Support: The CVAT team remains in contact to gather your feedback on the process and provide support for any questions that may arise.‍Clarifai· Easy Execution: Users can effortlessly upload data in various formats to the Clarifai platform. The labeled data will be returned to the specified format for continued training, whether on Clarifai or another platform.· Expert and Flexible Workforce: The platform reduces the daily management burden of data labeling pipelines by allocating a specialized team based on expertise. A single team will manage the entire project to ensure consistency.· Quality Assurance Checkpoints: Clarifai conducts tests against data samples to ensure quality before finalizing the labeling of the complete training dataset. Users receive regular updates and transparency regarding quality metrics and turnaround times.· More Secure: The platform offers a secure environment for handling image, video, and document data. It adheres to strict security standards and data privacy principles. This allows users to select teams with background checks. The annotation takes place in secure facilities.· Flexible Pricing: Clarifai provides flat-rate pricing, making it easier to outsource data labeling needs and reduce operational overhead. Pricing scales with project growth.· Speed Time to Production: The team utilizes a state-of-the-art platform. This platform employs AI automation to expedite dataset annotation and project completion. It ensures high levels of accuracy.CVAT’s flexible pricing includes options like per-object, per-image, or hourly billing based on project demands. The only limitation for CVAT is that the project cost cannot be less than $5,000.‍Clarifai offers a more fixed project evaluation system, but there is also the option for a customized approach to the project.‍‍Suggestions for self-service on the platform.There are also plans available for independent work on the platform. Below is a comparison.CVAT‍Clarifai‍Additional Areas of ComparisonTo assist you in making an informed choice, here are five distinctions between CVAT and Clarifai:‍Integration with Existing Tools:CVAT's open-source architecture allows for seamless integration with third-party tools and custom pipelines. This makes it a suitable choice for teams with established AI ecosystems. This flexibility enables organizations to tailor their workflows to specific needs. While Clarifai also provides integration options, its emphasis on ready-to-use AI models may limit customization for teams with advanced technical skills.Project Management:CVAT offers robust project management features. These features allow team leaders to assign tasks, monitor progress, and collaborate in real time. This can be particularly beneficial for complex projects involving larger teams. Clarifai provides managed services for annotation and project management, which can streamline processes and support team coordination.Annotation Accuracy:CVAT is equipped with comprehensive annotation tools that are ideal for tasks demanding high precision, such as autonomous driving or medical imaging. Its capabilities allow for detailed data management. Clarifai utilizes AI-driven automation to enhance efficiency. This may be sufficient for many applications. However, it may face challenges with highly complex datasets.Turnaround Time:Clarifai's AI automation and distributed workforce are recognized for delivering faster turnaround times, making it suitable for projects that prioritize speed. Conversely, CVAT focuses on meticulous manual and semi-automated annotation. This ensures a high quality of results. This can be particularly important for complex datasets, even if it may take longer.Security and Data Privacy:CVAT's open-source nature allows for on-premise hosting. This grants organizations full control over data privacy. This is an essential feature for businesses handling sensitive information. Clarifai provides cloud-based solutions with strong security measures. This may appeal to companies that prioritize data security. However, it may not offer the same level of direct data control as CVAT.‍ConclusionCVAT and Clarifai are both powerful data annotation platforms, each serving different needs and applications. CVAT is well-suited for those requiring customizable, precise, and scalable solutions, particularly in sectors like robotics, autonomous driving, healthcare, and surveillance. Its open-source nature allows for easy installation and project management, especially for teams with the technical expertise to handle complex annotation tasks.‍On the other hand, Clarifai is designed for teams that value user-friendliness, automation, and rapid scalability. Its focus on AI features and managed services makes it a strong contender across various industries.‍Are you ready to make your choice? Explore both CVAT and Clarifai to determine which platform aligns best with your project's unique needs and objectives!‍

Industry Insights & Reviews

October 8, 2024

CVAT vs. Clarifai: Which Data Annotation Service Is Right for You?

CVAT, your go-to computer vision annotation tool, now supports the YOLOv8 dataset format.‍‍Version 2.17.0 of CVAT is currently live. Among the many changes and bug fixes, CVAT also introduced support for YOLOv8 datasets for all open-source, SaaS, and Enterprise customers. Starting now, you export annotated data to be compatible with YOLOv8 models.What is the YOLOv8 Dataset Format?YOLOv8, developed by Ultralytics, is the latest version of the YOLO (You Only Look Once) object detection series of models. YOLOv8 is designed for:‍‍Classification: Classifying or organizing an entire image into a set of predefined classes;‍Classification‍‍Object Detection: Detecting, locating, and identifying the class of object in the image or visual data.‍Object Detection‍‍Pose Estimation: Identifying the location and orientation of a person or object within an image by recognizing specific keypoints (also referred to as interest points).‍Pose Estimation‍‍Oriented Bounding Boxes: A step further than object detection and introduce an extra angle to locate objects more accurately in an image.Oriented Bounding Boxes‍‍Instance Segmentation: Pixel-accurate segmentation of objects or people in an image or visual data.‍Instance Segmentation‍With the help of CVAT’s data labeling and annotation tools, YOLOv8 models can be trained to perform the functions as accurately as possible.‍What are the Benefits of Using the YOLOv8 Model for Computer Vision?Ultralytics has used the knowledge and experience garnered from previous iterations of their AI models to create the latest and most advanced YOLOv8. The benefits of using YOLOv8 include, but are certainly not limited to: ‍Highly accurate object detection;Versatility when it comes to detecting multiple objects, classifying and segmenting them, and detecting keypoints within images;Efficient, as the YOLOv8 has been optimized for efficient hardware usage and doesn’t require much computing power to run;Open-source means that the YOLOv8 is always evolving and being built by a passionate community of developers and all its features are easily accessible;And a lot more that would require a much longer list than this. ‍Which Industries Can Benefit from Training YOLOv8 Models?A trained YOLOv8 model can then be used for a variety of tasks. The functionality that YOLOv8 computer vision models can provide will benefit the following industries.‍Computer vision and AI models trained to detect various objects related to automotive are the way of the future in the automotive industry. Self-driving vehicles and traffic management are just a few of the ways that YOLOv8 models will benefit the automotive industry.Automotive use case‍The YOLOv8 object detection model can also offer significant functionality for security. Thanks to highly accurate object tracking and pose estimation, YOLOv8 models can detect intrusions and monitor for unregistered activities or prohibited objects within a given area.‍Security us case‍Using computer vision in retail and logistics will improve the efficiency at which stores maintain their supply and stock. They can also use YOLOv8's powerful object detection function to detect which shelves need to be restocked to improve customer experience.‍Naturally, the robotics industry greatly benefits from AI models with accurate computer vision, as it helps significantly when it comes to problem-solving. With each advancement in computer vision, problem-solving robots get more and more sophisticated as a result.‍Robotics use case‍In construction and architecture, computer vision can identify weak support, foundational problems, and other structural errors. This can help construction crews to detect potentially disastrous errors before any serious problems occur. On top of that, visual surveillance can be paired with AI to help construction managers detect safety hazards before they take place.‍Safety hazards use case‍There are ton of functions for many other industries when it comes to Ultalytics' YOLOv8 model. For now, these are among the most popular use cases for this tech.‍Understanding the Technical Details of the YOLOv8 Dataset FormatThe YOLOv8 dataset format uses a text file for each image, where each line corresponds to one object in the image.‍Each line includes five values for detection tasks: class_id, center_x, center_y, width, and height. These coordinates are normalized to the image size, ensuring consistency across varying image dimensions.‍For tasks like pose estimation, the YOLOv8 format also includes additional keypoint coordinates. Segmentation tasks require the use of polygons or masks, represented by a series of points that define the object boundary. Additionally, oriented bounding boxes can be rotated, which helps in annotating objects not aligned with the image axes.‍Dataset StructureThe YOLOv8 dataset typically includes the following components:‍<dataset directory>/ ├── data.yaml # configuration file ├── train.txt # list of train subset image paths │ ├── images/ │ ├── train/ # directory with images for train subset │ │ ├── image1.jpg │ │ ├── image2.jpg │ │ ├── image3.jpg │ │ └── ... ├── labels/ │ ├── train/ # directory with annotations for train subset │ │ ├── image1.txt │ │ ├── image2.txt │ │ ├── image3.txt │ │ └── ...‍‍Images Folder: This folder contains the images you are training the model on. These images are referenced by the corresponding annotation files.‍Annotations: Each image has a corresponding .txt file with the same name located in the annotations folder. The file structure for detection tasks looks like this:‍# <image_name>.txt: # content depends on format # YOLOv8 Detection: # label_id - id from names field of data.yaml # cx, cy - relative coordinates of the bbox center # rw, rh - relative size of the bbox # label_id cx cy rw rh 1 0.3 0.8 0.1 0.3 2 0.7 0.2 0.3 0.1 # YOLOv8 Oriented Bounding Boxes: # xn, yn - relative coordinates of the n-th point # label_id x1 y1 x2 y2 x3 y3 x4 y4 1 0.3 0.8 0.1 0.3 0.4 0.5 0.7 0.5 2 0.7 0.2 0.3 0.1 0.4 0.5 0.5 0.6 # YOLOv8 Segmentation: # xn, yn - relative coordinates of the n-th point # label_id x1 y1 x2 y2 x3 y3 ... 1 0.3 0.8 0.1 0.3 0.4 0.5 2 0.7 0.2 0.3 0.1 0.4 0.5 0.5 0.6 0.7 0.5 # YOLOv8 Pose: # cx, cy - relative coordinates of the bbox center # rw, rh - relative size of the bbox # xn, yn - relative coordinates of the n-th point # vn - visibility of n-th point. 2 - visible, 1 - partially visible, 0 - not visible # if second value in kpt_shape is 3: # label_id cx cy rw rh x1 y1 v1 x2 y2 v2 x3 y3 v3 ... 1 0.3 0.8 0.1 0.3 0.3 0.8 2 0.1 0.3 2 0.4 0.5 2 0.0 0.0 0 0.0 0.0 0 2 0.3 0.8 0.1 0.3 0.7 0.2 2 0.3 0.1 1 0.4 0.5 0 0.5 0.6 2 0.7 0.5 2 # if second value in kpt_shape is 2: # label_id cx cy rw rh x1 y1 x2 y2 x3 y3 ... 1 0.3 0.8 0.1 0.3 0.3 0.8 0.1 0.3 0.4 0.5 0.0 0.0 0.0 0.0 2 0.3 0.8 0.1 0.3 0.7 0.2 0.3 0.1 0.4 0.5 0.5 0.6 0.7 0.5 # Note, that if there are several skeletons with different number of points, # smaller skeletons are padded with points with coordinates 0.0 0.0 and visibility = 0‍‍data.yaml: This configuration file defines the dataset structure for training. It includes paths to the images and annotation files and lists all class names. An example of a data.yaml file looks like this:‍path: ./ # dataset root dir train: train.txt # train images (relative to 'path') # YOLOv8 Pose specific field # First number is the number of points in a skeleton. # If there are several skeletons with different number of points, it is the greatest number of points # Second number defines the format of point info in annotation txt files kpt_shape: [17, 3] # Classes names: 0: person 1: bicycle 2: car # ... ‍This lightweight and modular format allows for flexibility and scalability in your machine-learning pipeline. It also means that it can undertake a wide range of computer vision tasks, including object detection, pose estimation, segmentation, and oriented bounding boxes. For more technical details and in-depth usage, you can explore the full YOLOv8 format documentation.‍How to Use the YOLOv8 Dataset Format in CVATExporting YOLOv8 DatasetsAfter completing annotations in CVAT, exporting them in a YOLOv8 format is straightforward. Here’s how you can do it:‍‍Export Your Dataset: Once your annotations are ready, CVAT allows you to export them in YOLOv8 format, ensuring they are perfectly structured for use in YOLOv8 models. This includes annotations for detection, pose, oriented bounding boxes, and segmentation tasks. For detailed instructions on exporting your dataset, you can refer to the Exporting Annotations Guide.‍Train Your YOLOv8 Model: With your annotations exported, you can now directly integrate them into Ultralytics' YOLOv8 training pipeline. The dataset will be ready to train your model for detection, pose estimation, or segmentation tasks without the need for conversion. For further guidance on training your YOLOv8 models using Python, check out the Ultralytics YOLOv8 Python Usage Guide.‍Importing YOLOv8 DatasetsIn addition to exporting datasets, CVAT also supports importing datasets that are already in the YOLOv8 format. This feature allows you to bring external datasets and annotations into CVAT for further refinement or use in different projects. You can import both annotations and images for detection, oriented bounding boxes, segmentation, and pose estimations.‍To learn more about how to import YOLOv8 datasets and annotations, follow the detailed instructions in our Dataset Import Guide.‍F.A.Q.Which CVAT users have access to YOLOv8 support?All CVAT users, including open-source, SaaS, and Enterprise, have access to annotation tools for the YOLOv8 computer vision model.‍How good is YOLOv8 object detection?A YOLOv8 computer vision model trained with data annotated through CVAT can be very accurate in identifying various objects in visual data. YOLOv8 models can identify object borders down to the pixel, making them incredibly powerful when it comes to object detection.‍What functions do YOLOv8 models perform in computer vision?As listed above, YOLOv8's functions include classification, object detection, pose estimation, oriented bounding boxes, and instance segmentation.‍Start Using YOLOv8 in CVAT Today!The additional support for YOLOv8 dataset formats is a major milestone for CVAT. All open-source, SaaS customers and Enterprise clients are welcome to try out CVAT to help you train a YOLOv8 model for all manner of computer vision uses.‍For more information, visit our YOLOv8 format documentation. ‍Not a CVAT.ai user? Click through and sign up here.‍Do not want to miss updates and news? Have any questions? Join our community:‍Facebook‍DiscordLinkedInGitterGitHub

September 17, 2024

CVAT for YOLOv8 Models: A New Era for Your Data Annotation Needs

‍We are excited to announce a new feature for our enterprise clients: we've added Security Assertion Markup Language Single Sign-On (SAML SSO) support into the CVAT platform. This addition underscores our commitment to providing secure and flexible solutions tailored to the needs of large organizations.‍What is SAML SSO and Why Does It Matter?‍SAML is a well-established and trusted SSO standard widely adopted by many companies due to its robust security features. It allows users to authenticate across multiple applications using a single set of credentials, significantly simplifying the login process while enhancing security. ‍CVAT.ai SSO Proposal‍Better User Experience: SAML SSO simplifies the login process for users by allowing them to access multiple applications with a single set of credentials. This reduces the time spent on managing various logins and enhances overall productivity.‍‍Improved Security: SAML is known for its rigorous security standards, making it the preferred choice for many large organizations. ‍We understand that every enterprise has unique requirements. That is why CVAT.ai supports both SAML and OIDC (OpenID Connect), another popular SSO protocol. Enterprises can choose the protocol that best fits their infrastructure and security policies.‍Get Started Today‍With the new SAML SSO integration, your enterprise can enjoy a more secure, streamlined, and flexible authentication process. Whether you already use CVAT or consider it part of your enterprise's workflow, this new feature ensures you have the best tools to manage security and access effectively.‍‍Not a CVAT.ai user? Click through and sign up here‍Do not want to miss updates and news? Have any questions? Join our community:‍Facebook‍DiscordLinkedInGitterGitHub‍

Product Updates

August 29, 2024

Enterprise CVAT.ai Clients Can Now Benefit from Enhanced Security with SAML SSO Integration

In a significant update for computer vision enthusiasts and professionals, the powerful Segment Anything 2 model has been integrated into the Computer Vision Annotation Tool (CVAT.ai). This cutting-edge technology, developed by Meta, improves the image segmentation speed and accuracy and streamlines the annotation process. ‍So, what's new in the SAM 2?‍SAM 2 dramatically improves over earlier methods in image annotation without prior training on 17 different datasets. It also reduces the need for human involvement by about three times, making the process much more efficient.SAM 2 performs better than its predecessor, SAM, on a suite of 23 datasets without prior training and operates six times faster.Using SAM 2 feels like real-time processing, as it can handle about 44 frames per second.Using SAM 2 for video segmentation annotation in the loop is 8.4 times quicker than manual per-frame annotation with the original SAM.‍CVAT.ai Cloud: Segment Anything Model v2 Now Available for Image Segmentation‍CVAT has integrated "Segment Anything 2" into its SaaS version, improving the platform's capabilities for image segmentation.Integrating Meta AI's advanced machine learning models transforms CVAT into a more powerful tool for various users, ranging from academic researchers to industry professionals. This integration highlights a mutual commitment to advancing the field of computer vision. For now, in CVAT.ai, SAM 2 works only for images today, but video support will be added soon!We've Added Bounding Box Input‍The public version of CVAT.ai now supports optional bounding box input for Segment Anything 2. This feature allows users to define areas to annotate more quickly and accurately, enhancing the efficiency of model training processes for various applications.‍‍CVAT.ai Enterprise Edition: Added Segment Anything Model v2 CVAT has stepped up its game for Enterprise users by integrating Segment Anything 2 interactor support. This edition is tailored to meet the high demands of corporate environments where precision and scalability are critical. Enterprises can leverage this feature to handle complex segmentation tasks more effectively, ensuring higher accuracy and productivity in machine learning projects.‍‍‍Not a CVAT.ai user? Click through and sign up here‍Do not want to miss updates and news? Have any questions? Join our community:‍Facebook‍DiscordLinkedInGitterGitHub‍

Product Updates

August 15, 2024

Meta's SAM 2 is Now Available in CVAT Online for Image Segmentation

In the dynamic world of computer vision, staying current with technology advancements is not just beneficial—it's critical. This is particularly true for organizations that use self-hosted solutions for the Computer Vision Annotation Tool (CVAT.ai). ‍Regular updates to such a tool are essential for several reasons: security, improved functionality, ensuring compatibility, and maintaining operational efficiency. This article explores why regularly updating your self-hosted CVAT.ai solution is crucial for maintaining a competitive edge and operational reliability.‍This article is divided into two parts: the first addresses 'why' regular updates are necessary, and the second explains 'how' to implement these updates effectively.‍Why is it Necessary to Update CVAT.ai Regularly?‍Improved Security: One of the most compelling reasons to regularly update your self-hosted CVAT is to enhance security. Although the latest version of CVAT.ai is secure, the threat landscape constantly evolves. New vulnerabilities are discovered daily, and the CVAT.ai Team releases patches to mitigate these risks. By staying updated, you safeguard your system against vulnerabilities that malicious actors could otherwise exploit. Regular updates are crucial for maintaining the integrity of your data and ensuring the privacy of the information processed by CVAT.‍Access to Latest Features: CVAT is continuously improved by a community of developers who add new functionalities and enhancements. These updates can include everything from improved annotation algorithms, support for new formats, and enhanced user interfaces to integration capabilities with other tools and platforms. ‍Compatibility and Integration: As your IT environment evolves, new versions of dependent software and hardware are introduced. Regularly updating CVAT ensures compatibility with other software tools and infrastructure changes. For example, updates may be needed for CVAT.ai to operate smoothly with newer versions of browsers, operating systems, or integrations with third-party APIs and services. Maintaining an updated system prevents disruptions caused by compatibility issues, which can be costly and time-consuming to resolve after the fact.‍Operational Reliability: Regular updates introduce new features and improvements, including optimizations that enhance CVAT's performance and stability. These optimizations can lead to faster load times, improved response times, and more efficient data processing, enhancing the system's overall reliability. For businesses relying heavily on computer vision technologies, operational reliability is non-negotiable.‍How to Update CVAT?‍Before we delve into the procedure, it’s important to note that the steps described here apply only to standard CVAT.ai standard public images.‍If you have created a custom image that we need to be aware of, we assume you are technically proficient and can handle the necessary updates tailored to your image.‍Step 1: Back Up Your Data‍Before making any changes to your CVAT installation, it's essential to back up your data. This ensures you can restore your system to its previous state if something goes wrong during the update.For more information, see CVAT.ai Backup Guide.‍Step 2: Stop the Old Version‍You need to stop the currently running version of the application to avoid potential conflicts.Use the Docker compose command to stop the running CVAT.ai container.‍Step 3: Pull Updates from Repository‍Once the system is halted, you can safely update the software by pulling the latest changes from the CVAT GitHub repository. You must download the entire source code, not just the Docker Compose configuration file.To see if the new version was released and to check the latest changes, use CVAT.ai Changelog.You must also check and update the additional components at this stage.‍Step 4: Handle Personal Customizations‍If you have custom configurations, such as a database managed outside Docker, you must ensure these are compatible with the new version. Review your configurations and make necessary adjustments to ensure they work with the new version of CVAT. In some cases, you need to build images locally; see this Guide for details.‍Step 5: Run the New Version‍After updating the software and adjusting your customizations, you can start the new version of CVAT.Start CVAT container: Use Docker commands to run the new CVAT containers; see the Upgrade Guide for details.‍Step 6: Manual Updates Where Needed‍Sometimes, you may need to update custom external components or manually handle migration scripts.‍And that's it!You now have new updates CVAT.ai with all necessary security improvements and features!‍Looks Too Complicated?‍Updating and managing CVAT can sometimes feel complex, mainly when you're focused on annotating and training models for your work or research. If you'd prefer to leave the sysadmin and DevOps tasks to someone else, CVAT offers installation support and help managing Enterprise self-hosted solutions. Explore our enterprise proposals and plans to find the right level of support for your needs. Alternatively, consider using our online version—it's always up-to-date and secure, so you can focus solely on annotating without hassle.‍‍Not a CVAT.ai user? Click through and sign up here‍Do not want to miss updates and news? Have any questions? Join our community:‍Facebook‍DiscordLinkedInGitterGitHub‍

August 8, 2024

Why is it Essential to Keep CVAT Updated?

Computer Vision Annotation Tool (CVAT) was started by Intel in 2017 and launched publicly on GitHub in the middle of 2018. In 2022, the platform became a core IP of an independent CVAT.ai Corporation, which we consider our founding year. With over seven years of experience, CVAT.ai has embarked on a mission to transform the field of data annotation and image labeling. We are proud of our remarkable journey and the milestones we have achieved.‍Our platform has become a cornerstone for data scientists, machine learning engineers, researchers, and students striving for excellence in artificial intelligence. ‍An anniversary is more than just a date; it symbolizes our growth, achievements, and the vibrant community we have fostered.‍The following post will outline our achievements from last year and revisit the company's history!Best Moments of The Year‍There were some ups and downs, but we are here to celebrate the results of our efforts. No hard feelings—lessons were learned and will not be forgotten. Today, we focus on the good parts and celebrate the fruits of our hard work:‍September 2023: We've reached 10,000 stars on GitHub, and we're still going strong—today, we have nearly 12,000 stars! We want to thank every one of you for your support. We also welcome stars as a gift, so if you'd like to cheer us up and help make our data labeling tool even more popular, please visit our GitHub and give us a star.November 2023: CVAT.ai plays a crucial role in the crowdsourcing annotation of Computer Vision datasets; therefore, in collaboration with Human Protocol, we have successfully launched the crowdsourcing data annotation project in several iterations:some textFebruary 2023: We’ve aired the first experiment in Crowdsourcing Annotation with CVAT and Human Protocol.November 2023: We have continued to push forward, and through a combined effort with our Human Protocol partners, we have made the integration more user-friendly for annotators and clients whose data needs to be annotated.November 2023: With Human Protocol, we warmly welcomed speakers at the Newconomics 2023 Conference.This initiative makes data annotation more affordable for AI companies needing annotated data. We are continuing to collaborate with Human Protocol to unlock and democratize AI.‍February 2024: CVAT.ai joined Google Summer of Code 2023, and we are still actively working on the project, which we consider a success!April 2024: We've introduced Annual Plans, helping our loyal and devoted users save up to 30% on data annotation tools. We have maintained transparent pricing, which significantly aids in budget planning!May 2024: The CVAT.ai Labeling Service is stellar and thriving. We have several hundred annotators who work across various fields, consistently meeting deadlines and maintaining high-quality standards. Our client base includes large enterprises in retail and other sectors, featuring customers from the top 100 enterprises worldwide. Their satisfaction with our services brings us great joy.May 2024: CVAT.ai was recognized as a top-choice data annotation tool at the Embedded Vision Summit 2024 (EVS 2024).‍Looking Forward‍As we celebrate this milestone, we are more committed than ever to pushing the boundaries of what CVAT.ai can achieve. ‍We extend our heartfelt thanks to our users, contributors, and partners who have been part of this incredible journey. Your support and collaboration have been instrumental in our success.‍Here's to more years of innovation, growth, and success with CVAT.ai!‍Stay connected with us, be curious, keep annotating!‍‍Not a CVAT.ai user? Click through and sign up here‍Do not want to miss updates and news? Have any questions? Join our community:‍Facebook‍DiscordLinkedInGitterGitHub‍

Company News

August 1, 2024

Video Annotation Guide (Applications, Techniques & Best Practices)

Save Time,
Annotate Better

Subscribe to the CVAT Newsletter

Product & Services

Company

Resources

Blog

Video Annotation Guide (Applications, Techniques & Best Practices)

Medical Data Annotation: Improving AI Accuracy in Healthcare

CVAT AI agents: What's new?

CVAT Now Supports Video Annotation with SAM 2

Introduction to Image Annotation for Computer Vision and AI Model Training

How to Create Data Labeling Specifications for Your Annotation Project: A Client's Guide (+ Free Template)

Announcing CVAT AI Agents: A New (and Better) Way to Automate Data Annotation using Your Own Models

CVAT vs. Clarifai: Which Data Annotation Service Is Right for You?

CVAT for YOLOv8 Models: A New Era for Your Data Annotation Needs

Enterprise CVAT.ai Clients Can Now Benefit from Enhanced Security with SAML SSO Integration

Meta's SAM 2 is Now Available in CVAT Online for Image Segmentation

Why is it Essential to Keep CVAT Updated?

CVAT.ai Birthday is Here: See Our Achievements in the Field of Data Annotation and Image Labeling

Save Time, Annotate Better

Subscribe to the CVAT Newsletter

Product & Services

Company

Resources

Save Time,
Annotate Better