Improving Data Sets and Training in YOLO

This Discuss the YOLO Vision Algorithm efficient training optimization and dataset optimization

By: Kurt Axl Saludo|Published on: November 24, 2024

Yolo Optimization

Datasets

One key in optimization in the model accuracy is the datasets. datasets must have a good and clean slate to be before getting inputted in the datasets.

Good data = good model and bad data = bad model.

Now what can be considered as a good data and a bad data

Characteristics of a Good Data

one of the most important part of the data is it must be accurate specially working on a computer vison model a properly labelled data is important for a computer vision to work properly and also is next part of it is a high quality data resolution does not matter that much when it comes to computer vision but it is really important for a certain subject to be properly visually identifiable for it to be considered as high quality.

(insert comparison of a high quality data and low quality data a motorcycle with motion blur)

Meanwhile a bad data is the exact vice versa of the good data

Training

This type of training is what can we call transfer learning, in training our custom data sets we are using the yolo pre trained data that allows us to identify some of the objects faster and in a more efficient training. In training the data in yolo there are some parameters that could be set to attain better and more accurate data. here are the list of those parameters:

Parameters	Description
model	specifies the type of model that you are going to use to train your custom datasets this allows the training to be assisted through the use of the yolo pretrained dataset or `.pt` files
data	this is the path of the dataset configuration commonly named as a `data.yaml` file this contains the datasets parameters incuding the class names, number of class, training and validation data.
epochs	the number of how many repetitions you would go pass through your datasets. i commonly refer this as a pass to review everything. the more epochs doesn't mean the better the and more accurate the output it would become
patience	is the number of epochs to wait if there are no other improvements. it would automatically stop if the model is not having any performance improvements
imgsz	image size is the size of the training all images are default set to `640` but if your gpu cannot support it you can allow it to be trained on lower resolutions.
pretrained	can be enabled as boolean either true or false it allows your training to be loading from a specific weights thus improving the training efficieny and model performance

There are a lot more of parameters that can be seen on the official yolo documentation that can be found here

Now how do we increase the model accuracy in by these parameters? Normally i use google collab to train these and i learned to worked with limited resources so i tried to train efficiently and accurate as the same time. In training i use this command

!yolo task=detect mode=train pretrained=True model=yolo11x.pt data={dataset.location}/data.yaml epochs=150 imgsz=416 patience=10

Lets breakdown the settings we have. in the model we used the latest model which is yolo11x which is the most accurate and the best yolov11 model that there for us to have a more accurate prediction and better. next is epochs=150 and patience=10 this gives the model a good balance of not overfitting or in simpler words training too much because the more epochs doesnt mean more accurate data and lastly imgsz=416 this was adjusted and lowered to accommodate the limitation in the T4 GPU in the google collab. Lowering the resolution can allow for the gpu vram to have less usage and allow us to train with less resources.

Model Output

Now we have the model output but how can we determine if the model is accurate enough. we would now measure it with the mAP 50 and mAP 50-95 we can simplify this as mAP the mean average percentage the accuracy of the model in a 0.5 threshold and the mAP 50-95 as the calculation of it in the threshold of 0.5 to 0.95. This 2 have always the different output but it is always that the mAP that is higher than the mAP 50-95 and we are targeting of at least 0.80 or 0.95 to say that we have an accurate model.

But sometimes we didn't get that. average training can get at least 70 to 75 percent of accuracy and this can be increased if the datasets are properly fitted. I use an app named Clear ML to track my training and this allows me to visualize the accuracy of the data overtime allowing me to adjust the training parameters accordingly, thus giving me a more accurate data.