This option allows you to use more samples from the less represented concepts in your view.
Consider a view with k concepts. Each region can have:
1 label in the case of classification/detection
several labels in the case of tagging
The goal of balancing is to tend to a uniform representation of each label throughout all regions. The algorithm uses loss-based balancing to do the following:
Establish the initial repartition, e.g {'cat': 54, 'dog': 13, 'horse': 79}
Compute an entropy representing the uniformity of the repartition
Compute a "score" for each data point which is its individual loss. It represents by how much is the concepts attached to it are prevalent in the dataset.
We obtain a list of samples sorted by their prevalence, and we select the least prevalent to add them into the dataset. Then we update the loss with the newly added data points.
We repeat this operation until the entropy does not increase anymore, which means that we have done our best at maximizing uniformity.
Additionally, here are rules used to have an efficient balancing:
The dataset will not expand more than twice his size
The algorithm will avoid re-using the same samples to balance out the entropy to prevent overfitting on them.
It's possible that the dataset will not be perfectly balanced in the end. Consider the case where the repartition would be {A -> 55, B -> 41, C -> 2}, with the class C being represented in only two regions. As per (2), we cannot duplicate that sample 53 times to match the cardinality of A. When we stop duplicating samples because none of the available ones brings balance to the dataset, that's when the entropy stops increasing, and that is the signal to stop.
This applies to sub views of detection views
You can train a dataset composed of images which are regions, or crops, of a parent "Detection" view. This sub view can be any kind of task. Typically when training on a view whose parent is a detection view, we crop the region out of the original image with the coordinates of the parent view. With this parameter, we expand this crop, allowing you to enlarge the sample region by using the original image, expanding the crop by the percentage given in the parameter.
This can be useful if important elements of context are situated next to the bounding box.
Example: you are tasked with detecting animals. You create a first view to detect any animal, and a second one to select the kind of animal.
Your parent view correctly detects animal instances in a bounding box. Next, you should predict which kind of animal it is. For that task, the tail could be useful, unfortunately your bounding box did not include the tail of the animals.
Adding a crop margin will allow the training engine to take a larger crop from the image and include the tail.