Below are listed the different steps that take place during an audit:
When you launch an audit, there is actually a neural network that is trained on your dataset. Here are the parameters of the training:
Classification and tagging: MobileNet v2 with learning rate 0.01 and 1000 iterations
Detection: SSD MobileNet v2 with learning rate 0.004 and 1000 iterations
Once we have a trained model, we calculate predictions for all the images in the dataset, both the training set and the validation set.
For classification and tagging views, we thus obtain a score for each concept that has been learned by the model. We also have thresholds that were determined from the validation set.
For detection views, we have in the same way one or more boxes as well as thresholds determined from the validation set.
For classification and tagging views, matching annotations and predictions is an easy step. We can compare the predicted concept and the annotated one, and we get a list of potential errors.
For detection views, we work in fact not at the level of the image as a whole but at the level of the boxes. For the set of predicted and annotated boxes, we calculate the IoU (Intersection over Union) and we consider that there is a match when the resulting value is higher than 0.3. There's actually a whole algorithm to determine the best possible matching combinations.
Whatever the type of view, at the end of this third step, we have potential errors: images for which predicted and annotated concepts are not the same in tagging and classification, and boxes for which there was no match (both annotated and predicted boxes) for detection.
These potential errors are then filtered by taking into account the review history. Any errors that have already been corrected are used to avoid recreating identical errors.