January 2025
Last updated
Last updated
We moved from Celery to an Argo jobs based solution, allowing us to increase stability, observability and robustness during the training process. This also unlocks some features such as:
We are now able to stop training directly from the model training page :
Observability: It is now possible to download the logs for every training. The log file will include information about the training (parameters, optimizer, dataset options ...) and also error messages in case of failed training.
Introducing a new Job management page on the Drive part of the platform. The aim is to give the ability to manage training, imports and exports at the Organisation level. This gives better visibility on all the different jobs and allows users to monitor and download the logs of every job. (Click on the page below to launch the demo)
Our system is now able to support batch inference, which leads to a significant reduction of the response time.
How it works: In a few words, when the same model is used multiple times in a row for the same image, instead of inferencing one by one the different parts of the image we are now able to launch the inferences in batch on the different crop of the image. This is particularly useful when we classify multiple objects on an image.