Choosing the right architecture

Characteristics of the architectures

Depending on your needs, you can choose any of the provided architectures for your training.

Network size

  • EfficientNets are bigger than the others, and the size increases as their identifying number goes up.

  • Other ResNets, YOLO, Inception and VGG are intermediate-sized networks.

  • MobileNets and ResNet50 are relatively small networks

Inference speed

Make sure you choose a network that is appropriate for your needs because the network size of your architecture affects how quickly inferences are made.

Consider making your model more efficient so that each prediction is executed quickly. A signal will propagate more slowly in larger architectures, increasing the inference time. High inference times will accumulate if your workflow consists of numerous models run in succession.

Storage

A bigger model means more weight values saved, which is synonym to more neurons. In the event that you export your model to a device with restricted memory, this will logically require extra RAM.

Precision

Bigger backbones are generally more accurate, at the cost of speed of training/inference. But research developments are constantly trying to bring down inference time while at the same time increasing precision.

Training speed

The number of steps you set in the interface is the number of forward passes. If you select an architecture with a batch size > 1, each step will take longer. On the other hand, each step will process more images. The end should not vary much, but you need to take that into account when comparing two trainings with different batch sizes.

A batch size superior to 1 allows the training engine to make use of parallelisation capacities of the GPU.

Image size

Each architecture has an input size, which is the size all your images will be resized to before passing through the backbone. If your dataset consists of mainly small images, consider a small-input backbone.

Try to match roughly the recommended input size of the backbone, which you can find for each architecture in the table at Available architectures, to the average size of your dataset.

How to choose?

First check if your use case has constraints which might fall into the above characteristics. Then, in order to find the best model you might need to test several configurations, and different parameters.

Here is a simple guide to give you a basic idea on how to conduct such experiments:

  • Test a few architectures that fit your hard constraints, with a low number of iterations (1 epoch) and see which ones come on top

  • Take the best performing ones, and try to change some of the parameters, to see if some are bringing better results than others see the options sections).

  • Take the best performing architecture and replay it with a larger number of iterations to allow it to fine-tune to your dataset.