Neural networks explained

What is deep learning?

A deep learning application is fundamentally a neural network that receives data from an algorithm and has its internal operations controlled. In addition, by reducing the losses as the model is being trained, this technique makes sure that the entire process converges to the optimal final model.

The smallest component is a neuron, while neurons from other layers serve as its inputs and outputs. In fact, layers, which are collections of neurons, are used to organize all the neurons. A neuron will transmit information to the neurons in the following layer. Various links are used to join layers together (fully-connected, convolutions, recursive connections, ...).

When you "feed" an image to the neural network, the raw image information will be dispatched to the neurons of the first layer, then passed to the next layer, until the end of the backbone. This is the forward pass. At the very end is a loss operation : it compares the truth known (the annotation you provided) to the information which percolated through the network.

Then happens the back-propagation. If your image does not match the annotation, the loss is programmed to send a signal to all the neurons in reverse. Each weight is slightly modified, so that next time a similar image can generate a prediction which hits closer to the truth.

This series of layers, from the data feeder to the loss operation, is what we call the backbone.

What is an "architecture"?

An architecture is simply a specific design of backbone.

There have been many alternative architectures developed since its inception. Researchers have experimented with networks that contain many more layers, larger layers, new processes, and new sorts of connections since LeNet with its 5 layers (1998). You may access some of the best modern architecture designs through Deepomatic training engine.

What is a "meta-architecture"?

A backbone is sufficient for classification and tagging tasks that necessitate "digesting" a single image in its entirety. However, we must consider the bounding box positions when performing a detection task. To achieve this, we require a detection algorithm that can modify the bounding box precision in the predictions. These detection algorithms, which also include EfficientDet, Faster-RCNN, YOLO, and others, employ a number of different methods, such as region proposal, NMS, etc. Because they all have a backbone at their core, we refer to them as meta-architectures.

A few of them can switch out their backbones (for instance, SSD works with Inception v2 as well as MobileNet).

Was this helpful?