# Data Augmentation

By applying modifications to the images in your dataset, data augmentation artificially widens the scope of your data domain search. It can be an excellent strategy to boost your performances to add more variation to your dataset by making new images from the existing ones in it.

Each operation has a probability factor attached, thus when a certain image is used, there is an x% chance that the activated technique will be employed. This occurs each time an image is extracted from your dataset, so different transformations of the same image may occur over different epochs (or not transformed).

* All augmentations can be used
* You can add them only once
* The order in which you add them can matter
* For YOLO architectures, data augmentation is not available

## Horizontal flip

The image is flipped horizontally randomly with 50% chance, meaning it is mirrored on a X-centered axis.

<figure><img src="/files/0eUVrq0bL3P2JZkWMsQa" alt=""><figcaption></figcaption></figure>

{% hint style="warning" %}
If you are trying to detect right or left-sided objects, this one augmentation is of course highly discouraged.
{% endhint %}

## Vertical flip

The image is flipped vertically randomly with 50% chance, it is mirrored on a Y-centered axis.&#x20;

<figure><img src="/files/NEYb4CvITzVRW9Q1Id4s" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
It is different from a 180° rotation
{% endhint %}

## 90° rotation

Randomly rotates the image by 90° counter-clockwise with 50% chance. Combine it with 'Random Horizontal Flip' and 'Random Vertical Flip' to get rotations and symetries in all directions.

<figure><img src="/files/dOdWSoRPTfDWnbmXUrsI" alt=""><figcaption></figcaption></figure>

## Modify brightness

Randomly changes image brightness with a 20% chance. This modifies the image by adding a single random number uniformly sampled from \[-max\_delta, max\_delta] to all pixel RGB channels. Image outputs will be saturated between 0 and 1.

{% hint style="warning" %}
The max\_delta parameter indicates the range up to which your image could get brighter or dimmer.

This means that a value of 0 will not dim the image, but rather prevent any change.
{% endhint %}

<figure><img src="/files/x77uOz5otHD4nQGb1j4F" alt=""><figcaption></figcaption></figure>

## Modify contrast

Randomly scales contrast by a value between \[min\_delta, max\_delta]. For each RGB channel, this operation computes the mean of the image pixels in the channel and then adjusts each component x of each pixel to '(x - mean) \* contrast\_factor + mean' with a single 'contrast\_factor' uniformly sampled from \[min\_delta, max\_delta] for the whole image.

{% hint style="info" %}
This operation has not a probability factor, when activated it is applied every time
{% endhint %}

<figure><img src="/files/QE65BVLk16Qtnn5XovqR" alt=""><figcaption></figcaption></figure>

## Add gaussian noise

Randomly modify a patch of the image by adding gaussian noise to the image pixels normalized between 0 and 1.

{% hint style="info" %}
This operation has not a probability factor, when activated it is applied every time
{% endhint %}

<figure><img src="/files/XzW1KrWFDX7IQorG4vr4" alt=""><figcaption></figcaption></figure>

## AutoAugment

AutoAugment is an algorithm which learns the best augmentations to apply during the training. It is a mix of random image translations, color histogram equilizations, "graying" patches of the image, sharpness adjustments, image shearing, image rotating and color balance adjustments.

You can find the original paper explaining the technique here: <https://arxiv.org/abs/1805.09501>

The original paper proposes several "policies" which are different combinations of transformations. Based on our own research, we use the v3 policy which had the best results.

AutoAugment is a method to find the best augmentations to apply during the training, that was  published in [this 2018 paper "AutoAugment: Learning Augmentation Policies from Data"](https://arxiv.org/abs/1805.09501). It is a mix of random image translations, color histogram equilizations, "graying" patches of the image, sharpness adjustments, image shearing, image rotating and color balance adjustments.

This original paper proposes an ideal policy (\~ series of transformations) that performed best on ImageNet in their benchmarks, in a classification task. **It is the one called "Original AutoAugment policy" when you select AutoAugment as your preprocessing.**

[Another paper ](https://arxiv.org/pdf/1906.11172.pdf)did a similar work in a **detection** setting, and released several policies which they found worked best while training on the COCO dataset. We selected the v1 and v3 versions, which you can select when working in a detection view, under the following, more descriptive names:

* v1: **"More color & BBoxes vertical translate transformations"**
* v3: **"More contrast & vertical translate transformations"**

{% hint style="success" %}
We found that AutoAugment had better outcomes with detection tasks, and/or smaller architectures such as MobileNet or ResNet-50.
{% endhint %}

{% hint style="warning" %}
You may want to disable other data augmentations to avoid interfering.
{% endhint %}

{% hint style="info" %}
This operation has not a probability factor, when activated it is applied every time
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.deepomatic.com/platform-documentation/deepomatic-drive/configuring-visual-automation-applications/training-models-1/data-augmentation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
