Studio Import file format

This document aims to present the platform's accepted import text file format with which you can import multiple lines of data into the platform. In what follows, we will demonstrate the structure of the format with detailed examples.

File structure

The text file contains a set of JSON objects:

  • the first object is a header allowing you to define the tree structure of the views composing your project

  • the other objects correspond to all the images you wish to import, one per image.

This structure allows the upload of very large quantities of data by streaming.

A header is a JSON object that allows you to define your views and the concepts attached to each of them. It is also in the header that you indicate the tree structure of your views, and how the child and parent views are linked together.

Considering the desired tree structure in a project depicted below, we would like to detect items (if existed) in the given image and later on classify them by their type of material. Follow the provided examples for constructing a valid header line and building proper views:

Header json
{
  "name": "Bulk items",
  "real_name": "Bulk items",
  "splits": ["val", "train"], 
  "views": [
  {
    "name": "Item or not",
    "real_name": "Item or not",
    "type": "classification",
    "concepts": [{"name": "With Item"}, {"name": "Without Item"}],
    "conditions": [],
    "children":  [
                  {
                   "name": "Item Detector",
                   "real_name": "Item Detector",
                   "type": "detection",
                   "concepts": [{"name": "Item"}],
                   "conditions": [[{"name": "With Item"}]],
                   "children": [
                                 {
                                   "name": "Item material",
                                   "real_name": "Item material",
                                   "type": "classification",
                                   "concepts": [{"name": "Glass"},{"name": "Metal"},{"name": "Wood"}],
                                   "conditions": []
                                   }
                                ]
                  }
                  ]
    }
    ]
    
    }

Notice that the header in your file should be written in a single line, and will allow the platform to build all the views and all the concepts that you have specified, based on the conditions you have indicated.

Here are the fields you need to specify:

  • name : the name of your project

  • splits : a list that only supports train and val for now. Read more!

  • views : each entry in the views list defines a root view and contains the following fields:

    • name : the name of your view

    • type : the type of your view, choosing among classification, tagging or detection (see the creation of views to know more)

    • concepts : each entry in the concepts list defines a concept and contains the following field:

      • name: the name of your concept

    • conditions : a list of lists of concepts to specify AND and OR conditions. A list of concepts specifies a AND condition and the list of lists specifies OR conditions.

    • children: each entry in the children list defines a child view and contains the same fields as defined in a root view.

Concepts must be unique within a project. It is not possible to have several concepts with the same name in separate views.

If the views of your projects have been already created beforehand, including the header line in your file is not obligatory.

Images

Each image you want to add to your project is a JSON object that must be written on a single line in your file with the following structure:

Image
{"data": [{"url": "/url/"}], 
 "metadata": {}, 
 "splits": ["train"],
 "annotations_pack": [
      {"view": "Item or not",
       "annotations": [
         {
           "concept_name": ["With Item"],
           "region": null,
           "children": [
              {
                "view": "Item Detector",
                "annotations": [
                    {
                    "concept_name": ["Item"],
                    "region": {
                              "bbox": {
                                       "xmin": 0.14237532448083068,
                                       "ymin": 0.2097216707202064,
                                       "xmax": 0.7449424670527156,
                                       "ymax": 0.9198461929815852
                                      }
                              },
                    "children": [
                       {
                       "view": "Item material",
                       "annotations": [
                           {"concept_name": ["Glass"],
                            "region": null,
                            "children": []}]}]}]}]}]}]}

Here are the fields you need to specify about the images:

  • data : each entry in the data list is an object that contains the field url that you need to specify. You should for now add only one image per line of your text file.

  • metadata : a string of characters via which you can add metadata to your image as a dictionary. The metadata are displayed in the information popup on Studio.

  • splits: a list of splits to which the image belongs. You should choose between train and val for now.

Some data are automatically added to the platform: File name, Created, Creator, and Last annotator are set automatically.

In case you want to insert the annotation information with images, you can specify the annotation info for each image with the following fields and by keeping the tree structure according to the created views:

  • annotations_pack:contains all the annotation information with the hierarchical structure with respect to the specified views in the header. Each entry corresponds to a specific view and its annotations and requires the following fields:

    • view:defines the name of the view in which the annotation is provided

    • annotations: each entry in the annotations list corresponds to a region, with all the information attached to it:

      • concept_name: the list of concepts attached to the region for the specific view. If the type of view is a classification view, this list can only get one item whereas if the view is a tagging one it can be a list of multiple concepts.

        In case of annotation being as without concept, the list should be left empty ("concept_name": []).

        • region: the actual region, described as a bounding box bbox, and the coordinates of this bounding box, xmin, ymin, xmax and ymax. In the classification and tagging views, the region can be set as null, due to the fact that a region in these views can be considered the whole image.

        • children: each entry in the children list corresponds to a child view and is followed by the fields to specify its annotations as described.

Notice the difference between classification and tagging views! In the classification views, the concept_name list can only contain one concept whereas, in tagging views, multiple concepts can be included as annotated tags in the list.

Example:

"concept_name": ["concept1", "concept2"]

Last updated