Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The ImageNet is a large visual database containing millions of images designed for computer vision research. It is one of the first and most popular image databases for Deep learning. It is widely used for large-scale Deep Neural Networks, the huge collection of images makes it possible for the Neural networks to efficiently learn various image features. Deep Neural Networks today can have more than 100 layers and run on multiple GPUs, such a Network can easily process thousands of images per second, which is why a large image dataset is necessary. Once , a model is trained , it can be used for image classification and object detection.

Fig.1 shows some examples from the ImageNet dataset, it shows various categories and sub-categories and how they are linked together. The images are collected from the internet and painstakingly categorized categorised by researchers at the Stanford vision lab. An image can fall under several different categories and subcategories. Fig. 1 show two such chains of categories.

...

All the subcategories are taken from another Princeton project called the WordNet (https://wordnet.princeton.edu/). It is a large lexical database of English words categorized categorised and linked together in a semantic relationship.   The words that are of similar meaning are grouped and called synsets. Nouns, verbs, adjectives , and adverbs are grouped based on cognitive synonyms and put into the synsets. Each expresses a distinct concept and is interlinked together based on conceptual semantic and lexical relation relations. ImageNet uses only a subset of the WordNet dataset.

...

The hierarchical nature of the this dataset makes the task of image classification and object detection much easier. Deep learning models are trained with this dataset giving the option to choose among several subcategories if the model is not sure about the subclass/subcategory. The model can be tuned to select a category with higher accuracy rather than a category with lower accuracy during the classification process.

The following three images show the usefulness of the ImageNet dataset. Fig.2 show shows an image of a cat and how it is classified using the hierarchical architecture. It shows that an image can belong to more than one path in the hierarchy. Based on the prediction, it can be seen that there is a greater chance that it is the image of an Egyptian cat rather than a Tabby cat. On the other hand, it is less likely to be a Tiger cat. Next, Fig.3 shows the image of a dog and there is a 65% chance that it is a Clumber Spaniel. Now, compare this result to that of Fig.4. There is a 96% chance that it is an image of a Golden Retriever. The difference is that Fig.4 focuses more on the face. On the other hand, Fig.3 covers a wider angle that includes other pictorial information besides the face, which is why training yields less accuracy. It is clear that if the image is more focused on the important features, then the trained model has a higher prediction accuracy. This observation gives rise to the concept of bounding box, which is discussed next.

...

The ImageNet has played an important part in the development of modern computer vision algorithms. It has many different subcategories of images, all stored in a logical hierarchical structure. Before the ImageNet, datasets and Neural Networks were largely segregated. One dataset and algorithm were designed for dog images, while another algorithm is was developed for vehicle image recognition. An algorithm would work only with a particular dataset and datasets were not interchangeable. ImageNet changed all that, and combined data from different synsets, that . That is why it is possible to use the same Neural network to detect different types of objects. 

The ImageNet introduces diversity to the image collection. Each subcategory contains images of the same object or animal under different circumstances, including different camera angles, lighting conditions, and backgrounds. Models trained with such diverse datasets are more robust , and because of this, the ImageNet is a great benchmark to measure the efficiency of Neural network models.

Usage on NCI

In At NCI, ImageNet is used to train models on multiple GPUs and nodes. A tutorial about training Deep learning models using ImageNet and ResNet on GADI Gadi can be found in the following link: https://opus.nci.org.au/x/UAC7CQ.  The ImageNet dataset is also part of NCI's AI-DL data collection and is available already on the gdata filesystem.

Use in research and ImageNet Challenge.

The ImageNet data labelling is done through crowdsourcing, numerous volunteers have . Numerous volunteers have participated in classification of the images in various subcategories , it is then which are cross-checked by other volunteers. Hundreds of Thousands thousands of man-hour hours have been spent to create and validate such a big creating and validating this very large dataset. Thus, by using this dataset ImageNet a researcher can save many months worth of work that would have been otherwise spent preparing the dataset. 

ImageNet Large Scale Visual Recognition Challenge (ILSVRC) was an annual competition held between 2010 and 2017.  Each year a new benchmark is was prepared for the challenge that contained about 1000 synsets and over a million images. Those benchmarks are now a part of the larger ImageNet database, it which can be freely downloaded and used for research projects.

...