The MNIST (Mixed National Institute of Standards and Technology) database of handwritten digits is one of the most researched datasets in machine learning. It has a training set of 60,000 examples, and a test set of 10,000 examples. The digits have been size-normalized and centered in a fixed-size image, i.e. 28*28 pixel.
A sample image from MNIST test dataset is show below
We have provided a copy of the MNIST dataset as the part of the NCI AI/ML data collection under the project wb00. Please join the project to access this dataset.
The directory tree for the MNIST dataset is listed below
$ tree /g/data/wb00/MNIST/ /g/data/wb00/MNIST/ ├── npz # To feed in the the Tensorflow function tf.keras.datasets.mnist.load_data(path="/g/data/wb00/MNIST/npz/mnist.npz") │ └── mnist.npz └── raw # To feed in the Pytorch function datasets.MNIST("/g/data/wb00",...) ├── t10k-images-idx3-ubyte # test set images ├── t10k-labels-idx1-ubyte # test set labels ├── train-images-idx3-ubyte # training set images └── train-labels-idx1-ubyte # training set labels 2 directories, 5 files
You can find some examples on accessing the MNIST dataset under the NCI-ai-ml environment example space, i.e. "${NCI_AI_ML_ROOT}/examples".