Image Source: Tensorflow
EMNIST dataset is handwritten characters and digits converted 28×28 pixel image format like MNIST dataset.
EMNIST dataset is widely used in Convolutional Neural Networks and computer vision. The EMNIST dataset is a subset derived from NIST Special Database.
EMNIST contains three different types with count Digits 10, Uppercase 26, and Lowercase 26 by class. However, it includes two different types with count Digits 10 and Letters 37 by merge. The letters C, I, J, K, L, M, O, P, S, U, V, W, X, Y, and Z are merged as suggested by NIST.
EMNIST dataset summary as below:
- EMNIST ByClass: 814,255 characters. 62 unbalanced classes.
- EMNIST ByMerge: 814,255 characters. 47 unbalanced classes.
- EMNIST Balanced: 131,600 characters. 47 balanced classes.
- EMNIST Letters: 145,600 characters. 26 balanced classes.
- EMNIST Digits: 280,000 characters. 10 balanced classes.
- EMNIST MNIST: 70,000 characters. 10 balanced classes.
Source: NIST
Research Paper: EMNIST: an extension of MNIST to handwritten letters
Tensorflow has a prebuilt library to download and work on Image classification projects.
MNIST dataset will be useful in the following ERP projects:
- Handwritten customer purchase order convert into Sales Order
- Handwritten customer inquiry convert into Quotation
- Handwritten stock convert and post into a stock database
- Handwritten email converts and sends characters typed email
We will work on a few projects EMINST ERP projects in future blogs.
Leave A Comment