Computer Vision
Optical Character Recognition(OCR)
Athulya Menon
There is so much data in the world around us since humans started using computers and other digital media in an unprecedented way. This leads to a large volume of digital information which must be processed in the right way to be read and interpreted by machines.
Read more..
Looking to build project on Computer Vision - Text Scanner?:
Skyfi Labs gives you the easiest way to learn and build this project.
- Computer Vision - Text Scanner Kit will be shipped to you (anywhere in the world!)
- Use high quality videos to understand concepts and build the project
- Get 1 to 1 expert assistance from Skyfi Labs engineers while doing the project
- Earn a smart certificate on finishing the project
You can start for free and pay only if you like it!
Machines don't see the world the way we do, but they are being trained to do so, thanks to the advanced concepts such as Machine Learning, Deep Learning, and Artificial Intelligence, all of which helps in making machines a little more human. This time around, we will be taking a look at how to implement these techniques to create an optical character recognition tool.
Project Description
Optical Character Recognition, better known as, OCR is a tool that allows you to read data from documents supplied to a machine, such as a picture, PDF or screenshot. Such tools will allow you to convert data on such pictorial items and give them a text editor touch. This helps you to edit, search and join the textual data, which was read from a picture or visual source.
This project aims to create a tool which, when supplied with an input image, will be able to extract alphabets, digits, and symbols from it. The process will be easier to implement on printed data because that is easier to analyze, but the same system can be built for handwritten notes as well.
Project Implementation
- OCR involves the automatic regeneration of images into digitized texts which can be used in machine processes.
- This machine-encoded text may then be used for machine translation, text-to-speech transformation or text mining.
- Use an optical scanner to capture a digital image of the required document.
- Next, convert the multilevel, multi-color image into a binary image in gray-scale.
Latest projects on Computer Vision
Want to develop practical skills on Computer Vision? Checkout our latest projects and start learning for free
- Perform thresholding on the image using some pre-defined values to convert the image into black-and-white and also to reduce space.
- A fixed threshold in which gray-levels below a particular value is set to black and ones above are set to white.
- Presence of noises cause poor recognition rates.
- Hence, they are eliminated using a preprocessor or smoother. We have already discussed such a filter. Go here to read up about image enrichment and principles of filtering.
- Once the document has been properly binarized, a top-down segmentation is done.
- The document is analyzed line by line, and words are extracted.
- The words are segmented into characters.
- The segmentation works on component extraction and average figure elevation evaluation.
- A block-based Hough transform detects potential text lines.
- K-Means clustering algorithm is used to create compact clusters.
- All Characters must then be grouped into k clusters.
- Convert these clusters into classes with a unique ASCII label.
- Now the system must be trained using a fixed set of data.
- Characters are then identified and extracted by converting them into field vectors.
- To get precision to 100%, a lot of fine-tuning and training is required.
- Improvements must also be made in the pre-processing work, for the tool to be effective.
Concepts Used
- Data Acquisition
- Feature Extraction
- Segmentation
- Machine Learning
- MATLAB or Octave
- JAVA Programming/Python Programming
- Image processing
- Natural language processing
- Artificial Intelligence
Did you know
Skyfi Labs helps students learn practical skills by building real-world projects.
You can enrol with friends and receive kits at your doorstep
You can learn from experts, build working projects, showcase skills to the world and grab the best jobs.
Get started today!
Components of an OCR
- Digitizing
- Analog to Digital Scanner
- Data extraction
- Segmentation
- Pre-processor- Filtering, Increasing clarity
- Feature Extraction
- Comparison with Known object
- Learning phase modules
- Reconstruction of words
Kit required to develop Optical Character Recognition(OCR):
Technologies you will learn by working on Optical Character Recognition(OCR):
Optical Character Recognition(OCR)
Skyfi Labs
•
Published:
2019-10-03 •
Last Updated:
2022-03-22