Hello friends how are you, Today in this post "Extract text from image in Python" I am going to teach you how you can extract text from image using 3 lines of code. For a human being its very easy to get text from an image but for a computer its a very difficult task . In python we can do some very difficult task using very simple lines of code. If you want to create something new and interesting in python then this post is for you.
Now i am going to explain everything step by step
Step 1:Install Library(pytesseract)
This library is know as Optical Character Recognition tool for python which is used to read and recognize the text embedded in image. It is a free library and in can read all images types for example- png,jpeg,tiff,gif etc. It supports in Python 2.7 or Python 3.6+ version. To install this library open command window and type the following command and press enter.
pip install pytesseract
Step 2:Download & Install Tesseract Application
After installation of library you need to download Tesseract Application. Just click here to download this application into your system.
When you will visit the link then you will two exe of Tesseract one for 32bit OS and another for 64bit OS. If your system is 32bit OS then you will download 32bit exe from there otherwise download 64bit exe. After the download of Tesseract exe, install it into your system. its very simple to install this application into your system.
Here execute() is a predefined function which is used to execute any query of SQLite.
Step 3:Python code
This program contains only three lines of code so i am going to explain it line by line.
Line 1: import pytesseract- This is the first line of this program and here i have imported the pytesseract library which is needed to get the text from image.
Line 2: pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'- In this line you have to pass the path where Tesseract Application is installed. Here I have given default path so if you don't change the directory at the time of installation of Tesseract then just type the same as above.
Line 3: print(pytesseract.image_to_string(r'intro.png'))-Here in this line a predefined function image_to_string of module pytesseract is used to get the text from image and it is printed using python print function. intro.png is an image stored in the same folder where python file exists but if in your case image exists in other location or directory then you have to pass the complete of the image location. i will suggest you to store image in same folder. First i am going to show you the image that i have used in this program.
Here is the complete code of this program , you can type this code into your python file or you can copy this code for your personal use.
#import needed library import pytesseract #Installtion path of Tesseract Application pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract' #printing content of the image #here intro.png is an image stored in current directory print(pytesseract.image_to_string(r'intro.png'))
Step 5:Run Program
You just type this above code into your python file or you can copy this code for your personal use. When you will run this code you will get a screen like below
Here you can match the output text with image text It is approximately 100% correct. I hope now you can extract text from image .
Also Visit: Digital & Analog clock in Python
Request:-If you found this post helpful then let me know by your comment and share it with your friend.
If you want to ask a question or want to suggest then type your question or suggestion in comment box so that we could do something new for you all.
If you have not subscribed my website then please subscribe my website. Try to learn something new and teach something new to other.
Thanks.😊
0 Comments