Tesseract OCR is a package that includes an OCR engine and a command line program. Its newest version includes a neural net based OCR engine for line recognition. The legacy Tesseract OCR engine is still supported and requires traineddata files. Tesseract can recognize over 100 languages and can be trained to recognize others. The project does not include a GUI application. Tesseract is licensed under the Apache License, Version 2.0, and developers can use the libtesseract C or C++ API to build their own application. Support is available through mailing lists, documentation, and GitHub.

don't have tea/gui yet? download here

Copy the tea one-liner above into your terminal to install tea will interpret the documentation and take care of any dependencies.