# # spec file for package python-pytesseract # # Copyright (c) 2018 SUSE LINUX GmbH, Nuernberg, Germany. # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed # upon. The license for this file, and modifications and additions to the # file, is the same license as for the pristine package itself (unless the # license for the pristine package is not an Open Source License, in which # case the license is the MIT License). An "Open Source License" is a # license that conforms to the Open Source Definition (Version 1.9) # published by the Open Source Initiative. # Please submit bugfixes or comments via http://bugs.opensuse.org/ # %{?!python_module:%define python_module() python-%{**} python3-%{**}} Name: python-pytesseract Version: 0.2.0 Release: 2.1 Summary: Python wrapper for Google's Tesseract-OCR License: GPL-3.0-only Group: Development/Languages/Python Url: https://github.com/madmaze/python-tesseract Source: https://files.pythonhosted.org/packages/source/p/pytesseract/pytesseract-%{version}.tar.gz Source10: https://raw.githubusercontent.com/madmaze/pytesseract/v%{version}/LICENSE BuildRequires: %{python_module devel} BuildRequires: %{python_module setuptools} BuildRequires: fdupes BuildRequires: python-rpm-macros # SECTION test requirements BuildRequires: %{python_module Pillow} BuildRequires: tesseract-traineddata-deu BuildRequires: tesseract-traineddata-eng BuildRequires: pkgconfig(tesseract) # /SECTION Requires: python-Pillow Requires: tesseract-traineddata-deu Requires: tesseract-traineddata-eng Requires: pkgconfig(tesseract) BuildArch: noarch %python_subpackages %description Python-tesseract is an optical character recognition (OCR) tool for Python, that is, it will recognize and "read" the text embedded in images. Python-tesseract is a wrapper for Google's Tesseract-OCR Engine. It can be used as a stand-alone invocation script to tesseract, as it can read all image types supported by the Python Imaging Library, including JPEG, PNG, GIF, BMP, TIFF, and others, whereas tesseract-ocr, by default, only supports TIFF and BMP. Additionally, if used as a script, python-tesseract will print the recognized text instead of writing it to a file. There is no support for confidence estimates and bounding box data is planned for future releases. %prep %setup -q -n pytesseract-%{version} sed -i -e '/^#!\//, 1d' src/pytesseract.py cp %{SOURCE10} . %build %python_build %install %python_install %python_clone -a %{buildroot}%{_bindir}/pytesseract %python_expand %fdupes %{buildroot}%{$python_sitelib} %post %python_install_alternative pytesseract %postun %python_uninstall_alternative pytesseract %files %{python_files} %doc README.rst %license LICENSE %python_alternative %{_bindir}/pytesseract %{python_sitelib}/* %changelog * Sun Jul 29 2018 jengelh@inai.de - Fix some grammar issues and replace future plans by current state. * Thu May 24 2018 toddrme2178@gmail.com - Update to 0.2.0 * Convert image to RGB mode in order to save as PNG - Update to 0.1.9 * Preserve source image extension and metadata info * Don't delete every file in current directory if the temp_name is not populated * Remove enum dependency, fix bug with missing text in last line * Support for different output types * Added verbose option that returns detailed output from tesseract run - Update to 0.1.8 * Add initial support for numpy arrays/opencv images * Improved method to discard alpha channel * Add optional nice agrument for runing tesseract with different priority * fix python 3 byte string bug - spec file cleanups * Wed Oct 18 2017 toddrme2178@gmail.com - Implement single-spec version - Update to version 0.1.7 * No changelog * Tue Jun 10 2014 jnweiger@gmail.com - pull from pypi. needed by testipy