# # spec file for package python-pytesseract # # Copyright (c) 2020 SUSE LLC # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed # upon. The license for this file, and modifications and additions to the # file, is the same license as for the pristine package itself (unless the # license for the pristine package is not an Open Source License, in which # case the license is the MIT License). An "Open Source License" is a # license that conforms to the Open Source Definition (Version 1.9) # published by the Open Source Initiative. # Please submit bugfixes or comments via https://bugs.opensuse.org/ # %{?!python_module:%define python_module() python-%{**} python3-%{**}} Name: python-pytesseract Version: 0.3.4 Release: 1.1 Summary: Python wrapper for Google's Tesseract-OCR License: GPL-3.0-only Group: Development/Languages/Python URL: https://github.com/madmaze/python-tesseract # https://github.com/madmaze/pytesseract/issues/262 Source: https://github.com/madmaze/pytesseract/archive/v%{version}.tar.gz BuildRequires: %{python_module setuptools} BuildRequires: fdupes BuildRequires: python-rpm-macros Requires: python-Pillow Requires: python-setuptools Requires: tesseract-traineddata-deu Requires: tesseract-traineddata-eng Requires: pkgconfig(tesseract) BuildArch: noarch # SECTION test requirements BuildRequires: %{python_module Pillow} BuildRequires: %{python_module pytest} BuildRequires: tesseract-ocr-traineddata-orientation_and_script_detection BuildRequires: tesseract-traineddata-deu BuildRequires: tesseract-traineddata-eng BuildRequires: tesseract-traineddata-fra BuildRequires: pkgconfig(tesseract) # /SECTION Requires(post): update-alternatives Requires(postun): update-alternatives %python_subpackages %description Python-tesseract is an optical character recognition (OCR) tool for Python, that is, it will recognize and "read" the text embedded in images. Python-tesseract is a wrapper for Google's Tesseract-OCR Engine. It can be used as a stand-alone invocation script to tesseract, as it can read all image types supported by the Python Imaging Library, including JPEG, PNG, GIF, BMP, TIFF, and others, whereas tesseract-ocr, by default, only supports TIFF and BMP. Additionally, if used as a script, python-tesseract will print the recognized text instead of writing it to a file. There is no support for confidence estimates and bounding box data is planned for future releases. %prep %setup -q -n pytesseract-%{version} sed -i -e '/^#!\//, 1d' src/pytesseract.py # by eisfair sed -i 's/--strict-markers//' tox.ini %build %python_build %install %python_install %python_clone -a %{buildroot}%{_bindir}/pytesseract %python_expand %fdupes %{buildroot}%{$python_sitelib} %check %pytest %post %python_install_alternative pytesseract %postun %python_uninstall_alternative pytesseract %files %{python_files} %doc README.rst %license LICENSE %python_alternative %{_bindir}/pytesseract %{python_sitelib}/* %changelog * Tue May 5 2020 Matej Cepl - Update to 0.3.4: - Support for WebP images - Support for python 3.8 (CI testing) - Improved cli error reporting - Don't use %%python3_only command, but properly use alternatives. * Mon Mar 23 2020 pgajdos@suse.com - version update to 0.3.3 * no upstream changelog * Tue Sep 10 2019 Tomáš Chvátal - Update to 0.3.0: * no upstream changelog * Mon Jul 22 2019 Tomáš Chvátal - Update to 0.2.7: * no upstream changelog * Tue May 14 2019 John Jolly - Update to 0.2.6 + No upstream changelog * Tue Dec 4 2018 Matej Cepl - Remove superfluous devel dependency for noarch package * Sun Jul 29 2018 jengelh@inai.de - Fix some grammar issues and replace future plans by current state. * Thu May 24 2018 toddrme2178@gmail.com - Update to 0.2.0 * Convert image to RGB mode in order to save as PNG - Update to 0.1.9 * Preserve source image extension and metadata info * Don't delete every file in current directory if the temp_name is not populated * Remove enum dependency, fix bug with missing text in last line * Support for different output types * Added verbose option that returns detailed output from tesseract run - Update to 0.1.8 * Add initial support for numpy arrays/opencv images * Improved method to discard alpha channel * Add optional nice agrument for runing tesseract with different priority * fix python 3 byte string bug - spec file cleanups * Wed Oct 18 2017 toddrme2178@gmail.com - Implement single-spec version - Update to version 0.1.7 * No changelog * Tue Jun 10 2014 jnweiger@gmail.com - pull from pypi. needed by testipy