# # spec file for package python-tesserocr # # Copyright (c) 2020 SUSE LLC # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed # upon. The license for this file, and modifications and additions to the # file, is the same license as for the pristine package itself (unless the # license for the pristine package is not an Open Source License, in which # case the license is the MIT License). An "Open Source License" is a # license that conforms to the Open Source Definition (Version 1.9) # published by the Open Source Initiative. # Please submit bugfixes or comments via https://bugs.opensuse.org/ # %{?!python_module:%define python_module() python-%{**} python3-%{**}} Name: python-tesserocr Version: 2.5.1 Release: 1.2 Summary: A Python wrapper around tesseract-ocr License: MIT Group: Development/Languages/Python URL: https://github.com/sirfz/tesserocr Source: https://files.pythonhosted.org/packages/source/t/tesserocr/tesserocr-%{version}.tar.gz BuildRequires: %{python_module Cython} BuildRequires: %{python_module Pillow} BuildRequires: %{python_module devel} BuildRequires: %{python_module pytest} BuildRequires: %{python_module setuptools} BuildRequires: %{python_module six} BuildRequires: gcc-c++ BuildRequires: pkgconfig BuildRequires: python-rpm-macros BuildRequires: tesseract-ocr-traineddata-english BuildRequires: tesseract-ocr-traineddata-orientation_and_script_detection BuildRequires: pkgconfig(tesseract) Requires: tesseract-ocr-traineddata-english Requires: tesseract-ocr-traineddata-orientation_and_script_detection Recommends: python-Pillow %python_subpackages %description A wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with Tesseract's C++ API using Cython which allows for Pythonic source code. It enables real concurrent execution when used with Python's threading module by releasing the GIL while processing an image in tesseract. %prep %setup -q -n tesserocr-%{version} %build %python_build %install %python_install %check %python_exec setup.py develop --user # test_LSTM_choices failure: https://github.com/sirfz/tesserocr/issues/214 %pytest -k 'not test_LSTM_choices' %files %{python_files} %license LICENSE %doc README.rst %{python_sitearch}/* %changelog * Thu Mar 26 2020 pgajdos@suse.com - version update to 2.5.1 * Fix order of linker arguments (#211) * Fix memory leaks in GetComponentImages (#213) * Mon Jan 13 2020 pgajdos@suse.com - disable test_LSTM_choices temporarily https://github.com/sirfz/tesserocr/issues/214 * Tue Nov 26 2019 Martin Herkt <9+suse@cirno.systems> - Update to version 2.5.0 * Support for RowAttributes method in LTRResultIterator * SetImage: use PNG instead of JPEG fallback * Replace STRING::string() by c_str() * Don't use assignment operator for TessBaseAPI * Fri Aug 23 2019 Martin Herkt <9+suse@cirno.systems> - Update to version 2.4.1 * fix pixa_to_list python3 segfault * fix BlockPolygon python3 segfault * Thu Dec 6 2018 Jan Engelhardt - Trim bias and filler wording. * Wed Dec 5 2018 Martin Herkt <9+suse@cirno.systems> - Update to version 2.4.0 Tesseract v4 new API methods supported: * GetBestLSTMSymbolChoices * BlanWksBeforeWord * Mon Aug 13 2018 9+suse@cirno.systems - Update to version 2.3.1 * Python 3.7 support release * Thu Aug 2 2018 tchvatal@suse.com - Ensure we require some of the tesseract data so we can do at least some basic ocr operations * Thu Aug 2 2018 tchvatal@suse.com - Drop unused bcond * Tue Jun 26 2018 9+suse@cirno.systems - Run tests - Use %%license macro - Update to version 2.3.0 * Support for Tesseract 4 + New OCR engines LSTM_ONLY and TESSERACT_LSTM_COMBINED + New default tessdata path handling * Fixed compilation against Tesseract v3.05.02 which required c++11 * Fallback to 'eng' as default language when default language returned by the API is empty * Fri Aug 11 2017 9@cirno.systems - Add doc files * Sun Aug 6 2017 9@cirno.systems - Switch to PyPI source URL - Add Pillow (PIL) to Recommends * Wed Jul 26 2017 9@cirno.systems - v2.2.2 * Support timeout in Recognize API methods * Fixed typo in _Enum initialization error message formatting * Display tessdata path in init exception message * Fixed version check in Python 3 when reading the version number from the tesseract executable * Thu Jun 1 2017 9@cirno.systems v2.2.1 * Fixed setup bug that affects gcc versions with no -std=c++11 option support (which should be required by tesseract 4.0+ and not older versions). * Sun May 28 2017 9@cirno.systems v2.2.0 * Improved setup script * Tesseract 4.0 support: - Two new OEM enums: OEM.LSTM_ONLY and OEM.TESSERACT_LSTM_COMBINED (tesseract 4.0+) - Two new API methods: GetTSVText and DetectOrientationScript (tesseract 4.0+) - PyTessBaseApi.__init__ now accepts a new attribute oem (OCR engine mode: OEM.DEFAULT by default). - file_to_text and image_to_text functions now also accept the oem attribute as above. * Fixed segfault on API Init* failure * Fixed segfault when pixa_to_list returns NULL * Documentation fixes and other minor improvments * Sat Apr 1 2017 9@cirno.systems - Initial commit