#
# spec file for package python-tesserocr
#
# Copyright (c) 2025 SUSE LLC
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
# upon. The license for this file, and modifications and additions to the
# file, is the same license as for the pristine package itself (unless the
# license for the pristine package is not an Open Source License, in which
# case the license is the MIT License). An "Open Source License" is a
# license that conforms to the Open Source Definition (Version 1.9)
# published by the Open Source Initiative.

# Please submit bugfixes or comments via https://bugs.opensuse.org/
#


%{?sle15_python_module_pythons}
Name:           python-tesserocr
Version:        2.8.0
Release:        1.1
Summary:        A Python wrapper around tesseract-ocr
License:        MIT
Group:          Development/Languages/Python
URL:            https://github.com/sirfz/tesserocr
Source:         https://files.pythonhosted.org/packages/source/t/tesserocr/tesserocr-%{version}.tar.gz
BuildRequires:  %{python_module Cython}
BuildRequires:  %{python_module Pillow}
BuildRequires:  %{python_module devel}
BuildRequires:  %{python_module pip}
BuildRequires:  %{python_module pytest}
BuildRequires:  %{python_module setuptools}
BuildRequires:  %{python_module wheel}
BuildRequires:  fdupes
BuildRequires:  gcc-c++
BuildRequires:  pkgconfig
BuildRequires:  python-rpm-macros
BuildRequires:  tesseract-ocr-traineddata-english
BuildRequires:  tesseract-ocr-traineddata-orientation_and_script_detection
BuildRequires:  pkgconfig(libcurl)
BuildRequires:  pkgconfig(tesseract)
Requires:       tesseract-ocr-traineddata-english
Requires:       tesseract-ocr-traineddata-orientation_and_script_detection
Recommends:     python-Pillow
%python_subpackages

%description
A wrapper around the tesseract-ocr API for Optical Character
Recognition (OCR).

tesserocr integrates directly with Tesseract's C++ API using Cython
which allows for Pythonic source code. It enables real concurrent
execution when used with Python's threading module by releasing the
GIL while processing an image in tesseract.

%prep
%setup -q -n tesserocr-%{version}

%build
%pyproject_wheel

%install
%pyproject_install
%fdupes %{buildroot}

%check
export TESSDATA_PREFIX=/usr/share/tessdata
%python_exec setup.py develop --user
# test_LSTM_choices failure: https://github.com/sirfz/tesserocr/issues/214
# https://github.com/sirfz/tesserocr/issues/295
donttest="test_LSTM_choices"
donttest+=" or test_detect_os"
donttest+=" or test_init"
%pytest -k "not ($donttest)"

%files %{python_files}
%license LICENSE
%doc README.rst
%{python_sitearch}/tesserocr*

%changelog
* Mon Mar 10 2025 John Paul Adrian Glaubitz <adrian.glaubitz@suse.com>
- Update to 2.8.0
  * Build Python 3.13 wheels by @nijel in (#357)
  * chore(ci): Modernize wheel builds by @nijel in (#362)
- Switch build system from setuptools to pyproject.toml
  * Add python-pip and python-wheel to BuildRequires
  * Replace %%python_build with %%pyproject_wheel
  * Replace %%python_install with %%pyproject_install
- Use Python 3.11 on SLE-15 by default
* Mon Sep 23 2024 Dirk Müller <dmueller@suse.com>
- update to 2.7.1:
  * bugfix: `set_leptonica_log_level` expects int
  * revert: disable tesseract's logging by default
* Sun Apr 28 2024 Mia Herkt <mia@0x0.st>
- Update to 2.7.0:
  * Allow passing configs/variables on initialization
    gh#sirfz/tesserocr#349
  * Stub file for completion
    gh#sirfz/tesserocr#350
  * Expose leptonica's log level setting via
    set_leptonica_log_level function
  * Keep tesseract's default debug_file setting
* Tue Apr  2 2024 Dirk Müller <dmueller@suse.com>
- update to 2.6.3:
  * Clarified the comments for tessdata path
  * skip unit test for GetComponentImages if Pillow is missing
  * Build with C++17 for Tesseract>=5.3.4
* Tue Nov  7 2023 Mia Herkt <mia@0x0.st>
- Update to 2.6.2 (no user-facing changes)
* Sun Jun 11 2023 Mia Herkt <mia@0x0.st>
- Fix build: Add libcurl dependency
* Fri Mar 17 2023 Mia Herkt <mia@0x0.st>
- Update to 2.6.0
  * _pix_to_image now works with binary images
    gh#sirfz/tesserocr#274
  * SetImage with alpha channels support
    gh#sirfz/tesserocr#280
  * Leptonica 1.83.0 support
    gh#sirfz/tesserocr#306
  * Pointsize should be returned even if fontname doesn't exist
    gh#sirfz/tesserocr#308
  * Added Python 3.10, 3.11 setup classifiers
- Drop 1441bec703cf68161acce5e85907ddd71c47fdc3.patch
* Mon Feb 27 2023 Daniel Garcia <daniel.garcia@suse.com>
- Disable current broken tests, test_LSTM_choices, test_detect_os and
  or test_init, gh#sirfz/tesserocr#295
* Sat Jan 14 2023 Hans-Peter Jansen <hpj@urpla.net>
- Apply 1441bec703cf68161acce5e85907ddd71c47fdc3.patch from upstream
  project in order to build with Leptonica 1.83.0
- Make tests work again
* Fri Nov 11 2022 pgajdos@suse.com
- silent rpmlint
* Fri Nov 11 2022 pgajdos@suse.com
- python-six is not required
* Wed Jun 23 2021 Mia Herkt <mia@0x0.st>
- Update to 2.5.2
  * Support new Tesseract 5 API (gh#sirfz/tesserocr#242)
  * GetBestLSTMSymbolChoices crash fix (gh#sirfz/tesserocr#241)
  * Fallback to BMP instead of PNG
  * Create pix from a BMP image bytes (gh#sirfz/tesserocr#156)
* Thu Mar 26 2020 pgajdos@suse.com
- version update to 2.5.1
  * Fix order of linker arguments (#211)
  * Fix memory leaks in GetComponentImages (#213)
* Mon Jan 13 2020 pgajdos@suse.com
- disable test_LSTM_choices temporarily
  https://github.com/sirfz/tesserocr/issues/214
* Tue Nov 26 2019 Martin Herkt <9+suse@cirno.systems>
- Update to version 2.5.0
  * Support for RowAttributes method in LTRResultIterator
  * SetImage: use PNG instead of JPEG fallback
  * Replace STRING::string() by c_str()
  * Don't use assignment operator for TessBaseAPI
* Fri Aug 23 2019 Martin Herkt <9+suse@cirno.systems>
- Update to version 2.4.1
  * fix pixa_to_list python3 segfault
  * fix BlockPolygon python3 segfault
* Thu Dec  6 2018 Jan Engelhardt <jengelh@inai.de>
- Trim bias and filler wording.
* Wed Dec  5 2018 Martin Herkt <9+suse@cirno.systems>
- Update to version 2.4.0
  Tesseract v4 new API methods supported:
  * GetBestLSTMSymbolChoices
  * BlanWksBeforeWord
* Mon Aug 13 2018 9+suse@cirno.systems
- Update to version 2.3.1
  * Python 3.7 support release
* Thu Aug  2 2018 tchvatal@suse.com
- Ensure we require some of the tesseract data so we can do at
  least some basic ocr operations
* Thu Aug  2 2018 tchvatal@suse.com
- Drop unused bcond
* Tue Jun 26 2018 9+suse@cirno.systems
- Run tests
- Use %%license macro
- Update to version 2.3.0
  * Support for Tesseract 4
    + New OCR engines LSTM_ONLY and TESSERACT_LSTM_COMBINED
    + New default tessdata path handling
  * Fixed compilation against Tesseract v3.05.02 which required
    c++11
  * Fallback to 'eng' as default language when default language
    returned by the API is empty
* Fri Aug 11 2017 9@cirno.systems
- Add doc files
* Sun Aug  6 2017 9@cirno.systems
- Switch to PyPI source URL
- Add Pillow (PIL) to Recommends
* Wed Jul 26 2017 9@cirno.systems
- v2.2.2
  * Support timeout in Recognize API methods
  * Fixed typo in _Enum initialization error message formatting
  * Display tessdata path in init exception message
  * Fixed version check in Python 3 when reading the version number
    from the tesseract executable
* Thu Jun  1 2017 9@cirno.systems
  v2.2.1
  * Fixed setup bug that affects gcc versions with no -std=c++11 option support
  (which should be required by tesseract 4.0+ and not older versions).
* Sun May 28 2017 9@cirno.systems
  v2.2.0
  * Improved setup script
  * Tesseract 4.0 support:
  - Two new OEM enums: OEM.LSTM_ONLY and OEM.TESSERACT_LSTM_COMBINED (tesseract 4.0+)
  - Two new API methods: GetTSVText and DetectOrientationScript (tesseract 4.0+)
  - PyTessBaseApi.__init__ now accepts a new attribute oem (OCR engine mode: OEM.DEFAULT by default).
  - file_to_text and image_to_text functions now also accept the oem attribute as above.
  * Fixed segfault on API Init* failure
  * Fixed segfault when pixa_to_list returns NULL
  * Documentation fixes and other minor improvments
* Sat Apr  1 2017 9@cirno.systems
- Initial commit