# # spec file for package tesseract # # Copyright (c) 2012 SUSE LINUX Products GmbH, Nuernberg, Germany. # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed # upon. The license for this file, and modifications and additions to the # file, is the same license as for the pristine package itself (unless the # license for the pristine package is not an Open Source License, in which # case the license is the MIT License). An "Open Source License" is a # license that conforms to the Open Source Definition (Version 1.9) # published by the Open Source Initiative. # Please submit bugfixes or comments via http://bugs.opensuse.org/ # %define major 3 Name: tesseract #Version: 3.02.02 Version: 3.04.00 Release: 1.174.lk License: Apache-2.0 Summary: Open Source OCR Engine Url: https://github.com/tesseract-ocr Group: Productivity/Graphics/Other #Source0: https://github.com/tesseract-ocr/tesseract/archive/3.04.00.tar.gz Source0: http://tesseract-ocr.googlecode.com/files/%{name}-ocr-%{version}.tar.gz BuildRequires: gcc-c++ BuildRequires: libleptonica-devel BuildRequires: libtiff-devel BuildRequires: libtool BuildRequires: pkg-config Provides: ocr-engine BuildRoot: %{_tmppath}/%{name}-%{version}-build %description A commercial quality OCR engine originally developed at HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by UNLV. It was open-sourced by HP and UNLV in 2005. From 2007 it is developed by Google. %package -n libtesseract%{major} Summary: Tesseract Shared Libraries Group: System/Libraries %description -n libtesseract%{major} Shared libraries for the Tesseract Open Source OCR Engine. %package -n libtesseract-devel Summary: Tesseract Development files Group: Development/Libraries/C and C++ Requires: libtesseract%{major} = %{version} Provides: tesseract-devel = %{version} Obsoletes: tesseract-devel < %{version} %description -n libtesseract-devel Development files for the Tesseract Open Source OCR Engine. %prep %setup -qn %{name}-ocr %build export CXXFLAGS="%{optflags} -fno-strict-aliasing -fPIC" %configure --disable-static make %{?_smp_mflags} %install %make_install rm -f %{buildroot}%{_libdir}/*.la %post -n libtesseract%{major} -p /sbin/ldconfig %postun -n libtesseract%{major} -p /sbin/ldconfig %files %defattr(-,root,root,-) %doc AUTHORS ChangeLog COPYING NEWS README %{_bindir}/* %dir %{_datadir}/tessdata/ #%{_datadir}/tessdata/configs/ #%{_datadir}/tessdata/tessconfigs/ %{_datadir}/tessdata/* %doc %{_mandir}/man?/* %files -n libtesseract%{major} %defattr(-,root,root,-) %{_libdir}/libtesseract*.so.* %files -n libtesseract-devel %defattr(-,root,root,-) %{_includedir}/%{name}/ %{_libdir}/libtesseract*.so %{_libdir}/pkgconfig/*.pc %changelog * Tue Nov 6 2012 lazy.kent@opensuse.org - Update to 3.02.02. * Moved ResultIterator/PageIterator to ccmain. * Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic. * Added paragraph detection in layout analysis/post OCR. * Fixed inconsistent xheight during training and over-chopping. * Added simultaneous multi-language capability. * Refactored top-level word recognition module. * Added experimental equation detector. * Improved handling of resolution from input images. * Blamer module added for error analysis. * Tidied up constraints on control parameters. * Added support for ShapeTable in classifier and training. * Fixed training leaks and randomness. * Major improvements to layout analysis for better image detection, diacritic detection, better textline finding, better tabstop finding. * Improved line detection and removal. * Added fixed pitch chopper for CJK. * Added UNICHARSET to WERD_CHOICE to make mult-language handling easier. * Fixed problems with internally scaled images. * Added page and bbox to string in tr files to identify source of training data better. * Fixes to Hindi Shiroreka splitter. * Added word bigram correction. * Reduced stack memory consumption and eliminated some ugly typedefs. * Added new uniform classifier API. * Added new training error counter. * Many other fixes, including the way in which the chopper finds chops and messes with the outline while it does so. - Drop "gcc47" patch (no need). - Build requires pkg-config. * Thu Jun 28 2012 lazy.kent@opensuse.org - Split off traineddata packages. * Fri Apr 27 2012 lazy.kent@opensuse.org - Patch to fix compilation with GCC 4.7. - Split off shared libraries package. - tesseract-devel renamed to libtesseract-devel. - Removed check for unsupported openSUSE versions. - Use make_install macro. - Disable build static library. * Sun Dec 4 2011 lazy.kent@opensuse.org - Build requires libtool. * Mon Nov 14 2011 lazy.kent@opensuse.org - Update to 3.01. - Dropped "nonvoid" patch (fixed upstream). - Provides ocr-engine. - Install man pages. - spec clean up and formatting. * Sun Jul 10 2011 lazy.kent@opensuse.org - replaced liblept2-devel with liblept-devel in build dependencies * Mon Jun 27 2011 lazy.kent@opensuse.org - build against leptonica library - build requires liblept2-devel * Sat Oct 30 2010 lazy.kent.suse@gmail.com - use makeinstall macro to avoid error building in oS 11.1 - build traineddata packages noarch for oS > 11.1 only * Mon Oct 25 2010 prusnak@opensuse.org - fixed missing returns in nonvoid functions (nonvoid.patch) - added missing post/postun scripts calling ldconfig * Thu Sep 23 2010 michal.smrz@opensuse.cz - update to tesseract-3.00 - added plenty od new supported languages - created tesseract-package-creator.py which will, hopefully, make future updates easier * Fri Jul 10 2009 puzel@novell.com - update to tesseract-2.04 * Integrated bug fixes and patches and misc changes for portability. * Integrated a patch to remove some of the "access" macros. * Removed dependence on lua from the viewer, speeding it up dramatically. * Fixed the viewer so it compiles and runs properly!