View Single Post
Old 01-05-2016, 05:28 PM   #1223
timofonic
Zealot
timofonic has a certain pleonastic somethingtimofonic has a certain pleonastic somethingtimofonic has a certain pleonastic somethingtimofonic has a certain pleonastic somethingtimofonic has a certain pleonastic somethingtimofonic has a certain pleonastic somethingtimofonic has a certain pleonastic somethingtimofonic has a certain pleonastic somethingtimofonic has a certain pleonastic somethingtimofonic has a certain pleonastic somethingtimofonic has a certain pleonastic something
 
Posts: 123
Karma: 18554
Join Date: Jan 2008
Location: Spain
Device: Onyx Boox M96+
@willus

Is it possible to user other OCR engines? Even commercial ones.

I tried to find a repository where k2pdfopt is stored, but it was impossible to find.

Do you use some source code versioning control system? It would be ideal to have it in GitHub.

Here's an outdated mirror, but obviously all the history is lost.
https://github.com/JohannesBuchner/k2pdfopt

Are you planning to update your software to latest Tesseract?


3.04.00 released on Jul 11, 2015
https://github.com/tesseract-ocr/tes...901f361ecd7e90

Here's the langdata
https://github.com/tesseract-ocr/lan...f3bf238ee8903d

Latest MuPDF is 1.8.1-ios from 15 days ago, I'm not sure if you updated it:
https://github.com/ArtifexSoftware/m...61f3815a92a375

Latest Leptonica is 1.7.2

Here's a GitHub unofficial mirror:
https://github.com/egorpugin/leptonica
Version notes: http://leptonica.com/source/version-notes.html

Official version isn't hosted in a repository, the code is here:
http://leptonica.com/source/leptonica-1.72.tar.gz


I wrote this because KOreader project got unable to use newer Tesseract and Leptonica, you need to update it in your project:
https://github.com/koreader/koreader-base/issues/361

They seem to use a wrapper around k2pdfopt to make it a library, or something like that:
https://github.com/koreader/libk2pdfopt


NOTE: I'm not part of KOreader Team and not a developer at all. Just an user of the software.


Here's an historical reference why using GitHub would make other projects to deploy k2pdfopt a lot easier, plus potential contributions from other developers.

Quote:
Originally Posted by chrox View Post
Yes. The koreader team would always like to keep up with the latest k2pdfopt code. On each k2pdfopt version release we will make a fresh codebase override in libk2pdfopt with the lattest k2pdfopt source. The outcome commit in libk2pdfopt should be something like "fresh k2pdfopt v2.12"(d988c98ae7ec889fecc0bc8b5cae051f7b573988). Then we use this branch to merge with our latest branch. Conflicts, if there will have, will be fixed in the immediate commit like "update k2pdfopt to version v2.12"(e36230b59ba92e9da5ed261253acd805eba5baa9). So patches will be obtained with a command:

Code:
git clone https://github.com/koreader/libk2pdfopt.git
cd libk2pdfopt
git diff d988c98ae7ec889fecc0bc8b5cae051f7b573988 k2pdfoptlib/
It will diff all changes in k2pdfoptlib since the fresh k2pdfopt v2.12 source.

Last edited by timofonic; 01-05-2016 at 05:46 PM.
timofonic is offline   Reply With Quote