{I feel it's about time to share back something to this forum, which really helped me in countless occasions.}
py-BookBundler is python Flask webapp to enable publishers to bundle ebooks with paper books, providing that they can upload a picture of a page. The uploaded picture is ocr-ed with tesseract-ocr and if it matches the target page inserted by the operator a download page is shown. The algorithm in matching.py tries to guess if source and destination are similar enough, since a byte-perfect match is pure utopia.
The app exposes a RESTful API.
- GET /: returns a list of available publications.
- GET /book/<isbn>: presents the user with a form to upload a picture of the target page.
- POST /book/<isbn>: invoked by a form, takes the picture and spawns the ocr.
- GET /new:[basic-auth] presents the operator with a form to load data about a new publication.
- POST /new/<isbn> [basic-auth] creates the new publication.
The full source code is
available on github under the MIT license. Every form of feedback or contribution is appreciated and encouraged.