|
|||||||
![]() |
|
|
Thread Tools | Search this Thread |
|
|
#1 |
|
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Jun 2026
Device: Kobo Clara HD
|
Hi all, this is a program I have been working on since early last year.
Problem: you get a scan from your physical scanner or from Archive.org, the file is 500+mb and the scanned pages are yellowed and worn, making the contrast look wrong on a BW e-ink reader. Solution: intelligent binarization of such raster scans, and re-encoding to 1bit fax format for 90% file size reduction at a resolution you select, with final PDF or DjVu formats as well as EPUB, possible (EPUB is actually processed differently, using intensive OCR). I made the program to work directly with my Kobo Clara HD + KoReader which supports PDF/JBIG2 and DjVu. Features: GUI and a separate command line interface. page range selection, center/crop margins, or Reflow (similar to K2PDFOPT but much faster). 2 modes of OCR available. Many specialized debug options in the CLI. Custom ONNXRuntime engine, custom JBIG2, JP2, and DjVu encoders. Everything works fast and automatic, unlike ScanTailor et al. Opensource. Let me know what you think here, or email read@legeapp.com, with bug reports or suggestions. www.legeapp.com https://apps.microsoft.com/detail/9N...&ocid=pdpshare https://github.com/LegeApp/Lege |
|
|
|
|
|
#2 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,825
Karma: 731691
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O, Kobo Libra 2
|
That's pretty neat. Is the Claude-assisted code any good or total spaghetti?
|
|
|
|
|
|
#3 |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 83,899
Karma: 153649587
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Why is it in the Microsoft link, the Lege side of the sample looks like a bad photocopy?
|
|
|
|
|
|
#4 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,612
Karma: 5000564
Join Date: Feb 2012
Location: Cape Canaveral
Device: Kindle Scribe
|
Err, because it is a bad photocopy? This is the whole point of the app
|
|
|
|
|
|
#5 |
|
Weirdo
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,129
Karma: 12503116
Join Date: Nov 2019
Location: Wuppertal, Germany
Device: Kobo Sage, Kobo Libra 2, reMarkable PaperPro
|
|
|
|
|
|
|
#6 |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 83,899
Karma: 153649587
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
|
|
|
|
|
#7 |
|
Weirdo
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,129
Karma: 12503116
Join Date: Nov 2019
Location: Wuppertal, Germany
Device: Kobo Sage, Kobo Libra 2, reMarkable PaperPro
|
They failed at basic testing and just assumed that I had a required packed installed in fixed directory. Sigh.
|
|
|
|
|
|
#8 |
|
Weirdo
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,129
Karma: 12503116
Join Date: Nov 2019
Location: Wuppertal, Germany
Device: Kobo Sage, Kobo Libra 2, reMarkable PaperPro
|
And another dependency that they assumed I have installed, rdf. Really sloppy.
|
|
|
|
|
|
#9 |
|
Weirdo
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,129
Karma: 12503116
Join Date: Nov 2019
Location: Wuppertal, Germany
Device: Kobo Sage, Kobo Libra 2, reMarkable PaperPro
|
|
|
|
|
|
|
#10 | ||||
|
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Jun 2026
Device: Kobo Clara HD
|
Quote:
Quote:
Quote:
Quote:
I did use LLMs to make it but the releases are solid and none of it is sloppy. It is open source software and free, if you want to help, let me know. Otherwise download a release, use it like any other program, and leave the source code to developers. |
||||
|
|
|
|
|
#11 |
|
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Jun 2026
Device: Kobo Clara HD
|
Hi all, now the github repo will build as-is after clone. However you still need the files in the Release zips to get the program to work, if you modify the binary for some purpose of yours.
Otherwise there is no reason to build the binaries from source since the most recent versions are included in the Releases, and the ONNX files, pdfium library and other files are needed to run the program. Continue to let me know what you think. I use the program to make my own prepared e-ink files and it works great for me at this point but there's always some improvement or change that could be made. Will update soon for final macos github clone compatibility, currently it only supports windows and linux but macos would be a quick tweak (pdfium library detection). |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Selective connections to Web | rmanlee | Library Management | 4 | 02-02-2014 03:04 AM |
| Selective paragraph indent | Leonatus | Writer2ePub | 8 | 10-31-2013 04:22 PM |
| Selective preprocess_regexps | dasp | Recipes | 3 | 12-06-2011 08:52 AM |
| Selective format conversion? | drmathprog | Library Management | 2 | 04-19-2011 08:43 AM |
| Selective exclusion of Hyperlinks | SteffenH | Sony Reader | 4 | 10-03-2007 06:51 AM |