07-19-2023, 06:00 PM | #16 | |
the rook, bossing Never.
Posts: 11,729
Karma: 87663463
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Quote:
More modern dedicated scanners based on cameras have built in lighting, lasers etc to ensure de-skewing and even contrast. Better value for A3 and needed for books you can't cut up. The png is typically one image per page. The Tiff format and a motion png format equivalent to gif can have an entire book in one file. Both do lossless compression and will compress white space or sold black completely, so good illumination is important. Last edited by Quoth; 07-19-2023 at 06:03 PM. |
|
07-19-2023, 09:58 PM | #17 | |||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
To Download Scan Tailor Advanced On the right side, you should see "Releases". That leads you to this page with a list of EXEs: You might need to expand the spot where it says "Assets". - - - If by "how to use it", you meant Tutorials... then yeah... there isn't too much good step-by-step explanations of Scan Tailor out there. Back when I responded to anonlivros's fantastic: I contacted him + showed him how to use it over screenshare—answering any questions he had live—but I never formalized the instructions/tips anywhere. Since then, I explained a little more detail on the overall process: but again, nothing specific on tricks I've learned inside of Scan Tailor Advanced. I'd just suggest poking around. - - - Side Note: Perhaps one of these days, I'll finally write down and formalize this Scan Tailor stuff. Recently, my time has been focusing less on MR... and more on helping LibreOffice. Within the past 2 years, I've written nearly 1000 posts about all sorts of random LibreOffice questions!!! Like the ultimate:
I've been refining all my Documentation/Technical Writing skills—just haven't turned them back towards MobileRead/Sigil/Calibre and all my favorite ebook tools... yet! - - - Quote:
I guess Finereader 16 jumped ship to that horrible yearly subscription fee nonsense. Finereader 15 is then the last version that is standalone, so I'd recommend seeing if you can get a copy of that. - - - Personally, I still use Finereader 12, which is the version I purchased at the time. Finereader 13->14 introduced a few minor features that I didn't feel were huge enhancements. Finereader 15 introduced a lot more PDF + PDF comparison stuff, so I was tempted to upgrade, just never got around to purchasing it. Quote:
If your original scan/PDF is fine, then you can just feed that right into your OCR. But if you did things like:
Scan Tailor can help clean that type of stuff up, so when you DO feed it into OCR, the OCR has a much easier time and can be more accurate. Last edited by Tex2002ans; 07-19-2023 at 10:19 PM. |
|||
07-20-2023, 06:45 AM | #18 | |
Connoisseur
Posts: 66
Karma: 10
Join Date: Jul 2023
Device: None
|
From print to ePub - how I did it
Quote:
WV-Mike |
|
07-20-2023, 06:53 AM | #19 | |
Connoisseur
Posts: 66
Karma: 10
Join Date: Jul 2023
Device: None
|
Quote:
Anyone out there know of one which is available? Thanks, WV-Mike |
|
07-20-2023, 08:13 AM | #20 |
A Hairy Wizard
Posts: 3,120
Karma: 18727091
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
The only advice you'll get here on MR about getting a KEY is to legally purchase the software. No discussion of pirating is allowed.
If you are trying to follow tex's advice on obtaining Finereader 15 then I would just avail yourself of Mr Google or one of his cousins. They do fine work. (see what I did there? ) |
07-20-2023, 09:31 AM | #21 | |
Connoisseur
Posts: 66
Karma: 10
Join Date: Jul 2023
Device: None
|
Quote:
Thanks, WV-Mike |
|
07-21-2023, 02:11 PM | #22 |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
I randomly stumbled upon this video today.
I thought it did a pretty great job showing off the basics of Scan Tailor Advanced's steps: He showed off how to split pages, reorient, add a box around the content, etc. Even I learned a little something: I had no idea "Fill Zones" even existed—so I'll be using that in the future. Towards the end, he even quickly showed "the equivalent steps" using Finereader, and you can see how much better/easier Scan Tailor is for cleaning up scans:
where:
- - - Note: I wouldn't follow a lot of his advice on "low DPI"... or how he exports images out of PDF (using low quality JPGs is going to introduce a lot more errors). But overall, I thought the Scan Tailor parts were a great beginner intro. The rest of his video, you can take with a huge grain of salt. Last edited by Tex2002ans; 07-21-2023 at 03:42 PM. |
07-22-2023, 02:28 AM | #23 |
Wizard
Posts: 1,175
Karma: 4949904
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
That is a great video @Tex2002ans, really helpful.
Today I decided to create a better workflow with this new information. Firstly, I had to do something about my prehistoric scanner. The interface is non-existant. Yep, when Adobe Flash Player was removed from Windows, I lost access to the scanners GUI. Up until now I've been using the WIA function on photoshop. Not ideal. 1. So hunting around for new WIA compliant scanner software, I found this... https://www.naps2.com/ Simple and easy to use. The BEST feature is that it can batch scan. You enter how many scans to make, how many seconds between scans (6 sec in my case) and press Start. All you need to worry about is turning pages in that 6 seconds. In a matter of a few minutes 15 scans have been completed (30 book pages). 2. Once those scans are created, then it's time for Scan Tailor Advanced. It is very quick and simple with all the batch processes. In a few minutes 30 pages are turned into OCR ready tiff images. 3. Then onto gImage Reader... https://github.com/manisandro/gImageReader/ Batch OCR the 30 pages 4. Next comes LibreOFFICE and the OCR text is copied across. This is where it becomes quite time consuming- fixing all those little OCR errors. Then marking the chapter headings. Once done export to epub. Of course, the ebook needs a bit of work for a good quality final product, but my main concern was the OCR side. I have previously tried to scan pages from books, but it was a very frustrating experience, and I spent close to three hours to scan 20 pages and add them to an ebook. I realise now I attempted this without the right knowledge and tools. So thanks for all the great pointers!! If there is anything in my workflow that could be improved, please let me know. |
07-22-2023, 03:38 AM | #24 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
It:
I use it when I need to create a rough PDF from an actual scanner, or, to quickly crop/edit photos taken from a camera. Like if my family gives me a small/short document to scan, I just use NAPS2 instead of busting out the full-blown editing + OCR tools! Quote:
Edit in Word/LibreOffice (DOCX) or Sigil/Calibre (EPUB)? In the DOCX stage, if that's where you prefer to do your edits... LibreOffice has Regular Expressions, so if you know how to master those, you can do lots of mass corrections in there. LibreOffice's Regex is SO MUCH better than Word's Wildcards... but it still has limitations. So... Personally, I do all edits in Sigil/Calibre, because you have full access to:
And since you're working directly in HTML, nothing can hide from you. For more on Regex + Spellcheck Lists, and even how to take advantage of some of this stuff in LibreOffice... see my post in:
If you follow the pyramid of links, it'll:
My Current PDF->EPUB Workflow I settled on:
where:
This gives me extremely clean HTML code—with almost all the trash removed—so when I begin editing EPUB, I can focus purely on:
Cutting down on all the wasted in-between cleanup/repairing time drastically. - - - Side Note: Sadly, Toxaris's EPUB Tools is now abandoned + will not be getting support (or the much-anticipated version 2 release). I did recover and share one of the final versions of EPUBTools (v1.27.1) in: You could also still read Toxaris's original "EPUBTools" MobileRead thread or visit his (now-dead) website via Archive.org:
The instant I finally gave in and began using this, it fully converted me. It was just SO MUCH BETTER than the manual cleanup I was doing before. And the "Dialogue Checker" alone is the best dang thing since sliced bread: To even APPROXIMATE that same type of "find the mismatching quotation marks" functionality... this is the kind of steps + Regexes you'd need to use: and that still doesn't even get close to what Toxaris solved with his amazing cleanup tool. - - - Side Note #2: If you want more random EPUB productivity tips, also see my posts in:
Last edited by Tex2002ans; 07-23-2023 at 01:48 AM. |
||
08-22-2023, 12:56 PM | #25 |
Enthusiast
Posts: 40
Karma: 10
Join Date: Jul 2023
Device: none
|
Have you thought about assembling your posts into an epub or a wiki? I clicked on a link which led me to a post with several more interesting links, which . . . until I felt I was in a maze of twisty (not so) little forum posts, all different.
|
08-22-2023, 01:36 PM | #26 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
My blog, Digital Slug—where I'll be collecting + reorganizing all my ebook knowledge—will be coming. - - - Like I said in Post #17, the past two years, I've mostly been focusing on LibreOffice tutorials/info + really boosting my Technical Writing skills. When the blog eventually comes, I'll also be rewriting the ebook information in a much more easy-to-digest form. So instead of having the knowledge spread across 2000+ MobileRead posts and 1100+ Reddit posts, it'll be gathered in one location—the blog! Quote:
The frustrating thing is the titles of many of the MobileRead/LibreOffice topics have nothing to do with the underlying answers. So while I know a given answer is buried in the replies, most others might not. (For example, there might be some godly answer about italics/emphasis, but it was a side-discussion happening while answering Questions A, B, and C.) Part of what I'm aiming to do with the blog is gathering/pulling out all that info I've written about over the years, making it much easier to read and search through. Then, I'd bring it all up to the latest standards / best practices too! So if you came across some older MR post from 2016, I probably came up with much better ways/explanations since then! Last edited by Tex2002ans; 08-22-2023 at 01:38 PM. |
||
08-25-2023, 01:54 AM | #27 | |
Guru
Posts: 668
Karma: 929286
Join Date: Apr 2014
Device: PW-3, iPad, Android phone
|
Quote:
Such as adding to https://wiki.mobileread.com/wiki/Dig...ooks_to_Ebooks |
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
EPUB DIRECT PRINT | hershe | ePub | 2 | 02-21-2013 01:28 AM |
Can I print an Epub book? | Bart123 | ePub | 3 | 12-01-2011 12:04 AM |
Print version of ePub | rplantz | ePub | 3 | 09-08-2011 03:51 AM |
epub print squashed | pendragginp | Calibre | 16 | 11-10-2010 08:19 AM |
How can I print an Epub | jimjam | ePub | 4 | 11-27-2009 11:41 AM |