View Single Post
Old 07-18-2023, 11:43 PM   #3
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by WV-Mike View Post
I recently made my first eBook using Sigil.
Fantastic! Congrats.

And welcome to the forum.

Quote:
Originally Posted by WV-Mike View Post
From print to ePub - how I did it.

[...]

What would make this process simpler and more efficient?
Boy, oh boy... Well, you've come to the right place.

I've been writing about this stuff extensively since 2012.

For some of the most recent topics, see:

and, just last week, I wrote an even bigger summary here which linked to even more of the previous threads:

That should hold you over on all OCRing + PDF->EPUB + DOCX->EPUB info for... oh, about 100 years.

Quote:
Originally Posted by JSWolf View Post
It will add all kinds of errors and Acrobat 6 is an old version and may noot OCR all that well. Get a good OCR program and use that instead.
Yes, exactly.

I looked up the date, and looks like Adobe Acrobat 6 was from 2003! My gods, there has been multiple GENERATIONAL leaps in OCR quality since then.

Getting much more accurate OCR is one of the biggest and most important steps you can do, because EVERY further stage will be based on how clean your initial text is.

You can see the post I wrote about how important accurate OCR is:

When you're creating ebooks... it's not JUST the raw text you have to worry about, but correctly recognizing all the formatting too:
  • Bold / Italics
  • Superscripts / Subscripts
  • Lists
  • Tables
  • Images
  • Headers / Footers
  • [...]

Last edited by Tex2002ans; 07-18-2023 at 11:56 PM.
Tex2002ans is offline   Reply With Quote