MobileRead Forums - View Single Post

Hitch · 07-02-2010, 09:05 PM

Quote:

Originally Posted by KevinH

Hi,

FWIW: For spellchecking and creating epubs my work flow now uses OpenOffice.org - Writer which runs on Windows, Linux, Mac OSX, Sun, Free BSD, etc and is completely free. It is a full word processor similar to Microsoft Word in functionality and it can read in and write out Microsoft Office file formats.

Really? Because I tried using OO in lieu of Word, when I was struggling with having to strip out the nine bazillion "spans" getting stuck in by OCR for text formatting, and after figuring out that OO couldn't even do something as simple as S&R the section breaks put in by the OCR conversion to a .doc format, I gave it up as a waste of time. I found the regex end of it extremely difficult to use, much harder than it needs to be, IMHO.

Quote:

You can load dictionaries for a large variety of languages and you can tell it which parts of your text are written in a particular language. It can spell check in many languages at once.

Well, I can see how that would be handy.

Quote:

Once I have completed the spellcheck, I then use "writer2xhtml.oxt" which is a free extension for OpenOffice.org Writer (again that runs on all platforms) that will export my document to xhtml.

You can then load the xhtml directly into Sigil and then touch things up, add fonts, chapter breaks, etc, and then have it produce the final epub.

I then check things using epubcheck 1.05 and fix as needed.

This seems to work quite well for *.doc documents and long xhtml documents that have been created from OCR (think Topaz Books converted to xhtml) that have lots of spelling errors from imperfect OCR.

Hope something here helps,

KevinH

You said, about 3 paras ago, "add fonts?" So you're basically exporting a completely stripped xhtml file w/o any css and then creating the css from scratch--do I understand you correctly? I'm curious, as I do a lot of ebook creation from OCR'd OOP books.

Hitch