View Single Post
Old 01-23-2011, 10:57 AM   #6
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
learning enough regex to get by is the key. there is sufficient in the forums & online or in a simple book like "sam's teach yourself in 10 mins"

I think the issues break down into:

1. formatting - mostly line break issues, but also there's malfunctioning special characters like accents & dashes.

2 chapter detection & headings. - once you get to understand what e-reader software looks for & how calibre handles heuristics - they are not so bad.

3. typos / scan errors

2of 3 can be fixed well with calibre & sigl

3 needs the most manual input, but if a word is mis-scanned once it's likely to be consistently mis-scanned through out, so I use find / replace to look for other instances e.g. find all I'11, replace with I'll.

there is a spell check free tool ( microspell ) that will work with Sigil and/or will scan your txt files, but I've found it too troublesome for regular use.. it has to be taught all the proper nouns anyway then it develops a tendency to accept some scan errors as uncommon words & has to be retrained... if you have the patience, it does have lots of options & can learn as it goes....http://www.microspell.com/ seems to work OK in windows 7.

stripping header & footer ( & page numbers ) is only an issue with .pdf sources, which are best avoided anyway.

when it comes to fine tuning book appearance, then a little understanding of HTML styles & the stylesheet.css file ( editable in sigil) goes a long way. I keep goggling & asking whenever I bump into something that I don't fully understand.

Last edited by cybmole; 01-23-2011 at 11:01 AM.
cybmole is offline   Reply With Quote