Oh well, forgot the question of our workflow, here goes:
- Scanning from original source (printed book) with Finereader (often source is printed with blackletter, as we specialize in 19th and early 20th century literature in danish - so we use the training ability of Finereader a lot, since there are many variations in the blackletter fonts used back then!)
- Saving result as txt and pdf.
- Load textfile into Notetab and processing it with a clip-program, that removes most of the common OCR-errors etc., then proofreading onscreen in Notetab and pdfreader side by side, modernising archaic language and inserting HTML formatting while reading.
- Copy proofread text to Word for spell- and grammarchecking, afterwards copy text back to Notetab and processing with another Notetab-clip to produce xhtml source + css.
- Check xhtml in The W3C Markup Validation Service, then:
- Load file into Sigil (the xhtml coding holds codes for setting the necessary metadata automagically and importing frontpage and other images) and parting the file into chunks of max 250 kb. by hand.
- Save as epub, and validate in Sigil and ePubCheck.
(Seems like a lot of work, but there aren't really any shortcuts available, if you want to produce output of a reasonable quality ... the longest part of the process is of course the proofreading - and we like reading here