I have just been through this process.
Here is a procedure (linux):
a) Put the pdf in a viewer and do a "select all"
b) Put the text into openoffice and produce an .odt file
c) Ajust sentence length to 50% so as to join short bits with new line chars
d) Run replace for "" to "\n" and find for [a-z] - this gets rid of paras that begin with a small letter and dialogue that has been run together.
e)Split odt file into a separate file for each chapter or section you want in the TOC
f) Clean up the odt files against the original, checking sentences and paras, and put (i) and (ii) for example around italic text.
g) Convert these files to encoded text utf-8
h) Start ecub and put in these files for immediate conversion to html.
i) Clean these html files substituting <i> for (i) and </i> for (ii) etc and putting in images etc.
h) Compile to epub file and check in azardi that all the changes are ok.
i) Copy the resulting build folder to e.g. finalbuild
j) Correct the cover page and place a reference to the TOC in content.opf
k) Place a reference to the TOC in the title page
l) Run "zip -Xr9D $1.epub mimetype * -x .DS_Store" in FinalBuild to produce a new epub. Check in azardi
m) Run mobigen against content.opf with wine
|