View Single Post
Old 09-03-2008, 07:34 AM   #2
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
Quote:
Originally Posted by sassanik View Post
Okay so this is probably not the right spot to ask this question, but I am hoping that you guys can point me into the right direction.

So I regularly import books to ebwlibrarian and their format then gets converted to imp.

Rather annoyingly a number of books seem to have some conversion issues mostly with ' and " being replaced with a ? instead. While the book is still readable it would be nice to not have that happen.

Is there a way to help prevent this? would changing the font before import help? suggestions ideas?

Whatever ConvertLit does it seems to make the files import the best, better than html, txt, or rtf. It seems to have the least problems upon importing the books, probably because I am using the oeb format?

Anyway, suggestions ideas, pointers in the direction of another forum?

Thanks!

Amy
Most probably what happens is that the book is produced by Word or some variant thereof, and uses smart quotes directly as characters. When converting, it is run through HTML Tidy internally by the ebook publisher conversion process which causes that. The solution is simple: install Tidy yourself, and from the command line type:
Code:
tidy -m -win1252 input.html
which should convert the file in place to a form which is suitable for input to the EBW Librarian (or any other conversion tool).
ashkulz is offline   Reply With Quote