View Single Post
Old 10-12-2010, 09:08 PM   #49
KevinH
Wizard
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 1,077
Karma: 444444
Join Date: Nov 2009
Device: many
Hi,

I think the line endings depend on which type of machine was used to generate the original Mobi file. The ones you tested must have been made on a Mac platform. Luckily HTML itself is immune to line ending differences. But encodings (specificly utf-8) may need the high bit set so I would keep the 'wb'.

If you are on Linux or Mac OSX, simply use tr to remove or change them:

To replace carriage returns '\r' with new lines '\n':

cat FILE.html | tr '\r' '\n' > temp.html
mv temp.html FILE.html


To simply remove the carriage returns without replacing them

cat FILE.html | tr -d '\r' > temp.html
mv temp.html FILE.html


BTW: There is another tool: mobiml2html.py that will take the Mobi specific html file created by mobiunpack.py and make it xhtml if you want to archive things or convert them to epub.

It is available as python source code with a GUI front-end from the same site as a zip archive

http://code.google.com/p/ebook-conve...s.zip&can=2&q=

or you can checkout the source tree itself
http://code.google.com/p/ebook-conve...ource/checkout

It is also available in the "tools" package mentioned on the ApprenticeAlf site.

Hope this helps,

KevinH

Last edited by KevinH; 10-12-2010 at 09:19 PM. Reason: fixed a typo, added an download archive
KevinH is offline   Reply With Quote