Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Other formats > IMP

Notices

Reply
 
Thread Tools Search this Thread
Old 02-20-2008, 11:59 AM   #16
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,966
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
I've figured out what is going on and why the bug exists.

the blank lines are <br /> and they are not being picked up and converted to a blank line. If you fix that, you'll be good to go. I looked at the expanded HTML and yes, it had <br /> for the blank lines.

So no, I do not need to test version 7. Just get a fixed version 8 or a version 9.
JSWolf is offline   Reply With Quote
Old 02-20-2008, 01:00 PM   #17
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by JSWolf View Post
I've figured out what is going on and why the bug exists.

the blank lines are <br /> and they are not being picked up and converted to a blank line. If you fix that, you'll be good to go. I looked at the expanded HTML and yes, it had <br /> for the blank lines.

So no, I do not need to test version 7. Just get a fixed version 8 or a version 9.
You are right about the <br /> issue. It surfaces when using the '--nopara' as the eBook Publisher doesn't seem to respond to it beside the <div> construct. I have seen this in past conversions I did before the 'mobi2imp' days.

I have a work-around fix that could be inserted after line 289 in 'mobi2imp.pl' (just after the <body> tag substitution):
Code:
if (defined $opt_nopara) {
    $html =~ s/<br([^>])*><div/<BR \/><BR \/><div/g;  #force <br /> to work better in ebook Publisher
}
This is better than just forcing two <br />'s everywhere (what I tried first and didn't like!) Further testing is required to ensure this doesn't 'break' something else...

I used this 'fix' to produce the attached .IMP version of 'The Heretic.prc'

Is this better?

-Nick

p.s. the links to Chapter 9, Chapter 10 and Chapter 22 don't get fixed by the mobi code in 'mobi2imp' so you may want to check the original .prc to see if it is working properly. Also, in Chapter 40, I noticed an extra line para break ('<br /><br /><div') where the original .prc has a '<br /><div' which probably shouldn't be there

Last edited by nrapallo; 02-21-2008 at 05:31 PM. Reason: see JSWolf's .IMP version of 'The Heretic.prc'
nrapallo is offline   Reply With Quote
Advert
Old 02-20-2008, 02:38 PM   #18
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by JSWolf View Post
I've figured out what is going on and why the bug exists.

the blank lines are <br /> and they are not being picked up and converted to a blank line. If you fix that, you'll be good to go. I looked at the expanded HTML and yes, it had <br /> for the blank lines.

So no, I do not need to test version 7. Just get a fixed version 8 or a version 9.
Ok, I just produced mobi2imp.exe (version 8b) for now to test this <br /> issue.

I've tried it on some files with no unexpected results, so maybe it will work.

By the way, can you just put the below .bat file in a directory full of .prc/.mobi that you want to convert and check for any other problems?

Just make sure this new 'mobi2imp.exe' is in your 'path'.

Have fun!
Attached Files
File Type: bat NRconvert prc to imp.bat (557 Bytes, 1172 views)

Last edited by nrapallo; 02-21-2008 at 05:33 PM. Reason: version 9 now implements this; but can disable it too
nrapallo is offline   Reply With Quote
Old 02-20-2008, 04:09 PM   #19
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,966
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
I've tried your new version 8b and it seems to work. Check out the new version of The Heretic to see. I would like to know what you think of it on an actual EB1150.
JSWolf is offline   Reply With Quote
Old 02-21-2008, 11:30 PM   #20
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by nrapallo View Post
Mobi2imp (version 8) with windows executable now out! (See post #3 above)

VERSION 8 - Changes:
- mobi2imp.exe (version 8) - windows executable (very stable now!)
- now allow you to specify .IMP filename produced, overriding default naming of 'Author - Title'.ext
- BUGFIX: now strip <body> tag of any BD/mobi specific in-line styles before start 'fixing' things.

TO DO:
- better documentation and even a tutorial would be nice
- ability to add a (default) 'cover' image to every conversion from .mobi to .imp exists, but not yet ready for the consequences
- add more user defined settings along with some 'Mobiperl' fixes like TOC first, cover link, prefix title...

-Nick
Mobi2imp (version 9) with windows executable now out! (See post #3 above)

VERSION 9 - Changes:
- mobi2imp.exe (version 9) - windows executable
- can now handle (text) .pdb files properly i.e. ereader 'TEXt'/'REAd' type
- now makes the BookDesigner notice at the end 'small print' by default
- can make that BD notice 'big print' with '--BDbig' (case sensitive)
- can make that BD notice start on a newpage using '--BDnewpage'
- can even remove that BD notice at the end with '--BDremove'
- to add flare, can use '--bgcolor #FF80FF' to set background color for every page
- BUGFIX: Only when using '--nopara' option, some <br />'s get ignored so another <br /> is added; if this creates issues, then '--noBRfix' will not add the second <br />.

The 'mobi2imp' program is now very stable and mature enough to be used effectively in re-conversion efforts using a .prc copy for the previously BD made .IMP.

Enjoy!
nrapallo is offline   Reply With Quote
Advert
Old 02-24-2008, 10:29 PM   #21
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Mini-tutorial for Mobi2IMP

Quote:
Originally Posted by nrapallo View Post
Mobi2IMP (version 9.4) with windows executable now out! (See post here)
Mini-tutorial follows:

After installing Mobi2IMP 9.4 using the Windows installer, you can use the new Windows GUI instead of using the dos/command prompt or perl script.

REQUIRED: You must have the eBook Publisher software previously installed to facilitate the conversions.

The mobi2imp.exe can be run from within an already opened Dos box i.e. command prompt and then only needs one argument "My Source.prc". If there are any spaces, then you need to surround them with quotes. Don't double-click the .exe directly nor the .pl; it is best run from within a batch file (see below)

You can also specify 'Category' (like Fiction) 'Author' or 'Title'.

Try this:
Code:
c:\> mobi2imp.exe --verbose "My Source.prc" Fiction
If you want to automate this, try running a batch file (just copy and paste this into a file called 'prc2imp.bat')
Code:
@echo off
rem Convert .mobi/.prc to .imp process devised by Nick Rapallo (Jan. 2008)
rem =============================================
rem Start the conversion of all .prc files in this directory to .imp format
rem For GEB 1150/EBW 1150 only output; add switch '--1200' for REB 1200 .IMP

for %%i in (*.prc)  do mobi2imp.exe --verbose "%%i" "%%~ni"
for %%i in (*.mobi) do mobi2imp.exe --verbose "%%i" "%%~ni"
for %%i in (*.pdb)  do mobi2imp.exe --verbose "%%i" "%%~ni"

rem That's it! We are now finished the conversion of all .prc files
echo WoW! All done.
pause
This will allow those with many mobipocket .prc/.mobi/.pdb files to migrate them to their ebookwise 1150 easily. For recursive batch processing, see post#11 below

Then all you have to do is put mobi2imp.exe in your path (or current directory), your 'prc2imp.bat' into the current directory containing your .prc, and then double-click 'prc2imp.bat' (just ensure you don't have too many .prc as ALL of them will be converted!)

Also, options like 'margins' and 'text-justification' can be better controlled in the mobi2imp via command-line '--options'. Popular options are:
'--out IMPFILENAME' set .IMP filename to use (overrides default naming)
'--smallerfont' use 'x-small' font size for body text like pre-fix BD not default 'small'
'--nojustify' no full justification (i.e. left-aligned) not 'justify'
'--nopara' use no paragraph separation not 'blank line' (1em) separation
'--indent' use small (1em) indent instead of no (0em) indent
These options (sometimes called switches) go just after mobi2imp.exe and ALWAYS start with two dashes (i.e. '--verbose'). Just forget about getting the .pl and ActiveState Perl setup working. With the .exe, you don't even need ActiveState Perl!

With mobi2imp, just beware that you're stuck with any inconsistencies (if any) introduced by the .prc/.mobi original when converting over. However, 'mobi2imp' also creates a .opf that can be loaded into eBook Publisher and from there you can further edit/build it.

All in all, I like the output of mobi2imp.

I have been converting Madam Broshkina's .prc posts (with her permission) using mobi2imp.exe. For one I did recently, I used this command-line (in a batch file):
Code:
mobi2imp.exe --1200 "Authors Various_The Worlds Greatest Books Volume V.prc" "AUTHOR5" Fiction "Authors, Various" "The Worlds Greatest Books Vol 5"
Download "Authors Various_The Worlds Greatest Books Volume V.prc" and see if you can duplicate the .IMP posted for the same ebook with the above command.

p.s. Thanks to DaleDe there is now a wiki entry for mobi2imp here
Attached Files
File Type: bat prc2imp.bat (757 Bytes, 1125 views)

Last edited by nrapallo; 04-15-2008 at 05:26 PM. Reason: revised due to new Mobi2IMP GUI 9.4!
nrapallo is offline   Reply With Quote
Old 03-24-2008, 09:59 AM   #22
Moonraker
Addict
Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.
 
Moonraker's Avatar
 
Posts: 314
Karma: 1002965
Join Date: Mar 2006
Location: UK
Device: ILiad. Gen 3, PocketBook 360, Kobo Aura HD, Kindle Oasis 2
I have used ebook Publisher (v2.2.3) for many years to create .imp files from my own XHTML code.
I could not get Mobi2Imp working at first because it could not find Publisher but after I updated it to v2.2.5 all worked well.

I was surprised that Publisher had been updated because I understood it was not being supported any more.

Mobi2Imp is a great and useful tool.

I have only one gripe - why does it replace a closing </p> with <div height="0em"></div> <div height="0em"></div>?

This makes for a larger than necessary file and also it is difficult to edit the html code to correct it.
Moonraker is offline   Reply With Quote
Old 03-24-2008, 10:33 AM   #23
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Moonraker View Post
I have used ebook Publisher (v2.2.3) for many years to create .imp files from my own XHTML code.
I could not get Mobi2Imp working at first because it could not find Publisher but after I updated it to v2.2.5 all worked well.

I was surprised that Publisher had been updated because I understood it was not being supported any more.

Mobi2Imp is a great and useful tool.

I have only one gripe - why does it replace a closing </p> with <div height="0em"></div> <div height="0em"></div>?

This makes for a larger than necessary file and also it is difficult to edit the html code to correct it.
Mobi2IMP does not parse the HTML code; it just does some (global) search and replaces to "fix" things that are "broken" when used with eBook Publisher.

The </p> endings are being stripped/altered BEFORE the .prc is used by Mobi2IMP. I'm not sure if it is tompe's Mobiperl code or the use of BookDesigner. I think BookDesigner may be the culprit of the '<div height="0em"></div> <div height="0em"></div>' construct. Either way it's not Mobi2IMP's doing.

BTW, if you can read perl, check the source, mobi2imp.pl, for items replaced by Mobi2IMP.

I've used eBook Publisher for years and have come to respect its power and usefulness. It also doubles as great "validator" of HTML v3.2 code since it gives detailed error messages and points to the error in the source file.

Sure, it has a few shortcomings (image re-sizing fails to honour bottom margin; missing fraction HTML numeric codes; ...) but it has seen some improvements over the years (margin indents work better now; new default 'small' font size for eBookwise 1150; ...).

If you have any more specific Mobi2IMP questions/comments, please be sure to post in the thread
Mobi2IMP 9.4 with new Windows GUI released! Here, we discuss the current (and future) version of Mobi2IMP.

p.s. drool... How do you like the iLiad vs. the eBookwise 1150?

Last edited by nrapallo; 10-18-2008 at 07:33 AM.
nrapallo is offline   Reply With Quote
Old 03-24-2008, 03:21 PM   #24
Moonraker
Addict
Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.
 
Moonraker's Avatar
 
Posts: 314
Karma: 1002965
Join Date: Mar 2006
Location: UK
Device: ILiad. Gen 3, PocketBook 360, Kobo Aura HD, Kindle Oasis 2
Quote:
Mobi2IMP does not parse the HTML code; it just does some (global) search and replaces to "fix" things that are "broken" when used with eBook Publisher.

The </p> endings are being stripped/altered BEFORE the .prc is used by Mobi2IMP. I'm not sure if it is tompe's Mobiperl code or the use of BookDesigner. I think BookDesigner may be the culprit of the '<div height="0em"></div> <div height="0em"></div>' construct. Either way it's not Mobi2IMP's doing.
Regarding the stripping of the </p> endings. As I don't use book designer or Perl then I don't suppose it is either of them. I always create my books in html. Then using the html file I create an imp file using eBook Publisher and then a prc file using Mobipocket Creator. I think the culprit must be Mobipocket Creator.

Quote:
I've used eBook Publisher for years and have come to respect its power and usefulness. It also doubles as great "validator" of HTML v3.2 code since it gives detailed error messages and points to the error in the source file.
I agree 100%. Another good validator I use is Amaya.

Quote:
p.s. drool... How do you like the iLiad vs. the eBookwise 1150?
I love all my ebook readers and wouldn't part with any of them.
I prefer my Cybook Gen 3 to the iLiad because of its longer lasting battery and faster boot-up time.
But the eBookwise 1150 beats them all in terms of ergonomics and ease of use. This would be my ideal if it had an e-ink screen.
Moonraker is offline   Reply With Quote
Old 03-24-2008, 03:52 PM   #25
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Moonraker View Post
Regarding the stripping of the </p> endings. As I don't use book designer or Perl then I don't suppose it is either of them. I always create my books in html. Then using the html file I create an imp file using eBook Publisher and then a prc file using Mobipocket Creator. I think the culprit must be Mobipocket Creator.
To investigate this further, you may want to try to convert the .prc directly to html using tompe's windows binaries here (use mobi2html.exe in the .zip).

Just issue the command:
Code:
mobi2html "Your.prc" TempDir
Then examine the resulting .html in the TempDir directory. If want to see just the "raw html" before Mobiperl manipulates it, try:
Code:
mobi2html --rawhtml "Your.prc" Temp >My.html
BTW, I used 'mobi2html' as the base code for Mobi2IMP (at least the .prc to .html part)

Hope this helps!
nrapallo is offline   Reply With Quote
Old 03-24-2008, 05:37 PM   #26
Moonraker
Addict
Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.
 
Moonraker's Avatar
 
Posts: 314
Karma: 1002965
Join Date: Mar 2006
Location: UK
Device: ILiad. Gen 3, PocketBook 360, Kobo Aura HD, Kindle Oasis 2
Deleted post

Last edited by Moonraker; 03-24-2008 at 06:52 PM.
Moonraker is offline   Reply With Quote
Old 03-24-2008, 06:01 PM   #27
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Moonraker View Post
Thank you very much for the link and for the instructions.

This is all very interesting to me because I have never before seen the html code behind a prc file.
This is a recent ability perfected by tompe with his Mobiperl code. I had "hacked" makedoc9 (popular .pdb to .txt converter) years ago to strip out the images and fix the <img...> to substitute the 'filenames' for the 'reindex' tag. It allowed me to see the .html code behind the .prc for the first time. Sadly, I had no idea what a 'filepos' was and the href links were all broken. That's why I was so taken by tompe's efforts and wanted to combine the two worlds (.prc to .imp)!

Quote:
For the test I used the same prc file in two different folders, giving the prc files different names.

The result, as far as I can see, is that the two files are identical using either:

mobi2html "Your.prc" TempDir

or

mobi2html --rawhtml "Your.prc" Temp >My.html

Both files are the same size size and have the same number of lines and both end with �</body></html>

Both files have all the closing </p>'s stripped and replaced by <div height="0em"></div>
<div height="0em"></div>.

All my curly quotes i.e. &8220; and &8221; have been changed to &quot; (straight quotes).
My em-dash codes &8212; have all been changed to &mdash; etc.
Note: I had to omit the # sign from the above numerics in order to get this posted.

And my HTML(XML) header is completely changed although the charset=UTF-8 has been kept.

It appears to be Mobipocket Creator that changes the code don't you think?
BTW, those endings may be easy to strip out as they don't mean anything nor needed. I haven't come across these issues with .prc's built by BookDesigner or HarryT. Any other quirks to watch out for?
nrapallo is offline   Reply With Quote
Old 03-24-2008, 06:53 PM   #28
Moonraker
Addict
Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.
 
Moonraker's Avatar
 
Posts: 314
Karma: 1002965
Join Date: Mar 2006
Location: UK
Device: ILiad. Gen 3, PocketBook 360, Kobo Aura HD, Kindle Oasis 2
Sorry for my previous post - I had missed the my.html file thinking it would be in another folder. I have retested the files and the following is my findings:

Thank you for the link and for the instructions:

This is all very interesting to me because I have never before seen the html code behind a prc file.

For the test I used the same prc file but gave it two different names.

First file test (mobi2html "Your.prc" TempDir):

Size: 1152 KB

� 250 occurrences appeared throught the document. These would have to be removed.
i.e. adhering changed to adherin�g

</p> stripped and replaced by <div height="0em"></div> <div height="0em"></div>

&8220; changed to &ldquo;
&8221; changed to &rdquo;
&8217; changed to &rsquo;
&8212; changed to &mdash;

Headings - i.e. <h4>Chapter 10</h4> Changed to = <h4 align="center"><font size="+1"><b>Chapter 10</b></font></h4>

<b></b> added to headings but where <strong></strong> were in the original file they have been left unchanged.

<br style="page-break-after:always" /> inserted at end of file.




Second file test (mobi2html --rawhtml "Your.prc" Temp >My.html)

Size: 1155 KB

All numeric code unchanged.

<b></b> Added to Headings but <strong></strong> in original file left unchanged.

<font size="+1"> added to Headings

</p> left unchanged but <div height="0em"></div> <div height="0em"></div> added between paragraphs. This seems superfluous to me.

<mbpagebreak/> added to end of file.


When I put the file through Tidy.exe I got 8833 warnings that <div> attribute "height" has invalid value "0em"

NOTE: # omitted from numeric codes to get this posted.

Last edited by Moonraker; 03-24-2008 at 07:00 PM.
Moonraker is offline   Reply With Quote
Old 03-24-2008, 08:01 PM   #29
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Moonraker View Post
Sorry for my previous post - I had missed the my.html file thinking it would be in another folder. I have retested the files and the following is my findings:

Thank you for the link and for the instructions:

This is all very interesting to me because I have never before seen the html code behind a prc file.

For the test I used the same prc file but gave it two different names.

First file test (mobi2html "Your.prc" TempDir):

Size: 1152 KB

� 250 occurrences appeared throught the document. These would have to be removed.
i.e. adhering changed to adherin�g

</p> stripped and replaced by <div height="0em"></div> <div height="0em"></div>

&8220; changed to &ldquo;
&8221; changed to &rdquo;
&8217; changed to &rsquo;
&8212; changed to &mdash;

Headings - i.e. <h4>Chapter 10</h4> Changed to = <h4 align="center"><font size="+1"><b>Chapter 10</b></font></h4>

<b></b> added to headings but where <strong></strong> were in the original file they have been left unchanged.

<br style="page-break-after:always" /> inserted at end of file.




Second file test (mobi2html --rawhtml "Your.prc" Temp >My.html)

Size: 1155 KB

All numeric code unchanged.

<b></b> Added to Headings but <strong></strong> in original file left unchanged.

<font size="+1"> added to Headings

</p> left unchanged but <div height="0em"></div> <div height="0em"></div> added between paragraphs. This seems superfluous to me.

<mbpagebreak/> added to end of file.


When I put the file through Tidy.exe I got 8833 warnings that <div> attribute "height" has invalid value "0em"

NOTE: # omitted from numeric codes to get this posted.
That � entry is weird. I wonder what the rationale behind it was. I know this will sound like you are chasing your tail, but if you make an .imp with this .html using eBook Publisher, does it bomb? It does if the HTML char &# 20; exists in the ebook i.e.
Code:
<p>Html documents with this entity &# 20; bomb!  No output produced by eBook Publisher v2.2.5</p>
Note for display purposes, I put a space between '#' and '2' that shouldn't be there!

I think we can conclude that the Mobiperl code strips the </p> and just leaves behind the Mobipocket empty <div>'s. I have seen this behaviour with the .pdb to .imp routine in Mobi2IMP. BTW, you can take a PalmDOC .pdb (TEXt/REAd) document and have Mobi2IMP create a .imp version.
nrapallo is offline   Reply With Quote
Old 03-24-2008, 10:10 PM   #30
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by nrapallo View Post
I think we can conclude that the Mobiperl code strips the </p> and just leaves behind the Mobipocket empty <div>'s. I have seen this behaviour with the .pdb to .imp routine in Mobi2IMP. BTW, you can take a PalmDOC .pdb (TEXt/REAd) document and have Mobi2IMP create a .imp version.
It should not modify the html in this way if you do not use a fixhtml flag. I will look at this when I am back from Eastercon (british sf con) in a couple of days.
tompe is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to produce epubs for Sony ereader drmaxx ePub 1 03-15-2010 10:10 PM
Anyone use Calibre to produce ebooks from HTML? AlexBell Workshop 10 07-03-2009 07:15 AM
Imp scripts and wine linux related derrell Fictionwise eBookwise 12 10-31-2008 04:53 PM
Perl only access to imp file info derrell IMP 5 08-29-2008 10:38 AM
Can BookDesigner produce an ebook that looks exactly like those from Connect? Dr. Drib Sony Reader 4 03-30-2007 08:32 PM


All times are GMT -4. The time now is 12:42 PM.


MobileRead.com is a privately owned, operated and funded community.