View Full Version : IMP GUI Converter/Editor


mscott161
12-22-2008, 02:27 PM
Hello,

Since the start of the GUI version of my ConvertIMP Program, I have continually added features as people have sent wish list items. I try to add new features to the program if they are a good fit, I am also try to keep the GUI from being too difficult to use.

This all started when conversing with Nick who has posted many elements to the IMP thread and has helped make the program what it is today. I will be working with him and others to make a good solid utility for users to be able to do something with the IMP format.

I will continue to edit this post with the new releases as I make them. Please be patient with it because it is Beta and I do not want users to mess up their IMP files. If there is a bug or problem, please post or PM me and I will remove the last release and fix the bugs before doing another one.

Thank you.
--Michael


- Version 1.3.5
I have now added the ability to mass convert the IMP Text Content to LRF. Found under Tools Menu. Or Select a book from the library and go to the Text Content Tab there is a Save LRF button. I would like to thank Chris Mumford for the BBeBLib in C# that allowed me to complete this milestone in the application.

- Version 1.3.6
I have add the TRow and TCel parsing which may have different effects to the HTML saving until string runs have been parsed.

Unless someone post that they want to see the source code this will be the last one and I will continue to update the below link but will only contain the executable.

- Just the Executable is at http://www.ebizsoft.com/download/convertimpgui.zip

nrapallo
12-22-2008, 02:49 PM
Nice! :thumbsup:

At this rate, you'll be posting version 2.0 soon! :snicker:

OK time to write that wish list... ;)

nrapallo
12-22-2008, 10:47 PM
OK time to write that wish list... ;)

Just but a few items...
In the Header section, display ImpType (2=EBW 1150/GEB 1150), ZoomState, Count of RESfiles and BytesRemainingInHeader.
Perform some sanity checks, mainly BytesRemainingInHeader -24 = length of book properties and if not correct, a Button to adjust BytesRemainingInHeader should be provided.
The Book section (should be called Book Properties) text usually gets cropped; the text box should have a horizontal scroll bar.
In the Book section, consider adding a string for the property's length/size like the TOC file entries. This way the Contents and Size will be listed underneath the Book Property heading.
In the Table of Contents (TOC) section, the random 4 letter filenames are not too important, so I would just list it along with the filetype it contains i.e. XHBG (JPEG). Consider revising the random 4-letter filenamse with their 4-letter filetype equivalent i.e.JPEG (JPEG). For filetypes with a space you can use an underscore. For DATA.FRK you can use DATA!
In the Images section, the image type could be displayed when expanded. I still don't see listed ALL the images used. Check the 'ImRn' filetype for the images used and their image type and original dimensions. Note for the EBW1150, those image types are written backwards like ' FIG' or 'GEPJ' :smack:
Under the Book Properties tab, for the RES File Name, allow a button that will autonmatically change it to the 'Author-Title' naming used for the .txt.
allow for the unimp'ing and re-imp'ing of files like facilitated by unimp.exe and reimp.exe.

OK, those were the easy/cosmetic things.

For the really advanced stuff, allow batch processing, handle REB1200 .imp ebooks, convert the text to html with character codes substitutions and weird characters substitution, allow linking to images and hyperlinks, use the embedded styles, < hr />, tables, forms, etc (you know the 'piece of cake' stuff :rolleyes:).

I'll help with the advanced stuff as I get the .IMP format better documented. ;)

Now when this is all done, we can then feed the resulting .html with images/links/styles to Cailbre's html2lrf and presto, .lrf support! or .epub via html2epub!

mscott161
12-24-2008, 12:31 AM
Nick and All,
I have made some GUI changes and provided image saving and display. Most all of the items from Nick's wish list above has been addressed.

I have placed the new update in the first post in this thread.

Happy Holidays,
--Michael

nrapallo
12-24-2008, 08:30 AM
Nick and All,
I have made some GUI changes and provided image saving and display. Most all of the items from Nick's wish list above has been addressed.

I have placed the new update in the first post in this thread.

Happy Holidays,
--Michael

That's great, Michael! What turn-around time.

In the same time it took you to produce this new version, if you were a major corporation, I think you would be still at the stage of identifying your target users. :eek: :smack: :snicker: :rofl:

OK, some constructive/minor comments:
In the General tab, under Header, the RES File Count needs the closing ')';
note the ZoomState is the lower nibble i.e 0 which means Both Small View and Large View; 1=Small View and 2=Large View. The upper nibble of 0x20 (i.e. 32) is the ImpType 2=EBW1150.
under Book Properties, you now somehow omitted the SubCategory (it holds the number of pages the reader displays when in Small View and when in Large View respectively. While this cannot be edited, it could be displayed!
under Table of Contents, you list the file size. Other information to list would be Data size, Index size (and when divided by index headersize of 14 (for EBW1150) will yield the number of such Indices). So basically the format here is Filetype header (32 bytes), Data section, {IndexN}*
under Book Images, there are some duplicate images now listed :smack:. As my test .imp, I used REBtestdoc.imp, I don't understand why those multiple copies listed in 'ImRn' were stored. The 'ImRn' record indicates that there are 17 images stores, but only 8 seem unique i.e. there should be 1 .gif, 6 .jpgs, and 1 .png. See this imp_dump.pl printout ======== ImRn ========
Filename:BYVI, $0000, Filesize: 656, Filetype:ImRn, $0001
Header:TOCconst:0001, TOCfname:ImRn, TOCoffset: 642
Header:$00000001, Unknown:$0000, $00000282, $00000101, $00000000, $00000000
Data length = 610, Index length = 14
Number of images indexed = 17
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 1527, $00BC70D0, imgtype: FIG, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 14851, $01E0DC48, imgtype: FIG, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 14888, $01F479E0, imgtype: FIG, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 14924, $00BE17B8, imgtype: FIG, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFC, $0000, offset: 15012, $00BC9D20, imgtype: GNP, imgID:0080, $0000
width:472, height:595, aspect:0.79, $FFFA, $0000, offset: 17845, $00BD94A0, imgtype:GEPJ, imgID:0080, $0000
width:472, height:595, aspect:0.79, $FFFB, $0000, offset: 17847, $01F444C0, imgtype:GEPJ, imgID:8D4D, $0000
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 17898, $01EDFE80, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFF, $0000, offset: 18201, $01F44380, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFE, $0000, offset: 18512, $00C47C20, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFC, $0000, offset: 18823, $01E1A4C0, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFE, $0000, offset: 19156, $00BC7120, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFE, $0000, offset: 19485, $00BCDE50, imgtype: GNP, imgID:0080, $0000
width:176, height:207, aspect:0.85, $FFFB, $0000, offset: 19949, $01F40D50, imgtype:GEPJ, imgID:4D80, $0000
width:174, height:207, aspect:0.84, $FFFB, $0000, offset: 19951, $01F15318, imgtype:GEPJ, imgID:B7B8, $0000
width:176, height:212, aspect:0.83, $FFFB, $0000, offset: 19953, $00BDD5C0, imgtype:GEPJ, imgID:8A1A, $0000
width:174, height:212, aspect:0.82, $FFFC, $0000, offset: 19955, $00BF8D68, imgtype:GEPJ, imgID:0F4E
Index1:Index1_const1:0080, len: 610, offset: 32, const0:0000

======== GIF ========
Filename:TGBQ, $0000, Filesize: 2154, Filetype:GIF , $0000
Header:TOCconst:0001, TOCfname:GIF , TOCoffset: 2140
Header:$00000001, Unknown:$0000, $0000085C, $00000101, $00000000, $00000000
Data length = 2108, Index length = 14
Number of GIF images = 1
Index1:Index1_const1:0080, $0000, len: 2108, offset: 32, $0000

======== JPEG ========
Filename:BEDO, $0000, Filesize: 116352, Filetype:JPEG, $0000
Header:TOCconst:0001, TOCfname:JPEG, TOCoffset: 116268
Header:$00000001, Unknown:$0000, $0001C62C, $00000101, $00000000, $00000000
Data length = 116236, Index length = 84
Number of JPEG images = 6
Index1:Index1_const1:0F4E, $0000, len: 7767, offset: 32, $0000
Index1:Index1_const1:8A1A, $0000, len: 7628, offset: 7799, $0000
Index1:Index1_const1:B7B8, $0000, len: 6149, offset: 15427, $0000
Index1:Index1_const1:4D80, $0000, len: 6004, offset: 21576, $0000
Index1:Index1_const1:8D4D, $0000, len: 44637, offset: 27580, $0000
Index1:Index1_const1:0080, $0000, len: 44051, offset: 72217, $0000

======== PNG ========
Filename:RYJS, $0000, Filesize: 3977, Filetype:PNG , $0000
Header:TOCconst:0001, TOCfname:PNG , TOCoffset: 3963
Header:$00000001, Unknown:$0000, $00000F7B, $00000101, $00000000, $00000000
Data length = 3931, Index length = 14
Number of PNG images = 1
Index1:Index1_const1:0080, $0000, len: 3931, offset: 32, $0000

after you save an image, then when re-constructing the .html you will have to refer to it in a unique way. I used, for the filename, the Imagetype followed by the Image ID (in hex as it format well and is always 4 characters). Now the Image ID is unique only for its Imagetype. An example from REBtestdoc.imp is GIF_0080.gif, JPEG_0080.jpg, and PNG_0080.png
display the Image ID as a 4-byte hex. Also, clicking on that Image ID entry causes the program to crash!
In the Book Properties tab, when I click Rename IMP file button the program crashes.
I would relocate the Rename IMP file button to below and the Save button to the right where you had it in the previous version. This would allow those quick fix "Buttons" beside a property i.e. for Author a "Button" could change "Firstname Intial Lastname" to "Lastname, Firstname Initial" format. ID (should be called BookID) can be auto-updated via a "Button" using the GUID 32 character naming (http://en.wikipedia.org/wiki/Globally_Unique_Identifier) convention . Place a Fix BytesRemainingInHeader "Button" and other sanity checks "Buttons" at the bottom.
Save means Save .IMP Changes; and where is that Save Text "Button"? :)
In the Text Content tab, pressing it a second time should not display to Pop-up that it may take a while as it's already been decompressed.


That's it for now....

Have yourself Happy Holidays as well. And thanks for your present (ConvertIMP)!

mscott161
12-24-2008, 02:45 PM
Nick and All,

Most all of the items from Nick's wish list above has been addressed.

I have placed the new update in the first post in this thread.

Happy Holidays,
--Michael

DaleDe
12-24-2008, 03:55 PM
Nick and All,

Most all of the items from Nick's wish list above has been addressed.

I have placed the new update in the first post in this thread.

Happy Holidays,
--Michael

How is this one only 17K when the others were over 110K in size?

Dale

nrapallo
12-24-2008, 04:08 PM
Dale:

It's just the executable file and doesn't have the source and project overhead/clutter. Maybe Michael can include the source code as well.

Still for 17k zipped it's amazing!!!!

mscott161
12-25-2008, 11:09 PM
I replaced the zip file with one with the source code as well in it.
Sorry about that.

--Michael

mscott161
12-31-2008, 01:02 PM
I have posted an update to the conversion gui program in my first post in this forum.
The new version is 1.1.4

-- Michael

mscott161
12-31-2008, 06:20 PM
New Version 1.1.5 (In first post) - Have fixed the styles in the Create HTML. The Space Encyclopedia that Nick provided comes out pretty good. Still working on the Tables. Any Help with the format would be great.

-- Michael

mscott161
01-01-2009, 05:38 PM
New Version 1.1.6 - I have added to the tree values retrieved from TRow and TCel. I believe my Tabl parse is correct. I hoping for some help in the TRow and TCel parse. Also Nick if you still have REB Test Document.IMP file that when using the Create HTML from the Tools menu displays pretty good along with the Space.IMP file.

If any one has code to take HTML to BBeB format or IRF left me know. I know that are programs like makelrf but I would like to incorporate the code in the application.

Thank you
-- Michael

nrapallo
01-01-2009, 10:28 PM
New Version 1.1.6 - I have added to the tree values retrieved from TRow and TCel. I believe my Tabl parse is correct. I hoping for some help in the TRow and TCel parse. Also Nick if you still have REB Test Document.IMP file that when using the Create HTML from the Tools menu displays pretty good along with the Space.IMP file.
Michael:

Great achievements in so little time. Nice to see the styles working and table entries getting better support and finally being able to create .html from .imp ebooks!

I've had a hard time analyzing your results and reporting my findings BEFORE I see another revision is out and must re-do my testing. :rolleyes:

For now, I have some quick findings (hopefully before your next revision gets posted) :snicker: :
Your Create HTML menu item is a great addition; but why not place it in a Save HTML button beside (or in place of ) the Save Text button.
I've noticed that the extracted images (ID for the image type) are not always in the correct spot in the .html created. Check the first image in that Space Encyclopedia.imp against what shows up in the PC imp viewer.exe.
Under the General tab, the Header entry for IMP device shows 'Softbook 200/250e' when it should show 'EBW1150'.
It shows 'UnCompressed' even if the text is LZSS compressed.
The 'SubCategory' is not displayed as a Book Properties entries even though it does show up under the Book Properties tab.
When I click an image entry, you display the corresponding image. Consider displaying in that same section, text results that say imp_dump.pl would give for that record entry when "clicked'. No editing would be allowed; just display the info for that filetype in that section.
Under the Book Properties tab, the Format M L, F doesn't work as it only returns one character for each. Consider changing that button to Format L, F M instead, as this "sorts" better and is more common. Ensure that multiple button clicking of Format F M L and Format L, F M properly show the Author name.
Consider adding an About menu item with version number and your name (and possibly a link to this thread)!


If anyone has code to take HTML to BBeB format or LRF let me know. I know that are programs like makelrf but I would like to incorporate the code in the application.

Thank you
-- Michael

The only code I myself know of to make .lrf ebooks is in python. Calibre's python source code is available for perusal as well as PDFRead's output.py which uses images instead of text. Why not feed your created .html to an external program 'html2lrf.exe" (or "html2epub.exe") using the Calibre command line support programs.

And thanks again for this program!

mscott161
01-02-2009, 01:51 AM
Michael:

For now, I have some quick findings (hopefully before your next revision gets posted) :snicker: :
Your Create HTML menu item is a great addition; but why not place it in a Save HTML button beside (or in place of ) the Save Text button.
I've noticed that the extracted images (ID for the image type) are not always in the correct spot in the .html created. Check the first image in that Space Encyclopedia.imp against what shows up in the PC imp viewer.exe.
Under the General tab, the Header entry for IMP device shows 'Softbook 200/250e' when it should show 'EBW1150'.
It shows 'UnCompressed' even if the text is LZSS compressed.
The 'SubCategory' is not displayed as a Book Properties entries even though it does show up under the Book Properties tab.
When I click an image entry, you display the corresponding image. Consider displaying in that same section, text results that say imp_dump.pl would give for that record entry when "clicked'. No editing would be allowed; just display the info for that filetype in that section.
Under the Book Properties tab, the Format M L, F doesn't work as it only returns one character for each. Consider changing that button to Format L, F M instead, as this "sorts" better and is more common. Ensure that multiple button clicking of Format F M L and Format L, F M properly show the Author name.
Consider adding an About menu item with version number and your name (and possibly a link to this thread)!



Nick,

I was able to put together the changes above.

I added the button to save HTML next to the Save Text button.
I Fixed the Image placement in the HTML so they are using the proper Resource ID for the Image in the HTML
I Fixed the Book Header for the IMP Device
I Fixed the Uncompressed / Compressed flag in the Book Header Display
I Added the SubCategory to the Book Properties Tree Node
Added an About Dialog
???Can you give me a simple text layout of the image information you would like to see when clicking for the image display??? And do you want it added to the Tree or beside the Image Display???
???The are several possible F L M combinations to try to handle without a specific set of rules this would be hard to control and some combination may be left out. I still have the code but temporary made the buttons invisible.


Thank you, I am glad the program is working for you. I appreciate the comments.

I added the list above to 1.1.7

-- Michael

nrapallo
01-02-2009, 08:57 AM
???Can you give me a simple text layout of the image information you would like to see when clicking for the image display??? And do you want it added to the Tree or beside the Image Display???

Sorry, I didn't exactly say what I meant to say. The image displaying routine is just fine as it is now. What I wanted to say was that for "other" RES filetype entries like BGcl, ImRn, etc just display where you showed previously the image (i.e. re-use that same section/space) some text output as shown above in post #5 (http://www.mobileread.com/forums/showthread.php?p=312923#post312923).
For example, BGcl could show: ======== BGcl ========
Filename:OFQJ, $0000, Filesize: 54, Filetype:BGcl, $0000
Header:TOCconst:0001, TOCfname:BGcl, TOCoffset: 40
Header:$00000001, Unknown:$0000, $00000028, $00000101, $00000000, $00000000
Data length = 8, Index length = 14
BGcl_const1:FFFF, Red:FF ($FF), Green:FF ($FF), Blue:FF ($FF)
Index1:Index1_const1:0080, len: 8, offset: 32, const0:0000
For ImRn, that would mean, show: ======== ImRn ========
Filename:BYVI, $0000, Filesize: 656, Filetype:ImRn, $0001
Header:TOCconst:0001, TOCfname:ImRn, TOCoffset: 642
Header:$00000001, Unknown:$0000, $00000282, $00000101, $00000000, $00000000
Data length = 610, Index length = 14
Number of images indexed = 17
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 1527, $00BC70D0, imgtype: FIG, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 14851, $01E0DC48, imgtype: FIG, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 14888, $01F479E0, imgtype: FIG, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 14924, $00BE17B8, imgtype: FIG, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFC, $0000, offset: 15012, $00BC9D20, imgtype: GNP, imgID:0080, $0000
width:472, height:595, aspect:0.79, $FFFA, $0000, offset: 17845, $00BD94A0, imgtype:GEPJ, imgID:0080, $0000
width:472, height:595, aspect:0.79, $FFFB, $0000, offset: 17847, $01F444C0, imgtype:GEPJ, imgID:8D4D, $0000
width:153, height: 61, aspect:2.51, $FFFB, $0000, offset: 17898, $01EDFE80, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFF, $0000, offset: 18201, $01F44380, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFE, $0000, offset: 18512, $00C47C20, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFC, $0000, offset: 18823, $01E1A4C0, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFE, $0000, offset: 19156, $00BC7120, imgtype: GNP, imgID:0080, $0000
width:153, height: 61, aspect:2.51, $FFFE, $0000, offset: 19485, $00BCDE50, imgtype: GNP, imgID:0080, $0000
width:176, height:207, aspect:0.85, $FFFB, $0000, offset: 19949, $01F40D50, imgtype:GEPJ, imgID:4D80, $0000
width:174, height:207, aspect:0.84, $FFFB, $0000, offset: 19951, $01F15318, imgtype:GEPJ, imgID:B7B8, $0000
width:176, height:212, aspect:0.83, $FFFB, $0000, offset: 19953, $00BDD5C0, imgtype:GEPJ, imgID:8A1A, $0000
width:174, height:212, aspect:0.82, $FFFC, $0000, offset: 19955, $00BF8D68, imgtype:GEPJ, imgID:0F4E
Index1:Index1_const1:0080, len: 610, offset: 32, const0:0000
Or trim to what is only "new" information and not yet displayed in your entries under the General tab, i.e. for BGcl , show just its data contents, simplified to 'BGcl Color Red:FF (OFF), Green:FF (OFF), Blue:FF (OFF)'. The data that gets displayed can be "collected" and assembled as a "string" for each RES filetype when you parse the RES records and then just displayed when the user clicks the BGcl or ImRn entries, etc.... For ImRn, consider displaying (when clicked) only one image's text output at a time.

???The are several possible F L M combinations to try to handle without a specific set of rules this would be hard to control and some combination may be left out. I still have the code but temporary made the buttons invisible.

I had previously "dissected" the Blackmask DVD with over 10,000 ebooks and was trying to get its 8.3 dos filenames to display in 'Author - Title' format where Author was in format Firstname Middlename Lastname. Below are a few text lines that were derived by me from the original Blackmask index so as to get my desired results, just look at the last string on each line of this tabbed spreadsheet data:865 The Acorn-Planter Jack London The Acorn-Planter The Acorn-Planter Jack London litmax/acornplant.lit "London, Jack - The Acorn-Planter"
866 Acres of Diamonds Russell H. Conwell Acres of Diamonds Acres of Diamonds Russell H Conwell litmax/acrdi.lit "Conwell, Russell H - Acres of Diamonds"
867 Why Certain Plants Are Acrid Professor William B. Lazenby Why Certain Plants Are Acrid Why Certain Plants Are Acrid Professor William B Lazenby litmax/acridplant.lit "Lazenby, Professor William B - Why Certain Plants Are Acrid"
868 Across the Years Eleanor H. Porter Across the Years Across the Years Eleanor H Porter litmax/acros.lit "Porter, Eleanor H - Across the Years"
869 Liber LIX - Across the Gulf Aleister Crowley Liber LIX - Across the Gulf Liber LIX-Across the Gulf Aleister Crowley litmax/acrossgulf.lit "Crowley, Aleister - Liber LIX-Across the Gulf"
870 Across the Moors William Fryer Harvey Across the Moors Across the Moors William Fryer Harvey litmax/acrossmoors.lit "Harvey, William Fryer - Across the Moors"
Oh, by the way, when I say 'Format F M L' should be 'Format L, F M' I don't mean using the Book Properties with the same name, but instead using just one field (Author Firstname) as a source and performing that string manipulation on it ONLY i.e. Author 'Russell H Conwell' becomes 'Conwell, Russell H'. The tricky part is spotting surname suffixes like Sr., Jr. or III.

Thank you, I am glad the program is working for you. I appreciate the comments.

I added the list above to 1.1.7

-- Michael

I like where this program is going: IMP format displayer and editor. It's been a long time coming and I am only glad I can be a "catalyst" for you to produce same. I am a "hobby" programmer and most suited to tweaking code already written than writing my own code from scratch (especially with a GUI!).:thumbsup:

nrapallo
01-02-2009, 09:30 AM
Version 1.1.7 works like a charm! Sweet! :2thumbsup

OK, feature request time.....

Have you used the eBookwise Librarian (aka GEB Librarian) before?

It allows for, amongst other things, category editing where a standard list is maintained and added to as the user changes categories. It is simple in its implementation but quite powerful in its use. There could be a drop-down box in the Book Properties tab beside the Category to accomplish same.

Also, Imp Librarian (http://www.mobileread.com/forums/showthread.php?t=23003) has some nice library database manipulations, like CSV Dump for all it's .imp ebook library. Perhaps, ConvertIMP can "read" in a user directory (but not parse it just yet), then have the user select one of the library entires and then "go to town" on it. This two-tier/level approach will allow some Global editing/renaming to be facilitated as well as allowing your core program functions to be retained for a single .imp ebook. Food for thought....

Anyone else using this program have anything to add... please, I don't want to make this a one-trick (user) show... :snicker:

nrapallo
01-02-2009, 04:15 PM
Michael:

Have you seen this post (http://www.mobileread.com/forums/showthread.php?t=8380) featuring a lot of early LRF tools/code?

They even have some lrf to html/xml programs similar (in spirit) to your imp to html program.

Worth a look-see...

mscott161
01-02-2009, 05:41 PM
Nick,

I have not seen that post. I will look into it. I have posted a new version 1.2 for you to see.

--Michael

ashkulz
01-03-2009, 09:46 AM
Nick/Michael: have either of you figured out enough of the .IMP format so that you can write an image-only IMP [no text at all] so that I can incorporate it into pdfread? That way there'd be no dependency on eBook Publisher....

nrapallo
01-03-2009, 01:24 PM
Nick/Michael: have either of you figured out enough of the .IMP format so that you can write an image-only IMP [no text at all] so that I can incorporate it into pdfread? That way there'd be no dependency on eBook Publisher....

Yeah, still trying to get there... ;)

A minimal list of .RES filetypes that are used (when using solely images as ebook pages) and what they do is presented below:Filetype: (DATA.FRK text with control codes for images)
Filetype:!!sw (Standard coding can be obtained from sample .imp build)
Filetype:AncT (optional - if want to be able to link to images from TOC)
Filetype:AnTg (optional - if want to be able to link to images from TOC)
Filetype:BGcl (default = FFFFFF [white])
Filetype:BPgZ (actual ebook screen - the HARD one!!!!)
Filetype:BPgz (actual ebook screen - the HARD one!!!!)
Filetype:Devm (default = 01)
Filetype:ESts (use default x-sbp-widow-push and x-sbp-orphan-pull)
Filetype:ImRn (list of images stored with particulars and assigned Image ID)
Filetype:Mrgn (default, not used)
Filetype:PcZ0 (holds actual picture sizes to display on ebook screen)
Filetype:Pcz0 (holds actual picture sizes to display on ebook screen)
Filetype:pInf (compute last page number/image resources count info)
Filetype:GIF (repack list of images)
Filetype:JPEG (repack list of images)
Filetype:PNG (repack list of images)
Filetype:PPic (summary of number of images and borders used)
Filetype:StRn (optional - if want styles for headings and such)
Filetype:Styl (optional - if want styles for headings and such)

The above remaining filetypes to reverse-engineer are few, if any. The creation/combination of the pieces into a .IMP without using the eBook Publisher /dll's should be trival afterwards.

Close but just not there yet... :)

mscott161
01-04-2009, 12:37 PM
I have posted a bug fix to the version 1.2 If the book list loads from the XML file and you click on the General tab after selecting a book it throws an error.

-Michael

mscott161
01-05-2009, 03:22 PM
Some things found while looking at the tabl, trow, and tcel res files:
the tables match pretty good. I think there are 3 values for the table border 0, 1, and 2. Also the TRow is actually the TD tag in HTML and TCel is the definition for TR, but I find the offset and length confusing because the numbers are not even close. You will also notice in the Index for TRow and TCell the second Index is 1 byte larger than the others.

-- Michael

nrapallo
01-05-2009, 03:38 PM
but I find the offset and length confusing because the numbers are not even close. You will also notice in the Index for TRow and TCell the second Index is 1 byte larger than the others.

-- Michael

Off the top of my head, try printing the offset and length in hexadecimal and see if it is a LE vs BE issue ie. 0x1000 is 16 in LE and 4096 in BE.

mscott161
01-06-2009, 01:36 PM
Nick,

I liked your suggestions and added a IMP Type Column and allowed the including of sub directories.

I did look at the LE and BE both. I know the IMP format says there sould be a ID for the TRow and the TCel that match its parent but I have been unable to locate them besides in the indexes.

-Michael

-- Michael

mscott161
01-06-2009, 05:26 PM
I have place a bug fix for the REB1200 IMP File Format. It is in v.1.2.3 see post #1

The Tabl, TRow, and TCel content and indexes match up nicely in that format, but not for EBW1150. I will post a new version when I can match both.

-- Michael

nrapallo
01-06-2009, 05:37 PM
I have place a bug fix for the REB1200 IMP File Format. It is in v.1.2.3 see post #1

The Tabl, TRow, and TCel content and indexes match up nicely in that format, but not for EBW1150. I will post a new version when I can match both.

-- Michael

Then for sure the problem is "LE vs BE" as the .IMP "specs" were written for the REB1200 format and need to be re-interpreted with the EBW1150's slghtly longer index size as well as its reversed byte representation (BE).

Thanks for adding the REB1200 support! Much appreciated!!!

mscott161
01-07-2009, 02:17 AM
Nick,

I have published v.1.2.4 for you. It has the Tabl, TRow, TCel for the REB1200 displayed in the tree. I am still working on the 1150.

-- Michael

mscott161
01-07-2009, 10:42 PM
Nick,

I have noticed in the TCel and TRow sections that there is more than one record per index. The index length is greater than a single record. I will post a new version tonight. In the tree view I added the Groups for both the individual records under their parents. Also for debug information. I have a Text Display of the Tree Expanded out on the Content Tab. I have not had any luck with the EBW 1150 for these two sections. I have tried BE and LE but the bytes do not match the format of REB1200 at all.

-- Michael

nrapallo
01-07-2009, 11:25 PM
Nick,

I have noticed in the TCel and TRow sections that there is more than one record per index. The index length is greater than a single record. I will post a new version tonight. In the tree view I added the Groups for both the individual records under their parents. Also for debug information. I have a Text Display of the Tree Expanded out on the Content Tab. I have not had any luck with the EBW 1150 for these two sections. I have tried BE and LE but the bytes do not match the format of REB1200 at all.

-- Michael

Which source .imp file are you using as your "guide"? ResTestDocument.imp?

If so, I'll check that to see what I can come up with. It helps to look at both the .imp and _1200.imp versions at the same time when reverse-engineering this stuff.

But be careful with that one!!! The EBW1150 version is one I recently created using eBook Publisher v2.2 and the REB1200 version was the one the original poster had created using SBPublisher 1.5 (it's actually a Softbook 200/250e imp type 0).

You may want to try another ebook set like 16 Web Safe Colors (http://www.mobileread.com/forums/showthread.php?p=156435#post156435) for this table testing.

EDIT: I've added EBW1150 & REB1200 versions of RebTestDoc.imp created by eBook Publisher, so that you can continue to test with it (use it as your guide)! :cool:

mscott161
01-08-2009, 01:19 PM
Nick,

Thanks for the files I will check them out. I have been using the Space.imp file for testing because it is small. I wonder if the EBW1500 uses a XOR to get the number. It almost seems compressed. I will be trying a few things.

I have a new version that take the Text content and creates an LRF file now. And a process that iterates through the library list and creates lrf files for each. I am currently running it on all 1600 books.

I would like to thank the creators and developers involved in the BBeBLib (Chris Mumford) for the source code to get this part working.

--Michael

PS - I plan to upload the new version later today or tomorrow.

ritibelle
03-05-2009, 10:43 AM
I would like to test this but for the life of me can't figure out how to use it :blink:
Can anyone give me some pointers?

Thank you very much!

mscott161
03-06-2009, 11:49 AM
ritibelle,

What problems do you have?

Michael

ritibelle
03-06-2009, 05:06 PM
Well, I unziped your file and I just plain don't know what to do next...
What do I do with that bunch of files inside the zip?

mscott161
03-06-2009, 06:07 PM
Ritibelle,

If you are not using Visual Studio 2008 and just want to use the Convert Program, then the only files you will need are
ConvertIMPGUI.exe
ICSharpCode.SharpZipLib.dll

Put these two files in a folder and run the exe file.

On the form at the top use the browse button to select the directory where your IMP files are located if you have sub directories with IMP files check the check box to include subdirectories.

This will load the list area on the Library tab. Select a book by single clicking it and then you can use the other tabs.

Michael

nrapallo
03-06-2009, 06:10 PM
Well, I unziped your file and I just plain don't know what to do next...
What do I do with that bunch of files inside the zip?

Oh, that's easy, just drill down the folders bin/debug and double-click the ConvertIMPGUI.exe.

Then choose the directory with your .imp files.

Have fun!

By the way, I like your choice of devices. ;) :thumbsup:

EDIT: Oops, beat by Michael!

ritibelle
03-07-2009, 03:50 PM
Thanks, I've got it at last! It seems great!
Why would I want to use Visual Studio? I mean, what would I be able to do with it? Sorry to bother you with all these questions.

As for devices, I've been lusting for these newer machines but when push comes to shove I don't think they are better then what I've already have, just different. And I can't do without the backlight!

Thanks again for the help! :)

rtype
10-24-2009, 01:36 PM
Michael,

First of all thanks for writing such a great tool!

I wanted to use the functionality in your program from calibre (i.e. the convert from imp to HTML bit) since I use calibre to orgainize all my books.

I created my plugin in calibre, and also modified your code slightly giving it the ability to accept the path to an imp and a path to an output file have that conversion happen automatically.

I think it would be a good feature for the program to have I can pass on the few changes if you would like to incorporate them.

In looking at this I also made a change to the FormatDATAToHTML method
changing the string strFile to a StringBuilder gives a huge performance benifit with large files (or at least on the couple I've tried)

mscott161
10-30-2009, 11:48 AM
rtype,

No problem I would love to see any changes you have done. I have been working on a program suite like calibre but in .Net it would allow most downloaded ebooks (secure and unsecure) then create the file for the device or devices you selected. I gets annoying to have to use a utility for each format to get it to text or html then use my program to create the format for my device.

I wished someone would create an ebook device that will allow the secure and unsecure of the most popular formats without having to convert them.

Michael

rtype
11-01-2009, 12:30 PM
Great,

here are the changes I made, I've included a changes.txt file in the solution describing the specifics.

test_IMPX_input_plugin.zip is the (very much a WIP but working) calibre plugin I have been writing (it assumes that ConvertIMPGUI.exe is in C:\ConvertIMPGUI , I havn't gotten to a config ui)

I think it'll be a while before we have a reads everthing ebook they would much rather you buy the book again in their format :-)

dragon
11-25-2009, 11:48 PM
- Just the Executable is at http://www.ebizsoft.com/download/convertimpgui.zip

That file also appears to be source code, not the executable. Also, the three attachments 1.30, 1.35 and 1.36 are all source code.

Is the executable available?

nrapallo
11-26-2009, 12:01 AM
That file also appears to be source code, not the executable. Also, the three attachments 1.30, 1.35 and 1.36 are all source code.

Is the executable available?

See Posts #34 and #35 above indicating that the executables are in the .zip file but deep within folders i.e. bin/debug. Just extract them like Michael says in Post #34 above.

dragon
11-26-2009, 10:34 AM
See Posts #34 and #35 above indicating that the executables are in the .zip file but deep within folders i.e. bin/debug. Just extract them like Michael says in Post #34 above.

Thanks! That solved it. I didn't look deep enough through the directories.
:thanks:

FizzyWater
11-26-2009, 07:05 PM
Does this only convert FROM imp TO something else (like LRF)? Or can you convert something else (like ePub) TO imp?

nrapallo
11-27-2009, 05:44 PM
Does this only convert FROM imp TO something else (like LRF)? Or can you convert something else (like ePub) TO imp?

Yes, it's only able to go FROM .imp to something more useful, like .txt or .html with images. It also should be able to go from .imp to .lrf, though, I've never tried that part since I don't own a Sony ebook reader.

It doesn't convert anything TO .imp (from any format).

BTW, you can convert .epub to .imp using eBook Publisher's latest release, though the conversion still requires some manual settings afterwards, like specifying the Category and that the ebook should be compressed and have links underlined. (See Edition settings...)

RikaStrom
04-23-2010, 08:25 PM
A quick question, I downloaded the most recent version, yet when I tried to open the .exe file I received an error message stating I needed to download a .NET framework v2.0 50727.

What would that be?

Thanks for the assist.

Madam Broshkina
04-23-2010, 08:30 PM
The Microsoft .NET Framework version 2.0 (x86) redistributable package installs the .NET Framework runtime and associated files required to run applications developed to target the .NET Framework v2.0.


http://www.microsoft.com/downloads/details.aspx?FamilyID=0856EACB-4362-4B0D-8EDD-AAB15C5E04F5&displaylang=en

rtype
04-24-2010, 03:43 AM
The Program is a .net program (a program development framework supplied by microsoft) the framework comes as standard on the later versions of windows.

If you don't have it on your version of windows you can use windows update to install it.

or you can get it from here (better to get from win update)

http://www.microsoft.com/downloads/details.aspx?FamilyID=0856eacb-4362-4b0d-8edd-aab15c5e04f5&displaylang=en

Westlyn
07-14-2010, 11:56 AM
Can anyone tell me what sort of conversion speed I should be getting?

I've seen no complaints re really slow conversion times in the forum but....

I'm using v1.3.6 and finding that a 1.5 Megabyte .IMP file is taking up to 8 hours to generate the html file. I see similar times on my dual core at work and my single core machine at home.

By comparison the deimp.exe tool that only produces text files runs in a minute or less although I realise that it has less to do since it's not rebuilding the html.

Am I doing something wrong, or is that the sort of conversion time I should be expecting? Is there anything I can do to speed that conversion up a bit?

rtype
07-16-2010, 04:23 AM
I've only converted smaller files, converts them quickly, I did however spot a possible performance issue, I posted a fix in post #39.

This brought small file conversion for me from a minute or two to near instant. So it might help.

mscott161
07-16-2010, 11:30 AM
I attached the compile that is just a little slower than deimp but still less than 2 minutes on a 2Mb file.
Please let me know if the attached zip fixes your issue.

Michael

Westlyn
07-20-2010, 10:50 AM
I attached the compile that is just a little slower than deimp but still less than 2 minutes on a 2Mb file.
Please let me know if the attached zip fixes your issue.

Michael

Thanks for uploading a new version. I'll give it a whirl and report back though it could be a while. I've pretty well finished converting the IMPs and didn't keep them since I had the html. Maybe I'll have to convert the html back to IMP and then try to convert it back to html..... :smack:

nrapallo
07-20-2010, 11:00 AM
I've pretty well finished converting the IMPs and didn't keep them

Or you may just want to try any of the .imp ebooks uploaded here (http://www.mobileread.com/forums/ebooks.php?forumid=153&order=DESC&sort=dateline&pp=30&genreid=&ltr=). ;)

Westlyn
07-21-2010, 10:05 AM
Or you may just want to try any of the .imp ebooks uploaded here (http://www.mobileread.com/forums/ebooks.php?forumid=153&order=DESC&sort=dateline&pp=30&genreid=&ltr=). ;)


Well yes - hadn't found this resource before so thanks for making me aware. Will certainly allow me to find a couple of IMPs to test convert.

Of course, they may be better made that the ones I was converting before so the test may not really prove anything other than the new converter version is fast on the imps I downloaded to test with. ;)

Found a couple of epubs, in the library, I want to read as well (which was where I'm heading with my converted IMPs) so a double benefit.

Thanks to all for the tool and advice.

Westlyn
07-21-2010, 12:11 PM
I attached the compile that is just a little slower than deimp but still less than 2 minutes on a 2Mb file.
Please let me know if the attached zip fixes your issue.

Michael

OK, didn't have the original problem imp file to hand but as per suggestion in another reply downloaded some imps from the book library here.

Using your attached version (which by the way says v1.1.7 in the about dialog) I get the following:

Agent to the Stars = 812Kb took 17 minutes
Four by Laumer = 621Kb took 8 minutes
Accelerando = 769Kb took 21 minutes

So not the 2 minutes per 2 MB sadly.

These timings running on a Dualcore 2GHz Laptop with 3Gb Ram in Vista

I've just checked my recycle bin and actually found one of the original problem files so I'll post timings for that tomorrow. I'll also try and post matching timings for deimp just for comparison.

Time is not too much of an issue if I can convert an imp file in 10 mins or so but many hours did seem a bit too long to be practical - hence me wondering if I was doing something wrong.

rtype
07-22-2010, 05:08 AM
I converted Four by Laumer on my machine with version 1.1.7 as linked to on the first post of this thread.

It took 2mins on my machine, core2duo e8200 2.67

I made the tweak refered to in post #39 and it converted nearly instantly.

I've attached the exe in case it helps.

Just to clarify what I did was click on the imp file in the library view, click on the general tab and then from the tools menu click Create HTML.

Westlyn
07-22-2010, 12:04 PM
I converted Four by Laumer on my machine with version 1.1.7 as linked to on the first post of this thread.

It took 2mins on my machine, core2duo e8200 2.67

I made the tweak refered to in post #39 and it converted nearly instantly.

I've attached the exe in case it helps.

Just to clarify what I did was click on the imp file in the library view, click on the general tab and then from the tools menu click Create HTML.

Yes that's exactly the steps I take to create a HTML from the IMP file.

Converting the Laumer book wasn't instant for me sadly -took about 3 mins.

As additional info I also tried converting my original 2453 kb IMP file with the gui; after 4.5 hours, I have just used task manager to kill the executable since there was no sign of an output.

Running the same file though DEIMP produced a text file in about 30 seconds.

Note this particular IMP file doesn't generate any image files; ie it is all text whereas many of the other IMP files I've tested contain images that will reduce the amount of Ks used by the text content, I guess. So this will be a much bigger processing challenge for the converter as a large, text content only, file.

Another notable thing about this file is that DEIMP created an output file with unix linefeed only end of lines rather than the more normal CRLF output I've seen from some of the other IMPs I've converted.

So as a hypothesis, could the GUI converter be stumbling over the unix line endings and treating the whole text as a single (enormous) line of text as opposed to hundreds of much shorter lines of text? Would that slow the html rebuild process?

mscott161
07-23-2010, 11:58 AM
Westlyn,

I would like to have a copy of the imp file you are having problems with so that I may make corrections to the program if needed. I did not change the about box version number. If I make a change to the application again I will change the version and make sure that the first post has the correct version in it.

Just on a side note if you have an android phone I have written a IMP Reader for it. You can download the source or the adk from droidimpreader.codeplex.com

Michael

Westlyn
08-16-2010, 12:34 PM
Westlyn,

I would like to have a copy of the imp file you are having problems with so that I may make corrections to the program if needed. I did not change the about box version number. If I make a change to the application again I will change the version and make sure that the first post has the correct version in it.

Just on a side note if you have an android phone I have written a IMP Reader for it. You can download the source or the adk from droidimpreader.codeplex.com

Michael
Sorry to be slow replying - just got back from holiday.

Sadly, I don't have an android phone just a relatively ancient HTC Prophet running WinMobile v5. But I can use Freda to read epubs and Tiny reader to read .lit and .pdb.

Appreciate your offer to look at the file but I'm not too sure about uploading the file since, to be honest, I can't be too sure of it's copyrite status given that I have no idea where I orginally got it from. I'd hate to break any forum rules.

mscott161
08-20-2010, 10:53 AM
Westlyn,

No problem on the IMP file. I have ran through all 1000 IMP books I have al rangeing in size. Let me know if you are still having problems. If you want to email me the file, I do not believe it would break forum rules and since I will not be using the file for anything but testing, it will not break the copyright. It is up to you.

Michael

Westlyn
08-30-2010, 12:09 PM
Westlyn,

No problem on the IMP file. I have ran through all 1000 IMP books I have al rangeing in size. Let me know if you are still having problems. If you want to email me the file, I do not believe it would break forum rules and since I will not be using the file for anything but testing, it will not break the copyright. It is up to you.

Michael

I find that any .IMP more than about 100k takes an ever increasing time eg 400Kb takes about 20 minutes but 1500K takes about 4 hours and a 2500Kb IMP hadn't produced html after 7 or 8 hours, so there seems to be an exponential-like increase in processing time as the file size increases.

IMP_DUMP seems to be much less size affected with the same 1500Kb file taking less than 30 seconds to output the txt file.

The file that takes 4 hours also generates a .html file where the text is unreadable but the imp_dump .txt file is readable

eg Output html in browser from ConvertIMPGUI:

Ä P) M a o “ %
…s C i’ +' ·

è 7 < F q F k s ™¹, s Á™ ™ K ˆ

7t 7 ™ ™ K r ¡ Ø sf aÁ™ ]4c q ‘{ Í£ ±c’ F\2 Ø { ^ cv  Û K



but the imp_dump output is readable; but is output with unix not windows line end termination. My highlighting of the line end chars.

"LF"PROLOGUE "LF" "LF"I have a story to tell you. It has many beginnings, and perhaps one ending. Perhaps not. Beginnings and endings are contingent things anyway; inventions, devices. Where does any story really begin? There is always context, always an encompassingly greater epic, always something before the described events, unless we are to start every story with, 'BANG! Expand! Sssss . . .', then itemise the whole subsequent history of the universe before settling down, at last, to the particular tale in question. Similarly, no ending is final, unless it is the end of all things . . . "LF"Nevertheless

I must be being dense today but I couldn't see how to attach a file to a private email in this forum. So bear with me until I find out how.

I'll send you a zip file with the .imp, the .html and .txt output from ConvertIMPGUI and IPM_dump. Hopefully that would make it easier to track down the issue. I'm assuming the massive performance hit is maybe related to the output formatting issue.

Thanks again for being willing to take a look

nrapallo
08-30-2010, 04:10 PM
I find that any .IMP more than about 100k takes an ever increasing time eg 400Kb takes about 20 minutes but 1500K takes about 4 hours and a 2500Kb IMP hadn't produced html after 7 or 8 hours, so there seems to be an exponential-like increase in processing time as the file size increases.

IMP_DUMP seems to be much less size affected with the same 1500Kb file taking less than 30 seconds to output the txt file.

IMP_DUMP just decompresses the text that was compressed as part of the .imp build process. It basically is a decompressor written in C and is extremely fast since it does no "processing" on THAT text.

I too do find the writing of this html file extremely slow, but it is a more complex process than just decompressing the text. It has to search/look-up the 'styles' used for each span of text and insert proper paragraph / new page breaks. There's even "special codes" spots that deal with images/tables/etc...

Perhaps Michael can review same for better efficiencies and try to curb the exponential growth in time when compared to file size. :fingersx:

The file that takes 4 hours also generates a .html file where the text is unreadable but the imp_dump .txt file is readable

eg Output html in browser from ConvertIMPGUI:

„ ™ P) M a o “ %
…s C i’ +' ·

è’ 7 < F q F k •s ™¹, s Á™ ™ K ˆ

7t 7 ™ ™ K r ¡ ˜ sf aÁ™ ]4c q ‘{ Í£ ±c’ F\2 ˜ { ^ cv  › K



but the imp_dump output is readable; but is output with unix not windows line end termination. My highlighting of the line end chars.

"LF"PROLOGUE "LF" "LF"I have a story to tell you. It has many beginnings, and perhaps one ending. Perhaps not. Beginnings and endings are contingent things anyway; inventions, devices. Where does any story really begin? There is always context, always an encompassingly greater epic, always something before the described events, unless we are to start every story with, 'BANG! Expand! Sssss . . .', then itemise the whole subsequent history of the universe before settling down, at last, to the particular tale in question. Similarly, no ending is final, unless it is the end of all things . . . "LF"Nevertheless

I must be being dense today but I couldn't see how to attach a file to a private email in this forum. So bear with me until I find out how.

You can't, so stop looking... :snicker: Best to send Michael a private message asking him for his email or better still provide him with your off-site email and then you can make a "connection" for further email attachments... :)

I'll send you a zip file with the .imp, the .html and .txt output from ConvertIMPGUI and IPM_dump. Hopefully that would make it easier to track down the issue. I'm assuming the massive performance hit is maybe related to the output formatting issue.

Thanks again for being willing to take a look

I'm interest in this too! ;)

mscott161
09-03-2010, 11:21 AM
Is it possible that the IMP file is also encrypted? The convertIMP does not handle encrypted IMP files and would probably have a tough time with it. I can add the code to it if I had one of your IMP books to test with.
Michael

Westlyn
09-07-2010, 05:50 AM
57817Is it possible that the IMP file is also encrypted? The convertIMP does not handle encrypted IMP files and would probably have a tough time with it. I can add the code to it if I had one of your IMP books to test with.
Michael

I'm pretty sure that the IMP is not encrypted, not least of which is that I don't think DeIMP can handle encrypted files and in any case I'm not providing any key when deimping.

Attached an encrypted zipfile with relevant files inside. Password sent via private mail.

mscott161
09-10-2010, 12:13 PM
Westlyn,

I found the problem. The book is uncompressed. I have fixed the code and updated the attachments and the download link in the first post of this thread.

Michael

Susant1
01-02-2011, 12:39 PM
Westlyn,

I found the problem. The book is uncompressed. I have fixed the code and updated the attachments and the download link in the first post of this thread.

Michael

I'm new here and have read the thread from the beginning. I have installed the executable as suggested by nrapallo in 11-26-2009. But when I try to open it, I get a box, "Application failed to initialize properly (OxcOOO135)" I'm running WinXP, SP2. Any suggestions would be greatly appreciated. TIA Susan

Hat8ee
05-23-2011, 12:46 AM
Thanks so much for all your work. :):thanks:

GrannyGrump
11-06-2012, 10:07 PM
:blink: I am throwing this question out into cyberspace, with little hope, as it looks like the IMP format has been mostly abandoned.

A few titles in the library that I want are only in IMP format. So I tried this converter, but with mixed results.

I "installed" IMP GUI Converter v 1.36 as instructed --- mscott161 stated that all that is required is the ConvertIMPGUI executable and the ICSharpCode.SharpZipLib DLL. I launched the GUI and loaded a book into it. I CAN get the text extraction, but the HTML tool on the menu is grayed out / disabled.

The text extraction pulled out the text well enough, but apparently cannot deal with unicode characters. Lots of question marks/null glyphs for curly quotes, mdashes, diacritics, etc. I don't know if the HTML extraction would have better results, because it is disabled and not usable.

Any suggestions how to get the HTML extraction working?
:help:

nrapallo
11-07-2012, 05:47 PM
:blink: I am throwing this question out into cyberspace, with little hope, as it looks like the IMP format has been mostly abandoned.

:bigwave: I rarely use the .imp format anymore, but still am willing to dabble with it.... :grin2:

A few titles in the library that I want are only in IMP format. So I tried this converter, but with mixed results.

Which ones in particular? Early on, there were two distinct ways to produce .imp files, one using compressed text (the norm) and one using just images for each page (this doesn't have anything to extract/convert).

Perhaps, I can have a go at converting it, if you give me a link to the ebook...

I "installed" IMP GUI Converter v 1.36 as instructed --- mscott161 stated that all that is required is the ConvertIMPGUI executable and the ICSharpCode.SharpZipLib DLL. I launched the GUI and loaded a book into it. I CAN get the text extraction, but the HTML tool on the menu is grayed out / disabled.

Best to leave the .exe and associated files in that Debug directory under the bin folder; otherwise, you may encounter problems like you are noticing.

The text extraction pulled out the text well enough, but apparently cannot deal with unicode characters. Lots of question marks/null glyphs for curly quotes, mdashes, diacritics, etc. I don't know if the HTML extraction would have better results, because it is disabled and not usable.

My imp_dump tool (http://www.mobileread.com/forums/showthread.php?t=34212) just extracts the raw text as well. If you are adventurous, try using the cpan EBbook-Tools imp extraction as discussed in this thread (http://www.mobileread.com/forums/showthread.php?t=31142).

Any suggestions how to get the HTML extraction working?
:help:

I don't think HTML extraction was working 100% to begin with. We never got that far... :(

GrannyGrump
11-11-2012, 05:59 AM
Hi nrapallo, so glad to get your reply!
Sorry to be so late coming back.

Just as background, all I did was download the zip file for the IMP GUI Converter, and extracted that entire folder to my drive, then drilled down to the Debug / bin folder and launched the executable. It runs, I just don't know if I need to do anything else to get the html extraction to work.

One of the files I was trying was Zelda Pinwheel's "Jumping Frog" by Mark Twain, because I am doing that book in ePub, and wanted to see if she had restored the diacritics (I don't speak French). This is the link (http://www.mobileread.com/forums/showthread.php?t=20193).

I tried a couple of other IMP files as well, but can't remember titles just now.

I also googled unsuccessfully for a reader app that will display IMP files on my computer.

All I want to do is extract a reasonably clean text or html file that I can convert to ePub. Did I misunderstand what this tool does?

I will give a try to the other extraction tool you linked to, and see how that goes.

Thanks for the advice!

nrapallo
11-12-2012, 05:03 PM
Hi nrapallo, so glad to get your reply!
Sorry to be so late coming back.

No worries... :)

Just as background, all I did was download the zip file for the IMP GUI Converter, and extracted that entire folder to my drive, then drilled down to the Debug / bin folder and launched the executable. It runs, I just don't know if I need to do anything else to get the html extraction to work.

Using WinXP, when I download the latest ConvertIMPGUI.zip I can produce the .txt, .html and even the .lrf conversions. I attach same below for your perusal.

One of the files I was trying was Zelda Pinwheel's "Jumping Frog" by Mark Twain, because I am doing that book in ePub, and wanted to see if she had restored the diacritics (I don't speak French). This is the link (http://www.mobileread.com/forums/showthread.php?t=20193).

Ouch, diacritics support IS very limited or even non-existent!!!

I tried a couple of other IMP files as well, but can't remember titles just now.

I also googled unsuccessfully for a reader app that will display IMP files on my computer.

You can get an imp_viewer for your PC when you install the eBook-Publisher software available from here (http://www.ebooksystem.net/support_download.htm). It even allows you to print the .imp to a printer or to .pdf file using a PDF printer driver like PrimoPDF (http://www.primopdf.com/) (it's free).

All I want to do is extract a reasonably clean text or html file that I can convert to ePub. Did I misunderstand what this tool does?

The steps I take are usually click the tabs in order i.e. General, Book Properties, and then Text Content. Once there, I click the Save Text button first, then the Save Html button. The latter launches a browser to display the results, but saves the .html anyway.

I will give a try to the other extraction tool you linked to, and see how that goes.

I used the Ebook-Tools v0.4.6 to convert the .imp to .html and the results were quite acceptable. I did notice that the diacritics didn't translate well and discovered that the .imp stores them in "MacRoman" OS font encoding!!!! I knew the original SoftBook (like my REB1200) used the Mac OS as its base for the GUI and now I know it also used it for the text encoding! That was news to me!!! :2thumbsup [BTW, I used 'iconv' to convert from macroman text to ascii text.]

Anyway, I attach the various conversions using ConvertIMPGUI, Ebook-Tools, imp_dump and PrimoPDF/imp_viewer!

Thanks for the advice!

No problem, this was a worthwhile exercise for me.

Oh, and the .html source (now attached as "frog-html.zip" below) used for Zelda's .imp version has been already converted to .epub and .pdf at Feedbooks.com here (http://www.feedbooks.com/book/2925/the-jumping-frog)! Just use these instead of converting.... :snicker:

Enjoy!

GrannyGrump
11-28-2012, 09:17 AM
I am SO sorry for the belated response, I didn't see a notification of another reply.

Thank you so much for doing these conversions, and maybe you gave me a boost by the file name with v 1.3.6 A. I will be trying this out again with that version, and keeping my fingers crossed.

And thank you for the FeedBooks link. I had downloaded books from there quite some time ago, and they all seemed to be in plain-text, a la Gutenberg. Who knew they have nicely formatted books... (would be nice if they gave a hint how the book looks)

Thank you for the time and trouble, Santa better be generous with you for your good deeds.

:thanks: