Quote:
Originally Posted by grannyGrumpy
I am throwing this question out into cyberspace, with little hope, as it looks like the IMP format has been mostly abandoned.
|
I rarely use the .imp format anymore, but still am willing to dabble with it....
Quote:
A few titles in the library that I want are only in IMP format. So I tried this converter, but with mixed results.
|
Which ones in particular? Early on, there were two distinct ways to produce .imp files, one using compressed text (the norm) and one using just images for each page (this doesn't have anything to extract/convert).
Perhaps, I can have a go at converting it, if you give me a link to the ebook...
Quote:
I "installed" IMP GUI Converter v 1.36 as instructed --- mscott161 stated that all that is required is the ConvertIMPGUI executable and the ICSharpCode.SharpZipLib DLL. I launched the GUI and loaded a book into it. I CAN get the text extraction, but the HTML tool on the menu is grayed out / disabled.
|
Best to leave the .exe and associated files in that Debug directory under the bin folder; otherwise, you may encounter problems like you are noticing.
Quote:
The text extraction pulled out the text well enough, but apparently cannot deal with unicode characters. Lots of question marks/null glyphs for curly quotes, mdashes, diacritics, etc. I don't know if the HTML extraction would have better results, because it is disabled and not usable.
|
My
imp_dump tool just extracts the raw text as well. If you are adventurous, try using the cpan EBbook-Tools imp extraction as discussed in
this thread.
Quote:
Any suggestions how to get the HTML extraction working?
|
I don't think HTML extraction was working 100% to begin with. We never got that far...