Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 07-06-2009, 07:39 PM   #1
Nvidiot
Connoisseur
Nvidiot began at the beginning.
 
Nvidiot's Avatar
 
Posts: 57
Karma: 30
Join Date: Jul 2009
Location: Netherlands
Device: PW2
Question multi-page HTML with images to ePub or LRF

I'm trying to convert a multi-page html book (http://www.hq.nasa.gov/office/pao/Hi.../contents.html) to something I can read on my PRS-700. I've tried copying and pasting the text into an RTF file, and then using Calibre to convert to LRF or ePUB. This works, however, the images dissapear. The same thing happens when I just toss the RTF file on my reader. When I open the RTF file with MS Word (2007), the images are there and visible.

Any tips?
Nvidiot is offline   Reply With Quote
Old 07-06-2009, 08:01 PM   #2
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
I don't think RTF support on the Sony Reader includes images. The best quick way to do it is copy it into calibre and convert to either LRF or Epub.
Nate the great is offline   Reply With Quote
Old 07-06-2009, 08:11 PM   #3
Nvidiot
Connoisseur
Nvidiot began at the beginning.
 
Nvidiot's Avatar
 
Posts: 57
Karma: 30
Join Date: Jul 2009
Location: Netherlands
Device: PW2
Quote:
Originally Posted by Nate the great View Post
I don't think RTF support on the Sony Reader includes images. The best quick way to do it is copy it into calibre and convert to either LRF or Epub.
The problem is, when I import the 'contents.html' into calibre, it thinks that that file is everything, obviously not what I want. When I import the RTF that I made with MS Word (with the pictures) and then convert to LRF or epub it converts but again misses the images.
Nvidiot is offline   Reply With Quote
Old 07-06-2009, 08:27 PM   #4
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Well shoot. Is it possible to open the RTF in MSWord? You could save it as DOC, and then use calibre to convert that (I think).

Thanks for pointing this out. I spidered it, and I will place it on top of my TBC pile. If I get a chance, I'll throw up a Q&D conversion tonight.
Nate the great is offline   Reply With Quote
Old 07-06-2009, 08:33 PM   #5
Nvidiot
Connoisseur
Nvidiot began at the beginning.
 
Nvidiot's Avatar
 
Posts: 57
Karma: 30
Join Date: Jul 2009
Location: Netherlands
Device: PW2
Quote:
Originally Posted by Nate the great View Post
Well shoot. Is it possible to open the RTF in MSWord? You could save it as DOC, and then use calibre to convert that (I think).

Thanks for pointing this out. I spidered it, and I will place it on top of my TBC pile. If I get a chance, I'll throw up a Q&D conversion tonight.
Doc2lrf is not supported in Calibre (at least not in 0.5.14)

I'd be VERY happy with a Q&D conversion (especially if you tell me how you did it). I don't care about non-working links to footnotes etc, if they are at the end of a chapter I can find 'm easily enough, the chapters are short anyway.

I tried copying & pasting the text to the Atlantis editor and using it's epub export option. That does seem to work better, however I'll have to copy & paste the images one at a time. Selecting all of the html and pasting it in will not put the images in. Also, the right side of the images is cut off on the reader. At least it's progress
Nvidiot is offline   Reply With Quote
Old 07-06-2009, 08:44 PM   #6
wallcraft
reader
wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.
 
wallcraft's Avatar
 
Posts: 6,979
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3 and Fire
Try importing the RTF into OpenOffice and exporting it as an ODT file. This should be readable (with images I think) by Calibre. Anoother possibility is save as "web page filtered" from Word.
wallcraft is offline   Reply With Quote
Old 07-06-2009, 08:55 PM   #7
Nvidiot
Connoisseur
Nvidiot began at the beginning.
 
Nvidiot's Avatar
 
Posts: 57
Karma: 30
Join Date: Jul 2009
Location: Netherlands
Device: PW2
Fixing the links to point to local pages (wget -k) did the trick. Calibre correctly read in all the html files and made a decent LRF out of it. Only problem I have is that for some reason it put some chapters in front of of others when they should not be. Not sure what's going on with that, I opened the 'contents.html', which has all the chapters/pages linked, in the proper order.
Nvidiot is offline   Reply With Quote
Old 07-06-2009, 10:29 PM   #8
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Here is the Q&D edition in Epub and Mobipocket.

I haven't done anything to the formatting, and I make no claims about the quality becuase the original html is horrible. But I will say that the links _should_ work correctly, the files _should_ be in the correct order, and all the important images _should_ have been included.

Enjoy.


EDIT: Having looked at the ebooks I must say that they're a lot better than I expected.

SECOND EDIT: I moved the files to the book upload section so others can find them.

Epub:
http://www.mobileread.com/forums/showthread.php?t=50384

Mobi:
http://www.mobileread.com/forums/showthread.php?t=50385

Last edited by Nate the great; 07-06-2009 at 11:24 PM.
Nate the great is offline   Reply With Quote
Old 07-06-2009, 10:44 PM   #9
Nvidiot
Connoisseur
Nvidiot began at the beginning.
 
Nvidiot's Avatar
 
Posts: 57
Karma: 30
Join Date: Jul 2009
Location: Netherlands
Device: PW2
Awesome!

If you could tell me how you did it I can do it myself next time around
Nvidiot is offline   Reply With Quote
Old 07-06-2009, 11:02 PM   #10
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
1. Downloaded the set of pages with WinHTTrack.
2. Started a new ebook project in Mobipocket Creator, and carefully added the files a few at a time to make sure they were in the correct order.
3. Failed to build the ebook several times so I could identify and delete the bad files created in the download step. (Don't worry, they were created by the download program and weren't source content.)
4. Built the Mobipocket ebook. Saved the ebook project.
5.Used html2epub.exe with the ebook project files to make the Epub version.


Total time invested: about an hour

Last edited by Nate the great; 07-06-2009 at 11:05 PM.
Nate the great is offline   Reply With Quote
Old 07-07-2009, 01:54 PM   #11
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530531
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Nate the great View Post
1. Downloaded the set of pages with WinHTTrack.
This is absolutely the RIGHT tool for building ebooks from webpages; much easier when the webpages stay on the same domain and go "downwards" from there. Did you realize there was a "cover.html" that would have been the best place to start the spidering instead of the "contents.html"? I spidered it last night and it took all of 6 minutes. The ensuing ebook conversion to .imp took several hours more (see below).

Quote:
2. Started a new ebook project in Mobipocket Creator, and carefully added the files a few at a time to make sure they were in the correct order.
I replicated the .html files ordering in TOC within the "contents.html" and used that as my starting point for the .opf.

Quote:
3. Failed to build the ebook several times so I could identify and delete the bad files created in the download step. (Don't worry, they were created by the download program and weren't source content.)
This is the ONLY way, through several unsuccessful trials, to get things right. This takes MOST of the time to convert webpages to ebooks!

Quote:
4. Built the Mobipocket ebook. Saved the ebook project.
5.Used html2epub.exe with the ebook project files to make the Epub version.
After getting the .prc version , I used Mobi2IMP to convert it to .imp formats, but the eBook Publisher is a lot more picky and sensitive to badly coded html, so I had to "fix" a lot more problems, i.e.
  • ill-formed/corrupt images,
  • <h1> tags in the <head> section and BEFORE the <body> tag,
  • non-existent links due to typos,
  • non-existent images for previous, next and index links,
  • missing image retrieved from an old website copy using WayBackMachine at archive.org
  • many minor fixes to make the resulting .html look more presentable...

Quote:
Total time invested: about an hour
Total time invested: almost 3 hours

Uploading the .imp formats, which differ slightly from your (.prc) version. Check here.

I can upload my .prc/.epub versions if you would like as well?

Last edited by nrapallo; 07-07-2009 at 02:55 PM. Reason: added link to .imp versions
nrapallo is offline   Reply With Quote
Old 07-07-2009, 04:14 PM   #12
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Quote:
Originally Posted by nrapallo View Post
  • ill-formed/corrupt images,
  • <h1> tags in the <head> section and BEFORE the <body> tag, yep
  • non-existent links due to typos, yep
  • non-existent images for previous, next and index links,
  • missing image retrieved from an old website copy using WayBackMachine at archive.org
  • many minor fixes to make the resulting .html look more presentable...still working on this
Those 3 link images are there; they're just linked to in an odd way. Also, can you let me have a copy of the missing image from file ch22-6.html?

I wish I'd known about the cover but it's okay. I like the one I made.
Nate the great is offline   Reply With Quote
Old 07-07-2009, 09:26 PM   #13
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530531
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Nate the great View Post
Those 3 link images are there; they're just linked to in an odd way. Also, can you let me have a copy of the missing image from file ch22-6.html?

I wish I'd known about the cover but it's okay. I like the one I made.
I changed the way those three links referenced their images to make them better to use.

The missing image was m493b.gif and is attached. There were two corrupt images that I could fix (attached as well), the others were corrupt from when the website was originally set up, as far as I can tell.

BTW, here's a snapshot of the cover page I used (basically their cover.html).
Attached Thumbnails
Click image for larger version

Name:	m493b.gif
Views:	148
Size:	151.3 KB
ID:	31822   Click image for larger version

Name:	m521.gif
Views:	146
Size:	34.9 KB
ID:	31824   Click image for larger version

Name:	m498c.jpg
Views:	144
Size:	43.6 KB
ID:	31825   Click image for larger version

Name:	Moonport-cover.jpg
Views:	142
Size:	150.5 KB
ID:	31826  

Last edited by nrapallo; 07-07-2009 at 09:39 PM.
nrapallo is offline   Reply With Quote
Old 07-07-2009, 10:18 PM   #14
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Thank you for the images.

BTW, I sent the 2 files with all the link errors to the contact email listed. I also sent a list of the errors I found, and mentioned that I was making an ebook. This afternoon I received a response.

The History Division at NASA is planning to convert all of their documents to ebooks. They wanted to know about the tools I use and my work process. I wrote a fairly lengthy email.

And yes, I did direct them here.
Nate the great is offline   Reply With Quote
Old 07-07-2009, 11:20 PM   #15
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530531
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Nate the great View Post
Thank you for the images.

BTW, I sent the 2 files with all the link errors to the contact email listed. I also sent a list of the errors I found, and mentioned that I was making an ebook. This afternoon I received a response.

The History Division at NASA is planning to convert all of their documents to ebooks. They wanted to know about the tools I use and my work process. I wrote a fairly lengthy email.

And yes, I did direct them here.
Great news... who would have thought that a recreational hour or so would have resulted in a productive skillset valuable to others!
nrapallo is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
LRFTools. Convert LRF to EPUB, HTML, PDF and RTF elinares LRF 279 07-31-2011 12:48 AM
Problem with html->epub: reader can't page through file horseflesh Calibre 5 10-20-2009 01:22 AM
converting multi-page HTML to Mobipocket shinew Calibre 13 02-21-2009 02:33 PM
HTML to image and CHM to images and CHM to LRF caritas LRF 0 12-14-2008 08:58 AM
Problem converting a webpage html to LRF, what program should I use? Long page turns seajewel Workshop 1 08-01-2008 07:32 AM


All times are GMT -4. The time now is 06:45 AM.


MobileRead.com is a privately owned, operated and funded community.