Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-06-2010, 10:11 PM   #1
N13L5
tenjooberrymuds
N13L5 began at the beginning.
 
Posts: 58
Karma: 12
Join Date: Sep 2010
Device: Android
would like a recipe to pull down a free online book

I'd love to be able to read this MIT book on Scheme programming offline on my phone.

Its got a lot of pages on separate html pages, with "next" and "previous" links for navigation.

is it possible to pull down the whole book and save it to epub or chm?

Structure and Interpretation of Computer Programs


regrettably, I'm not doing well with the recipes... I've tried to adapt exisiting recipes for other websites to no avail...

...maybe after I read it, I can make recipes too...

Last edited by N13L5; 10-06-2010 at 10:14 PM.
N13L5 is offline   Reply With Quote
Old 10-07-2010, 10:01 AM   #2
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by N13L5 View Post
is it possible to pull down the whole book and save it to epub or chm?
Recipes can do this. But once the recipe was written, that's all it would do - grab that one book. Once it was run, there'd be no reason for anyone to run it again. If they're going to go to the effort to write a recipe, most volunteer recipe writers would rather write a recipe that creates a new book each week from a changing feed, rather than one that is only run once to get a single book.
Starson17 is offline   Reply With Quote
Old 10-07-2010, 10:56 AM   #3
N13L5
tenjooberrymuds
N13L5 began at the beginning.
 
Posts: 58
Karma: 12
Join Date: Sep 2010
Device: Android
hmm, ughh

if someone did write a recipe to pull down a book might it be re-usable insofar as it could be adapted with a different url to pull a different book?


I feel people would have to admit few news feeds contain information as valuable as an MIT 101 computer science course on Scheme and basic programming principles?


If I finished that book maybe I'd be more successful at creating these dang recipes...
N13L5 is offline   Reply With Quote
Old 10-07-2010, 11:03 AM   #4
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by N13L5 View Post
hmm, ughh



If I finished that book maybe I'd be more successful at creating these dang recipes...
Ha, or you could simply read the thread (sticky) and the numerous post that Starson17, myself, and others have posted and figure it out.
TonytheBookworm is offline   Reply With Quote
Old 10-07-2010, 11:28 AM   #5
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by N13L5 View Post
if someone did write a recipe to pull down a book might it be re-usable insofar as it could be adapted with a different url to pull a different book?
Look at any multipage recipe (Adventure Gamers is one). They are just as adaptable. Search on the word "multipage" here to get further info. Feel free to adapt one. Ask questions - they'll get answered.

Quote:
I feel people would have to admit few news feeds contain information as valuable as an MIT 101 computer science course on Scheme and basic programming principles?
I agree - it's just not something that needs a "recipe" to bake a cake (create a book) multiple times. You want to avoid the work of doing it manually by asking someone else to write a recipe that will do that one job automatically. I'm not blaming you - you probably didn't see the difference between a "feed" that changes "and a "book" that doesn't. That's why I posted.

Quote:
If I finished that book maybe I'd be more successful at creating these dang recipes...
There are other better sources for Calibre recipe info.

You can also look at some automatic website grabbers, such as wget, HTTrack and web2disk.

Good luck
Edit: I probably shouldn't have pointed only to multipage recipes. They put multiple pages into a single page. For a book, it might be better to turn recursion on and let the recipe track down all the pages by restricting the URLs that can be followed.

Last edited by Starson17; 10-07-2010 at 11:43 AM.
Starson17 is offline   Reply With Quote
Old 10-07-2010, 01:06 PM   #6
andyh2000
Avid reader
andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.andyh2000 ought to be getting tired of karma fortunes by now.
 
andyh2000's Avatar
 
Posts: 868
Karma: 6399168
Join Date: Apr 2009
Location: UK
Device: Samsung Galaxy Z Flip 4 / Kindle Paperwhite / TCL Nxtpaper 14
Quote:
Originally Posted by Starson17 View Post
You can also look at some automatic website grabbers, such as wget, HTTrack and web2disk.
I'd second the recommendation for HTTrack. I just tried it (Windows GUI version) and loaded the root HTML file of the resultant website mirror into Calibre then converted it. It produced a 1.2 Mb ePub which looks quite useable. So don't bother with a Calibre recipe, use a website mirror tool.

Andrew
andyh2000 is offline   Reply With Quote
Old 10-07-2010, 01:31 PM   #7
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by N13L5 View Post
I'd love to be able to read this MIT book on Scheme programming offline on my phone.

Its got a lot of pages on separate html pages, with "next" and "previous" links for navigation.

is it possible to pull down the whole book and save it to epub or chm?

Structure and Interpretation of Computer Programs


regrettably, I'm not doing well with the recipes... I've tried to adapt exisiting recipes for other websites to no avail...

...maybe after I read it, I can make recipes too...

Doing just this (converting a website into ebook form) was my forte before joining Mobileread.com.

My best tool (HTTrack Website Copier) for this job is similar to what was done for the free NASA (web) ebooks posted in the thread multi-page HTML with images to ePub or LRF.

Basically, you need to point HTTrack to your website and download a local copy of it and then create a content.opf (manually or using Mobipocket Creator or Sigil), then create your desired ebook format using your favourite conversion software. Sigil unfortunately lost the internal "contents" and "index" links on each page (and probably elsewhere where the filename was referenced before the link) so I couldn't use it to finish the epub version.

I can save you the effort though since I spidered that ebook yesterday and produced various ebooks (.prc/.imp/.epub) from it that I will post in the Mobileread.com Uploads directory soon. I haven't read or proof-read it so it may have some minor issues. I'll post the link to the uploaded ebooks when done.

My only "hang-up" was getting the links to properly work in the .epub version after converting the .opf (local .html and images) using calibre and when using ADE. The .epub worked initially using the Firefox EPUBReader plugin (which is my main epub reader since I don't have a hardware capable epub reader just yet) but it's been a challenge to say the least when trying to preview the epub using ADE since links where not followed "fully".

BTW, I did try to use the calibre command line tool, web2dsk, but that produced malformed .xhtml files that I couldn't proceed with my conversion programs.

Last edited by nrapallo; 10-07-2010 at 02:31 PM. Reason: fixed original poster's link to this book
nrapallo is offline   Reply With Quote
Old 10-07-2010, 02:57 PM   #8
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by nrapallo View Post
I'll post the link to the uploaded ebooks when done.
Done, look here for the epub version.

I also did a .mobi/.prc version here and EBW1150 & REB1200 .imp versions here.
nrapallo is offline   Reply With Quote
Old 10-07-2010, 03:14 PM   #9
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by nrapallo View Post
My best tool (HTTrack Website Copier) for this job is similar to what was done for the free NASA (web) ebooks posted in the thread multi-page HTML with images to ePub or LRF.
I started with HTTrack, but found some things it wouldn't do well for me. It's been a long time, and perhaps it has been updated, but I switched to wget and have been happy with it. I use it daily/hourly/weekly to automatically grab certain files for my wife on those sites that want you to come back each day/hour/week for something free.
Starson17 is offline   Reply With Quote
Old 10-07-2010, 05:44 PM   #10
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Starson17 View Post
I started with HTTrack, but found some things it wouldn't do well for me. It's been a long time, and perhaps it has been updated, but I switched to wget and have been happy with it. I use it daily/hourly/weekly to automatically grab certain files for my wife on those sites that want you to come back each day/hour/week for something free.
I think that's a fair assessment when the website is "troublesome" i.e. doesn't stay on the same URL path and/or goes off-domain with it's hyperlinks.

The aforementioned MIT Press website book was very well constructed and "behaved nicely" when being spidered so I didn't have much to worry about when using HTTrack. Using wget should also not have any issues.

There are some tricks/techniques I employ when dealing with a "poorly linked website" for spidering purposes, but they usually get used ONCE and then the project is spidered/over.

For some websites that I've spidered and converted to ebooks, in the past, see the bottom of this thread.
nrapallo is offline   Reply With Quote
Old 10-08-2010, 04:50 AM   #11
N13L5
tenjooberrymuds
N13L5 began at the beginning.
 
Posts: 58
Karma: 12
Join Date: Sep 2010
Device: Android
@nrapallo wow, awesome! thank you so much! and what a coincidence... ^^

I went to pull it with wget this morning, before I saw Starson17's new advice.

It sorta did it but failed to pull down images even though they were in the same directory on the website... go figure. Maybe I needed to use one more commandline switch for that, who knows.

It didn't work anyway, cause my 3G connection on this island is miserably bad, and it failed to complete too many pages, even though they designed it to be bad-connection-resilient..

I don't blame the program, my ISP really does suck

Here's how bad: after 5 minutes, Flashget 3 managed 1.4% of your 1.1MB epub file...
N13L5 is offline   Reply With Quote
Old 10-08-2010, 11:52 AM   #12
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by N13L5 View Post
@nrapallo wow, awesome! thank you so much! and what a coincidence... ^^
Thanks, I like the subject matter, so I kinda of made it for myself as well.
I hope you enjoy it as an ebook...
nrapallo is offline   Reply With Quote
Old 10-08-2010, 01:10 PM   #13
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by nrapallo View Post
I think that's a fair assessment when the website is "troublesome" i.e. doesn't stay on the same URL path and/or goes off-domain with it's hyperlinks.

The aforementioned MIT Press website book was very well constructed and "behaved nicely" when being spidered so I didn't have much to worry about when using HTTrack.
Yes, IIRC, HTTrack was a bit easier, wget had more flexibility for some tricky situations I encountered.

BTW, if you've converted from spidered html sites, can you give me a summary of what Calibre does with the spidered site when you drag in the index.html? Does it track down locally stored links or relative links or do you need to tweak things? Any info or advice you want to share about the process (tips/tricks) would be appreciated. I've never needed to manually construct a content.opf file.
Starson17 is offline   Reply With Quote
Old 10-09-2010, 12:35 AM   #14
N13L5
tenjooberrymuds
N13L5 began at the beginning.
 
Posts: 58
Karma: 12
Join Date: Sep 2010
Device: Android
Quote:
Originally Posted by TonytheBookworm View Post
Ha, or you could simply read the thread (sticky) and the numerous post that Starson17, myself, and others have posted and figure it out.
I think I should have spent my time on what you said, instead of learning all the commandline switches to run wget...
N13L5 is offline   Reply With Quote
Old 10-09-2010, 12:53 AM   #15
N13L5
tenjooberrymuds
N13L5 began at the beginning.
 
Posts: 58
Karma: 12
Join Date: Sep 2010
Device: Android
Quote:
Originally Posted by nrapallo View Post
Thanks, I like the subject matter, so I kinda of made it for myself as well.
I hope you enjoy it as an ebook...
I am!

I'm using wget to pull down John Graham's blog and the Joel on Software blog
atm... thinking you might like them as well...

http://www.joelonsoftware.com/
http://www.paulgraham.com/

I wish the new Lisp version John Graham et al are working on would move beyond the experimental and "in flux" stage already, and continue to make an interpreter that can somehow make use of all the useful Python libraries

Like he points out himself; Lisp isn't much in use, cause except for command shell scripts, its tough to use for real world stuff, cause there's not much in the way of libraries... I don't know about the interpreter / compiler situation


You can already run Python on Android if that isn't cool, I don't know what is... Now I can read my programming books, and immediately try out stuff...
N13L5 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
FREE: Read AMBER MAGIC first book of a fantasy series Online! BVLarson Self-Promotions by Authors and Publishers 32 10-15-2010 07:33 AM
recipe to pull web page similar to 'print/save as pdf' JPD Recipes 15 09-29-2010 10:20 AM
Economist (Free) Recipe geneaber Calibre 2 01-08-2010 10:21 PM
Economist Free Recipe geneaber Calibre 10 12-31-2009 04:45 PM
37signals' Getting Real as free online e-book Alexander Turcic Deals and Resources (No Self-Promotion or Affiliate Links) 3 10-26-2006 07:27 PM


All times are GMT -4. The time now is 09:13 PM.


MobileRead.com is a privately owned, operated and funded community.