Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-09-2010, 01:47 AM   #16
DMcCunney
New York Editor
DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.DMcCunney ought to be getting tired of karma fortunes by now.
 
DMcCunney's Avatar
 
Posts: 6,384
Karma: 16540415
Join Date: Aug 2007
Device: PalmTX, Pocket eDGe, Alcatel Fierce 4, RCA Viking Pro 10, Nexus 7
Quote:
Originally Posted by N13L5 View Post
I wish the new Lisp version John Graham et al are working on would move beyond the experimental and "in flux" stage already, and continue to make an interpreter that can somehow make use of all the useful Python libraries

Like he points out himself; Lisp isn't much in use, cause except for command shell scripts, its tough to use for real world stuff, cause there's not much in the way of libraries... I don't know about the interpreter / compiler situation
I liked Dr. Nikolai Bezroukov's commentary in his commentary on editors on his Softpanorama website. He talked about macro languages for text editors, and was less than thrilled with Emacs using a dialect of Lisp, because he felt the language ought to have some use ouside of the editor, and what other use of Lisp did the majority of Emacs users make? Eric S. Raymond was going on a while back about what Emacs got wrong, and how much better Python would be for the purpose. When I said "So rewrite Gnu Emacs to use Python instead of Lisp." the response was "Don't tempt me..."

I have to agree. I have Gnu Emacs here, but the only use I made of elisp was years back, learning enough to hack my .emacsrc to properly support the various special keys on the machine I was running it on. I've had no cause to touch it since.
______
Dennis
DMcCunney is offline   Reply With Quote
Old 10-09-2010, 07:43 AM   #17
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Starson17 View Post
Yes, IIRC, HTTrack was a bit easier, wget had more flexibility for some tricky situations I encountered.

BTW, if you've converted from spidered html sites, can you give me a summary of what Calibre does with the spidered site when you drag in the index.html?
After using WinHTTrack to spider the website, find the topmost .html file that will be the first page of your ebook (besides the cover page which can be added afterward). In most cases, that first .html file will be the page the website address (URL) points to and is usually called "index.html". HTTRack will generate it's own "index.html" which will point to the entry point into the local copy of that spidered website. That pointed-to file is the one I usually use as the first page of the ebook.

Quote:
Does it track down locally stored links or relative links or do you need to tweak things? Any info or advice you want to share about the process (tips/tricks) would be appreciated.
If the original website book is constructed in such a way as to use exclusively a top down design, then the spidering will be easily accomplished. If, however, the website "jumps" to another domain or traverses up the URL and then down another (unrelated) path then things become much more complicated and the use of the "Experts only" tab options is a must (Travel mode and Global travel mode in particular). I usually use the default options, however, I may adjust the "Scan Rules" tab options to exclude any unwanted filetype like .pdf, .avi, or animated .gif's. I may only know what to exclude or where to allow HTTrack to travel only on the SECOND iteration of trying to spider the website (and so on)...

Here's a tip: When spidering, the main rule for me is to not let HTTRack veer off to some domain/address that itself, on the whole, is unnecessary. If I need just one or two pages from that website location, I may just manually download them or do another mini "spider" to get the section relevant to the main spidering and then merge that directory within the original spidered local directory.

Quote:
I've never needed to manually construct a content.opf file.
My main ebook reader is the REB1200 which uses the .imp format. The freely available eBook Publisher from www.ebooktechnologies.com has been my workhorse in preparing a local website's .html and images, but you can also use the freely available Mobipocket Creator from www.mobipocket.com. The tried and true method is to assemble the logical flow of .html pages to mimic the way you would read/traverse that website book. I sometimes look for a table of contents page and examine it's ordering of .html for top down flow and then sort the .html pages that way within ebook publisher or Mobipocket Creator. This creates the .opf file for me that I will later "feed" to calibre to create the .epub, to Mobipocket Creator to create the .prc and ebook Publisher to create the REB1200 .imp as well as the EBW1150 .imp. I personally do this, in reverse order, perfecting my .imp ebook, then creating the .prc and .epub afterward.

I create and test the ebook for such things as improperly formed hyperlinks, removal of headers/footer if unnecessary, creation of a cover page, reordering of .html if necessary, etc. To correct certain deficiencies (change html code) within many .html files, at once, I use the shareware TextPad text editor (www.textpad.com) and perform reg-ex search and replace over all files, at the same time.

I did try to use Sigil to create the .opf and actually did use it's content.opf to get my ebook creation started, but the .epub created by Sigil lost some of the hyperlinks (due to a "bug" perhaps) where a hyperlink was referenced with an external filename before the #reference. I did use it's Metadata Editor and TOC Editor before going the calibre route with the content.opf as input.

I accomplish the .epub build using calibre's command line tool, ebook-convert, using a simple .bat file whose contents I manually edit for each different build, namely:
Code:
ebook-convert "Structure and Interpretation of Computer Programs.opf" "Structure and Interpretation of Computer Programs.epub" --no-default-epub-cover --title "Structure and Interpretation of Computer Programs" --authors "Harold Abelson and Gerald Jay Sussman with Julie Sussman" --publisher="NR (nrapallo)" --dont-split-on-page-breaks --no-chapters-in-toc --chapter "//*[name()='h1' or name()='h2']"  --chapter-mark=none --output-profile=sony
pause
I sometimes need to adjust the Chapter detection command line switch to exclude h1 or include h3. Then, for website books, I usually edit the resulting .epub using winrar and remove from the stylesheet.css any spurious "page-break-before" or "page-break-after" tags that insert too much white space in my ebook.

At this point, I call it a day and try out the ebook on a reading device and peruse it. Because I'm a bit of a perfectionist, I'll try to fix glaring "faux pas", ensure all images appear properly and try most of the hyperlinks to ensure they jump properly. Then I'll recreate that ebook until I'm satisfied...

I hope this gives you some insights into what for me has increased my enjoyment of (huge) ebooks, that is the spidering of meticulously fine-crafted website books and converting them into ebooks! It's all about content, content, content....
nrapallo is offline   Reply With Quote
Advert
Old 10-09-2010, 10:38 AM   #18
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by nrapallo View Post
I hope this gives you some insights into what for me has increased my enjoyment of (huge) ebooks, that is the spidering of meticulously fine-crafted website books and converting them into ebooks! It's all about content, content, content....
Thank you for the comments. There have been several requests for book recipes, and I think they have a place in Calibre, but they don't really fit into the normal recipe system where Kovid stores them centrally. I'd suggest anyone who wants to write a one-shot recipe for books should put them here. Also wget files or HTTrack command files/batch files and the like that have been configured to pull a book should be welcome here.
Starson17 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
FREE: Read AMBER MAGIC first book of a fantasy series Online! BVLarson Self-Promotions by Authors and Publishers 32 10-15-2010 06:33 AM
recipe to pull web page similar to 'print/save as pdf' JPD Recipes 15 09-29-2010 09:20 AM
Economist (Free) Recipe geneaber Calibre 2 01-08-2010 09:21 PM
Economist Free Recipe geneaber Calibre 10 12-31-2009 03:45 PM
37signals' Getting Real as free online e-book Alexander Turcic Deals and Resources (No Self-Promotion or Affiliate Links) 3 10-26-2006 06:27 PM


All times are GMT -4. The time now is 10:25 PM.


MobileRead.com is a privately owned, operated and funded community.