|
|
Thread Tools | Search this Thread |
04-27-2011, 07:25 AM | #1 |
Enthusiast
Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
|
Questions about skipping re-downloading and parse_index browse to next page
Hi all,
I'm planning to utilize a calibre server for my friends, who happen to use different e-readers. I'm asking is there a possibility, to reuse the already downloaded news/html files for conversions to different output formats without re-downloading them? (e.g. I shall create a .mobi file from the recipe via the CLI/ebook-convert, then want to create a .epub from the same recipe and don't want to waste bandwith (currently on an EC2 instance).) My 2nd question is: how does one create a recipe/parse_index for a page without rss AND has multiple section pages? E.g. there is a technology section on a site, and the last link is "next page" (on every page, but the last), and I want to add the "h2" article items with the same article date to the feed from every page... Thanks for any advice! |
04-27-2011, 01:02 PM | #2 | ||
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
1) Use Windows scheduler or cron 2) to run script or batch file 3) to run ebook-convert first to make recipe-created book 4) then to run ebook-convert to convert recipe-created ebook to 2nd, 3rd, 4th formats, 5) then to run calibredb with the add option to add the books to Calibre. Quote:
|
||
Advert | |
|
04-27-2011, 04:43 PM | #3 |
Enthusiast
Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
|
for your answer!
I'll try out the solution for the first question tomorow at my devdesk. But for the second, there's a little misunderstanding: the site I want to create a feed has a feed, but that's rather unuseable: most of the times there are duplicate items and it has only the 10 articles of wich usually at least 4 is duplicate, e.g.:
So I don't want to use it (yeah I know I can filter out the duplicates). The site has sections (e.g. Sport, Technology: it's a local newspaper site). The sections has all the articles related to their sections, like <h1>article 1</h1> <p>2011-04-27</p> (snippet) <h1>article 2</h1> <p>2011-04-27</p> (snippet) <h1>article 3</h1> <p>2011-04-27</p> (snippet) <h1>article 4</h1> <p>2011-04-27</p> (snippet) <h1>article 5</h1> <p>2011-04-27</p> (snippet) <link to next page> And on the next page: <h1>article 6</h1> <p>2011-04-27</p> (snippet) <h1>article 7</h1> <p>2011-04-27</p> (snippet) <h1>article 8</h1> <p>2011-04-27</p> (snippet) <h1>article 9</h1> <p>2011-04-26</p> (snippet) <h1>article 10</h1> <p>2011-04-26</p> (snippet) <link to next page> Now what I want is to generate a custom feed for todays all articles, for which I have to open the index page of the section, then click the link to next page(s) until I can find and add articles to the feed with todays date. I can parse the index page, and create the feed for it, but how to get to the next page? Thanks in advance! |
04-28-2011, 07:55 AM | #4 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
04-28-2011, 09:21 AM | #5 |
Enthusiast
Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
|
Thank You, I'm experimenting with that now!
For the record, the reusing of the already downloaded HTML files was pretty easy: Code:
ebook-convert some.recipe somedir/ && for format in mobi epub pdf ; do ebook-convert somedir/index.html "some.${format}" ; done |
Advert | |
|
04-28-2011, 09:49 AM | #6 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Be aware that a recipe-created ebook of mobi format is not necessarily the same as a recipe-created ebook of epub format that has been converted to mobi format. When the device is set to Kindle, the recipe system makes some changes to the ebook that aren't made in a straight conversion.
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Downloading Metadata - couple of questions | sadievan | Calibre | 6 | 12-14-2010 09:27 PM |
How does my new ebook downloading support page look? | Falbe Publishing | General Discussions | 6 | 09-14-2010 04:30 PM |
Classic Complete Beginner Questions For The Nook, Especially About Downloading Free Books | sun surfer | Barnes & Noble NOOK | 13 | 08-08-2010 04:30 AM |
PRS-505 PRS+: 2 Questions on "Browse Folders" and Dictionary Format | crc | Sony Reader | 2 | 06-23-2010 01:36 AM |
Article Dates with parse_index | EnergyLens | Calibre | 6 | 04-21-2010 10:13 PM |