Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-15-2010, 06:02 AM   #1
Leprecon
Junior Member
Leprecon began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Nov 2010
Device: PRS 350
MetroTime Belgium

http://www.metrotime.be/digipapernl.html
Then look for a .pdf file. This pdf id unfortunately only one page and you have to continue downloading every consecutive page. Luckily the urls make sense.
Code:
page1: http://www.metrotime.be/UserFiles/DigiPaper/nl/20101110/1/MVLMP-0-20101110-01.pdf
page2: http://www.metrotime.be/UserFiles/DigiPaper/nl/20101110/2/MVLMP-0-20101110-02.pdf
It basically has date followed by pagenumber a couple of times and it is always 24 pages.

Could these be merged into one pdf file?

If not, you would have to "click" each story on the image of the page to get to a simple version. You would just have to look for a "storyId=" in the source of every of the 24 pages. (And on the frontpage there are short headlines and the page number they are on, which have a storyid to them as well which should be ignored. This could technically be done since the "stories" all end with the identifier "e_SRitp")

Is any of this possible?

Edit: forgot to mention that it doesn't update every day.

Last edited by Leprecon; 11-16-2010 at 04:45 AM.
Leprecon is offline   Reply With Quote
Old 11-16-2010, 02:32 AM   #2
marbs
Zealot
marbs began at the beginning.
 
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
calibre does not support pdf file input. sorry.
marbs is offline   Reply With Quote
Advert
Old 11-16-2010, 04:45 AM   #3
Leprecon
Junior Member
Leprecon began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Nov 2010
Device: PRS 350
Quote:
Originally Posted by marbs View Post
calibre does not support pdf file input. sorry.
Ok ... then you read the rest of my post saying there are two ways of getting the news from that site, one leading to articles that look like this. Surely calibre can handle plain text?
Leprecon is offline   Reply With Quote
Old 11-16-2010, 04:49 AM   #4
jfdeclercq
ePaper Enthousiast
jfdeclercq began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2008
Location: Overijse, Belgium
Device: iRex DR1000S
What you could try is to download the html site using scrapbook than, find out what html file could be the most complete table of contents and use Calibre's ebook converter to get an ePub.

J-F
jfdeclercq is offline   Reply With Quote
Old 11-16-2010, 09:48 AM   #5
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Leprecon View Post
Ok ... then you read the rest of my post saying there are two ways of getting the news from that site, one leading to articles that look like this. Surely calibre can handle plain text?
It's not totally clear what the relationship is between the text link above and your question about pdf, but if you have links to html pages (text) with or without images, then Calibre's recipe system can handle them. If the html pages have links to other related pages and you need to put them together into a single article, Calibre can handle that, too.
Starson17 is offline   Reply With Quote
Advert
Old 11-16-2010, 05:09 PM   #6
Leprecon
Junior Member
Leprecon began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Nov 2010
Device: PRS 350
Apparently my English isn't as good as I thought.
My question was whether someone could make a recipe for me.

You can get the article in plaintext or with a couple of simple images by going to this site and pressing the individual articles on the picture of the newspapers page.
All of the articles have links that look like this
Code:
http://www.metrotime.be/digipaperArticlenl.html?storyId=37748746
http://www.metrotime.be/digipaperArticlenl.html?storyId=37748752
If you were to go over page one to 24
Code:
http://www.metrotime.be/digipapernl.html?pag=1&kdate=15/11/2010
...
http://www.metrotime.be/digipapernl.html?pag=24&kdate=15/11/2010
And look through the source of each page aggregating every story by going over every link that has
Code:
/digipaperArticlenl.html?storyId=
in it.

Then you would have to remove every story that ends with "e_SRitp" (like this one) because it is useless fluff.
Leprecon is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Hello to all from Belgium ybuelens Introduce Yourself 6 02-04-2010 05:12 AM
Hello from Belgium. Ubikzz Introduce Yourself 12 01-25-2010 10:53 AM
Hello from Belgium Nate0072 Introduce Yourself 3 02-27-2009 02:14 PM
New from Belgium hannah Introduce Yourself 10 02-15-2009 10:08 AM
Greetings from Belgium rittsi Introduce Yourself 11 04-25-2008 06:18 AM


All times are GMT -4. The time now is 05:57 PM.


MobileRead.com is a privately owned, operated and funded community.