02-20-2011, 04:18 AM | #1 |
Connoisseur
Posts: 57
Karma: 10
Join Date: Nov 2009
Device: Kindle 3
|
Recipe: DER SPIEGEL?
Is it possible to build a recipe for the SPIEGEL Magazine? At "m.spiegel.de/epaper.do" registered users get access to the SPIEGEL Magazine via iPhone (html, no pdf).
Gany |
02-21-2011, 01:50 AM | #2 |
Connoisseur
Posts: 76
Karma: 12
Join Date: Nov 2010
Device: Android, PB Pro 602
|
Look at the already existing recipe spiegelde. It works quite well.
|
Advert | |
|
02-21-2011, 01:53 AM | #3 |
Connoisseur
Posts: 57
Karma: 10
Join Date: Nov 2009
Device: Kindle 3
|
|
04-16-2011, 05:14 AM | #4 |
Enthusiast
Posts: 43
Karma: 136
Join Date: Mar 2011
Device: Kindle Paperwhite
|
Ganymede is right. There is Spiegel Online, and then there is the actual magazine. Some of the actual magazine articles make it online but I think this is limited. In addition, IMO the writing for the magazine and the online version differ a lot sometimes in quality
I was looking into a recipe for the magazine and since I have not a lot of experience with recipes I would be happy to get some ideas on how to tackle this one. The layout of the online edition is very close to the actual printed magazine, i.e. page 1 is on 1.html, page 2 on 2.html, etc. If there is a page with an ad, the html page exists and shows a page, but has no text-content, only the image of the page. There is a table of content which looks like this (I replaced the German naming of the classes with English) Spoiler:
In addition, on the bottom of every page there is a link to the next and previous page. This navigation skips pages that have only ads. From the table of content it looks also like ads are not directly linked. I could not find any URLs that actually match the content, e.g. spiegel.de/..../deutschland instead of the page number, which would allow me to use the standard feeds approach of BasicRecipe. Is there a recipe that already parses a page similar like this? I.e., with page-number URLs or a similar table of content layout where I could peak? Thanks in advance Last edited by aerodynamik; 04-16-2011 at 05:17 AM. Reason: Fixed code section and added spoiler section for readability |
04-17-2011, 08:37 AM | #5 | |||
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Quote:
Quote:
|
|||
Advert | |
|
04-17-2011, 12:51 PM | #6 |
Enthusiast
Posts: 43
Karma: 136
Join Date: Mar 2011
Device: Kindle Paperwhite
|
Deleting my old comment, since completely irrelevant
I missed the most important information in this thread: "m.spiegel.de/epaper.do" (from ganymede's original post). This is a much simpler version, index on one page, no multipage articles, no multiple articles on one page I can work on this less than an hour a day, but I should have something within a few days. Thanks again for your help Starson, and sorry for the confusion. Last edited by aerodynamik; 04-19-2011 at 02:01 AM. |
04-19-2011, 05:52 PM | #7 |
Enthusiast
Posts: 43
Karma: 136
Join Date: Mar 2011
Device: Kindle Paperwhite
|
Der Spiegel Recipe (printed edition)
Here we go, a first recipe for the printed edition of Der Spiegel. You need a subscription to access it.
I tested it on my Kindle 3, looks very good. Would be great to get some more tests on other devices. When you copy the script, replace the character ◆ with "& # 9670 ;" (remove spaces in quotes). My Kindle wasn't able to display this correctly so I just replaced with a horizontal rule. Spoiler:
I am unhappy with this code Code:
for article in section.findNextSiblings(['dd','dt']): if article.name == 'dt': break Code:
<dt>section 1</dt> <dd class="spFirst">article 1</dd> <dd>article 2</dd> <dt>section 2</dt> <dd>article 3</dd> Is it okay to use a wikipedia image for the masthead image? Last edited by aerodynamik; 04-19-2011 at 05:54 PM. |
04-19-2011, 08:06 PM | #8 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You can find dt first and then call find dd on each dt.
Using wikipedia images should be fine. Though it's better to use an image from the publisher so as not to load wikipedia's servers. |
04-20-2011, 01:59 AM | #9 |
Enthusiast
Posts: 43
Karma: 136
Join Date: Mar 2011
Device: Kindle Paperwhite
|
Different than e.g. <ul><li><li></ul>, <dt> does not "include" <dd>.
When I iterate over all <dt>'s and on each of them then call findAll('dd') I get all dd included in the overall index: Code:
for section in index.findAll('dt'): section_title = self.tag_to_string(section).strip() self.log('Found section ', section_title) articles = [] for article in section.findAll('dd'): #lists all dd's, including the ones next to the ones listed below the current dt Regarding the masthead: all I could find on the publishers website is the corresponding online logo. To avoid confusion between Spiegel Online and Der Spiegel I would stick to the wikipedia logo for now. There is an SVG source that renders the logo, does this help? |
04-20-2011, 09:24 AM | #10 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
04-20-2011, 11:13 AM | #11 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
If they're siblings then you essentially have to do something along the lines of what you did, i.e. keep track of the last seen dt and add dds to it.
Code:
current_section, current_articles = None, [] for x in findAll(['dt', 'dd']): if x.name == 'dt': if current_section and current_articles: sections.append((current_section, current_articles)) current_section, current_articles = set.tag_to_string(x), [] else: current_articles.append(...) |
06-03-2012, 03:01 AM | #12 |
Junior Member
Posts: 5
Karma: 10
Join Date: Oct 2011
Device: Kindle
|
SPIEGEL download failure
Hi, I am a SPIEGEL subscriber. I am using calibre to download and to convert the print version for an e-book reader (mine is a Kindle 3). It has been working great for more than a year. Today the download didn't work. I got a failure report (see attachment). I suppose a change in the recipe might help. I have no idea how to do it. Can please anybody help? Thank you very much. Jan
|
06-03-2012, 03:56 AM | #13 |
Enthusiast
Posts: 43
Karma: 136
Join Date: Mar 2011
Device: Kindle Paperwhite
|
The website is not working. If you go to m.spiegel.de/epaper.do, you already get an error message. Thru the link "Der Spiegel" on the bottom right you can select the current issue (http://m.spiegel.de/spiegel/print/ep...x-2012-22.html), but this one also does not work.
Jan, I don't have an account with Der Spiegel anymore. Can you find a working link on the website? |
06-03-2012, 04:20 AM | #14 |
Junior Member
Posts: 5
Karma: 10
Join Date: Oct 2011
Device: Kindle
|
SPIEGEL error message
Hi, it looks like their service is down. I was not able to open the older issues either. Should I write them an e-mail? When they fix the service again, I should be able to to retrieve the magazine.
|
06-04-2012, 11:49 AM | #15 |
Junior Member
Posts: 5
Karma: 10
Join Date: Oct 2011
Device: Kindle
|
SPIEGEL fixed it
Yes, it was SPIEGEL's fault. Now it is working again.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Spiegel.de: Kleine Nabelschau zu eBooks auf der Buchmesse | K-Thom | Deutsches Forum | 14 | 10-16-2009 01:25 PM |
Mystery and Crime Storm, Theodor W.: Der Spiegel des Cyprianus, german, v1, 14 Mar 2009 | ravenne | ePub Books | 0 | 03-14-2009 06:24 PM |
Mystery and Crime Storm, Theodor W.: Der Spiegel des Cyprianus, german, v1, 14 Mar 2009 | ravenne | Kindle Books | 0 | 03-14-2009 06:23 PM |
Mystery and Crime Storm, Theodor W.: Der Spiegel des Cyprianus, german, v1, 14 Mar 2009 | ravenne | BBeB/LRF Books | 0 | 03-14-2009 06:21 PM |
Article on Plastic Logic in german magazine "Der Spiegel" | Manichean | News | 1 | 09-18-2008 06:48 AM |