02-11-2013, 11:58 AM | #1 |
Junior Member
Posts: 3
Karma: 10
Join Date: Feb 2013
Device: Kindle Keyboard
|
Include Author and Publication Date from feed
I have finally decided to start to improve on the simple, fully automatic recipes that Calibre generates. The first thing I would like to do is have the article include the name of the author of the article and the publication date after the title (or in some other logical location, for reference).
I believe this information is included in the feed data and in the page of the article itself. The feed in question is: http://politikon.es/feed/ Many thanks for your help. Perhaps seeing how this is done will help me make heads or tails of the rest. |
02-12-2013, 06:37 AM | #2 |
Junior Member
Posts: 3
Karma: 10
Join Date: Feb 2013
Device: Kindle Keyboard
|
First efforts
Here is my first effort. The articles are pretty clean and auto cleanup works fine. I just wanted to include a line under the title with the date and author. It looks like it is contained in a div class "post-meta", but auto_cleanup_keep is not enough to pull in this data it seems.
Spoiler:
Thanks for any help |
Advert | |
|
02-12-2013, 06:49 AM | #3 |
creator of calibre
Posts: 43,850
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
If you cant get it to work with auto_cleanup_keep you will have to cleanup manually using the remove_tags keep_only_tags directives instead.
|
02-13-2013, 05:40 PM | #4 |
Junior Member
Posts: 3
Karma: 10
Join Date: Feb 2013
Device: Kindle Keyboard
|
A little victory
Revisiting this again, I think I have achieved what I wanted. But I have some questions and would appreciate anyone shining some light on the subject.
The solution All the information I wanted was there in the HTML of the article website. I used auto_cleanup, since it had been working fine, but used auto_cleanup_keep to include all the tags around that information. This was three levels deep in some cases. Also, one of the tags I wanted was <abbr>, which being strange I substitute for a wildcard (*), since I suspect that might have been causing it to fail previously. I also had to choose an unusual attribute for one <span> (rel) since there was no 'id' or 'class' and the title was too specific. To achieve all this I had to set use_embedded_content=False. Here's how it came out: Spoiler:
The problems As I said, I had to force Calibre not to use the embedded content, although it is all there and I can identify the bits of information I want very easily in the source of the RSS feed. Applying the same technique, however, does not yield the results I want. I don't understand how Calibre is picking up and using the tags from the RSS source. I am not a programmer and from what I have read I cannot understand enough of what is going on behind the scenes. Enabling a few HTML tags I get, but the RSS content surely requires more processing. Cheers for any advice/pointers regarding the RSS issue. Since the data is in the feed it seems preferable to use it from there. |
Tags |
author, publication date |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Publication Date or Copyright Date or ??? | hd_cal_dave | Library Management | 8 | 05-25-2012 01:50 PM |
Problem: Date of Publication... | samy2 | Calibre | 2 | 03-02-2012 05:09 AM |
How to Include Date in Title? | awitko | Recipes | 2 | 11-02-2011 04:40 PM |
Date of Publication Metadata | crutledge | Sigil | 5 | 01-10-2011 01:27 PM |
Is there any way to control publication date? | weasal | Recipes | 4 | 09-27-2010 12:37 PM |