11-26-2014, 02:34 PM | #1 | |
Member
Posts: 17
Karma: 10
Join Date: Nov 2014
Device: Kobo Mini
|
How detelete empty paragraph?
Hi
I'm writing this recipe: Code:
class AdvancedUserRecipe1416065639(BasicNewsRecipe): title = u'Ganemos Feminismos' oldest_article = 365 max_articles_per_feed = 100 auto_cleanup = True reverse_article_order = True remove_empty_feeds = True language = 'es_ES' publisher = 'Ganemos' publication_type = 'actas' feeds = [(u'Feminismos', u'http://ganemosmadrid.info/category/actas/actas_feminismos/feed/')] extra_css = '.calibre_navbar, *:empty {display:none;}' preprocess_regexps = [ (re.compile(r' ',re.DOTALL|re.IGNORECASE), lambda match: ''), (re.compile(r'\s*<p[^>]*>\s*</p>\s*',re.DOTALL|re.IGNORECASE), lambda match: '') ] def get_cover_url(self): return 'http://ganemosmadrid.info/wp-content/uploads/2014/11/GM_ORG_SEPT.png' Quote:
Code:
<p class="calibre8"> </p> Thanks Last edited by cyttorak; 11-26-2014 at 05:20 PM. |
|
11-26-2014, 11:09 PM | #2 |
creator of calibre
Posts: 44,340
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You need to look at the actual source html of the articles in question, not the html in the final book. Visiting one of the articles in that feed, I see no <p> </p> in the article html. There will be something in that markup that is getting mapped to empty paragreaphs by auto_cleanup. You will have to figure out what that is. Or dont use auto_cleanup and instead use keep_only_tags/remove_tags
|
Advert | |
|
11-27-2014, 02:41 AM | #3 | |
Member
Posts: 17
Karma: 10
Join Date: Nov 2014
Device: Kobo Mini
|
Thank kovidgoyal
but the solution was this: Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problem: Merge two ebooks paragraph by paragraph... | akayacik80 | Workshop | 5 | 09-23-2014 09:05 AM |
Writing on empty | gmw | Writers' Corner | 27 | 12-21-2013 05:09 PM |
Spine is empty? | artbatista | Conversion | 44 | 07-01-2012 02:37 PM |
Preference: Paragraph indent or a little paragraph spacing? | 1611mac | General Discussions | 48 | 11-11-2011 12:43 AM |
Empty Books | philandjan | Library Management | 8 | 03-11-2011 06:03 PM |