Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-26-2014, 02:34 PM   #1
cyttorak
Member
cyttorak began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Nov 2014
Device: Kobo Mini
Question How detelete empty paragraph?

Hi

I'm writing this recipe:

Code:
class AdvancedUserRecipe1416065639(BasicNewsRecipe):
	title	= u'Ganemos Feminismos'
	oldest_article = 365
	max_articles_per_feed = 100
	auto_cleanup = True
	reverse_article_order = True
	remove_empty_feeds = True
	language = 'es_ES'
	publisher = 'Ganemos'
	publication_type = 'actas'
	feeds	= [(u'Feminismos', u'http://ganemosmadrid.info/category/actas/actas_feminismos/feed/')]
	extra_css = '.calibre_navbar, *:empty {display:none;}'
	preprocess_regexps = [
		(re.compile(r' ',re.DOTALL|re.IGNORECASE), lambda match: ''),
		(re.compile(r'\s*<p[^>]*>\s*</p>\s*',re.DOTALL|re.IGNORECASE), lambda match: '')
	]

	def get_cover_url(self):
		return 'http://ganemosmadrid.info/wp-content/uploads/2014/11/GM_ORG_SEPT.png'
but I'm still see empty paragraph in my .epub. I see the line blank for each:
Quote:
<p>&nbsp;</p>
I get:

Code:
<p class="calibre8"> </p>
how can I delete this kind of empty tags?

Thanks

Last edited by cyttorak; 11-26-2014 at 05:20 PM.
cyttorak is offline   Reply With Quote
Old 11-26-2014, 11:09 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You need to look at the actual source html of the articles in question, not the html in the final book. Visiting one of the articles in that feed, I see no <p>&nbsp;</p> in the article html. There will be something in that markup that is getting mapped to empty paragreaphs by auto_cleanup. You will have to figure out what that is. Or dont use auto_cleanup and instead use keep_only_tags/remove_tags
kovidgoyal is offline   Reply With Quote
Advert
Old 11-27-2014, 02:41 AM   #3
cyttorak
Member
cyttorak began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Nov 2014
Device: Kobo Mini
Exclamation

Thank kovidgoyal

but the solution was this:

Quote:
preprocess_regexps = [
(re.compile(u'\xa0'), lambda match: ' '),
(re.compile(r'&nbsp;',re.DOTALL|re.IGNORECASE), lambda match: ' '),
(re.compile(r'\s*<p[^>]*>\s*</p>\s*',re.DOTALL|re.IGNORECASE), lambda match: '')
]
I saw it here http://stackoverflow.com/questions/1...a0-from-string
cyttorak is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem: Merge two ebooks paragraph by paragraph... akayacik80 Workshop 5 09-23-2014 09:05 AM
Writing on empty gmw Writers' Corner 27 12-21-2013 05:09 PM
Spine is empty? artbatista Conversion 44 07-01-2012 02:37 PM
Preference: Paragraph indent or a little paragraph spacing? 1611mac General Discussions 48 11-11-2011 12:43 AM
Empty Books philandjan Library Management 8 03-11-2011 06:03 PM


All times are GMT -4. The time now is 01:52 PM.


MobileRead.com is a privately owned, operated and funded community.