![]() |
#1486 |
onlinenewsreader.net
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 328
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Mobipocket reader, also Kindle. I could try EPUB, but I doubt that's where the issue is. I did a debug pipeline and the emdash has been replaced in the input directory, before any of the output-specific processing is performed. I think the substitution must be happening in BeautifulSoup.
|
![]() |
![]() |
#1487 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,410
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The news download system replaces entities with their UTF-8 equivalents. That's expected. Are you saying they're being saved as cp1252 in the input sub directory?
|
![]() |
![]() |
#1488 | ||
onlinenewsreader.net
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 328
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
The original news source looks like
Quote:
Quote:
|
||
![]() |
![]() |
#1489 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,410
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Basically the UTF-8 byte sequence for an emdash is rendered as an emdash by viewers that understand UTF-8 and have the necessary fonts to render the character.
Does the resultant MOBI display correctly in the calibre viewer? |
![]() |
![]() |
#1490 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Quote:
|
|
![]() |
![]() |
#1491 | |
onlinenewsreader.net
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 328
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Quote:
|
|
![]() |
![]() |
#1492 | |
onlinenewsreader.net
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 328
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Quote:
http://www.theprovince.com/sports/20...576/story.html you'll see the problem--two emdashes in the body of the article Last edited by nickredding; 02-24-2010 at 10:07 AM. |
|
![]() |
![]() |
#1493 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,410
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That webpage incorrectly declares its encoding to be is8859-1, when it is actually utf-8. Set encoding='utf-8' in your recipe.
|
![]() |
![]() |
#1494 | |
onlinenewsreader.net
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 328
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Quote:
|
|
![]() |
![]() |
#1495 |
onlinenewsreader.net
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 328
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
How did you deduce the encoding is utf-8?
|
![]() |
![]() |
#1496 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,410
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Simple, I tried decoding it using utf-8, and the emdash was correctly decoded. I use a program called iconv to do this conveniently, but you can use calibre-debug as well
|
![]() |
![]() |
#1497 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Feb 2010
Device: Barnes & Noble Nook, Sony 505
|
Detroit News and Detroit Free Press
I created recipes for both the Detroit News and Free Press, but I can't get it right! The biggest problem is that both have a background, the News one is light enough, but the Free Press is really dark. Also both have lots of junk after the article that I don't know how to get rid of.
Can anybody help? |
![]() |
![]() |
#1498 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Feb 2010
Device: Sony Reader Pocket Ed. (PRS-300)
|
I used the editor to make a quick and dirty recipe for Kukuburi.com.
I'm pretty happy w/ the result, but can't seem to export the recipe from Calibre. Would anyone like to clean it up and save it as a file? I didn't know how to trim the bottom buttons out of the feed. class AdvancedUserRecipe1267141443(BasicNewsRecipe): title = u'Kukuburi' oldest_article = 30 max_articles_per_feed = 100 feeds = [(u'http://feeds.feedburner.com/kukuburi?format=xml', u'http://feeds.feedburner.com/kukuburi?format=xml')] |
![]() |
![]() |
#1499 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Feb 2010
Device: Sony PRS 600
|
|
![]() |
![]() |
#1500 | |
Memento Mori
![]() Posts: 36
Karma: 10
Join Date: Apr 2007
Device: eClicto, iPad WiFi, Kindle 3 WiFi
|
Quote:
Code:
#!/usr/bin/env python __license__ = 'GPL v3' __author__ = 'Mori' __version__ = 'v. 0.1' ''' Kukuburi.com ''' from calibre.web.feeds.news import BasicNewsRecipe import re class KukuburiRecipe(BasicNewsRecipe): __author__ = 'Mori' language = 'en' title = u'Kukuburi' publisher = u'Ramón Pérez' description = u'KUKUBURI by Ramón Pérez' no_stylesheets = True remove_javascript = True oldest_article = 100 max_articles_per_feed = 100 feeds = [ (u'Kukuburi', u'http://feeds2.feedburner.com/Kukuburi') ] preprocess_regexps = [ (re.compile(i[0], re.IGNORECASE | re.DOTALL), i[1]) for i in [ (r'<!--.*?-->', lambda match: ''), (r'<div class="feedflare".*?</div>', lambda match: '') ] ] |
|
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 02:52 AM |
Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 12:33 PM |
How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 05:08 AM |
Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 04:54 PM |
Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 02:37 PM |