I am afraid I found some more problems. I don't really mind issues 2-4, but would like to solve them if it's easy. Issue 1, however, is more of a critical error.
Issue 1: Some articles show up with
completely garbled text (see "gardbledText.jpg"), both in Calibre and in my PRS-300. Every time I download the news, the articles that show up corrupt are different ones, so it's not an issue with a specific article. Problem with the server?
Issue 2: I had to delete the "Ecosfera" feed from the recipe, because it was making my PRS-300 freeze & reboot, although the articles from said feed displayed just fine on Calibre. As a result,
some articles from the main feed (which conform to the "Ecosfera" structure)
are showing up empty on the resulting ebook. This also happens with articles from other feeds, which are completely empty, such as
http://desporto.publico.pt/noticia.aspx?id=1442218 Is there an EASY way to say, "if you find an empty article, delete it from the book and from the TOC"?
Issue 3:
Sometimes the feed provides the same article twice. For instance, "Proposta de composição no exame do 9º ano provocou mais um corrupio nas escolas" under the "Educação" section appears twice, with the same URL, the same title and the same exact content.
Is there an EASY way to say, "if you find repeated articles, delete all of them except for the newest one"?
Issue 4:
Some articles have the "Next" link disabled. Under PRS-300, I cannot navigate to them. Under Calibre, clicking on them makes no difference. This happens with the "Australiano Tim Cahill suspenso por um jogo" (9th) article from the "Desporto" section, for instance. Any EASY way to solve this?
I ran the recipe with the
debugging parameters as follow:
ebook-convert publico_pt_test.recipe .epub -vv --debug-pipeline p --extract-to x
I ran the resulting ePUB through
Adobe's Epubcheck (
http://code.google.com/p/epubcheck/) and it returned hundreds of errors. Is this normal?
Attached:
1. parsing_debug.zip > Results of debugging with -vv
2. ebook-convert_log.txt > Terminal messages from debugging
3. epubcheck_log.txt > Results of epubcheck for compliance
4. gardbledText.jpg > Garbled text on my Reader
5. publico_pt_test.epub > ePUB with today's news
6. publico_pt_test.txt > Current recipe