![]() |
#1 |
Junior Member
![]() Posts: 8
Karma: 10
Join Date: Aug 2011
Device: Kobo Touch
|
Errors with diacritic characters
I'm trying to convert some .pdf with diacritic characters but the final .epub shows many lines inserted in a random way and breaking paragraphs.
Documentation says: NO RESULTS. http://manual.calibre-ebook.com/sear...s&area=default Is there any help or specification before converting files with these characters? Last edited by mosker; 08-21-2011 at 03:27 AM. |
![]() |
![]() |
![]() |
#2 | |
US Navy, Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,895
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
|
Quote:
|
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Junior Member
![]() Posts: 8
Karma: 10
Join Date: Aug 2011
Device: Kobo Touch
|
yes. That FAQ is useless because it don't give information on diacritic characters neither the conversion using own fonts.
When I try to convert the PDF to EPUB, in many diacritic characters there are jumps of one or more lines. Changes in heuristic options has no effect. Is there any information in the Calibre documentation on diacritic characters?. |
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Your problem description isn't clear - is the diacritic character itself displayed correctly and the paragraph breaks on the character, or is the character not rendered correctly?
PDFs define diacritics in a lot of ways, Calibre handles some of the common occurrences, but taking care of some of the more obscure ones can be difficult. Beyond that support for diacritics will depend on your reading system - most reading systems don't have comprehensive fonts that cover all languages. |
![]() |
![]() |
![]() |
#5 |
Junior Member
![]() Posts: 8
Karma: 10
Join Date: Aug 2011
Device: Kobo Touch
|
no, the diacritic character is not displayed correctly and also there are paragraph breaks on the characters.
However, I have some e-pubs files dowwloaded from internet and using diacritic characters, and I know they have been converted using Calibre. I'm using XP and the Calibre viewer to check the result. When I decompress these files, just I see UTF-8 codification and the following CSS specification: font-family: "Times Ext Roman", "Indic Times", "Doulos SIL", Tahoma, "Arial Unicode MS", Gentium; Then I try: 1 - decompress the wrong converted epub 2 - change the CSS specification to include that same CSS family specification of those epub files 3 - rebuild the e-pub. but no success. In the wrong decompress epub, the paragraphs are already broken with <p></p> at every place in where there is a diacritic character, and the next rebuild process has no effect. I suppose these specifications should be included before converting the pdf. How one can include the own font to display diacritics?. I cannot find intructions about how to do it. thx, Last edited by mosker; 08-21-2011 at 06:38 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
If the character itself is not displayed correctly then Calibre doesn't support the way the diacritics are defined. Messing with the fonts won't help.
The list of diacritic characters that Calibre has support for is here: http://bazaar.launchpad.net/~kovid/c.../preprocess.py Search for "# Fix Accents" - no quotes - to see the relevant part of the source. If the characters you're concerned about are already covered in that list then there isn't anything to be done, you just have a set of junk pdfs. The only thing you could do is use the search and replace wizard to replace whatever garbage is being generated with the correct character, but this is dependent on how many you need to do it for. |
![]() |
![]() |
![]() |
#7 |
Junior Member
![]() Posts: 8
Karma: 10
Join Date: Aug 2011
Device: Kobo Touch
|
That part of the python code you cited it should cover all the characters of my texts.
What do you mean with "junk PDF"?. I don't know about the inner pdf characteristics although my files seem to be right. (As an example, here one of them: http://www.archive.org/details/Cetasikas ) I don't know if the pdf needs some inner definition already implemented before to be converted. thanks for the help, |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
I get many errors with calibre | desideria | Devices | 4 | 04-14-2011 11:56 AM |
Errors | Caitlin | Calibre | 7 | 11-15-2010 03:48 PM |
PDF to WORD/HTML conversion, "special characters and marks" errors | chengyibo | 3 | 11-06-2010 12:43 AM | |
metadata.db errors | christinerutter | Calibre | 20 | 10-06-2009 12:23 PM |
Errors and Errors... | uncultured | Amazon Kindle | 7 | 03-11-2009 05:11 PM |