![]() |
#1 |
Member
![]() Posts: 20
Karma: 10
Join Date: Aug 2010
Device: Kindle 3
|
Convert Kyrillic document
Hello,
I have some Kyrillic documents/books (RTF and PDF based) that I would like to convert to epub using CALIBRE. At the moment I get errors. Is it a problem because of the Kyrillic letter? Best regards, -Adrian |
![]() |
![]() |
![]() |
#2 |
Sigil & calibre developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
What are the errors?
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Member
![]() Posts: 20
Karma: 10
Join Date: Aug 2010
Device: Kindle 3
|
calibre, version 0.7.55
ERROR: Konvertierungsfehler: <b>Misslungen</b>: Convert book 1 of 1 (basinskiy pavel lev tolstoy begstvo iz raia) Convert book 1 of 1 (basinskiy pavel lev tolstoy begstvo iz raia) Resolved conversion options calibre version: 0.7.55 {'asciiize': False, 'author_sort': None, 'authors': None, 'base_font_size': 0.0, 'book_producer': None, 'change_justification': u'original', 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\\s+', 'i')) or @class = 'chapter']", 'chapter_mark': u'pagebreak', 'comments': None, 'cover': None, 'debug_pipeline': None, 'dehyphenate': True, 'delete_blank_paragraphs': True, 'disable_font_rescaling': False, 'dont_split_on_page_breaks': False, 'enable_heuristics': False, 'epub_flatten': False, 'extra_css': None, 'extract_to': None, 'fix_indents': True, 'flow_size': 260, 'font_size_mapping': None, 'format_scene_breaks': True, 'html_unwrap_factor': 0.4, 'input_encoding': None, 'input_profile': <calibre.customize.profiles.InputProfile object at 0x05A2E7D0>, 'insert_blank_line': False, 'insert_metadata': False, 'isbn': None, 'italicize_common_cases': True, 'keep_ligatures': False, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0.0, 'linearize_tables': False, 'margin_bottom': 5.0, 'margin_left': 5.0, 'margin_right': 5.0, 'margin_top': 5.0, 'markup_chapter_headings': True, 'max_toc_links': 50, 'minimum_line_height': 120.0, 'no_chapters_in_toc': False, 'no_default_epub_cover': False, 'no_inline_navbars': False, 'no_svg_cover': False, 'output_profile': <calibre.customize.profiles.GenericEink object at 0x05A2E9D0>, 'page_breaks_before': u"//*[name()='h1' or name()='h2']", 'prefer_metadata_cover': False, 'preserve_cover_aspect_ratio': False, 'pretty_print': True, 'pubdate': None, 'publisher': None, 'rating': None, 'read_metadata_from_opf': 'c:\\users\\adrian\\appdata\\local\\temp\\calibre_ 0.7.55_tmp_hpkf_3\\calibre_0.7.55_ch0ntf.opf', 'remove_fake_margins': True, 'remove_first_image': False, 'remove_paragraph_spacing': False, 'remove_paragraph_spacing_indent_size': 1.5, 'renumber_headings': True, 'replace_scene_breaks': u'', 'series': None, 'series_index': None, 'smarten_punctuation': False, 'sr1_replace': None, 'sr1_search': None, 'sr2_replace': None, 'sr2_search': None, 'sr3_replace': None, 'sr3_search': None, 'tags': None, 'timestamp': None, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'unwrap_lines': True, 'use_auto_toc': False, 'verbose': 2} InputFormatPlugin: RTF Input running on F:\eBooks\Unbekannt\basinskiy pavel lev tolstoy begstvo iz r (535)\basinskiy pavel lev tolstoy begstvo iz r - Unbekannt.rtf Converting RTF to XML... Python function terminated unexpectedly invalid literal for int() with base 10: 'true' (Error Code: 1) Traceback (most recent call last): File "site.py", line 103, in main File "site.py", line 85, in run_entry_point File "site-packages\calibre\utils\ipc\worker.py", line 119, in main File "site-packages\calibre\gui2\convert\gui_conversion.py", line 31, in gui_convert_override File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert File "site-packages\calibre\ebooks\conversion\plumber.py", line 915, in run File "site-packages\calibre\customize\conversion.py", line 204, in __call__ File "site-packages\calibre\ebooks\rtf\input.py", line 270, in convert File "site-packages\calibre\ebooks\rtf\input.py", line 132, in generate_xml File "site-packages\calibre\ebooks\rtf2xml\ParseRtf.py", line 317, in parse_rtf File "site-packages\calibre\ebooks\rtf2xml\default_encoding.p y", line 75, in find_default_encoding File "site-packages\calibre\ebooks\rtf2xml\default_encoding.p y", line 111, in _encoding ValueError: invalid literal for int() with base 10: 'true' |
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
The encoding detection function is failing, not sure if that's a bug or related to some bad data in your rtf file. Non-ASCII support in RTF files is relatively new. You could open a bug on the bug-tracker with a reproduction test file attached:
https://bugs.launchpad.net/calibre For the RTF file there is a workable way to get a good conversion:
A pdf file wouldn't be giving you the same problem, so if that's failing you should post the error separately. |
![]() |
![]() |
![]() |
#5 |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Has the encoding detection been changed relatively recently? As I noticed yesterday when I converted a txt document that it failed to detect and I lost my quotes. For the first time ever in using calibre I had to set the encoding manually to cp1252.
Note that I haven't done a conversion of txt for weeks before this. Unfortunately I no longer have the txt to attach to a bug report. Perhaps it is just normal expected behaviour but having never seen this before I was surprised and I am sure I have converted documents with angled quotes before. Edit: sorry I maybe should have given this post it's own thread but when you mentioned some additions to encoding detection I thought it might be vaguely relevant. Last edited by kiwidude; 04-23-2011 at 04:57 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Two different things have happened recently - user_none added auto-detection for text file encoding, prior to that you always had to manually specify the non-ascii encoding. I think there are some limitations with the detection though, it's not expected to catch all cases, so you may be in the margins.
RTF is a slightly different story, it hasn't historically supported any non-ascii. However a bunch of work was done on RTF in the last few months - I believe by Sengian - and I believe non-ascii is generally supported but I'm not sure if all the wrinkles have been ironed out. |
![]() |
![]() |
![]() |
#7 |
Sigil & calibre developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
TXT and RTF detect the encoding in two completely different ways.
Detecting cp1252 in TXT file is very hit or miss. See http://chardet.feedparser.org/docs/h...ow.windows1252 for more info about this limitation. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
hi.guys..i tried to convert word document.. | shater | Calibre | 11 | 01-15-2010 04:59 PM |
How to convert a Word document into a Kindle document? | PS Kindle | Kindle Developer's Corner | 2 | 12-08-2009 08:40 PM |
Help, I can't convert an RTF document. | BookCat | Calibre | 2 | 08-03-2009 04:20 PM |
How to convert a Word Document | SillyBear80 | Sony Reader | 6 | 12-25-2008 11:25 PM |