Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 04-22-2011, 04:46 PM   #1
adrian142
Member
adrian142 began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Aug 2010
Device: Kindle 3
Convert Kyrillic document

Hello,

I have some Kyrillic documents/books (RTF and PDF based) that I would like to convert to epub using CALIBRE.

At the moment I get errors.

Is it a problem because of the Kyrillic letter?

Best regards,
-Adrian
adrian142 is offline   Reply With Quote
Old 04-22-2011, 05:50 PM   #2
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
What are the errors?
user_none is offline   Reply With Quote
Advert
Old 04-23-2011, 01:35 AM   #3
adrian142
Member
adrian142 began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Aug 2010
Device: Kindle 3
calibre, version 0.7.55
ERROR: Konvertierungsfehler: <b>Misslungen</b>: Convert book 1 of 1 (basinskiy pavel lev tolstoy begstvo iz raia)

Convert book 1 of 1 (basinskiy pavel lev tolstoy begstvo iz raia)
Resolved conversion options
calibre version: 0.7.55
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0.0,
'book_producer': None,
'change_justification': u'original',
'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\\s+', 'i')) or @class = 'chapter']",
'chapter_mark': u'pagebreak',
'comments': None,
'cover': None,
'debug_pipeline': None,
'dehyphenate': True,
'delete_blank_paragraphs': True,
'disable_font_rescaling': False,
'dont_split_on_page_breaks': False,
'enable_heuristics': False,
'epub_flatten': False,
'extra_css': None,
'extract_to': None,
'fix_indents': True,
'flow_size': 260,
'font_size_mapping': None,
'format_scene_breaks': True,
'html_unwrap_factor': 0.4,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x05A2E7D0>,
'insert_blank_line': False,
'insert_metadata': False,
'isbn': None,
'italicize_common_cases': True,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0.0,
'linearize_tables': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'markup_chapter_headings': True,
'max_toc_links': 50,
'minimum_line_height': 120.0,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.GenericEink object at 0x05A2E9D0>,
'page_breaks_before': u"//*[name()='h1' or name()='h2']",
'prefer_metadata_cover': False,
'preserve_cover_aspect_ratio': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': 'c:\\users\\adrian\\appdata\\local\\temp\\calibre_ 0.7.55_tmp_hpkf_3\\calibre_0.7.55_ch0ntf.opf',
'remove_fake_margins': True,
'remove_first_image': False,
'remove_paragraph_spacing': False,
'remove_paragraph_spacing_indent_size': 1.5,
'renumber_headings': True,
'replace_scene_breaks': u'',
'series': None,
'series_index': None,
'smarten_punctuation': False,
'sr1_replace': None,
'sr1_search': None,
'sr2_replace': None,
'sr2_search': None,
'sr3_replace': None,
'sr3_search': None,
'tags': None,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'unwrap_lines': True,
'use_auto_toc': False,
'verbose': 2}
InputFormatPlugin: RTF Input running
on F:\eBooks\Unbekannt\basinskiy pavel lev tolstoy begstvo iz r (535)\basinskiy pavel lev tolstoy begstvo iz r - Unbekannt.rtf
Converting RTF to XML...
Python function terminated unexpectedly
invalid literal for int() with base 10: 'true' (Error Code: 1)
Traceback (most recent call last):
File "site.py", line 103, in main
File "site.py", line 85, in run_entry_point
File "site-packages\calibre\utils\ipc\worker.py", line 119, in main
File "site-packages\calibre\gui2\convert\gui_conversion.py", line 31, in gui_convert_override
File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert
File "site-packages\calibre\ebooks\conversion\plumber.py", line 915, in run
File "site-packages\calibre\customize\conversion.py", line 204, in __call__
File "site-packages\calibre\ebooks\rtf\input.py", line 270, in convert
File "site-packages\calibre\ebooks\rtf\input.py", line 132, in generate_xml
File "site-packages\calibre\ebooks\rtf2xml\ParseRtf.py", line 317, in parse_rtf
File "site-packages\calibre\ebooks\rtf2xml\default_encoding.p y", line 75, in find_default_encoding
File "site-packages\calibre\ebooks\rtf2xml\default_encoding.p y", line 111, in _encoding
ValueError: invalid literal for int() with base 10: 'true'
adrian142 is offline   Reply With Quote
Old 04-23-2011, 04:37 AM   #4
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
The encoding detection function is failing, not sure if that's a bug or related to some bad data in your rtf file. Non-ASCII support in RTF files is relatively new. You could open a bug on the bug-tracker with a reproduction test file attached:
https://bugs.launchpad.net/calibre

For the RTF file there is a workable way to get a good conversion:
  1. Open the file in MS Word
  2. Save/Export the file as HTML (Filtered), make sure the encoding is UTF-8
  3. Import the HTML file to Calibre, edit metadata as desired, then convert

A pdf file wouldn't be giving you the same problem, so if that's failing you should post the error separately.
ldolse is offline   Reply With Quote
Old 04-23-2011, 04:48 AM   #5
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Has the encoding detection been changed relatively recently? As I noticed yesterday when I converted a txt document that it failed to detect and I lost my quotes. For the first time ever in using calibre I had to set the encoding manually to cp1252.

Note that I haven't done a conversion of txt for weeks before this. Unfortunately I no longer have the txt to attach to a bug report.

Perhaps it is just normal expected behaviour but having never seen this before I was surprised and I am sure I have converted documents with angled quotes before.

Edit: sorry I maybe should have given this post it's own thread but when you mentioned some additions to encoding detection I thought it might be vaguely relevant.

Last edited by kiwidude; 04-23-2011 at 04:57 AM.
kiwidude is offline   Reply With Quote
Advert
Old 04-23-2011, 06:40 AM   #6
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Two different things have happened recently - user_none added auto-detection for text file encoding, prior to that you always had to manually specify the non-ascii encoding. I think there are some limitations with the detection though, it's not expected to catch all cases, so you may be in the margins.

RTF is a slightly different story, it hasn't historically supported any non-ascii. However a bunch of work was done on RTF in the last few months - I believe by Sengian - and I believe non-ascii is generally supported but I'm not sure if all the wrinkles have been ironed out.
ldolse is offline   Reply With Quote
Old 04-23-2011, 01:24 PM   #7
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
TXT and RTF detect the encoding in two completely different ways.

Detecting cp1252 in TXT file is very hit or miss. See http://chardet.feedparser.org/docs/h...ow.windows1252 for more info about this limitation.
user_none is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
hi.guys..i tried to convert word document.. shater Calibre 11 01-15-2010 04:59 PM
How to convert a Word document into a Kindle document? PS Kindle Kindle Developer's Corner 2 12-08-2009 08:40 PM
Help, I can't convert an RTF document. BookCat Calibre 2 08-03-2009 04:20 PM
How to convert a Word Document SillyBear80 Sony Reader 6 12-25-2008 11:25 PM


All times are GMT -4. The time now is 12:13 AM.


MobileRead.com is a privately owned, operated and funded community.