MobileRead Forums - View Single Post - Could not find reasonable point at which to split

Manhattan · 07-24-2021, 09:37 AM

Trying to convert a book from Amazon. I'm not sure what file format my Kindle 1.117 app downloads.

At first when converting the file to EPUB Calibre gave an error message that reads: The splits were too big for some EPUB readers, so I tried other file types, but the only thing that comes out right is the cover page. Other than that, instead of the text from the book, it's just pages upon pages of random characters, like when you try to open an image file with notepad.

Then I tried heuristics enabled but that didn't work, then I set the split size to the size of the whole file and I didn't get an error message but I did get the random characters.

Code:

Convert book 1 of 1 (Volume 1)
DeDRM v7.2.1: Trying to decrypt i9t3knr0.mobi
Using Library AlfCrypto DLL/DYLIB/SO
Using Library AlfCrypto DLL/DYLIB/SO
MobiDeDrm v1.0.
Copyright © 2008-2020 The Dark Reverser, Apprentice Harper et al.
Decrypting Mobipocket 4 ebook: Volume 1
Got DSN key from database default_key
Found 4 keys to try after 0.3 seconds
Crypto Type is: 0
This book is not encrypted.
Decryption succeeded after 0.3 seconds
DeDRM v7.2.1: Finished after 0.5 seconds
Conversion options changed from defaults:
  verbose: 2
  read_metadata_from_opf: 'C:\\...\\calibre_vycg3n56\\m738vugq.opf'
  output_profile: 'generic_eink'
  cover: 'C:\\...\\calibre_vycg3n56\\wvk9dzg4.jpeg'
Resolved conversion options
calibre version: 5.23.0
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'change_justification': 'original',
 'chapter': "//*[((name()='h1' or name()='h2') and re:test(., "
            "'\\s*((chapter|book|section|part)\\s+)|((prolog|prologue|epilogue)(\\s+|$))', "
            "'i')) or @class = 'chapter']",
 'chapter_mark': 'pagebreak',
 'comments': None,
 'cover': 'C:\\...\\calibre_vycg3n56\\wvk9dzg4.jpeg',
 'debug_pipeline': None,
 'dehyphenate': True,
 'delete_blank_paragraphs': True,
 'disable_font_rescaling': False,
 'dont_split_on_page_breaks': False,
 'duplicate_links_in_toc': False,
 'embed_all_fonts': False,
 'embed_font_family': None,
 'enable_heuristics': False,
 'epub_flatten': False,
 'epub_inline_toc': False,
 'epub_toc_at_end': False,
 'epub_version': '2',
 'expand_css': False,
 'extra_css': None,
 'extract_to': None,
 'filter_css': '',
 'fix_indents': True,
 'flow_size': 260,
 'font_size_mapping': None,
 'format_scene_breaks': True,
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x07BFD730>,
 'insert_blank_line': False,
 'insert_blank_line_size': 0.5,
 'insert_metadata': False,
 'isbn': None,
 'italicize_common_cases': True,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'markup_chapter_headings': True,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'no_chapters_in_toc': False,
 'no_default_epub_cover': False,
 'no_inline_navbars': False,
 'no_svg_cover': False,
 'output_profile': <calibre.customize.profiles.GenericEink object at 0x07BFD898>,
 'page_breaks_before': "//*[name()='h1' or name()='h2']",
 'prefer_metadata_cover': False,
 'preserve_cover_aspect_ratio': False,
 'pretty_print': True,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': 'C:\\...\\calibre_vycg3n56\\m738vugq.opf',
 'remove_fake_margins': True,
 'remove_first_image': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'renumber_headings': True,
 'replace_scene_breaks': '',
 'search_replace': '[]',
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'sr1_replace': None,
 'sr1_search': None,
 'sr2_replace': None,
 'sr2_search': None,
 'sr3_replace': None,
 'sr3_search': None,
 'start_reading_at': None,
 'subset_embedded_fonts': False,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'transform_css_rules': '[]',
 'unsmarten_punctuation': False,
 'unwrap_lines': True,
 'use_auto_toc': False,
 'verbose': 2}
DeDRM v7.2.1: Trying to decrypt pe9u4_op.mobi
MobiDeDrm v1.0.
Copyright © 2008-2020 The Dark Reverser, Apprentice Harper et al.
Decrypting Mobipocket 4 ebook: Second Language Acquisition: Volume 1
Got DSN key from database default_key
Found 4 keys to try after 0.2 seconds
Crypto Type is: 0
This book is not encrypted.
Decryption succeeded after 0.2 seconds
DeDRM v7.2.1: Finished after 0.5 seconds
InputFormatPlugin: MOBI Input running
on C:\...\calibre_vycg3n56\7ognbxx0.mobi
Extracting text...
Adding anchors...
Extracting images...
Cleaning up HTML...
Parsing HTML...
Malformed markup, parsing using html5-parser
Converting style information to CSS...
Creating OPF...
Parsing all content...
Parsing index.html ...
Initial parse failed, using more forgiving parsers
Parsing index.html as HTML
HTML 5 parsing failed, falling back to older parsers
Traceback (most recent call last):
  File "calibre\ebooks\oeb\parse_utils.py", line 211, in parse_html
  File "calibre\utils\xml_parse.py", line 27, in safe_xml_fromstring
  File "src/lxml/etree.pyx", line 3237, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1896, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1777, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
  File "<string>", line 724
lxml.etree.XMLSyntaxError: Attribute _ redefined, line 724, column 675

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "calibre\ebooks\oeb\parse_utils.py", line 218, in parse_html
  File "calibre\utils\xml_parse.py", line 27, in safe_xml_fromstring
  File "src/lxml/etree.pyx", line 3237, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1896, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1777, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
  File "<string>", line 724
lxml.etree.XMLSyntaxError: Attribute _ redefined, line 724, column 675

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "calibre\ebooks\oeb\parse_utils.py", line 224, in parse_html
  File "calibre\ebooks\oeb\parse_utils.py", line 105, in html5_parse
ValueError: HTML 5 parsing resulted in a tree with nesting depth > 100

Forcing index.html into XHTML namespace
Parsing styles.css ...
Generating default TOC from spine...
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Removing fake margins...
Found 1 items of level: p_75
Ignoring level p_75
Cleaning up manifest...
Trimming unused files from manifest...
Trimming 'images/00002.jpg' from manifest
Trimming 'images/00001.jpg' from manifest
Creating EPUB Output...
Rescaling image from 600x857 to 526x751 cover.jpeg
Splitting markup on page breaks and flow limits, if any...
	Looking for large trees in index.html...
	Found large tree #0
		Splitting...
			Split point: {http://www.w3.org/1999/xhtml}wc__________________________v______w__gk___________w__w__y________h4__e__i____________n /*/*[2]/*[54]/*/*/*/*/*/*/*[2]/*/*/*/*/*/*[2]/*/*/*/*
			Split tree still too large: 8925 KB
		Splitting...
			Split point: {http://www.w3.org/1999/xhtml}u____h5__ /*/*[2]/*[24]/*
			Split tree still too large: 565 KB
		Splitting...
			Split point: {http://www.w3.org/1999/xhtml}m____xj___r________r_______i_________wd____________sh3____oxu____xp__1w____fq________o___x_________ /*/*[2]/*[16]
			Committed sub-tree #1 (231 KB)
			Split tree still too large: 334 KB
		Splitting...
Traceback (most recent call last):
  File "runpy.py", line 194, in _run_module_as_main
  File "runpy.py", line 87, in _run_code
  File "site.py", line 82, in <module>
  File "site.py", line 77, in main
  File "site.py", line 49, in run_entry_point
  File "calibre\utils\ipc\worker.py", line 216, in main
  File "calibre\gui2\convert\gui_conversion.py", line 41, in gui_convert_override
  File "calibre\gui2\convert\gui_conversion.py", line 28, in gui_convert
  File "calibre\ebooks\conversion\plumber.py", line 1271, in run
  File "calibre\ebooks\conversion\plugins\epub_output.py", line 207, in convert
  File "calibre\ebooks\oeb\transforms\split.py", line 66, in __call__
  File "calibre\ebooks\oeb\transforms\split.py", line 75, in split_item
  File "calibre\ebooks\oeb\transforms\split.py", line 224, in __init__
  File "calibre\ebooks\oeb\transforms\split.py", line 372, in split_to_size
  File "calibre\ebooks\oeb\transforms\split.py", line 372, in split_to_size
  File "calibre\ebooks\oeb\transforms\split.py", line 372, in split_to_size
  File "calibre\ebooks\oeb\transforms\split.py", line 350, in split_to_size
calibre.ebooks.oeb.transforms.split.SplitError: Could not find reasonable point at which to split: index.html Sub-tree size: 334 KB