|  02-20-2017, 06:07 PM | #1 | |
| Member  Posts: 11 Karma: 10 Join Date: Nov 2015 Device: none | 
				
				Converting UTF-8 TXT to Epub
			 
			
			Example file: spaghetti_sparkle_2_-_galaonline.txt (don't judge me). Quote: 
 Conversion log for .docx-to-epub: Code: Convert book 1 of 1 (spaghetti sparkle 2)
DeDRM v6.1.0: In __init__
DeDRM v6.1.0: In load_resources
DeDRM v6.1.0: verdir C:\Users\N\AppData\Roaming\calibre\plugins\DeDRM\6.1.0
DeDRM v6.1.0: In initialize
Conversion options changed from defaults:
  search_replace: '[]'
  output_profile: 'kindle_pw'
  sr2_search: None
  transform_css_rules: '[]'
  sr2_replace: None
  verbose: 2
  filter_css: u''
  sr3_search: None
  read_metadata_from_opf: u'C:\\Users\\N\\AppData\\Local\\Temp\\calibre_o1jd4a\\bc57t_.opf'
  sr1_search: None
  sr3_replace: None
  sr1_replace: None
Resolved conversion options
calibre version: 2.79.0
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'change_justification': u'original',
 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., '\\s*((chapter|book|section|part)\\s+)|((prolog|prologue|epilogue)(\\s+|$))', 'i')) or @class = 'chapter']",
 'chapter_mark': u'pagebreak',
 'comments': None,
 'cover': None,
 'debug_pipeline': None,
 'dehyphenate': True,
 'delete_blank_paragraphs': True,
 'disable_font_rescaling': False,
 'docx_inline_subsup': False,
 'docx_no_cover': False,
 'docx_no_pagebreaks_between_notes': False,
 'dont_split_on_page_breaks': False,
 'duplicate_links_in_toc': False,
 'embed_all_fonts': False,
 'embed_font_family': None,
 'enable_heuristics': False,
 'epub_flatten': False,
 'epub_inline_toc': False,
 'epub_toc_at_end': False,
 'expand_css': False,
 'extra_css': None,
 'extract_to': None,
 'filter_css': u'',
 'fix_indents': True,
 'flow_size': 260,
 'font_size_mapping': None,
 'format_scene_breaks': True,
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x0000000005483CF8>,
 'insert_blank_line': False,
 'insert_blank_line_size': 0.5,
 'insert_metadata': False,
 'isbn': None,
 'italicize_common_cases': True,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'markup_chapter_headings': True,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'no_chapters_in_toc': False,
 'no_default_epub_cover': False,
 'no_inline_navbars': False,
 'no_svg_cover': False,
 'output_profile': <calibre.customize.profiles.KindlePaperWhiteOutput object at 0x00000000054983C8>,
 'page_breaks_before': u'/',
 'prefer_metadata_cover': False,
 'preserve_cover_aspect_ratio': False,
 'pretty_print': True,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': u'C:\\Users\\N\\AppData\\Local\\Temp\\calibre_o1jd4a\\bc57t_.opf',
 'remove_fake_margins': True,
 'remove_first_image': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'renumber_headings': True,
 'replace_scene_breaks': u'',
 'search_replace': '[]',
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'sr1_replace': None,
 'sr1_search': None,
 'sr2_replace': None,
 'sr2_search': None,
 'sr3_replace': None,
 'sr3_search': None,
 'start_reading_at': None,
 'subset_embedded_fonts': False,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'transform_css_rules': '[]',
 'unsmarten_punctuation': False,
 'unwrap_lines': True,
 'use_auto_toc': False,
 'verbose': 2}
InputFormatPlugin: DOCX Input running
on C:\Users\N\AppData\Local\Temp\calibre_o1jd4a\fgzz3b.docx
Converting Word markup to HTML
Converting styles to CSS
Cleaning up redundant markup generated by Word
Parsing all content...
Parsing index.html ...
Initial parse failed, using more forgiving parsers
Parsing index.html as HTML
Parsing docx.css ...
Generating default TOC from spine...
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Flattening CSS and remapping font sizes...
Source base font size is 10.50000pt
Removing fake margins...
Found 183 items of level: p_1
p_1  left margin stats: Counter({u'0': 183})
p_1  right margin stats: Counter({u'0': 183})
Cleaning up manifest...
Trimming unused files from manifest...
Creating EPUB Output...
Splitting markup on page breaks and flow limits, if any...
	Looking for large trees in index.html...
	No large trees found
Generating default cover
This EPUB file has no Table of Contents. Creating a default TOC
EPUB output written to C:\Users\N\AppData\Local\Temp\calibre_o1jd4a\datz4i.epubCode: Convert book 1 of 1 (spaghetti_sparkle_2_-_galaonline)
DeDRM v6.1.0: In __init__
DeDRM v6.1.0: In load_resources
DeDRM v6.1.0: verdir C:\Users\N\AppData\Roaming\calibre\plugins\DeDRM\6.1.0
DeDRM v6.1.0: In initialize
Conversion options changed from defaults:
  sr3_replace: None
  sr1_replace: None
  search_replace: '[]'
  output_profile: 'kindle_pw'
  markdown_extensions: u'toc, tables, footnotes'
  sr2_search: None
  transform_css_rules: '[]'
  sr2_replace: None
  verbose: 2
  filter_css: u''
  sr3_search: None
  read_metadata_from_opf: u'C:\\Users\\N\\AppData\\Local\\Temp\\calibre_o1jd4a\\xy4rwu.opf'
  sr1_search: None
Resolved conversion options
calibre version: 2.79.0
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'change_justification': u'original',
 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., '\\s*((chapter|book|section|part)\\s+)|((prolog|prologue|epilogue)(\\s+|$))', 'i')) or @class = 'chapter']",
 'chapter_mark': u'pagebreak',
 'comments': None,
 'cover': None,
 'debug_pipeline': None,
 'dehyphenate': True,
 'delete_blank_paragraphs': True,
 'disable_font_rescaling': False,
 'dont_split_on_page_breaks': False,
 'duplicate_links_in_toc': False,
 'embed_all_fonts': False,
 'embed_font_family': None,
 'enable_heuristics': False,
 'epub_flatten': False,
 'epub_inline_toc': False,
 'epub_toc_at_end': False,
 'expand_css': False,
 'extra_css': None,
 'extract_to': None,
 'filter_css': u'',
 'fix_indents': True,
 'flow_size': 260,
 'font_size_mapping': None,
 'format_scene_breaks': True,
 'formatting_type': u'auto',
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x0000000005352D68>,
 'insert_blank_line': False,
 'insert_blank_line_size': 0.5,
 'insert_metadata': False,
 'isbn': None,
 'italicize_common_cases': True,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'markdown_extensions': u'toc, tables, footnotes',
 'markup_chapter_headings': True,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'no_chapters_in_toc': False,
 'no_default_epub_cover': False,
 'no_inline_navbars': False,
 'no_svg_cover': False,
 'output_profile': <calibre.customize.profiles.KindlePaperWhiteOutput object at 0x0000000005364438>,
 'page_breaks_before': u"//*[name()='h1' or name()='h2']",
 'paragraph_type': u'auto',
 'prefer_metadata_cover': False,
 'preserve_cover_aspect_ratio': False,
 'preserve_spaces': False,
 'pretty_print': True,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': u'C:\\Users\\N\\AppData\\Local\\Temp\\calibre_o1jd4a\\xy4rwu.opf',
 'remove_fake_margins': True,
 'remove_first_image': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'renumber_headings': True,
 'replace_scene_breaks': u'',
 'search_replace': '[]',
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'sr1_replace': None,
 'sr1_search': None,
 'sr2_replace': None,
 'sr2_search': None,
 'sr3_replace': None,
 'sr3_search': None,
 'start_reading_at': None,
 'subset_embedded_fonts': False,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'transform_css_rules': '[]',
 'txt_in_remove_indents': False,
 'unsmarten_punctuation': False,
 'unwrap_lines': True,
 'use_auto_toc': False,
 'verbose': 2}
InputFormatPlugin: TXT Input running
on C:\Users\N\AppData\Local\Temp\calibre_o1jd4a\prmnfp.txt
Reading text from file...
Detected input encoding as ISO-8859-2 with a confidence of 84.8260567914%
Auto detected paragraph type as unformatted
Auto detected formatting as heuristic
Running text through basic conversion...
Language not specified
Creator not specified
Building file list...
	Found files...
		 HTMLFile:0:a:C:\Users\N\AppData\Local\Temp\calibre_o1jd4a\index.html
Normalizing filename cases
Rewriting HTML links
Parsing index.html ...
*********  Heuristic processing HTML  *********
There are 12 blank lines. 0.107142857143 percent blank
minimum chapters required are: 1
found 0 pre-existing headings
Total wordcount is: 1240, Average words per section is: 1240, Marked up 0 chapters
Hard line breaks check returned True
Median line length is 39, calculated with html format
Fixing hyphenated content
Looking for more split points based on punctuation, currently have 0
marked 1 section markers based on punctuation. - Fucking embarrassing</p>
Formatting scene breaks
Forcing index.html into XHTML namespace
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Removing fake margins...
Found 112 items of level: p_1
p_1  left margin stats: Counter({u'0': 112})
p_1  right margin stats: Counter({u'0': 112})
Cleaning up manifest...
Trimming unused files from manifest...
Creating EPUB Output...
Splitting markup on page breaks and flow limits, if any...
		Splitting on page-break at id=calibre_pb_0
	Looking for large trees in index.html...
	No large trees found
	Split into 2 parts
Generating default cover
This EPUB file has no Table of Contents. Creating a default TOC
EPUB output written to C:\Users\N\AppData\Local\Temp\calibre_o1jd4a\bbtamz.epubLast edited by ij26; 02-20-2017 at 09:56 PM. Reason: Correcting quote. | |
|   |   | 
|  02-20-2017, 06:43 PM | #2 | 
| Grand Sorcerer            Posts: 28,880 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | Moderator Notice Mobileread strives to be a family-friendly site. Please refrain from profanity (even if it's quoted text) | 
|   |   | 
| Advert | |
|  | 
|  02-20-2017, 06:57 PM | #3 | 
| Grand Sorcerer            Posts: 28,880 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | 
			
			If you explicitly mark the input character encoding as utf8 (Look & feel->Text->Input character encoding)  in the conversion settings, the character is properly preserved when converting TXT to EPUB. It was for me anyway.
		 | 
|   |   | 
|  02-20-2017, 07:23 PM | #4 | 
| Resident Curmudgeon            Posts: 80,740 Karma: 150249619 Join Date: Nov 2006 Location: Roslindale, Massachusetts Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 | 
			
			What is this nonsense being converted?
		 | 
|   |   | 
|  02-20-2017, 07:25 PM | #5 | 
| Well trained by Cats            Posts: 31,249 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | |
|   |   | 
| Advert | |
|  | 
|  02-20-2017, 09:59 PM | #6 | |
| Member  Posts: 11 Karma: 10 Join Date: Nov 2015 Device: none | Quote: 
 Is it safe to leave that as the general setting in Preferences, or are there was that it can go wrong? Also: Is there a fix for the line break issue? And is there a way to omit fonts when converting .docx files, or would I have to manually open each one and click the "Clear all formatting" button? | |
|   |   | 
|  02-20-2017, 11:05 PM | #7 | 
| Well trained by Cats            Posts: 31,249 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | 
			
			You can do specific overrides  when you start Conversion  Preferences sets the Defaults BTW when you convert a book, THOSE settings are remembered, even if you change the defaults. | 
|   |   | 
|  02-20-2017, 11:23 PM | #8 | 
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			Look at the Look & Feel section of the conversion dialog under the Stying tab. And look at the option in the txt input section of the conversion dialog.
		 | 
|   |   | 
|  02-21-2017, 10:49 AM | #9 | |
| Member  Posts: 11 Karma: 10 Join Date: Nov 2015 Device: none | Quote: 
 Edit: I see "unformatted" also strips line breaks. "single" seems to work best, although all the settings result in space appearing between lines that doesn't appear when using the Word route. Last edited by ij26; 02-21-2017 at 11:10 AM. | |
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Convert Chinese UTF-8 TXT file into ePub?? | C.Jones81 | Calibre | 4 | 12-05-2010 06:32 AM | 
| comic.txt UTF-8 for manga | kookiie | LRF | 0 | 11-15-2010 03:10 PM | 
| comic.txt UTF-8 | kookiie | Sony Reader | 0 | 11-15-2010 10:21 AM | 
| comic.txt UTF-8 | kookiie | Calibre | 0 | 11-15-2010 10:16 AM | 
| comic.txt UTF-8 | kookiie | Recipes | 0 | 11-15-2010 10:14 AM |