Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 06-25-2023, 04:33 AM   #1
skunkworks
Member
skunkworks began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Mar 2023
Device: none
line breaks in EPUB

When converting a DOCX to EPUB, several sentences which are okay in Word produce line breaks in the EPUB.

Word:
Arriving in San Francisco we were transported to the Victorian home we had arranged to rent in Pacific Heights located high on a hill overlooking the city. We were entranced by the sparkling waters of the bay and I remarked it wouldn’t be surprising if someday they figured out how to build a bridge across there.
Invigorated by the view we arose early next day to admire the beauty of the area.

Date: April 18, 1906. Time: A little past 5:00 AM.

“Well, Fran, my dear, here we are in our thirty-eighth year together and we’re both awake bright and early. How would you like to celebrate today?”
“Oh, Matt, it doesn’t seem possible we’ve been together that long. We’ve certainly had some wonderful times. It seems to me like every day is a celebration.”
“Me, too. I thought maybe we might…”
“Did you feel that? It felt like...”
“Something’s shaking...”
“What’s that rumbling noise? Sounds like...”

(Editor's note: At 5:12 AM on April 18, 1906 San Francisco was struck by a devastating earthquake.)

We were startled, frightened, and bewildered.

EPUB:
Arriving in San Francisco we were transported to the Victorian home we had arranged to rent in Pacific Heights located high on a hill overlooking the city. We were entranced by the sparkling waters of the bay and I remarked it wouldn’t be surprising if someday they figured out how to build a bridge across there.
Invigorated by the view we arose early next day to admire the beauty of the area.

Date: April 18, 1906. Time: A little past 5:00 AM.

“Well, Fran, my dear, here we are in our thirty-eighth year together and we’re both awake bright and early. How would you like to celebrate today?”

“Oh, Matt, it doesn’t seem possible we’ve been together that long. We’ve certainly had some wonderful times. It seems to me like every day is a celebration.”

“Me, too. I thought maybe we might…”

“Did you feel that? It felt like...”

“Something’s shaking...”

“What’s that rumbling noise? Sounds like...”

(Editor's note: At 5:12 AM on April 18, 1906 San Francisco was struck by a devastating earthquake.)

We were startled, frightened, and bewildered.

*****
Spoiler:

No conversion options have been changed.

Conversion log:
Convert book 1 of 1 (ebookD2D)
Conversion options changed from defaults:
verbose: 2
output_profile: 'generic_eink'
read_metadata_from_opf: 'C:\\Users\\Owner\\AppData\\Local\\Temp\\calibre_5 ec2affu\\g91qejnm.opf'
Resolved conversion options
calibre version: 6.19.1
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0.0,
'book_producer': None,
'change_justification': 'original',
'chapter': "//*[((name()='h1' or name()='h2') and re:test(., "
"'\\s*((chapter|book|section|part)\\s+)|((prolog|p rologue|epilogue)(\\s+|$))', "
"'i')) or @class = 'chapter']",
'chapter_mark': 'pagebreak',
'comments': None,
'cover': None,
'debug_pipeline': None,
'dehyphenate': True,
'delete_blank_paragraphs': True,
'disable_font_rescaling': False,
'docx_inline_subsup': False,
'docx_no_cover': False,
'docx_no_pagebreaks_between_notes': False,
'dont_split_on_page_breaks': False,
'duplicate_links_in_toc': False,
'embed_all_fonts': False,
'embed_font_family': None,
'enable_heuristics': False,
'epub_flatten': False,
'epub_inline_toc': False,
'epub_max_image_size': 'none',
'epub_toc_at_end': False,
'epub_version': '2',
'expand_css': False,
'extra_css': None,
'extract_to': None,
'filter_css': '',
'fix_indents': True,
'flow_size': 260,
'font_size_mapping': None,
'format_scene_breaks': True,
'html_unwrap_factor': 0.4,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x000001FBEE649EA0>,
'insert_blank_line': False,
'insert_blank_line_size': 0.5,
'insert_metadata': False,
'isbn': None,
'italicize_common_cases': True,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0.0,
'linearize_tables': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'markup_chapter_headings': True,
'max_toc_links': 50,
'minimum_line_height': 120.0,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.GenericEink object at 0x000001FBEE64A1A0>,
'page_breaks_before': '/',
'prefer_metadata_cover': False,
'preserve_cover_aspect_ratio': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': 'C:\\Users\\Owner\\AppData\\Local\\Temp\\calibre_5 ec2affu\\g91qejnm.opf',
'remove_fake_margins': True,
'remove_first_image': False,
'remove_paragraph_spacing': False,
'remove_paragraph_spacing_indent_size': 1.5,
'renumber_headings': True,
'replace_scene_breaks': '',
'search_replace': '[]',
'series': None,
'series_index': None,
'smarten_punctuation': False,
'sr1_replace': None,
'sr1_search': None,
'sr2_replace': None,
'sr2_search': None,
'sr3_replace': None,
'sr3_search': None,
'start_reading_at': None,
'subset_embedded_fonts': False,
'tags': None,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'toc_title': None,
'transform_css_rules': '[]',
'transform_html_rules': '[]',
'unsmarten_punctuation': False,
'unwrap_lines': True,
'use_auto_toc': False,
'verbose': 2}
InputFormatPlugin: DOCX Input running
on C:\Users\Owner\AppData\Local\Temp\calibre_5ec2affu \dr3o8txt.docx
Converting Word markup to HTML
Converting styles to CSS
Cleaning up redundant markup generated by Word
Generating Table of Contents from headings
Parsing all content...
Parsing docx.css ...
Parsing index.html ...
Initial parse failed, using more forgiving parsers
Parsing index.html as HTML
Reading TOC from NCX...
Merging user specified metadata...
Detecting structure...
Detected chapter: Chapter 1
Detected chapter: Chapter 2
Detected chapter: Chapter 3
Detected chapter: Chapter 4
Detected chapter: Chapter 5
Detected chapter: Chapter 6
Detected chapter: Chapter 7
Detected chapter: Chapter 8
Detected chapter: Chapter 9
Detected chapter: Chapter 10
Detected chapter: Chapter 11
Detected chapter: Chapter 12
Detected chapter: Chapter 13
Detected chapter: Chapter 14
Detected chapter: Chapter 15
Detected chapter: Chapter 16
Detected chapter: Chapter 17
Detected chapter: Chapter 18
Detected chapter: Chapter 19
Detected chapter: Chapter 20
Detected chapter: Chapter 21
Detected chapter: Epilogue
Flattening CSS and remapping font sizes...
Source base font size is 14.00000pt
Removing fake margins...
Found 1322 items of level: p_1
Found 22 items of level: div_1
p_1 left margin stats: Counter({'0': 1322})
p_1 right margin stats: Counter({'0': 1322})
div_1 left margin stats: Counter()
div_1 right margin stats: Counter()
Cleaning up manifest...
Trimming unused files from manifest...
Creating EPUB Output...
Splitting markup on page breaks and flow limits, if any...
Splitting on page-break at id=calibre_pb_0
Splitting on page-break at id=calibre_pb_1
Splitting on page-break at id=calibre_pb_2
Splitting on page-break at id=calibre_pb_3
Splitting on page-break at id=calibre_pb_4
Splitting on page-break at id=calibre_pb_5
Splitting on page-break at id=calibre_pb_6
Splitting on page-break at id=calibre_pb_7
Splitting on page-break at id=calibre_pb_8
Splitting on page-break at id=calibre_pb_9
Splitting on page-break at id=calibre_pb_10
Splitting on page-break at id=calibre_pb_11
Splitting on page-break at id=calibre_pb_12
Splitting on page-break at id=calibre_pb_13
Splitting on page-break at id=calibre_pb_14
Splitting on page-break at id=calibre_pb_15
Splitting on page-break at id=calibre_pb_16
Splitting on page-break at id=calibre_pb_17
Splitting on page-break at id=calibre_pb_18
Splitting on page-break at id=calibre_pb_19
Splitting on page-break at id=calibre_pb_20
Splitting on page-break at id=calibre_pb_21
Splitting on page-break at id=calibre_pb_22
Splitting on page-break at id=calibre_pb_23
Splitting on page-break at id=calibre_pb_24
Splitting on page-break at id=calibre_pb_25
Splitting on page-break at id=calibre_pb_26
Splitting on page-break at id=calibre_pb_27
Splitting on page-break at id=calibre_pb_28
Splitting on page-break at id=calibre_pb_29
Splitting on page-break at id=calibre_pb_30
Splitting on page-break at id=calibre_pb_31
Splitting on page-break at id=calibre_pb_32
Splitting on page-break at id=calibre_pb_33
Splitting on page-break at id=calibre_pb_34
Splitting on page-break at id=calibre_pb_35
Splitting on page-break at id=calibre_pb_36
Splitting on page-break at id=calibre_pb_37
Splitting on page-break at id=calibre_pb_38
Splitting on page-break at id=calibre_pb_39
Splitting on page-break at id=calibre_pb_40
Splitting on page-break at id=calibre_pb_41
Splitting on page-break at id=calibre_pb_42
Splitting on page-break at id=calibre_pb_43
Splitting on page-break at id=calibre_pb_44
Looking for large trees in index.html...
No large trees found
Split into 24 parts
Generating default cover
Removing anchor from TOC href: index_split_002.html#id_Toc525564898
Removing anchor from TOC href: index_split_003.html#id_Toc525564899
Removing anchor from TOC href: index_split_004.html#id_Toc525564900
Removing anchor from TOC href: index_split_005.html#id_Toc525564901
Removing anchor from TOC href: index_split_006.html#id_Toc525564902
Removing anchor from TOC href: index_split_007.html#id_Toc525564903
Removing anchor from TOC href: index_split_008.html#id_Toc525564904
Removing anchor from TOC href: index_split_009.html#id_Toc525564905
Removing anchor from TOC href: index_split_010.html#id_Toc525564906
Removing anchor from TOC href: index_split_011.html#id_Toc525564907
Removing anchor from TOC href: index_split_012.html#id_Toc525564908
Removing anchor from TOC href: index_split_013.html#id_Toc525564909
Removing anchor from TOC href: index_split_014.html#id_Toc525564910
Removing anchor from TOC href: index_split_015.html#id_Toc525564911
Removing anchor from TOC href: index_split_016.html#id_Toc525564912
Removing anchor from TOC href: index_split_017.html#id_Toc525564913
Removing anchor from TOC href: index_split_018.html#id_Toc525564914
Removing anchor from TOC href: index_split_019.html#id_Toc525564915
Removing anchor from TOC href: index_split_020.html#id_Toc525564916
Removing anchor from TOC href: index_split_021.html#id_Toc525564917
Removing anchor from TOC href: index_split_022.html#id_Toc525564918
Removing anchor from TOC href: index_split_023.html#id_Toc525564919
EPUB output written to C:\Users\Owner\AppData\Local\Temp\calibre_5ec2affu \o5_tp31y.epub

Last edited by theducks; 06-25-2023 at 09:27 AM. Reason: Please Spoiler Log files
skunkworks is offline   Reply With Quote
Old 06-25-2023, 07:27 AM   #2
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,016
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Check your MS Word Styles. Make sure no direct formatting.

Also no headers, footers, page numbers or automatic hyphenation (all those are only for paper / PDF export)

No inches or cm. Only pt (or em if your version of Word supports it. 1em = 12 pt for ebooks, but not for paper!).

Make sure on your dialogue that each part equals one paragraph. Do not use Shift Enter or similar, but just enter. A "shift enter" may translate to a <br />

Should be
<p class="some-body">Arriving in San Francisco we were transported to the Victorian home we had arranged to rent in Pacific Heights located high on a hill overlooking the city. We were entranced by the sparkling waters of the bay and I remarked it wouldn’t be surprising if someday they figured out how to build a bridge across there.</p>
<p class="some-body">“Something’s shaking...”</p>
<p class="some-body">“What’s that rumbling noise? Sounds like...”</p>

Don't enter blank lines to space, instead have a paragraph style that adds non-zero top (& optionally bottom) margins.

Have the regular body text style for narration and dialogue have a first-line-indent value. A paragraph style for after a heading or scene break can have a zero first line indent margin.

Don't set any line-spacing

Last edited by Quoth; 06-25-2023 at 07:36 AM.
Quoth is offline   Reply With Quote
Advert
Old 06-25-2023, 07:47 AM   #3
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,745
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Quoth View Post
Check your MS Word Styles. Make sure no direct formatting.

Also no headers, footers, page numbers or automatic hyphenation (all those are only for paper / PDF export)

No inches or cm. Only pt (or em if your version of Word supports it. 1em = 12 pt for ebooks, but not for paper!).

Make sure on your dialogue that each part equals one paragraph. Do not use Shift Enter or similar, but just enter. A "shift enter" may translate to a <br />

Should be
<p class="some-body">Arriving in San Francisco we were transported to the Victorian home we had arranged to rent in Pacific Heights located high on a hill overlooking the city. We were entranced by the sparkling waters of the bay and I remarked it wouldn’t be surprising if someday they figured out how to build a bridge across there.</p>
<p class="some-body">“Something’s shaking...”</p>
<p class="some-body">“What’s that rumbling noise? Sounds like...”</p>

Don't enter blank lines to space, instead have a paragraph style that adds non-zero top (& optionally bottom) margins.

Have the regular body text style for narration and dialogue have a first-line-indent value. A paragraph style for after a heading or scene break can have a zero first line indent margin.

Don't set any line-spacing
Actually...

Should be
<p>Arriving in San Francisco we were transported to the Victorian home we had arranged to rent in Pacific Heights located high on a hill overlooking the city. We were entranced by the sparkling waters of the bay and I remarked it wouldn’t be surprising if someday they figured out how to build a bridge across there.</p>
<p>“Something’s shaking...”</p>
<p>“What’s that rumbling noise? Sounds like...”</p>

With the style for p in CSS such as...
Code:
p {
  margin-top: 0;
  margin-bottom: 0;
  text-indent: 1.2em;
}
JSWolf is offline   Reply With Quote
Old 06-25-2023, 04:38 PM   #4
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,016
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
or
.somebody {
margin-top: 0;
margin-bottom: 0;
text-indent: 1.2em;
}

I didn't bother putting example CSS because it's automatically created to match the docx paragraph style.

No need to edit any CSS or HTML for any text or heading if the docx is correct.
Quoth is offline   Reply With Quote
Old 06-25-2023, 06:37 PM   #5
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,745
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Quoth View Post
or
.somebody {
margin-top: 0;
margin-bottom: 0;
text-indent: 1.2em;
}

I didn't bother putting example CSS because it's automatically created to match the docx paragraph style.

No need to edit any CSS or HTML for any text or heading if the docx is correct.
It's a simple fix. I don't like <p class="somereallystupiduselessclass">.

I prefer the default paragraph format to be <p>.
JSWolf is offline   Reply With Quote
Advert
Old 06-25-2023, 07:38 PM   #6
skunkworks
Member
skunkworks began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Mar 2023
Device: none
Thank you, Quoth and JSWolf! I'll study your tips and give it a go. Much appreciated!
skunkworks is offline   Reply With Quote
Old 06-25-2023, 08:23 PM   #7
skunkworks
Member
skunkworks began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Mar 2023
Device: none
Update: For some odd reason, the troublesome paragraphs, although formatted the same as others in Word, had a different class block. Changing it from 20 to 13 did the trick.

Thanks again!
skunkworks is offline   Reply With Quote
Old 06-26-2023, 06:55 AM   #8
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,016
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by JSWolf View Post
It's a simple fix. I don't like <p class="somereallystupiduselessclass">.

I prefer the default paragraph format to be <p>.
But you are mostly editing existing ebooks. Creating ebooks from Word or LO Writer via docx shouldn't involve any ebook editing at all, except maybe image classes that need % and auto properties rather than absolute pixels.
Quoth is offline   Reply With Quote
Old 06-26-2023, 07:37 AM   #9
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,745
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Quoth View Post
But you are mostly editing existing ebooks. Creating ebooks from Word or LO Writer via docx shouldn't involve any ebook editing at all, except maybe image classes that need % and auto properties rather than absolute pixels.
If the ePub from the DOCX has <p class="someuselessclass"> then yes it should be edited to remove the class.
JSWolf is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Some line breaks after converting epub to azw3 reinferrer Conversion 1 09-19-2020 07:49 AM
Line breaks on Kindle, no line breaks on 4 PC Siavahda Kindle Formats 0 10-20-2012 05:50 AM
RTF to EPUB...extra line breaks GreenMonkey Calibre 46 12-27-2010 04:25 PM
Removing paragraph breaks present after every line in EPUB? Snakey Calibre 6 12-17-2010 11:08 AM
Odd line/paragraph breaks in epub and FB2? PKFFW Calibre 4 10-01-2009 07:49 AM


All times are GMT -4. The time now is 06:05 AM.


MobileRead.com is a privately owned, operated and funded community.