Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 01-20-2021, 03:16 AM   #1
Thomas_Georg
Junior Member
Thomas_Georg began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jan 2021
Device: none
epub to epub: removing hyphens without other changes

Hi! I need to remove unnecessary hyphens in an epub3, and have tried the heuristic processing option when converting from epub to epub. It seems that Calibre also edits file structure and contents, leaving the resulting epub3 unusable to my needs.

I have tried to tweak the settings to make Calibre edit just the hyphens, but to no avail. Is there a way to just remove the unnecessary hyphens without editing the entire epub?
Thomas_Georg is offline   Reply With Quote
Old 01-20-2021, 08:59 AM   #2
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,660
Karma: 127838196
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Thomas_Georg View Post
Hi! I need to remove unnecessary hyphens in an epub3, and have tried the heuristic processing option when converting from epub to epub. It seems that Calibre also edits file structure and contents, leaving the resulting epub3 unusable to my needs.

I have tried to tweak the settings to make Calibre edit just the hyphens, but to no avail. Is there a way to just remove the unnecessary hyphens without editing the entire epub?
The hyphens are not in the eBook.The hyphens are from the program being used to display the eBook. In the CSS, you can turn off hyphenation using the following CSS code. This is the body style I use with the code for turning off hyphenation. This will turn off hyphenation for the entire eBook.

Code:
body {
  widows: 1;
  orphans: 1;
  margin-top: 0;
  margin-right: 0;
  margin-bottom: 0;
  margin-left: 0;
  text-align: justify;
  -epub-hyphens: none;
  -webkit-hyphens: none;
  hyphens: none;
}
JSWolf is offline   Reply With Quote
Advert
Old 01-21-2021, 03:02 AM   #3
Thomas_Georg
Junior Member
Thomas_Georg began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jan 2021
Device: none
Quote:
Originally Posted by JSWolf View Post
The hyphens are not in the eBook.The hyphens are from the program being used to display the eBook. In the CSS, you can turn off hyphenation using the following CSS code. This is the body style I use with the code for turning off hyphenation. This will turn off hyphenation for the entire eBook.
Thanks for the tip, but I use these epubs for further conversions into accessible formats for sight impaired. Hyphens in the middle of words exist within the body text, and are kept across different formats, so I am looking for a method to erase these hypens early in the conversion process without deleting every single one manually.
Thomas_Georg is offline   Reply With Quote
Old 01-21-2021, 10:17 AM   #4
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Thomas_Georg View Post
Thanks for the tip, but I use these epubs for further conversions into accessible formats for sight impaired. Hyphens in the middle of words exist within the body text, and are kept across different formats, so I am looking for a method to erase these hypens early in the conversion process without deleting every single one manually.
So what does the sight impaired person get for phrases that do NEED hyphens, like : double-barreled ? doublebarreled? double barreled?

Seems to me, that program needs to adjust its interpretation based upon context. double<very short new word pause>barreled

I believe the CSS code JUST instructs the render engine (viewer) on how to BREAK words for visual line fit. Those places might not always coded in the book (soft).
theducks is offline   Reply With Quote
Old 01-21-2021, 01:34 PM   #5
retiredbiker
Addict
retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.
 
retiredbiker's Avatar
 
Posts: 378
Karma: 1638210
Join Date: May 2013
Location: Ontario, Canada
Device: Kindle KB, Oasis, Ubuntu, Jutoh,Kobo Forma
Quote:
Hyphens in the middle of words exist within the body text, and are kept across different formats,
If your book came from an OCR process, it is very possible to have real hyphens scattered through the book, where the print copy had them. Some OCR is smart enough to remove them from the ends of lines, but not all, especially older ones.

If that is your case, you could remove them with search and replace, but like theducks said, what do you do where they are valid, like "forty-two"? If you just have a bad book to start with, your only option is to proofread it and manually remove the bad ones. Very tedious indeed.

One thing you might try--examine them closely in the editor. I've seen the odd book where the end-of-line hyphens came through as "- " (hyphen space), but the valid ones had no space, and I could clean it up by searching out the "- " instances. But never perfect: proofing still needed.
retiredbiker is offline   Reply With Quote
Advert
Old 01-21-2021, 02:59 PM   #6
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,660
Karma: 127838196
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Spell check the book. It should find words broken with hyphens.
JSWolf is offline   Reply With Quote
Old 01-21-2021, 03:52 PM   #7
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,459
Karma: 26645808
Join Date: Mar 2012
Location: Sydney Australia
Device: none
I would use the Editor's Tools->Reports->Words list with a '-' in the filter, it should be easy to spot any questionable hyphenation, and fix in the code editor, viz:

Click image for larger version

Name:	Screenshot 2021-01-22 074435.jpg
Views:	274
Size:	154.9 KB
ID:	184857

BR
BetterRed is offline   Reply With Quote
Old 04-18-2022, 10:11 PM   #8
Phil MTL
Junior Member
Phil MTL began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Sep 2016
Device: nexus 5X
brilliant !
Phil MTL is offline   Reply With Quote
Reply

Tags
conversion, epub, heuristic processing, unnecessary hyphens

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Soft hyphens lost on conversion to EPUB David Booth Conversion 4 06-23-2017 06:33 AM
hyphens - comparing epub - kepub in 3.15.0 cramoisi Kobo Reader 13 05-24-2015 07:25 AM
ePub to pdf: Doesn't respect soft hyphens in ePub EbokJunkie Conversion 4 11-18-2013 03:27 AM
Calibre deletes soft Hyphens in Epub ? NASCARaddicted Calibre 4 09-20-2009 06:31 PM
Certain hyphens being removed on HTML to ePub phunkysai Calibre 4 05-19-2009 03:17 PM


All times are GMT -4. The time now is 05:25 AM.


MobileRead.com is a privately owned, operated and funded community.