Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 03-28-2019, 12:35 AM   #16
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by DNSB View Post
Given that a reflowable epub simply shows what is between the opening and closing tags, using heuristics in an attempt to unwrap lines is going to be somewhat futile. The LFs embedded in the epub from FadedPages show as a space in all the renderers I tested (and that includes a couple of really dumb, disregard the book's CSS renderers).
Firstly, I always assumed the unwrap heuristics only applied to PDF and similar formats. It didn't seem to make sense for an epub. But, playing with it with that book, it does have an affect on the the epub. And using 0.40 on it, does split the existing paragraphs in what a probably the wrong places. As I don't like artificial line breaks like this has, I had a play to see if I could achieve what I think @lumpynose wants. Setting it high, to 1, means basically each line was split to a paragraph. Setting it low, to 0.05, unwrapped it correctly to remove the extra line breaks.
Quote:
Both calibre's editor and Sigil will happily remove those unneeded LFs when you prettify the file. Sigil and calibre's editor do have some differences which makes it handy to have choices.
The calibre editor prettify doesn't fix this. It won't unwrap lines like this. I have a regex that does it for me as it is a fairly common thing. I just find it offensive
davidfor is offline   Reply With Quote
Old 03-28-2019, 01:11 AM   #17
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 46,374
Karma: 169098492
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by DNSB View Post
Both calibre's editor and Sigil will happily remove those unneeded LFs when you prettify the file. Sigil and calibre's editor do have some differences which makes it handy to have choices.
Quote:
Originally Posted by davidfor View Post
The calibre editor prettify doesn't fix this. It won't unwrap lines like this. I have a regex that does it for me as it is a fairly common thing. I just find it offensive
Interesting. I hadn't noticed that the calibre editor prettify did not remove extraneous EOLs. Admittedly, Sigil is my goto editor with the calibre editor being used for such things as adding unmanifested files to the manifest.

I admit to be rather surprised that calibre actually tries to unwrap lines in a reflowable epub. Perhaps, there are excellent reasons that Kovid decided that heuristics are off by default.
DNSB is online now   Reply With Quote
Advert
Old 03-28-2019, 09:48 AM   #18
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,079
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by DNSB View Post
Interesting. I hadn't noticed that the calibre editor prettify did not remove extraneous EOLs. Admittedly, Sigil is my goto editor with the calibre editor being used for such things as adding unmanifested files to the manifest.

I admit to be rather surprised that calibre actually tries to unwrap lines in a reflowable epub. Perhaps, there are excellent reasons that Kovid decided that heuristics are off by default.

I also Start with Sigil for cleanup, then I switch to Calibre for Fixup ( bug report error correction and spelling)

I like each the way they are. This allows me to catch a lot of things that would slip past the other.
theducks is online now   Reply With Quote
Old 03-28-2019, 02:34 PM   #19
lumpynose
Wizard
lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.
 
Posts: 1,086
Karma: 6719822
Join Date: Jul 2012
Device: Palm Pilot M105
Quote:
Originally Posted by BetterRed View Post
@lumpynose - Did you try Prettifying the EPUB HTML with an editor (calibre or Sigil), and then converting to AZW using the calibre default settings - that's usually the best place to start.

BR
No, but it sounds like I need to give Sigil a try.

I'm flabbergasted at how many books I can load onto my new kindle and as a result I've gone buck wild putting public domain books on it. And as a result of that I don't look at the conversion of each one and then it's irritating when I open one and find the formatting is wacko.

I also like a ragged right margin. If there's some likely to be successful conversion recipe that people use I'd love to hear about it. Also a list of best practices when adding public domain books would be helpful; in the beginning I was mistakenly under the impression that mobi was a better format to download and start with but now I'm using epub.
lumpynose is offline   Reply With Quote
Old 03-28-2019, 02:40 PM   #20
lumpynose
Wizard
lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.
 
Posts: 1,086
Karma: 6719822
Join Date: Jul 2012
Device: Palm Pilot M105
Quote:
Originally Posted by theducks View Post
I did NOT use heuristics. there was no need

I do prefer Justified, but that is personal taste. I also do my touch ups manually, so heuristics is not ticked
Ok, thanks. (I'm a fan of ragged right since that gives better spacing between words when there are long words on a line.) I've been loading gobs of books onto my new kindle and one of them produced a badly formatted azw3 file and turning on heuristics fixed it so I've gotten in the habit of using that. It sounds like I need to do a preclean using sigil before I have calibre convert to azw3.
lumpynose is offline   Reply With Quote
Advert
Old 03-28-2019, 02:50 PM   #21
lumpynose
Wizard
lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.
 
Posts: 1,086
Karma: 6719822
Join Date: Jul 2012
Device: Palm Pilot M105
Quote:
Originally Posted by davidfor View Post
The calibre editor prettify doesn't fix this. It won't unwrap lines like this. I have a regex that does it for me as it is a fairly common thing. I just find it offensive
From my misadventure with this it seems to me that what calibre would work best with is EPUBs where the text between the beginning and ending p tags is one long line with no linefeeds.

It sounds like you may have a regex that does that? If so could you post it with instructions for where (which program) you use it?

Thanks
lumpynose is offline   Reply With Quote
Old 03-28-2019, 03:05 PM   #22
lumpynose
Wizard
lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.
 
Posts: 1,086
Karma: 6719822
Join Date: Jul 2012
Device: Palm Pilot M105
Quote:
Originally Posted by davidfor View Post
The calibre editor prettify doesn't fix this. It won't unwrap lines like this. I have a regex that does it for me as it is a fairly common thing. I just find it offensive
It looks like Sigil does this; I just tried its Tools > Reformat HTML > Mend and Prettify all HTML Files and it looks like it's just what I (Calibre) needs.
lumpynose is offline   Reply With Quote
Old 03-28-2019, 09:11 PM   #23
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by lumpynose View Post
From my misadventure with this it seems to me that what calibre would work best with is EPUBs where the text between the beginning and ending p tags is one long line with no linefeeds.
Because of the way epubs work, it doesn't really matter. Any epub reader/device that doesn't handle them is broken. The one place it causes a little confusion is highlighting text on my Kobo devices. If there is a line feed in the text, it shows in the annotations list on the device. But, the main reason I fix it is because I don't like the look of it.
Quote:
It sounds like you may have a regex that does that? If so could you post it with instructions for where (which program) you use it?
It's a saved search in the calibre editor. It started as:

Code:
(\w)
(\w)
with the replacement:
Code:
\1 \2
And has expanded to add other possible characters for the last and first of the lines. There's probably a better way to do, but, this is the quick and dirty way. There's always one or two lines that it doesn't catch. It's on my home machine. I'll try and remember to post it tonight.

After seeing what the conversion heuristics does, I might have a look at it and see how it works and if I can use it. My current version misses just enough to be annoying.
davidfor is offline   Reply With Quote
Old 03-29-2019, 12:31 PM   #24
lumpynose
Wizard
lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.lumpynose ought to be getting tired of karma fortunes by now.
 
Posts: 1,086
Karma: 6719822
Join Date: Jul 2012
Device: Palm Pilot M105
Thanks David. The Sigil program seems to be doing the line fixing thing, removing the newlines between p tags. Hopefully that's all that I need to fix these random books that confuse calibre and end up with undesirable formatting.

So now my recipe before downloading a book to my kindle will be to use Sigil's Mend and Prettify and then use Calibre to convert it to azw3.
lumpynose is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
noindent first line, indent all other lines, same paragraph, possible? patrik ePub 3 02-15-2016 11:36 AM
Add blank line between two lines coolpixel Sigil 1 11-08-2014 02:13 PM
Random Blank Line Feeds on iPhone 4 DrDoug Sigil 3 05-30-2014 10:17 AM
Text file formatting - line feeds and spaces Fallingwater Workshop 6 07-04-2011 02:42 PM
html->lrf line spacing between wrong lines? flowoeB Calibre 6 08-21-2009 12:43 PM


All times are GMT -4. The time now is 01:13 AM.


MobileRead.com is a privately owned, operated and funded community.