![]() |
#1 |
Member
![]() Posts: 12
Karma: 10
Join Date: May 2014
Device: None
|
Narrow text width, double spacing
The book came in pdf format. I converted to epub. Both versions show a narrow text width of about 6 words. Is there an easy, reliable and quick way to make all the line lengths fit the screen only where they should be longer than the screen width? (Some of the lines should be very short such as "it arrived" others need to be full width.)
For some reason the line spacing is double. That I would like to reduce to single. |
![]() |
![]() |
![]() |
#2 |
Member
![]() Posts: 12
Karma: 10
Join Date: May 2014
Device: None
|
I looked in the manual and I could not see an easy way to format the book which has come me in double spacing and very short lines. is there a way to do it that does not involve editing every line manually?
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,786
Karma: 146391129
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Can you post a sample of the HTML code?
|
![]() |
![]() |
![]() |
#4 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,068
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
When you converted the PDF, you need to try a slight smaller 'Line unwrap factor' (on the conversion screen form)
LUF tries to join lines so you should now see: LUF tries to join lines If it was .45 > .42 and see what that produces. It may take a few tries, Overkill is not a good idea |
![]() |
![]() |
![]() |
#5 |
Member
![]() Posts: 12
Karma: 10
Join Date: May 2014
Device: None
|
Sample HTML
This is from the file browser that shows the epub html that was converted from the pdf. Note also the page numbers that appear in the text.
Code:
<p class="calibre1">He was invited to stay for dinner</p> <p class="calibre1">but he was expected with his sister</p> <p class="calibre1">and her family out in the yuppie</p> <p class="calibre1">188/1467</p> <p class="calibre1">That morning he had also had an</p> <p class="calibre1">invitation to celebrate Christmas</p> <p class="calibre1">jöbaden. He said no, but thank you, </p> <p class="calibre1">certain that there was a limit to</p> <p class="calibre1">Beckman’s indulgence and quite</p> <p class="calibre1">sure that he had no ambition to find</p> <p class="calibre1">out what that limit might be. </p> <p class="calibre1">Instead he was knocking on the</p> <p class="calibre1">door where Annika Blomkvist, now</p> <p class="calibre1">Italian-born husband and their two</p> <p class="calibre1">children. With a platoon of her hus-</p> <p class="calibre1">band’s relatives, they were about to</p> <p class="calibre1">carve the Christmas ham. During</p> <p class="calibre1">dinner he answered questions about</p> <p class="calibre1">the trial and received much well-</p> <p class="calibre1">meaning and quite useless advice. </p> <p class="calibre1">The only one who had nothing to</p> <p class="calibre1">say about the verdict was his sister, </p> <p class="calibre1">although she was the only lawyer in</p> <p class="calibre1">189/1467</p> <p class="calibre1">the room. She had worked as clerk</p> <p class="calibre1">of a district court and as an assistant</p> <p class="calibre1">prosecutor for several years before</p> <p class="calibre1">she and three colleagues opened a</p> <p class="calibre1">law firm of their own with offices on</p> <p class="calibre1">having taken stock of its happening, </p> <p class="calibre1">his little sister began to appear in</p> <p class="calibre1">newspapers as representing battered</p> <p class="calibre1">or threatened women, and on panel</p> <p class="calibre1">discussions on TV as a feminist and</p> <p class="calibre1">wome |
![]() |
![]() |
Advert | |
|
![]() |
#6 | ||
Member
![]() Posts: 12
Karma: 10
Join Date: May 2014
Device: None
|
PDF formatting
Quote:
This is from the pdf Quote:
|
||
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
Normally if you enable heuristic processing and play with reducing the line-unwrap value then the conversion from PDF can handle combine short lines in a satisfactory way.
|
![]() |
![]() |
![]() |
#8 | |
Member
![]() Posts: 12
Karma: 10
Join Date: May 2014
Device: None
|
Heuristics
Quote:
Based on the sample what line-unwrap value should I first try? |
|
![]() |
![]() |
![]() |
#9 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
When you select a book and click on the convert option it brings up the Convert dialog. Heuristic processing is one of the areas shown in the left-hand panel for which you can set conversion settings. Tick the 'Enable Heuristic processing' checkbox to enable the other settings related to heuristic processing. The line unwrap setting normally starts at 0.40 , so try reducing it from this (e.g. 0.35) and see if it improves the conversion.
|
![]() |
![]() |
![]() |
#10 | ||
Member
![]() Posts: 12
Karma: 10
Join Date: May 2014
Device: None
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#11 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,068
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
That is the perils of PDF
![]() There is no perfect conversion. You are lucky if there is even a close conversion . May your REGEX foo get stronger because that is what you need (using the Editor or Sigil) Pass after pass of carefully thought out Searches (If you do them in the wrong order, you make later pattern matches more difficult. First I would remove the standalone page number lines. Then I would remove the mostest junk that can be done with a single pattern. BACKUP before each new cleaning pass in case you get it WRONG (discard the current bad edit) |
![]() |
![]() |
![]() |
#12 |
Member
![]() Posts: 12
Karma: 10
Join Date: May 2014
Device: None
|
The longer lines are acceptable now.
I managed to remove all the inbuilt page numbers by repeating the same regex with one less \d each time I ran the replace all. The replacements are shockingly fast. I don't really understand how to do the other cleaning up of the page I show you at comment 10 above. What type of regex will distinguish between a single word that should be the only one the line such as "Hello" and in other cases the single word should be joined to the next line? One way would be to join all words to one continuous line until a full stop is found, but is that level of control possible? |
![]() |
![]() |
![]() |
#13 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,068
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
![]() I have 5 different 'Join' saved searches. 3 are basic, run as is. 2 more need to be tweaked by case-by-case because they need to match the current class= portion to fine tune greediness Almost all are run in Replace Next (Find to skip this one) mode Line ending in Hyphen removal is a plague. It could be a hyphenated word: join with no space or it could be pseudo em dash (--) where context is everything in the break/nobreak decision There are examples in the stickies (and other places) over in Sigil. For the most part they also work in Calibre |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Maintaining double spacing question | ralphiedee | Sigil | 17 | 12-04-2012 08:16 PM |
expression to remove double spacing between paragraphs | ktj | Calibre | 4 | 07-26-2011 02:38 PM |
Double Spacing | Jafo | Calibre | 3 | 12-31-2010 10:47 AM |
ePub double spacing | leebase | Calibre | 5 | 03-30-2010 03:42 PM |
.pdf file and Double Spacing output | holguinero | 0 | 10-05-2009 12:14 PM |