Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 12-22-2014, 11:07 PM   #1
Nyssa
Series Addict
Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.
 
Nyssa's Avatar
 
Posts: 6,180
Karma: 167189477
Join Date: Dec 2010
Location: Florida, USA
Device: Kindle Paperwhite (2nd Gen)
How can I fix it when every line is a paragraph?

I have a book that for some reason has every line, regardless of punctuation, listed as a paragraph. (Please see image)

Is there a way, using the editor, that I can remove all of the paragraph html tags at once ?

I'm practically reading the book as I'm trying to correct it (not an enjoyable experience) and it seems all of the books in this particular group were formated the same way.

Click image for larger version

Name:	Screen Shot 2014-12-22 at 11.00.31 PM.png
Views:	498
Size:	95.3 KB
ID:	132813
Nyssa is offline   Reply With Quote
Old 12-22-2014, 11:33 PM   #2
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Nyssa View Post
I have a book that for some reason has every line, regardless of punctuation, listed as a paragraph. (Please see image)

Is there a way, using the editor, that I can remove all of the paragraph html tags at once ?

I'm practically reading the book as I'm trying to correct it (not an enjoyable experience) and it seems all of the books in this particular group were formated the same way.

That's a poorly converted PDF, if you have the PDF tweak the conversion Line Wrap up a little bit, if its worse tweak it down a bit. You may need to use an alternate converter - there's a sticky thread at the top of the Conversion subforum all about converting PDF's

Psst - its better to start your own thread rather than tack onto someone else's - then if they are different problems the different answers don't get confused

BR
BetterRed is offline   Reply With Quote
Advert
Old 12-22-2014, 11:44 PM   #3
Nyssa
Series Addict
Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.
 
Nyssa's Avatar
 
Posts: 6,180
Karma: 167189477
Join Date: Dec 2010
Location: Florida, USA
Device: Kindle Paperwhite (2nd Gen)
Quote:
Originally Posted by BetterRed View Post
That's a poorly converted PDF, if you have the PDF tweak the conversion Line Wrap up a little bit, if its worse tweak it down a bit. You may need to use an alternate converter - there's a sticky thread at the top of the Conversion subforum all about converting PDF's

Psst - its better to start your own thread rather than tack onto someone else's - then if they are different problems the different answers don't get confused

BR
  1. I don't have the PDFs
  2. Sorry, hopefully a mod can correct my error.
Nyssa is offline   Reply With Quote
Old 12-23-2014, 12:08 AM   #4
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Nyssa, I am a mod, but still using training wheels, I cant see a split thread option like I get on other sites - not to worry - I'll let you let off with a warning.

Someone can probably give you you a couple of regex's to fix the broken lines

I'm more comfortable fixing things like that without the html markup - so I'd convert to formatted text and use an editor like Notepad++ (or the one I just found called Bowpad - Scintilla wrapped in a pretty ribbon) and then convert that back to EPUB.

BR
BetterRed is offline   Reply With Quote
Old 12-23-2014, 12:39 AM   #5
Nyssa
Series Addict
Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.
 
Nyssa's Avatar
 
Posts: 6,180
Karma: 167189477
Join Date: Dec 2010
Location: Florida, USA
Device: Kindle Paperwhite (2nd Gen)
Quote:
Originally Posted by BetterRed View Post
Nyssa, I am a mod, but still using training wheels, I cant see a split thread option like I get on other sites - not to worry - I'll let you let off with a warning.
Sorry, I'm used to the guys (and girls) in green.

Quote:
Originally Posted by BetterRed View Post
Someone can probably give you you a couple of regex's to fix the broken lines

I'm more comfortable fixing things like that without the html markup - so I'd convert to formatted text and use an editor like Notepad++ (or the one I just found called Bowpad - Scintilla wrapped in a pretty ribbon) and then convert that back to EPUB.

BR
Umm...
Nyssa is offline   Reply With Quote
Advert
Old 12-23-2014, 03:00 AM   #6
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Nyssa View Post
Sorry, I'm used to the guys (and girls) in green.

Only the generals get Green Jackets, us NCOs have to skive around in mufti.



Umm...
does Umm... need an answer ?

BR
BetterRed is offline   Reply With Quote
Old 12-23-2014, 03:26 AM   #7
Nyssa
Series Addict
Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.
 
Nyssa's Avatar
 
Posts: 6,180
Karma: 167189477
Join Date: Dec 2010
Location: Florida, USA
Device: Kindle Paperwhite (2nd Gen)
Quote:
Originally Posted by BetterRed View Post
does Umm... need an answer ?

BR
First of all - " mufti " is a new term for me. I know them as "civies".


Second of all, the "Umm.." Was I have no idea what the means (regex), but maybe I can figure it out; especially since I wouldn't know what to ask from here anyway.

I was able to find "cleaner" versions of two of the books (maybe 3, I haven't checked the last one yet), but one of them has the same issue as my previous version. And I honestly don't feel like cleaning that mess up manually... Its too much.
Nyssa is offline   Reply With Quote
Old 12-23-2014, 08:16 AM   #8
dmonasse
Member
dmonasse began at the beginning.
 
Posts: 23
Karma: 10
Join Date: Apr 2014
Location: Paris
Device: ipad 2, Ubuntu
First of all: remove the
Code:
</p>

<p class="calibre2">
which are followed by a minuscule. Use regex:
Code:
</p>\n+<p class="calibre2">(?=[a-z])
and for the replacement: nothing or a single space. It will look nicer. Hope this helps.

Last edited by dmonasse; 12-23-2014 at 08:20 AM.
dmonasse is offline   Reply With Quote
Old 12-23-2014, 09:06 AM   #9
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
First, you want to get rid of the useless spaces before the closing "</p>"

Regex #1:

Search: \s+</p>
Replace: </p>

Explanation: What this will do is look for "one or more spaces" + "</p>", and replace it with just "</p>".

Example:

Code:
<p>This is a sample line </p>
Code:
<p>This is a sample line</p>
Regex #2:

Search: -</p>\s+<p>
Replace:

Explanation: What this will do is remove hyphens at the very end of the "paragraph", and combine it with the next line.

Side Note: I use the above regex on a one-by-one, case-by-case basis, because many "soft hyphens" in the PDF aren't actually a part of the word.

Example:

Code:
<p>Blah blah blah govern-</p>
<p>ment.</p>
Code:
<p>Blah blah blah government.</p>
Regex #2 (Variant):

Search: -</p>\s+<p>
Replace: -

Note: I don't use this one, although if there are TONS of hyphens at the end of each line, it might be best to do it this way, and take care of the hyphen situation on your own at a later step. I personally prefer to use the Spell Check Tool, and search for a single hyphen by itself: '-'. This will give you a list of every single word with a hyphen in it. Then I can check for + fix mistakes there much more quickly.

Example:

Code:
<p>Blah blah blah govern-</p>
<p>ment.</p>
Code:
<p>Blah blah blah govern-ment.</p>
Regex #3:

Search: ([^>”\?\!\.])</p>\s+<p>
Replace: \1

Explanation: What this Regex will do, is search for a paragraph that DOES NOT end in a "greater than sign", "right double quote", "question mark", "exclamation point", or "period". It will then combine it with the next paragraph.

Note: There is a space after the "\1".

Example:

Code:
<p>Susie said</p>
<p>that she was going to jump over a tree.</p>
<p>She also said,</p>
<p>that this was just a sample.</p>
Code:
<p>Susie said that she was going to jump over a tree.</p>
<p>She also said, that this was just a sample.</p>

Last edited by Tex2002ans; 12-23-2014 at 09:23 AM.
Tex2002ans is offline   Reply With Quote
Old 12-23-2014, 10:53 AM   #10
Nyssa
Series Addict
Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.
 
Nyssa's Avatar
 
Posts: 6,180
Karma: 167189477
Join Date: Dec 2010
Location: Florida, USA
Device: Kindle Paperwhite (2nd Gen)
Based on what I've seen as I've tried to clean up, I would need Regex #3, however, when I add those search and replace terms I get:
Code:
 
Searching done: Replaced 0 occurrences of ([^>”\?\!\.])</p>\s+<p>
I made sure the mode was set to "Regex".

Regex 1 ran without a problem, and I didn't change anything in the code (I just copied/pasted).
Nyssa is offline   Reply With Quote
Old 12-23-2014, 11:04 AM   #11
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Each regex should be run, in order.
eschwartz is offline   Reply With Quote
Old 12-23-2014, 11:11 AM   #12
Nyssa
Series Addict
Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.
 
Nyssa's Avatar
 
Posts: 6,180
Karma: 167189477
Join Date: Dec 2010
Location: Florida, USA
Device: Kindle Paperwhite (2nd Gen)
Quote:
Originally Posted by eschwartz View Post
Each regex should be run, in order.
Oh! Sorry, I thought I could choose between options. I didn't realize they were steps... Hyphens aren't a problem, so I figured I didn't need Regex #2 or its variant.

Thank you.
Nyssa is offline   Reply With Quote
Old 12-23-2014, 11:16 AM   #13
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
#1 cleans up the ebook for #2 to run, same with #2 and #3.

If hyphens aren't a problem, you can probably skip it. The worst it can do is nothing.
eschwartz is offline   Reply With Quote
Old 12-23-2014, 11:20 AM   #14
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Ooh, The Enchanted Forest Chronicles! I have all the pbooks, pity they were never released digitally.
eschwartz is offline   Reply With Quote
Old 12-23-2014, 11:35 AM   #15
Nyssa
Series Addict
Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.Nyssa ought to be getting tired of karma fortunes by now.
 
Nyssa's Avatar
 
Posts: 6,180
Karma: 167189477
Join Date: Dec 2010
Location: Florida, USA
Device: Kindle Paperwhite (2nd Gen)
Quote:
Originally Posted by eschwartz View Post
Ooh, The Enchanted Forest Chronicles! I have all the pbooks, pity they were never released digitally.
Yep. These were gifted to me awhile ago and I'm just now ready to read them.
Nyssa is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Indenting first line of paragraph jimvde Workshop 3 07-22-2013 04:16 AM
Indenting first line of each paragraph? dandelioncottage Sigil 3 04-10-2012 07:08 AM
Chapters are one giant paragraph. How to fix? bfollowell Conversion 9 02-03-2011 01:20 PM
First paragraph line indents jhempel24 Sigil 10 11-23-2010 07:26 PM
scanned PDF has weird paragraph breaks. Possible to fix lunixer PDF 0 08-30-2010 10:47 PM


All times are GMT -4. The time now is 12:17 PM.


MobileRead.com is a privately owned, operated and funded community.