Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 06-29-2015, 01:55 AM   #1
Chris_Snow
Zealot
Chris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipse
 
Posts: 148
Karma: 8170
Join Date: Jul 2013
Device: kobo glo
</p> at beginning of paragraph - how do I change?

Greetings,

I have a file I'm using as a test bed to learn regex. It is a badly converted PDF to epub. The original file had the actual body of the text under the misc folder within the epub.

I didn't know how to fix that so I converted to html, then back to epub. That seems to have fixed the problem of the text body now being where it should be.

However...the code for the paragraphs is...

Code:
<p class="calibre2"></p>The paragraph goes in here.
As you can see the "</p>" is at the beginning and not at the end. Is there an easy way to change this? There must be a regex I can use to "find/replace" it?

I have learned a lot about how to tweak things using regex on this file, but this is somewhat beyond me.

Appreciate the assistance.

Update: I found out I could use .* and some variations to achieve a result. Got all the end tags where they belong now.

Last edited by Chris_Snow; 06-29-2015 at 02:18 AM.
Chris_Snow is offline   Reply With Quote
Old 06-29-2015, 02:24 AM   #2
doubleshuffle
Unicycle Daredevil
doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.
 
doubleshuffle's Avatar
 
Posts: 13,926
Karma: 185041098
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
The experts will certainly have more interesting solutions, but I just tested this and it works:

Search:
Code:
</p>(.*?)
<(.*?)>
Replace with:
Code:
\1</p> <\2>
doubleshuffle is offline   Reply With Quote
Advert
Old 06-29-2015, 02:57 AM   #3
rubeus
Banned
rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.
 
Posts: 272
Karma: 1224588
Join Date: Sep 2014
Device: Sony PRS 650
Quote:
Originally Posted by Chris_Snow View Post
Code:
<p class="calibre2"></p>The paragraph goes in here.
As you can see the "</p>" is at the beginning and not at the end. Is there an easy way to change this? There must be a regex I can use to "find/replace" it?
For this particular problem there's a regex for sure. But i guess this example does not cover all problems. There might be sentences not ending with a full stop, but with question marks, exclamation marks, parenthesis etc etc.

My approach would be to delete all </p> and let tiny do the rest.
rubeus is offline   Reply With Quote
Old 06-29-2015, 06:03 AM   #4
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
Quote:
Originally Posted by rubeus View Post
My approach would be to delete all </p> and let tiny do the rest.
Neat! I was going to replace <p class="calibre2"></p> with </p><p>
Notjohn is offline   Reply With Quote
Old 06-29-2015, 07:02 AM   #5
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,590
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by rubeus View Post
My approach would be to delete all </p> and let tiny do the rest.
I assume you're referring to Tiny, the helpful Sigil gnome?
DiapDealer is online now   Reply With Quote
Advert
Old 06-29-2015, 10:25 AM   #6
rubeus
Banned
rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.
 
Posts: 272
Karma: 1224588
Join Date: Sep 2014
Device: Sony PRS 650
Quote:
Originally Posted by DiapDealer View Post
I assume you're referring to Tiny, the helpful Sigil gnome?
Tiny, the Sigil gnome also known as tidy
rubeus is offline   Reply With Quote
Old 06-29-2015, 10:29 AM   #7
avantman42
Wizard
avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.
 
avantman42's Avatar
 
Posts: 1,090
Karma: 6058305
Join Date: Sep 2010
Location: UK
Device: Kindle Paperwhite
I don't claim to be great at regular expressions, but I think this should work, and preserve any class/style attributes in the paragraph:

Find:
Code:
(<p[^>]+>)<\/p>(.*)
Replace:
Code:
\1\2</p>
avantman42 is offline   Reply With Quote
Old 06-29-2015, 10:37 AM   #8
rubeus
Banned
rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.
 
Posts: 272
Karma: 1224588
Join Date: Sep 2014
Device: Sony PRS 650
I would use * instead of + to catch tags without attributes.
rubeus is offline   Reply With Quote
Old 06-29-2015, 01:57 PM   #9
avantman42
Wizard
avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.
 
avantman42's Avatar
 
Posts: 1,090
Karma: 6058305
Join Date: Sep 2010
Location: UK
Device: Kindle Paperwhite
Good point. I admit I didn't have time to test it much.
avantman42 is offline   Reply With Quote
Old 06-29-2015, 04:09 PM   #10
Chris_Snow
Zealot
Chris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipse
 
Posts: 148
Karma: 8170
Join Date: Jul 2013
Device: kobo glo
Thx very muchly for all the pointers. You are right, in that my small regex didn't pick up all the paragraph instances (endings with question marks etc) - but surprisingly there were very few and I figured how to mod the regex to pick up a question mark. I seem to be able to sort out small changes but have a lot of trouble trying to get one regex to pick up everything

I'll trial the regexes here and see what the results are. Thx again.
Chris_Snow is offline   Reply With Quote
Old 06-29-2015, 04:12 PM   #11
Chris_Snow
Zealot
Chris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipseChris_Snow can illuminate an eclipse
 
Posts: 148
Karma: 8170
Join Date: Jul 2013
Device: kobo glo
Quote:
Originally Posted by rubeus View Post
My approach would be to delete all </p> and let tiny do the rest.
Will sigil automagically fix things in this instance? There are lots of times there is no undo in sigil, so I worry about stuffing things up. Usually I have to close the prog. without saving (that's if I haven't saved it without thinking!!)

Update: Yep...found that it does (well at least in small doses)

Last edited by Chris_Snow; 06-29-2015 at 08:49 PM.
Chris_Snow is offline   Reply With Quote
Old 06-30-2015, 12:55 AM   #12
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by avantman42 View Post
I don't claim to be great at regular expressions, but I think this should work, and preserve any class/style attributes in the paragraph:

Find:
Code:
(<p[^>]+>)<\/p>(.*)
Replace:
Code:
\1\2</p>
I would explicitly leave out paragraph tags in the text body, as such:

Code:
(<p(?: [^>]+)?>)</p>((?:(?!</?p>).)+)
Using the power of negative lookarounds.
eschwartz is offline   Reply With Quote
Old 06-30-2015, 11:47 PM   #13
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,463
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by Chris_Snow View Post
Will sigil automagically fix things in this instance? There are lots of times there is no undo in sigil, so I worry about stuffing things up. Usually I have to close the prog. without saving (that's if I haven't saved it without thinking!!)

Update: Yep...found that it does (well at least in small doses)
Big-time experts say, "make copy of file first, before committing Regexus Non-Interruptus."

Just sayin'! All these years, and Regex Buddy is still my closest, well...buddy.

Hitch
Hitch is offline   Reply With Quote
Old 06-30-2015, 11:49 PM   #14
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,936
Karma: 55705602
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Hitch View Post
Big-time experts say, "make copy of file first, before committing Regexus Non-Interruptus."


Hitch
Small fry users say same thing

oops is a 4 letter word
theducks is online now   Reply With Quote
Old 07-01-2015, 01:48 AM   #15
avantman42
Wizard
avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.avantman42 ought to be getting tired of karma fortunes by now.
 
avantman42's Avatar
 
Posts: 1,090
Karma: 6058305
Join Date: Sep 2010
Location: UK
Device: Kindle Paperwhite
Quote:
Originally Posted by Hitch View Post
Big-time experts say, "make copy of file first, before committing Regexus Non-Interruptus."
Extremely good advice. I've never used RegexBuddy, but I do use the online tester at https://regex101.com/
avantman42 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem: Merge two ebooks paragraph by paragraph... akayacik80 Workshop 5 09-23-2014 09:05 AM
How to Change Paragraph Indentation Acharn ePub 5 01-31-2013 12:16 AM
Could this be the Beginning? kennyc Lounge 12 01-24-2013 03:59 PM
Preference: Paragraph indent or a little paragraph spacing? 1611mac General Discussions 48 11-11-2011 12:43 AM
From the beginning ........ Aspic8 Writers' Corner 15 10-10-2011 11:05 AM


All times are GMT -4. The time now is 10:36 AM.


MobileRead.com is a privately owned, operated and funded community.