![]() |
#1 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 148
Karma: 8170
Join Date: Jul 2013
Device: kobo glo
|
</p> at beginning of paragraph - how do I change?
Greetings,
I have a file I'm using as a test bed to learn regex. It is a badly converted PDF to epub. The original file had the actual body of the text under the misc folder within the epub. I didn't know how to fix that so I converted to html, then back to epub. That seems to have fixed the problem of the text body now being where it should be. However...the code for the paragraphs is... Code:
<p class="calibre2"></p>The paragraph goes in here. I have learned a lot about how to tweak things using regex on this file, but this is somewhat beyond me. Appreciate the assistance. Update: I found out I could use .* and some variations to achieve a result. Got all the end tags where they belong now. Last edited by Chris_Snow; 06-29-2015 at 02:18 AM. |
![]() |
![]() |
![]() |
#2 |
Unicycle Daredevil
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,944
Karma: 185432100
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
|
The experts will certainly have more interesting solutions, but I just tested this and it works:
Search: Code:
</p>(.*?) <(.*?)> Code:
\1</p> <\2> |
![]() |
![]() |
![]() |
#3 | |
Banned
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 272
Karma: 1224588
Join Date: Sep 2014
Device: Sony PRS 650
|
Quote:
My approach would be to delete all </p> and let tiny do the rest. |
|
![]() |
![]() |
![]() |
#4 |
mostly an observer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,518
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
|
![]() |
![]() |
![]() |
#5 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,359
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
|
![]() |
![]() |
![]() |
#6 |
Banned
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 272
Karma: 1224588
Join Date: Sep 2014
Device: Sony PRS 650
|
|
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,090
Karma: 6058305
Join Date: Sep 2010
Location: UK
Device: Kindle Paperwhite
|
I don't claim to be great at regular expressions, but I think this should work, and preserve any class/style attributes in the paragraph:
Find: Code:
(<p[^>]+>)<\/p>(.*) Code:
\1\2</p> |
![]() |
![]() |
![]() |
#8 |
Banned
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 272
Karma: 1224588
Join Date: Sep 2014
Device: Sony PRS 650
|
I would use * instead of + to catch tags without attributes.
|
![]() |
![]() |
![]() |
#9 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,090
Karma: 6058305
Join Date: Sep 2010
Location: UK
Device: Kindle Paperwhite
|
Good point. I admit I didn't have time to test it much.
|
![]() |
![]() |
![]() |
#10 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 148
Karma: 8170
Join Date: Jul 2013
Device: kobo glo
|
Thx very muchly for all the pointers. You are right, in that my small regex didn't pick up all the paragraph instances (endings with question marks etc) - but surprisingly there were very few and I figured how to mod the regex to pick up a question mark. I seem to be able to sort out small changes but have a lot of trouble trying to get one regex to pick up everything
![]() I'll trial the regexes here and see what the results are. Thx again. |
![]() |
![]() |
![]() |
#11 | |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 148
Karma: 8170
Join Date: Jul 2013
Device: kobo glo
|
Quote:
Update: Yep...found that it does (well at least in small doses) Last edited by Chris_Snow; 06-29-2015 at 08:49 PM. |
|
![]() |
![]() |
![]() |
#12 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
Code:
(<p(?: [^>]+)?>)</p>((?:(?!</?p>).)+)
|
|
![]() |
![]() |
![]() |
#13 | |
Bookmaker & Cat Slave
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Just sayin'! All these years, and Regex Buddy is still my closest, well...buddy. Hitch |
|
![]() |
![]() |
![]() |
#14 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,912
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
![]() |
![]() |
![]() |
#15 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,090
Karma: 6058305
Join Date: Sep 2010
Location: UK
Device: Kindle Paperwhite
|
Quote:
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problem: Merge two ebooks paragraph by paragraph... | akayacik80 | Workshop | 5 | 09-23-2014 09:05 AM |
How to Change Paragraph Indentation | Acharn | ePub | 5 | 01-31-2013 12:16 AM |
Could this be the Beginning? | kennyc | Lounge | 12 | 01-24-2013 03:59 PM |
Preference: Paragraph indent or a little paragraph spacing? | 1611mac | General Discussions | 48 | 11-11-2011 12:43 AM |
From the beginning ........ | Aspic8 | Writers' Corner | 15 | 10-10-2011 11:05 AM |