![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
epub with no <p>'s
Recently downloaded a epub report (I think) that might have started life as a .txt file (maybe). Example in the spoiler
Problem 1. There are no <p>...</p> tags, but just text 'paragraphs' separated by <br/>. Problem 2. Sometimes there is <blockquote> around a block of text, but no <p> tags Problem 3. Sometimes there are large 'blank' sections that are various combinations of PARAGRAPH SEPERATOR + SPACE or SPACE + PARAGRAPH SEPERATOR or PARAGRAPH SEPERATOR + PARAGRAPH SEPERATOR + PARAGRAPH SEPERATOR + etc. [Beautify] and [Fix HTML] don't clean them up, but a Find: \n{2,} Replace: \n does catch a lot, but leaves the spaces Are there (probably) 3 RegEx's that will 1. Put <p> tags around the blocks of text that don't have any? I can delete the <br/>'s 2. Put <p> tags around the blocks of text inside <blockquotes>'s that don't have any? I can delete the <blockquote>'s manually if I need to 3. Remove the spaces that are outside <p>'s? Spoiler:
Thanks |
![]() |
![]() |
![]() |
#2 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,181
Karma: 8888888
Join Date: Jun 2010
Device: Kobo Clara HD,Hisence Sero 7 Pro RIP, Nook STR, jetbook lite
|
Quote:
Code:
find: <br/>\s*(.*?)\s*< replace: <p>\1</p>\n< To get this: Spoiler:
bernie |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | ||
Age improves with wine.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 576
Karma: 95229
Join Date: Nov 2014
Device: Kindle Oasis, Kobo Libra II
|
You need four "replace all"s in total:
Quote:
(1) Wrap the body in <p>...</p> Find: <body>(.*?)</body> Replace: <body>\n<p>\1</p>\n</body> (2) Replace all the <br/>s Find: <br/> Replace: </p>\n<p> Quote:
Replace: <blockquote>\n<p>\1</p>\n</blockquote> Or, if you just want plain paras instead of blockquotes, Replace: <p>\1</p> Find: </p>\s*<p> Replace: </p>\n<p> Last edited by Phssthpok; 08-23-2015 at 11:29 AM. |
||
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
A New Epub Creator: txt to epub, word to epub | oxen | ePub | 120 | 07-22-2019 02:28 PM |
redo epub to epub - don't use original-epub | cybmole | Conversion | 8 | 02-20-2014 05:21 AM |
epub to epub conversion problem with regex spanning multiple input files | ctop | Conversion | 2 | 02-12-2012 01:56 AM |
[Old Thread] Reading epub on viewer inexplicably changes the time stamp of epub | greenapple | Library Management | 20 | 03-19-2011 10:18 PM |
epub, ePub, EPUB, warum blos ePub? | flowoeB | Lounge | 5 | 11-27-2009 09:37 AM |