Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 02-25-2013, 08:55 AM   #16
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 37,813
Karma: 18701526
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
Quote:
Originally Posted by Dybbuk View Post
I don't. I was just trying to make clear that I don't want to erase the style and formatting tags inside paragraphs.
But you do want to erase the style and formatting tags outside of the <p></p> tags.

For example, a div can be used to give a bit of a top margin. Take that away and you lose that. You have to know what it is you are removing before removing it. There are blockquotes you do not want to remove. There are divs that are used to simulate blockquotes (stupid coding, but that's how they do it in India these days) that you do not want to remove.

Look at the code before you delete it. See what it is that you are deleting. Then you can decide if it's OK to delete. For example, I've seen divs that have no purpose other then to be an ID that is not actually used anyplace. Those can go. Then there are divs to help deal with chapter titles that can go because you can do it all in something like an h2 as long as h2 is defined properly in the CSS.

Now, a lot of chapter titles are defined using h1 or h2 and if you remove all but <p></p> then you would lose all the chapter titles if they use h1 or h2.
JSWolf is online now   Reply With Quote
Old 02-25-2013, 10:53 AM   #17
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 645
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
Adding/extending the post from theducks
Quote:
Originally Posted by theducks View Post
(?sm)</p>\s+(.+?)\s+<p>
Should work to remove things outside those tags
Don't try this on any copy you want to be usable after you are done,

BUT YOU WERE WARNED that there are other valid things between the closing </p> and the Next <p> that should not be removed: The list is big, so I am not wasting my time typing it.
Because it's highlighting/selecting the p tags, just change the replace to </p>\n<p>

Overall problem - The way I'd do it, is convert in calibre to txtz->textile output, the convert that back to epub.
keeping copies of original just in case.

Last edited by Perkin; 02-25-2013 at 10:57 AM.
Perkin is offline   Reply With Quote
Old 02-25-2013, 10:57 AM   #18
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,071
Karma: 5939999
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by Perkin View Post
Adding/extending the post from theducks


Because it's highlighting/selecting the p tags, just change the replace to </p>\n<p>

I forgot to mention that the Tags needed to be replaced after the capture.
I always step through a few times before hitting the 'all' button.
Now you see why
theducks is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help removing bold text tecweston Sigil 5 02-08-2012 01:33 PM
Removing text from an ebook mjt57 Conversion 3 04-29-2011 03:55 AM
Tool for removing line breaks in text documents kahn10 Sony Reader 9 08-22-2010 11:05 PM
PDF Conversion - Removing Header / Footer Text heb Sony Reader 9 07-12-2010 12:02 AM
Converting PDF - Removing text at top of pages halljames Calibre 4 07-21-2009 08:00 AM


All times are GMT -4. The time now is 06:48 PM.


MobileRead.com is a privately owned, operated and funded community.