Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 08-18-2014, 05:58 PM   #1
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Invisible Char and too many <blockquote>'s

Q1 - There seems to be an invisible char at the end of line 24 that causes a editor line wrap. Click at the end of the Colon and the lower right says

COLON 24 46

Right arrow positions it to the beginning of line 25, but the lower right says

PARAGRAPH SEPERATOR 24 0

I can enter a space after the colon and hit delete and that will make it pretty again

Is there regex that will fix it, or is it a bug?

Q2 - For some reason the original converter has 5 or 6 or more nested <blockquotes>. I probablely removed all the class-'s, but is there a RegEx to eliminate all but a single set?


Thanks
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	312
Size:	96.1 KB
ID:	127035  
Attached Files
File Type: epub Forum_Questions.epub (1.6 KB, 180 views)
phossler is offline   Reply With Quote
Old 08-18-2014, 08:02 PM   #2
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
You can highlight it, copy it and find it along with the space then replace it with just a space.

You might also try pasting in from the insert special character function. It shows up as psep. But it is less reliable that way.

But this shows only in the code, not in the output.

If you succeed in eliminating them, you will need to beautify code to make it like you expect it.

There was discussion very recently in another topic about doing just this with block quotes. I think was diapdealer who was involved in it, in connection with plugins, but I could be wrong about that
mrmikel is offline   Reply With Quote
Advert
Old 08-18-2014, 08:16 PM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,391
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Q1: I think it's one of those super-secret line-feed characters that you have to use the backspace or delete key to get rid of.

But if you insist on using regex: you can search for \n (or \u000A if you want it to look cooler) and replace the one you want to remove with nothing (or a space if you need some separation).

I think you're just used to the way Tidy prettifies markup.

Q2: eliminating nested blockquote montrosities (leaving the outermost) is a tool I'd like to get around to creating for the editor, but alas ... it's not something I'm currently working on.

Last edited by DiapDealer; 08-18-2014 at 08:45 PM.
DiapDealer is offline   Reply With Quote
Old 08-18-2014, 09:03 PM   #4
AnotherCat
....
AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.
 
Posts: 1,547
Karma: 18068960
Join Date: May 2012
Device: ....
Q2.

The following is pretty obvious so maybe I have misunderstood your problem, apologies if so.

All I do in such circumstances is in the Editor highlight a <blockquote> that has no text after it with the mouse and move the mouse curser to the start of the next line (so, for example, in your file select <blockquote> on line 11 and carry the selection to the first column on line 12). Right click>Copy that with the mouse and paste it into the Find box, and have the Replace box empty.

Do a Normal mode Replace all and all the {unwanted} <blockquotes> without any text after them disappear.

Then do a "Beautify" and all the </blockquotes> will disappear.

Last edited by AnotherCat; 08-18-2014 at 09:53 PM. Reason: added {unwanted} for clarity
AnotherCat is offline   Reply With Quote
Old 08-18-2014, 10:37 PM   #5
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
@mrmike and diapdealer

Yes, I am used to [Beautify]. It makes it a lot easier to read

PSEP is \u2029 but that didn't seem to help

Peeking into the hex confirms that it's a LF (\x0A). I got sort of a F&R to work. It's not perfect but replace the \n with a space really messed up the code since it did it everywhere, and not just the 20 or 30 places I needed it.

My RegEx is mostly cookbook, i.e. look up a recipe, bake it, and see if I like it

Find: \x0a([^\s?])
Replace: space\1

The lower right corner saying PARAGRAPH SEPERATOR really threw me off (when all else fails, look at the hex)

@anothercat

Obvious to you, but not to me. That works -- thanks very much. Going to be awfully hard for me to remember. Hopefully diapdealer will find the time to add it to his plugin.
phossler is offline   Reply With Quote
Advert
Old 08-18-2014, 11:34 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,391
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by phossler View Post
My RegEx is mostly cookbook, i.e. look up a recipe, bake it, and see if I like it

Find: \x0a([^\s?])
Replace: space\1
I don't know about the Replace, but see what you think about:
Code:
(?<!(>|\n))\n
for finding linefeeds that aren't immediately preceded by a ">" (or another linefeed).

Will probably match the linefeed in a split-line DOCTYPE, but those tend to disappear in calibre's editor anyway.
DiapDealer is offline   Reply With Quote
Old 08-19-2014, 08:23 AM   #7
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Quote:
I don't know about the Replace, but see what you think about:
I did a quick test, but it seems to do too much. I'd really like to just do a [Replace All] and not that to single step through 100's of Finds
phossler is offline   Reply With Quote
Old 08-19-2014, 08:54 AM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,391
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by phossler View Post
I did a quick test, but it seems to do too much. I'd really like to just do a [Replace All] and not that to single step through 100's of Finds
Do you have the dotall box checked? Uncheck it if so. That regex finds absolutely nothing but the DOCTYPE line in several huge books I've tested it on (and the extraneous linefeed like the one in your sample). If there are no doctype statements, that regex often finds nothing in the books I try it on.

It WILL catch a lot of style stuff in the headers if you run into a lot of that (which I don't) and the "traditional" svg cover image pages, but other than that, it seems to point out the non-typical linefeeds in otherwise prettified markup fairly well for me. *shrug*

Having said that though... I'll rarely nominate ANY regex as a candidate for a Replace All action in a novel, if ever. My plan is always to reduce the step-through process to something manageable .

Last edited by DiapDealer; 08-19-2014 at 10:16 AM.
DiapDealer is offline   Reply With Quote
Old 08-19-2014, 10:33 AM   #9
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
I think I had dotall checked (my bad). Isn't there a RE flag to do that? Back to the cookbook

Agree that Replace All is 'dangerous'. If I have a lot of replaces, I do a Replace, Find Next until I'm pretty comfortable I didn't mess up, and then I just 'let 'er rip' (and hope for the best)

If I have to clean a pile of crap epub, I try to catch the low hanging fruit, and run the PI to clean the dead tags.

In that particular epub, one at a time would take forever.

Last edited by phossler; 08-19-2014 at 06:17 PM.
phossler is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Display Char Value phossler Editor 2 05-05-2014 05:27 PM
Convert Ascii to UTF char gardefjord ePub 6 12-02-2011 03:36 AM
255 Char limit question jerrywojo Calibre 3 07-10-2010 07:15 PM
50 char limit? BrianG Calibre 2 01-25-2010 10:15 AM


All times are GMT -4. The time now is 09:30 AM.


MobileRead.com is a privately owned, operated and funded community.