Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 08-18-2014, 06:58 PM   #1
phossler
Addict
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 371
Karma: 51406
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: kindle
Invisible Char and too many <blockquote>'s

Q1 - There seems to be an invisible char at the end of line 24 that causes a editor line wrap. Click at the end of the Colon and the lower right says

COLON 24 46

Right arrow positions it to the beginning of line 25, but the lower right says

PARAGRAPH SEPERATOR 24 0

I can enter a space after the colon and hit delete and that will make it pretty again

Is there regex that will fix it, or is it a bug?

Q2 - For some reason the original converter has 5 or 6 or more nested <blockquotes>. I probablely removed all the class-'s, but is there a RegEx to eliminate all but a single set?


Thanks
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	97
Size:	96.1 KB
ID:	127035  
Attached Files
File Type: epub Forum_Questions.epub (1.6 KB, 30 views)
phossler is offline   Reply With Quote
Old 08-18-2014, 09:02 PM   #2
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,086
Karma: 1444487
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
You can highlight it, copy it and find it along with the space then replace it with just a space.

You might also try pasting in from the insert special character function. It shows up as psep. But it is less reliable that way.

But this shows only in the code, not in the output.

If you succeed in eliminating them, you will need to beautify code to make it like you expect it.

There was discussion very recently in another topic about doing just this with block quotes. I think was diapdealer who was involved in it, in connection with plugins, but I could be wrong about that
mrmikel is offline   Reply With Quote
Old 08-18-2014, 09:16 PM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,539
Karma: 44002482
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Q1: I think it's one of those super-secret line-feed characters that you have to use the backspace or delete key to get rid of.

But if you insist on using regex: you can search for \n (or \u000A if you want it to look cooler) and replace the one you want to remove with nothing (or a space if you need some separation).

I think you're just used to the way Tidy prettifies markup.

Q2: eliminating nested blockquote montrosities (leaving the outermost) is a tool I'd like to get around to creating for the editor, but alas ... it's not something I'm currently working on.

Last edited by DiapDealer; 08-18-2014 at 09:45 PM.
DiapDealer is offline   Reply With Quote
Old 08-18-2014, 10:03 PM   #4
AnotherCat
Evangelist
AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.AnotherCat ought to be getting tired of karma fortunes by now.
 
Posts: 429
Karma: 1211911
Join Date: May 2012
Device: Sony PRS-T1
Q2.

The following is pretty obvious so maybe I have misunderstood your problem, apologies if so.

All I do in such circumstances is in the Editor highlight a <blockquote> that has no text after it with the mouse and move the mouse curser to the start of the next line (so, for example, in your file select <blockquote> on line 11 and carry the selection to the first column on line 12). Right click>Copy that with the mouse and paste it into the Find box, and have the Replace box empty.

Do a Normal mode Replace all and all the {unwanted} <blockquotes> without any text after them disappear.

Then do a "Beautify" and all the </blockquotes> will disappear.

Last edited by AnotherCat; 08-18-2014 at 10:53 PM. Reason: added {unwanted} for clarity
AnotherCat is offline   Reply With Quote
Old 08-18-2014, 11:37 PM   #5
phossler
Addict
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 371
Karma: 51406
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: kindle
@mrmike and diapdealer

Yes, I am used to [Beautify]. It makes it a lot easier to read

PSEP is \u2029 but that didn't seem to help

Peeking into the hex confirms that it's a LF (\x0A). I got sort of a F&R to work. It's not perfect but replace the \n with a space really messed up the code since it did it everywhere, and not just the 20 or 30 places I needed it.

My RegEx is mostly cookbook, i.e. look up a recipe, bake it, and see if I like it

Find: \x0a([^\s?])
Replace: space\1

The lower right corner saying PARAGRAPH SEPERATOR really threw me off (when all else fails, look at the hex)

@anothercat

Obvious to you, but not to me. That works -- thanks very much. Going to be awfully hard for me to remember. Hopefully diapdealer will find the time to add it to his plugin.
phossler is offline   Reply With Quote
Old 08-19-2014, 12:34 AM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,539
Karma: 44002482
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by phossler View Post
My RegEx is mostly cookbook, i.e. look up a recipe, bake it, and see if I like it

Find: \x0a([^\s?])
Replace: space\1
I don't know about the Replace, but see what you think about:
Code:
(?<!(>|\n))\n
for finding linefeeds that aren't immediately preceded by a ">" (or another linefeed).

Will probably match the linefeed in a split-line DOCTYPE, but those tend to disappear in calibre's editor anyway.
DiapDealer is offline   Reply With Quote
Old 08-19-2014, 09:23 AM   #7
phossler
Addict
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 371
Karma: 51406
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: kindle
Quote:
I don't know about the Replace, but see what you think about:
I did a quick test, but it seems to do too much. I'd really like to just do a [Replace All] and not that to single step through 100's of Finds
phossler is offline   Reply With Quote
Old 08-19-2014, 09:54 AM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,539
Karma: 44002482
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by phossler View Post
I did a quick test, but it seems to do too much. I'd really like to just do a [Replace All] and not that to single step through 100's of Finds
Do you have the dotall box checked? Uncheck it if so. That regex finds absolutely nothing but the DOCTYPE line in several huge books I've tested it on (and the extraneous linefeed like the one in your sample). If there are no doctype statements, that regex often finds nothing in the books I try it on.

It WILL catch a lot of style stuff in the headers if you run into a lot of that (which I don't) and the "traditional" svg cover image pages, but other than that, it seems to point out the non-typical linefeeds in otherwise prettified markup fairly well for me. *shrug*

Having said that though... I'll rarely nominate ANY regex as a candidate for a Replace All action in a novel, if ever. My plan is always to reduce the step-through process to something manageable .

Last edited by DiapDealer; 08-19-2014 at 11:16 AM.
DiapDealer is offline   Reply With Quote
Old 08-19-2014, 11:33 AM   #9
phossler
Addict
phossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensionsphossler can understand the language of future parallel dimensions
 
Posts: 371
Karma: 51406
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: kindle
I think I had dotall checked (my bad). Isn't there a RE flag to do that? Back to the cookbook

Agree that Replace All is 'dangerous'. If I have a lot of replaces, I do a Replace, Find Next until I'm pretty comfortable I didn't mess up, and then I just 'let 'er rip' (and hope for the best)

If I have to clean a pile of crap epub, I try to catch the low hanging fruit, and run the PI to clean the dead tags.

In that particular epub, one at a time would take forever.

Last edited by phossler; 08-19-2014 at 07:17 PM.
phossler is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Display Char Value phossler Editor 2 05-05-2014 06:27 PM
Convert Ascii to UTF char gardefjord ePub 6 12-02-2011 04:36 AM
255 Char limit question jerrywojo Calibre 3 07-10-2010 08:15 PM
50 char limit? BrianG Calibre 2 01-25-2010 11:15 AM


All times are GMT -4. The time now is 04:11 AM.


MobileRead.com is a privately owned, operated and funded community.