Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 09-13-2018, 07:45 PM   #16
retiredbiker
Connoisseur
retiredbiker is faster than slow light.retiredbiker is faster than slow light.retiredbiker is faster than slow light.retiredbiker is faster than slow light.retiredbiker is faster than slow light.retiredbiker is faster than slow light.retiredbiker is faster than slow light.retiredbiker is faster than slow light.retiredbiker is faster than slow light.retiredbiker is faster than slow light.retiredbiker is faster than slow light.
 
retiredbiker's Avatar
 
Posts: 52
Karma: 29896
Join Date: May 2013
Device: Kindle KB, Oasis, Ubuntu, Jutoh
Quote:
Originally Posted by deback View Post
Glad it worked for you. I knew it would, because I always have to run the Beautify Files before I search for anything. If I don't do that, the search function will almost always NOT work correctly. I don't know if this is a bug in the Search function or not, but running the Beautify Files is definitely a necessity when you want to search for something.

This solution should be added to the instructions for the Search function.
I think this is most important for searches that span variable spaces or line breaks. Some of it can be got around by using regex \s+ where appropriate, but that isn't always enough.

Other than beautifying, I have a couple of things that help.

One is a saved search that replaces spaces before a </p> or </div>.

The other is my dynamite fix for files that have way too much white space: regex search for \n and replace it with a space. Then search for two spaces and replace with a single space--run that a few times until the results hit zero replacements. Then beautify the file. This will neaten up the worst mess imaginable, and I've never yet had it destroy a file, even though it looks like it will after the first step!
retiredbiker is offline   Reply With Quote
Old 09-13-2018, 10:52 PM   #17
deback
Book E d i t o r
deback knows the complete value of PI to the enddeback knows the complete value of PI to the enddeback knows the complete value of PI to the enddeback knows the complete value of PI to the enddeback knows the complete value of PI to the enddeback knows the complete value of PI to the enddeback knows the complete value of PI to the enddeback knows the complete value of PI to the enddeback knows the complete value of PI to the enddeback knows the complete value of PI to the enddeback knows the complete value of PI to the end
 
Posts: 285
Karma: 31930
Join Date: May 2015
Device: Laptop
You can do various searches, but you always have to run Beautify Files first for the search function to work, no matter what you'll be searching for.

After you run Beautify Files, then you can run just one Regex search for \s+</p> and replace with </p>--to delete one space or more than one space with this simple command. There's no need to run multiple searches to delete the spaces before ending tags, such as </p>, </div>, </span>, etc. You don't even have to run Beautify Files afterward, since you ran it before the search.
deback is offline   Reply With Quote
Old 09-13-2018, 11:15 PM   #18
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 15,241
Karma: 24732480
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo: Touch, Glo, Aura H2O, Glo HD, Aura ONE, Clara HD
Quote:
Originally Posted by deback View Post
You can do various searches, but you always have to run Beautify Files first for the search function to work, no matter what you'll be searching for.

After you run Beautify Files, then you can run just one Regex search for \s+</p> and replace with </p>--to delete one space or more than one space with this simple command. There's no need to run multiple searches to delete the spaces before ending tags, such as </p>, </div>, </span>, etc. You don't even have to run Beautify Files afterward, since you ran it before the search.
I'll have to disagree with this. When cleaning a book, I do a lot of the search and replaces before the beautify. In fact, one of my saved searches is to unwrap text that someone though should be only 80 characters long. Fixing that after a beautify is a lot harder.

For the situation that @roger64 is reporting, I generally do that fairly early, so it is before the beautify. I haven't had any issues doing this. A problem is exactly what type of space is used in these otherwise empty paragraphs. Generally, it means selecting one and pressing CTR+F and change that, and then repeating this with one of the ones that were missed gets the rest. It would be interesting to see the original file to see what was actually going on.
davidfor is offline   Reply With Quote
Old 09-13-2018, 11:16 PM   #19
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,470
Karma: 10205098
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The test file you sent me has

Code:
<p>(non breaking space as an HTML entity)</p>
So if you want to search for it use

Code:
<p>(non breaking space as an entity)</p>
in the search box. With that it found 41 matches for me. When you use Beautiofy calibre will remove all these ugly entities, replacing them with their unicode characters, which is why searching for <p>\s+</p> will now work
kovidgoyal is offline   Reply With Quote
Old 09-14-2018, 01:20 AM   #20
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,302
Karma: 2385865
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
@kovidgoyal
Thank you for your explanation. I shall pay attention to the way these offending characters may have been introduced in the ePub.

Thanks to all for your useful various tips.

Last edited by roger64; 09-14-2018 at 01:23 AM.
roger64 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Searching for empty entries reluniks Calibre 2 11-12-2016 02:29 PM
Searching for empty tadas Library Management 2 04-20-2015 12:14 AM
Empty space in Paragraph odedta ePub 11 12-27-2014 09:18 AM
How detelete empty paragraph? cyttorak Recipes 2 11-27-2014 02:41 AM
Searching for empty tags iain_benson Calibre 2 01-27-2009 05:04 PM


All times are GMT -4. The time now is 03:53 AM.


MobileRead.com is a privately owned, operated and funded community.