Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 12-05-2013, 12:43 PM   #1
magmanpi
Enthusiast
magmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheese
 
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
Closing unwanted line breaks

Hello,

I've looked in the other threads about fixing line breaks using Sigil but can't find the solution to my problem:

I trying to fix an ebook (a hated Calibre conversion from PDF, I think) in which almost every sentence, though not all -- that contains an em dash is broken after the dash. It looks like this in code view:

blah blah blah —</p>

<p class="calibre2">blah blah blah

In some instances, the code looks like this:

blah blah blah —</span></p>

<p class="calibre2">blah blah blah

I've tried a lot of the regex search strings suggested in other threads but none of them finds any matches, either when I click Find or Count All. Any ideas why the search strings aren't finding any matches?

I would greatly appreciate any help! Thanks!
magmanpi is offline   Reply With Quote
Old 12-05-2013, 01:19 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,691
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by magmanpi View Post
Hello,

I've looked in the other threads about fixing line breaks using Sigil but can't find the solution to my problem:

I trying to fix an ebook (a hated Calibre conversion from PDF, I think) in which almost every sentence, though not all -- that contains an em dash is broken after the dash. It looks like this in code view:

blah blah blah —</p>

<p class="calibre2">blah blah blah

In some instances, the code looks like this:

blah blah blah —</span></p>

<p class="calibre2">blah blah blah

I've tried a lot of the regex search strings suggested in other threads but none of them finds any matches, either when I click Find or Count All. Any ideas why the search strings aren't finding any matches?

I would greatly appreciate any help! Thanks!
were you in REGEX mode?
Was the scope: All HTML

Sigil comes with a few saved searches: Join is one of them...But
only for simple joins.
Both the cases you show need care (Replace All is NOT a good idea)

The first may be desired (my regex for THAT ONE only allow IF the first letter being joined to is lower case)


Your second: What to do about the /span(s)?
Things can get uglier/not better FAST with a Replace All
theducks is online now   Reply With Quote
Advert
Old 12-05-2013, 01:38 PM   #3
magmanpi
Enthusiast
magmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheese
 
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
Quote:
Originally Posted by theducks View Post
were you in REGEX mode?
Was the scope: All HTML

Sigil comes with a few saved searches: Join is one of them...But
only for simple joins.
Both the cases you show need care (Replace All is NOT a good idea)

The first may be desired (my regex for THAT ONE only allow IF the first letter being joined to is lower case)


Your second: What to do about the /span(s)?
Things can get uglier/not better FAST with a Replace All
Yes, I was in REGEX mode and the scope was All HTML. And no, I would never use replace all. I can see the danger there.

I'm totally puzzled why Sigil isn't finding any matches for my search strings. Could you please post your REGEX string for the first fix? Thanks!
magmanpi is offline   Reply With Quote
Old 12-05-2013, 02:07 PM   #4
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,468
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
I'm totally puzzled why Sigil isn't finding any matches for my search strings. Could you please post your REGEX string for the first fix? Thanks!
You've not said what your search string was. It might be easier (and more instructive) to show what you might be doing wrong.

However, if I had to guess, I'd say that whatever search expression you're using isn't taking into account the line breaks/white-space between the two lines of code you're trying to match.

Given that the example you gave is consistent through the code, an expression to find what you're looking for (the one without the </span> involved) could be as simple as:
(make sure you're actually typing the emdash character here)
Code:
—</p>\s+<p class="calibre2">
replace it with an emdash (or emdash - <space> depending on your preference)
DiapDealer is offline   Reply With Quote
Old 12-05-2013, 02:55 PM   #5
magmanpi
Enthusiast
magmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheese
 
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
Quote:
Originally Posted by DiapDealer View Post
You've not said what your search string was. It might be easier (and more instructive) to show what you might be doing wrong.

However, if I had to guess, I'd say that whatever search expression you're using isn't taking into account the line breaks/white-space between the two lines of code you're trying to match.

Given that the example you gave is consistent through the code, an expression to find what you're looking for (the one without the </span> involved) could be as simple as:
(make sure you're actually typing the emdash character here)
Code:
—</p>\s+<p class="calibre2">
replace it with an emdash (or emdash - <space> depending on your preference)
I wondered about the white space between the two lines. This is the search string I was using:

Find: ([a-z])</p>\s+<p class="calibre2"> Replace: \1 In the replace expression, it is \1 followed by a single space.
magmanpi is offline   Reply With Quote
Advert
Old 12-05-2013, 03:51 PM   #6
magmanpi
Enthusiast
magmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheese
 
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
Quote:
Originally Posted by DiapDealer View Post
However, if I had to guess, I'd say that whatever search expression you're using isn't taking into account the line breaks/white-space between the two lines of code you're trying to match.

Given that the example you gave is consistent through the code, an expression to find what you're looking for (the one without the </span> involved) could be as simple as:
(make sure you're actually typing the emdash character here)
Code:
—</p>\s+<p class="calibre2">
replace it with an emdash (or emdash - <space> depending on your preference)
OMG! The code worked. I can't thank you guys enough for your help.
Does the Sigil manual have a section on how to write code for regex? I looked in the manual trying to solve this problem but it kinda made my head spin.

Now, about those other code items with </span>?
magmanpi is offline   Reply With Quote
Old 12-05-2013, 04:36 PM   #7
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,468
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by magmanpi View Post
Does the Sigil manual have a section on how to write code for regex? I looked in the manual trying to solve this problem but it kinda made my head spin.
No. Regex is completely separate and independent of Sigil. It uses the PCRE flavor of regex, so just look for a tutorial somewhere. Such as http://www.regular-expressions.info/

Quote:
Now, about those other code items with </span>?
I would probably leave the closing span in there (who knows what css class may be assigned to the opening tag), but do something very similar.
Code:
—</span></p>\s+<p class="calibre2">
Replace with "—</span>" or "—</span> " (without the quotes of course).
DiapDealer is offline   Reply With Quote
Old 12-06-2013, 08:04 AM   #8
magmanpi
Enthusiast
magmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheesemagmanpi can extract oil from cheese
 
Posts: 30
Karma: 1000
Join Date: Nov 2012
Device: none
Quote:
Originally Posted by DiapDealer View Post
No. Regex is completely separate and independent of Sigil. It uses the PCRE flavor of regex, so just look for a tutorial somewhere. Such as http://www.regular-expressions.info/


I would probably leave the closing span in there (who knows what css class may be assigned to the opening tag), but do something very similar.
Code:
—</span></p>\s+<p class="calibre2">
Replace with "—</span>" or "—</span> " (without the quotes of course).
Thank you very much for all the help, DiapDealer. I appreciate you sharing your knowledge. Now I'm off to check out your regex link!
magmanpi is offline   Reply With Quote
Old 12-06-2013, 08:20 AM   #9
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,468
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
No problem. Good luck on your regex journey!
DiapDealer is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help! How do I remove unwanted paragraph breaks? ElMiko Sigil 7 03-27-2013 11:43 AM
Line breaks on Kindle, no line breaks on 4 PC Siavahda Kindle Formats 0 10-20-2012 05:50 AM
Closing up line endings without punctuation remltr Conversion 2 06-23-2011 12:25 AM
Calibre making unwanted chapter breaks PatNY Calibre 6 10-08-2010 09:58 PM
utility to eliminate unwanted line breaks in txt profnachos Workshop 11 11-27-2007 06:24 PM


All times are GMT -4. The time now is 09:21 PM.


MobileRead.com is a privately owned, operated and funded community.