Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 06-23-2012, 02:19 AM   #91
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Don't use Calibre to clean up the filtered HTML. Either do it manually in Sigil or use a program/macro to do it.
Conversion to ePUB in Calibre will cause big changes in your styles. Further more, it is not necessary, since Sigil can import HTML without issues.
Toxaris is offline   Reply With Quote
Old 06-23-2012, 08:49 AM   #92
goldilocks
Addict
goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.
 
Posts: 344
Karma: 1222222
Join Date: Aug 2009
Location: Florida
Device: Sony PRS-505
Quote:
Originally Posted by DiapDealer View Post
You could very well end up with a disaster if you're not careful. I would start with the paragraphs first as spans can get a bit hairy.

If you're absolutely sure that you want to change everything that has a class name of "MsoNormalXX" (X being numerals) to "paragraphtext", then:

Find: <p class="MsoNormal\d+">
Replace: <p class="paragraphtext">

Make sure you have good backups in case things don't turn out the way you've planned.
Thanks DiapDealer, but it didn't work. I keep originals and backups separate from my "working" folder.

Karen
goldilocks is offline   Reply With Quote
Advert
Old 06-23-2012, 09:14 AM   #93
goldilocks
Addict
goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.goldilocks ought to be getting tired of karma fortunes by now.
 
Posts: 344
Karma: 1222222
Join Date: Aug 2009
Location: Florida
Device: Sony PRS-505
Quote:
Originally Posted by Toxaris View Post
Don't use Calibre to clean up the filtered HTML. Either do it manually in Sigil or use a program/macro to do it.
Conversion to ePUB in Calibre will cause big changes in your styles. Further more, it is not necessary, since Sigil can import HTML without issues.
Toxaris, thanks for your suggestion. I did not use Calibre on the htm file but it really isn't much better. There is no style sheet and there are over 3000 expressions in the /*<![CDATA[*/ area. Every paragraph of text is filled with another paragraph of commands?. Also it is one large file - I do know how to split it.

But, I'll keep working on it and eventually I will have a decent looking, if not perfect, eBook!

Karen
goldilocks is offline   Reply With Quote
Old 06-23-2012, 10:01 AM   #94
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,848
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Thanks DiapDealer, but it didn't work. I keep originals and backups separate from my "working" folder.
I'm not sure what you mean by "didn't work."

It didn't do what it was intended to do?... or it didn't do what you wanted/expected it to do? There's a difference.

It certainly should have done what I said it would do... if you had the ePub open in Sigil, in Code View(an html file), with the F&R widget open (and in Regex mode) and set to "All HTML Files".

Last edited by DiapDealer; 06-23-2012 at 10:03 AM.
DiapDealer is offline   Reply With Quote
Old 06-26-2012, 06:10 AM   #95
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Suppressing <br /> tags only in "body text" style.

Could there be a way to destroy the soft hyphens only when they are included in a "body text" paragraph?

Rationale:

After using a new (and not perfect) OCR , I found that my recognized text was interspersed with a lot of <br /> tags (soft hyphens?). I usually insert the html file in OpenOffice and clean all formatting to begin with. Even this way, I realized that these resilient tags survived.

It is not that bad. Some poems or songs are thus nicely transcribed. On the other hand, I have to clean these tags for many standard paragraphs of text.

Sigil provides a simple way out. The user has a choice either cleaning every one of them, good and bad, or selectively and patiently suppress the useless tags...

There could a better one.

Give your songs or poems their own style, keep standard text in its "body text" class and then launch the following Regex...
roger64 is offline   Reply With Quote
Advert
Old 06-26-2012, 09:50 AM   #96
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,848
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
<br />'s are not soft-hyphens.... just to be clear.

Quote:
Originally Posted by roger64
Give your songs or poems their own style, keep standard text in its "body text" class and then launch the following Regex...
Tricky... but—strictly speaking of Sigil (PCRE) here—then possibly:

If there's only one occurrence of the <br /> tag inside a paragraph, this expression should find it (only inside p tags of the class "body-text"):
Code:
<p class="body-text">(?!</p>).*\K<br[^>]*?/>
(If there's more than one occurrence of <br /> the above expression will only match the last one)

The following expression should match the first occurrence (if there's more than one) of a <br /> tag inside p tags of the class "body-text".
Code:
(?U)<p class="body-text">(?!</p>).*\K<br[^>]*?/>
Leaving the "Replace" field blank when replacing should then get rid of the <br /> tags.

It's certainly not ideal, but if you have multiple <br /> tags inside the targeted paragraph (class name "body-text"), you could conceivably run one or the other of these "Replace All" expressions multiple times until the search no longer matches anything. Still quicker than stepping through each occurrence (and will ignore all other p classes), though.
DiapDealer is offline   Reply With Quote
Old 06-26-2012, 10:22 AM   #97
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
@DiapDealer

Thanks very much for your reply. I will put it soon to work.

Do you think it is possible to join your two commands with a kind of AND/OR link so that it would destroy the tags two by two or be happy with one?

Thanks for the vocabulary. I was not sure about it. Now I know.

Last edited by roger64; 06-26-2012 at 10:24 AM.
roger64 is offline   Reply With Quote
Old 06-26-2012, 10:32 AM   #98
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,848
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by roger64 View Post
Do you think it is possible to join your two commands with a kind of AND/OR link so that it would destroy the tags two by two or be happy with one?
I certainly wouldn't know of any way to easily combine them. It really boils down to the lazyiness/greediness aspects of the various regex repetition-control characters. I can't imagine it would take that many clicks of the "replace all" button to rid the "body-text" paragraphs of <br /> tags, but then again... I'm not looking at the afflicted code either.
DiapDealer is offline   Reply With Quote
Old 06-27-2012, 02:27 AM   #99
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
@DiapDealer

I am very pleased to report full success of your Regex ( I used the first one) which deleted successively in seven busy rounds: 53/22/7/5/2/2/2 occurrences of the <br /> tag.

This is only the top of the iceberg, because on the odt I previously manually destroyed probably about over one hundred. I did not know then I would use your regex.

For information, this is the styles break-up of the test EPUB (classes only):
Spoiler:

Code:
class="Textbody" 1676
class="frameGraphics" 66
class="let" 64
class="let2" 64
class="let1" 64
class="Centrage" 62
class="smcpTypeV" 46
class="smcpTypeA" 16
class="smcpDroite" 16
class="Header" 8
class="smcpCentrage" 6
class="Italdroite" 4

Last edited by roger64; 06-27-2012 at 08:53 AM.
roger64 is offline   Reply With Quote
Old 06-27-2012, 02:15 PM   #100
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,848
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by roger64 View Post
I am very pleased to report full success of your Regex ( I used the first one) which deleted successively in seven busy rounds: 53/22/7/5/2/2/2 occurrences of the <br /> tag.
Cool! Glad it worked for you. I've stashed it away myself for tweaking in various ways.

Last edited by DiapDealer; 06-27-2012 at 04:46 PM. Reason: typo
DiapDealer is offline   Reply With Quote
Old 07-03-2012, 04:18 AM   #101
mrjoeyman
Junior Member
mrjoeyman began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jul 2012
Device: Kindle Fire
reverse linking time consuming woes

<a href="../Text/notes.html#scrip1" id="backscrip1">This text is a link</a>

The above is some code in my file that I use to reverse link, or tag/anchor, whatever they call it. You click on a link in one file (in this case clicking on the text "This text is a link" would take you to the "../Text/notes.html file, where another link is designated as "scrip1", with the previous link "This text is a link" was designated as "backscrip1". So they go back and forth. When there are hundreds of reverse links, it take me a short time to list the main code ie...

<a href="../Text/scriptures.html#scrip1" id="backscrip1">This text is a link</a>
<a href="../Text/scriptures.html#scrip1" id="backscrip1">This text is a link</a>
<a href="../Text/scriptures.html#scrip1" id="backscrip1">This text is a link</a>
<a href="../Text/scriptures.html#scrip1" id="backscrip1">This text is a link</a>
<a href="../Text/scriptures.html#scrip1" id="backscrip1">This text is a link</a>

but now I have to go back and change the second occurrence of the linking code to "2" then "3" then "4", ie...

<a href="../Text/scriptures.html#scrip1" id="backscrip1">This text is a link</a>

<a href="../Text/scriptures.html#scrip2" id="backscrip2">This text is a link</a>

<a href="../Text/scriptures.html#scrip3" id="backscrip3">This text is a link</a>

<a href="../Text/scriptures.html#scrip4" id="backscrip4">This text is a link</a>

....you get the idea.

Is there a way to use the find and replace in such a way that it would search for this code and bump up the number for each occurrence, so I won't have to manually find each one and put in each number separately myself?

mrjoeyman is offline   Reply With Quote
Old 07-03-2012, 05:48 AM   #102
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,762
Karma: 24088559
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by mrjoeyman View Post
Is there a way to use the find and replace in such a way that it would search for this code and bump up the number for each occurrence, so I won't have to manually find each one and put in each number separately myself?
AFAIK, you cannot increment numbers using regular expressions. This kind of functionality can only be achieved with a scripting language.
Doitsu is offline   Reply With Quote
Old 07-03-2012, 06:04 AM   #103
mrjoeyman
Junior Member
mrjoeyman began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jul 2012
Device: Kindle Fire
I was afraid of that. I guess the best thing would be to save it as a template and insert the text, but that still entails manually inserting each occurrence. Is there a quicker way of doing such a task that I just am not aware of yet? Thanks for the consideration.
mrjoeyman is offline   Reply With Quote
Old 07-03-2012, 07:06 AM   #104
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,570
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I don't know about Sigil, but this is what I do in vim:

I use a special symbol (¬, |, ¦ are useful for this) where I want the consecutive numbers:

Code:
<a href="../Text/scriptures.html#scrip¬" id="backscrip¬">This text is a link</a>
Once I have all the links like that, I run this command in vim:

Code:
: let n=1 | g/¬/s/¬/\=n/g | let n+=1
which replaces all ¬ in a line with the number n, and n is incremented by one every time a line with ¬ is found.
Jellby is offline   Reply With Quote
Old 07-03-2012, 10:56 PM   #105
mrjoeyman
Junior Member
mrjoeyman began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Jul 2012
Device: Kindle Fire
Omg are you serious? I will have to give it a go! So how would I go about getting the code into Sigil afterward? That is the only way I know to convert it into epub.

Last edited by mrjoeyman; 07-03-2012 at 11:27 PM.
mrjoeyman is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 03:36 AM.


MobileRead.com is a privately owned, operated and funded community.