Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 07-18-2011, 05:14 AM   #1
paulfiera
Addict
paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.
 
paulfiera's Avatar
 
Posts: 378
Karma: 3102
Join Date: Dec 2010
Location: EU
Device: Kobo Aura ONE, Kobo Libra H20
Question Regexp help - I think...

I'm working in fixing some epubs and have discovered that on some of them, on every page there is over 50 KB of inline style font definitions like this:

Quote:
@font-face {
font-family: Meiryo;
panose-1: 2 11 6 4 3 5 4 4 2 4;
mso-font-charset: 128;
mso-generic-font-family: swiss;
mso-font-pitch: variable;
mso-font-signature: -536870145 1791492095 18 0 131081 0
}
I'd like to remove everything inside the style tags during a conversion. I don't know if I could use a regexp in Search & Replace for this or even a built-in feature in calibre or any plugin.

So far, I've been exploding the epubs and manualy removing all these font definitions with UltraEdit but it would be really great if I could do it at the time of conversion.

Many thanks
PF
paulfiera is offline   Reply With Quote
Old 07-18-2011, 07:00 AM   #2
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
If it's part of the HTML itself search and replace will work. Search and replace however does not run over .css files. You could try:
Code:
(?mu)<style>.+</style>
as the search code. This should remove every style tag inline in the HTML.
user_none is offline   Reply With Quote
Advert
Old 07-18-2011, 07:27 AM   #3
paulfiera
Addict
paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.
 
paulfiera's Avatar
 
Posts: 378
Karma: 3102
Join Date: Dec 2010
Location: EU
Device: Kobo Aura ONE, Kobo Libra H20
Quote:
Originally Posted by user_none View Post
If it's part of the HTML itself search and replace will work. Search and replace however does not run over .css files. You could try:
Code:
(?mu)<style>.+</style>
as the search code. This should remove every style tag inline in the HTML.
Many thanks user_none.

Yes. It's inline style in the html, xhtml documents.

I've tried putting
Code:
(?mu)<style>.+</style>
in the First Expression in Search Regular Expression and Replacement Text, blank, but I end with all the font declarations in this format:

Quote:
@font-face {
font-family: DotumChe
}
It looks like the calibre conversion already strips many of the font attributes, but not the font declarations.

I also tried
Code:
(?mu)<style.+</style>
Note the missing closing tag in the first style occurrence, as in the html files it's declared as
Code:
<style type="text/css">
but same results.
paulfiera is offline   Reply With Quote
Old 07-20-2011, 01:34 AM   #4
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
i remove that stuff with the find replace feature in Sigil. just paste the whole chunk into the find box & replace with nothing. works fine, even across multiple line breaks.
cybmole is offline   Reply With Quote
Old 07-20-2011, 03:27 AM   #5
paulfiera
Addict
paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.paulfiera could sell banana peel slippers to a Deveel.
 
paulfiera's Avatar
 
Posts: 378
Karma: 3102
Join Date: Dec 2010
Location: EU
Device: Kobo Aura ONE, Kobo Libra H20
Quote:
Originally Posted by cybmole View Post
i remove that stuff with the find replace feature in Sigil. just paste the whole chunk into the find box & replace with nothing. works fine, even across multiple line breaks.
Thanks, cybmole.

I guess I'll have to do that.
paulfiera is offline   Reply With Quote
Advert
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
RegExp in search field Format C: Library Management 4 06-13-2011 11:00 AM
Is this as it supposed to be? (Regexp issue?) Mixx Calibre 13 03-09-2011 03:30 AM
Error in Regexp documentation arifzaman Calibre 3 03-02-2011 06:03 AM
Multiple line regexp janvanmaar Calibre 19 11-02-2010 01:02 PM
Regexp and header/footer problems concern Calibre 0 02-07-2010 03:35 AM


All times are GMT -4. The time now is 01:22 PM.


MobileRead.com is a privately owned, operated and funded community.