![]() |
#46 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Depends on whether it is copyrighted or not. Also, it depends on whether I have time to mess around with it or not, which is currently not looking too good. If the document is not copyrighted, though, you could attach it and hope that someone else tries.
|
![]() |
![]() |
![]() |
#47 |
Member
![]() Posts: 13
Karma: 12
Join Date: Jan 2011
Device: Samsung Galaxy Tab
|
ok actually i think i have it working
|
![]() |
![]() |
Advert | |
|
![]() |
#48 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
|
![]() |
![]() |
![]() |
#49 |
Member
![]() Posts: 13
Karma: 12
Join Date: Jan 2011
Device: Samsung Galaxy Tab
|
is it possible to chain regex codes in the header bar? e.g. set up a standard code that removes page nmubers, abc, pdf transform etc in one? right now i can only do one at a time...
|
![]() |
![]() |
![]() |
#51 |
Member
![]() Posts: 13
Karma: 12
Join Date: Jan 2011
Device: Samsung Galaxy Tab
|
1 more question sorry everyone
![]() (<p.*?><a.*?></a></p>) to get rid of the aabby stuff and it highlights in yellow but doesnt remove am i making a mistake? |
![]() |
![]() |
![]() |
#52 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
That only matches the tags, not what's enclosed in them. Or, to be more precise, it matches the tags without anything enclosed in them, so it shouldn't even highlight anything from the Abby stuff. I'm beginning to think there's something seriously wonky with your Calibre.
|
![]() |
![]() |
![]() |
#53 |
Member
![]() Posts: 13
Karma: 12
Join Date: Jan 2011
Device: Samsung Galaxy Tab
|
really? i was under the impression it would remove anything page break with a html link inside no matter what the html link is( or at least thats what i was going for and what gets highlighted) any help with it then?
|
![]() |
![]() |
![]() |
#54 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
The <p> stuff is a paragraph, not a page break. Just saying because it's been driving me crazy.
Regexes only do string matching with no interpretation of what the strings are, do or mean in it. There's no text to match in between your opening and closing tags, thus, nothing should get matched except for an empty link tag inside a paragraph tag. It seems to me that while you obviously know something about regular expressions, you may have missed the concept. Think about the string matching part of the second paragraph a little. |
![]() |
![]() |
![]() |
#55 |
Member
![]() Posts: 13
Karma: 12
Join Date: Jan 2011
Device: Samsung Galaxy Tab
|
thanks i'd rather know... sorry it just made more sense to be a page break then a paragraph in the context of the links but in genral contrext paragraph does make more sense so thanx... i think my main problem is bringing my previous programming knowledge into regex.
i kno the purpose is to match strings but in the tutorial it makes refrences to matching all strings no matter what the actual string is as long as its within a particular function (its generally refered to as a problem e.g. trying to remove a bold page number and removing every bold string in the document) therefore i thought the same concept could be applied to the html links... I suppose im wrong oh well ![]() thanks very much manichean u've bn really helpful even tho i've bn a really slow learner ![]() |
![]() |
![]() |
![]() |
#56 |
Member
![]() Posts: 13
Karma: 12
Join Date: Jan 2011
Device: Samsung Galaxy Tab
|
dont worry got it working its supposed to be:
<a.*?>.*?</a> thanks alot manichean |
![]() |
![]() |
![]() |
#57 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
Yeah, that should work. Be aware, though, that this removes all links from the document. Depending on what you convert, that may not be desirable. |
|
![]() |
![]() |
![]() |
#58 |
Member
![]() Posts: 13
Karma: 12
Join Date: Jan 2011
Device: Samsung Galaxy Tab
|
"thus we could remove everything between those tags using <b.*?>.*?</b>"
I know but i wanted the code for a particlar set.... thanks again |
![]() |
![]() |
![]() |
#59 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Notice, though, that there is a wildcard followed by a quantifier in between those two tags. They weren't there in the regex you posted earlier.
|
![]() |
![]() |
![]() |
#60 |
Book Geek
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 596
Karma: 1499085
Join Date: Aug 2010
Location: Adelaide, Australia
Device: Kobo Touch, Asus MemPad 7" tablet, Nexus 5, Asus 10" tablet
|
I notice the "remove Header" and "remove footer" options have gone in Calibre - could someone point me in the right direction of how to do this very useful job? I presume there is a new way.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regex help to remove HTML footer | neonbible | Calibre | 4 | 09-09-2010 09:42 AM |
Regex to remove header from PDF | neonbible | Calibre | 4 | 09-07-2010 10:08 AM |
Removing header and footer | radicalnomad | Calibre | 2 | 08-26-2010 10:34 AM |
Header/Footer removal | Solicitous | Calibre | 2 | 03-30-2010 05:53 AM |
Multiline Regex Footer | hover | Calibre | 10 | 02-03-2010 04:23 AM |