|
|
Thread Tools | Search this Thread |
12-25-2011, 04:45 PM | #1 |
Junior Member
Posts: 1
Karma: 10
Join Date: Dec 2011
Device: Kindle Touch
|
Could use a bit of help with regular expressions to edit books on conversion
Ok so a whole bunch of PDFs I have have this nasty watermark which gets converted to text every time as exactly this:
Code:
<a href="http://www.abbyy.com/buy"><b>PDF Transform</b></a><br> <a href="http://www.abbyy.com/buy"><b>PDF Transform</b></a><br> <a href="http://www.abbyy.com/buy"><b>Y</b></a><br> <a href="http://www.abbyy.com/buy"><b>Y</b></a><br> <a href="http://www.abbyy.com/buy"><b>Y</b></a><br> <a href="http://www.abbyy.com/buy"><b>er</b></a><br> <a href="http://www.abbyy.com/buy"><b>Y</b></a><br> <a href="http://www.abbyy.com/buy"><b>er</b></a><br> <a href="http://www.abbyy.com/buy"><b>B</b></a><br> <a href="http://www.abbyy.com/buy"><b>2</b></a><br> <a href="http://www.abbyy.com/buy"><b>B</b></a><br> <a href="http://www.abbyy.com/buy"><b>2</b></a><br> <a href="http://www.abbyy.com/buy"><b>B</b></a><br> <a href="http://www.abbyy.com/buy"><b>.0</b></a><br> <a href="http://www.abbyy.com/buy"><b>B</b></a><br> <a href="http://www.abbyy.com/buy"><b>.0</b></a><br> <a href="http://www.abbyy.com/buy"><b>A</b></a><br> <a href="http://www.abbyy.com/buy"><b>A</b></a><br> <a href="http://www.abbyy.com/buy"><b>Click here to buy</b></a><br> <a href="http://www.abbyy.com/buy"><b>Click here to buy</b></a><br> <a href="http://www.abbyy.com/buy"><b>w</b></a><br> <a href="http://www.abbyy.com/buy"><b>w</b></a><br> <a href="http://www.abbyy.com/buy"><b>w</b></a><br> <a href="http://www.abbyy.com/buy"><b>w</b></a><br> <a href="http://www.abbyy.com/buy"><b>w .</b></a><br> <a href="http://www.abbyy.com/buy"><b>w</b></a><br> <a href="http://www.abbyy.com/buy"><b>A B B YY.com</b></a><br> <a href="http://www.abbyy.com/buy"><b>.A B BYY.com</b></a><br> I know I can remove most of it during a bulk conversion using these two: <a href="http://www.abbyy.com/buy"><b>[a-zA-Z0-9]</b></a><br> and <a href="http://www.abbyy.com/buy"><b>[a-zA-Z0-9][a-zA-Z0-9]</b></a><br> but the longer ones will obviously remain in there. While reading thru the "all about using regular expressions in calibre" it got over my head and if anyone can help me with a regular expression to remove all of that junk I'd really appreciate it. |
12-28-2011, 06:50 AM | #2 | |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Quote:
Good Luck. |
|
Advert | |
|
12-28-2011, 03:20 PM | #3 |
Evangelist
Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
Mostly, the post above, failing that :
Code:
<a href="http://www.abbyy.com/buy">.*?</a><br> |
12-29-2011, 10:29 AM | #4 |
Groupie
Posts: 154
Karma: 2054094
Join Date: Apr 2011
Location: Boulder, CO
Device: Kindle Voyage, Samsung Galaxy Tab 10.1 (for PDFs)
|
<a href="http://www.abbyy.com/buy"><b>[a-zA-Z0-9\. ]+</b></a><br>
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Adding books - need help with regular expressions | tweebee | Library Management | 10 | 08-05-2011 08:58 PM |
Regular Expressions | geormes | Calibre | 4 | 08-04-2011 07:09 AM |
Another help with regular expressions | encapuchado | Library Management | 6 | 06-21-2011 03:14 PM |
Help with regular expressions | jevonbrady | Library Management | 6 | 06-21-2011 10:16 AM |
Help with Regular Expressions | ghostyjack | Workshop | 2 | 01-08-2010 11:04 AM |