Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 01-11-2013, 10:01 AM   #1
sws
Enthusiast
sws began at the beginning.
 
sws's Avatar
 
Posts: 26
Karma: 10
Join Date: Oct 2012
Device: Pocket Book Touch Lux
python regex: delete text in preprocessing

Hi all,

I am using a calibre recipe (Weltonline; german daily newspaper) to fetch news daily. Everything works fine, but in the final epub file after every news article there is plenty of (web-) rubbish I want to get rid of. Therefore I use Sigil and work on the epub I have downloaded from my calibre server. I delete everything between two string groups:

Code:
(     <div class="calibre7">
                  © Axel Springer AG 2013. Alle Rechte vorbehalten)([\s\S ]*?)(Weitere Hinweise</a></li>)
Now I have seen that there is a way to preprocess regex expressions while fetching the news with calibre. But I have also seen that python requires a slightly different approach towards regex and I am no expert in different regex dialects nor am I a python programmer.Took me quite a while to figure out the regex above .

Could someone tell me how to use the above regex in the forementioned recipe in a way to use this preprocessing?

Thanks,
Sebastian
sws is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regex problem: Trying to replace surrounding text without effecting the middle ghostyjack Workshop 3 10-09-2012 05:26 PM
Kobo Read - Can't delete Text file - txt fglaysher Calibre 0 08-15-2010 07:08 PM
Does de-DRMing delete text to speech restrictions? Sydney's Mom Amazon Kindle 4 03-07-2010 12:46 AM
Preprocessing to PRS-505 from Calibre jeff363 Calibre 7 06-02-2008 08:20 AM
Python Gutenberg E-text Project: PyGE ignatz Deals, Freebies, and Resources (No Self-Promotion) 2 09-17-2004 02:18 PM


All times are GMT -4. The time now is 11:12 AM.


MobileRead.com is a privately owned, operated and funded community.