Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 11-09-2010, 08:31 AM   #1
Dasun
Junior Member
Dasun began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Nov 2010
Device: Kobo
Can Calibre Strip HTML links when exporting to epub?

A long time lurkers first post, so hi to all.

Now, I am trying to convert some (well lots!) of PDFs to epub format and I need to strip embedded HTML links. Can Calibre do this?

Thanks for any suggestions....
Dasun is offline   Reply With Quote
Old 11-09-2010, 08:45 AM   #2
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
So, if I understand you correctly, you want to remove any parts that contain
Code:
<a href="whatever">text</a>
while preserving the text? You could just abuse the header/footer removal for that. For the given example, something like
Code:
<a href[^>]>|</a>
ought to work. Remember to check you documents' source, though (the magic wand symbol).
Manichean is offline   Reply With Quote
Advert
Old 11-10-2010, 09:17 AM   #3
Dasun
Junior Member
Dasun began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Nov 2010
Device: Kobo
Manichean,

Thanks for putting me on the right path....adding

<a href.*</a>

to the header/footer removal sections ripped out all the HTML links and the documents only suffered a little - table of contents is missing in action but I can live with that. As a former software engineer I am surprised I had forgotten how truly ugly regex syntax was until I started doing it again tonight!
Dasun is offline   Reply With Quote
Old 11-10-2010, 09:19 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,839
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You could just put

a[href] { display: none }

into the extra css box.
kovidgoyal is online now   Reply With Quote
Old 11-10-2010, 10:28 AM   #5
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by Dasun View Post
Manichean,

Thanks for putting me on the right path....adding

<a href.*</a>

to the header/footer removal sections ripped out all the HTML links and the documents only suffered a little - table of contents is missing in action but I can live with that. As a former software engineer I am surprised I had forgotten how truly ugly regex syntax was until I started doing it again tonight!
Well, yeah, small wonder the TOC got wiped out... be aware though that your way deletes the text that contained the hyperlink as well. My expression above should preserve that text.
Manichean is offline   Reply With Quote
Advert
Old 03-22-2017, 05:27 PM   #6
Dipaksomak
Enthusiast
Dipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of lightDipaksomak is a glorious beacon of light
 
Posts: 26
Karma: 12092
Join Date: Jan 2010
Device: none
Quote:
Originally Posted by kovidgoyal View Post
You could just put

a[href] { display: none }

into the extra css box.

Lifesaver. Thanks mate.
Dipaksomak is offline   Reply With Quote
Old 03-03-2020, 02:47 AM   #7
greenpossum
Junior Member
greenpossum began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Mar 2020
Device: Kindle paperwhite
Quote:
Originally Posted by kovidgoyal View Post
You could just put

a[href] { display: none }

into the extra css box.
Sorry for reviving an old thread but I tried this and unfortunately it strips the text inside the <a> element. After some searching I found this advice solved the problem:

https://stackoverflow.com/questions/...57625#27057625

and the command line enhancement to the conversion command is:

--extra-css 'a.disabled { pointer-events: none; cursor: default; } a[href] { class: disabled }'

Posting so that people can do it correctly.
greenpossum is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre convert Chinese PDF to EPUB well, but not TXT and HTML jimmyzou ePub 15 12-27-2013 04:02 PM
Will Calibre maintain the links when it converts HTML? ficbot Calibre 3 11-18-2010 10:27 PM
Quick and dirty conversion of html to epub WITH intra-file links Birdonawire ePub 2 06-18-2010 02:18 AM
Calibre: HTML => ePub: Anführungszeichen verloren buecherkorb Software 2 01-30-2010 09:16 AM
Multiple html issue - too many links and zip isn't created in calibre Katelyn Calibre 1 03-10-2009 01:31 PM


All times are GMT -4. The time now is 12:34 PM.


MobileRead.com is a privately owned, operated and funded community.