Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 11-22-2010, 03:04 AM   #1
nkormanik
Connoisseur
nkormanik began at the beginning.
 
nkormanik's Avatar
 
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
Question: Deleting html links to sites in document

I'm constructing a personal e-book containing a collection of html pages I gathered from the Internet.

I merged these pages into one large html document. Within the merged document are many links to outside sites. When I bring the html document into my web browser, I can see at the bottom of the screen lots of downloading from those various links/sites.

My question is: What's the easiest way of removing or eliminating all of those links to the outside?

One way I found was to use an editor and to simply remove "http://" wherever it occurs in the html document.

Is that the best way?

Thanks,
Nicholas Kormanik
nkormanik is offline   Reply With Quote
Old 11-22-2010, 03:06 AM   #2
nkormanik
Connoisseur
nkormanik began at the beginning.
 
nkormanik's Avatar
 
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
I was hoping that Calibre would have some auto-stripping function for html importing, that would automatically eliminate such links.
nkormanik is offline   Reply With Quote
Advert
Old 11-22-2010, 04:35 AM   #3
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
If you want to remove whatever from your document, you can always use the header/footer removal regexes in structure detection.
Manichean is offline   Reply With Quote
Old 11-22-2010, 12:06 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
just convert your html file, links pointing to non local resources are not followed by calibre's conversion engine anyway
kovidgoyal is offline   Reply With Quote
Old 11-22-2010, 08:51 PM   #5
nkormanik
Connoisseur
nkormanik began at the beginning.
 
nkormanik's Avatar
 
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
Glad to hear Calibre strips the links.

The problem is, however, if there are lots of links, it takes a long time for Calibre to convert the file. Seems better to delete them upfront, from the HTML document, before bringing it into Calibre.
nkormanik is offline   Reply With Quote
Advert
Old 11-22-2010, 09:05 PM   #6
nkormanik
Connoisseur
nkormanik began at the beginning.
 
nkormanik's Avatar
 
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
Things are rough enough for Calibre to handle. Cut the junk first.
nkormanik is offline   Reply With Quote
Old 11-22-2010, 09:07 PM   #7
nkormanik
Connoisseur
nkormanik began at the beginning.
 
nkormanik's Avatar
 
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
Converting to txt is not the answer, though, as too much formatting is given up.
nkormanik is offline   Reply With Quote
Reply

Tags
eliminate links


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Links to other similar sites? Rhynedahll Writers' Corner 4 10-25-2010 06:37 PM
html to zip without following links dracore Calibre 1 09-08-2010 06:10 PM
Sigil shows a blank document when importing valid HTML walter2 Sigil 15 03-25-2010 07:17 AM
Make chapters for a document/HTML iodine9176 ePub 12 02-23-2010 02:24 PM
HTML with external links posativ LRF 2 02-07-2010 07:27 AM


All times are GMT -4. The time now is 06:31 AM.


MobileRead.com is a privately owned, operated and funded community.