11-22-2010, 03:04 AM | #1 |
Connoisseur
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
|
Question: Deleting html links to sites in document
I'm constructing a personal e-book containing a collection of html pages I gathered from the Internet.
I merged these pages into one large html document. Within the merged document are many links to outside sites. When I bring the html document into my web browser, I can see at the bottom of the screen lots of downloading from those various links/sites. My question is: What's the easiest way of removing or eliminating all of those links to the outside? One way I found was to use an editor and to simply remove "http://" wherever it occurs in the html document. Is that the best way? Thanks, Nicholas Kormanik |
11-22-2010, 03:06 AM | #2 |
Connoisseur
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
|
I was hoping that Calibre would have some auto-stripping function for html importing, that would automatically eliminate such links.
|
Advert | |
|
11-22-2010, 04:35 AM | #3 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
If you want to remove whatever from your document, you can always use the header/footer removal regexes in structure detection.
|
11-22-2010, 12:06 PM | #4 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
just convert your html file, links pointing to non local resources are not followed by calibre's conversion engine anyway
|
11-22-2010, 08:51 PM | #5 |
Connoisseur
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
|
Glad to hear Calibre strips the links.
The problem is, however, if there are lots of links, it takes a long time for Calibre to convert the file. Seems better to delete them upfront, from the HTML document, before bringing it into Calibre. |
Advert | |
|
11-22-2010, 09:05 PM | #6 |
Connoisseur
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
|
Things are rough enough for Calibre to handle. Cut the junk first.
|
11-22-2010, 09:07 PM | #7 |
Connoisseur
Posts: 86
Karma: 10
Join Date: Oct 2010
Location: Philippines
Device: DavidAcisAAJ
|
Converting to txt is not the answer, though, as too much formatting is given up.
|
Tags |
eliminate links |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Links to other similar sites? | Rhynedahll | Writers' Corner | 4 | 10-25-2010 06:37 PM |
html to zip without following links | dracore | Calibre | 1 | 09-08-2010 06:10 PM |
Sigil shows a blank document when importing valid HTML | walter2 | Sigil | 15 | 03-25-2010 07:17 AM |
Make chapters for a document/HTML | iodine9176 | ePub | 12 | 02-23-2010 02:24 PM |
HTML with external links | posativ | LRF | 2 | 02-07-2010 07:27 AM |