|  11-22-2010, 03:04 AM | #1 | 
| Connoisseur  Posts: 86 Karma: 10 Join Date: Oct 2010 Location: Philippines Device: DavidAcisAAJ | 
				
				Question:  Deleting html links to sites in document
			 
			
			I'm constructing a personal e-book containing a collection of html pages I gathered from the Internet. I merged these pages into one large html document. Within the merged document are many links to outside sites. When I bring the html document into my web browser, I can see at the bottom of the screen lots of downloading from those various links/sites. My question is: What's the easiest way of removing or eliminating all of those links to the outside? One way I found was to use an editor and to simply remove "http://" wherever it occurs in the html document. Is that the best way? Thanks, Nicholas Kormanik | 
|   |   | 
|  11-22-2010, 03:06 AM | #2 | 
| Connoisseur  Posts: 86 Karma: 10 Join Date: Oct 2010 Location: Philippines Device: DavidAcisAAJ | 
			
			I was hoping that Calibre would have some auto-stripping function for html importing, that would automatically eliminate such links.
		 | 
|   |   | 
| Advert | |
|  | 
|  11-22-2010, 04:35 AM | #3 | 
| Wizard            Posts: 3,130 Karma: 91256 Join Date: Feb 2008 Location: Germany Device: Cybook Gen3 | 
			
			If you want to remove whatever from your document, you can always use the header/footer removal regexes in structure detection.
		 | 
|   |   | 
|  11-22-2010, 12:06 PM | #4 | 
| creator of calibre            Posts: 45,595 Karma: 28548962 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			just convert your html file, links pointing to non local resources are not followed by calibre's conversion engine anyway
		 | 
|   |   | 
|  11-22-2010, 08:51 PM | #5 | 
| Connoisseur  Posts: 86 Karma: 10 Join Date: Oct 2010 Location: Philippines Device: DavidAcisAAJ | 
			
			Glad to hear Calibre strips the links. The problem is, however, if there are lots of links, it takes a long time for Calibre to convert the file. Seems better to delete them upfront, from the HTML document, before bringing it into Calibre. | 
|   |   | 
| Advert | |
|  | 
|  11-22-2010, 09:05 PM | #6 | 
| Connoisseur  Posts: 86 Karma: 10 Join Date: Oct 2010 Location: Philippines Device: DavidAcisAAJ | 
			
			Things are rough enough for Calibre to handle.  Cut the junk first.
		 | 
|   |   | 
|  11-22-2010, 09:07 PM | #7 | 
| Connoisseur  Posts: 86 Karma: 10 Join Date: Oct 2010 Location: Philippines Device: DavidAcisAAJ | 
			
			Converting to txt is not the answer, though, as too much formatting is given up.
		 | 
|   |   | 
|  | 
| Tags | 
| eliminate links | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Links to other similar sites? | Rhynedahll | Writers' Corner | 4 | 10-25-2010 06:37 PM | 
| html to zip without following links | dracore | Calibre | 1 | 09-08-2010 06:10 PM | 
| Sigil shows a blank document when importing valid HTML | walter2 | Sigil | 15 | 03-25-2010 07:17 AM | 
| Make chapters for a document/HTML | iodine9176 | ePub | 12 | 02-23-2010 02:24 PM | 
| HTML with external links | posativ | LRF | 2 | 02-07-2010 07:27 AM |