07-06-2013, 04:09 PM | #1 |
Groupie
Posts: 164
Karma: 31650
Join Date: May 2011
Location: Asuncion (Paraguay)
Device: Several Kindle 3 KB's
|
removing links from text . (Solved)
I am editing a text that was originally on a web-page, and contains various links to other web-pages.
Is there a simple way to remove the links, without loosing the rest of the page formatting ? Last edited by rolgiati; 07-12-2013 at 08:36 AM. Reason: Solved |
07-06-2013, 04:35 PM | #2 |
mostly an observer
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
Well, if it's only "various" links, it shouldn't involve much effort simply to delete the <a href="xxxx.htm"> and the closing </a> links in Code View.
Indeed, find&replace would wipe out the </a> in one go. |
Advert | |
|
07-06-2013, 05:50 PM | #3 |
Color me gone
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
You would have to be careful you did not remove footnotes which were links as well. You would search for http or www.
|
07-07-2013, 12:57 AM | #4 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
find <a .*</a>
replace with nothing - takes out all links, assuming there is nothing of value inside the anchor tags. be careful because I've had instances where chapter names/numbers are also inside of those & thus a more complex expression is needed to preserve them |
07-07-2013, 03:29 AM | #5 | |
Zealot
Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
Quote:
<a .*?</a> to make the matching non-greedy. @rolgiati: it always helps if you post a snippet of the HTML code you want to modify. |
|
Advert | |
|
07-07-2013, 03:50 AM | #6 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
good catch - but I think it worked ok anyway for me.
I often strip these otherwise the chapter names/numbers appear as blue+underline because they look like hyperlinks - and I prefer plain black. my sony reader insists on underlining anything in <a> tags, irrespective of styling but sometimes there's a horrendous collection of spans & styles all on the same single chapter header line, esp if the opening letter is differently styled, so it can get quite messy |
07-12-2013, 06:38 AM | #7 | |
Groupie
Posts: 164
Karma: 31650
Join Date: May 2011
Location: Asuncion (Paraguay)
Device: Several Kindle 3 KB's
|
Quote:
looks like <a class="pcalibre" href="http://books.google.com/books?id=98PYVYGFCIIC&pg=PA54&lpg=PA54& ;dq=dimethylcadmium&source=bl&ots=oK-NDQqout&sig=wDDF0qwknz3G-twxMl_6B8Di5lk&hl=en&sa=X&ei=IhSDUdyMD NW-4AO51oCYAw&ved=0CDMQ6AEwATgU#v=onepage&q=d imethylcadmium&f=false">extremely tedious work</a>, which allows you In this exemple I would like to be able to remove everything from <a class to false">, and </a>, without loosing extremely tedious work. |
|
07-12-2013, 06:57 AM | #8 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
find <a class="pcalibre" href.*?>
replace with <a class="pcalibre" > the .*? should match all text up to the 1st > test it first! |
07-12-2013, 07:46 AM | #9 |
Zealot
Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
Find:
<a class="pcalibre" href="http://books\.google\.com.+?>(.+?)</a> Replace with: \1 |
07-12-2013, 07:55 AM | #10 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
the correct answer depends on whether he still wants to apply class="pcalibre" to the text
|
07-12-2013, 08:35 AM | #11 |
Groupie
Posts: 164
Karma: 31650
Join Date: May 2011
Location: Asuncion (Paraguay)
Device: Several Kindle 3 KB's
|
My thanks for the suggestions;
I finally got the result by replacing <a class="pcalibre" href.*?> with nothing, then </a> with nothing. did not work, it put /1 in place of the link text ! Possibly the find and replace does not understand a regex in the replacement text. |
07-12-2013, 08:48 AM | #12 |
Groupie
Posts: 164
Karma: 31650
Join Date: May 2011
Location: Asuncion (Paraguay)
Device: Several Kindle 3 KB's
|
|
07-12-2013, 08:54 AM | #13 | |
Well trained by Cats
Posts: 29,800
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
old: Code:
<a class="pcalibre" href="http://books.google.com/books?id=98PYVYGFCIIC&pg=PA54&lpg=PA54& ;dq=dimethylcadmium&source=bl&ots=oK-NDQqout&sig=wDDF0qwknz3G-twxMl_6B8Di5lk&hl=en&sa=X&ei=IhSDUdyMD NW-4AO51oCYAw&ved=0CDMQ6AEwATgU#v=onepage&q=d imethylcadmium&f=false">extremely tedious work</a>, Code:
extremely tedious work |
|
07-12-2013, 11:46 AM | #14 | |
Zealot
Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
Quote:
: ) |
|
01-18-2015, 07:09 PM | #15 | |
Enthusiast
Posts: 39
Karma: 10
Join Date: Jul 2012
Device: none
|
Quote:
The regex would be to replace <a\x20[^\r\n<>]+http[^\r\n<>]+>([^\r\n<>]*)</a> with $1. The link would be removed, and the text, if any would be retained. Only possible problem is if the </a> is missing. If further explanation is wanted by anyone, let me know. John |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Removing Arrow Links on EPUB book | andy88 | ePub | 2 | 04-12-2013 06:29 AM |
Removing Everything But Formatted Text | Dybbuk | Sigil | 17 | 02-25-2013 09:57 AM |
Removing social media links (and hyperlinks) | benn44b | Recipes | 0 | 09-11-2011 04:46 PM |
Removing text from an ebook | mjt57 | Conversion | 3 | 04-29-2011 02:55 AM |
Apple removing links to Consumer Reports study from forums | =X= | News | 96 | 08-08-2010 12:10 AM |