Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 07-28-2012, 06:47 AM   #1
worley
Junior Member
worley began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jul 2012
Device: Kindle Fire
Need Help with Search and Replace please!

Hello guys,

Forgive my ignorance, but I need someone's help please. I don't know anything about html. I'm trying to remove the headers from a PDF ebook. I used the following expression to remove the header from page 6, which is the page where the headers begin.

<br> <hr> <A name=6></a>6 <b>•</b> J o ã o G u i m a r ã e s R o s a<br>

The Expression above obviously only found 1 match in the file. The following one (marked in red);

Utilizamos ainda outras edi-<br>ções tanto para corrigir variações indevidas como para insistir<br>em outras. Essas grafias em desuso podem parecer simplesmen-<br>te uma questão de atualização ortográfica, mas, se essa atualiza-<br>ção já era exigida pela norma quando da publicação dos livros e<br> <hr> <A name=6></a>6 <b>•</b> J o ã o G u i m a r ã e s R o s a<br>de suas várias edições durante a vida do autor, partimos do prin-<br>cípio de que elas são provavelmente intencionais e devem, por-<br>tanto, ser mantidas. Para justificar essa decisão, lembramos aos<br>leitores que as antigas edições da obra de Guimarães Rosa apre-<br>sentavam uma nota alertando justamente para a grafia persona-<br>líssima do autor e que algumas histórias registram a sua teimosia<br>em acentuar determinadas palavras.

How on earth do I make it match every page, there are 608 pages? I'm sure it should be easy, but I become dyslexic when dealing with html. Again, I would appreciate someone's help! Thanks!
worley is offline   Reply With Quote
Old 07-28-2012, 07:08 AM   #2
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,841
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by worley View Post
Hello guys,

Forgive my ignorance, but I need someone's help please. I don't know anything about html. I'm trying to remove the headers from a PDF ebook. I used the following expression to remove the header from page 6, which is the page where the headers begin.

<br> <hr> <A name=6></a>6 <b>•</b> J o ã o G u i m a r ã e s R o s a<br>

The Expression above obviously only found 1 match in the file. The following one (marked in red);

Utilizamos ainda outras edi-<br>ções tanto para corrigir variações indevidas como para insistir<br>em outras. Essas grafias em desuso podem parecer simplesmen-<br>te uma questão de atualização ortográfica, mas, se essa atualiza-<br>ção já era exigida pela norma quando da publicação dos livros e<br> <hr> <A name=6></a>6 <b>•</b> J o ã o G u i m a r ã e s R o s a<br>de suas várias edições durante a vida do autor, partimos do prin-<br>cípio de que elas são provavelmente intencionais e devem, por-<br>tanto, ser mantidas. Para justificar essa decisão, lembramos aos<br>leitores que as antigas edições da obra de Guimarães Rosa apre-<br>sentavam uma nota alertando justamente para a grafia persona-<br>líssima do autor e que algumas histórias registram a sua teimosia<br>em acentuar determinadas palavras.

How on earth do I make it match every page, there are 608 pages? I'm sure it should be easy, but I become dyslexic when dealing with html. Again, I would appreciate someone's help! Thanks!

Probably the '6' is a page number (and there is only 1 @ 6 )

The REGEX wildcard for (any quantity of sequential) Numbers is \d+
Code:
<br> <hr> <A name=\d+></a>\d+ <b>•</b> J o ã o G u i m a r ã e s R o s a<br>
What looks odd to me is this part: <A name=6>, The part after the = should normally be in quotes AND to be valid if it was in a EPUB, start with at least a letter (can't be just numbers)
theducks is offline   Reply With Quote
 
Enthusiast
Old 07-28-2012, 09:29 AM   #3
Gunnerp245
Gadget Freak
Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.Gunnerp245 ought to be getting tired of karma fortunes by now.
 
Gunnerp245's Avatar
 
Posts: 1,122
Karma: 1043832
Join Date: Nov 2007
Location: US
Device: EE, PE, Note 8
Quote:
Originally Posted by worley View Post
... How on earth do I make it match every page, there are 608 pages? I'm sure it should be easy, ...
This post is an excellent introduction; An introduction to regular expressions

Gunnerp245 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Search and Replace SOThunder Conversion 2 04-29-2012 08:29 AM
Search and replace TdeV Sigil 1 10-30-2011 04:45 PM
search and replace - drops blanks in replace ? cybmole Conversion 10 03-13-2011 03:07 AM
Search and replace in 0.2.0 paulpeer Sigil 7 03-13-2010 11:59 AM
Why no search and replace? charleski Sigil 10 11-24-2009 04:13 PM


All times are GMT -4. The time now is 09:55 PM.


MobileRead.com is a privately owned, operated and funded community.