View Single Post
Old 04-26-2016, 12:22 PM   #1
chaot
Head of lunatic asylum
chaot will give the Devil his due.chaot will give the Devil his due.chaot will give the Devil his due.chaot will give the Devil his due.chaot will give the Devil his due.chaot will give the Devil his due.chaot will give the Devil his due.chaot will give the Devil his due.chaot will give the Devil his due.chaot will give the Devil his due.chaot will give the Devil his due.
 
chaot's Avatar
 
Posts: 349
Karma: 77620
Join Date: Jun 2012
Location: UTC +1
Device: Tolino Vision 3HD
Delete paragraphs in scanned books (S & R with regexes)



Scanned books show in the view screen or e-reader often unwanted paragraphs in respect to book page numbers.

The terms may look different, but they appear en masse and therefore elimination using S & R and regaxes would be advantageous.
Marked syntax (red) should be deleted. Note the book page numbers always differ (of course).

Where are our great regex masters!?

Some examples from different books:

Click image for larger version

Name:	Example 1.png
Views:	352
Size:	45.1 KB
ID:	148233
Example 1
Code:
keine Anzeichen für körperliche Mängel zu erkennen. </p>

  <p class="calibre2">Normal? Der US-Geheimdienst OSS (Office of Strategic 169</p>

  <p class="calibre2"></p>

  <p class="calibre2">Studies, Vorläufer der CIA), oder genauer, der von ihm
Click image for larger version

Name:	Example 2.png
Views:	365
Size:	27.7 KB
ID:	148237
Example 2 Note hyphen, also to delete.
Code:
derartigen Mangel hingewiesen hätten, aber die ärztlichen Feststel-170</p>

  <p class="calibre2"></p>

  <p class="calibre2">lungen lauteten nach dem Krieg nicht anders als
Click image for larger version

Name:	Example 3.png
Views:	359
Size:	28.4 KB
ID:	148236
Example 3
Code:
die natürlich ihre Blöße nicht deckten, denn es war </p>

  <p class="calibre2">17</p>

  <p class="calibre2"></p>

  <p class="calibre2">keiner anwesend (außer mir), der nicht mindestens seine
Click image for larger version

Name:	Example 3a.png
Views:	330
Size:	35.3 KB
ID:	148272
Example 3a
Code:
das viel zu herb und zu modisch für sie ist, irgendein <b class="calibre3">19</b></p>

  <p class="calibre2"></p>

  <p class="calibre2">Zeug, das, glaube ich, Taiga heißt, noch in der Wohnung
Click image for larger version

Name:	Example 4.png
Views:	345
Size:	28.7 KB
ID:	148238
Example 4 Note Roman rather than Arabic numerals!
Code:
bewundernden Kommentare von westlichen Besuchern in Maos China, XVI </p>

  <p class="calibre2"></p>

  <p class="calibre2">dass Chinesen außerordentliche Menschen seien, die es
Click image for larger version

Name:	Example 5.png
Views:	325
Size:	20.6 KB
ID:	148239
Example 5
Code:
ihr Büro war für die [306] Sicherheit eines Parkabschnitts zuständig.
Interna: Ex1&2 Bedürftig (AHdAb), Ex3&3a Böll (AeC), Ex4&5 Chang (WS)

Last edited by chaot; 06-02-2016 at 02:27 PM. Reason: add Interna, Example 3a
chaot is offline   Reply With Quote