![]() |
#1 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: May 2015
Device: none
|
Calibre - How to erase page? numbers after heuristic processing
The heuristic processing worked great to unify paragraphs converting PDF to ePub. I am getting numbers at various intervals though. Please see example (24 & 25) below:
nonrealistic view suggested by quantum theory. 24 Einstein protested: “I cannot seriously believe in [the quantum theory] because it cannot be reconciled with the idea that physics should represent a reality in time and space, free from spooky actions at a distance.” 25 It was in a discussion of the EPR paper that Erwin Schrödinger first coined the term “entanglement.” Any ideas how to omit these, thanks. Last edited by msshain; 05-06-2015 at 02:00 PM. Reason: Improve title |
![]() |
![]() |
![]() |
#2 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Sounds like page numbers. You will need to add a regex under Search and Replace to get rid of those.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,932
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
This is a slightly tedious EDITOR job, not a conversion job. REGEX in a conversion expects a FIXED pattern to the Page # appearance. Long Winded 56 57 Short Story Long Winded 103 104 Short Story When it is (semi) random, you need to step through each find (there will be many patterns to find. you create a unique REGEX for each pattern you discover. BTW This is probably a case to NOT have Heuristics clean up. The page pattern might have been easier to discover before the attempt to join lines. Every PDF is unique in the issues presented (see the sticky about PDF) |
|
![]() |
![]() |
![]() |
#4 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
theducks -- if you use S&R in the conversion settings, it operates before line unwrapping. Handy.
![]() Of course, you lose the ability to step through each match and confirm. There are advantages either way. |
![]() |
![]() |
![]() |
#5 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,932
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
![]() ![]() ![]() |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
You do actully get to preview the parsed xhtml that is extracted from the PDF -- it is part of the S&R wizard. So it isn't as dangerous as it could be.
|
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Zapping numbers without manual checks is bad for dates, and guns, and times
E.g. He was shot in '66 with a colt 45. At 11am We called 911 but..... |
![]() |
![]() |
![]() |
#8 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,657
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Actually they could be reference link numbers - is there a numbered reference list at the back of the book, and do the numbers embedded in the text bear any relationship to the reference with the same number.
I suspect those quotes may be from the correspondence between E and S on the latter's thought experiments on cats in boxes and all that, and E's statement claiming God doesn't play dice etc. If they are such - then you might want to 'fix' them in the editor by recreating the links. BR |
![]() |
![]() |
![]() |
#9 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Page numbers | kaufman | Library Management | 1 | 10-05-2014 04:26 AM |
Chapter Page Numbers Instead of Title Page Numbers | TheArtfulDodger | Devices | 1 | 11-18-2013 01:08 PM |
Kindle (AZW3/MOBI) ebooks with "real page numbers" to PDF with same page numbers? | abvgd | Conversion | 2 | 05-24-2013 01:24 PM |
PRS-T1 Can you make page-numbers correspond to page-turns? | bibahbuzemann | Sony Reader | 13 | 01-01-2012 12:03 AM |
Is there a hack for displaying page numbers rather than location numbers? | nesler | Kindle Developer's Corner | 16 | 02-15-2011 12:00 AM |