![]() |
#1 |
Connoisseur
![]() Posts: 86
Karma: 10
Join Date: Aug 2013
Device: Kindle Fire HD
|
Stripping Author/Title and Page Number
I'm trying to remove alternating title/author as well as page numbers from the top of the page. (Title|Author) finds some instances of the title, but not all or even most, and doesn't find the author at all. [0-9] removes the numbers I want but also removes all paragraph spacing. Thanks to anyone for the help.
|
![]() |
![]() |
![]() |
#2 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,310
Karma: 168808723
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
It would help if you posted some examples of the author/title/page number strings.
Is there a consistent class associated with them? Have you tried using Regex to locate and remove them using calibre's editor (azw3 or ePub books). Are these books that you are converting? |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Connoisseur
![]() Posts: 86
Karma: 10
Join Date: Aug 2013
Device: Kindle Fire HD
|
I can do that, sorry, wasn't sure what info you would need.
I got it in epub format, already. <i class="calibre1">Red Sky At Morning </i> 5</p> <p class="calibre3">4 <i class="calibre1">Melissa Good</i> </p> [Red Sky At Morning] [0-9] [Melissa Good] Page [0-9]+ (Title|Author) So far, this is what I've tried. Thank you. |
![]() |
![]() |
![]() |
#4 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,939
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
This is exactly how the code appears in the editor (on different lines)
First i would use the main Beautify (tulip icon) to try and get all the files more 'normal' (there can be non-visible differences) I do these as 2 separate S&R (because they are. The Right page and the Left page in a print book. ) Code:
<p class="calibre3"><i class="calibre1">Red Sky At Morning\s</i>\s*\d+</p>
I used a s* (zero or more spaces) in some places where it appears a space MIGHT be in some cases The second (other side) is similar Code:
<p class="calibre3">\d+\s+<i class="calibre1">Melissa Good</i></p> |
![]() |
![]() |
![]() |
#5 |
Connoisseur
![]() Posts: 86
Karma: 10
Join Date: Aug 2013
Device: Kindle Fire HD
|
My source anna's archive, with no info on if it was ocr, but I'm guessing so. All versions I can find seem to have it. I'll try your suggestions, thank you very much for the help.
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Connoisseur
![]() Posts: 86
Karma: 10
Join Date: Aug 2013
Device: Kindle Fire HD
|
Ok, so I'm double checking, are these intended to be just regex or regex + function? I tried created a function, but I'm getting code errors, so maybe its just regex. I'm definitely doing something wrong if thats the case because I'm getting errors, see below:
calibre, version 8.4.0 ERROR: Unhandled exception: <b>error</b>:bad escape \s at position 60 I'd really like to understand because it seems like this could be useful for multiple things, but I'm also struggling with understanding the documentation today. In the manual, I mean. Thank you so much. |
![]() |
![]() |
![]() |
#7 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,939
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
??? \s should be valid, but change it to \s*
![]() <p class="calibre3"><i class="calibre1">Red Sky At Morning\s*</i>\s*\d+</p> The second \s* is because it appears there is a possible leading space befor the single digit. If that does not work, try the 8.4.101 preview at https://download.calibre-ebook.com/preview/ |
![]() |
![]() |
![]() |
#8 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,310
Karma: 168808723
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Hmmm.... sorry but if your source is Anna's Archive, you are not going to get much help on MobileRead. Anna's Archive is a pirate site.
|
![]() |
![]() |
![]() |
#9 |
Connoisseur
![]() Posts: 86
Karma: 10
Join Date: Aug 2013
Device: Kindle Fire HD
|
Apologies, I'll remember that.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
tts page autoturn only works if built-in page number turns off for 1 page txt file | CrazyGriferman | PocketBook | 0 | 10-22-2024 02:50 PM |
How to get first number of characters in the title to the back of the title | quicks | Library Management | 6 | 05-22-2018 03:13 PM |
Aura Books not listed by title but just "<title> - <number> - <series>" | qee4q | Kobo Reader | 12 | 05-10-2015 04:37 PM |
Can I remove the title/author page? | lizzielou | Calibre | 2 | 02-23-2012 08:36 AM |
Book Designer - Removing Author/Title from each page | Stuart Young | Sony Reader | 1 | 02-21-2008 05:58 PM |