12-09-2022, 03:40 AM | #1 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Bulk-check ebook collection for edited XHTML files
I have a big collection of epub ebooks, in the thousands let's say. Most of them are retail (original), and some of them have been surreptitiously edited, after being bought, to add some text within the flow of the book to advertise some website or service (this is obviously very annoying when reading).
One way to know whether a book has been messed with is opening it with winrar and checking whether all the xhtml files in the "OEBPS/text" folder have been last edited at the same date and time - the files that show a different date and time are the ones that I'm targeting. I'm looking for a way to bulk-check the whole collection, without having to check every single file. Is it possible? Thank you. |
12-09-2022, 09:52 AM | #2 |
Well trained by Cats
Posts: 29,811
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
The OPF may have been tagged if the modification was done with an e-book aware tool.
eg <dc:date opf:event="modification">2022-12-09</dc:date> As usual, these take on different forms, but are usually obvious to the human reading the code. I guess you could build a code library of the variations for your check tool. |
01-19-2023, 04:40 AM | #3 |
Junior Member
Posts: 9
Karma: 326
Join Date: Nov 2021
Device: kobo
|
I know this is an old question, but it seemed interesting from a programming perspective. I write a small application for it, try it.
(sorry for my bad english) Last edited by oldboys; 01-19-2023 at 04:42 AM. |
01-19-2023, 09:05 AM | #4 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Wow, it works. Thank you very much for the great work!
(I got a lot of "epub metadata read error", not sure exactly what that means) Last edited by 1v4n0; 01-19-2023 at 09:08 AM. |
01-19-2023, 01:44 PM | #5 |
Junior Member
Posts: 9
Karma: 326
Join Date: Nov 2021
Device: kobo
|
(I got a lot of "epub metadata read error", not sure exactly what that means)
It cannot read the metadata of the given epub All kidding aside, check the epub file with Sigil or use the "Add Folder" or "Add Folder with Subfolders" menu items. I'm glad I could help |
11-22-2023, 03:12 AM | #6 | |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Quote:
|
|
11-22-2023, 06:54 AM | #7 |
the rook, bossing Never.
Posts: 11,166
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Ask in Calibre subsection.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
missing html files in a edited epub | woooz | Editor | 4 | 05-18-2020 02:11 PM |
Saving edited files | jeyjey | Editor | 3 | 01-29-2020 11:59 AM |
Some files.html & toc.xhtml (also Cover.xhtml) | chaot | Workshop | 23 | 02-13-2017 12:20 PM |
Check Library - Bulk Delete | chilady1 | Calibre | 6 | 07-09-2011 03:06 PM |
Any way to import files directly into collection in eBook Library? | erikk | Sony Reader | 3 | 02-26-2008 01:48 PM |