Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 07-09-2015, 08:00 PM   #976
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by JSWolf View Post
I say dump them. I do it manually because as long as I have the ADE page numbers, I'm good to go. I don't need page numbers to match some printed edition or whatever someone feels they should be. So please, automate dumping GBS.
But what you need is immaterial. If there are accurate and useful page numbers, they should be left in -- it is beyond the purview of this plugin to strip features-that-some-people-don't-need-and-JSWolf-disapproves-of.


Assuming GBS anchors are automated trash, otherwise known as "artifacts", then they can be safely gotten rid of and it is appropriate to do so.
And everyone else knows this already. Please don't muddy the waters.
eschwartz is offline   Reply With Quote
Old 07-10-2015, 03:26 AM   #977
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by eschwartz View Post
Assuming GBS anchors are automated trash, otherwise known as "artifacts", then they can be safely gotten rid of and it is appropriate to do so.
And everyone else knows this already. Please don't muddy the waters.
Whether the anchors are trash or valuable, it is always nice to have the option to get rid of them.
DoctorOhh is offline   Reply With Quote
Advert
Old 07-11-2015, 07:51 AM   #978
Rev. Bob
Wizard
Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.
 
Rev. Bob's Avatar
 
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
Quote:
Originally Posted by DoctorOhh View Post
Whether the anchors are trash or valuable, it is always nice to have the option to get rid of them.
Aye, there's the rub: "option." Without getting too far into the weeds, here's the situation:

1. All GBS.*.* anchors look alike.
2. Only some of those anchors are used in pagemapping. The rest appear to be garbage except (possibly) in a Google ebook reader app.
3. The GBS-related pagemap appears to be garbage, with the same caveat.
4. My GBS pagemap data comes from a very small sample, from which I am reluctant to generalize.

Thus, it is difficult to separate the pagemapped GBS anchors from the rest, and that may not even be desirable - so, at present, the posted beta routine does not give the option to preserve any GBS anchors. (Well, aside from the "don't use the routine at all" option.)

That is why I'd like to get more feedback on that point before sanctifying this version as a proper release. I'm fine with removing Google app-specific bloat, but not actually useful code.
Rev. Bob is offline   Reply With Quote
Old 07-11-2015, 11:04 AM   #979
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,763
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
I'm away for two weeks, but when I get back, I'll download all of my Google ePub and test whatever the latest beta is.
JSWolf is offline   Reply With Quote
Old 07-11-2015, 05:01 PM   #980
Rev. Bob
Wizard
Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.
 
Rev. Bob's Avatar
 
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
I'm in no particular hurry; I have several projects on my plate right now. I'm particularly interested in these cases:

- Books with non-GBS pagemaps.
- GBS pagemaps that conform to physical page counts in some way.
- Books with multiple pagemaps, GBS or otherwise, if that's even possible.
- Any case where the GBS anchors or pagemap is actually useful, such that removing them is a Bad Idea.
- Any kind of false positive, where something is affected that should not be.
- Weird "unpretty" results, especially in EPUB3 books (since they have new tags).
Rev. Bob is offline   Reply With Quote
Advert
Old 07-11-2015, 05:09 PM   #981
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,763
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Rev. Bob View Post
I'm in no particular hurry; I have several projects on my plate right now. I'm particularly interested in these cases:

- Weird "unpretty" results, especially in EPUB3 books (since they have new tags).
I've run Modify ePub on some ePub 3 that don't really use much in the way of ePub 3 features with no issues. I've not run Modify ePub on a full out ePub 3 as those aren't easy to come by and if they were, I'd not buy/download as I've no need for such.
JSWolf is offline   Reply With Quote
Old 07-11-2015, 05:32 PM   #982
Rev. Bob
Wizard
Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.
 
Rev. Bob's Avatar
 
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
Quote:
Originally Posted by JSWolf View Post
I've run Modify ePub on some ePub 3 that don't really use much in the way of ePub 3 features with no issues. I've not run Modify ePub on a full out ePub 3 as those aren't easy to come by and if they were, I'd not buy/download as I've no need for such.
I'm mainly concerned with how it works with the semantic EPUB 3 elements - SECTION and the like. Those are kind of "EPUB 2.5" books to me; a little more functionality, but nothing that breaks in an EPUB 2 application.

I wish I could think of a way to automatically detect and discard pointlessly-nested DIV elements. Several Baen books show this bug, in which the first chapter's contents are enclosed in one DIV, the next is wrapped in two DIVs, and by the end of the book, you've got thirty or more extraneous DIVs nested around the same kind of content. Removing them does no harm, and may improve performance in some situations.
Rev. Bob is offline   Reply With Quote
Old 07-11-2015, 07:54 PM   #983
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,252
Karma: 16544692
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Quote:
Originally Posted by Rev. Bob View Post
I wish I could think of a way to automatically detect and discard pointlessly-nested DIV elements...
I don't have a way of automatically removing these useless DIVs but just in case it sparks a few ideas I've been getting rid of mine using the calibre Editor like this:
  1. Look through the css sheet and manually delete 'style-less' (IMO) styles e.g. .xyz {display: block} or .xyz {page-break-before:always}
  2. Run the Editor's 'Remove Unused css Rules & Classes' option to convert the useless DIV-with-class's to empty DIVs.
  3. Run DiapDealer's Editing Toolbag plugin to remove the empty DIVs.
Steps 2 & 3 should be easy enough to automate as they're already Python code but step 1 would be a bit more contentious to determine if/when a style should be considered 'useless'.
jackie_w is offline   Reply With Quote
Old 07-11-2015, 09:56 PM   #984
Rev. Bob
Wizard
Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.
 
Rev. Bob's Avatar
 
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
Oh, I can get rid of them easily in the editor - two regex operations take care of things nicely - but that's the thing. I have to manually identify the problem and check for breakage afterward. Details, details...

See, in this particular class of cases, the nesting spans the entire BODY element, and the DIVs have no attached classes. It's literally a matter of stripping out opening DIV tags that come immediately after the opening BODY tag, and doing likewise for the closing ones. Simple and painless, but not universally applicable.

Last edited by Rev. Bob; 07-11-2015 at 09:59 PM.
Rev. Bob is offline   Reply With Quote
Old 07-12-2015, 06:21 AM   #985
chrisridd
Guru
chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.
 
chrisridd's Avatar
 
Posts: 977
Karma: 2209358
Join Date: Nov 2011
Location: London, UK
Device: Kobo Aura, Kobo Aura ONE, PocketBook InkPad Color 3
I'm testing the plugin in a clean library with copies of the original Google Play books with stripped DRM. I am going through them systematically to see what the plugin is doing.

I found 2 books with a pageList and a GBS pagemap, both using different anchors (id="page-.*" and id="GBS.*") and in both cases the new plugin correctly removed the GBS pagemap and anchors, but left the pageList intact.

So far so good :-)

However I'm not sure about what multiple pagemaps look like, so don't know how to check these.
chrisridd is offline   Reply With Quote
Old 07-12-2015, 05:25 PM   #986
Rev. Bob
Wizard
Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.
 
Rev. Bob's Avatar
 
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
Quote:
Originally Posted by chrisridd View Post
So far so good :-)

However I'm not sure about what multiple pagemaps look like, so don't know how to check these.
Encouraging news!

With multiple pagemaps, my main concern is what happens when a book that already has a pagemap gets picked up by Google Play. Does the old one get junked in favor of the new, or is it still there in some form? If it's junked, that's a bad thing, but it's on Google; nothing I can do about it. If it's still there, though, it ought to be restored if possible...
Rev. Bob is offline   Reply With Quote
Old 08-20-2015, 01:57 AM   #987
chrisridd
Guru
chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.
 
chrisridd's Avatar
 
Posts: 977
Karma: 2209358
Join Date: Nov 2011
Location: London, UK
Device: Kobo Aura, Kobo Aura ONE, PocketBook InkPad Color 3
Unfortunately I've just found a book that the test plugin has slightly mangled, producing invalid HTML in two files.

I'm attaching a zip containing the original files, which look like they are nearly XHTML except I thought empty XHTML elements had a start tag ending with " />". These just end "/>", i.e. without the space. But if you're using a real XML parser that shouldn't matter.

In both cases the plugin leaves an open <div> element at the end of the text. Sigil complains, and the FlightCrew validator complains.

On the plus side, at least the bogus GBS anchors are gone
Attached Files
File Type: zip broken.zip (1.3 KB, 173 views)
chrisridd is offline   Reply With Quote
Old 08-21-2015, 05:43 AM   #988
Rev. Bob
Wizard
Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.
 
Rev. Bob's Avatar
 
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
I have a couple of suspicions about what's going on, but could you post an "after" ZIP of those two files as well, so I can see exactly what the routine's doing to them?

Meanwhile, as a workaround, you might try running the book through the "unpretty" routine first. I have a hunch that the <div></div><div/> line is causing some of the trouble, and passing the book through "unpretty" should split that up so it can be handled correctly.

It's a hack, but it may get the job done until I'm back on my feel and able to look at it in depth.
Rev. Bob is offline   Reply With Quote
Old 08-22-2015, 02:38 AM   #989
chrisridd
Guru
chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.chrisridd ought to be getting tired of karma fortunes by now.
 
chrisridd's Avatar
 
Posts: 977
Karma: 2209358
Join Date: Nov 2011
Location: London, UK
Device: Kobo Aura, Kobo Aura ONE, PocketBook InkPad Color 3
Yes, your hunch was right. If I do a modify run to "depretty" the book before a second modify run to remove the Kobo/Google gunk, the output files are valid.

(If I do a single modify run to depretty *and* remove the Kobo/Google gunk, I still get mangling.)

Attached are the two files from the original mangled run.
Attached Files
File Type: zip after.zip (1.1 KB, 165 views)
chrisridd is offline   Reply With Quote
Old 08-25-2015, 03:16 AM   #990
Rev. Bob
Wizard
Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.
 
Rev. Bob's Avatar
 
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
Quote:
Originally Posted by chrisridd View Post
Attached are the two files from the original mangled run.
I think I see the problem, and it should be an easy fix. Not to get too detailed, but I believe the current code uses one line to look for three patterns, with the middle one mandatory and the outer ones optional. The trouble is, the outer ones are supposed to be optional together, not separately - and in these cases, one of the outer patterns (the closing /DIV) gets detected without the other (the opening DIV). The code then removes the anchor and closing tag, thus creating a mismatch.

If that's correct, I just need to adjust the code to make it two lines instead of one: one with the anchor enclosed by the DIV tags (not optional), followed by one with just the anchor. Since neither new pattern allows for a mismatch, the problem should be solved.

I should get a chance to look into that tomorrow (er, later today) sometime - thanks for the useful bug report!
Rev. Bob is offline   Reply With Quote
Reply

Tags
modify epub


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Quality Check kiwidude Plugins 1251 07-07-2025 09:13 PM
[GUI Plugin] Open With kiwidude Plugins 404 02-21-2025 05:42 AM
[GUI Plugin] Manage Series kiwidude Plugins 167 07-28-2024 03:07 PM
Modify ePub plugin dev thread kiwidude Development 346 09-02-2013 05:14 PM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM


All times are GMT -4. The time now is 06:53 AM.


MobileRead.com is a privately owned, operated and funded community.