Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 06-20-2011, 11:16 PM   #121
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Been scouring my epubs programmatically to see just how margins were used on body and @page, the results were interesting to say the least.
  • Nearly 50% of my epubs have some sort of margins specified on body or @page
  • 60% of these specify margins in em (generally the most egregious offender, why em margins on devices that allow font size to be increased?!)
  • 27% have margins specified in px
  • 13% specify margins using pt, typically 5pt

Note many have some combination of px/pt/em, didn't count those variations.

The margins in pt were generally the most reasonable. I also saw a lot of right hand margins being specified in px to get around Adobe's page number rendering.

I've been thinking more about how to programmatically replace/insert margins. This is probably a very reasonable thing to do, at least when considering users of older Adobe renderers that may want to create a slightly larger right margin. Aside from the matter of successfully injecting them, which is solvable, this brings up other questions for Quality Check, as one would want to be able to exclude whatever margin Modify Epub was configured to insert from a search...

Last edited by ldolse; 06-21-2011 at 09:46 PM.
ldolse is offline   Reply With Quote
Old 06-21-2011, 02:16 AM   #122
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 667
Karma: 436517
Join Date: Jul 2010
Location: UK
Device: PRS-300 (R.I.P.), PW2, Nexus7
Quote:
Originally Posted by ldolse View Post
Note this only removes margins - replacing with a user configurable margin isn't quite so easy, so I'm just going to leave that alone for now. I've just tested on a handful of epubs, so use at your own risk.
How about simply removing the entire @page declaration and replacing it with a new one defined by the user, (in the plugin). This removes the need for any clever parsing.
Agama is offline   Reply With Quote
 
Enthusiast
Old 06-21-2011, 02:26 AM   #123
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by Agama View Post
How about simply removing the entire @page declaration and replacing it with a new one defined by the user, (in the plugin). This removes the need for any clever parsing.
The theme of Kiwidude's plugin is to do minimal damage/modification, which I fully agree with. @page could potentially specify things other than margin, and I wouldn't want to delete such things with a wholesale replacement. Aside from that replacing an existing @page declaration is different logic from inserting one that doesn't exist, the way the function works at the moment is it only modifies things which already exist (and removes them from the css if they're empty after removing margins).
ldolse is offline   Reply With Quote
Old 06-21-2011, 03:51 AM   #124
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,228
Karma: 1334002
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Quote:
Originally Posted by ldolse View Post
I updated some of the dialog text based on silence being tacit approval to leave this tied with the xpgt margin removal, simple enough to change later if desired.
Sorry, I was working on some other plugins so I hadn't yet looked at or responded on this. Certainly without the ability to set the margins I would want the two as separate options, I don't want my Calibre 5pt l/r body margins obliterated but I do want xpgt ones removed.

However as I have other changes on my version of this plugin I need to integrate all of yours so it is easiest if yours are minimally isolated which your sticking it in with the xpgt stuff for now would do. Though if you are still working on this in terms of now considering supporting specifying margins (which will involve more intrusive changes unless you hardcode something for now) it is more problematic to keep merging them. So either I wait until you are done and merge it with mine, or I integrate now and then effectively play pass the parcel to you until you are 100% satisfactory this feature is done. I don't want to confuse people with forked versions on this thread or accidentally miss some change you make to it.

And yes I will also look at adding a matching check to Quality Check at some point, though as you point out it will likely not be as simple as just checking for the presence of a margin declaration. Since any book converted by Calibre will by default have 5pt l/r margins. Same deal applies - if you can do the work of writing the function I will add it to the plugin
kiwidude is offline   Reply With Quote
Old 06-21-2011, 04:05 AM   #125
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,228
Karma: 1334002
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Quote:
Originally Posted by ldolse View Post
I've been thinking more about how to programmatically replace/insert margins. This is probably a very reasonable thing to do, at least when considering users of older Adobe renderers that may want to create a slightly larger right margin. Aside from the matter of successfully injecting them, which is solvable, this brings up other questions for Quality Check, as one would want to be able to exclude whatever margin Modify Epub was configured to insert from a search...
In terms of thee replace/insert. As you pointed out blindly replacing the 'Page declaration with a new one containing just margins is a no-no. Though could you not take the old one, parse it to split by ; and then strip out just the margin related elements to insert your own. Then replace the whole @page element back in the file? I'm asking out of ignorance (which is why I didn't take on this feature myself!).

I don't have an answer for Quality Check however. Good luck...
kiwidude is offline   Reply With Quote
Old 06-21-2011, 04:21 AM   #126
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
One of the things I was considering was we just decide on an accepted hard-coded standard between this and quality check. 5pts seems to be pretty widely accepted, additionally a right margin of around 25px seems to be common to keep ADE page numbers from overlapping the text. I'm thinking the 5pts could be completely hard-coded, and the user would need to enable whether they care about the right hand margin.

Any pt based margins which aren't 5pts(along with any variations of em/px margins)would be flagged in quality check. Modify ePub would convert px margins it encounters to 5pts, I suppose other types of margins could be converted vs deleted as well.

The alternative to hard coding might be to use Calibre's conversion pref as a central place to store the desired settings. I'll have to look at how that's stored.

I don't have a problem hard-coding it though, it makes the development effort easier, not sure what others think.

One other item, I copied the initial function from the xpgt function, but capnm discovered that it currently gives up after the first matching file encountered, is there an option to continue through the whole manifest? Some epubs have multiple CSS files.
ldolse is offline   Reply With Quote
Old 06-21-2011, 04:26 AM   #127
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by kiwidude View Post
In terms of thee replace/insert. As you pointed out blindly replacing the 'Page declaration with a new one containing just margins is a no-no. Though could you not take the old one, parse it to split by ; and then strip out just the margin related elements to insert your own. Then replace the whole @page element back in the file? I'm asking out of ignorance (which is why I didn't take on this feature myself!).

I don't have an answer for Quality Check however. Good luck...
That's exactly what my function does (splitting by ; ), so for CSS files where @page exists this would be trivial to add, but not every CSS file has an @page, so I would need an extra function to insert it, along with keeping track of whether it was already set in the body tag etc. All doable, though, just want to try and come up with the right strategy before I dive in.
ldolse is offline   Reply With Quote
Old 06-21-2011, 07:17 AM   #128
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 667
Karma: 436517
Join Date: Jul 2010
Location: UK
Device: PRS-300 (R.I.P.), PW2, Nexus7
Quote:
Originally Posted by ldolse View Post
additionally a right margin of around 25px seems to be common to keep ADE page numbers from overlapping the text.
I use 10px for this purpose, as a compromise between overlap vs. wasted space. ADE on Sony 300's cannot fully justify text so it is rare to find text close to the right margin.

I read a recent thread in this forum that indicated that the latest version of ADE, (Sony 350/650 readers), no longer displays these numbers, so this requirement may soon be obsolete. (This newer ADE can also appy full text justification).

I think that as soon as you hard-code any of these values you may well find that people then need to manually tweak the CSS, because the hard-values don't suit them! So, storing user values as preferences sounds a better option.
Agama is offline   Reply With Quote
Old 06-21-2011, 08:51 AM   #129
capnm
Groupie
capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'
 
Posts: 150
Karma: 10001
Join Date: Feb 2011
Device: sony
Quote:
Originally Posted by kiwidude View Post
I don't want my Calibre 5pt l/r body margins obliterated but I do want xpgt ones removed.
This won't touch Calibre generated margins (as they are currently done) --
Calibre follows a different practice.

It doesn't use body for L/R, and T/B are in @page statements but they're not in the stylesheet, they're inline, where this plugin won't find them.
capnm is offline   Reply With Quote
Old 06-21-2011, 09:06 AM   #130
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,228
Karma: 1334002
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Calibre does indeed - but if the epub hasn't been converted by Calibre then it might have been created with similar 5pt margins.

Funnily enough I just hit an ePub which had margin-top and bottom set in @page directives in each html file. Utter filth to manually remove, as apart from Notepad++ stupid inability to do multi-line regex it screwed the quote encoding on the files if you try to edit them.

Unfortunately this feature only looking in the CSS files wouldn't be able to help me either.
kiwidude is offline   Reply With Quote
Old 06-21-2011, 09:32 AM   #131
capnm
Groupie
capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'capnm knows the difference between 'who' and 'whom'
 
Posts: 150
Karma: 10001
Join Date: Feb 2011
Device: sony
Quote:
Originally Posted by kiwidude View Post
Funnily enough I just hit an ePub which had margin-top and bottom set in @page directives in each html file. Utter filth to manually remove, as apart from Notepad++ stupid inability to do multi-line regex it screwed the quote encoding on the files if you try to edit them.
That's what Calibre does.
capnm is offline   Reply With Quote
Old 06-21-2011, 10:06 AM   #132
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
There's no reason this same function couldn't grab the styles safely from the individual xhtml files using xpath and rewrite it - the search pattern and rewrite function would be exactly the same. Though I've only used xpath to extract information from xhtml... Someone might need to point me in the right direction to insert the updated styles back into into the style tag.

This would also require being able to walk through every item in the manifest as I noted before.
ldolse is offline   Reply With Quote
Old 06-21-2011, 10:10 AM   #133
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,228
Karma: 1334002
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
@Idolse - I just downloaded your code and the problem is with your break statement. The xpgt function you copied from has an inner for loop to do with iterating through the xpgt file itself. Yours does not, hence you are breaking out of the for name in container.name_map loop.

EDIT: with regards to the xhtml files. I could be completely wrong here, but I would be reluctant to go down the xpath route, as surely that will mean re-writing the xhtml files using the xpath tostring() which means you are at the mercy of the content being reformatted, encoding issues etc?

Last edited by kiwidude; 06-21-2011 at 10:22 AM. Reason: Add comment about xpath for xhtml
kiwidude is offline   Reply With Quote
Old 06-21-2011, 10:48 AM   #134
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Ah, good, sorry about that - I completely forgot to read the rest of the leftover code after I got my test cases working.

I'm thinking to use the configured prefs to decide what to discard/rewrite/flag in quality check, but right now I'm struggling a bit trying to figure out how to read Calibre's prefs. When I've done it before I was able to get it as a conversion option, or calling self.prefs in a metadata plugin, but doesn't seem like those are options here. What needs to happen to be able to look up a pref?

edit: I figured out how to get the user prefs using load_defaults, all good there.

Last edited by ldolse; 06-21-2011 at 12:10 PM.
ldolse is offline   Reply With Quote
Old 06-21-2011, 11:45 AM   #135
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by kiwidude View Post
EDIT: with regards to the xhtml files. I could be completely wrong here, but I would be reluctant to go down the xpath route, as surely that will mean re-writing the xhtml files using the xpath tostring() which means you are at the mercy of the content being reformatted, encoding issues etc?
I'm not sure - it could certainly be done by string manipulation just grabbing everything between the <style></style> tags if that was a concern. I just figured xpath should be safe because by definition epub has to contain valid xhtml. Encoding shouldn't be a problem, epub spec also requires UTF-8. Reformatting might be a valid concern though - not sure any tidying happens with the tostring() function.
ldolse is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Any web-to-epub plugin for internet browser? bthoven ePub 7 07-10-2011 05:14 AM
[Old Thread] Reading epub on viewer inexplicably changes the time stamp of epub greenapple Library Management 20 03-19-2011 10:18 PM
Easy way to modify thread subscription emails in bulk? snipenekkid Feedback 11 02-06-2011 03:47 AM
Another plugin dev question DiapDealer Plugins 2 12-11-2010 01:46 PM
Epub plugin dev DiapDealer Plugins 15 11-12-2010 09:36 AM


All times are GMT -4. The time now is 01:53 PM.


MobileRead.com is a privately owned, operated and funded community.