Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 07-06-2011, 03:55 AM   #61
Ortep
Fanatic
Ortep has a complete set of Star Wars action figures.Ortep has a complete set of Star Wars action figures.Ortep has a complete set of Star Wars action figures.Ortep has a complete set of Star Wars action figures.Ortep has a complete set of Star Wars action figures.
 
Posts: 527
Karma: 470
Join Date: Sep 2007
Location: The Netherlands
Device: Kindle Oasis
Quote:
Originally Posted by burbleburble View Post
Updated to version 0.0.5

Due to a lack of user feedback (interest?) when I posted requests for suggestions in certain areas,
Defeinately NOT lack of interest. I really like any method to clean up eBooks. I'm not really interested in ePub, it is to chaotic and does not play well with my Kindle and my wive's Bebook. So the approach of using the HTMLZ as a 'in between' format is probably a good one. Simply work on a general format and anyone who wants another format can create it from there.

Perhaps you can use Calibre to 'automate' the creation of the HTMLZ if not available as a first step.
Ortep is offline   Reply With Quote
Old 07-06-2011, 12:39 PM   #62
burbleburble
Connoisseur
burbleburble began at the beginning.
 
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
@jackie_w

In the next update, in about half a week, what you want should be rather easy to do (and I'll be happy to explain in depth how, then. Its just that the current version can't really do it too easily). The next update will be html source code based editing/tools, with only the previewer using webkit. But all you would need is a very basic knowledge of html, and obviously you have more than that. Also, in the next update, I don't think it will auto support adding in external stylesheets yet; so you'll just have to add that one line <link type"text/css"...> in yourself.

Anyways the next update will have FAR more features and abilities; including special tools covering a wide range of needs (auto smallcap title, strip certain formattings, uppercase, titlecase, TOC building and saving to epub, etc; all with an easy chooser for making sure it doesn't happen to the wrong things).


@Ortep

Glad there's some inerest! I actually would love to implement an option to automatic conversion, its just that I find it quite hard to work out how to interface with calibre's classes. If someone could provide me with a relatively clear framework of what methods and options are needed, and how to use them, I would be more than happy to add it immeidiatly.
burbleburble is offline   Reply With Quote
Advert
Old 07-06-2011, 01:33 PM   #63
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,798
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
@Burble
The problem is we don't know what to gripe and moan about until we have used the tool.
AFAIK your reason for creating the tool is in step with what the creator of Calibre reason for writing the tools that became Calibre: What was out there did not do what he wanted

Write something and post it.
The Gripes will follow
theducks is offline   Reply With Quote
Old 07-09-2011, 04:14 PM   #64
Calibrefan
Enthusiast
Calibrefan began at the beginning.
 
Posts: 49
Karma: 12
Join Date: Feb 2011
Device: Kobo Aura, Sony PRS-350 and PRS-T1
[QUOTE=
[B]The long:[/B] Amongst the many plans, I do plan on implementing (early on) several methods of search/replace; whether by class / between____and_____ / regex /lineThatStartsWith (I'll figure out the exact breakdown later). For your issue, I would probably create a search for 'lineThatStartsWith' = '------------', provide a list of results (to remove any matches that may not be page breaks for some reason), then replace/remove. However, currently, the plugin only does a basic reformatting/restructuring of the epub based on pattern matching heuristics.

Can't gripe either but just applaud your work and keep my fingers crossed that you manage to find the time to implement the above mentioned items.

Calibrefan is offline   Reply With Quote
Old 07-11-2011, 01:19 AM   #65
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Dialog size

I am just starting to try Cleaner, but have a problem with the dialog size.

When I open Cleaner, the dialog is to large for my screen. The top of the dialog is at the top edge of the screen and the bottom extends off the bottom of the screen. One the "Open/Save" tab, the top edge of the "Open Htmlz" button is just visible above the Windows taskbar. If I change the taskbar to autohide I can see the "Basic Cleaning" checkbox is against the bottom of the screen.

I have had a look at the source code, and do not immediately see where the dialog sizing is done. I don't code in Python (C#, C++, Perl and lots of older languages) so I am probably missing something obvious. The only thing close I can find if the call to "setRowMinimumHeight" in main.py. I might play with that and see what happens.

Edit: Changing from 500 to 400 in the "setRowMinimumHeight" call sized the dialog so that it was usable. And I can now say: Nice plugin and very handy.

This in on a laptop with Windows 7 and a 1280*800 resolution screen. Caibre is V0.8.9 and the Cleaner is 0.0.6.

I you need any more info, I will be happy to supply it.

David

Last edited by davidfor; 07-11-2011 at 02:04 AM. Reason: Tested a change to the code
davidfor is offline   Reply With Quote
Advert
Old 07-11-2011, 05:38 AM   #66
burbleburble
Connoisseur
burbleburble began at the beginning.
 
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
@Calibrefan
I am almost completed adding a series of new features to the next update, and I have begun looking into search and replace. After brainstorming for a little while, I discovered that without examples to test it on, it will be very hard to make a good search and replace with specific tools oriented on page-break-lines, page numbers, footers and headers. So, if you can provide me with a sample containing some of what you are looking to remove, and some surrounding context, I will try to work out an approach. (Same for @Under the Covers, I did respond earlier, but got no sample.)

@davidfor
Thanks for pointing out the issue. I actually wanted to fix this, but I ran into two 'unknowns' that I haven't had the time to google and research:
  1. How do I discover the resolution of the current user's screen, and then force the plugin to fit to it?
  2. How do I add a 'full screen' button in the upper right corner, as most window's have (this would force it to fit, I imagine)?
So, if you have an answer, I can add it to the next update.

@Kovid, anyone
I am continuing to try to use webkit with contenteditable. But webkit manages to produce some horribly twisted and convoluded markup when pressing the enter key and splitting an element or when deleteing larger amounts of text/across elements. For the life of me I don't know what the designers were attempting to accomplish. All the same, is there a way to work around this, either options to set or some basic javascript to catch the event and reimplement it?

(Edit: while searching for solutions, I came across a great line: 'WYSIWYG Editors should be for the most part called WYDSIAGDM (What You Don’t See Is A Gosh Darn Mess)')

Last edited by burbleburble; 07-11-2011 at 07:36 AM.
burbleburble is offline   Reply With Quote
Old 07-11-2011, 09:34 AM   #67
Japes
Addict
Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.
 
Posts: 303
Karma: 1033852
Join Date: Jun 2011
Device: Sony PRS-350,Sony PRS-950,Pocketbook 360+,B&N Nook Simple Touch Reader
This might be one of the most useful plugins, if done right, of all the plugins out there. I find that alot of the time (25% or so...roughly), the conversion to epub, by Calibre, is done with some formatting that causes issues for me (probably not Calibre's fault, but rather, the fault of the source file).

I have a Sony PRS-950 and 350 with PRS+, whereby I have custom stylesheets for some fonts. What I'm finding is that, as stated above, about 75% of the time, my books all look the same (as they are supposed to). Some 25% of the time or so, however, the font sizing is off, or the line height/line spacing is off, or margins are off. That is normally a result of the epub itself having these values in there somewhere, and those values are NOT being overridden by my custom CSS.

If there was some way to clean up the CSS on the epubs where there was NO reference made to font sizing (in the main body of the book), or line heights, or margins, that would leave me free to format it as I wanted to, with my custom CSS.

I'm keeping my fingers crossed.
Japes is offline   Reply With Quote
Old 07-11-2011, 10:03 AM   #68
burbleburble
Connoisseur
burbleburble began at the beginning.
 
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
@Japes
Correct me if I understood wrong, and the following feature wouldn't help (so that I can try to create one that does...). In the next update the initial cleaner will have an option to strip all but basic formatting (i.e. italic, bold, justification?). Of course it will also have an option to discover patterns (which should make it easier for you to quickly apply you own formatting, as all (or almost all) 'Pattern13' will be titles, all 'Pattern#' will be normal text... etc.). As you are not the first person who wishes to utilize his own standardized css, I will add an option to import an external stylesheet.

The next update may take a week or two, I am busy 'catching events' so that I can avoid webkit's twisted wysiwyg editor output code, and use a simple and clean approach.

@Anyone, everyone:
Beacause of the complexity involved in dealing with more types and levels of tags, is there any pressing reason why a clean ebook should need div's or i or b or em etc.? are p, span, lists enough? (I mean, its an ebook, not a pdf or interactive webpage...)
burbleburble is offline   Reply With Quote
Old 07-11-2011, 10:57 AM   #69
Japes
Addict
Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.Japes ought to be getting tired of karma fortunes by now.
 
Posts: 303
Karma: 1033852
Join Date: Jun 2011
Device: Sony PRS-350,Sony PRS-950,Pocketbook 360+,B&N Nook Simple Touch Reader
No, I don't think you understand. Your idea of stripping all but basic formatting is EXACTLY what I'm looking for, because I do NOT wish to implement an external stylesheet, UNTIL the book gets on the reader, and, on my reader, I can switch between different stylesheets "on the go."

So, basically, what you had in mind of stripping ALL but the basic formatting (italic, bold, justification), would work perfectly.

Now, regarding that, I had a question. Sometimes, stripping font size, for example, would not be a good idea if it refers to Chapter headings, for example. In other words, sometimes specific font sizes need to be specified for things OTHER than the main body of the book. How will you handle those circumstances?
Japes is offline   Reply With Quote
Old 07-11-2011, 11:32 AM   #70
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Quote:
Originally Posted by burbleburble View Post
Beacause of the complexity involved in dealing with more types and levels of tags, is there any pressing reason why a clean ebook should need div's or i or b or em etc.? are p, span, lists enough? (I mean, its an ebook, not a pdf or interactive webpage...)
Some feedback, (IMO only, of course), I've never needed a <div> tag but I do prefer an <i> or <em> tag to a <span class="italic">. mainly because if I'm cleaning up using regex in a text editor the closing tags </i> and </em> are specific. The </span> tag could be closing anything. Particularly difficult when they are nested several layers deep. The shorter <i>, <em> tags are also less distracting in a text editor.

Of course, if your new utility is really good at cleaning html then I may not need to use regex in a text editor nearly so often

On the matter of stripping CSS... I would hope that it would be optional. If a <blockquote> tag has been used or the equivalent CSS, to achieve equal left and right indented margins, then I do not want this info stripped out. Similarly CSS margin settings when used for scene-breaks or verse. I also don't want text-indent settings removed.

@Japes,
I find a Calibre conversion to epub usually does a good job of sorting out the font-size of the main bodytext, by putting a font-size in the <body> CSS class and omitting it from the main bodytext class. Calibre also changes all instances of font-sizes small, medium, large etc to em-equivalents. I've found this helps to make my PRS+ CSS work more reliably. I do find that each different font-face has its own 'best font-size' but that's easily added to the relevant PRS+ CSS file. I always convert using Base_font_size=12pt, which puts font-size:1em in the body CSS class, then let PRS+ take care of the rest.

[Edit:] Just thought of something else regarding simplification. This is a finer detail point, but if an html file has a specially formatted first-character of a chapter then it would be good to retain it. I wouldn't care what the formatting was but if it always got cleaned up to:
Code:
<span class="dropcap">T</span>his is the first...
then a standard linked CSS file could format it to taste.

Well you did ask for feedback...

Last edited by jackie_w; 07-11-2011 at 11:47 AM. Reason: additional feedback
jackie_w is offline   Reply With Quote
Old 07-11-2011, 06:20 PM   #71
Under the Covers
Night Reader
Under the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura aboutUnder the Covers has a spectacular aura about
 
Under the Covers's Avatar
 
Posts: 127
Karma: 4314
Join Date: Oct 2010
Location: Rocky Mountains (US)
Device: Sony PRS-650
Quote:
Originally Posted by burbleburble View Post
... So, if you can provide me with a sample containing some of what you are looking to remove, and some surrounding context, I will try to work out an approach. (Same for @Under the Covers, I did respond earlier, but got no sample.) ...
I've been really overloaded at work recently and way too exhausted in off hours to mess with pdf-to-epub conversions to show you examples. But you can pretty much take any pdf book with title/author/chapter/page headers and/or footers, do a Calibre conversion to epub, and see the problem. Calibre also gives you the opportunity to view the code in pdf or in epub format, as well, in the Search & Replace dialog.

Sorry for bringing up the problem but being unable to participate, other than a quick forum posts check ...
Under the Covers is offline   Reply With Quote
Old 07-12-2011, 08:22 AM   #72
burbleburble
Connoisseur
burbleburble began at the beginning.
 
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
Something has come up in my personal life, and for the next month or two I don't see myself having very much time to work on this plugin. I will try to get back into it afterwards.
In the meantime, here is the most recent version of the plugin. As I am leaving off in-between versions, it is not really fully functional. If any one else wishes to take it up (for the time being at least) and make improvements/write your own version, you are more than welcome.
Attached Files
File Type: zip plugin 0.0.8.zip (145.5 KB, 261 views)
burbleburble is offline   Reply With Quote
Old 07-14-2011, 07:11 AM   #73
burbleburble
Connoisseur
burbleburble began at the beginning.
 
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
Whew! I take that last post back. I got the time again, and am back on track with this plugin. Hope to have this update functional soon. Am currently working on a cleaner code creation for the wysiwyg editor. (For some reason webkit's deletion of multiple paragraphs can be awfuly slow, while my hard coded approach takes no noticable time. Go figure...)

Last edited by burbleburble; 07-14-2011 at 09:06 AM.
burbleburble is offline   Reply With Quote
Old 07-14-2011, 11:43 AM   #74
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Glad things have worked out for you
jackie_w is offline   Reply With Quote
Old 07-19-2011, 01:21 AM   #75
therealjoeblow
Zealot
therealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfoldedtherealjoeblow reads XML... blindfolded
 
Posts: 106
Karma: 52102
Join Date: Jun 2010
Device: Samsung Android Tablet w/Moon+ Pro Reader
Quote:
Originally Posted by burbleburble View Post
Ebook Cleaner

About:
Many ebooks have messy and inconsistent formatting.
  • <snip>
  • Broken paragraphs/sentences, missing punctuation...


Plans:
  • <snip>
  • a spell checker using heuristics to avoid wasting time on names and places created for that book
  • a punctuation checker finding broken paragraphs/sentences/punctuation - (the ones guarenteed needing you attention, not every possible grammer...)
I am *REALLY* looking forward to trying this out when the features noted above are working. Personally, I could care less about the rest of the features as I've figured out how to manually fix most of them relatively easy with notepad++, but the punctuation, broken paragraphs and general spelling mistakes from bad OCR are killing me!

Cheers,
The REAL Joe
therealjoeblow is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Reading List kiwidude Plugins 1316 Today 12:52 PM
[GUI Plugin] Open With kiwidude Plugins 403 04-01-2024 08:39 AM
[GUI Plugin] User Category kiwidude Plugins 123 03-16-2024 11:59 PM
[GUI Plugin] Find Duplicates kiwidude Plugins 1096 03-16-2024 11:28 PM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM


All times are GMT -4. The time now is 05:50 PM.


MobileRead.com is a privately owned, operated and funded community.