07-19-2011, 08:57 AM | #76 |
Connoisseur
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
|
To those who have posted in the last few days: Thanks for the feedback and interest. Though I have not responded to all the specifics, I am keeping them in mind. Please bug me again if I don't manage to deal with them in the next functional update.
@davidfor I worked out a 'QSplitter' which should allow for easy manual resizing of the display objects in the next update; so I removed the setMinimum line of code. Also, it will have maximize/resize button in the upper right corner. @Kovid, anyone I have begun working on a search and replace /puntuation checker. But I can't figure out how to easily match across paragraphs, etc. Does javascript regex provide a way to easily match across tags, especially paragraph tags, and to include them in an expression? I googled it but came up with nothing helpful. Last edited by burbleburble; 07-19-2011 at 09:34 AM. |
07-19-2011, 10:01 AM | #77 |
Connoisseur
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
|
@Kovid (New post in case you already looked at the last one, as I see you are logged on) I am having strange issues when running the plugin in calibre, (it works fine in pyscripter). I recoded the delete, backspace, enter key responses for webkit. In calibre it won't delete the next character. Not only that, for all these keys, if dealing with a fairly long paragraph it will insert some very weird characters elswhere in the paragraph. Please, if you have any suggestions as to the issue, or could take a look, I would much appreciate it. I attatched the current version (that is having the issues) below. The actual code is in main.Text.Browser under keypress() etc. |
Advert | |
|
07-19-2011, 10:12 AM | #78 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Sorry, I am swamped at the moment.
|
07-19-2011, 02:49 PM | #79 |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Admittedly I did not read through this entire thread but this is news to me. The non-css translation doesn't support all styles but all of the other out put versions should support all HTML tags and styles. All HTMLZ does is translate links and put multiple HTML files together. The only thing that I can think of that would be lost are per page background color / images. Which ones are you having issues with?
|
07-19-2011, 03:04 PM | #80 |
Connoisseur
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
|
@user_none
I converted 'The Princess and the Goblin' (by George Macdonald, from gutenberg.org, for copyright free testing purposes) epub to htmlz. If I recall correctly, the epub had a list tag system for the table of contents, and this was converted to spans alone. I will try to double check this tomorrow. @anyone, everyone If any one can please answer the question posted in #76, it is rather important for implementing a good search and replace. Last edited by burbleburble; 07-19-2011 at 03:07 PM. |
Advert | |
|
07-19-2011, 09:19 PM | #81 | |
Wannabe Connoisseur
Posts: 425
Karma: 2516674
Join Date: Apr 2011
Location: Geelong, Australia
Device: Kobo Libra 2, Kobo Aura 2, Sony PRS-T1, Sony PRS-350, Palm TX
|
Quote:
Note that I know regexs in general, but absolutely nothing about javascript. Cheers, Simon. |
|
07-20-2011, 03:47 AM | #82 |
Connoisseur
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
|
Examples:
Code:
<p>This is an example</p><p>of a broken paragraph</p> Code:
<p><span>Bob <span> went to</span> the <span> market</span>. </span></p> I've gotta assume someone's figured these issues out. I can't imagine how it hasn't come up before in the world of programming! But I googled and searched and couldn't find answers... Thanks for any help! |
07-20-2011, 04:23 AM | #83 | |
Grand Sorcerer
Posts: 11,742
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
In some regexp systems you can use back references to match what you matched before, but in limited ways such as fixed counts. In others you must use programmatic matching (in effect, recursive regexps) to dynamically modify the regexp. This latter scheme can be used to solve the palindrome problem. It is, however, very system dependent and rather complicated. If you have a limited number of cases, it would probably be easier to code these using string functions instead of regexps. That way you can handle both arbitrary numbers of matches and the necessary recursion. What does javascript have to do with this? Are you really running javascript inside a calibre (python) plugin? |
|
07-20-2011, 05:52 AM | #84 |
Connoisseur
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
|
Well, by general regex I figured as much, but thanks for the confirmation.
As for why I am using javascript - the ebook editor uses webkit, which has limited python bindings. Javascript handles much easier (if not clearer) and faster most tasks having to do with the internal dom/cursor/editing. I hoped that since js is dedicated to the html dom, it must have some regex search method that can take the tags/structure into account. |
07-20-2011, 05:47 PM | #85 | |
Fanatic
Posts: 527
Karma: 470
Join Date: Sep 2007
Location: The Netherlands
Device: Kindle Oasis
|
Hi, I'm not sure if it is helpfull to you, but when I use for example Word to find 'extra' paragraph breaks I look for a ^p without the following characters in front of it
Quote:
Of course you can't find characters that are not there so I turn it around. I look for a ^p with one of those characters in front of it and I change it to that character followed by <<PAR>>. A string you probably won't find in a text. This marks the 'real' paragraphs I'm not sure if you can do that in a one step regex, but you alway can do it in four seperate ones. Then I change all the ^p that are left to a space. Those are the ones that aren't at the end of a sentence. In the next step I replace all the <<PAR>> with ^p This process effectively removes all pargraphs that do not start at the end of a sentence and leaves the ones that are at the end of a sentence. You probably want to first replace al ^p with a space in front of it with a single ^p because sometimes there is a space between the end of sentence character and the ^p It is not a perfect proces, but it will catch a least 95% of your problems Last edited by Ortep; 07-20-2011 at 06:00 PM. |
|
07-21-2011, 10:35 AM | #86 |
Connoisseur
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
|
Thanks. Thats a neat approach.
After giving it some thought, I decided it might be more robust to first replace the 'p' tags with say, a null byte or '\n', and record the original tags and their position in a list of tuples. Then - search, mark the results with some tags, replace the p tags, and proceed from there. |
08-26-2011, 02:44 AM | #87 |
Connoisseur
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
|
Its been a while, but I'm back.
Um, I've written a fully functional program, and have used it myself for more than 100 books. Problem is, I gave up on working out the kinks when interfacing with Calibre's python 2.7 (mainly unicode vs ascii issues), since I wrote it originally in 3.2. So, it's currently a standalone program.
|
09-02-2011, 11:58 AM | #88 |
Fanatic
Posts: 527
Karma: 470
Join Date: Sep 2007
Location: The Netherlands
Device: Kindle Oasis
|
Maybe just another stupid idea, but can you use the 'open with' plugin to open a file with your program? Not completely integrated, but you can launch from within Calibre.
I'd like to test it |
09-14-2011, 07:15 AM | #89 |
Connoisseur
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
|
Sorry about the delay.
Ortep: Sounds okay. What with school and SAT's I don't have time just now to learn how to integrate it with the 'Open With' plugin + the fact that it still wouldn't open ebooks directly from calibre without some work. (I really would like to port it to python 2.7 to reintegrate with calibre, just testing all the unicode conversions (it primarily manipulates unicode text, and py2.7 vs py3+ differs greatly in this area) will take alot of time...) Meanwhile [I hope this is okay to post here, as I do look forward to re-integrating it, as it was before, and you asked to test its current state]: Because it's and independant package running off Python 3 + PyQt4 + lxml, it's a 13mb rar package. I have uploaded it to megaupload: Note: This program is currently optimized/designed for a large screen! It will appear cluttered and probably be awkward to use on a small screen! ECleaner v1.0.6 Program Spoiler:
ECleaner v1.0.6 Instructions Spoiler:
|
09-24-2011, 10:40 AM | #90 |
Fanatic
Posts: 527
Karma: 470
Join Date: Sep 2007
Location: The Netherlands
Device: Kindle Oasis
|
Hi, I was busy for a while and had no time to check everything. Last night I started playing with the cleaner. Your program looks great.. And I am able to start it from within Calibre using the plugin 'Open With'. The only thing I could not do was to open the HTMLZ automatically. I'll keep on playing
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Reading List | kiwidude | Plugins | 1319 | 04-25-2024 09:27 AM |
[GUI Plugin] Open With | kiwidude | Plugins | 403 | 04-01-2024 08:39 AM |
[GUI Plugin] User Category | kiwidude | Plugins | 123 | 03-16-2024 11:59 PM |
[GUI Plugin] Find Duplicates | kiwidude | Plugins | 1096 | 03-16-2024 11:28 PM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |