Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 10-24-2014, 03:46 PM   #16
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,518
Karma: 987654
Join Date: Dec 2012
Device: Kindle
If Write will save in *.doc format, you can still try word2cleanhtml dot com online.
Notjohn is offline   Reply With Quote
Old 10-24-2014, 04:02 PM   #17
kbanelas
Member
kbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheese
 
Posts: 13
Karma: 1000
Join Date: Oct 2014
Device: bq cervantes
Hi Doitsu!!
You have told me what I would be afraid..
It's a very simple script that remove all tags of html code except bold, italic, underline and lists.
But now I understand how the plugins work.
Thanks, I will look DiapDealer's Smarten Punctuation plugin.
Regards

edit:

Ei Notjohn!!

Now it's matter of pride.

Create is fantastic, awesome, and the hapiness because of create something that works, it's sublime

Cheers!!

Last edited by kbanelas; 10-24-2014 at 04:12 PM.
kbanelas is offline   Reply With Quote
Advert
Old 10-24-2014, 04:21 PM   #18
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by kbanelas View Post
You have told me what I would be afraid..
It's a very simple script that remove all tags of html code except bold, italic, underline and lists.
You don't need a plugin for that, you can clean up a file by using several regular expression searches one after another.

For example:

You could use the following search and replace expressions in a search group:

Find<p[^>]+>
Replace<p>

Replaces all <p class="xx"> with <p>

Find</*span[^>]*>
Replace


Deletes all <span> tags.

For more examples see the Regular Expression thread.
Doitsu is offline   Reply With Quote
Old 10-25-2014, 04:43 PM   #19
blackest
Connoisseur
blackest began at the beginning.
 
Posts: 67
Karma: 10
Join Date: Sep 2014
Device: sony prs 2
http://www.cssout.com/

might be useful html in and html and separate css file out. Anyone fancy trying it ?
blackest is offline   Reply With Quote
Old 10-26-2014, 05:10 PM   #20
kbanelas
Member
kbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheese
 
Posts: 13
Karma: 1000
Join Date: Oct 2014
Device: bq cervantes
Hello guys!!!

Doitsu, I'm sorry but, what you're suggesting, it's a drudgery. I prefer losing my time one day in a little script, that using several regular expressions, all my life.

blackest, I have visited the web link, and it's interesting, but the idea is improvable at least in that web. Perhaps another website works better.

Regards!!
kbanelas is offline   Reply With Quote
Advert
Old 10-26-2014, 06:07 PM   #21
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by kbanelas View Post
Doitsu, I'm sorry but, what you're suggesting, it's a drudgery. I prefer losing my time one day in a little script, that using several regular expressions, all my life.
Actually, you can use regular expressions in scripts also. They might look rather intimidating at first, but once you learn the basics they can save you hours of manual work.

However, as many other posters have repeatedly suggested, you might save even more time by resaving your original file as a .docx or .odt file and exporting it as a "filtered .html file."

BTW, Calibre can convert a .docx file to a halfway decent epub with very little clutter.
Doitsu is offline   Reply With Quote
Old 10-31-2014, 01:46 PM   #22
kbanelas
Member
kbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheese
 
Posts: 13
Karma: 1000
Join Date: Oct 2014
Device: bq cervantes
Hello!!

I know it's possible I have been an obstinate because of to want doing this plugin but now the work have been made. I'll upload my little script smoothRemove for everybody.
Now in this version of plugin 0.1 you'll be able to realize a complete remove formatting but with the peculiarity of preserve the italics, underline, bold, list and headings tags.
Remember that it's about an special remove formatting. Everything the rest of tags will be removed

ATTENTION: Don't use this plugin with the cover, so if you have one, don't select a complete smoothRemove, only select the files in that you are interested.

See you soon
Attached Files
File Type: zip smoothRemove.zip (1.8 KB, 148 views)
kbanelas is offline   Reply With Quote
Old 10-31-2014, 02:12 PM   #23
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,478
Karma: 5703586
Join Date: Nov 2009
Device: many
Hi,

You seem to either only be working on a set of files without an opf (or that have no entries in the opf).

The calls you make to read and write the files are only correct for files ***not*** part of the manifest in the opf.

Code:
        def applySelect(self):

                if self.flagAll.get() == True:
                        for (ID, href) in self.bk.text_iter():
                                html = self.bk.readotherfile(ID)
                                r = Remove(html)
                                html_new = r.get_html()
                                self.bk.writeotherfile(ID,html_new)
If your html files are actually included in the manifest of the opf, the correct way to access these files is via the calls setup to do just that:

Code:
    def readfile(self, id):
        # returns the contents of the file with manifest id  (text files are utf-8 encoded)
        return self._w.readfile(id)

    def writefile(self, id, data):
        # writes data to a currently existing file pointed to by the manifest id
        self._w.writefile(id, data)
whereas you used code to read and write files that are **not** part of the manifest.

Code:
# reading / writing / adding / deleting other ebook files that DO NOT exist in the opf manifest

    def readotherfile(self, book_href):
        # returns the contents of the file pointed to by the ebook href
        return self._w.readotherfile(book_href)

    def writeotherfile(self, book_href, data):
        # writes data to a currently existing file pointed to by the ebook href
        self._w.writeotherfile(book_href, data)
The ONLY way this would work is that either your opf file is broken or non-existant or not being parsed properly.

So if that code truly works for you, and if the files you edit are actually part of the manifest in the opf, then something is very broken that needs to be fixed. So could you please post just the opf file for a book this code works on so that I can track down why the opf is not being parsed properly and get it fixed in the next release of Sigil.

Thanks,

KevinH

Quote:
Originally Posted by kbanelas View Post
Hello!!

I know it's possible I have been an obstinate because of to want doing this plugin but now the work have been made. I'll upload my little script smoothRemove for everybody.
Now in this version of plugin 0.1 you'll be able to realize a complete remove formatting but with the peculiarity of preserve the italics, underline, bold, list and headings tags.
Remember that it's about an special remove formatting. Everything the rest of tags will be removed

ATTENTION: Don't use this plugin with the cover, so if you have one, don't select a complete smoothRemove, only select the files in that you are interested.

See you soon

Last edited by KevinH; 10-31-2014 at 02:15 PM.
KevinH is offline   Reply With Quote
Old 10-31-2014, 03:05 PM   #24
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by KevinH View Post
The ONLY way this would work is that either your opf file is broken or non-existant or not being parsed properly.
Actually, his plugin works just fine with a regular epub with files manifested in the .opf. I haven't got the faintest idea why, but it does.

@kbanelas: Congratulations on your first working plugin!
Doitsu is offline   Reply With Quote
Old 10-31-2014, 03:16 PM   #25
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,478
Karma: 5703586
Join Date: Nov 2009
Device: many
Quote:
Originally Posted by Doitsu View Post
Actually, his plugin works just fine with a regular epub with files manifested in the .opf. I haven't got the faintest idea why, but it does.

@kbanelas: Congratulations on your first working plugin!
Hi,
Wow! I certainly didn't design it that way. Somehow the manifest ids must be matching the book href, or ... something else is going on. Typically manifest ids do not match hrefs.

I will look into this some more. If you change his calls to use the correct way, does it still work?

KevinH
KevinH is offline   Reply With Quote
Old 10-31-2014, 05:02 PM   #26
kbanelas
Member
kbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheese
 
Posts: 13
Karma: 1000
Join Date: Oct 2014
Device: bq cervantes
Hi!!

Quote:
Originally Posted by KevinH View Post

So if that code truly works for you, and if the files you edit are actually part of the manifest in the opf, then something is very broken that needs to be fixed. So could you please post just the opf file for a book this code works on so that I can track down why the opf is not being parsed properly and get it fixed in the next release of Sigil.

Thanks,

KevinH
Thanks to you.

This code works fine with all ebooks that I have. Could the error be other?
I'm at your disposal.

Quote:
Originally Posted by Doitsu View Post
@kbanelas: Congratulations on your first working plugin!
Thanks, but the plugin works by chance. Or that seems...
This plugin worked fine from the start, so I didn't ask any question.
Anyhow, but not today, tomorrow, I'll look the code again.
Thanks, again
Cheers!!
kbanelas is offline   Reply With Quote
Old 10-31-2014, 05:27 PM   #27
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,478
Karma: 5703586
Join Date: Nov 2009
Device: many
Hi,
It will work the way you have it now but it will stop working in the future. The only reason it works now is that I recently changed the code to use a dictionary to map both manifest ids and book hrefs to filepaths. This change allowed bk.writeotherfile and bk.readotherfile to work by accident if you pass in a valid manifest id in place of a href from the book root. The old code did not use this. The upcoming next version code will change this back to the old way of doing things.

So simply just replace your use of bk.writeotherfile with bk.writefile and bk.readotherfile with bk.readfile and then your plugin should continue to work as expected even after the next release of Sigil.

Thanks,

KevinH
KevinH is offline   Reply With Quote
Old 11-01-2014, 05:22 AM   #28
kbanelas
Member
kbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheesekbanelas can extract oil from cheese
 
Posts: 13
Karma: 1000
Join Date: Oct 2014
Device: bq cervantes
Hi!
I have fixed the code!!
I think that all is right but please, if you could to look it, I would be thankful.
Cheers!!
Attached Files
File Type: zip smoothRemove.zip (1.9 KB, 127 views)
kbanelas is offline   Reply With Quote
Old 11-01-2014, 09:42 AM   #29
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,478
Karma: 5703586
Join Date: Nov 2009
Device: many
Hi kbanelas,

The use of the edit bookcontainer calls looks exactly right. Very nice job indeed!

Only if you so desire you could simply rename your zip file to include your version information. In other words you could rename smoothRemove.zip to be smoothRemove_v010.zip (no other changes needed ... ie it should still unpack into a smoothRemove folder).

That way if you wanted to change it or add new features in the future, people would not be confused as to which version is newer.

Take care,

KevinH
KevinH is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Dealing with bad formatting: "broken" lines inside paragraphs? MelBr Calibre 5 08-26-2013 12:10 AM
Try to remove : This item was downloaded by caliber from .... and "Section Menu", "Ma poulardalber Recipes 4 08-21-2012 05:23 AM
How to remove "Fully read" books from "Last Open" list? pjeanetta PocketBook 4 12-08-2010 10:30 AM
PRS-600 "Internal content invalid. Formatting" 600 crash (is it dead ?) zelda_pinwheel Sony Reader 93 02-18-2010 05:27 PM


All times are GMT -4. The time now is 01:07 PM.


MobileRead.com is a privately owned, operated and funded community.