Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 08-17-2015, 08:22 AM   #1
Nick_1964
Bookworm
Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.
 
Nick_1964's Avatar
 
Posts: 975
Karma: 768585
Join Date: Aug 2010
Location: Netherlands
Device: Sony prs-650, Kobo Glo HD (2x), Kobo Glo
Is there Epub cleaning software (so much unneeded code inside)

Dear friends.
When I opened a commercial ebook that I bought (social DRM) I noticed the book is really slow.
So I looked to the code and that was terrible.. no way I can clean this by hand... is there software (I hope free because it is for only one book,until now) or an online tool that can clean it for me ??
Look at the spoiler for a example.
In real..this are only a couple of lines from the book...
Spoiler:

<p class="dlct-000"><span class="dlct-007">'Ik weet het,</span><span class="dlct-007">'</span> <span class="dlct-007">zei Ro zacht. 'Ik heb alles gezien</span><span class="dlct-007">. I</span><span class="dlct-007">k</span> <span class="dlct-007">zat onder de tafel verstopt</span><span class="dlct-007">.</span></p>

<p class="dlct-000"><span class="dlct-007">Elena keek hem</span> <span class="dlct-007">verbijsterd</span> <span class="dlct-007">aan.</span></p>

<p class="dlct-000"><span class="dlct-007">'Mijn moeder zei dat ze mij een afkoelperiode wilde geven en dat ik dan wel weer redelijk zou worden. Ze stopte mij in een kamer</span><span class="dlct-007">,</span> <span class="dlct-007">maar ik kon ontsnappen</span><span class="dlct-007">.</span> <span class="dlct-007">Joxy</span><span class="dlct-007">,</span> <span class="dlct-007">mijn privé</span><span class="dlct-007">-</span> <span class="dlct-007">bediende heb ik kunnen overhalen en z</span><span class="dlct-007">ij</span> <span class="dlct-007">verstopte mij onder de trolley die ze naar de grote zaal bracht. Toen ik zag dat daar van alles stond te gebeuren besloot ik mij</span> <span class="dlct-007">onder tafel</span> <span class="dlct-007">te ver</span><span class="dlct-007">bergen</span><span class="dlct-007">. Ik heb alles gezien en nu kom ik je halen. We moeten weg hier. We moeten William en Charlotte bevrijden. Dat</span><span class="dlct-007">… wezen</span> <span class="dlct-007">is mijn moeder niet meer!'</span></p>

<p class="dlct-000"><span class="dlct-007">Elena knikte en ze schreef</span> <span class="dlct-007">opnieuw iets op het</span> <span class="dlct-007">papier.</span> <span class="dlct-007">'Wat wil je doen? Samis heeft nu de twee boeken. Alles en iedereen is hier aangepast, gekruist of gemaakt</span><span class="dlct-007">.</span> <span class="dlct-007">Greysdale is bewoond door zombies, Utopalta is half vergaan en Willowjinx is weg. Waar moeten we naartoe? Ik kan niets zo!' Elena</span> <span class="dlct-007">wees naar haar</span> <span class="dlct-007">keel.</span></p>

<p class="dlct-000"><span class="dlct-007">'Shit- ja!'</span> <span class="dlct-007">zei Ro net iets te hard</span><span class="dlct-007">. M</span><span class="dlct-007">eteen sloeg Elena een hand voor zijn mond.</span></p>

<p class="dlct-000"><span class="dlct-007">Ro haalde haar hand weg</span><span class="dlct-007">.</span> <span class="dlct-007">'Sorry,'</span> <span class="dlct-007">fluisterde hij</span><span class="dlct-007">. 'Ik weet het niet</span><span class="dlct-007">,</span> <span class="dlct-007">maar we moeten iets doen.'</span></p>

<p class="dlct-000"><span class="dlct-007">Met tegenzin schudde Elena haar hoofd. Ze pakte het papiertje en</span> <span class="dlct-007">schreef verder</span><span class="dlct-007">.</span> <span class="dlct-007">'We hebben alleen het huis in het bos over. Misschien dat de professor ons kan helpen. Aan de Codex Trias hebben we niets. Die is veilig bij Bia en Luxis. Hier</span> <span class="dlct-007">kunnen we niet</span> <span class="dlct-007">blijven.</span> <span class="dlct-007">We</span> <span class="dlct-007">moeten</span> <span class="dlct-007">William en Charlotte</span> <span class="dlct-007">bevrijden</span> <span class="dlct-007">en gaan dan naar het bos.'</span></p>

<p class="dlct-000"><span class="dlct-007">'</span><span class="dlct-007">Goed plan</span><span class="dlct-007">,' fluisterde Ro</span><span class="dlct-007">.</span><span class="dlct-007">&nbsp;</span></p>

<p class="dlct-000"><span class="dlct-007">Elena pakte het papier en gooide het in de open haard. Ro</span> <span class="dlct-007">nam haar bij de</span> <span class="dlct-007">hand en trok haar door de deur waar hij vandaa</span><span class="dlct-007">n was gekomen</span><span class="dlct-007">. Ze kwam in een gelijksoortige kamer als die van haar alleen was het hier een stuk rommeliger.&nbsp;</span></p>
Nick_1964 is offline   Reply With Quote
Old 08-17-2015, 08:33 AM   #2
Ripplinger
350 Hoarder
Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.Ripplinger ought to be getting tired of karma fortunes by now.
 
Ripplinger's Avatar
 
Posts: 3,574
Karma: 8281267
Join Date: Dec 2010
Location: Midwest USA
Device: Sony PRS-350, Kobo Glo & Glo HD, PW2
This looks to be a typical paragraph from your selection, and if it's the same throughout the book, I think you could clean it by hand pretty easily using Sigil's Find and Replace:
Quote:
<p class="dlct-000"><span class="dlct-007">Ro haalde haar hand weg</span><span class="dlct-007">.</span> <span class="dlct-007">'Sorry,'</span> <span class="dlct-007">fluisterde hij</span><span class="dlct-007">. 'Ik weet het niet</span><span class="dlct-007">,</span> <span class="dlct-007">maar we moeten iets doen.'</span></p>
I'd search for <p class="dlct-000"> and replace all instances of it with <p class="calibre"> (whatever you decide to use and set up in the stylesheet).

Then search for <span class="dlct-007"> and in the replace box just leave it blank so it will delete all instance of it. Note the number of instances it finds here for the next step.

Then search for </span> and do the same thing, leave the replace box blank to they'll all be deleted. Check that the number of instances found matches with the previous step, or there might be other span classes somewhere, some of which you might want to keep. You can find those after this step is done by just searching for "span" and see what comes up.

If there are a lot of other instances of "dict-###" with various numbers, you could use regex the same way and get them all.

Then just let Sigil clean up the stylesheet to delete any unused styles and try it.

I've never found any automated software that can make the proper decisions about what stays and what should go.
Ripplinger is offline   Reply With Quote
Advert
Old 08-17-2015, 08:47 AM   #3
Nick_1964
Bookworm
Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.
 
Nick_1964's Avatar
 
Posts: 975
Karma: 768585
Join Date: Aug 2010
Location: Netherlands
Device: Sony prs-650, Kobo Glo HD (2x), Kobo Glo
Quote:
Originally Posted by Ripplinger View Post
If there are a lot of other instances of "dict-###" with various numbers, you could use regex the same way and get them all.

Then just let Sigil clean up the stylesheet to delete any unused styles and try it.

I've never found any automated software that can make the proper decisions about what stays and what should go.
I tried with Sigil which I use the most but the changing "dicts" are everywhere and by removing a lot of span's I always get the warning that the html is not well formed and need to be cleaned.

Due to my dyslection I really don't understand the regex way of searching.
In the very beginning Sigil was working with wildcards like dict* but that isn't the case anymore..
Also it doesn't use any .css all the rules are in the html:
Spoiler:

p.dlct-085 {line-height:110%; margin-bottom:3pt; margin-right:3pt; margin-top:12pt; text-align:justify}
p.dlct-060 {line-height:110%; margin-bottom:3pt; margin-right:2.35pt; margin-top:12pt; text-align:justify}
p.dlct-046 {line-height:110%; margin-bottom:3pt; margin-right:0pt; margin-top:12pt; text-align:justify}
p.dlct-083 {line-height:110%; margin-bottom:3pt; margin-right:3.15pt; margin-top:12pt; text-align:justify}
p.dlct-091 {font-size:1.27em; line-height:110%; margin-bottom:3pt; margin-right:0pt; margin-top:12pt; text-align:justify}
p.dlct-058 {line-height:110%; margin-bottom:3pt; margin-right:2.5pt; margin-top:12pt; text-align:justify}
p.null {line-height:0.8249999em; margin-bottom:0pt; text-align:center}
p.dlct-050 {line-height:110%; margin-bottom:3pt; margin-right:3.45pt; margin-top:12pt; text-align:justify}
p.dlct-035 {line-height:110%; margin-bottom:3pt; margin-right:3.1pt; margin-top:12pt; text-align:justify}
p.dlct-051 {line-height:110%; margin-bottom:3pt; margin-right:3.4pt; margin-top:12pt; text-align:justify}
p.dlct-086 {line-height:110%; margin-bottom:3pt; margin-right:2pt; margin-top:12pt; text-align:justify}
p.null {line-height:0.8249999em; margin-bottom:0pt; text-indent:11pt}p.dlct-061 {line-height:110%; margin-bottom:3pt; margin-right:2.05pt; margin-top:12pt; text-align:justify}
p.dlct-056 {line-height:110%; margin-bottom:3pt; margin-right:2.2pt; margin-top:12pt; text-align:justify}
p.dlct-090 {font-size:1.09em; line-height:110%; margin:12pt 0pt 3pt 17.1pt; text-align:justify}

p.null {line-height:0.8249999em; margin-bottom:0pt}
p.dlct-078 {line-height:110%; margin:12pt 0pt 3pt 5.5pt; text-align:justify}
p.dlct-031 {text-align:justify}
p.dlct-055 {line-height:110%; margin-bottom:3pt; margin-right:3.05pt; margin-top:12pt; text-align:justify}
p.dlct-076 {line-height:110%; margin-bottom:3pt; margin-right:2.45pt; margin-top:12pt; text-align:justify}
p.dlct-020 {font-size:0.91em; line-height:115%; margin-bottom:0pt; text-align:justify}
p.dlct-053 {line-height:110%; margin-bottom:3pt; margin-right:2.9pt; margin-top:12pt; text-align:justify}
p.dlct-029 {font-size:1.09em; line-height:110%; margin-bottom:3pt; margin-right:2.35pt; margin-top:12pt; text-align:justify}
p.dlct-088 {line-height:110%; margin-bottom:3pt; margin-right:2.6pt; margin-top:12pt; text-align:justify}
p.null {line-height:0.9916666em; margin-bottom:0pt; margin-top:0.85pt; text-align:center}
p.dlct-084 {line-height:110%; margin-bottom:3pt; margin-right:3.3pt; margin-top:12pt; text-align:justify}
p.dlct-022 {font-size:1.27em; line-height:110%; margin-bottom:3pt; margin-top:12pt; text-align:justify}
p.dlct-057 {line-height:110%; margin-bottom:3pt; margin-right:2.1pt; margin-top:12pt; text-align:justify}
p.null {line-height:0.92499995em; margin-bottom:0pt; margin-left:5.8pt; margin-right:4.75pt; text-align:justify}
p.dlct-077 {line-height:110%; margin-bottom:3pt; margin-right:2.25pt; margin-top:12pt; text-align:justify}
p.null {line-height:0.7833333em; margin-bottom:0pt; margin-top:0.35pt; text-align:center}
p.dlct-074 {line-height:110%; margin-bottom:3pt; margin-right:2.3pt; margin-top:12pt; text-align:justify}
p.dlct-059 {line-height:110%; margin-bottom:3pt; margin-right:2.4pt; margin-top:12pt; text-align:justify}
p.dlct-094 {font-size:0.91em; line-height:110%; margin-bottom:3pt; margin-top:12pt; text-align:justify}
p.dlct-079 {line-height:110%; margin:12pt 2.4pt 3pt 5.8pt; text-align:justify}
p.dlct-103 {font-size:1.09em; line-height:110%; margin-bottom:3pt; margin-right:0pt; margin-top:12pt; text-align:justify}
p.dlct-096 {line-height:110%; margin-bottom:3pt; margin-right:3.35pt; margin-top:12pt; text-align:justify}
p.dlct-032 {line-height:110%; margin-bottom:3pt; margin-right:2.15pt; margin-top:12pt; text-align:justify}
p.dlct-089 {line-height:110%; margin:12pt 0pt 3pt 17.1pt; text-align:justify}
p.dlct-105 {font-size:1.09em; line-height:110%; margin-bottom:3pt; margin-right:2.3pt; margin-top:12pt; text-align:justify}
p.dlct-034 {font-size:1.09em; line-height:110%; margin-bottom:3pt; margin-right:3.1pt; margin-top:12pt; text-align:justify}
p.dlct-080 {line-height:110%; margin-bottom:3pt; margin-right:2.65pt; margin-top:12pt; text-align:justify}
p.null {line-height:0.9333333em; margin-bottom:0pt; margin-left:5.8pt; margin-right:2.7pt; text-align:justify}
p.dlct-010 {font-size:0.91em; line-height:normal; margin-bottom:0pt; margin-left:5.8pt; margin-right:223.3pt; text-align:justify}
p.null {line-height:0.9083333em; margin-bottom:0pt; margin-left:5.5pt; margin-top:0.1pt}
p.dlct-082 {line-height:110%; margin-bottom:3pt; margin-right:3.25pt; margin-top:12pt; text-align:justify}
p.dlct-027 {font-size:1.64em; line-height:110%; margin-bottom:3pt; margin-right:0pt; margin-top:12pt; text-align:justify}
span.dlct-102 {font-family:Garamond, serif}
span.dlct-064 {font-family:Verdana, sans-serif; font-style:italic; letter-spacing:-0.1pt}
span.dlct-018 {font-family:'Times New Roman', serif; font-size:0.91em; letter-spacing:-0.2pt}
span.dlct-003 {font-size:0.86em}
span.dlct-013 {font-family:'Times New Roman', serif; font-size:0.91em; letter-spacing:-0.1pt}
span.dlct-049 {font-family:Verdana, sans-serif; letter-spacing:0.15pt}
span.dlct-065 {font-family:Verdana, sans-serif; font-style:italic; letter-spacing:0.05pt}

This is only a small part of one html... and they use the margins-right to position the words (or it seems that way) maybe it is better to put it all in Calibre and convert it to txt and then build it again proper with Sigil if that is possible.
I really don't know what the publisher tries to do with it...

Yes.. I really tried to understand the regex method.. but it just doesn't stay in the part I reserved for it in my brains
And Imagine what happens with all the classes when I put it out with the KoboTouchExtended driver to a kepub ..

Last edited by Nick_1964; 08-17-2015 at 08:53 AM.
Nick_1964 is offline   Reply With Quote
Old 08-17-2015, 08:58 AM   #4
MikeB1972
Gnu
MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.
 
Posts: 1,222
Karma: 15625359
Join Date: Jul 2009
Location: UK
Device: BeBook,JetBook Lite,PRS-300-350-505-650,+ran out of space to type
In Sigil use the Regex mode for find/replace
In the find box
<span class="dlct-007">(.*?)</span>
in the replace box
\1
Replace all
MikeB1972 is offline   Reply With Quote
Old 08-17-2015, 09:08 AM   #5
Nick_1964
Bookworm
Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.
 
Nick_1964's Avatar
 
Posts: 975
Karma: 768585
Join Date: Aug 2010
Location: Netherlands
Device: Sony prs-650, Kobo Glo HD (2x), Kobo Glo
Quote:
Originally Posted by MikeB1972 View Post
In Sigil use the Regex mode for find/replace
In the find box
<span class="dlct-007">(.*?)</span>
in the replace box
\1
Replace all
I gonna try,thank you, but first I gonna try again to read what you wrote to understand what it does.. for sure it will help, but I want to know why to understand that Regex mode.. just for the look it replaces the classes, but what does the \1 means..
The dlct-007 is just one.. they start with 1 and end with.. I even don't know where..and everyone is different and used.. some containing larger txt for chapters so the lay-out would be gone anyway... I rather scan a book with abbyy then this mess..
Nick_1964 is offline   Reply With Quote
Advert
Old 08-17-2015, 09:19 AM   #6
MikeB1972
Gnu
MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.MikeB1972 ought to be getting tired of karma fortunes by now.
 
Posts: 1,222
Karma: 15625359
Join Date: Jul 2009
Location: UK
Device: BeBook,JetBook Lite,PRS-300-350-505-650,+ran out of space to type
Quote:
Originally Posted by Nick_1964 View Post
I gonna try,thank you, but first I gonna try again to read what you wrote to understand what it does.. for sure it will help, but I want to know why to understand that Regex mode.. just for the look it replaces the classes, but what does the \1 means..
The dlct-007 is just one.. they start with 1 and end with.. I even don't know where..and everyone is different and used.. some containing larger txt for chapters so the lay-out would be gone anyway... I rather scan a book with abbyy then this mess..
To break it down
Search for this <span class="dlct-007">
then any combination of items (.*?)
ending with this </span>

the brackets around .*? say "cut the text for later use"

to use in the replace start with \ then the instance of saved text so as it's the first (and only) instance in your case use this for replace
\1

from the look of the sample you posted I would just try replacing 007 for now as it seems to be the default paragraph and won't affect the layout.
MikeB1972 is offline   Reply With Quote
Old 08-17-2015, 09:40 AM   #7
Nick_1964
Bookworm
Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.
 
Nick_1964's Avatar
 
Posts: 975
Karma: 768585
Join Date: Aug 2010
Location: Netherlands
Device: Sony prs-650, Kobo Glo HD (2x), Kobo Glo
Quote:
Originally Posted by MikeB1972 View Post
To break it down
Search for this <span class="dlct-007">
then any combination of items (.*?)
ending with this </span>

the brackets around .*? say "cut the text for later use"

to use in the replace start with \ then the instance of saved text so as it's the first (and only) instance in your case use this for replace
\1

from the look of the sample you posted I would just try replacing 007 for now as it seems to be the default paragraph and won't affect the layout.
Gonna study this.. but the dclt-007 is only in the lines I copy'd, 2 paragraps below it is changing in dclt-008... tried to open the next html, Sigil is crashing.. there is just to much..
Nick_1964 is offline   Reply With Quote
Old 08-17-2015, 10:00 AM   #8
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Moved to the "Workshop" forum, where such questions belong.
HarryT is offline   Reply With Quote
Old 08-17-2015, 10:07 AM   #9
rubeus
Banned
rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.
 
Posts: 272
Karma: 1224588
Join Date: Sep 2014
Device: Sony PRS 650
Delete <span class="dlct-\d\d\d"> in Sigil and let do tidy do the rest.
rubeus is offline   Reply With Quote
Old 08-17-2015, 10:08 AM   #10
Nick_1964
Bookworm
Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.
 
Nick_1964's Avatar
 
Posts: 975
Karma: 768585
Join Date: Aug 2010
Location: Netherlands
Device: Sony prs-650, Kobo Glo HD (2x), Kobo Glo
Quote:
Originally Posted by HarryT View Post
Moved to the "Workshop" forum, where such questions belong.
Excuse me .. didn't finded that one..

Quote:
Originally Posted by rubeus View Post
Delete <span class="dlct-\d\d\d"> in Sigil and let do tidy do the rest.
67712 replacements done.. but then html error expected end of tag 'p'. automatix fix doesn't work and there are a whole lots of </span> there.. I guess about 67712
But I do try to figure out how the regex is working so such examples are gold for me.

Last edited by Nick_1964; 08-17-2015 at 10:29 AM.
Nick_1964 is offline   Reply With Quote
Old 08-17-2015, 10:43 AM   #11
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
If you want Nick, I can clean it up for you. Just give me a sign.
Toxaris is offline   Reply With Quote
Old 08-17-2015, 10:54 AM   #12
dickloraine
Guru
dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.dickloraine ought to be getting tired of karma fortunes by now.
 
Posts: 631
Karma: 7544080
Join Date: Apr 2013
Location: Berlin
Device: PRS 350, Kobo Aura
Is it a novel? If so, maybe use calibre to conver it to for example txt with markdown and then back to epub. Or convert it to docx and use toxaris word add-in. Good luck, it really looks terrible.
dickloraine is offline   Reply With Quote
Old 08-17-2015, 10:57 AM   #13
Nick_1964
Bookworm
Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.
 
Nick_1964's Avatar
 
Posts: 975
Karma: 768585
Join Date: Aug 2010
Location: Netherlands
Device: Sony prs-650, Kobo Glo HD (2x), Kobo Glo
Quote:
Originally Posted by Toxaris View Post
If you want Nick, I can clean it up for you. Just give me a sign.
Hou ik als backup I keep that as a backup,thank you very much,because I want to learn... And the books are not even mine (in the way I paid for it but they are for the girl next door..she has one of my readers in use and brings me her pocketmoney and a list with books she want.. )

But I am a bit further now.. just to find out that between every line, almost between every line there is a blank one.
When I remove it by adding code to a .css I added..they are gone but also the paragraphs that does belong there.. but even visually I can't see what is a blanc line by a wrong html code and whats a paragraph, they almost all start with <p class="dlct-025"> and at places where i expect a paragraph they are to... The files are not splitted in sections where a new chapter begins (I always do that...) but the chapters are divided by a bunch of enters.. (</br> ) man oh man..

And more worse.. al the hyphen - are there (i suppose they are there in the real paper book) just in the middle of lines.. but also some text is marked just by - txt here - so i can't just remove all the hyphen - marks.. and guess what.. she also asked me to buy the 2 other parts and they are made exactly the same..

Quote:
Originally Posted by dickloraine View Post
Is it a novel? If so, maybe use calibre to conver it to for example txt with markdown and then back to epub. Or convert it to docx and use toxaris word add-in. Good luck, it really looks terrible.
That was what I was thinking before and I may end up doing this but now so many people add suggestions I first want to try it with Sigil and after that I try with Calibre and then compare them..

They are 3 child books, for as far as I can see it is a story about a couple of kids that are bookkeepers (dunno.. seems a bit odd compared to the mess the books are) and have to fight dragons and other things in a underworld, don't know why and for what goal..and I don't want to know it to..

Last edited by Nick_1964; 08-17-2015 at 11:15 AM.
Nick_1964 is offline   Reply With Quote
Old 08-17-2015, 01:41 PM   #14
rubeus
Banned
rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.rubeus ought to be getting tired of karma fortunes by now.
 
Posts: 272
Karma: 1224588
Join Date: Sep 2014
Device: Sony PRS 650
Quote:
Originally Posted by Nick_1964 View Post
automatix fix doesn't work
Thats not what i suggested.
rubeus is offline   Reply With Quote
Old 08-17-2015, 01:45 PM   #15
Nick_1964
Bookworm
Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.Nick_1964 ought to be getting tired of karma fortunes by now.
 
Nick_1964's Avatar
 
Posts: 975
Karma: 768585
Join Date: Aug 2010
Location: Netherlands
Device: Sony prs-650, Kobo Glo HD (2x), Kobo Glo
Quote:
Originally Posted by rubeus View Post
Thats not what i suggested.
Nope.. but tidy isn't an option anymore in the new Sigil, it now works as the automatic repair function.
Nick_1964 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Create / Optimize Cbz files for Kobo (software inside) satsuki_yatoshi Kobo Reader 20 06-22-2022 04:23 PM
conversion problem? - cleaning up epub potestus Calibre 1 05-31-2011 01:28 PM
Stop Automatic Code cleaning in Sigil ericp20 Sigil 11 05-27-2011 08:52 AM
questions on epub and lrf and cleaning up book Janette55 Sony Reader 1 03-11-2011 09:25 AM
Unutterably Silly A pug cleaning the inside of your monitor! Dusty Bottoms Lounge 4 05-03-2010 10:06 AM


All times are GMT -4. The time now is 12:42 PM.


MobileRead.com is a privately owned, operated and funded community.