Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 06-04-2020, 12:31 AM   #1
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 47,971
Karma: 174315098
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Candidate for worse formatted epub

If anyone is interested in hanging Sigil on startup for several minutes and making the whole program crawl, check out the epub version of Radix by A. A. Attanasio which is currently a freebie from Arc Manor/Phoenix Pick. Downloadable from Free Ebooks | Publisher's Pick. This abomination has a single 2MB text file with nothing to be seen but inline styles.
Attached Thumbnails
Click image for larger version

Name:	Radix_epub.png
Views:	275
Size:	171.5 KB
ID:	179706  
DNSB is online now   Reply With Quote
Old 06-04-2020, 02:38 AM   #2
najgori
Klak
najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'
 
najgori's Avatar
 
Posts: 174
Karma: 150374
Join Date: Sep 2011
Location: Belgrade, Serbia
Device: many
edit: In the header there is a name of "generator" aspose words ... which is probably responsible for strange code.

Last edited by najgori; 06-04-2020 at 10:42 AM. Reason: DiapDialer
najgori is offline   Reply With Quote
Old 06-04-2020, 06:21 AM   #3
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,665
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by DNSB View Post
If anyone is interested in hanging Sigil on startup for several minutes and making the whole program crawl, check out the epub version of Radix by A. A. Attanasio which is currently a freebie from Arc Manor/Phoenix Pick. Downloadable from Free Ebooks | Publisher's Pick. This abomination has a single 2MB text file with nothing to be seen but inline styles.
I've seen that mess the first time it was a freebie. Someone said to get the Mobi version and use Calibre to convert to ePub and clean up from there. That is still not nice, but it's better then the ePub.
JSWolf is offline   Reply With Quote
Old 06-04-2020, 08:52 AM   #4
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Thank you for pointing out this horrible test case. It is not just one big file, it is one super big line!!! There are NO linebreaks anyplace. And Sigil is a line based editor.

It actually opens better in older versions (0.9.14) when mend on open is used so that at least that huge line is split into multiple lines.

Perhaps adding a "bad epub" input plugin that can be used to do what old Sigil did (force at least some linebreaks into the file after block level tags). And I might also be able to add some ability to collect out inline styles into a separate css file.

That is one messed up epub, and therefore a wonderful test case for improving Sigil. I will look into it.

Last edited by KevinH; 06-04-2020 at 08:57 AM.
KevinH is offline   Reply With Quote
Old 06-04-2020, 08:56 AM   #5
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
To make clean-up less painful in Sigil, I did the following:

1. Open Sigil, turn off Preview
2. Use Sigil to load the epub
3. Go make coffee while Sigil loads the one huge line :-)
4. Immediately after it finally open, do nothing except:
Right click in the window and select Mend
5. Go refill your coffee :-)

Sigil should now be able to at least function a bit more rapidly now that the file is not just one giant single line.

6. Use Find and Replace to insert a Sigil Split Marker
Code:
<hr class="sigil_split_marker" />
immediately before each "<h1", and then split on split markers.

7. Turn back on Preview

You are finally back to something still very horrible but workable.

Last edited by KevinH; 06-04-2020 at 09:14 AM.
KevinH is offline   Reply With Quote
Old 06-04-2020, 09:08 AM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,855
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by najgori View Post
Aspose.Words for .NET is an advanced document processing API
I'm hesitant to reprimand a long-standing member for what looks to me like a straight-up spam post, so I'll give you some time to explain the relevance of the link you posted to the ongoing discussion before I delete it. Are you suggesting the program you linked to was used to create the awful epub, or what?

Last edited by DiapDealer; 06-04-2020 at 09:20 AM.
DiapDealer is offline   Reply With Quote
Old 06-04-2020, 04:42 PM   #7
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,394
Karma: 20212733
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
I used a non-sigil technique to make it usable in Sigil... I used notepad++ and replaced all the </p> tags with </p>\n to give it line breaks. It was fairly spry when opening in Sigil....then of course all the other steps Kevin mentioned.

So, perhaps that can be a simple file integrity check when Sigil opens..... number of characters divided by number of lines MUST be below a certain threshold otherwise additional line breaks (\n) are automatically added. That doesn't adversely affect the document at all...
Turtle91 is offline   Reply With Quote
Old 06-04-2020, 05:11 PM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,855
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I've seen epubs created with Apple's Pages where the files were all one single line without breaks as well.

Large swathes of code (often the entire body) with "white-space: pre" applied via css in some InDesign output also creates a similar problem.
DiapDealer is offline   Reply With Quote
Old 06-04-2020, 05:25 PM   #9
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
We will have to add some code on ImportEpub to analyze each html file and either run mend on it to inject newlines after each block level tags, or do the equivalent of what Turtle91 suggested. The new policy of trying not to touch each file on load until absolutely necessary has hurt us a bit here. Older Sigil would forcibly inject the newlines since "mend" on open was effectively always done during the move to our standard directory structure.

Thoughts?
KevinH is offline   Reply With Quote
Old 06-04-2020, 05:51 PM   #10
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,665
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Both the code and formatting are just awful. I have this abomination and it's really bad. An example is 3em indents.

Last edited by JSWolf; 06-04-2020 at 05:58 PM.
JSWolf is offline   Reply With Quote
Old 06-04-2020, 05:56 PM   #11
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,665
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Turtle91 View Post
I used a non-sigil technique to make it usable in Sigil... I used notepad++ and replaced all the </p> tags with </p>\n to give it line breaks. It was fairly spry when opening in Sigil....then of course all the other steps Kevin mentioned.

So, perhaps that can be a simple file integrity check when Sigil opens..... number of characters divided by number of lines MUST be below a certain threshold otherwise additional line breaks (\n) are automatically added. That doesn't adversely affect the document at all...
Try loading it into Calibre's editor and use the Beautify Current File feature and see how that goes.
JSWolf is offline   Reply With Quote
Old 06-05-2020, 04:12 AM   #12
najgori
Klak
najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'
 
najgori's Avatar
 
Posts: 174
Karma: 150374
Join Date: Sep 2011
Location: Belgrade, Serbia
Device: many
Quote:
Originally Posted by KevinH View Post
Thoughts?
Let Sigil inform user about crappy code and let him decide if he or she want to wait for program to show it, or to fix it.
najgori is offline   Reply With Quote
Old 06-05-2020, 12:52 PM   #13
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 47,971
Karma: 174315098
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by najgori View Post
Let Sigil inform user about crappy code and let him decide if he or she want to wait for program to show it, or to fix it.
That might work if the file opened fast enough that the user doesn't have the opportunity to think the program has crashed. Not to mention once the file is open, the incredible slowness of trying to do anything.

Basically, I did what Turtle91 did and used Notepad++ to add the line breaks (though I did use </p>\r\n), restructured to Sigil Norm and used the RemoveInLineStyles plugin to move the styles to a stylesheet. Some time spent with cleaning up that stylesheet muttering about people who use absolute value and their probable destination, dubious ancestry, repugnant morals, etc.

Last edited by DNSB; 06-05-2020 at 01:03 PM.
DNSB is online now   Reply With Quote
Old 06-05-2020, 01:50 PM   #14
mikeayers
Zealot
mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.mikeayers ought to be getting tired of karma fortunes by now.
 
mikeayers's Avatar
 
Posts: 136
Karma: 432377
Join Date: Nov 2010
Location: USA
Device: Kindle PW 10thGen, Kobo Clara HD
I don't remember the order I did things, but with calibre I split the file on <H1> tags and ran beautify all files.
Then I ran check book and it complained that some of the files where too large, so I manually split them where there were "***" in the files until I got tired of doing them and it stopped complaining about the error...
It does open faster now.
I should read the book, but I am tired of looking at it for now. I will get back to it later.
mikeayers is offline   Reply With Quote
Old 06-08-2020, 11:58 PM   #15
AlanHK
Guru
AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.
 
AlanHK's Avatar
 
Posts: 681
Karma: 929286
Join Date: Apr 2014
Device: PW-3, iPad, Android phone
The publisher advertises http://www.arcmanor.com/
Quote:
ArcManor Typesetting Services
An award-winning service for self-publishers and independent publishers.
Give your book that distinctive professional look.
Would like to know what award that was.

They publish some interesting and diverse books, but the layout and artwork are at best enthusiastic amateur.

It's very hard for small publishers, if they aren't a ripoff vanity press or reformatting public domain text they've scraped from Gutenberg or Internet Archive. Or genre-of-the-month stuff like zombies/gay werewolves/LitRPG/survivalist gun porn. So still give them kudos for services to literature. But do your typesetting elsewhere.

Last edited by AlanHK; 06-09-2020 at 12:04 AM.
AlanHK is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
The Adventures of Joe Nobody and the Badly Formatted Epub mklynds Sigil 44 01-30-2013 02:43 PM
EPUB files formatted okay XHTML not so much condor Nook Color & Nook Tablet 13 04-29-2011 10:31 AM
Help with horribly formatted epub? bfollowell Sigil 4 10-28-2010 12:44 AM
Mobipocket vs ePub: Why worse is better in ebook formats anurag News 104 10-15-2010 04:28 PM
Properly formatted PDFs to Epub AgentBEATS Calibre 10 11-01-2009 11:02 PM


All times are GMT -4. The time now is 01:04 AM.


MobileRead.com is a privately owned, operated and funded community.