MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Sigil (https://www.mobileread.com/forums/forumdisplay.php?f=203)
-   -   How NOT to use a stylesheet? (https://www.mobileread.com/forums/showthread.php?t=170764)

karyan 03-01-2012 11:17 AM

How NOT to use a stylesheet?
 
My apologizes if this has already been asked and answered. (I have done some searches but...)

I really like using Sigil for cleaning up badly converted documents. Most need very minimal formatting, leaving display decisions to the reading device, so I like to clean up using the code view. One thing that urks me is how Sigil often replaces the simple html with stylesheet references or adds code where none is needed.

Is there a way to stop Sigil from doing this?

Thanks.

Rob Lister 03-01-2012 11:20 AM

Quote:

Originally Posted by karyan (Post 1986677)
...where none is needed.

example?

karyan 03-01-2012 11:26 AM

Quote:

Originally Posted by Rob Lister (Post 1986683)
example?

Adding <span> ... </span> around text that has no formatting.

DiapDealer 03-01-2012 11:29 AM

Turning off HtmlTidy will help in many instances, but you can't completely stop it from ever "helping." I know I don't like it when it sometimes consolidates/stacks/creates CSS classes.

Quote:

Originally Posted by karyan (Post 1986695)
Adding <span> ... </span> around text that has no formatting.

I've never had it do that, myself. I most often see stuff like that in ePubs that were automatically converted from a different format.

Rob Lister 03-01-2012 11:45 AM

Quote:

Originally Posted by karyan (Post 1986695)
Adding <span> ... </span> around text that has no formatting.

no formatting? that in itself is a problem that should be addressed somehow. can you provide a cut and paste example?

Keroberos 03-01-2012 11:49 AM

I think a lot of these issues are dependent upon the source HTML. The cleaner the HTML being imported to Sigil is, the less problems like these you will have. Unfortunately if the HTML has had any editing done in Microsoft Word, you will have a lot of wonkiness in the resulting HTML. Saving the HTML as "Web Page, Filtered" in Word can help to alleviate some of these problems.

karyan 03-01-2012 11:50 AM

Quote:

Originally Posted by DiapDealer (Post 1986701)
Turning off HtmlTidy will help in many instances, but you can't completely stop it from ever "helping." I know I don't like it when it sometimes consolidates/stacks/creates CSS classes.


I've never had it do that, myself. I most often see stuff like that in ePubs that were automatically converted from a different format.

Thanks! I didn't even realise HtmlTidy was turned on. I'd be very happy if Sigil was just a little less "helpful."

Most of the <span></span> were introduced (I believe) by the original conversion process, but I also had this happening in files that I had previously "cleaned" up. Turning off HtmlTidy seems to be preventing this. *fingers crossed*

Hitch 03-01-2012 03:24 PM

If you exported material from Word, you'll have spans everywhere that have nothing whatsoever to do with Sigil. Word outputs--as nicely as I can say it--gobbledygook--and Sigil attempts to rectify it. If you have something--ANYTHING--in a para that is odd, Sigil will attempt to span it, for lack of any other instruction.

Just cleaning your html in a plain html editor will eliminate a lot of that.

If you exported your html from PDF, even with Acrobat, gods help you. You're better off, IME, exporting to XML and working from there.

HTH,
Hitch

Toxaris 03-02-2012 03:24 AM

Exactly, that is the reason why I wrote my HTML export macro in Word to get some decent, clean HTML output. It is not perfect, but good enough.

huebi 03-02-2012 04:50 AM

Never saw an empty span created by Sigil. What i dislike is the transformation <p><i>...</i></p> into <p class="sgc-n">...</p>. Well i have to live with that.

karyan 03-02-2012 09:19 AM

Keroberos and Hitch, I didn't do the initial conversion - have no idea how it was done - although it appears to have originated from a scanned physical document. There is not much I can do about the initial state of the documents, I just want to be able to "clean" them and have them stay clean - which seems to be happening now, with DiapDealer's suggestion.

huebi, I had the same problem, but it doesn't seem to be happening now that I've turned off HtmlTidy.

Rob Lister, I no longer have examples - and I'm not eager to reproduce by turning HtmlTidy back on - but one situation I saw was <p><span>...</span></p> where the ... represents plain text. I would delete, save and then have the <span>...</span> reappear when I reopened the file. Maybe I am overlooking some steps on my part that contributed to my problems, but I'm very satistfied with the solution.

Oldpilot 03-07-2012 12:29 PM

Quote:

Originally Posted by Toxaris (Post 1987736)
Exactly, that is the reason why I wrote my HTML export macro in Word to get some decent, clean HTML output. It is not perfect, but good enough.

You have a macro? Wow. Is it available?

Toxaris 03-07-2012 01:43 PM

yep, an old version is up here somewhere. I'll try to find it and upload the latest version.

It can probably improved upon, but it works for me.


All times are GMT -4. The time now is 10:57 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.