08-05-2009, 03:18 PM | #1 |
Guru
Posts: 718
Karma: 1085610
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
|
Remove Paragraph Formatting
Hi Valloric,
I was wondering if you were planning on putting in an option to remove all formatting to a paragraph or page? The reason I ask is that I got lots of ebooks in HTML format with a wide variety of formatting and most of the formatting is a mess. If I had an easy option to remove the formatting from each paragraph and revert it to just a plain (<p> </p> form) paragraph it would make my life much easier. Also having an option to do the same thing to the whole file and revert all paragraphs to a plain form in one go would also be good but I would like to have at least the option on an individual paragraph as a minimum. Both option would be fantastic. If you hadn't planned it, Then I'll make it a feature request. |
08-05-2009, 04:15 PM | #2 | |
Created Sigil, FlightCrew
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Quote:
Then again, I'm far from a CSS master. I could be missing something. If some of the other members have ideas how this could be implemented, I'd like their input. EDIT: If all you want is for Sigil to remove the classes from your elements and the styles that are given to them by references to their explicit ID's, then that would be different. But inherited styles would still be a problem. Last edited by Valloric; 08-05-2009 at 04:18 PM. |
|
Advert | |
|
08-05-2009, 04:45 PM | #3 |
creator of calibre
Posts: 44,353
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Welcome to my world The way calibre handles this is it "flattens" the css. i.e. it computes the style applying to every element. Once that's done, it's relatively easy to reset the computed values to some defaults.
Of course, I doubt this approach is suitable for Sigil since you don't want to mess with the user specified CSS structure. But perhaps if the user asks you to do this, you can warn them that it will involve flattening the css and then do it. |
08-05-2009, 05:11 PM | #4 |
creator of calibre
Posts: 44,353
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Actually thinking about it a little...
WebKit already computes the styles for you, you can get the computed values using a javascript to python bridge. Then just set a high specificty CSS selector for that element that overrides the computed values where they differ from the defaults |
08-05-2009, 05:26 PM | #5 |
Created Sigil, FlightCrew
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Hm... this could work. How would one get this information in javascript? I already have a JS-to-C++ bridge in Qt so that won't be a problem.
|
Advert | |
|
08-05-2009, 05:49 PM | #6 |
creator of calibre
Posts: 44,353
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Examining the attributes of the style object should do the trick. I suggest embedding a javascript library like jQuery and using that to do this easily. See the calibre viewer source for how to do this (calibre uses it to implement the reference mode, bookmarks, etc). If you need more clarification, feel free to ask. I'm just a littel hassled today (as you saw in the thread on covers), so I may take a little while to give you a detailed response.
EDIT: oops I should have said javascript to C bridge (I forgot Sigil was not written in python |
08-06-2009, 05:57 AM | #7 |
Guru
Posts: 718
Karma: 1085610
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
|
Seeing as my programming skills are quite basic, I didn't realise how hard this was to implement.
Does this look like this is doable? Should I raise this as a feature request? Also thank to you (Valloric) and kovidgoyal for looking at this and also to the both of you for creating such awesome software for epub users like me. Between the two of you, I think you are providing everything I need for ebooks (except the eink reader I'm after, which I'll gladly except if you want to send me one ), and its freeware to boot! Cheers!!! |
08-06-2009, 06:12 AM | #8 |
Resident Curmudgeon
Posts: 75,901
Karma: 134368292
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
If you have paragraphs as just <p>text</p> then you can easily edit the CSS to do whatever you'd like with them.
|
08-06-2009, 07:29 AM | #9 |
Created Sigil, FlightCrew
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
|
08-06-2009, 09:30 AM | #10 |
Guru
Posts: 718
Karma: 1085610
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
|
Ok, have raised it as an enhancement, issue #56.
|
08-07-2009, 12:32 AM | #11 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
While there could be many levels of inherited styles, practically speaking each paragraph in most books is surrounded by <p> and or<span> tags which contain the primary style settings for that paragraph. If the feature just cleared the style from those specific tags and let any others remain that would probably be sufficient in most cases.
Trying to 'unset' all the style settings that may exist further up in the hierarchy would require a whole new style to be created to counteract those settings, wouldn't it? It looks like GhostyJack is asking to just clear the style tags from that specific paragraph, so computing and creating a whole new style goes beyond that, and may not actually be desirable. |
08-07-2009, 03:33 AM | #12 |
Guru
Posts: 718
Karma: 1085610
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
|
You've hit the nail on the head Idolse, for formatting removal at paragraph level, that's what I'm asking for. Just remove all the extra tags out of the paragraph at the press of a button.
I imagine though this wouldn't work at the file level if I'm understanding the earlier comments correctly, but implementing only at paragraph level would still speed up the tidying-up work on the files. |
08-07-2009, 03:47 AM | #13 |
Banned
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
|
It could totally work at the file-level too. Think of Book Designer files, that force bold onto every paragraph. No need for that; boom! they're all gone.
It would merely make the CSS apply styles to non-existent classes. No harm, especially if a CSS validator comes into being, it might note unnecessary classes so you could clean it up. Truthfully, you want as few classes and IDs as you can make work effectively and logically. m a r |
08-07-2009, 07:58 AM | #14 |
Created Sigil, FlightCrew
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
That should be simple then, if you're content with the aforementioned limitations of this approach.
|
08-07-2009, 08:12 AM | #15 |
Guru
Posts: 718
Karma: 1085610
Join Date: Mar 2009
Location: Bristol, England
Device: PRS-T1, 1825PT, Galaxy Tab, One X, TF700T, Aura HD, Nexus 7
|
I don't have any problem with the mentioned methods as I only wanted to removed the existing formating so I can apply my own without having to go through the code and manually remove it line by line.
If it gets implemented, then we can see if it works. If not, then it may need to be removed or modified and also how popular it is. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Remove Formatting | crutledge | Sigil | 5 | 09-15-2010 02:04 PM |
Should ebooks specify exact paragraph and page formatting? | sourcejedi | General Discussions | 27 | 07-01-2010 06:08 PM |
TXT conversion to ePub or LRF - paragraph formatting | Zapped | Calibre | 6 | 10-23-2009 05:06 PM |
RFE: Remove remove tags in bulk edit | magphil | Calibre | 0 | 08-11-2009 10:37 AM |
Anyway to remove paragraph spaces in pdb files? | twister | Other formats | 3 | 03-12-2009 09:36 PM |