MobileRead Forums - View Single Post - Cleaning ePubs: automatically, fast and with as many generic rules as possible

ibu · 08-07-2013, 08:22 AM

@DiapDealer
Yes, you are right. I'm looking for a tool which parses a valid xhtml document. Only than secure and as well easy cleaning (without complex and risky regex) is possible.

The html authoring tool dreamweaver e.g. offers some commands in it's GUI to perform some of my mentioned tasks.
But I don't want to unzip an epub, edit all the files in Dreamweaver, pack it again as an epub, and than, perform the rest of cleaning inside Sigil (generate the TOC, edit the OPF, ...).

I understand all your arguments about "not important enough".
My hope was, that in the community of epub friends, there are many others who are looking for ways to clean existing epubs, because it is not rare, that cluttered source code is the cause of many presentation problems.

And there's no hope at all, that the producers will deliver quality code.

08-07-2013, 08:22 AM	#7
ibu Addict Posts: 264 Karma: 9246 Join Date: Feb 2010 Location: Berlin, Germany Device: Kobo H20, iPhone 6+, Macbook Pro	@DiapDealer Yes, you are right. I'm looking for a tool which parses a valid xhtml document. Only than secure and as well easy cleaning (without complex and risky regex) is possible. The html authoring tool dreamweaver e.g. offers some commands in it's GUI to perform some of my mentioned tasks. But I don't want to unzip an epub, edit all the files in Dreamweaver, pack it again as an epub, and than, perform the rest of cleaning inside Sigil (generate the TOC, edit the OPF, ...). I understand all your arguments about "not important enough". My hope was, that in the community of epub friends, there are many others who are looking for ways to clean existing epubs, because it is not rare, that cluttered source code is the cause of many presentation problems. And there's no hope at all, that the producers will deliver quality code.