| 
			
			 | 
		#1 | 
| 
			
			
			
			 A Hairy Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,397 
				Karma: 20212733 
				Join Date: Dec 2012 
				Location: Charleston, SC today 
				
				
				Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire 
				
				
				 | 
	
	
	
		
		
			
			 
				
				Formalize current functionality?
			 
			
			
			Soooo....I'm cleaning up a book with 145000 chapters/files. I make a change - a very simple one that shouldn't be any problem - and when I try and save Sigil pukes and says it can't save because there is some malformed code....somewhere. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	I could let Sigil fix it automatically...but I would rather not....who knows what would happen then?? So I go file by file and open them one at a time looking for the red error box in the preview pane. Did I mention there are 145000 chapters??...takes FOREVER. I can't even run a report to see if that would help because of the malformed html. I've lived with this for a couple years...I took it as my just deserts for making the mistake in the first place...but JUST THIS WEEK I found a way for Sigil to tell me which file actually has the error on it. "How?" you ask. I select all the files in the text folder then right-click and attempt to link a stylesheet. It doesn't let me do this either, but the error code includes the name of the first file with an error!! Hallelujah!! Of course, it only lists one errant file at a time....so my request is: Can we formalize that functionality into an error listing? Something along the lines of the "Report" page but with a list of offending files that could be clicked on to open. That would make my blundering so much more enjoyable..... Thanks,  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#2 | |
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306 
				Karma: 13057279 
				Join Date: Jul 2012 
				
				
				
				Device: Kobo Forma, Nook 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I agree I agree, a few of those error dialogs could use some tweaking. 
		
	
		
		
		
		
		
		
		
		
		
		
		
			You could always click on the little checkmark icon as well to "Validate EPUB with Flight Crew". That would also then tell you which files are malformed, and should be able to point you in the general vicinity of the line number of the errors. Quote: 
	
  
		Last edited by Tex2002ans; 03-21-2015 at 08:18 AM.  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| Advert | |
| 
         | 
    
| 
			
			 | 
		#3 | 
| 
			
			
			
			 Guru 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 878 
				Karma: 2457540 
				Join Date: Nov 2011 
				
				
				
				Device: none 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#4 | 
| 
			
			
			
			 Well trained by Cats 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,267 
				Karma: 61916422 
				Join Date: Aug 2009 
				Location: The Central Coast of California 
				
				
				Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I will admit  
		
	
		
		
		
		
		
		
		
		
		
		
	
	  t0 I having seen this issue more than once and being fustrated that I had to attempt to touch each HTML file in Book view (Trying to goto a bad file in BV will flip over to CV and show the dreaded pink box.  Is it possible to at least  RED Color  Lines in the File Browser for problem  files  (the ones that will have pink boxes)
		 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#5 | 
| 
			
			
			
			 Sigil Developer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,072 
				Karma: 6361556 
				Join Date: Nov 2009 
				
				
				
				Device: many 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Hi  
		
	
		
		
		
		
		
		
		
		
		
		
		
			I think I could throw together a python plugin that would walk the complete set of xhtml files and build up a report of any not well-formed files with a description of at least the first error in the file if one exists. Would this do the trick? BTW: We have already removed Tidy and will use google's gumbo-parser to auto clean up any not properly formed files in the future. Gumbo implements the true html5 parsing spec and will handle the html exactly like browsers will. Gumbo is basically like Beautiful Soup but written in C and really fast. Last edited by KevinH; 03-21-2015 at 06:34 PM.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| Advert | |
| 
         | 
    
| 
			
			 | 
		#6 | 
| 
			
			
			
			 A Hairy Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,397 
				Karma: 20212733 
				Join Date: Dec 2012 
				Location: Charleston, SC today 
				
				
				Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#7 | |
| 
			
			
			
			 Resident Curmudgeon 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,782 
				Karma: 150249619 
				Join Date: Nov 2006 
				Location: Roslindale, Massachusetts 
				
				
				Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#8 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520 
				Karma: 121692313 
				Join Date: Oct 2009 
				Location: Heemskerk, NL 
				
				
				Device: PRS-T1, Kobo Touch, Kobo Aura 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#9 | 
| 
			
			
			
			 Sigil Developer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,072 
				Karma: 6361556 
				Join Date: Nov 2009 
				
				
				
				Device: many 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Which?  The plugin I offered to write? 
		
	
		
		
		
		
		
		
		
		
		
		
	
	or The version of Sigil without Tidy and with Gumbo? If the former, I will try to drum something up later this week or early next if I get a few free moments. If the latter, Sigil master already has Tidy gone and Gumbo in place but it is in very rough shape as we have been tearing Xerces out of it,and FlightCrew (will make it a plug-in) and will come with python 3.4 embedded in it when it stabilizes. It will need lots of testing but I would guess within a month or so you may see an alpha or a beta. We have already begun changing the OPF Parser to allow and keep epub3 features. Take care, KevinH  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#10 | 
| 
			
			
			
			 Resident Curmudgeon 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,782 
				Karma: 150249619 
				Join Date: Nov 2006 
				Location: Roslindale, Massachusetts 
				
				
				Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Thanks for the information. It is indeed the latter (Sigil without Tidy). 
		
	
		
		
		
		
		
		
		
		
		
		
	
	So with the next Sigil, will it be possible to not have the structure changed?  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#11 | 
| 
			
			
			
			 Sigil Developer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,072 
				Karma: 6361556 
				Join Date: Nov 2009 
				
				
				
				Device: many 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Hi, 
		
	
		
		
		
		
		
		
		
		
		
		
	
	No Tidy, so no playing with your classes, no need to remove duplicate classes, just a really robust parser that tries its best to come up with something useful out of any html soup. Plus we get the inherent benefit of recognizing all html5 tags. Still lots and lots to do before any release as we also replaced the strict Xml processor Xerces with Gumbo node trees and parsing for all xhtml files. We are replacing the remainder of Xerces use with pure xml (opf and ncx) with python and lxml. So lots of code had to be rewritten and redesigned and still needs some work. I'm sure lots of new bugs will have to be tracked down and fixed. But if we want to support both epub2 and 3 there was no other way then overhaul the tools Sigil used. This is the first step. Once this is complete we can start adding in support for epub3. KevinH  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#12 | 
| 
			
			
			
			 Zealot 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 119 
				Karma: 64428 
				Join Date: Aug 2011 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Once Xerces is gone, will Sigil still require sse2 hardware?
		 
		
	
		
		
		
		
		
		
		
		
		
		
		
			Last edited by signum; 03-23-2015 at 03:17 AM. Reason: forgot which group I was in  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#13 | 
| 
			
			
			
			 Grand Sorcerer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,891 
				Karma: 207182180 
				Join Date: Jan 2010 
				
				
				
				Device: Nexus 7, Kindle Fire HD 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#14 | |
| 
			
			
			
			 Sigil Developer 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,072 
				Karma: 6361556 
				Join Date: Nov 2009 
				
				
				
				Device: many 
				
				
				 | 
	
	
	
		
		
			
			 
				
				Sigil Plugin: SanityChecker_v0.1.0.zip
			 
			
			
			Hi, 
		
	
		
		
			This is now updated to version 0.1.0 which provides the line and column of the last open start tag as well when tag nesting mismatches occur. Quote: 
	
 Attached is a quick and dirty Sigil validation plugin that will do a rough (and I mean rough!) sanity check of all xhtml files in an ebook and report back the first nesting error, mismatched attribute quotes, things like that which will prevent it from being parsed by an xml parser. It will also detect basic structure errors. It is NOT in any way meant to replace Flightcrew or EpubCheck. But it will detect gross errors that would prevent a pure xml parser (without a dtd) from loading it. Nicely, the gumbo html5 parser would happily eat any of this for lunch and fix it automatically on the fly. Give it a try and see if it will cut your list of files to worry about down to a manageable level. We can improve it if anyone feels it is useful enough to want something that will do a bit more. But for real validation, you really need epubcheck. I have attached it. Hope this helps. KevinH Last edited by KevinH; 05-07-2015 at 04:34 PM. Reason: update to newer release  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#15 | 
| 
			
			
			
			 A Hairy Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,397 
				Karma: 20212733 
				Join Date: Dec 2012 
				Location: Charleston, SC today 
				
				
				Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Wow, that was fast!! 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Thanks again Kevin, I'll give it a whirl and report back!  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
![]()  | 
            
        
            
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Collection functionality | sorin | Plugins | 12 | 04-06-2011 05:38 AM | 
| Request PDF Functionality | aidren | enTourage Archive | 10 | 05-04-2010 08:11 PM | 
| Right click functionality | dmikov | Calibre | 4 | 07-30-2009 01:25 AM | 
| Functionality | bookish | Which one should I buy? | 24 | 06-19-2007 01:32 PM |