| 
			
			 | 
		#1 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720 
				Karma: 1759970 
				Join Date: Sep 2010 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
			
			 
				
				azw3 to epub - structure detection bug ?
			 
			
			
			I converted a retail azw3 to epub - with my usual preference settings ( heuristcs OFF)  
		
	
		
		
		
		
		
		
		
		
		
		
	
	but the conversion took a very long time, the resulting epub took even longer to open in sigil , and it seemed that the reason was that some chapters had been split into many, many XHTML files, with typically only 1 sentence per file. I re-ran the conversion with all structure detection settings blanked out & it was fine. i.e. I removed this //*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\s+', 'i')) or @class = 'chapter'] and removed this //*[name()='h1' or name()='h2'] & with those bits removed I ended up with 1 file per chapter, & "normal" conversion /load times. I don't do a lot of azw3 to epub, so I don't know if this was a specific issue with just one book or a more general problem ?. I Don't know how to inspect the source format to see what could have caused this. I used calibre 1.19 64 bit version. I am just flagging this as something that may been more investigation, if anyone else has similr conversion experiences  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#2 | 
| 
			
			
			
			 US Navy, Retired 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,897 
				Karma: 13806776 
				Join Date: Feb 2009 
				Location: North Carolina 
				
				
				Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#3 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720 
				Karma: 1759970 
				Join Date: Sep 2010 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			i don't know how to create & post an extract from a azw3  so I will leave it for now. Happy to arrange to provide a copy via PM for investigation. It could be that the whole book is needed to reproduce the "bug" 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Also, I am not 100% sure that my structure detection xpath commands match the defaults or if I have previously tinkered with them, but they have never caused anything like this before  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#4 | |
| 
			
			
			
			 US Navy, Retired 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,897 
				Karma: 13806776 
				Join Date: Feb 2009 
				Location: North Carolina 
				
				
				Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#5 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720 
				Karma: 1759970 
				Join Date: Sep 2010 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Yes, I know & that means creating a new bug reporting account, as I've lost my previous credentials, then finding that I probably can't attach the book anyway because of size restriction.  
		
	
		
		
		
		
		
		
		
		
		
		
	
	Frankly it is far too much hassle, for something that may be a 1-book-only glitch. I went to the trouble of posting what happened in case anyone else found that helpful. I am happy to arrange to forward the book as previously stated, but that's It. I f I have cause to convert another AZW3 that gives the same problems then I'll reconsider but otherwise, life's too short....  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#6 | 
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 1 
				Karma: 10 
				Join Date: Feb 2014 
				
				
				
				Device: Tolino Shine 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Hi there. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	New to this forum and had the same problem; guess that's actually no bug. Previously converted books from AZW3 to EPUB perfectly but the latest one got me over 3000 pages instead of ~700 ![]() Actually i'm not exercised in html but comparing the html from a book that worked fine with the wrong one it occurs that the AZW3 download from Amazon got the command <p class="chapter"> a lot of times between the text...therefore it makes a 'pagebreak' on the reader...either in the Calibre reader and also in the physical E-Book Reader. However, thx cybmole the solution from your primary post worked fine for me...  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#7 | |
| 
			
			
			
			 null operator (he/him) 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,018 
				Karma: 30277294 
				Join Date: Mar 2012 
				Location: Sydney Australia 
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 BR  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
![]()  | 
            
        
    
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| RTF Structure Detection? | philhxc | Conversion | 1 | 11-09-2011 03:01 AM | 
| Structure Detection Problems | Jonnster | Conversion | 21 | 05-12-2011 03:12 PM | 
| Trouble w structure detection | jeff47 | Calibre | 1 | 10-13-2010 01:51 AM | 
| epub - force a 2nd pass to improve structure detection ? | cybmole | Calibre | 10 | 10-08-2010 02:00 AM | 
| Structure detection v5.5 and v6.2 | AlexBell | Calibre | 2 | 07-29-2009 11:11 PM |