Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 08-04-2009, 05:22 AM   #1
shilpa
Junior Member
shilpa began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2009
Device: sony prs-505
TOC not identifying all chapters

Hi,

I have just bought a sony prs-505 and have downloaded the calibre software. My books are all in pdf and i have converted some of them to epub. Because i am new to the software i kept the settings as it is in the conversion part. I am quite happy with it. Apart from the chapters. when i do a conversion it picks up chapters correctly upto a certain number then it does not. for example in a book with 51 chapters i will get chapters 1 - 20(so that is fine) but from 20 onwards i get chapter 30, chapter 40 chapter 50. It does this for most of my books. So it misses chapters between 30 - 40 and 40 - 50 etc. Does anybody now why this is happening? is there a setting that i need to change?
shilpa is offline   Reply With Quote
Old 08-05-2009, 12:07 AM   #2
Catire
Lord of the Universe
Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.Catire ought to be getting tired of karma fortunes by now.
 
Catire's Avatar
 
Posts: 670
Karma: 737849
Join Date: Jan 2008
Location: Maturin , Venezuela
Device: Sony Reader PRS-505 / PSP
Try modifying "table of contents" and/or the "the look and feel". For that you'll need to read the Xpath tutorial



If that doesn't help you let me know
If the book is not in copyright you might find it here or on another site already formatted for your Sony. If its not in copyright and you can't find it post it here and I'll test it.


Btw Welcome to Mobileread.
Catire is offline   Reply With Quote
Advert
Old 08-05-2009, 04:28 AM   #3
shilpa
Junior Member
shilpa began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2009
Device: sony prs-505
ok thanks catire. I will try that.
shilpa is offline   Reply With Quote
Old 08-05-2009, 08:34 AM   #4
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
PDF conversion is error prone, and there is a point where a regex attempts to identify the chapters, marking them up so that the xpath will later catch them. Editing the xpath from the default won't help with pdf.

You could open a bug on this and attach your pdf, I can look into enhancing the regex if possible. That said, this will continue to happen with other pdfs as the conversion can cause numerous variations in the output text.

Unfortunately, the best way to fix these in general is by hand. Now that Sigil is out it will make it trivial to scroll through the book and mark the missed chapters so that they're added to the TOC. Note if you're using Calibre .6.x you'll need to wait a couple days til a related crash fix is released for Sigil - watch for version 0.1.1.

Last edited by ldolse; 08-05-2009 at 08:39 AM.
ldolse is offline   Reply With Quote
Old 08-05-2009, 09:58 AM   #5
shilpa
Junior Member
shilpa began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2009
Device: sony prs-505
I will wait for the crash fix of the sigil comes out.

Also i was wondering how do you enhance the regex of the pdf file? what file do you open and what is that you edit?
shilpa is offline   Reply With Quote
Advert
Old 08-05-2009, 10:35 AM   #6
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
It's in preprocess.py You'd need to download a local version, edit it, and then update your local install.

On OS X I'd use the following command from the directory containing the source file to update my calibre installation (fixed according to Kovid's input):
Code:
calibre-debug --update module calibre.ebooks.preprocess ./preprocess.py
The directories vary a bit by platform, everything else is the same.

These are the regexes doing the chapter detection there:
Code:
                  # Detect Chapters to match default XPATH in GUI
                  (re.compile(r'(?=<(/?br|p))(<(/?br|p)[^>]*)?>\s*(?P<chap>(<i><b>|<i>|<b>)?(Chapter|Epilogue|Prologue|Book|Part)\s*(\d+|\w+)?(</i></b>|</i>|</b>)?)(</?p[^>]*>|<br[^>]*>)\n?((?=(<i>)?\s*\w+(\s+\w+)?(</i>)?(<br[^>]*>|</?p[^>]*>))((?P<title>(<i>)?\s*\w+(\s+\w+)?(</i>)?)(<br[^>]*>|</?p[^>]*>)))?', re.IGNORECASE), chap_head),
                  (re.compile(r'(?=<(/?br|p))(<(/?br|p)[^>]*)?>\s*(?P<chap>([A-Z \'"!]{5,})\s*(\d+|\w+)?)(</?p[^>]*>|<br[^>]*>)\n?((?=(<i>)?\s*\w+(\s+\w+)?(</i>)?(<br[^>]*>|</?p[^>]*>))((?P<title>.*)(<br[^>]*>|</?p[^>]*>)))?'), chap_head),

Last edited by ldolse; 08-05-2009 at 01:20 PM.
ldolse is offline   Reply With Quote
Old 08-05-2009, 11:29 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Actually in 0.6 the correct syntax is

Code:
calibre-debug --update module calibre.ebooks.preprocess ./preprocess.py
IIRS the old syntax will no longer work, though I'm not sure.
kovidgoyal is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Old Thread] calibre detects chapters, doesn't add to TOC Corey.Langner Calibre 17 09-25-2011 07:22 PM
identifying metadata? soondai ePub 0 09-24-2010 10:11 AM
adjusting toc.ncx file to restore missing chapters viewed in Adobe Digital Editions cyberbaffled ePub 5 12-06-2009 09:44 PM
Kindle doesnt show little dots that represent chapters tho I have a TOC ayman07 PDF 17 06-21-2009 03:28 AM
BookDesigner problem: TOC links to chapters and back again. Dr. Drib Sony Reader 6 07-08-2007 12:26 PM


All times are GMT -4. The time now is 08:44 PM.


MobileRead.com is a privately owned, operated and funded community.