Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 09-01-2008, 05:08 PM   #1
alexxxm
Addict
alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.
 
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
chapter detection, basic functionality

I encountered some problems in chapter detection while testing the bookit plugin, and found out that I don't understand exactly how it works.

Quite simply, I created a small html file with "<h1>some text</h1>" here and there, to represent chapter headings.
I found out that the html2lrf switches: --add-chapters-to-toc --chapter-regex="." (or --chapter-regex="h1", or --chapter-regex 'h1', or different other variations) accomplish exactly nothing: the lrf is created, without any TOC.
Can you please explain me what I'm doing wrong?

Thanks!

alessandro
alexxxm is offline   Reply With Quote
Old 09-01-2008, 11:02 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
download the calibre beta and use
Code:
--chapter-attr h1,none,
kovidgoyal is offline   Reply With Quote
Advert
Old 09-02-2008, 02:25 AM   #3
alexxxm
Addict
alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.
 
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
seems like a snaky way to find out beta-testers!

Joking apart, I was a bit afraid - but if it's not too unstable I'll give it a try.

Seems however a functionality already present in the stable version (at least in the man pages), and the switch does not raise an error. Of course, however, it generates a LRF file without a TOC - as usual.

alessandro
alexxxm is offline   Reply With Quote
Old 09-02-2008, 03:37 AM   #4
alexxxm
Addict
alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.
 
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
well, I jumped in so I'll post you about my problems - if any - on the beta (no problems yet).

I tried your suggestion, but I got:

[root@lambda2 Booklets in process]# html2lrf literature.htm --add-chapters-to-toc --chapter-attr h1,none,
Processing u'literature.htm'
Parsing HTML...
Converting to BBeB...
Traceback (most recent call last):
File "<string>", line 2026, in <module>
File "<string>", line 2020, in main
File "<string>", line 1910, in process_file
File "<string>", line 273, in __init__
File "<string>", line 395, in add_file
File "<string>", line 509, in parse_file
File "<string>", line 723, in process_children
File "<string>", line 1762, in parse_tag
File "<string>", line 723, in process_children
File "<string>", line 1449, in parse_tag
AttributeError: 'unicode' object has no attribute 'pattern'
/usr/bin/html2lrf: line 6: 32591 Segmentation fault ./html2lrf "$@"


where, usually, I didnt have any error - (apart from no-TOC)
then I thought that the new html2lrf certainly wasnt in /usr/bin anymore, and sure enough it was in /opt/calibre.

Then I got this:

[root@lambda2 Booklets in process]# /opt/calibre/html2lrf literature.htm --add-chapters-to-toc --chapter-attr h1,none,
Traceback (most recent call last):
File "<string>", line 7, in <module>
File "/home/kovid/build/pyinstaller/iu.py", line 346, in importHook
ImportError: No module named PyQt4


Now, I have this on my machine:
[root@lambda2]# rpm -qa|grep -i pyqt
PyQt-3.17.4-1.fc8
PyQt-devel-3.17.4-1.fc8
PyQt4-4.3.3-1.fc8

I wonder it it's related to the warning I got recently with the updtates, that tells me this:
WARNING: You need PyQt >= 4.4.2 for the GUI. You have 4.3.3
You may experience crashes or other strange behavior.


alessandro
alexxxm is offline   Reply With Quote
Old 09-02-2008, 10:34 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
oops typo, will be fixed in the next beta release.

/usr/bin/html2lrf is the correct way to run it.
kovidgoyal is offline   Reply With Quote
Advert
Old 09-03-2008, 02:43 AM   #6
alexxxm
Addict
alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.
 
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
Quote:
Originally Posted by kovidgoyal View Post
oops typo, will be fixed in the next beta release.
good, I'm waiting for it!
Anyway, coming back to my original question, I still have a basic doubt.

I believed that the default setup of the engine that recognizes the chapters was to look for stuff inside <h1></h1> (don't know about h2,h3,...), so I just needed the default functionality, just in case to be activated with --add-chapters-to-toc .
The need to use --chapter-attr h1,none, to install the beta etc, means that the chapter detection engine does not work?

Sorry for the questions, but the man pages are not very clear about it ...


alessandro
alexxxm is offline   Reply With Quote
Old 09-03-2008, 11:15 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The default is to search for the string chapter,section or book inside either h1 or h2 tags and if they are found to mark them as chapters.
kovidgoyal is offline   Reply With Quote
Old 09-04-2008, 02:56 AM   #8
alexxxm
Addict
alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.
 
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
Quote:
Originally Posted by kovidgoyal View Post
The default is to search for the string chapter,section or book inside either h1 or h2 tags and if they are found to mark them as chapters.
ah, maybe I understand now: the strings chapter,section or book must be inside the tag!
But where? In the id attribute (<h1 id="chapter 1">chapter title</h1>)???

If it's so, maybe it will be more difficult that I thought to easily create a TOC from the Bookit editor - I believed that html2lrf was able to "harvest" the chapters just by looking to <h1>chapter title</h1> text, so in the Bookit editor I could just set the h1 attribute to selected text.




alessandro
alexxxm is offline   Reply With Quote
Old 09-04-2008, 10:28 AM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
--chapter-regex matches the text inside heading tags i.e. <h1>chapter 1</h1> will match

If you want to match on tagnames or tag attributes, use --chapter-attr
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with Chapter detection ubergeeksov Calibre 0 09-02-2010 04:56 AM
xpath for chapter detection romnempire Calibre 7 07-26-2010 05:34 PM
Chapter detection for LRF HenryP Calibre 12 04-03-2009 08:22 AM
Cant find help for chapter detection fallwood Calibre 6 12-10-2008 01:20 PM
Calibre chapter detection AKninja04 Calibre 5 09-14-2008 12:09 PM


All times are GMT -4. The time now is 07:06 AM.


MobileRead.com is a privately owned, operated and funded community.