Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 01-13-2019, 11:34 AM   #1
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 79
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
CFI Parser

Hi, @Kovid.

I want to modify ACE Plugin so it can output the results directly on the Editor. This way the user could click on the line and go directly to where the error is (like EPUBCheck Plugin).

But, there is one thing on the way: ACE uses a non-standard EPUBCFI output. The starting point is not the spine. It uses CFI references only inside the file, like this:

Code:
toc.xhtml#epubcfi(/4[idParaDest-e]/2/4[toc])
Doc.xhtml#epubcfi(/4[idParaDest-b]/2/2/4[tit]/4)
The idea was to use your CFI Parser, but I am having trouble to deal with this difference.

I thought maybe I could inject the first part of the reference, before passing it to parser.py, making it whole again. Like this:

Spine item to ref:
Code:
	<spine toc="ncx">
		<itemref idref="Cover"/>
		<itemref idref="Doc"/>
		<itemref idref="Doc-1"/>
		<itemref idref="Doc-2"/>
		<itemref idref="Doc-3"/>
		<itemref idref="toc"/>
		..
		<itemref idref="Doc-38"/>
		<itemref idref="Doc-39"/>
	</spine>
Injected ref:
Code:
epubcfi(6/12!//4[idParaDest-e]/2/4[toc])
This is just an idea, but I don't know how to parse the spine, anyway. ... maybe using bs4.

Can you help me with that? If I could make the CFI reference to work with your parser, then I will work on the rest of the code to achieve the goal of clickable references.

Last edited by thiago.eec; 01-13-2019 at 07:43 PM.
thiago.eec is online now   Reply With Quote
Old 01-13-2019, 11:29 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,906
Karma: 10253972
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Not sure what you are asking. The parser does not care if the first step is a spine reference or not, for instance to parse

Code:
epubcfi(/4[idParaDest-e]/2/4[toc])
you simply call parse_epubcfi()

which will give you:

Code:
({u'steps': [{u'id': u'idParaDest-e', u'num': 4},
   {u'num': 2},
   {u'id': u'toc', u'num': 4}]},
 {},
 {},
 u'')
read the steps from the first returned value.
kovidgoyal is offline   Reply With Quote
Old Yesterday, 05:13 AM   #3
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 79
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Hi, Kovid.

By the spec, a epubcfi reference starts at the spine (at least that was what I understood).

Let's supose I call the parser with just that:
Code:
epubcfi(/4[idParaDest-e]/2/4[toc])
How would it know this reference is about "toc.xhtml"?

-------------------------

Another question: how can I use the parser the get line and column? I thought the parser would get the CFI reference and convert it to (col,line).
thiago.eec is online now   Reply With Quote
Old Yesterday, 05:28 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,906
Karma: 10253972
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
all the parser does is convert a serialized CFI reference into a form that is easier to access programmatically, it does not know anything about spines, lines and cols. In fact CFI cannot be used to go to a particular line and column. The best you can do in general is go to the line that contains the start of the tag referenced by the CFI. I dont know why the ACE tool chose CFI as a way to report error locations, it is not suited for that task, at all. Its a way to reference locations in an rendered HTML tree, not HTML source.
kovidgoyal is offline   Reply With Quote
Old Yesterday, 05:45 AM   #5
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 79
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Quote:
Originally Posted by kovidgoyal View Post
The best you can do in general is go to the line that contains the start of the tag referenced by the CFI.
Ok. That would be good enough. Can you give me some orientation on how to do that?

P.S.: I thought it would be possible to get line and column after reading this post of KevinH. He used you parser to achieve that.


Quote:
Originally Posted by kovidgoyal View Post
I dont know why the ACE tool chose CFI as a way to report error locations, it is not suited for that task, at all. Its a way to reference locations in an rendered HTML tree, not HTML source.
It surely would be easier if it gave a (lin,col), like EPUBCheck.
thiago.eec is online now   Reply With Quote
Old Yesterday, 05:59 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,906
Karma: 10253972
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I dont see how that post achieves that?? All it does is return the line and col of the containing tag as far as I can tell.

As for converting CFI to the the containing tag, there is code to do that for the viewer (it uses CFI internally for bookmarking), but not for the editor, it would need to be added.
kovidgoyal is offline   Reply With Quote
Old Yesterday, 06:21 AM   #7
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 79
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Quote:
Originally Posted by kovidgoyal View Post
I dont see how that post achieves that?? All it does is return the line and col of the containing tag as far as I can tell.
With line and col, maybe I could call something like:

Code:
editor.go_to_line(item.line, item.col)
Quote:
Originally Posted by kovidgoyal View Post
As for converting CFI to the the containing tag, there is code to do that for the viewer (it uses CFI internally for bookmarking), but not for the editor, it would need to be added.
Is it to possible to add this? So I could pass the CFI and it would return line and col?

Last edited by thiago.eec; Yesterday at 07:01 AM.
thiago.eec is online now   Reply With Quote
Old Yesterday, 09:04 AM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,906
Karma: 10253972
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There you go:

https://github.com/kovidgoyal/calibr...655c8f66957c9b
kovidgoyal is offline   Reply With Quote
Old Yesterday, 10:08 AM   #9
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 79
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Thanks a lot, Kovid!

I'll check this when I get home.
thiago.eec is online now   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Blogger/Blogspot Comment Parser & More EnergyLens Recipes 4 11-23-2014 12:04 PM
EPUB indexing with CFI ? boneill Library Management 0 09-05-2014 05:55 PM
cfi.coffee to cfi.js Anthon Calibre 11 03-16-2012 07:05 PM
Clippings Parser wiccan2 Kindle Developer's Corner 10 09-21-2011 02:21 PM
Parser can't identify form used for user/pass Solari Calibre 3 03-01-2009 08:04 PM


All times are GMT -4. The time now is 08:47 PM.


MobileRead.com is a privately owned, operated and funded community.