Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 01-13-2019, 10:34 AM   #1
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 93
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
CFI Parser

Hi, @Kovid.

I want to modify ACE Plugin so it can output the results directly on the Editor. This way the user could click on the line and go directly to where the error is (like EPUBCheck Plugin).

But, there is one thing on the way: ACE uses a non-standard EPUBCFI output. The starting point is not the spine. It uses CFI references only inside the file, like this:

Code:
toc.xhtml#epubcfi(/4[idParaDest-e]/2/4[toc])
Doc.xhtml#epubcfi(/4[idParaDest-b]/2/2/4[tit]/4)
The idea was to use your CFI Parser, but I am having trouble to deal with this difference.

I thought maybe I could inject the first part of the reference, before passing it to parser.py, making it whole again. Like this:

Spine item to ref:
Code:
	<spine toc="ncx">
		<itemref idref="Cover"/>
		<itemref idref="Doc"/>
		<itemref idref="Doc-1"/>
		<itemref idref="Doc-2"/>
		<itemref idref="Doc-3"/>
		<itemref idref="toc"/>
		..
		<itemref idref="Doc-38"/>
		<itemref idref="Doc-39"/>
	</spine>
Injected ref:
Code:
epubcfi(6/12!//4[idParaDest-e]/2/4[toc])
This is just an idea, but I don't know how to parse the spine, anyway. ... maybe using bs4.

Can you help me with that? If I could make the CFI reference to work with your parser, then I will work on the rest of the code to achieve the goal of clickable references.

Last edited by thiago.eec; 01-13-2019 at 06:43 PM.
thiago.eec is offline   Reply With Quote
Old 01-13-2019, 10:29 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 34,334
Karma: 10323932
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Not sure what you are asking. The parser does not care if the first step is a spine reference or not, for instance to parse

Code:
epubcfi(/4[idParaDest-e]/2/4[toc])
you simply call parse_epubcfi()

which will give you:

Code:
({u'steps': [{u'id': u'idParaDest-e', u'num': 4},
   {u'num': 2},
   {u'id': u'toc', u'num': 4}]},
 {},
 {},
 u'')
read the steps from the first returned value.
kovidgoyal is online now   Reply With Quote
Old 01-14-2019, 04:13 AM   #3
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 93
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Hi, Kovid.

By the spec, a epubcfi reference starts at the spine (at least that was what I understood).

Let's supose I call the parser with just that:
Code:
epubcfi(/4[idParaDest-e]/2/4[toc])
How would it know this reference is about "toc.xhtml"?

-------------------------

Another question: how can I use the parser the get line and column? I thought the parser would get the CFI reference and convert it to (col,line).
thiago.eec is offline   Reply With Quote
Old 01-14-2019, 04:28 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 34,334
Karma: 10323932
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
all the parser does is convert a serialized CFI reference into a form that is easier to access programmatically, it does not know anything about spines, lines and cols. In fact CFI cannot be used to go to a particular line and column. The best you can do in general is go to the line that contains the start of the tag referenced by the CFI. I dont know why the ACE tool chose CFI as a way to report error locations, it is not suited for that task, at all. Its a way to reference locations in an rendered HTML tree, not HTML source.
kovidgoyal is online now   Reply With Quote
Old 01-14-2019, 04:45 AM   #5
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 93
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Quote:
Originally Posted by kovidgoyal View Post
The best you can do in general is go to the line that contains the start of the tag referenced by the CFI.
Ok. That would be good enough. Can you give me some orientation on how to do that?

P.S.: I thought it would be possible to get line and column after reading this post of KevinH. He used you parser to achieve that.


Quote:
Originally Posted by kovidgoyal View Post
I dont know why the ACE tool chose CFI as a way to report error locations, it is not suited for that task, at all. Its a way to reference locations in an rendered HTML tree, not HTML source.
It surely would be easier if it gave a (lin,col), like EPUBCheck.
thiago.eec is offline   Reply With Quote
Old 01-14-2019, 04:59 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 34,334
Karma: 10323932
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I dont see how that post achieves that?? All it does is return the line and col of the containing tag as far as I can tell.

As for converting CFI to the the containing tag, there is code to do that for the viewer (it uses CFI internally for bookmarking), but not for the editor, it would need to be added.
kovidgoyal is online now   Reply With Quote
Old 01-14-2019, 05:21 AM   #7
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 93
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Quote:
Originally Posted by kovidgoyal View Post
I dont see how that post achieves that?? All it does is return the line and col of the containing tag as far as I can tell.
With line and col, maybe I could call something like:

Code:
editor.go_to_line(item.line, item.col)
Quote:
Originally Posted by kovidgoyal View Post
As for converting CFI to the the containing tag, there is code to do that for the viewer (it uses CFI internally for bookmarking), but not for the editor, it would need to be added.
Is it to possible to add this? So I could pass the CFI and it would return line and col?

Last edited by thiago.eec; 01-14-2019 at 06:01 AM.
thiago.eec is offline   Reply With Quote
Old 01-14-2019, 08:04 AM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 34,334
Karma: 10323932
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There you go:

https://github.com/kovidgoyal/calibr...655c8f66957c9b
kovidgoyal is online now   Reply With Quote
Old 01-14-2019, 09:08 AM   #9
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 93
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Thanks a lot, Kovid!

I'll check this when I get home.
thiago.eec is offline   Reply With Quote
Old 01-17-2019, 10:51 AM   #10
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 93
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Quote:
Originally Posted by kovidgoyal View Post
Hi, Kovid.

Thank you very much for the code. It's working perfectly.
Altough, I've noticed an odd behavior that I don't know if it is by design.

Steps to reproduce:

1) My Editor is set to wrap long lines.
2) When I click on a partial CFI, it goes to the beginning of the line referenced, as expected.
3) When the CFI has an ID for the target tag, then it will try to place the cursor on that ID. Nice.
4) But... it only works if the ID is in the first "line".
5) If I resize the code window, and the ID attribute goes to the second "line", then it will place the cursor at the beginning of the real line, like there was no ID.
thiago.eec is offline   Reply With Quote
Old 01-17-2019, 07:46 PM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 34,334
Karma: 10323932
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Probably a performance optimization -- open a ticket for it and I will lok into it, see if it can be implemented for at least a couple of lines without too much impact.
kovidgoyal is online now   Reply With Quote
Old 01-18-2019, 05:18 AM   #12
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 93
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Quote:
Originally Posted by kovidgoyal View Post
Probably a performance optimization
Hi, Kovid.

I didn't notice any impact on performance. Running ACE with the plugin takes roughly the same time as with the command line tool (on my sistem, the diference is around 0.3s).

When I click on a message (wich uses the CFI function), the file opens instantaneously, with the cursor on the right line.

Quote:
Originally Posted by kovidgoyal View Post
see if it can be implemented for at least a couple of lines without too much impact.
I'm attaching the plugin, in case you want to test it yourself (if you have the ACE tool installed, anyway).
I've incorporated your code into the plugin for older versions of calibre, and let it run normally for the 3.38 and later versions.

Take a look at the go_to_line() function.


--------------

As for the issue itself, its not about the position of the ID, as I supected before. It affects any attribute wich spans for 2 "lines". The cursor will be placed at the end of that attribute value.


P.S.: in case you need a book to test, use the attached file (scrambled). The issue is more easily noticed on the footnote references (when ACE complains about the correspondent ARIA role for it, doc-noteref).
Attached Files
File Type: zip ACE.zip (67.8 KB, 22 views)
File Type: epub Sample_scrambled.epub (439.1 KB, 24 views)
thiago.eec is offline   Reply With Quote
Old 01-18-2019, 11:45 AM   #13
thiago.eec
Connoisseur
thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.thiago.eec knows who John Galt is.
 
Posts: 93
Karma: 71600
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Quote:
Originally Posted by kovidgoyal View Post
Probably a performance optimization -- open a ticket for it and I will lok into it, see if it can be implemented for at least a couple of lines without too much impact.
Tiket: https://bugs.launchpad.net/calibre/+bug/1812400
thiago.eec is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Blogger/Blogspot Comment Parser & More EnergyLens Recipes 4 11-23-2014 11:04 AM
EPUB indexing with CFI ? boneill Library Management 0 09-05-2014 04:55 PM
cfi.coffee to cfi.js Anthon Calibre 11 03-16-2012 06:05 PM
Clippings Parser wiccan2 Kindle Developer's Corner 10 09-21-2011 01:21 PM
Parser can't identify form used for user/pass Solari Calibre 3 03-01-2009 07:04 PM


All times are GMT -4. The time now is 01:03 PM.


MobileRead.com is a privately owned, operated and funded community.