Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 10-10-2019, 11:30 AM   #31
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,275
Karma: 98804578
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by Luca2903 View Post
Hello man, the format is .Kfx.

Thanks.
I don't know of any tool to extract the highlighted text from KFX books, only the amazon encoded highlight location information.

You can get your books on your kindle in KF8 (azw3) by doing the follwing.

Go to the "Your Content and Devices" (or some such wording) page on amazon.com and find the books you want. Select download and transfer over USB. Also, remove the book and its .sdr folder from your kindle and put it in airplane mode.

Copy the downloaded azw3 file to your kindle, open the book and page around a bit and make some bookmarks, then close the book and leave airplane mode, then do a sync. If all goes well, then you should be able to use the tools in this thread to get your highlights.
j.p.s is offline   Reply With Quote
Old 10-29-2021, 03:14 PM   #32
BlackWolf1994
Junior Member
BlackWolf1994 began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2021
Device: Kindle Paperwhite
Quote:
Originally Posted by j.p.s View Post
I'm attaching a PDF of a book with inserted highlights and notes to this post along with associated files to make it. It turns out that the utility html2ps does not choke on the XML in a rawml file like my web browser does, so there was no need to comment out the XML. What is also surprising to me is that the TOC in the PDF works. This is not meant to be a book with highlights and notes, but rather the highlights and notes shown in context.

The source book is EPUB of The Humbugs of the World by P T Barnum from the Mobileread Library. I used kindlegen to make a dual mobi and used kindleunpack to extract the rawml and azw3, which I copied to a kindle and quickly made 9 highlights with bogus notes.

Then I copied the azw3r and dumped the notes, which also gives the start and end of the each higlight. Next I used the notes_insert.pl from the first post to modify the rawml, then html2ps and ps2pdf. You can search the PDF for '[HL]' or '[Note:' to find the highlights and notes.
Hello,

I generate my AZW3 files by converting EPUBs using Calibre. Then I copy those over to the Kindle (Kindle Paperwhite IV), read and put in notes/highlights. Afterwards, I copy over the .sdr file (including .azw3r and .azw3f) to my computer.

I used your scripts from here:

https://github.com/jps-e/azw3r

in the following way:

1. Extract notes:

perl azw3r.pl -n -i <azw3r_file_path> > out.notes

2. Extract highlights:

perl azw3r.pl -h -i <azw3r_file_path> -r <dat_file_path> > out.highlights

I generated the DAT file by using KindleUnpack from here:

https://github.com/kevinhendricks/KindleUnpack

with this command:

python3 kindleunpack.py -d <azw3_file_path> out.dat

The dat file I used for -r parameter was taken from out.dat/mobi8/ -> assembled_text.dat

The step I don't understand how to do is generating the PDF file. Based on your comment, I should be able to use insert_notes.pl script for this, but I don't know how to call it so that it does this.

Can you help?

Thanks.
BlackWolf1994 is offline   Reply With Quote
Advert
Old 10-29-2021, 03:54 PM   #33
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,275
Karma: 98804578
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by BlackWolf1994 View Post
The step I don't understand how to do is generating the PDF file. Based on your comment, I should be able to use insert_notes.pl script for this, but I don't know how to call it so that it does this.

Can you help?

Thanks.
Do you want to make a PDF with only the notes and highlighted text or a PDF with all the text in the book with the highlights marked and the notes inserted (and somewhat rough formatting and no images)?
j.p.s is offline   Reply With Quote
Old 10-30-2021, 01:42 AM   #34
BlackWolf1994
Junior Member
BlackWolf1994 began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2021
Device: Kindle Paperwhite
Quote:
Originally Posted by j.p.s View Post
Do you want to make a PDF with only the notes and highlighted text or a PDF with all the text in the book with the highlights marked and the notes inserted (and somewhat rough formatting and no images)?
If I understood correctly, I would want to achieve the latter - Have a PDF file that has all the text from the book + highlights and notes inserted at the appropriate places (with the [HL]/[Note:] as you previously explained).
BlackWolf1994 is offline   Reply With Quote
Old 10-30-2021, 03:22 PM   #35
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,275
Karma: 98804578
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by BlackWolf1994 View Post
If I understood correctly, I would want to achieve the latter - Have a PDF file that has all the text from the book + highlights and notes inserted at the appropriate places (with the [HL]/[Note:] as you previously explained).
Nobody asked before, and my instructions were incomplete.

It turns out that the output of azw3r.pl needs to be sorted. In addition, notes_insert.pl will not work correctly if both highlights and notes are in its input file.

So assuming the notes or highlights file has been sorted, the usage for notes_insert.pl is:
Code:
perl notes_insert.pl -r assembled_text.dat < notes.txt > annotated_book.html
ebook-convert annotated_book.html annotated_book.pdf
ebook-convert is the calibre ebook converter. I assume the calibre GUI would work as well.

Of course, you can use any file names you like for input and output files.
j.p.s is offline   Reply With Quote
Advert
Old 10-31-2021, 04:59 AM   #36
BlackWolf1994
Junior Member
BlackWolf1994 began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2021
Device: Kindle Paperwhite
Quote:
Originally Posted by j.p.s View Post
Nobody asked before, and my instructions were incomplete.

It turns out that the output of azw3r.pl needs to be sorted. In addition, notes_insert.pl will not work correctly if both highlights and notes are in its input file.

So assuming the notes or highlights file has been sorted, the usage for notes_insert.pl is:
Code:
perl notes_insert.pl -r assembled_text.dat < notes.txt > annotated_book.html
ebook-convert annotated_book.html annotated_book.pdf
ebook-convert is the calibre ebook converter. I assume the calibre GUI would work as well.

Of course, you can use any file names you like for input and output files.
Amazing, that worked! I managed to generate separate PDFs with highlights and notes.

Big thanks!

Based on your comment, I gather that it is impossible to generate a single PDF file which would have both HLs and notes? I tried generating one of the HTMLs first and using it as input for the notes_insert script, but that doesn't work, it places the notes or highlights at wrong places, which makes sense since we added new text so the locations shifted. I have no other ideas. Would it maybe be possible to make it work if the input file has both HLs and Notes? Would that file have to include all HLs sorted followed by all Notes sorted or would it be a mix of HLs and Notes sorted?
BlackWolf1994 is offline   Reply With Quote
Old 10-31-2021, 06:23 PM   #37
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,275
Karma: 98804578
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by BlackWolf1994 View Post
Amazing, that worked! I managed to generate separate PDFs with highlights and notes.

Big thanks!
You're welcome.

Quote:
Based on your comment, I gather that it is impossible to generate a single PDF file which would have both HLs and notes?
Impossible with the scripts as is. They might work if there is no overlap between notes and highlights, maybe by tweaking the location assigned to any note associated with a highlight.

Quote:
I tried generating one of the HTMLs first and using it as input for the notes_insert script, but that doesn't work, it places the notes or highlights at wrong places, which makes sense since we added new text so the locations shifted. I have no other ideas. Would it maybe be possible to make it work if the input file has both HLs and Notes? Would that file have to include all HLs sorted followed by all Notes sorted or would it be a mix of HLs and Notes sorted?
The reason the notes and higlights files need to be sorted is that the insertion method is very simple and also minimizes memory requirements and code complexity. The scripts could run on a home computer from the 80's with less than 1 MB RAM for a book many megabytes in size. notes_insert.pl simply copies assembled_text.dat from the current position to just nefore the start of the next note or highlight, writes some formatting, a label, then any text to be inserted, some more formatting, then repeat for then next note or highlight.

They could be rewritten to do what you want by buildig a table of locations and shift sizes and use that to do the copying and inserting. That would be way too tedious and time comsuming for me to consider. For someone else, it might be a piece of cake to bang out or a challenge worth taking on.

The reason the thread title ends in "info" instead of "tool" and is in the Kindle Formats forum instead of the Kindle forum is that it's purpose is to provide information for others to pick up and take further without having to work out how to process azw3r files.
j.p.s is offline   Reply With Quote
Old 11-01-2021, 01:04 AM   #38
BlackWolf1994
Junior Member
BlackWolf1994 began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2021
Device: Kindle Paperwhite
Quote:
Originally Posted by j.p.s View Post
You're welcome.


Impossible with the scripts as is. They might work if there is no overlap between notes and highlights, maybe by tweaking the location assigned to any note associated with a highlight.
I see. From what I've seen while playing with this is that HLs and Notes always overlap and are matched i.e. every note is essentially a highlight as well. If range for a highlight is A - B, then the note matched with it is always C - B, where C > A.

Quote:
Originally Posted by j.p.s View Post
The reason the notes and higlights files need to be sorted is that the insertion method is very simple and also minimizes memory requirements and code complexity. The scripts could run on a home computer from the 80's with less than 1 MB RAM for a book many megabytes in size. notes_insert.pl simply copies assembled_text.dat from the current position to just nefore the start of the next note or highlight, writes some formatting, a label, then any text to be inserted, some more formatting, then repeat for then next note or highlight.
I see, got it!

Quote:
Originally Posted by j.p.s View Post
They could be rewritten to do what you want by buildig a table of locations and shift sizes and use that to do the copying and inserting. That would be way too tedious and time comsuming for me to consider. For someone else, it might be a piece of cake to bang out or a challenge worth taking on.
Understood, thanks immeasureably for what you have done with this nonetheless, this is way more than I would have ever expected to be able to achieve. Personally, I have no knowledge about Perl, so I would have hard time reverse engineering your scripts and modifying them to achieve this goal. Not to mention the lack of free time. Maybe someday.

Quote:
Originally Posted by j.p.s View Post
The reason the thread title ends in "info" instead of "tool" and is in the Kindle Formats forum instead of the Kindle forum is that it's purpose is to provide information for others to pick up and take further without having to work out how to process azw3r files.
Understood again, that's brilliant.

Thanks for all your work and help!

Best regards!
BlackWolf1994 is offline   Reply With Quote
Reply

Tags
azw3r, highlights, highlights and notes, notes

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Fully Automated ebook file parsing, ISBN extraction, Titel Extraction and metadata isbnread Reading and Management 0 02-20-2017 10:20 AM
Paperwhite 2 add note without highlight? just_jeepin Amazon Kindle 3 10-07-2013 02:07 PM
PRS-650 Two years late — A crossplatform ePub highlight extraction tool for PRS-350, 650... Syniurge Sony Reader 1 09-30-2013 12:45 PM
eink device with note and highlight sync with Mendeley aldomenguzzi Which one should I buy? 0 12-04-2012 04:44 AM


All times are GMT -4. The time now is 06:49 AM.


MobileRead.com is a privately owned, operated and funded community.