Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > Amazon Kindle

Notices

Reply
 
Thread Tools Search this Thread
Old 10-20-2023, 02:01 AM   #1
archz2
Connoisseur
archz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura about
 
Posts: 73
Karma: 4102
Join Date: Jun 2018
Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro
Simplifying Kindle notebook exports

How can I simplify exported highlights which are exported?
Every highlight has a location number and chapter number.





A simplified and easier version required looks like.


Chapter 1

Highlight 1
Highlight 2
Highlight 3
Highlight 4



Chapter 2

Highlight 1
Highlight 2
Highlight 3
Highlight 4


Manually cleaning the location number is tedious. Any free tool to do the cleanup?
archz2 is offline   Reply With Quote
Old 10-20-2023, 12:55 PM   #2
mergen3107
Wizard
mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.mergen3107 ought to be getting tired of karma fortunes by now.
 
mergen3107's Avatar
 
Posts: 1,552
Karma: 5000046
Join Date: Feb 2012
Location: Cape Canaveral
Device: Kindle Scribe
Try KindleMate. It has flexibility on what parameters you want to export.
mergen3107 is offline   Reply With Quote
Old 10-20-2023, 04:48 PM   #3
tomsem
Grand Sorcerer
tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.
 
Posts: 7,003
Karma: 27060353
Join Date: Apr 2009
Location: USA
Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3
This looks like the HTML export from the Kindle app, not from Kindle?

I would just open with Word (or anything that imports HTML) and spend a few minutes editing it. With some scripting skills you could probably automate edit of the HTML since it has consistent structure/tagging.

I've never used KindleMate (it is Windows only), but it relies on the Clippings file, which does not log chapter headings at all (nor does direct export from Kindle). I assume that's why you are using the Kindle app export.

Last edited by tomsem; 10-20-2023 at 04:55 PM.
tomsem is offline   Reply With Quote
Old 10-21-2023, 04:53 AM   #4
archz2
Connoisseur
archz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura about
 
Posts: 73
Karma: 4102
Join Date: Jun 2018
Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro
Quote:
Originally Posted by mergen3107 View Post
Try KindleMate. It has flexibility on what parameters you want to export.
Thanks! Quite useful. How do I enable the chapter and subheading also for exporting? I couldn't find any option. Their user guide and FAQ doesn't cover this aspect either.

archz2 is offline   Reply With Quote
Old 10-21-2023, 04:55 AM   #5
archz2
Connoisseur
archz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura about
 
Posts: 73
Karma: 4102
Join Date: Jun 2018
Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro
Quote:
Originally Posted by tomsem View Post
This looks like the HTML export from the Kindle app, not from Kindle?

I would just open with Word (or anything that imports HTML) and spend a few minutes editing it. With some scripting skills you could probably automate edit of the HTML since it has consistent structure/tagging.

I've never used KindleMate (it is Windows only), but it relies on the Clippings file, which does not log chapter headings at all (nor does direct export from Kindle). I assume that's why you are using the Kindle app export.
Yes exactly. It's an html export from the app.
I'm using Kindle app export because there's no share highlights button appearing in my Kindle device. This has happened with many books that I shared through send-to-kindle service. The export highlights option shows up only in the app.

And there's one book that I sideloaded through USB. Surprisingly, export highlights button is showing up for it. But unfortunately, it only exports a blank page with name, author and photo of the book at the top.

Last edited by archz2; 10-21-2023 at 05:00 AM.
archz2 is offline   Reply With Quote
Old 10-23-2023, 02:33 AM   #6
archz2
Connoisseur
archz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura about
 
Posts: 73
Karma: 4102
Join Date: Jun 2018
Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro
Any tool to clean the html file so that it puts only one heading and classifies the highlights accordingly under one chapter and one subheading?
archz2 is offline   Reply With Quote
Old 10-24-2023, 06:56 PM   #7
tomsem
Grand Sorcerer
tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.
 
Posts: 7,003
Karma: 27060353
Join Date: Apr 2009
Location: USA
Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3
It appears this will happen with ToC's with more than one level: single level ToCs seem to work okay, chapter names are in sectionHeading.

Last edited by tomsem; 10-24-2023 at 08:42 PM.
tomsem is offline   Reply With Quote
Old 10-25-2023, 11:30 AM   #8
archz2
Connoisseur
archz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura about
 
Posts: 73
Karma: 4102
Join Date: Jun 2018
Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro
Quote:
Originally Posted by tomsem View Post
It appears this will happen with ToC's with more than one level: single level ToCs seem to work okay, chapter names are in sectionHeading.
In the screenshot I shared, the chapter name is appearing with every single highlight. Chapter name is a first/one level category. Right?

How to make it appear only once, and then all the highlights are organized under a single chapter heading?
archz2 is offline   Reply With Quote
Old 10-25-2023, 03:50 PM   #9
tomsem
Grand Sorcerer
tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.
 
Posts: 7,003
Karma: 27060353
Join Date: Apr 2009
Location: USA
Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3
Quote:
Originally Posted by archz2 View Post
In the screenshot I shared, the chapter name is appearing with every single highlight. Chapter name is a first/one level category. Right?

How to make it appear only once, and then all the highlights are organized under a single chapter heading?
Yes, it's clear what you are asking for. But in my experiments, some books export as you would wish, and the ones that do not have an additional level of hierarchy to them (for example collected works, omnibus editions etc.).
tomsem is offline   Reply With Quote
Old 10-25-2023, 05:25 PM   #10
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 47,944
Karma: 174315098
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by archz2 View Post
In the screenshot I shared, the chapter name is appearing with every single highlight. Chapter name is a first/one level category. Right?

How to make it appear only once, and then all the highlights are organized under a single chapter heading?
I've seen quite a few books where the book name is wrapped in h1 tags, the parts are wrapped in h2 tags and the chapters are wrapped in h3 tags.
DNSB is offline   Reply With Quote
Old 10-25-2023, 08:06 PM   #11
tomsem
Grand Sorcerer
tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.
 
Posts: 7,003
Karma: 27060353
Join Date: Apr 2009
Location: USA
Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3
Okay here is a first take at a script to add a chapter section header and remove the chapter strings from the note headers. It adds a bullet character to the chapter strings so they are set off from the 'higher level' section header. It will output new html files with '-new' appended to the original filename.

If it does not find 'chapter pattern' in the note headings, it will not do anything.

You can use calibre to run it:

Code:
[path to calibre executables]calibre-debug gather_chapter_notes.py html1 [html2, ...]
gather_chapter_notes.py
Code:
from re import match, DOTALL
from sys import argv

from bs4 import BeautifulSoup

chapter_pattern = r".*? - (.*?)( > ).*"


def gather_chapter_notes(html: str):
    soup = BeautifulSoup(html, 'html.parser')
    title_insert = {}
    remove_these = set()
    for note_heading in soup.find_all('div', class_='noteHeading'):
        content = note_heading.contents[-1]
        if matches := match(chapter_pattern, content, flags=DOTALL):
            title, token = matches.groups()
            if title not in title_insert:
                title_insert[title] = note_heading
            remove_these.add(f'{title} > ')

    for title, node in title_insert.items():
        title_section = soup.new_tag('div', attrs=[('class', 'sectionHeading')])
        title_section.string = f'● {title}'
        node.insert_before(title_section)

    html = str(soup)
    for remove_this in remove_these:
        html = html.replace(remove_this, '')
    return html


for arg in argv[1:]:
    with open(arg) as f:
        html_text = f.read()

    new_html = gather_chapter_notes(html=html_text)

    with open(arg.replace('.html', '-new.html'), 'w') as f:
        f.write(new_html)
Attached Thumbnails
Click image for larger version

Name:	Screenshot 2023-10-25 at 5.01.15 PM.png
Views:	148
Size:	851.2 KB
ID:	204424   Click image for larger version

Name:	Screenshot 2023-10-25 at 5.01.34 PM.png
Views:	151
Size:	821.9 KB
ID:	204425  
Attached Files
File Type: zip gather_chapter_notes.py.zip (1.0 KB, 122 views)
tomsem is offline   Reply With Quote
Old 10-26-2023, 07:10 AM   #12
archz2
Connoisseur
archz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura about
 
Posts: 73
Karma: 4102
Join Date: Jun 2018
Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro
Quote:
Originally Posted by tomsem View Post
Okay here is a first take at a script to add a chapter section header and remove the chapter strings from the note headers. It adds a bullet character to the chapter strings so they are set off from the 'higher level' section header. It will output new html files with '-new' appended to the original filename.

If it does not find 'chapter pattern' in the note headings, it will not do anything.

You can use calibre to run it:

Code:
[path to calibre executables]calibre-debug gather_chapter_notes.py html1 [html2, ...]
gather_chapter_notes.py
Code:
from re import match, DOTALL
from sys import argv

from bs4 import BeautifulSoup

chapter_pattern = r".*? - (.*?)( > ).*"


def gather_chapter_notes(html: str):
    soup = BeautifulSoup(html, 'html.parser')
    title_insert = {}
    remove_these = set()
    for note_heading in soup.find_all('div', class_='noteHeading'):
        content = note_heading.contents[-1]
        if matches := match(chapter_pattern, content, flags=DOTALL):
            title, token = matches.groups()
            if title not in title_insert:
                title_insert[title] = note_heading
            remove_these.add(f'{title} > ')

    for title, node in title_insert.items():
        title_section = soup.new_tag('div', attrs=[('class', 'sectionHeading')])
        title_section.string = f'● {title}'
        node.insert_before(title_section)

    html = str(soup)
    for remove_this in remove_these:
        html = html.replace(remove_this, '')
    return html


for arg in argv[1:]:
    with open(arg) as f:
        html_text = f.read()

    new_html = gather_chapter_notes(html=html_text)

    with open(arg.replace('.html', '-new.html'), 'w') as f:
        f.write(new_html)
Thanks. How do I use this script to clean the html file? Can you please share the steps?
archz2 is offline   Reply With Quote
Old 10-26-2023, 09:56 AM   #13
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,901
Karma: 110507267
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by archz2 View Post
Thanks. How do I use this script to clean the html file? Can you please share the steps?
It's in the post you quoted

[path to calibre executables]calibre-debug gather_chapter_notes.py html1 [html2, ...]
Quoth is offline   Reply With Quote
Old 11-02-2023, 05:34 AM   #14
archz2
Connoisseur
archz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura aboutarchz2 has a spectacular aura about
 
Posts: 73
Karma: 4102
Join Date: Jun 2018
Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro
I don't understand. If I double click the file titled 'gather_chapter_notes.py' in the zip, nothing happens.

I don't know any coding language.
archz2 is offline   Reply With Quote
Old 11-02-2023, 06:22 AM   #15
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,901
Karma: 110507267
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
You don't double click it. It's a console/terminal command.
Quoth is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Hierachical Notebook - Kindle Port! nasser Kindle Developer's Corner 0 08-12-2017 08:43 AM
Clone Kindle in my notebook silver18 Kindle Developer's Corner 5 12-19-2012 08:21 AM
Simplifying Tags katmarsc Library Management 12 04-15-2011 10:01 PM
Simplifying Tags Sydney's Mom Calibre 1 02-04-2010 03:00 PM
Simplifying a pdf? Vadrus PDF 10 02-10-2009 10:06 AM


All times are GMT -4. The time now is 06:53 AM.


MobileRead.com is a privately owned, operated and funded community.