|  10-20-2023, 02:01 AM | #1 | 
| Connoisseur            Posts: 74 Karma: 4102 Join Date: Jun 2018 Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro | 
				
				Simplifying Kindle notebook exports
			 
			
			How can I simplify exported highlights which are exported?  Every highlight has a location number and chapter number.  A simplified and easier version required looks like. Chapter 1 Highlight 1 Highlight 2 Highlight 3 Highlight 4 Chapter 2 Highlight 1 Highlight 2 Highlight 3 Highlight 4 Manually cleaning the location number is tedious. Any free tool to do the cleanup? | 
|   |   | 
|  10-20-2023, 12:55 PM | #2 | 
| Wizard            Posts: 1,552 Karma: 5000046 Join Date: Feb 2012 Location: Cape Canaveral Device: Kindle Scribe | 
			
			Try KindleMate. It has flexibility on what parameters you want to export.
		 | 
|   |   | 
|  10-20-2023, 04:48 PM | #3 | 
| Grand Sorcerer            Posts: 7,004 Karma: 27060353 Join Date: Apr 2009 Location: USA Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3 | 
			
			This looks like the HTML export from the Kindle app, not from Kindle? I would just open with Word (or anything that imports HTML) and spend a few minutes editing it. With some scripting skills you could probably automate edit of the HTML since it has consistent structure/tagging. I've never used KindleMate (it is Windows only), but it relies on the Clippings file, which does not log chapter headings at all (nor does direct export from Kindle). I assume that's why you are using the Kindle app export. Last edited by tomsem; 10-20-2023 at 04:55 PM. | 
|   |   | 
|  10-21-2023, 04:53 AM | #4 | 
| Connoisseur            Posts: 74 Karma: 4102 Join Date: Jun 2018 Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro | |
|   |   | 
|  10-21-2023, 04:55 AM | #5 | |
| Connoisseur            Posts: 74 Karma: 4102 Join Date: Jun 2018 Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro | Quote: 
 I'm using Kindle app export because there's no share highlights button appearing in my Kindle device. This has happened with many books that I shared through send-to-kindle service. The export highlights option shows up only in the app. And there's one book that I sideloaded through USB. Surprisingly, export highlights button is showing up for it. But unfortunately, it only exports a blank page with name, author and photo of the book at the top. Last edited by archz2; 10-21-2023 at 05:00 AM. | |
|   |   | 
|  10-23-2023, 02:33 AM | #6 | 
| Connoisseur            Posts: 74 Karma: 4102 Join Date: Jun 2018 Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro | 
			
			Any tool to clean the html file so that it puts only one heading and classifies the highlights accordingly under one chapter and one subheading?
		 | 
|   |   | 
|  10-24-2023, 06:56 PM | #7 | 
| Grand Sorcerer            Posts: 7,004 Karma: 27060353 Join Date: Apr 2009 Location: USA Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3 | 
			
			It appears this will happen with ToC's with more than one level: single level ToCs seem to work okay, chapter names are in sectionHeading.
		 Last edited by tomsem; 10-24-2023 at 08:42 PM. | 
|   |   | 
|  10-25-2023, 11:30 AM | #8 | |
| Connoisseur            Posts: 74 Karma: 4102 Join Date: Jun 2018 Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro | Quote: 
 How to make it appear only once, and then all the highlights are organized under a single chapter heading? | |
|   |   | 
|  10-25-2023, 03:50 PM | #9 | 
| Grand Sorcerer            Posts: 7,004 Karma: 27060353 Join Date: Apr 2009 Location: USA Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3 | 
			
			Yes, it's clear what you are asking for. But in my experiments, some books export as you would wish, and the ones that do not have an additional level of hierarchy to them (for example collected works, omnibus editions etc.).
		 | 
|   |   | 
|  10-25-2023, 05:25 PM | #10 | 
| Bibliophagist            Posts: 48,001 Karma: 174315100 Join Date: Jul 2010 Location: Vancouver Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos | 
			
			I've seen quite a few books where the book name is wrapped in h1 tags, the parts are wrapped in h2 tags and the chapters are wrapped in h3 tags.
		 | 
|   |   | 
|  10-25-2023, 08:06 PM | #11 | 
| Grand Sorcerer            Posts: 7,004 Karma: 27060353 Join Date: Apr 2009 Location: USA Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3 | 
			
			Okay here is a first take at a script to add a chapter section header and remove the chapter strings from the note headers. It adds a bullet character to the chapter strings so they are set off from the 'higher level' section header. It will output new html files with '-new' appended to the original filename. If it does not find 'chapter pattern' in the note headings, it will not do anything. You can use calibre to run it: Code: [path to calibre executables]calibre-debug gather_chapter_notes.py html1 [html2, ...] Code: from re import match, DOTALL
from sys import argv
from bs4 import BeautifulSoup
chapter_pattern = r".*? - (.*?)( > ).*"
def gather_chapter_notes(html: str):
    soup = BeautifulSoup(html, 'html.parser')
    title_insert = {}
    remove_these = set()
    for note_heading in soup.find_all('div', class_='noteHeading'):
        content = note_heading.contents[-1]
        if matches := match(chapter_pattern, content, flags=DOTALL):
            title, token = matches.groups()
            if title not in title_insert:
                title_insert[title] = note_heading
            remove_these.add(f'{title} > ')
    for title, node in title_insert.items():
        title_section = soup.new_tag('div', attrs=[('class', 'sectionHeading')])
        title_section.string = f'● {title}'
        node.insert_before(title_section)
    html = str(soup)
    for remove_this in remove_these:
        html = html.replace(remove_this, '')
    return html
for arg in argv[1:]:
    with open(arg) as f:
        html_text = f.read()
    new_html = gather_chapter_notes(html=html_text)
    with open(arg.replace('.html', '-new.html'), 'w') as f:
        f.write(new_html) | 
|   |   | 
|  10-26-2023, 07:10 AM | #12 | |
| Connoisseur            Posts: 74 Karma: 4102 Join Date: Jun 2018 Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro | Quote: 
 | |
|   |   | 
|  10-26-2023, 09:56 AM | #13 | 
| Still reading            Posts: 14,926 Karma: 110908135 Join Date: Jun 2017 Location: Ireland Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper | |
|   |   | 
|  11-02-2023, 05:34 AM | #14 | 
| Connoisseur            Posts: 74 Karma: 4102 Join Date: Jun 2018 Device: Kindle Paperwhite (10th gen) - 6", OnePlus 9RT - MoonReader Pro | 
			
			I don't understand. If I double click the file titled 'gather_chapter_notes.py' in the zip, nothing happens.  I don't know any coding language. | 
|   |   | 
|  11-02-2023, 06:22 AM | #15 | 
| Still reading            Posts: 14,926 Karma: 110908135 Join Date: Jun 2017 Location: Ireland Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper | 
			
			You don't double click it. It's a console/terminal command.
		 | 
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Hierachical Notebook - Kindle Port! | nasser | Kindle Developer's Corner | 0 | 08-12-2017 08:43 AM | 
| Clone Kindle in my notebook | silver18 | Kindle Developer's Corner | 5 | 12-19-2012 08:21 AM | 
| Simplifying Tags | katmarsc | Library Management | 12 | 04-15-2011 10:01 PM | 
| Simplifying Tags | Sydney's Mom | Calibre | 1 | 02-04-2010 03:00 PM | 
| Simplifying a pdf? | Vadrus | 10 | 02-10-2009 10:06 AM | |