View Single Post
Old 03-03-2020, 11:17 AM   #1
gol8erl8
Member
gol8erl8 began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Feb 2020
Device: Kindle Paperwhite 4
Clear metadata more completely

I'd like to clear the metadata from many ebooks (.EPUB) except for certain fields, namely title, author, and possibly the identifier and source that corresponds to the Project Gutenberg book number. I am currently using Edit metadata in bulk to do this, i.e.

Spoiler:
Rating : Not rated : Apply rating
Publisher : (Blank) : Clear pub
Remove tags : (Blank) : Remove all
Series : (Blank) : Clear series
Date : Undefined : Apply date (Clicked remove button)
Published : Undefined : Apply date (Clicked remove button)
Languages : (Blank) : Remove all
Change cover : Remove cover
Set the comments for all selected books : (Blank)


However, when I use Edit book, I still find unwanted bits in the content.opf file between the <metadata> tags, including <dc:rights> and <dc:date opf:event="conversion">. Moreover, when I convert from .EPUB to .AZW3, I lose the <dc:identifier opf:scheme="URI" id="id"> and <dc:source> tags, which may not be desirable should I decide to keep/modify these. Here is an example of metadata from A Tale of Two Cities by Charles Dickens found at https://www.gutenberg.org/ebooks/98.epub.images :

Original
Spoiler:
Code:
<metadata>
    <dc:title>A Tale of Two Cities</dc:title>
    <dc:creator opf:role="aut" opf:file-as="Dickens, Charles">Charles Dickens</dc:creator>
    <dc:rights>Public domain in the USA.</dc:rights>
    <dc:identifier opf:scheme="URI" id="id">http://www.gutenberg.org/98</dc:identifier>
    <dc:date>1994-01-01T05:00:00+00:00</dc:date>
    <dc:date opf:event="conversion">2020-03-01T08:37:22.360047+00:00</dc:date>
    <dc:source>https://www.gutenberg.org/files/98/98-h/98-h.htm</dc:source>
    <dc:subject>Historical fiction</dc:subject>
    <dc:subject>France -- History -- Revolution</dc:subject>
    <dc:subject>1789-1799 -- Fiction</dc:subject>
    <dc:subject>London (England) -- History -- 18th century -- Fiction</dc:subject>
    <dc:subject>War stories</dc:subject>
    <dc:subject>Executions and executioners -- Fiction</dc:subject>
    <dc:subject>French -- England -- London -- Fiction</dc:subject>
    <dc:subject>Lookalikes -- Fiction</dc:subject>
    <dc:subject>British -- France -- Paris -- Fiction</dc:subject>
    <dc:subject>Paris (France) -- History -- 1789-1799 -- Fiction</dc:subject>
    <dc:language>en</dc:language>
    <dc:identifier opf:scheme="calibre">7c7b2a30-5b0b-41e9-a1fa-b1109aa06373</dc:identifier>
    <meta name="cover" content="item33"/>
    <meta name="calibre:title_sort" content="Tale of Two Cities, A"/>
    <meta name="calibre:author_link_map" content="{&quot;Charles Dickens&quot;: &quot;&quot;}"/>
  </metadata>


After Edit metadata in bulk
Spoiler:
Code:
<metadata>
    <dc:title>A Tale of Two Cities</dc:title>
    <dc:creator opf:role="aut" opf:file-as="Dickens, Charles">Charles Dickens</dc:creator>
    <dc:rights>Public domain in the USA.</dc:rights>
    <dc:identifier opf:scheme="URI" id="id">http://www.gutenberg.org/98</dc:identifier>
    <dc:date>0101-01-01T00:00:00+00:00</dc:date>
    <dc:date opf:event="conversion">2020-03-01T08:37:22.360047+00:00</dc:date>
    <dc:source>https://www.gutenberg.org/files/98/98-h/98-h.htm</dc:source>
    <dc:identifier opf:scheme="calibre">7c7b2a30-5b0b-41e9-a1fa-b1109aa06373</dc:identifier>
    <meta name="cover" content="item33"/>
    <meta name="calibre:title_sort" content="Tale of Two Cities, A"/>
    <meta name="calibre:author_link_map" content="{&quot;Charles Dickens&quot;: &quot;&quot;}"/>
  </metadata>


Subsequently using Polish books and Convert books (from .EPUB to .AZW3)
Spoiler:
Code:
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:calibre="http://calibre.kovidgoyal.net/2009/metadata" xmlns:opf="http://www.idpf.org/2007/opf">
    <dc:title>A Tale of Two Cities</dc:title>
    <dc:creator opf:role="aut" opf:file-as="Unknown">Charles Dickens</dc:creator>
    <dc:contributor opf:role="bkp" opf:file-as="calibre">calibre (4.10.1) [https://calibre-ebook.com]</dc:contributor>
    <dc:identifier id="calibre_id" opf:scheme="calibre">7c7b2a30-5b0b-41e9-a1fa-b1109aa06373</dc:identifier>
    <dc:date>0100-12-31T19:00:00-05:00</dc:date>
    <dc:language>eng</dc:language>
    <dc:identifier opf:scheme="MOBI-ASIN">7c7b2a30-5b0b-41e9-a1fa-b1109aa06373</dc:identifier>
    <dc:rights>Public domain in the USA.</dc:rights>
  </metadata>


Is there a more thorough way of clearing metadata that is faster than individually opening and modifying the metadata in Edit books?
gol8erl8 is offline   Reply With Quote