Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 02-11-2014, 06:00 AM   #1
Jpax
Member
Jpax began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Nov 2012
Location: Philippines
Device: Kindle Fire, Kindle Fire HD, Amazon Kindle PW, Nook Color, Nook Tablet
NEED Help! <idx:entry> tagging

Hi,

Anyone here knows any program where I can do automation for <idx:entry> tagging.

I have 4000 word entries to tag and Its crazy to tagged it manually.

From this code:
<p class="rev"><span class="textStyle12">aluminum</span> <span class="textStyle25">알루미늄</span> <span class="textStyle13">441</span></p>

<p class="rev"><span class="textStyle12">apartment</span> <span class="textStyle25">아파트</span> <span class="textStyle13">430</span></p>


I want to do like this:
<idx:entry>
<p class="rev"><span class="textStyle12"><idx:orth>aluminum</idx:orth></span> <span class="textStyle25">알루미늄</span> <span class="textStyle13">441</span></p>
<idx:key key="aluminum">
</idx:entry>
<idx:entry>
<p class="rev"><span class="textStyle12"><idx:orth>apartment</idx:orth></span> <span class="textStyle25">아파트</span> <span class="textStyle13">430</span></p>
<idx:key key="apartment">
</idx:entry>
Jpax is offline   Reply With Quote
Old 02-11-2014, 06:03 AM   #2
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Moved to the Mobi file format forum.
HarryT is offline   Reply With Quote
Advert
Old 02-11-2014, 07:18 AM   #3
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,703
Karma: 24031401
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by Jpax View Post
Anyone here knows any program where I can do automation for <idx:entry> tagging.
You can use any text editor with regular expression support, for example, Notepad Plus.
Select Regular Expressions as the search mode and use the following parameters:

Find what:

Code:
<p class="rev"><span class="textStyle12">(.*?)</span>\s+<span class="textStyle25">(.*?)</span> <span class="textStyle13">(.*?)</span></p>
Replace with:

Code:
<idx:entry>\n<p class="rev"><span class="textStyle12"><idx:orth>\1</idx:orth></span> <span class="textStyle25">\2</span> <span class="textStyle13">\3</span></p>\n<idx:key key="\1">\n</idx:entry>
Doitsu is offline   Reply With Quote
Old 02-12-2014, 02:09 AM   #4
Jpax
Member
Jpax began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Nov 2012
Location: Philippines
Device: Kindle Fire, Kindle Fire HD, Amazon Kindle PW, Nook Color, Nook Tablet
woohoo!!! thanks a lot Doitsu!,

But, how about if there are two words and I only want to get the first word for my idx:key.

Example.
<p class="rev"><span class="textStyle12">aluminum foil</span> <span class="textStyle25">알루미늄</span> <span class="textStyle13">441</span></p>


<idx:entry>
<p class="rev"><span class="textStyle12"><idxrth>aluminum foil</idxrth></span> <span class="textStyle25">알루미늄</span> <span class="textStyle13">441</span></p>
<idx:key key="aluminum">
</idx:entry>
Jpax is offline   Reply With Quote
Old 02-12-2014, 02:15 AM   #5
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Instead of:
Code:
<p class="rev"><span class="textStyle12">(.*?)</span>\s+<span class="textStyle25">(.*?)</span> <span class="textStyle13">(.*?)</span></p>
use:
Code:
<p class="rev"><span class="textStyle12">([^ ]*?)</span>\s+<span class="textStyle25">([^ ]*?)</span> <span class="textStyle13">([^ ]*?)</span></p>
Changes the "." regex-match-all-characters to "[^<> ]" which searches for anything but " " or start/end tags.
eschwartz is offline   Reply With Quote
Advert
Old 02-12-2014, 04:09 AM   #6
Jpax
Member
Jpax began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Nov 2012
Location: Philippines
Device: Kindle Fire, Kindle Fire HD, Amazon Kindle PW, Nook Color, Nook Tablet
thanks eschwartz..

I used (\w+) and it also works. By the way, do you have idea what is this error:

"This XML file does not appear to have any style information associated with it. The document tree is shown below."

This is my first few lines:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns:idx="wwww.mobipocket.com" xmlns:mbp="www.mobipocket.com" xmlns:xlink="http://www.w3.org/1999/xlink">
<head>
<link href="../Styles/XXXXX.css" rel="stylesheet" type="text/css" />
<title></title>
<style type="text/css">
</style>
</head>
Jpax is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to use 'format' in <idx:orth> tags for dictionaries? totsubo Kindle Formats 12 10-30-2013 03:28 AM
Kindle for PC ignores multi word dictionary idx entries giorgio79 Amazon Kindle 1 04-30-2013 01:29 AM
Tagging feature Anyssia Devices 2 03-28-2013 04:11 PM
Need some help with my genre tagging ficbot Library Management 4 03-14-2011 12:49 PM


All times are GMT -4. The time now is 03:13 PM.


MobileRead.com is a privately owned, operated and funded community.