Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 06-24-2016, 08:42 PM   #1
Hopkins
Enthusiast
Hopkins began at the beginning.
 
Posts: 38
Karma: 10
Join Date: Jun 2016
Location: Minnesota USA
Device: Amazon Paperwhite 3G
[Editor Plugin] Traditional<->Simplified Chinese Convertor

Currently, the Chinese language is written with two different standardized character sets. The Chinese mainland and Singapore officially use the simplified set while other areas (such as Taiwan and Hong Kong) continue to largely use the traditional set. This plugin will allow users to convert EPUB and AZW3 files between both formats.

If only text format changes are desired (such as flow direction or quotation mark types), character set changes can be omitted. This allows changes to non-Chinese texts such as Japanese.

Main Features
  • Convert eBooks written in traditional characters into simplified characters
  • Convert eBooks written in simplified characters into traditional characters
  • Convert regional words and idioms used in the source material to those words and idioms used in the destination material
  • Convert individual sections or the entire book
  • Update metadata and table of contents
  • Convert text direction to vertical or horizontal
  • Provides command line processing for batch operations
  • This is an editor plugin so users can make changes in case the conversion is not perfect. Conversions from simplified to traditional should always be proofread.

Testing Platforms
  • Windows 10 (64 bit) - Calibre version 6.10

Note:
Github repository link

Command Line Interface(CLI)
Details:

Spoiler:
Example: overwrite all ebook files in a directory from Mainland simplified into Taiwan traditional (also change to East Asian quote marks and vertical text orientation) add in a "V" suffix to the file name and optimizing for display on the Chrome Readium reader:
calibre-debug --run-plugin "Chinese Text Conversion" -- -ol tw -il cn -d s2t -od out -qt e -td v -up -a V *.epub *.azw3
Example: overwrite all epub files in a directory from Taiwan traditional into Mainland simplified, but don't actually perform the write. Just print what would happen:
calibre-debug --run-plugin "Chinese Text Conversion" -- -ol cn -il tw -d t2s -t my_chinese_epub_dir/*.epub
Code:
usage: calibre-debug.exe [-h] [-il {cn,hk,tw,jp}] [-ol {cn,hk,tw,jp}] [-d {t2s,s2t,t2t,none}] [-p]
                         [-qt {w,e,no_change}] [-td {h,v,no_change}] [-up] [-v] [-t] [-q] [-od OUTDIR_OPT]
                         [-a APPEND_SUFFIX_OPT] [-f] [-s]
                         ebook-filepath [ebook-filepath ...]

Convert Chinese characters between traditional/simplified types and/or change text style. Generally run as: calibre-
debug --run-plugin "Chinese Text Conversion" -- [options] ebook-filepath Plugin Version: 3.0.0

positional arguments:
  ebook-filepath        One or more epub and/or azw3 ebook filepaths - UNIX style wildcards accepted

options:
  -h, --help            show this help message and exit
  -il {cn,hk,tw,jp}, --input-locale {cn,hk,tw,jp}
                        Set to the ebook origin locale if known (Default: cn)
  -ol {cn,hk,tw,jp}, --output-locale {cn,hk,tw,jp}
                        Set to the ebook target locale (Default: cn)
  -d {t2s,s2t,t2t,none}, --direction {t2s,s2t,t2t,none}
                        Set to the ebook conversion direction (Default: none)
  -p, --phrase_convert  Convert phrases to target locale versions (Default: False)
  -qt {w,e,no_change}, --quotation-type {w,e,no_change}
                        Set to Western or East Asian (Default: no_change)
  -td {h,v,no_change}, --text-direction {h,v,no_change}
                        Set to the ebook origin locale if known (Default: no_change)
  -up, --update_punctuation
                        Update punctuation to match direction change (Default: False)
  -v, --verbose         Print out details as the conversion progresses (Default: False)
  -t, --test            Run conversion operations without saving results (Default: False)
  -q, --quiet           Do not print anything, ignore warnings - this option overrides the -s option (Default: False)
  -od OUTDIR_OPT, --output-dir OUTDIR_OPT
                        Set to the ebook output file directory (Default: overwrite existing ebook file)
  -a APPEND_SUFFIX_OPT, --append_suffix APPEND_SUFFIX_OPT
                        Append a suffix to the output file basename (Default: )
  -f, --force           Force processing by ignoring warnings (e.g. allow overwriting files with no prompt)
  -s, --show            Show the settings based on user cmdline options and exit (Default: False)


Installation Steps:
Download the attached zip file and install the plugin/add to context menu or toolbar/restart Calibre as described in the Introduction to plugins .

Operation:
From the main Calibre window, select a book and then press the "Edit book" icon on the toolbar. The editor will open. Press the "plugins" text on the editor toolbar and select the plugin.

Special Notes:
  • Requires calibre v6.0 or higher
  • No testing has been done on OS X systems
  • Keep a copy of the original file. Round trip conversions (i.e. traditional->simplified->traditional) will probably not recover the original version. Also, since characters are being replaced, it's possible the font in your eBook reader may not have all the necessary glyphs
  • Metadata changes made via the GUI do not update the main Calibre database. They will be overwritten once the editor is re-opened. Consider using the 'Save a copy' option
  • Calibre Version 5.0 and later support the reading of vertical text. Earlier versions did not.

Version History:
Spoiler:
  • Version 1.0.0 - 24 Jun 2016. Initial release
  • Version 1.1.0 - 27 Jun 2016. Improved speed
  • Version 1.2.0 - 29 Jun 2016. Correct conversion, turn on compression for the plugin zip file
  • Version 2.0.0 - 10 Nov 2016. Added command line processing, now also update TOC and metadata, updated conversion dictionaries
  • Version 2.0.1 - 24 Jan 2016. Updated conversion dictionaries to latest at OpenCC project. Modified using chihchun's changes to allow plugin to work with more Calibre versions. Corrected minimum version.
  • Version 2.1.0 - 19 Feb 2017. Added option to also convert quotation mark style to match target. Not yet added to the command line version.
  • Version 2.1.1 - 5 Aug 2017. Correct exception that occurred when processing an entire book. See Github issue #3 for details.
  • Version 2.2.1 - 22 Aug 2017 - Added vertical text orientation and epub quotation mark optimization for Readium and Kindle viewers. Kindle Previewer 3 must be used to convert epub files into Kindle mobi files.
  • Version 2.2.2 - 30 Aug 2017 - Corrected CSS file for EPUB->AZW3 conversion. See Github issue #5 for details.
  • Version 2.2.3 - 31 Aug 2017 - Improved speed for vertical text conversion optimization. See Github issue #5 for details.
  • Version 2.2.4 - 15 Sep 2017 - Fix some CSS issues with vertical text. See Github issue #5 for details.
  • Version 2.3.0 - 4 Oct 2017 - Add full support for AZW3 files. See Gihub issue #6 for details
  • Version 2.3.1 - 13 Nov 2017 - Allow the settings dialog to resize by adding scroll bars.
  • Version 2.3.2 - 24 Nov 2018 - Improve conversion speed. Default to convert entire book
  • Version 2.3.3 - 12 Jan 2019 - Switch from cssutils to css-parser to match Calibre 3.37 and later releases. People will need to update to Calibre 3.37.
  • Version 2.3.4 - 17 Apr 2019 - Bug fix to avoid error when an item does not have a title.
  • Version 2.4.0 - 25 Sep 2020 - Add Python 3 operation. Warning - Command line is not fully tested.
  • Version 3.0.0 - 30 Dec 2022 - Updated conversion dictionaries. Added hanzi to kanji conversion. Added ability to only convert a small section. Uses a completely new HTML parser.
  • Version 3.0.1 - 4 Jan 2023 - Fixed conversion error from mainland simplified to Taiwan traditional.
  • Version 3.0.2 - 27 Mar 2023 - Fixed processing of character references.
Attached Thumbnails
Click image for larger version

Name:	PluginDialogPicture.png
Views:	584
Size:	296.5 KB
ID:	198687   Click image for larger version

Name:	PluginConversionChinese.png
Views:	387
Size:	284.8 KB
ID:	198688   Click image for larger version

Name:	PluginConversionJapanese2.png
Views:	314
Size:	282.2 KB
ID:	198689  
Attached Files
File Type: zip TradSimpChinese_3_0_2.zip (471.6 KB, 18208 views)

Last edited by Hopkins; 03-27-2023 at 12:59 PM. Reason: Fix Version: 3.0.1
Hopkins is offline   Reply With Quote
Old 06-24-2016, 09:43 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,776
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Thanks, I have added it to the plugin index.
kovidgoyal is offline   Reply With Quote
Advert
Old 08-25-2016, 10:21 AM   #3
GameMonsters
Junior Member
GameMonsters doesn't litterGameMonsters doesn't litter
 
Posts: 9
Karma: 100
Join Date: Jun 2011
Device: Nook, Nook Color, ASUS, Galaxy
Wow. Thank you. I finally can check out simplified Chinese books from my library. I have been waiting for a useful tool like this. Thank you SO MUCH.
GameMonsters is offline   Reply With Quote
Old 10-18-2016, 11:56 PM   #4
howardtang
Junior Member
howardtang began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Oct 2016
Device: Kindle PaperWhite
This plugin is really useful, and I use it all the time. However, I have two little recommendations for you.

1) this plugin can't convert the Chinese words inside metadata and table of contents
2) it would be perfect if there is a bulk convert function

i look forward to the next update~
howardtang is offline   Reply With Quote
Old 10-21-2016, 10:54 PM   #5
Hopkins
Enthusiast
Hopkins began at the beginning.
 
Posts: 38
Karma: 10
Join Date: Jun 2016
Location: Minnesota USA
Device: Amazon Paperwhite 3G
Those changes look like they would be very useful. I have spare time, so I think I can take a quick cut at it next week.
Hopkins is offline   Reply With Quote
Advert
Old 11-10-2016, 07:34 PM   #6
Hopkins
Enthusiast
Hopkins began at the beginning.
 
Posts: 38
Karma: 10
Join Date: Jun 2016
Location: Minnesota USA
Device: Amazon Paperwhite 3G
Update to 2.0.0

Plugin updated:

- Added command line processing to support batch processing
- Updated dictionary (txt) files based on OpenCC changes:- Plugin now updates Table of Contents (TOC) and Content metadata. The GUI updates both only if the "Entire eBook" option selected. Command line always does the entire book.
Hopkins is offline   Reply With Quote
Old 01-10-2017, 11:36 PM   #7
el_dheeb
Junior Member
el_dheeb began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jan 2017
Device: kindle
I have tried installing this plugin multiple times, both through Calibre and manually. Each time I get a message that the Plug in is installed and and to restart the machine. However Once i restart the Machine the plug in is listed as not installed. Here is the log after I install it.
Quote:
calibre Debug log
calibre 2.76 embedded-python: True is64bit: False
Windows-8-6.2.9200 Windows ('32bit', 'WindowsPE')
32bit process running on 64bit windows
('Windows', '8', '6.2.9200')
Python 2.7.9
Windows: ('8', '6.2.9200', '', 'Multiprocessor Free')
Successfully initialized third party plugins: DeDRM (6, 0, 8) && FanFicFare (2, 7, 0) && Overdrive Link (2, 7, 0) && Chinese Text Conversion (2, 0, 0)
devicePixelRatio: 1.0
logicalDpi: 96.0 x 96.0
physicalDpi: 100.861627907 x 101.07357513
Starting up...
FFF: INFO: 2017-01-10 22:21:23,918: calibre_plugins.fanficfare_plugin.prefs(201): Attempting to read settings from predecessor--FFDL
FFF: INFO: 2017-01-10 22:21:23,918: calibre_plugins.fanficfare_plugin.prefs(206): Using default settings
FFF: DEBUG: 2017-01-10 22:21:24,094: calibre_plugins.fanficfare_plugin.fff_plugin(207): Plugin FanFicFare macmenuhack file_path:C:\Users\el_dheeb\AppData\Roaming\calibr e\plugins\fanficfare_macmenuhack.txt
Started up in 57.20 seconds with 69 books
Downloading plugin zip attachment: https://code.calibre-ebook.com/plugins/275572.zip
Installing plugin: C:\Users\el_dheeb\AppData\Local\Temp\calibre_iunpo z\fawk5u.zip
Downloading plugin zip attachment: https://code.calibre-ebook.com/plugins/275572.zip
Installing plugin: C:\Users\el_dheeb\AppData\Local\Temp\calibre_iunpo z\irrznj.zip
Downloading plugin zip attachment: https://code.calibre-ebook.com/plugins/275572.zip
Installing plugin: C:\Users\el_dheeb\AppData\Local\Temp\calibre_iunpo z\4bjcyc.zip
Starting debug executable: C:\Program Files (x86)\Calibre2\calibre-debug.exe
el_dheeb is offline   Reply With Quote
Old 01-11-2017, 10:12 AM   #8
Hopkins
Enthusiast
Hopkins began at the beginning.
 
Posts: 38
Karma: 10
Join Date: Jun 2016
Location: Minnesota USA
Device: Amazon Paperwhite 3G
I just noticed I have the same problem. When I look at the "User Plugins" dialog window in the main program, I see that the plug is in the "Not Installed" list. But when I open the editor, I see that the plug is available (and works) under the "Plugins" menu.

Try this:
  1. Select a book in the main Calibre program
  2. Open the editor by clicking on the "Edit book" icon
  3. Click on the "Plugins" menu item in the editor

Check if there is an entry called "Convert Chinese Text Simplified/Traditional".

It's possible that the plugin is actually installed, but not properly registering with the main library application. Let me know what happens.

Edit:

Also, try this in the main Calibre window:
  1. Click on the "Preferences" icon
  2. Click on the "Plugins" icon which is at the bottom of the newly opened "Preferences" dialog
  3. Click on the "Edit Book Tool plugins" pull-down arrow

The plugin shows up in the list in this dialog in my case.

Last edited by Hopkins; 01-11-2017 at 10:22 AM.
Hopkins is offline   Reply With Quote
Old 01-11-2017, 06:00 PM   #9
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,908
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
@Hopkins: I think this means you have a name mismatch somewhere in the plugin. From memory, when I did this with one if my plugins, I had to uninstall it before the correct name would take.
davidfor is offline   Reply With Quote
Old 01-12-2017, 10:01 AM   #10
Hopkins
Enthusiast
Hopkins began at the beginning.
 
Posts: 38
Karma: 10
Join Date: Jun 2016
Location: Minnesota USA
Device: Amazon Paperwhite 3G
I think I see the issue. The name given in the plugin index and displayed via the "User Plugins" dialog is "Traditional<->Simplified Chinese Converter" where as the name I used in the plugin is "Chinese Text Conversion". I believe I changed the name at one point because the command shell processor and command line parser got annoyed when I used characters such as "<", "-", and ">".

I think the name in the plugin index may need to be changed.

For anyone writing plugins that may eventually use the command line interface:
  • Avoid whitespace in the name
  • Avoid using any character that might get interpreted by the shell or parser (e.g. <, >, - |, ! $, #, ?, /, \)
Hopkins is offline   Reply With Quote
Old 01-12-2017, 08:41 PM   #11
el_dheeb
Junior Member
el_dheeb began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jan 2017
Device: kindle
Your right, it did install. Thank you much.
el_dheeb is offline   Reply With Quote
Old 01-23-2017, 09:50 AM   #12
chihchun
Junior Member
chihchun began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2017
Device: Kindle
Hi,

tested with calibre 2.55.0+dfsg-1 on Ubuntu 16.04, the editor could not load the plugin.
Not sure if I missed some dependency

Quote:
(process:10492): Gtk-WARNING **: Locale not supported by C library.
Using the fallback 'C' locale.
File "/usr/bin/calibre-parallel", line 20, in <module>
sys.exit(main())
File "/usr/lib/calibre/calibre/utils/ipc/worker.py", line 190, in main
result = func(*args, **kwargs)
File "/usr/lib/calibre/calibre/gui_launch.py", line 76, in gui_ebook_edit
gui_main(path, notify)
File "/usr/lib/calibre/calibre/gui2/tweak_book/main.py", line 37, in gui_main
_run(['ebook-edit', path], notify=notify)
File "/usr/lib/calibre/calibre/gui2/tweak_book/main.py", line 71, in _run
main = Main(opts, notify=notify)
File "/usr/lib/calibre/calibre/gui2/tweak_book/ui.py", line 259, in __init__
self.create_actions()
File "/usr/lib/calibre/calibre/gui2/tweak_book/ui.py", line 495, in create_actions
create_plugin_actions(actions, toolbar_actions, self.plugin_menu_actions)
File "/usr/lib/calibre/calibre/gui2/tweak_book/plugin.py", line 166, in create_plugin_actions
for tool in load_plugin_tools(plugin):
File "/usr/lib/calibre/calibre/gui2/tweak_book/plugin.py", line 118, in load_plugin_tools
traceback.print_stack()
chihchun is offline   Reply With Quote
Old 01-23-2017, 10:14 AM   #13
chihchun
Junior Member
chihchun began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2017
Device: Kindle
It's API changed, the patch[1] works for me. :-)

[1] https://gist.github.com/chihchun/eee...9c1f62e8a84f39
chihchun is offline   Reply With Quote
Old 01-23-2017, 07:53 PM   #14
Hopkins
Enthusiast
Hopkins began at the beginning.
 
Posts: 38
Karma: 10
Join Date: Jun 2016
Location: Minnesota USA
Device: Amazon Paperwhite 3G
I see what happened. It looks like the API was changed in version 2.59 back in June 2016. I added the TOC stuff in the last plugin update in November 2016 but never tested on early versions. I'll incorporate your change and update the minimum version information to 2.55 (though it may very well work for earlier versions)

I was looking for an excuse to update the changes to the conversion dictionaries that were made to OpenCC since my November update:

https://github.com/BYVoid/OpenCC/com...ata/dictionary

I'll also add some text to my original post explaining how users can update their conversion dictionaries (they are just UTF-8 text files) in case I'm not around in the future...
Hopkins is offline   Reply With Quote
Old 01-24-2017, 02:03 AM   #15
chihchun
Junior Member
chihchun began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2017
Device: Kindle
Thank you, it would be great if you could host your code on github.com or somewhere that I can propose a pull request. :-)
chihchun is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
traditional and simplified chinese character set? mzmm ePub 3 05-10-2013 07:41 AM
Best ereader with (simplified) Chinese support in Australia fallsauce Which one should I buy? 3 12-29-2011 07:59 PM
A Simplified Chinese + English font that actually looks good macroexp Sony Reader Dev Corner 5 12-24-2010 11:08 PM
iLiad Enable Simplified Chinese handwriting ericshliao iRex Developer's Corner 2 04-15-2010 01:58 AM
Looking for Contemporary Simplified Chinese Books for PRS 505 eldon Sony Reader 2 08-25-2008 05:22 AM


All times are GMT -4. The time now is 01:54 PM.


MobileRead.com is a privately owned, operated and funded community.