01-08-2012, 04:42 PM | #1 |
Junior Member
Posts: 4
Karma: 20
Join Date: Jan 2012
Device: Kindle
|
Preprocess txt and txtz
Hi, i am working on plugin and i need help.
My plugin should modify markdown formatted txt files before converting to other formats. Currently i'm using `FileTypePlugin` with on_preprocess = True. Inside run method i read current txt file, process it and save as new temp file. Is this right way to do this, or is there better/easier/correct way? I would like to do the same with txtz files, any advices? Do i need to extract it and zip content after processing manually? Thanks |
01-09-2012, 04:08 AM | #2 |
creator of calibre
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That is the right way, and yes for txtz you will have to rezip the file into a new txtz file.
|
Advert | |
|
01-09-2012, 07:12 AM | #3 | |
Guru
Posts: 776
Karma: 2751519
Join Date: Jul 2010
Location: UK
Device: PW2, Nexus7
|
Quote:
Your plugin sounds very interesting. I pre-process markdown txt files prior to conversion but have resorted to using VBScripts because I already know VB and not Python. This means a lack of integration with calibre which is a bit of a downside. Do you have any plans to make your plugin publicly available? I may be able to use it as a starting point for my own plugin! Last edited by Agama; 01-09-2012 at 03:34 PM. Reason: typo! |
|
01-09-2012, 02:25 PM | #4 |
Junior Member
Posts: 4
Karma: 20
Join Date: Jan 2012
Device: Kindle
|
Thanks for answers, guys.
@Agama, i will publish my plugin soon, i will PM you when i finish it so you can take a peek at source code. |
01-10-2012, 09:18 AM | #5 |
Junior Member
Posts: 4
Karma: 20
Join Date: Jan 2012
Device: Kindle
|
I have made some progress, my plugin now supports both txt and txtz files.
I still have some followup questions:
Thanks. (Just quick info if anyone is interested, this plugin is used to manage footnotes and endnotes in markdown formatted text files, i will make post about it when plugin is finished) Last edited by draganHR; 01-28-2012 at 03:58 PM. |
Advert | |
|
01-10-2012, 09:55 AM | #6 |
creator of calibre
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
IIRC its index.text for textile and index.txt for others. Use
import traceback traceback.print_stack() to see why it is being called twice |
01-10-2012, 05:44 PM | #7 | |
Guru
Posts: 776
Karma: 2751519
Join Date: Jul 2010
Location: UK
Device: PW2, Nexus7
|
Quote:
|
|
01-19-2012, 02:11 PM | #8 |
Guru
Posts: 776
Karma: 2751519
Join Date: Jul 2010
Location: UK
Device: PW2, Nexus7
|
I've had a play with your plugin using your sample file and it seems to work nicely. I'll test it further with my own text file. Looking at the source code, there are various print statements. Where can I see the output from these when the plugin runs?
One minor suggestion: How about allowing the special character to be user-defined and saved as part of the plugin, rather than hard-coding the dagger character? |
01-26-2012, 02:15 PM | #9 |
Guru
Posts: 776
Karma: 2751519
Join Date: Jul 2010
Location: UK
Device: PW2, Nexus7
|
I'm having trouble getting your plugin to work with my own simple test file, (attached with resultant ePub), - perhaps I am missing something obvious. It works with your test file so I know that the plugin is getting called during conversion, and it does make foot noting easy. Any ideas what I'm doing wrong?
|
01-28-2012, 04:02 PM | #10 | ||
Junior Member
Posts: 4
Karma: 20
Join Date: Jan 2012
Device: Kindle
|
Quote:
Quote:
|
||
01-30-2012, 08:29 AM | #11 |
Guru
Posts: 776
Karma: 2751519
Join Date: Jul 2010
Location: UK
Device: PW2, Nexus7
|
The updated version worked, so I've now changed some of the links to inline references, but these didn't work. Files attached as before.
By the way what does utf-8 mean and do I need to make files in this format? I simply opened Notepad++ on a Windows 7 platform and typed the document. I've never known calibre have any problems with any other documents that I have typed. The Job Details button worked just as you pointed out and looks a very handy option for debugging plugins. (I'm learning lots from your code and reckon I will be able to tackle my plugin soon. It's a bit obscure but will take guitar tablature exported as plain text from TablEdit and apply Markdown to enable clean conversion to ePub in calibre. Then I can use calibre's library to catalogue all my Tabs. Just a shame that my Sony reader cannot play MIDI files so that I can hear the Tab whilst reading it!) |
02-04-2012, 10:02 AM | #12 | |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Quote:
UTF-8 is a widely used and lightweight way to handle a large number of characters including unicode characters. It is recommended to use UTF-8 for TXT files. Calibre defaults to utf-8 for TXT files. TXT input will try to auto detect the encoding but due to the way encoding works it's far from perfect. This is also why a conversion solution is often to specify the encoding before hand. |
|
02-04-2012, 11:45 AM | #13 |
Guru
Posts: 776
Karma: 2751519
Join Date: Jul 2010
Location: UK
Device: PW2, Nexus7
|
I have found that Notepad++ will allow me to encode in both "UTF-8" and "UTF-8 without BOM". Is one of these encodings preferred over the other?
|
02-04-2012, 11:47 AM | #14 |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Is there a way to preprocess a feed XML | pietvo | Recipes | 1 | 12-31-2011 12:34 PM |
txt to ... | jlutes | Conversion | 2 | 08-12-2011 02:14 AM |
Preprocess cbz before sending to Kindle | mhkey | Conversion | 3 | 07-02-2011 06:15 PM |
New Txtz - couple of things. | Perkin | Conversion | 5 | 02-18-2011 07:16 PM |
Preprocess or Postprocess epub Conversion? | robert_epub | Calibre | 1 | 03-20-2010 11:12 PM |