Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 01-08-2012, 04:42 PM   #1
draganHR
Junior Member
draganHR began at the beginning.
 
Posts: 4
Karma: 20
Join Date: Jan 2012
Device: Kindle
Preprocess txt and txtz

Hi, i am working on plugin and i need help.

My plugin should modify markdown formatted txt files before converting to other formats. Currently i'm using `FileTypePlugin` with on_preprocess = True. Inside run method i read current txt file, process it and save as new temp file.

Is this right way to do this, or is there better/easier/correct way?

I would like to do the same with txtz files, any advices? Do i need to extract it and zip content after processing manually?

Thanks
draganHR is offline   Reply With Quote
Old 01-09-2012, 04:08 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 24,762
Karma: 4369667
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That is the right way, and yes for txtz you will have to rezip the file into a new txtz file.
kovidgoyal is offline   Reply With Quote
 
Enthusiast
Old 01-09-2012, 07:12 AM   #3
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 637
Karma: 436517
Join Date: Jul 2010
Location: UK
Device: PRS-300, PW2
Quote:
Originally Posted by draganHR View Post
Hi, i am working on plugin and i need help.
My plugin should modify markdown formatted txt files before converting to other formats.
Welcome to Mobile Read!

Your plugin sounds very interesting. I pre-process markdown txt files prior to conversion but have resorted to using VBScripts because I already know VB and not Python. This means a lack of integration with calibre which is a bit of a downside. Do you have any plans to make your plugin publicly available? I may be able to use it as a starting point for my own plugin!

Last edited by Agama; 01-09-2012 at 03:34 PM. Reason: typo!
Agama is offline   Reply With Quote
Old 01-09-2012, 02:25 PM   #4
draganHR
Junior Member
draganHR began at the beginning.
 
Posts: 4
Karma: 20
Join Date: Jan 2012
Device: Kindle
Thanks for answers, guys.

@Agama, i will publish my plugin soon, i will PM you when i finish it so you can take a peek at source code.
draganHR is offline   Reply With Quote
Old 01-10-2012, 09:18 AM   #5
draganHR
Junior Member
draganHR began at the beginning.
 
Posts: 4
Karma: 20
Join Date: Jan 2012
Device: Kindle
I have made some progress, my plugin now supports both txt and txtz files.
I still have some followup questions:
  • Is name of text file inside txtz archive always index.txt?
  • Whenever i convert a book, `run()` method is called twice, any ideas why? (only `on_preprocess` is `True`)
Current version is in attachment.
Thanks.

(Just quick info if anyone is interested, this plugin is used to manage footnotes and endnotes in markdown formatted text files, i will make post about it when plugin is finished)

Last edited by draganHR; 01-28-2012 at 03:58 PM.
draganHR is offline   Reply With Quote
Old 01-10-2012, 09:55 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 24,762
Karma: 4369667
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
IIRC its index.text for textile and index.txt for others. Use

import traceback
traceback.print_stack()

to see why it is being called twice
kovidgoyal is offline   Reply With Quote
Old 01-10-2012, 05:44 PM   #7
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 637
Karma: 436517
Join Date: Jul 2010
Location: UK
Device: PRS-300, PW2
Quote:
Just quick info if anyone is interested, this plugin is used to manage footnotes and endnotes in markdown formatted text files, i will make post about it when plugin is finished.
I simply use markdown to manage my footnotes, (including reverse links from the footnote back to the main text), so it will be interesting to see what you are doing and how it enhances basic markdown.
Agama is offline   Reply With Quote
Old 01-19-2012, 02:11 PM   #8
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 637
Karma: 436517
Join Date: Jul 2010
Location: UK
Device: PRS-300, PW2
I've had a play with your plugin using your sample file and it seems to work nicely. I'll test it further with my own text file. Looking at the source code, there are various print statements. Where can I see the output from these when the plugin runs?

One minor suggestion: How about allowing the special character to be user-defined and saved as part of the plugin, rather than hard-coding the dagger character?
Agama is offline   Reply With Quote
Old 01-26-2012, 02:15 PM   #9
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 637
Karma: 436517
Join Date: Jul 2010
Location: UK
Device: PRS-300, PW2
I'm having trouble getting your plugin to work with my own simple test file, (attached with resultant ePub), - perhaps I am missing something obvious. It works with your test file so I know that the plugin is getting called during conversion, and it does make foot noting easy. Any ideas what I'm doing wrong?
Attached Files
File Type: txt FootNotes .txt (335 Bytes, 38 views)
File Type: epub FootNotes.epub (2.4 KB, 41 views)
Agama is offline   Reply With Quote
Old 01-28-2012, 04:02 PM   #10
draganHR
Junior Member
draganHR began at the beginning.
 
Posts: 4
Karma: 20
Join Date: Jan 2012
Device: Kindle
Quote:
Originally Posted by Agama View Post
Looking at the source code, there are various print statements. Where can I see the output from these when the plugin runs?
"Show job details" button in Jobs list.

Quote:
Originally Posted by Agama View Post
I'm having trouble getting your plugin to work with my own simple test file, (attached with resultant ePub), - perhaps I am missing something obvious. It works with your test file so I know that the plugin is getting called during conversion, and it does make foot noting easy. Any ideas what I'm doing wrong?
There was encoding problem - my file was utf-8 and your wasn't. Try updated version pls.
Attached Files
File Type: zip Markdown Notes Plugin.zip (2.5 KB, 47 views)
draganHR is offline   Reply With Quote
Old 01-30-2012, 08:29 AM   #11
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 637
Karma: 436517
Join Date: Jul 2010
Location: UK
Device: PRS-300, PW2
The updated version worked, so I've now changed some of the links to inline references, but these didn't work. Files attached as before.

By the way what does utf-8 mean and do I need to make files in this format? I simply opened Notepad++ on a Windows 7 platform and typed the document. I've never known calibre have any problems with any other documents that I have typed.

The Job Details button worked just as you pointed out and looks a very handy option for debugging plugins.

(I'm learning lots from your code and reckon I will be able to tackle my plugin soon. It's a bit obscure but will take guitar tablature exported as plain text from TablEdit and apply Markdown to enable clean conversion to ePub in calibre. Then I can use calibre's library to catalogue all my Tabs. Just a shame that my Sony reader cannot play MIDI files so that I can hear the Tab whilst reading it!)
Attached Files
File Type: txt FootNotes.txt (367 Bytes, 38 views)
File Type: epub FootNotes.epub (4.0 KB, 31 views)
Agama is offline   Reply With Quote
Old 02-04-2012, 10:02 AM   #12
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,427
Karma: 950001
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by Agama View Post
By the way what does utf-8 mean and do I need to make files in this format?
UTF-8 is a way to encode text in a file. There are many different encodings. Basically, a letter like A can be represented by different binary sequences depending on the encoding. This is why when converting some documents have the "s show up as ?. The representation of the " in one encoding is a different character in a different encoding.

UTF-8 is a widely used and lightweight way to handle a large number of characters including unicode characters. It is recommended to use UTF-8 for TXT files. Calibre defaults to utf-8 for TXT files. TXT input will try to auto detect the encoding but due to the way encoding works it's far from perfect. This is also why a conversion solution is often to specify the encoding before hand.
user_none is offline   Reply With Quote
Old 02-04-2012, 11:45 AM   #13
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 637
Karma: 436517
Join Date: Jul 2010
Location: UK
Device: PRS-300, PW2
I have found that Notepad++ will allow me to encode in both "UTF-8" and "UTF-8 without BOM". Is one of these encodings preferred over the other?
Agama is offline   Reply With Quote
Old 02-04-2012, 11:47 AM   #14
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,427
Karma: 950001
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by Agama View Post
I have found that Notepad++ will allow me to encode in both "UTF-8" and "UTF-8 without BOM". Is one of these encodings preferred over the other?
Without BOM. Typically editors default to utf-8 and have an option to do it with the BOM.
user_none is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Is there a way to preprocess a feed XML pietvo Recipes 1 12-31-2011 12:34 PM
txt to ... jlutes Conversion 2 08-12-2011 02:14 AM
Preprocess cbz before sending to Kindle mhkey Conversion 3 07-02-2011 06:15 PM
New Txtz - couple of things. Perkin Conversion 5 02-18-2011 07:16 PM
Preprocess or Postprocess epub Conversion? robert_epub Calibre 1 03-20-2010 11:12 PM


All times are GMT -4. The time now is 01:10 PM.


MobileRead.com is a privately owned, operated and funded community.