![]() |
#1 |
Enthusiast
![]() ![]() Posts: 26
Karma: 168
Join Date: May 2005
Location: Wuhan, China
Device: Kindle DXG
|
profile txt -> mobi convert
Text to mobi convert appears rather slow. Here I'd like to present a profile result performed on a 5000 lines txt file in unicode Chinese, the file size is 960902 bytes.
The ebook-convert seems spend a lot time in detect char encode. There must something can be done to speed up the convert. $ file sample.txt sample.txt: UTF-8 Unicode text Code:
Sort by tottime ncalls tottime percall cumtime percall filename:lineno(function) 15 10.6280 0.7090 12.2590 0.8170 sbcharsetprober.py:63(feed) 574218/574217 3.5850 0.0000 4.0200 0.0000 {built-in method sub} 268 2.7980 0.0100 2.7980 0.0100 {cPalmdoc.compress} 751 2.5250 0.0030 2.5250 0.0030 stylizer.py:126(__call__) 960902 1.0610 0.0000 1.1240 0.0000 codingstatemachine.py:40(next_state) 16097860 0.9990 0.0000 0.9990 0.0000 {ord} 1 0.6140 0.6140 1.7810 1.7810 utf8prober.py:50(feed) 145029 0.5670 0.0000 0.8200 0.0000 __init__.py:194(unit_convert) 245208/145027 0.4080 0.0000 0.5700 0.0000 stylizer.py:564(_get) 145029/145027 0.3840 0.0000 1.4220 0.0000 stylizer.py:577(_unit_convert) 5033 0.3510 0.0000 0.3510 0.0000 {built-in method findall} 5001/1 0.3430 0.0000 3.6550 3.6550 mobiml.py:292(mobimlize_elem) 347846/327027 0.3390 0.0000 1.0280 0.0000 {hasattr} 1 0.3180 0.3180 0.3800 0.3800 hebrewprober.py:188(feed) 361570 0.3180 0.0000 1.3790 0.0000 re.py:229(_compile) 20565 0.2880 0.0000 0.4750 0.0000 cssstyledeclaration.py:397(getProperty) 41234 0.2840 0.0000 0.6710 0.0000 serialize.py:1001(do_css_Value) 642375 0.2640 0.0000 0.2640 0.0000 {isinstance} 150028 0.2630 0.0000 3.3070 0.0000 stylizer.py:558(__getitem__) 215733 0.2470 0.0000 0.2470 0.0000 {built-in method match} 40 0.2460 0.0060 0.2460 0.0060 {method 'xpath' of 'lxml.etree._Element' objects} 1 0.2410 0.2410 0.2490 0.2490 page_margin.py:127(find_levels) Sort by cumtime ncalls tottime percall cumtime percall filename:lineno(function) 1 0.0140 0.0140 36.9980 36.9980 plumber.py:934(run_me) 1 0.0000 0.0000 19.8110 19.8110 conversion.py:193(__call__) 1 0.0100 0.0100 19.8100 19.8100 txt_input.py:54(convert) 1 0.0010 0.0010 14.5750 14.5750 __init__.py:20(detect) 1 0.0000 0.0000 14.5750 14.5750 chardet.py:36(detect) 2 0.0000 0.0000 14.4200 7.2100 charsetgroupprober.py:55(feed) 1 0.0000 0.0000 14.4200 14.4200 universaldetector.py:61(feed) 15 10.6280 0.7090 12.2590 0.8170 sbcharsetprober.py:63(feed) 1 0.0010 0.0010 9.7800 9.7800 mobi_output.py:167(convert) 1 0.0110 0.0110 9.7680 9.7680 mobi_output.py:204(write_mobi) 4 0.0590 0.0150 6.5140 1.6290 stylizer.py:176(__init__) 1 0.0030 0.0030 4.7810 4.7810 html_input.py:57(convert) 1 0.0170 0.0170 4.6750 4.6750 html_input.py:94(create_oebbook) 1 0.0000 0.0000 4.4910 4.4910 flatcss.py:122(__call__) 1 0.0000 0.0000 4.4740 4.4740 mobiml.py:104(__call__) 1 0.0000 0.0000 4.4740 4.4740 mobiml.py:114(mobimlize_spine) 52 0.0000 0.0000 4.3990 0.0850 base.py:903(fget) 1 0.0000 0.0000 4.3960 4.3960 base.py:830(_parse_xhtml) 1 0.0030 0.0030 4.3960 4.3960 parse_utils.py:201(parse_html) 1 0.0010 0.0010 4.3420 4.3420 preprocess.py:495(__call__) 574218/574217 3.5850 0.0000 4.0200 0.0000 {built-in method sub} 1 0.0000 0.0000 3.9730 3.9730 flatcss.py:150(stylize_spine) diff --git a/src/calibre/ebooks/conversion/plumber.py b/src/calibre/ebooks/conversion/plumber.py index 78821fa..9d8b4a6 100644 @@ -926,8 +926,13 @@ OptionRecommendation(name='search_replace', self.log.info('Input debug saved to:', out_dir) - def run(self): + '''debug profile ''' + import cProfile + cProfile.runctx('self.run_me()', globals(), locals()) + + def run_me(self): + #def run(self): ''' Run the conversion pipeline ''' |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,525
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
--input-encoding
|
![]() |
![]() |
![]() |
#3 |
Enthusiast
![]() ![]() Posts: 26
Karma: 168
Join Date: May 2005
Location: Wuhan, China
Device: Kindle DXG
|
thanks Kovid for the great work! Turn on --input-encoding certainly helps,
Code:
specify encoding NOT specify encoding ---------------------------------------------------------------- time 22s 36s total func calls 14M 30M top1 func call ord (0.7M) ord (16M) |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Convert from epub/mobi back to TXT or any format? | KDA1 | Calibre | 1 | 01-26-2012 04:19 PM |
txt to mobi how to | codrutoctavian | Conversion | 7 | 01-24-2012 10:42 PM |
How to config calibre when convert Chinese txt to mobi? | fifth | Calibre | 6 | 10-04-2010 08:56 AM |
Unable Convert Gutenberg TXT to Mobi | ascherjim | Calibre | 4 | 06-23-2009 08:55 AM |
Convert Mobi to txt | jflatto | Kindle Formats | 1 | 10-19-2008 04:14 PM |