Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 03-05-2011, 11:43 PM   #1
getajob
Junior Member
getajob began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Oct 2010
Location: Australia
Device: Kindle 3, iPhone 3G, iPad 2 (on order)
Trying to use Textile processing

I have spent hours trying to convert TXT to EPUB by marking it up with TEXTILE tags.

The resulting EPUB file shows no signs of any TEXTILE processing whatsoever. No headings, no linking, no italics or bold. Nothing.

I am now admitting defeat.

I am posting here in the hope that someone can tell me what is wrong.

I am running Calbre 0.7.48 on Windows XP SP3.

Basically I have set the TXT input processing to
Paragraph style: off
Formatting style: textile

Here is my input file (I stole this from Perkins):
Spoiler:

h1. Header 1

p(#fn1r). Here’s a link[1] which should jump to the end footnote.

h2. Header 2

The first *Robin Hobb* trilogy, the _Farseer Trilogy,_ took place in the ??Six Duchies??. It is the tale of +FitzChivalry+ Farseer.

p=. !E:/_BOOKS/Images/00004.jpg!

The first Robin Hobb trilogy, the Farseer Trilogy, took place in the Six Duchies.

h3. Header 3

Now some ^superscript^ followed by ~subscript~ and back to normal.

@This should be in Code format.@
@To see what mono font looks like.@

pre.
There was a man from hither,
Who, when he began to shiver,
He gave a cough,
His leg dropped off,
And floated down the river.

* Bullet 1
* Bullet 2
* Bullet 3

# Numbered 1
# Numbered 2
# Numbered 3

And now here follows a horizontal rule

<hr>

fn1. A footnote is here, which should jump back to first paragraph link.

When selecting here. "RETURN":#fn1r

p<. Left ??justified??

p=. Center *justified*

p>. Right _justified_

p<>. _*This should be fully justified and in bold and italics. This should be fully justified and in bold and italics. This should be fully justified and in bold and italics.*_

Here is my log file from Calibre
Spoiler:

Convert book 1 of 1 (Textile sample conversion)
Resolved conversion options
calibre version: 0.7.48
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0.0,
'book_producer': None,
'change_justification': u'original',
'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\\s+', 'i')) or @class = 'chapter']",
'chapter_mark': u'pagebreak',
'comments': None,
'cover': None,
'debug_pipeline': u'D:/Calibre Library/debug',
'dehyphenate': True,
'delete_blank_paragraphs': True,
'disable_font_rescaling': False,
'dont_split_on_page_breaks': False,
'enable_heuristics': False,
'epub_flatten': False,
'extra_css': None,
'extract_to': None,
'fix_indents': True,
'flow_size': 260,
'font_size_mapping': None,
'format_scene_breaks': True,
'formatting_type': u'textile',
'html_unwrap_factor': 0.4,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x046DB8F0>,
'insert_blank_line': False,
'insert_metadata': False,
'isbn': None,
'italicize_common_cases': True,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0.0,
'linearize_tables': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'markdown_disable_toc': False,
'markup_chapter_headings': True,
'max_toc_links': 50,
'minimum_line_height': 120.0,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.OutputProfile object at 0x046DBAD0>,
'page_breaks_before': u"//*[name()='h1' or name()='h2']",
'paragraph_type': u'off',
'prefer_metadata_cover': False,
'preserve_cover_aspect_ratio': False,
'preserve_spaces': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': 'c:\\docume~1\\johnbr~1\\locals~1\\temp\\calibre_0 .7.48_tmp_oqc5pb\\calibre_0.7.48_pe3ac_.opf',
'remove_first_image': False,
'remove_paragraph_spacing': True,
'remove_paragraph_spacing_indent_size': 1.5,
'renumber_headings': True,
'replace_scene_breaks': u'',
'series': None,
'series_index': None,
'smarten_punctuation': False,
'sr1_replace': None,
'sr1_search': None,
'sr2_replace': None,
'sr2_search': None,
'sr3_replace': None,
'sr3_search': None,
'tags': None,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'txt_in_remove_indents': False,
'unwrap_lines': True,
'use_auto_toc': False,
'verbose': 2}
InputFormatPlugin: TXT Input running
on D:\Calibre Library\Unknown\Textile sample conversion (155)\Textile sample conversion - Unknown.txt
Reading text from file...
Detected input encoding as windows-1252 with a confidence of 50.0%
Running text through textile conversion...
Language not specified
Creator not specified
Building file list...
Found files...
HTMLFile:0:a:\Calibre Library\Unknown\Textile sample conversion (155)\index.html
Normalizing filename cases
Rewriting HTML links
Parsing index.html ...
Initial parse failed:
Traceback (most recent call last):
File "site-packages\calibre\ebooks\oeb\base.py", line 881, in first_pass
File "lxml.etree.pyx", line 2532, in lxml.etree.fromstring (src/lxml/lxml.etree.c:48634)
File "parser.pxi", line 1545, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:72245)
File "parser.pxi", line 1417, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:71041)
File "parser.pxi", line 898, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:67581)
File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:64257)
File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:65178)
File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64521)
XMLSyntaxError: PCDATA invalid Char value 25, line 4, column 53

Parsing file 'index.html' as HTML
Forcing index.html into XHTML namespace
Input debug saved to: D:\Calibre Library\debug\input
Parsed HTML written to: D:\Calibre Library\debug\parsed
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Structured HTML written to: D:\Calibre Library\debug\structure
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Cleaning up manifest...
Trimming unused files from manifest...
Parsing stylesheet.css ...
Processed HTML written to: D:\Calibre Library\debug\processed
Creating EPUB Output...
Looking for large trees in index.html...
No large trees found
Generating default cover
This EPUB file has no Table of Contents. Creating a default TOC
EPUB output written to c:\docume~1\johnbr~1\locals~1\temp\calibre_0.7.48_ tmp_oqc5pb\calibre_0.7.48_wlo7ms.epub

I find the "initial parse failed" error message worrying but I cannot see a cause for this.

I have processed txt to epub in the past using MARKDOWN with acceptable results, but since I upgraded to 0.7.48, markdown is not working either.

Any ideas? John
getajob is offline   Reply With Quote
Old 03-06-2011, 06:55 AM   #2
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 655
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
Just updated to 7.48 (from .46), and conversion works fine here.
Have you tried restarting machine?

Edit:
Win 7

Last edited by Perkin; 03-06-2011 at 07:05 AM.
Perkin is offline   Reply With Quote
Advert
Old 03-06-2011, 08:16 AM   #3
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
The conversion is resulting correctly for me too. I would try: reboot your computer, uninstall calibre, reboot, reinstall calibre, reboot, try converting.

Quote:
Originally Posted by getajob
I find the "initial parse failed" error message worrying but I cannot see a cause for this.
Replace <hr> with <hr />. The error can be ignored as calibre will do the replacement later on during the conversion. But you can make that change to prevent it entirely.
user_none is offline   Reply With Quote
Old 03-06-2011, 09:29 PM   #4
getajob
Junior Member
getajob began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Oct 2010
Location: Australia
Device: Kindle 3, iPhone 3G, iPad 2 (on order)
Quote:
Originally Posted by user_none View Post
The conversion is resulting correctly for me too. I would try: reboot your computer, uninstall calibre, reboot, reinstall calibre, reboot, try converting.

Replace <hr> with <hr />. The error can be ignored as calibre will do the replacement later on during the conversion. But you can make that change to prevent it entirely.
Thanks for your suggestions. I am glad to hear that 0.7.48 is working for you.

I had already re-installed Calibre and re-booted but just to be sure I did this again (Uninstall Calibre, re-boot, install Calibre, re-boot).

I have changed the <hr> to <hr /> but I still get no joy what-so-ever.

Here is my new source file:
Spoiler:

h1. Header 1

p(#fn1r). Here’s a link[1] which should jump to the end footnote.

h2. Header 2

The first *Robin Hobb* trilogy, the _Farseer Trilogy,_ took place in the ??Six Duchies??. It is the tale of +FitzChivalry+ Farseer.

p=. !E:/_BOOKS/Images/00004.jpg!

The first Robin Hobb trilogy, the Farseer Trilogy, took place in the Six Duchies.

h3. Header 3

Ebañy had the book appliquéd with a dragon, which was a façadè.

Now some ^superscript^ followed by ~subscript~ and back to normal.

@This should be in Code format.@
@To see what mono font looks like.@

pre.
There was a man from hither,
Who, when he began to shiver,
He gave a cough,
His leg dropped off,
And floated down the river.


* Bullet 1
* Bullet 2
* Bullet 3

# Numbered 1
# Numbered 2
# Numbered 3

And now here follows a horizontal rule

<hr />

fn1. A footnote is here, which should jump back to first paragraph link.

When selecting here. "RETURN":#fn1r

p<. Left ??justified??

p=. Center *justified*

p>. Right _justified_

p<>. _*This should be fully justified and in bold and italics. This should be fully justified and in bold and italics. This should be fully justified and in bold and italics.*_

Here is my Calibre job log:
Spoiler:

Convert book 1 of 1 (Textile sample conversion)
Resolved conversion options
calibre version: 0.7.48
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0.0,
'book_producer': None,
'change_justification': u'original',
'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\\s+', 'i')) or @class = 'chapter']",
'chapter_mark': u'pagebreak',
'comments': None,
'cover': None,
'debug_pipeline': u'D:/Calibre Library/debug',
'dehyphenate': True,
'delete_blank_paragraphs': True,
'disable_font_rescaling': False,
'dont_split_on_page_breaks': False,
'enable_heuristics': False,
'epub_flatten': False,
'extra_css': None,
'extract_to': None,
'fix_indents': True,
'flow_size': 260,
'font_size_mapping': None,
'format_scene_breaks': True,
'formatting_type': u'textile',
'html_unwrap_factor': 0.4,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x045DC910>,
'insert_blank_line': False,
'insert_metadata': False,
'isbn': None,
'italicize_common_cases': True,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0.0,
'linearize_tables': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'markdown_disable_toc': False,
'markup_chapter_headings': True,
'max_toc_links': 50,
'minimum_line_height': 120.0,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.OutputProfile object at 0x045DCAF0>,
'page_breaks_before': u"//*[name()='h1' or name()='h2']",
'paragraph_type': u'off',
'prefer_metadata_cover': False,
'preserve_cover_aspect_ratio': False,
'preserve_spaces': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': 'c:\\docume~1\\johnbr~1\\locals~1\\temp\\calibre_0 .7.48_tmp_vxqywt\\calibre_0.7.48_8cixhz.opf',
'remove_first_image': False,
'remove_paragraph_spacing': True,
'remove_paragraph_spacing_indent_size': 1.5,
'renumber_headings': True,
'replace_scene_breaks': u'',
'series': None,
'series_index': None,
'smarten_punctuation': False,
'sr1_replace': None,
'sr1_search': None,
'sr2_replace': None,
'sr2_search': None,
'sr3_replace': None,
'sr3_search': None,
'tags': None,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'txt_in_remove_indents': False,
'unwrap_lines': True,
'use_auto_toc': False,
'verbose': 2}
InputFormatPlugin: TXT Input running
on D:\Calibre Library\Unknown\Textile sample conversion (161)\Textile sample conversion - Unknown.txt
Reading text from file...
Detected input encoding as windows-1252 with a confidence of 50.0%
Running text through textile conversion...
Language not specified
Creator not specified
Building file list...
Found files...
HTMLFile:0:a:\Calibre Library\Unknown\Textile sample conversion (161)\index.html
Normalizing filename cases
Rewriting HTML links
Parsing index.html ...
Initial parse failed:
Traceback (most recent call last):
File "site-packages\calibre\ebooks\oeb\base.py", line 881, in first_pass
File "lxml.etree.pyx", line 2532, in lxml.etree.fromstring (src/lxml/lxml.etree.c:48634)
File "parser.pxi", line 1545, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:72245)
File "parser.pxi", line 1417, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:71041)
File "parser.pxi", line 898, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:67581)
File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:64257)
File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:65178)
File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64521)
XMLSyntaxError: PCDATA invalid Char value 25, line 4, column 53

Parsing file 'index.html' as HTML
Forcing index.html into XHTML namespace
Input debug saved to: D:\Calibre Library\debug\input
Parsed HTML written to: D:\Calibre Library\debug\parsed
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Structured HTML written to: D:\Calibre Library\debug\structure
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Cleaning up manifest...
Trimming unused files from manifest...
Parsing stylesheet.css ...
Processed HTML written to: D:\Calibre Library\debug\processed
Creating EPUB Output...
Looking for large trees in index.html...
No large trees found
Generating default cover
This EPUB file has no Table of Contents. Creating a default TOC
EPUB output written to c:\docume~1\johnbr~1\locals~1\temp\calibre_0.7.48_ tmp_vxqywt\calibre_0.7.48_tbjctt.epub

Here is a link to the epub file that gets output http://dl.dropbox.com/u/18750031/Tex...20Unknown.epub

Although the job log says
Code:
Running text through textile conversion...
there is no evidence that this actually happened.

There are two .pyo files in C:\Program Files\Calibre2\Lib\site-packages\calibre\ebooks\textile so I seem to have the textile python executables installed.

Do you have any other ideas of what else I can check?

As you can see, I ran this with debug on. John
getajob is offline   Reply With Quote
Old 03-06-2011, 11:30 PM   #5
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by getajob View Post
Thanks for your suggestions. I am glad to hear that 0.7.48 is working for you.

Do you have any other ideas of what else I can check?
I have one idea. Quit calibre, rename the configuration directory, restart calibre. This will force calibre to create a brand new configuration folder. If something in your configuration folder is corrupt this might fix it.

The epub you attached looked very close to what I got when I left both text input settings to auto.

My experiment:

I have never used textile. I took your source and added it to calibre. I converted to ePub using paragraph style - Off, Formatting style - Textile.

I attached the resultant epub, which looks great except I didn't have the image. I also attached the txt file I used for the source. I left all of my default settings alone, hopefully they haven't skewed it too much.

In case it might help, here is my job details info.

Spoiler:

Code:
Convert book 1 of 1 (textile) Resolved conversion options
 calibre version: 0.7.48
 {'asciiize': False,
  'author_sort': None,
  'authors': None,
  'base_font_size': 16.0,
  'book_producer': None,
  'change_justification': u'original',
  'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'introduction|prologue|epilogue|chapter|book|section|conclusion|part\\s+', 'i')) or @class = 'chapter']",
  'chapter_mark': u'none',
  'comments': None,
  'cover': None,
  'debug_pipeline': None,
  'dehyphenate': False,
  'delete_blank_paragraphs': False,
  'disable_font_rescaling': False,
  'dont_split_on_page_breaks': False,
  'enable_heuristics': False,
  'epub_flatten': False,
  'extra_css': u'body { margin: 0 0; padding: 0em 0em; }\n\np {margin-top:0.5em; margin-bottom:0.5em; text-indent:1.1em}\n\nh1+p, h2+p, h3+p, p.whitespace+p, p.softbreak+p {margin-top:0.1em; margin-bottom:0.3em; text-indent:0%}',
  'extract_to': None,
  'fix_indents': False,
  'flow_size': 260,
  'font_size_mapping': u'16,16,16,16,17.5,17.5,18,18',
  'format_scene_breaks': True,
  'formatting_type': u'textile',
  'html_unwrap_factor': 0.4,
  'input_encoding': None,
  'input_profile': <calibre.customize.profiles.SonyReaderInput object at 0x0444AC90>,
  'insert_blank_line': True,
  'insert_metadata': False,
  'isbn': None,
  'italicize_common_cases': False,
  'keep_ligatures': True,
  'language': None,
  'level1_toc': None,
  'level2_toc': None,
  'level3_toc': None,
  'line_height': 0.0,
  'linearize_tables': False,
  'margin_bottom': 5.0,
  'margin_left': 5.0,
  'margin_right': 5.0,
  'margin_top': 5.0,
  'markdown_disable_toc': False,
  'markup_chapter_headings': False,
  'max_toc_links': 50,
  'minimum_line_height': 120.0,
  'no_chapters_in_toc': False,
  'no_default_epub_cover': False,
  'no_inline_navbars': False,
  'no_svg_cover': False,
  'output_profile': <calibre.customize.profiles.SonyReaderOutput object at 0x0444AF90>,
  'page_breaks_before': u'//h:h1',
  'paragraph_type': u'off',
  'prefer_metadata_cover': False,
  'preserve_cover_aspect_ratio': False,
  'preserve_spaces': False,
  'pretty_print': True,
  'pubdate': None,
  'publisher': None,
  'rating': None,
  'read_metadata_from_opf': 'C:\\Calibre_temp\\calibre_0.7.48_tmp_oghngz\\calibre_0.7.48_ckbwpv.opf',
  'remove_first_image': False,
  'remove_paragraph_spacing': True,
  'remove_paragraph_spacing_indent_size': 1.1,
  'renumber_headings': False,
  'replace_scene_breaks': u'',
  'series': None,
  'series_index': None,
  'smarten_punctuation': True,
  'sr1_replace': None,
  'sr1_search': None,
  'sr2_replace': None,
  'sr2_search': None,
  'sr3_replace': None,
  'sr3_search': None,
  'tags': None,
  'timestamp': None,
  'title': None,
  'title_sort': None,
  'toc_filter': None,
  'toc_threshold': 6,
  'txt_in_remove_indents': False,
  'unwrap_lines': False,
  'use_auto_toc': False,
  'verbose': 2}
 InputFormatPlugin: TXT Input running
 on C:\My Dropbox\CalibreLibrary\textile\textile (8108)\textile - textile.txt
 Reading text from file...
 Detected input encoding as ISO-8859-2 with a confidence of 83.5262045228%
 Running text through textile conversion...
 Language not specified
 Building file list...
     Found files...
          HTMLFile:0:a:C:\My Dropbox\CalibreLibrary\textile\textile (8108)\index.html
 Normalizing filename cases
 Rewriting HTML links
 Parsing index.html ...
 Forcing index.html into XHTML namespace
 Merging user specified metadata...
 Detecting structure...
 Auto generated TOC with 2 entries.
 Flattening CSS and remapping font sizes...
 Source base font size is 12.00000pt
 Cleaning up manifest...
 Trimming unused files from manifest...
 Parsing stylesheet.css ...
 Creating EPUB Output...
         Splitting on page-break
     Looking for large trees in index.html...
     No large trees found
 Generating default cover
 EPUB output written to C:\Calibre_temp\calibre_0.7.48_tmp_oghngz\calibre_0.7.48_brpjix.epub


Good Luck.
Attached Files
File Type: epub textile.epub (88.8 KB, 315 views)
File Type: txt textile.txt (1.2 KB, 207 views)
DoctorOhh is offline   Reply With Quote
Advert
Old 03-07-2011, 05:44 AM   #6
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 655
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
dwanthy, that comes out nearly all correct, the only problem is the pre section, which is missing spaces (also incorrect in posts), should be
Code:
pre. 
There   was   a   man   from   hither,
  Who,  when  he  began  to  shiver,
    He     gave     a     cough,
      His   leg  dropped  off,
   And  floated  down  the  river.
There's a space after pre. (why the pre tag isn't converted), and several spaces in the text (to give a centrified limerick).

Another problem is some of the accented characters, which is just the coding,being different.

@getajob
Have you tried setting the 'Input character encoding' in the conversion-Look'n'feel, try converting with utf-8, and if that doesn't work try again with cp1252

I had a similar problem with a version just after Textile was introduced, but was the character encoding which was causing it.

Edit:
If I remember, I was converting with cp1252 but the file was utf-8.

Last edited by Perkin; 03-07-2011 at 05:47 AM.
Perkin is offline   Reply With Quote
Old 03-07-2011, 07:15 AM   #7
getajob
Junior Member
getajob began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Oct 2010
Location: Australia
Device: Kindle 3, iPhone 3G, iPad 2 (on order)
Quote:
Originally Posted by dwanthny View Post
I have one idea. Quit calibre, rename the configuration directory, restart calibre. This will force calibre to create a brand new configuration folder. If something in your configuration folder is corrupt this might fix it.
Dwanthy,

Good idea. I renamed my configuration folder and also deleted and reinstalled Calibre. I changed (only) the TXT input processing to
Paragraph style: off
Formatting style: textile

Otherwise it is a vanilla Calibre installation now.

Disappointingly I get exactly the same non-textile-processed result.

I tried a fresh install on a Windows XP desktop that has never had Calibre on it before. I get the same non-textile-processed result yet again.

I then tried installing on a new Windows 7 Pro laptop. Still no go - and I was confident that this would work for sure.
Quote:
Originally Posted by Perkin View Post
@getajob
Have you tried setting the 'Input character encoding' in the conversion-Look'n'feel, try converting with utf-8, and if that doesn't work try again with cp1252

I had a similar problem with a version just after Textile was introduced, but was the character encoding which was causing it.

Edit:
If I remember, I was converting with cp1252 but the file was utf-8.
Perkin,

Good idea. I changed Input Character Coding to 'utf-8' in Look & Feel but all this seemed to was to do was put two black-diamond-question-marks before the 'h1. Header 1' which was unprocessed....(see attached result below)

Is there any easy way to determine what your Input Character Coding actually is?

The only markup that is working is the '<hr />' - maybe this is a clue...

Thanks for your help so far. Any more things I can try?
Attached Files
File Type: epub Textile sample conversion - Unknown.epub (93.0 KB, 256 views)
getajob is offline   Reply With Quote
Old 03-07-2011, 07:23 AM   #8
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by getajob View Post
Good idea. I renamed my configuration folder and also deleted and reinstalled Calibre. I changed (only) the TXT input processing to
Paragraph style: off
Formatting style: textile
How about attaching the exact text file you are adding to calibre.
DoctorOhh is offline   Reply With Quote
Old 03-07-2011, 07:31 AM   #9
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Also can you turn on having it give debug output. Then zip up and attach the debug output folder. I'm not sure how to do that with the GUI... On the command line you would use the --debug switch.
user_none is offline   Reply With Quote
Old 03-07-2011, 08:19 AM   #10
getajob
Junior Member
getajob began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Oct 2010
Location: Australia
Device: Kindle 3, iPhone 3G, iPad 2 (on order)
OK. It's after midnight here so I'll give you my input file, the last job log & the debug directory. I am calling it a night... Thanks for your help.
Attached Files
File Type: txt Textile sample conversion - Unknown.txt (2.4 KB, 245 views)
File Type: txt Job Log.txt (4.9 KB, 284 views)
File Type: zip debug.zip (9.2 KB, 233 views)
getajob is offline   Reply With Quote
Old 03-07-2011, 09:15 AM   #11
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 655
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
Your file didn't convert here.
My text editor (EditPad Pro) is saying that the encoding is Unicode-UTF-16 Little Endian, perhaps that's something to do with it?
Try copying the whole text and pasting it into notepad - resaving and the add that to calibre and try again.
I did that and it then converted properly.

What text editor are you using?

Edit:
If you convert your file with 'UTF-16' as the input character encoding, it also converts properly here.

Last edited by Perkin; 03-07-2011 at 09:27 AM.
Perkin is offline   Reply With Quote
Old 03-07-2011, 09:15 AM   #12
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by getajob View Post
OK. It's after midnight here so I'll give you my input file, the last job log & the debug directory. I am calling it a night... Thanks for your help.
The good news is when I use your supplied txt file I get the exact same results you do. Notepad++ says it is encoded UCS-2 Little Endian. After I save it as UTF-8 encoded I get the expected conversion.
DoctorOhh is offline   Reply With Quote
Old 03-07-2011, 05:43 PM   #13
getajob
Junior Member
getajob began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Oct 2010
Location: Australia
Device: Kindle 3, iPhone 3G, iPad 2 (on order)
Problem solved.

Thanks people you have solved my problem. (Thank God, it was driving me nuts...)
Quote:
Originally Posted by Perkin View Post
What text editor are you using?
I am using WordPad but at some stage of the journey I decided to 'Save As' in what WordPad calls Unicode.
Quote:
Originally Posted by Perkin View Post
If you convert your file with 'UTF-16' as the input character encoding, it also converts properly here.
You are absolutely correct. I tried setting the Input Character Encoding to 'Unicode' but that did not work. 'UTF-16' does not appear in the Input character encoding dropdown list and it never occurred to me to set it to 'UTF-16'.

WordPad has four 'Save As' options: ANSI, Unicode, Unicode big endian and UTF-8. Notepad is the same.

Using ANSI was giving me the annoying black-diamond-with-question-marks for the odd character so I changed to Unicode encoding.

SOLUTION:
Save your text in UTF-8 format using the 'Save As' dialog of WordPad or Notepad.
In Look & Feel, set Input character encoding to UTF-8
In TXT Input, set Paragraph style to off and set Formatting style to textile.

If you have to use Unicode, then set Input character encoding to UTF-16

Using ANSI is not recommended since it will give you black-diamond-with-question-marks for the odd character (don't ask me why...)
getajob is offline   Reply With Quote
Old 03-07-2011, 08:38 PM   #14
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
You're mostly correct except for one part of your solution:

Quote:
Originally Posted by getajob View Post
If you have to use Unicode, then set Input character encoding to UTF-16

UTF-8 is unicode, so there is no need to use UTF-16 ever. UTF-8 is basically the web and ebook standard for Unicode and is always the best file encoding to use. Just make sure your original file is saved as UTF-8.

Regarding your statement on ANSI, 'ANSI' shouldn't even really be called an encoding - ANSI really means 'encode this based on what country I live in, but make sure only people from the same country as me can read it'. Why Microsoft persists in defaulting all their products to ANSI I'll never understand, but it's the root cause of most people's encoding problems.


It probably wouldn't be terribly difficult to add support for reading the Unicode BOM at the beginning of the file so that Calibre can figure out UTF-8/16/32/LE/BE on it's own....

Last edited by ldolse; 03-07-2011 at 11:03 PM.
ldolse is offline   Reply With Quote
Old 03-07-2011, 09:00 PM   #15
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by ldolse View Post
It probably wouldn't be terribly difficult to add support for reading the Unicode BOM at the beginning of the file so that Calibre can figure out UTF-8/16/32/LE/BE on it's own....
It's already supposed to...
user_none is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Word Processing on the Kindle 3 cow_trix Amazon Kindle 41 05-17-2011 03:22 AM
Textile conversion broken in 7.45 Perkin Conversion 7 02-12-2011 06:36 PM
New edition of The Textile Planet; read chapter one for free [see post #14] suelange Self-Promotions by Authors and Publishers 14 09-29-2010 10:33 AM
Comic File Processing wonderboy Other formats 1 08-08-2009 04:17 AM
Perl processing alexxxm Sony Reader 3 11-26-2007 06:13 AM


All times are GMT -4. The time now is 12:14 AM.


MobileRead.com is a privately owned, operated and funded community.