Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 09-16-2010, 12:53 PM   #796
arijon
Junior Member
arijon began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2010
Device: iPad
They are OCRed PDFs. It took over 2 hours but it finally worked - the progress bar didn't appear to move but I think it's all set. Thanks!
arijon is offline   Reply With Quote
Old 09-16-2010, 01:33 PM   #797
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
If they are OCR'ed PDF files it will be interesting to see if the converted file is any better than using the PDF directly.
itimpi is offline   Reply With Quote
Advert
Old 09-16-2010, 01:41 PM   #798
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by itimpi View Post
If they are OCR'ed PDF files it will be interesting to see if the converted file is any better than using the PDF directly.
I have never seen a big pdf with decent OCR text. If it has OCR text and images of pages, the OCR has never been proofed and is lousy. If the OCR has been proofed, the page images have always been either removed, or cut down to show only the true graphics.
Starson17 is offline   Reply With Quote
Old 09-17-2010, 11:48 AM   #799
moriakaice
Memento Mori
moriakaice began at the beginning.
 
Posts: 36
Karma: 10
Join Date: Apr 2007
Device: eClicto, iPad WiFi, Kindle 3 WiFi
Ok, I'll admit I haven't read through this topic, but... Can we have calibre to output a correct <date> element in the content.opf in ePUB output? The specs expect it to be 4-digit year, then optional 2-digit month and then optional 2-digit day (AKA YYYY[-MM[-DD]]).

So, instead of something like:
Code:
<dc:date>2010-09-15T22:00:00+00:00</dc:date>
Can we get:
Code:
<dc:date>2010-09-15</dc:date>
It should be even simpler and make calibre ePUBs conform the specs (at least some part of it).

The other thing is that epubcheck doesn't like the <br> tag just before the </body> that calibre adds. Maybe this could be fixed as well? Putting it in the <p> tag solves the problem.
moriakaice is offline   Reply With Quote
Old 09-17-2010, 11:52 AM   #800
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 74,009
Karma: 315160596
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Oasis
I think that there's some disagreement about whether epubcheck is correct in this interpretation of the specification.

Quote:
Originally Posted by moriakaice View Post
Ok, I'll admit I haven't read through this topic, but... Can we have calibre to output a correct <date> element in the content.opf in ePUB output? The specs expect it to be 4-digit year, then optional 2-digit month and then optional 2-digit day (AKA YYYY[-MM[-DD]]).
pdurrant is offline   Reply With Quote
Advert
Old 09-17-2010, 12:00 PM   #801
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,380
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
epubcheck is wrong and calibre doesn't add <br> tags, if you have a br tag in your output, it was there in your input
kovidgoyal is offline   Reply With Quote
Old 09-17-2010, 12:02 PM   #802
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by moriakaice View Post
Can we have calibre to output a correct <date> element in the content.opf in ePUB output? The specs expect it to be 4-digit year, then optional 2-digit month and then optional 2-digit day (AKA YYYY[-MM[-DD]]).
I'm no expert, but I've seen Kovid quote the exact specs, and AFAICT, the above is wrong. Calibre is setting the date in accordance with the EPUB specs. Can you quote the specs that you think support your comment?
Starson17 is offline   Reply With Quote
Old 09-17-2010, 12:15 PM   #803
moriakaice
Memento Mori
moriakaice began at the beginning.
 
Posts: 36
Karma: 10
Join Date: Apr 2007
Device: eClicto, iPad WiFi, Kindle 3 WiFi
Quote:
Originally Posted by kovidgoyal View Post
epubcheck is wrong and calibre doesn't add <br> tags, if you have a br tag in your output, it was there in your input
So, calibre adds <br> tags when converting from RTF (as that was the source of my ePUB file)? That's strange!

As for the date:
http://www.idpf.org/doc_library/epub...m#Section2.2.7
Quote:
2.2.7: <date> </date>

Date of publication, in the format defined by "Date and Time Formats" at http://www.w3.org/TR/NOTE-datetime and by ISO 8601 on which it is based. In particular, dates without times are represented in the form YYYY[-MM[-DD]]: a required 4-digit year, an optional 2-digit month, and if the month is given, an optional 2-digit day of month.

The date element has one optional OPF event attribute. The set of values for event are not defined by this specification; possible values may include: creation, publication, and modification.
EDIT: Oh, I see my mistake about it supporting the full datetime. Sorry.
moriakaice is offline   Reply With Quote
Old 09-17-2010, 01:21 PM   #804
arijon
Junior Member
arijon began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2010
Device: iPad
So I can read the epubs on my iPad but the OCR is gone as well as my bookmarks. Any simple way to make sure these are preserved from the PDF?
arijon is offline   Reply With Quote
Old 09-19-2010, 02:33 PM   #805
meketrefi
Junior Member
meketrefi began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Sep 2010
Device: none
Thanks for the excellent software, Goyal.

I am using Calibre to convert .MOBI books to .EPUB in order to extract the internal HTML, as there is no direct option for HTML as output.

The problem is, I am trying to avoid splitting, as I would like to get a single HTML file. I checked "do not split on page breaks" on Preferences>Output Options, and even configured "Split files larger than" to 20480KB to avoid splitting, with no success.

What would you recommend?

Thanks,

Eduardo
meketrefi is offline   Reply With Quote
Old 09-19-2010, 02:42 PM   #806
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,380
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Read http://calibre-ebook.com/user_manual...l#introduction
kovidgoyal is offline   Reply With Quote
Old 09-19-2010, 03:00 PM   #807
meketrefi
Junior Member
meketrefi began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Sep 2010
Device: none
Goyal, you are my digital hero.

Thanks!

Eduardo
meketrefi is offline   Reply With Quote
Old 09-19-2010, 04:31 PM   #808
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,796
Karma: 146391129
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
But the problem is that even is ePubCheck is wrong in this case, if the ePubCheck spits out an error, some publishers won't accept the ePub as they thing it is incorrect.
JSWolf is offline   Reply With Quote
Old 09-30-2010, 01:13 PM   #809
Metamorphosis
Junior Member
Metamorphosis began at the beginning.
 
Metamorphosis's Avatar
 
Posts: 4
Karma: 10
Join Date: Apr 2009
Device: Sony
I'm having a problem converting my pdf files to epub. Each time I try it fails. I'm not a programmer so I don't understand the code that is returned. I have version 7.2

I apologize if I have posted this in the wrong thread.

ERROR: Conversion Error: <b>Failed</b>: Convert book 1 of 1 (Dalton's Awakening)

Convert book 1 of 1 (Dalton's Awakening)
Resolved conversion options
calibre version: 0.7.20
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0.0,
'book_producer': None,
'change_justification': u'original',
'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\\s+', 'i')) or @class = 'chapter']",
'chapter_mark': u'pagebreak',
'comments': None,
'cover': 'c:\\users\\lori\\appdata\\local\\temp\\calibre_0. 7.20_tmp_n3o853\\calibre_0.7.20_dkzjq_.jpeg',
'debug_pipeline': None,
'disable_font_rescaling': False,
'dont_split_on_page_breaks': False,
'extra_css': None,
'extract_to': None,
'flow_size': 260,
'font_size_mapping': None,
'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' ,
'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' ,
'html_unwrap_factor': 0.40000000000000002,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x046D85F0>,
'insert_blank_line': False,
'insert_metadata': False,
'isbn': None,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0.0,
'linearize_tables': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'max_toc_links': 50,
'new_pdf_engine': False,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_images': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.SonyReaderOutput object at 0x046D8990>,
'page_breaks_before': u"//*[name()='h1' or name()='h2']",
'prefer_metadata_cover': False,
'preprocess_html': False,
'preserve_cover_aspect_ratio': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': 'c:\\users\\lori\\appdata\\local\\temp\\calibre_0. 7.20_tmp_n3o853\\calibre_0.7.20_lby4pu.opf',
'remove_first_image': False,
'remove_footer': False,
'remove_header': False,
'remove_paragraph_spacing': False,
'remove_paragraph_spacing_indent_size': 1.5,
'series': None,
'series_index': None,
'smarten_punctuation': False,
'tags': None,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'unwrap_factor': 0.0,
'use_auto_toc': False,
'verbose': 2}
InputFormatPlugin: PDF Input running
on C:\Users\Lori\Calibre\Carol Lynne\Dalton's Awakening (2724)\Dalton's Awakening - Carol Lynne.pdf
Converting file to html...
pdftohtml log:

Retrieving document metadata...
Error (42): Unknown filter 'Crypt'
Generating manifest...
Rendering manifest...
Parsing all content...
Parsing index.html ...
Failed to parse content in index.html
Traceback (most recent call last):
File "site-packages\calibre\ebooks\oeb\reader.py", line 159, in _manifest_prune_invalid
File "site-packages\calibre\ebooks\oeb\base.py", line 1060, in fget
File "site-packages\calibre\ebooks\oeb\base.py", line 789, in _parse_xhtml
File "site-packages\calibre\ebooks\conversion\preprocess.py", line 431, in __call__
UnboundLocalError: local variable 'length' referenced before assignment

Spine item 'id1' not found
Python function terminated unexpectedly
Spine is empty (Error Code: 1)
Traceback (most recent call last):
File "site.py", line 103, in main
File "site.py", line 85, in run_entry_point
File "site-packages\calibre\utils\ipc\worker.py", line 99, in main
File "site-packages\calibre\gui2\convert\gui_conversion.py", line 24, in gui_convert
File "site-packages\calibre\ebooks\conversion\plumber.py", line 841, in run
File "site-packages\calibre\ebooks\conversion\plumber.py", line 968, in create_oebbook
File "site-packages\calibre\ebooks\oeb\reader.py", line 72, in __call__
File "site-packages\calibre\ebooks\oeb\reader.py", line 594, in _all_from_opf
File "site-packages\calibre\ebooks\oeb\reader.py", line 289, in _spine_from_opf
calibre.ebooks.oeb.base.OEBError: Spine is empty
Metamorphosis is offline   Reply With Quote
Old 09-30-2010, 01:17 PM   #810
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 657
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
There's a problem with 7.20, either go back to 7.19 or wait for upated 7.21
Perkin is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Old Thread] Epub Output: Line Height greenapple Conversion 20 01-27-2013 09:27 AM
EPUB output justification toki08 Calibre 10 01-08-2011 04:14 PM
Calibre epub output details and Nook squidward Calibre 6 11-24-2010 03:21 PM
epub output metadata troymc Calibre 5 05-22-2010 12:23 AM
Problem with epub output in Cybook Gen3 fjf Calibre 3 02-03-2010 02:23 AM


All times are GMT -4. The time now is 03:52 AM.


MobileRead.com is a privately owned, operated and funded community.