Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 04-27-2010, 06:11 AM   #1
mike_bike_kite
Digitally confused
mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.
 
mike_bike_kite's Avatar
 
Posts: 469
Karma: 1468286
Join Date: Mar 2010
Location: London, UK
Device: KPW, K2i, Nexus 7 32gb, Kobo Mini
learning to convert docs

I'm a bit new at this and have been trying to convert a few pdf books to epub to read in FBReader on my old Nokia N800. The pdf looks fine on my computer and on my N800 but I wanted to learn to convert using Calibre. I know regexp's etc but I don't understand these XPATH lines and can't see how they apply to non html files.

Problems I'm having:
  • Partial lines seem to get treated as a new paragraph so I get this
    Quote:

    the passengers rushed to view the cosmic visitor that had fallen from

    the sky. But it was impossible to examine the burning hot meteorite

    in any detail. Later, when the meteorite cooled, it was trenched
  • Chapters aren't recognised - obviously I need to enter some kind of format but I'm not sure what. Chapters Look like this:
    Quote:
    7. This is the new chapter
  • I tried copying an image from the web and pasting it into Calibre on the meta info page. It showed the image here but didn't seem to display the image later. Perhaps I'm missing something?
  • I'm using defaults for both input and output devices - is this a good idea for a non ereader device like an N800? it has a wide screen 800*600 ( I think).
Thanks for any advice or useful links

Mike
mike_bike_kite is offline   Reply With Quote
Old 04-29-2010, 07:52 PM   #2
mike_bike_kite
Digitally confused
mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.
 
mike_bike_kite's Avatar
 
Posts: 469
Karma: 1468286
Join Date: Mar 2010
Location: London, UK
Device: KPW, K2i, Nexus 7 32gb, Kobo Mini
Interestingly I still get all sorts of issues when converting from PDF to TXT. My aim was to just grab the text and then do the formatting with an editor like vi. Strangely the txt has many odd artefacts like double L's appearing as on L followed by a few strange graphic characters.

I do understand that PDFs are very poor as a container of text but I thought I might be able to convert my pdf files to epub (or even just txt) with the intention of picking a suitable ereader - I guess I'm stuck on getting one that can display the pdfs well.
mike_bike_kite is offline   Reply With Quote
Old 04-29-2010, 08:42 PM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,787
Karma: 4998511
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
http://calibre-ebook.com/user_manual...ture-detection

http://calibre-ebook.com/user_manual...-pdf-documents

As for the double ll glyph, that's a bug, which wont be fixed until calibre's new PDF engine is done.
kovidgoyal is offline   Reply With Quote
Old 04-30-2010, 04:09 AM   #4
mike_bike_kite
Digitally confused
mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.
 
mike_bike_kite's Avatar
 
Posts: 469
Karma: 1468286
Join Date: Mar 2010
Location: London, UK
Device: KPW, K2i, Nexus 7 32gb, Kobo Mini
Yep - I'd read those pages, I also understand HTML and, to a lesser extent, XML. Problem is I'm trying to write small bits of code in Calibre using a language I don't know (XPATH) to process the contents of a file I can't see the contents of (PDF) and for some strange reason I seem to be having problems

If I could just view the text then I could write a little program to stitch things back together. Are there converters that perhaps perform OCR on the PDF and just output the text?

Mike
mike_bike_kite is offline   Reply With Quote
Old 04-30-2010, 06:44 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,787
Karma: 4998511
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
read this http://calibre-ebook.com/user_manual...rsion.html#id7

in particular the section on the debug option which will allow you access to the text in the intermediate stages of conversion.
kovidgoyal is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Language learning Kumabjorn General Discussions 5 07-28-2010 12:33 PM
e-learning irenas Astak EZReader 42 03-03-2010 11:56 AM
Seriously thoughtful Learning a new language GraceKrispy Lounge 159 11-22-2009 08:38 AM
Plucker Fails to convert HTML docs via Word evwool Reading and Management 8 05-10-2009 01:23 PM
Convert word DOCs when you don't have WORD ? heheh macthekitten Calibre 9 01-30-2009 07:41 AM


All times are GMT -4. The time now is 08:17 PM.


MobileRead.com is a privately owned, operated and funded community.