Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 11-09-2009, 05:09 PM   #1
mediax
Readaholic
mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.
 
mediax's Avatar
 
Posts: 249
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
pdf to epub conversion

I have a book scanned in pdf format that I'm looking to tidy up and convert to epub format. The pdf is text only and will require a fair amount of editing to correct errors.

This is my first attempt at this sort of thing, so I'd appreciate advice as to the best way of tackling the project. Should I convert to epub and do all the work in Sigil, or should I convert to text (or maybe HTML), do the donkey work of text editing and correction in OpenOffice and then convert to epub and use Sigil to sort out the formatting issues? Or is there a better way that I haven't thought of?

I'm running Linux, in case anyone has any suggestions for utilities.

Thanks in advance.
mediax is offline   Reply With Quote
Old 11-10-2009, 06:39 AM   #2
weedfreak
Addict
weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.weedfreak can program the VCR without an owner's manual.
 
weedfreak's Avatar
 
Posts: 302
Karma: 185297
Join Date: Sep 2009
Location: Ankh Morpork
Device: calibre
I have just completed a couple of similar projects, I converted to epub using calibre then did all the editing and formatting in Sigil.
Difficulties encountered were Calibre does not create 'perfect' code, in one case there were 10k+ lines of CSS code as every paragraph was given a separate ID and needed a CSS entry even though every entry was the same, other books were more 'normal.' Calibre also seems to scatter chapter breaks at random, and does not necessarily recognize Chapter headings, but you will be editing in any case so these are not major drawbacks.
Sigil is not, yet, an ideal editor due to things like the missing find / replace tools (being worked on though) and sometimes it is easy to get lost when switching between code view and book view. But the whole process was fairly painless apart from the one book with all the ID's, with that one I cut all the code from Sigil, pasted into OpenOffice for mass find / replace editing the cut / pasted back into Sigil. Afterwards I decided that just cut and paste the text may of been simpler.
weedfreak is offline   Reply With Quote
 
Advertisement
Old 11-10-2009, 08:28 AM   #3
JohnnyD
Connoisseur
JohnnyD doesn't litterJohnnyD doesn't litterJohnnyD doesn't litter
 
Posts: 94
Karma: 221
Join Date: Oct 2006
Location: Drenthe, The Netherlands
Device: Cybook Gen3 (cracked screen)/Bebook/Nokia E60/Nokia 5800
presumably the best free tool for converting pdf's is Mobipocket Creator (in my opinion). It is meant to create mobipocket files, but it will give you a set of intermediate files (html files) which are just perfect to use as base material for editing.

It will not only convert your pdf text to html, but will also preserve text properties like italics or bold. I'm not really sure if it will preserve font types, however.

Next to that, it will preserve the images, if there are any, in their proper place in the text.
JohnnyD is offline   Reply With Quote
Old 11-10-2009, 08:29 AM   #4
mediax
Readaholic
mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.
 
mediax's Avatar
 
Posts: 249
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
Quote:
Originally Posted by weedfreak View Post
Sigil is not, yet, an ideal editor due to things like the missing find / replace tools (being worked on though)
That and the ability to spellcheck in OpenOffice are the main issues that are making me consider going the two-step route. The idea of cutting and pasting between Sigil and OpenOffice as necessary could swing me to going straight to Sigil, though.

Thanks for the input
mediax is offline   Reply With Quote
Old 11-10-2009, 08:39 AM   #5
mediax
Readaholic
mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.
 
mediax's Avatar
 
Posts: 249
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
Quote:
Originally Posted by JohnnyD View Post
presumably the best free tool for converting pdf's is Mobipocket Creator (in my opinion).
Is there a linux version of Mobipocket Creator? The system requirements I found on the Mobipocket site were only for Windows.

I already use Calibre for library management and file conversion, so I was planning on using that. That said, alternatives are always good!
mediax is offline   Reply With Quote
Old 11-10-2009, 09:07 AM   #6
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,978
Karma: 350515
Join Date: Feb 2008
Device: Sony Reader PRS 505
Quote:
Originally Posted by mediax View Post
That and the ability to spellcheck in OpenOffice are the main issues that are making me consider going the two-step route.
I'm actually thinking of bumping the spellcheck feature to just after the 0.2.0 redesign and RTF import.
Valloric is offline   Reply With Quote
Old 11-10-2009, 09:27 AM   #7
mediax
Readaholic
mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.
 
mediax's Avatar
 
Posts: 249
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
Quote:
Originally Posted by Valloric View Post
I'm actually thinking of bumping the spellcheck feature to just after the 0.2.0 redesign and RTF import.
As a totally new user, I'm not yet in any position to comment on the relative priorities of features, but Search & Replace and Spellcheck are two features I would imagine get used in virtually every editing (as opposed to format conversion) exercise.
mediax is offline   Reply With Quote
Old 11-10-2009, 09:51 AM   #8
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,978
Karma: 350515
Join Date: Feb 2008
Device: Sony Reader PRS 505
Quote:
Originally Posted by mediax View Post
As a totally new user, I'm not yet in any position to comment on the relative priorities of features, but Search & Replace and Spellcheck are two features I would imagine get used in virtually every editing (as opposed to format conversion) exercise.
I would certainly agree. Search&Replace is for instance being worked on right now.

Spellchecking is a new feature. It's absolutely necessary, as you've noticed. But Sigil has some things broken, and bug fixes are almost always more important than new features.

The redesign is there to fix performance problems and problems with per-flow CSS. The RTF import is important, but not as important as what will come with it: importing functions will now be loadable plugins with a (hopefully) consistent interface. Other people will then be able to write their own plugins. Designing new importers will also be easier, so future work on importing Mobi, LIT etc will benefit from this.

But we will see. If spellchecking turns out to be the next killer feature everyone wants (after S&R), then it could even come before the RTF import and general importing rewrite.
Valloric is offline   Reply With Quote
Old 11-10-2009, 04:12 PM   #9
kjk
.
kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.
 
Posts: 3,408
Karma: 5647231
Join Date: Oct 2008
Device: never enough
Quote:
Originally Posted by Valloric View Post
But we will see. If spellchecking turns out to be the next killer feature everyone wants (after S&R), then it could even come before the RTF import and general importing rewrite.
Count me in as a vote for spell check/S&R/ any editing type enhancements/performance enhancements/etc... before any importing stuff, including RTF.

For me, getting stuff to ePub is the easy part...its everything else that I need Sigil for
kjk is offline   Reply With Quote
Old 11-11-2009, 04:26 AM   #10
Slite
Icanhasdonuts?
Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.Slite ought to be getting tired of karma fortunes by now.
 
Slite's Avatar
 
Posts: 2,835
Karma: 532407
Join Date: Aug 2008
Location: Bålsta, Sweden
Device: BeBook x 2 (Hanlin V3 rebrand), Samsung Omnia
Quote:
Originally Posted by kjk View Post
Count me in as a vote for spell check/S&R/ any editing type enhancements/performance enhancements/etc... before any importing stuff, including RTF.

For me, getting stuff to ePub is the easy part...its everything else that I need Sigil for
I disagree, strongly

Not on S&R, that should be implemented asap, but I think it would be more productive to get an api for import plugins in place before a spellchecker.
Slite is offline   Reply With Quote
Old 11-15-2009, 01:37 AM   #11
darkmonk
Connoisseur
darkmonk began at the beginning.
 
Posts: 58
Karma: 12
Join Date: Jan 2009
Device: none
Quote:
Originally Posted by Slite View Post
I disagree, strongly

Not on S&R, that should be implemented asap, but I think it would be more productive to get an api for import plugins in place before a spellchecker.
I don't. Calibre does an insanely good job with a million formats, and asking to reinvent the wheel seems silly... Remember the UNIX philosophy!

Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface
darkmonk is offline   Reply With Quote
Old 11-15-2009, 04:37 PM   #12
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,978
Karma: 350515
Join Date: Feb 2008
Device: Sony Reader PRS 505
Quote:
Originally Posted by darkmonk View Post
I don't. Calibre does an insanely good job with a million formats, and asking to reinvent the wheel seems silly... Remember the UNIX philosophy!

Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface
To tell you the truth, I don't necessarily subscribe to the UNIX philosophy.

It works OK for UNIX which ships with all these different userland tools that you can chain together in interesting and useful ways, but that's hardly the case for Windows... which on last count represents ~92% of the market and the vast majority of Sigil's users, too.

Requiring users to have calibre installed is not something I'm aiming for. It's a wonderful application, to be sure, but Sigil should be able to stand on its own as an ebook editor, which means being able to import various ebook formats.

It's just something you'd expect of an ebook editor, now wouldn't you?
Valloric is offline   Reply With Quote
Old 11-16-2009, 05:26 PM   #13
mediax
Readaholic
mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.mediax ought to be getting tired of karma fortunes by now.
 
mediax's Avatar
 
Posts: 249
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
Quote:
Originally Posted by Valloric View Post
It's just something you'd expect of an ebook editor, now wouldn't you?
I personally feel that depends on whether your intention is to create a multi-format editor, or an epub editor.

In the former case, you'd expect the editor not only to read, but also to write, multiple formats; in the latter case, there's a valid argument to the effect that the application's task is to edit a specified format.

Mind you, if are planning to build a multi-format editor, it would be one hell of a valuable tool.
mediax is offline   Reply With Quote
Old 11-19-2009, 01:19 PM   #14
Kivgaen
Enthusiast
Kivgaen will become famous soon enoughKivgaen will become famous soon enoughKivgaen will become famous soon enoughKivgaen will become famous soon enoughKivgaen will become famous soon enoughKivgaen will become famous soon enough
 
Kivgaen's Avatar
 
Posts: 38
Karma: 608
Join Date: Aug 2009
Location: Toronto
Device: all devices
speaking as a publisher...

Quote:
Originally Posted by Valloric View Post

Requiring users to have calibre installed is not something I'm aiming for. It's a wonderful application, to be sure, but Sigil should be able to stand on its own as an ebook editor, which means being able to import various ebook formats.

It's just something you'd expect of an ebook editor, now wouldn't you?
While I do not disagree with anything that you have said, being able to import various ebook formats IS important, and should most definitely be one of your very important items on the development list, I too would not put it at the top of the list above S/R.

But that may be because my goals might be different from yours. If your ultimate goal is for the public to be able to load up their ebook from whatever format it is that they are using, fix problems with it, and then spit it back out so that it looks prettier, then I would agree with you.

But if your goal is more geared towards publishers -- if you want your epub file editor to be more useful RIGHT NOW to publishers who are struggling to get all of their files ready for e-Readers, then those publishers will use Calibre as a secondary tool if they need to.

Publishers don't want what we can already get elsewhere (features in Calibre), we want what doesn't exist yet -- and that's a GOOD way to edit .epub files!
Kivgaen is offline   Reply With Quote
Old 11-19-2009, 04:09 PM   #15
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,978
Karma: 350515
Join Date: Feb 2008
Device: Sony Reader PRS 505
Quote:
Originally Posted by Kivgaen View Post
I too would not put it at the top of the list above S/R.
Search&Replace is (FINALLY) nearing completion and a new version of Sigil with it should hopefully be released in a few days.

Quote:
Originally Posted by Kivgaen View Post
But if your goal is more geared towards publishers -- if you want your epub file editor to be more useful RIGHT NOW to publishers who are struggling to get all of their files ready for e-Readers, then those publishers will use Calibre as a secondary tool if they need to.

Publishers don't want what we can already get elsewhere (features in Calibre), we want what doesn't exist yet -- and that's a GOOD way to edit .epub files!
Point taken. While I do care a lot about the people doing professional work with Sigil (or at least trying to), I have to consider the people who edit ebooks for personal use.

But you're right. Improving the WYSIWYG/general editing experience should come first. And it does. The redesign is all about improved editing. I even got sidetracked with the current release (<sigh>) and implemented more than a few performance enhancements.

This is also one of the reasons why I'll be spending time on creating a plugin framework for the importers after v0.2.0: I'll be able to decouple the importing functionality from the editing, and hopefully others will want to contribute their own plugins, even independently of Sigil's main development branch.
Valloric is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF to EPUB conversion jfontana Calibre 2 03-17-2010 04:09 AM
epub to pdf conversion using calibre rblearn Calibre 0 02-23-2010 05:57 PM
Help Needed for PDF to Epub Conversion saurabh Morankar ePub 9 12-04-2009 06:10 PM
Help with conversion from PDF to EPUB Fizz Calibre 5 10-25-2009 12:48 PM
PDF to Epub - a new conversion tool Nate the great News 0 09-18-2009 08:47 AM


All times are GMT -4. The time now is 03:17 PM.


MobileRead.com is a privately owned, operated and funded community.