![]() |
#1 |
Readaholic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 255
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
|
pdf to epub conversion
I have a book scanned in pdf format that I'm looking to tidy up and convert to epub format. The pdf is text only and will require a fair amount of editing to correct errors.
This is my first attempt at this sort of thing, so I'd appreciate advice as to the best way of tackling the project. Should I convert to epub and do all the work in Sigil, or should I convert to text (or maybe HTML), do the donkey work of text editing and correction in OpenOffice and then convert to epub and use Sigil to sort out the formatting issues? Or is there a better way that I haven't thought of? I'm running Linux, in case anyone has any suggestions for utilities. Thanks in advance. |
![]() |
![]() |
![]() |
#2 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 302
Karma: 185297
Join Date: Sep 2009
Location: Ankh Morpork
Device: calibre
|
I have just completed a couple of similar projects, I converted to epub using calibre then did all the editing and formatting in Sigil.
Difficulties encountered were Calibre does not create 'perfect' code, in one case there were 10k+ lines of CSS code as every paragraph was given a separate ID and needed a CSS entry even though every entry was the same, other books were more 'normal.' Calibre also seems to scatter chapter breaks at random, and does not necessarily recognize Chapter headings, but you will be editing in any case so these are not major drawbacks. Sigil is not, yet, an ideal editor due to things like the missing find / replace tools (being worked on though) and sometimes it is easy to get lost when switching between code view and book view. But the whole process was fairly painless apart from the one book with all the ID's, with that one I cut all the code from Sigil, pasted into OpenOffice for mass find / replace editing the cut / pasted back into Sigil. Afterwards I decided that just cut and paste the text may of been simpler. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 95
Karma: 72819
Join Date: Oct 2006
Location: Drenthe, The Netherlands
Device: Cybook Gen3 (cracked screen)/Bebook/Nokia E60/Nokia 5800/Kobo Aura HD
|
presumably the best free tool for converting pdf's is Mobipocket Creator (in my opinion). It is meant to create mobipocket files, but it will give you a set of intermediate files (html files) which are just perfect to use as base material for editing.
It will not only convert your pdf text to html, but will also preserve text properties like italics or bold. I'm not really sure if it will preserve font types, however. ![]() Next to that, it will preserve the images, if there are any, in their proper place in the text. |
![]() |
![]() |
![]() |
#4 | |
Readaholic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 255
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
|
Quote:
Thanks for the input ![]() |
|
![]() |
![]() |
![]() |
#5 | |
Readaholic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 255
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
|
Quote:
I already use Calibre for library management and file conversion, so I was planning on using that. That said, alternatives are always good! |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Created Sigil, FlightCrew
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
|
![]() |
![]() |
![]() |
#7 |
Readaholic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 255
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
|
As a totally new user, I'm not yet in any position to comment on the relative priorities of features, but Search & Replace and Spellcheck are two features I would imagine get used in virtually every editing (as opposed to format conversion) exercise.
|
![]() |
![]() |
![]() |
#8 | |
Created Sigil, FlightCrew
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Quote:
Spellchecking is a new feature. It's absolutely necessary, as you've noticed. But Sigil has some things broken, and bug fixes are almost always more important than new features. The redesign is there to fix performance problems and problems with per-flow CSS. The RTF import is important, but not as important as what will come with it: importing functions will now be loadable plugins with a (hopefully) consistent interface. Other people will then be able to write their own plugins. Designing new importers will also be easier, so future work on importing Mobi, LIT etc will benefit from this. But we will see. If spellchecking turns out to be the next killer feature everyone wants (after S&R), then it could even come before the RTF import and general importing rewrite. |
|
![]() |
![]() |
![]() |
#9 | |
.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,408
Karma: 5647231
Join Date: Oct 2008
Device: never enough
|
Quote:
For me, getting stuff to ePub is the easy part...its everything else that I need Sigil for ![]() |
|
![]() |
![]() |
![]() |
#10 | |
Icanhasdonuts?
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,837
Karma: 532407
Join Date: Aug 2008
Location: Mölnbo, Sweden
Device: Kobo Aura 2nd edition, Kobo Clara HD
|
Quote:
![]() Not on S&R, that should be implemented asap, but I think it would be more productive to get an api for import plugins in place before a spellchecker. |
|
![]() |
![]() |
![]() |
#11 | |
Connoisseur
![]() Posts: 58
Karma: 12
Join Date: Jan 2009
Device: none
|
Quote:
Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface |
|
![]() |
![]() |
![]() |
#12 | |
Created Sigil, FlightCrew
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Quote:
It works OK for UNIX which ships with all these different userland tools that you can chain together in interesting and useful ways, but that's hardly the case for Windows... which on last count represents ~92% of the market and the vast majority of Sigil's users, too. Requiring users to have calibre installed is not something I'm aiming for. It's a wonderful application, to be sure, but Sigil should be able to stand on its own as an ebook editor, which means being able to import various ebook formats. It's just something you'd expect of an ebook editor, now wouldn't you? ![]() |
|
![]() |
![]() |
![]() |
#13 | |
Readaholic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 255
Karma: 1058454
Join Date: Jul 2009
Location: Swindon, UK
Device: Sony PRS-T2 (previously 505 and 650)
|
Quote:
In the former case, you'd expect the editor not only to read, but also to write, multiple formats; in the latter case, there's a valid argument to the effect that the application's task is to edit a specified format. Mind you, if are planning to build a multi-format editor, it would be one hell of a valuable tool. |
|
![]() |
![]() |
![]() |
#14 | |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() Posts: 38
Karma: 608
Join Date: Aug 2009
Location: Toronto
Device: all devices
|
speaking as a publisher...
Quote:
But that may be because my goals might be different from yours. If your ultimate goal is for the public to be able to load up their ebook from whatever format it is that they are using, fix problems with it, and then spit it back out so that it looks prettier, then I would agree with you. But if your goal is more geared towards publishers -- if you want your epub file editor to be more useful RIGHT NOW to publishers who are struggling to get all of their files ready for e-Readers, then those publishers will use Calibre as a secondary tool if they need to. Publishers don't want what we can already get elsewhere (features in Calibre), we want what doesn't exist yet -- and that's a GOOD way to edit .epub files! |
|
![]() |
![]() |
![]() |
#15 | |
Created Sigil, FlightCrew
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Search&Replace is (FINALLY) nearing completion and a new version of Sigil with it should hopefully be released in a few days.
Quote:
But you're right. Improving the WYSIWYG/general editing experience should come first. And it does. The redesign is all about improved editing. I even got sidetracked with the current release (<sigh>) and implemented more than a few performance enhancements. This is also one of the reasons why I'll be spending time on creating a plugin framework for the importers after v0.2.0: I'll be able to decouple the importing functionality from the editing, and hopefully others will want to contribute their own plugins, even independently of Sigil's main development branch. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF to EPUB conversion | jfontana | Calibre | 2 | 03-17-2010 03:09 AM |
epub to pdf conversion using calibre | rblearn | Calibre | 0 | 02-23-2010 04:57 PM |
Help Needed for PDF to Epub Conversion | saurabh Morankar | ePub | 9 | 12-04-2009 05:10 PM |
Help with conversion from PDF to EPUB | Fizz | Calibre | 5 | 10-25-2009 11:48 AM |
PDF to Epub - a new conversion tool | Nate the great | News | 0 | 09-18-2009 07:47 AM |