![]() |
#1 |
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Apr 2012
Device: PRS-T1
|
Saving old magazines in a useful way
Hello everyone,
I have a complete collection of what was probably the best ever french language magazine. Various sites have it in .jpg scans, which is not very useful if you want to copy & paste in particular. OCR won't work well enough as far as I'm concerned/on my platform (multi-column text, images, warped/rotated text, ...), therefore I'm aware I will have to put in a lot of hard work to create usable documents, but my motivation is high (for now ![]() I do believe in open formats, so a rough first search led me to believe Sigil as a tool, and epub as a format would be adapted to what I have in mind, which is to create a usable (in particular the text parts must be copy/pastable) copy of my magazine collection, which looks as close as possible to the original presentation of the mag. Would you concur? And if you don't, could you suggest something more useful? An ideal tool would take a jpg scan, and allow me to select text zones and OCR / edit these zones, then produce an open format, such as epub. Best regards, Sebastian |
![]() |
![]() |
![]() |
#2 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4
Karma: 501112
Join Date: Mar 2012
Device: Kindle
|
Wow, good luck. I'm really curious to see what others say about this. I have the same issue with some old magazines too.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
|
ABBYY FineReader is currently considered the best OCR-ing software around. But it's not free, nor open. It does, however let you export to ePub, and in older versions at least in HTML format (which you can then import into Sigil). But ePub is not very good with complex layouts found in magazines... It's great for novels and such but anything above a single column is just asking for trouble.
PDF is better suited for complex layouts. Unless you want to spend a lot of time proofreading the articles and go "vanilla" (no foreground image and background text), I suggest you apply the standard "good enough" OCR with FineReader and be done with it. It's still better than no text at all. Else you'll have to track down the fonts (or fonts that look similar), learn how to vectorize graphics, perform a lot of micro retouches and proofread the final product one last time. The quality will be amazing and it's always a pleasure to read something done right. But the time spent will be significant. You'll need to set aside a couple of hours each day to learn this stuff and in about a month, maybe two, the first magazine will be done. Sometimes it will seem like a chore, a repetitive, life sucking chore but you'll eventually start to get better at it and work faster. It's also very easy to get discouraged. Very few people stick to it. Considered to be the best at layout and vectorizing: Adobe InDesign and Illustrator. Open source and multi platform: Scribus and Inkscape Have fun! |
![]() |
![]() |
![]() |
#4 |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Personally I'd go for PDF page scans, with a searchable text layer. With a magazine, you generally want to preserve the appearance of the page, not merely the text.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
saving books | iomari | Calibre | 11 | 10-04-2011 10:34 AM |
Losing files when saving saving to disk | theaccountant | Library Management | 4 | 03-10-2011 02:38 PM |
Journal Not Saving Writing When Moving to a New Page or Saving it | eberhardt333 | enTourage Archive | 5 | 11-24-2010 12:47 AM |
saving changes only | DaleDe | Sigil | 3 | 06-26-2010 07:26 AM |
Saving to disk | htaylor | Calibre | 2 | 01-04-2009 08:29 PM |