Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > Amazon Kindle

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 07-07-2011, 04:07 AM   #1
Stratogirl
Member
Stratogirl began at the beginning.
 
Stratogirl's Avatar
 
Posts: 13
Karma: 10
Join Date: Apr 2009
Device: Amazon Kindle 3
Question Help converting Djvu to mobi

I've been looking for a book that I can't find in my country.
A friend of mine got me a scanned djvu version of the book and I'm trying to convert it to read it on the Kindle.
I understand that there are software that can convert it to pdf but I would prefer to convert it to mobi because the scanned djvu is really bad and I would prefer to have it in a reflowable format.

I was able to save the djvu into txt via WinDjView then loaded the txt into Word.
The book has illustrations so I will have to do some screen captures, edit the pictures and load them into Word.
Plus I have to clean all the page numbers and headers.

But the biggest problem I'm having now is that the txt file has line breaks in each line (because of the limited page size of the scanned document).
Is there a way that I can eliminate these breaks without having to delete them manually one by one?
Stratogirl is offline  
Old 07-07-2011, 05:40 AM   #2
RobbieRobot
Junior Member
RobbieRobot began at the beginning.
 
RobbieRobot's Avatar
 
Posts: 5
Karma: 10
Join Date: May 2011
Location: Queensland, Australia
Device: Kindle 3
Lightbulb HTML

You could use some very simple HTML tags to mark the beginnings of the paragraphs. All the lines in the paragraphs will then flow because of the nature of HTML.

Here is a little filter program written in perl which expects to read plain text from STDIN and prints simple HTML to STDOUT

#!/usr/bin/perl
#
# Convert plain text with a blank line between paragraphs into html
#
use strict;

my ($rope, @html);

while (<STDIN>) {
$_ =~ s/\r//; # make all text look like unix text
$_ =~ s/\x0c//;
$_ =~ s/\n/\xff/;
push(@html,$_);
}

$rope = join("\xff",@html); # Make one huge string

$rope =~ s/\xff\xff\xff/\n<p>/ig;# Convert double new-line into paragraph
#print $rope; exit;
$rope =~ s/\xff\s+/\n<p>/ig; # Convert single new-line followed by whitespace into paragraph
$rope =~ s/\xff/ /ig; # Convert remaining new-lines into spaces

$rope =~ s/\[\d+\]//g; # [32] etc tags from .PDF saved as .txt

print "<HTML><HEAD><TITLE>From text2HTML</TITLE></HEAD><BODY>\n\n";
print $rope;
print"\n\n</BODY></HTML>\n";
#EOF
RobbieRobot is offline  
Advert
Old 07-07-2011, 07:35 AM   #3
sourcejedi
Groupie
sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.sourcejedi ought to be getting tired of karma fortunes by now.
 
sourcejedi's Avatar
 
Posts: 155
Karma: 200000
Join Date: Dec 2009
Location: Britania
Device: Android
That only helps if it has double-line breaks in the first place.

(And if you're using Word, you may as well just use Search and Replace All. Lessee... "^l" means line break. So replace ^l^l with ^p, and then all the remaining ^l with the empty string, deleting them).
sourcejedi is offline  
Old 07-07-2011, 09:46 AM   #4
WT Sharpe
Bah, humbug!
WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.WT Sharpe ought to be getting tired of karma fortunes by now.
 
WT Sharpe's Avatar
 
Posts: 39,073
Karma: 157049943
Join Date: Jun 2009
Location: Chesapeake, VA, USA
Device: Kindle Oasis, iPad Pro, & a Samsung Galaxy S9.
Moderator Notice
Thread closed. MR is firmly opposed to ebook piracy.

WT Sharpe is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting .djvu to .pdf BranMakMorn Amazon Kindle 5 01-21-2011 04:32 PM
converting from standard mobi to compressed mobi noideaatall Kindle Formats 6 07-11-2010 03:10 PM
Easy DJVU Reader - reading DJVU books Rsfor Apple Devices 5 02-05-2010 08:30 PM
Converting to mobi rcuadro Calibre 3 03-13-2009 01:14 AM
Confused about DJVU files and converting to LRF BBRags LRF 4 12-08-2008 04:37 PM


All times are GMT -4. The time now is 11:29 AM.


MobileRead.com is a privately owned, operated and funded community.