View Single Post
Old 02-20-2008, 09:20 AM   #303
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by nrapallo View Post
tompe:

I recently processed a .pdb (TEXt/REAd) and got a long series of words with no line breaks.

In 'mobi2html' I tried using '--rawhtml' and saw that there were <CR><LF> line endings in the text, but they seem to disappear when processed.

I couldn't find where the line endings were being stripped and replaced with spaces. Since the text feed to HTML::TreeBuilder had no HTML tags, would that be the culprit?

I tried using substituitions on the raw text to produce basic HTML code, but it didn't work.
Code:
my $book = $text;
$book = ~s/\cM//g;                   # Unix line endings
$book = ~s/\n/\x01/g;                # Collapse lines
$book = ~s/\x01\x01/<\/p>\n\n<p>/g;  # Separate paragraphs
$book = ~s/\x01/ /g;                 # Insert whitespace

$text = "<html><body><p>" . $book . "</p></body></html>";
-Nick
Here is a test file to see what can be done to convert .pdb properly (text to HTML code internally). All I get is one long line of words in the resulting .html with no form feeds/para boundaries.

-Nick
Attached Files
File Type: pdb Riddles_for_Kid.pdb (1.5 KB, 330 views)
nrapallo is offline   Reply With Quote