01-03-2008, 10:34 AM | #1 |
Addict
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
|
Collating broken lines
One of the worst problem I have with many .txt files I have, is putting together broken lines which have been artificially broken to stay under some maxlength.
This is the first fix I make to Project Gutemberg files, for example, before converting them to rtf under Word, then to lrf I'm not sure if somebody posted some better way to do it, but I always feel fine with the following script (should work with any Perl version). Suffice you save it to takeaway_breaklines.pl, and run it as: takeaway_breaklines.pl infile.txt outfile.txt hope it helps! Alessandro Code:
#!/usr/bin/perl -w die "USAGE\n$0 filein fileout\n\n" if $#ARGV!=1; open(A,"<$ARGV[0]");my @a=<A>; close(A); open(B,">$ARGV[1]"); foreach $l(@a) { $l=~/(.*)\n$/; if (not defined $1) {print"problems at line -$l-\n"} else { $l=$1; $l=~s/\r//g; # if the file was in DOS mode if ($l!~/[\.:,;\"!\?\'\)-]$/) { print(B "$l ") } else{print(B "$l\n")} } } close(B); |
01-03-2008, 10:47 AM | #2 |
Gizmologist
Posts: 11,615
Karma: 929550
Join Date: Jan 2006
Location: Republic of Texas Embassy at Jackson, TN
Device: Pocketbook Touch HD3
|
There's a macro posted on the sight (searching for "stingo's macro" usually brings it up) that works in Word, and I think somebody had a version of it working for Open Office, but I don't think I've seen a version in Perl. Thanks for sharing it, alexxxm!
|
Advert | |
|
01-03-2008, 03:57 PM | #3 |
Addict
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
|
Yes, maybe most people prefer to work this thing from inside BD or Word, but I've been unable to make BD work (dont know why, crashes when outputting LRF).
Anyway I really prefer to use very small scripts to make precise things - and for all the admiration I have for Kovid's work, damn, I cannot update python to the latest version, so I cannot use it neither from my Linux machine, nor from my OSX Panther. I'll try to produce more small scripts with time, trying to let them work under plain Perl... Alessandro |
01-03-2008, 05:40 PM | #4 |
creator of calibre
Posts: 44,325
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The OSX version doesn't depend on the system python, it has its own custom python interpreter.
|
01-04-2008, 03:35 AM | #5 | |
Addict
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
|
Quote:
I'm just not sure: you're talking about the whole framework libprs500 or just the isolated utilities? How can I install it on Panther? Alessandro |
|
Advert | |
|
01-04-2008, 11:30 AM | #6 |
creator of calibre
Posts: 44,325
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
There'a a dmg available. Just download it. All of libprs500 wors on OS X. You may have problems using USB devices while libprs500 is running (that's a bug I have to find a solution for), but other than that it's all good.
https://libprs500.kovidgoyal.net/download_osx |
01-04-2008, 02:02 PM | #7 | |
Addict
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
|
Quote:
The best thing would be for the single apps to be available standalone, but I dont know if libprs is too integrated to allow it. thanks anyway... Alessandro |
|
01-04-2008, 02:32 PM | #8 |
creator of calibre
Posts: 44,325
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
oh i'm sorry I didn't see "Panther". Yeah libprs500 is way too integrated for separate apps. the separate apps are really calls to different functions. Makes maintenance of the app much easier.
|
01-05-2008, 05:15 PM | #9 | |
Addict
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
|
Quote:
I was thinking about the possibility to write to LRF texts with very basic format info, be it simple .txt, or at most texts with font size indications. Is there any way to hook to your libraries, maybe pylrf.py, or pylrs.py, to call them from an outside program (thinking perl here) to make use of their translation capabilities? Alessandro |
|
01-06-2008, 12:39 AM | #10 |
creator of calibre
Posts: 44,325
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Well sooner or later I'll release lrs2lrf. LRS is a XML file so you can output LRS and use it to create LRF.
|
01-06-2008, 05:13 AM | #11 |
Addict
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
|
but I see on libprs500.kovidgoyal.net a lrs2lrf.cpp file, so you released it already!
That, at least, is standalone? Because if LRS is xml-like, I'd have no problems producing output to it... Alessandro |
01-06-2008, 11:04 AM | #12 |
creator of calibre
Posts: 44,325
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
No that's legacy code. And yes LRS is pure XML.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Screen showing lines and broken text | riverdale | iRiver Story | 13 | 11-09-2010 08:54 AM |
Broken PRS-505; any place to buy chrome bottom piece? Or anyone with broken 505? | erikk | Sony Reader | 1 | 12-09-2009 06:51 PM |
Broken Ipod works Fine! except that its broken | Andybaby | Lounge | 1 | 06-04-2009 02:03 AM |
Missing lines | tompe | Bookeen | 4 | 07-21-2008 08:42 PM |
Lines from TV that I can't forget | Steven Lyle Jordan | Lounge | 82 | 06-13-2008 06:58 PM |