View Full Version : txt2lrf - New and Improved


kovidgoyal
06-19-2007, 12:11 PM
txt2lrf now supports a lightweight markup language. It has two main design goals:

To allow you to create good looking TXT files that can be read directly as TXT files on a computer screen.
Enable the easy conversion of TXT into nicely formatted LRF


These design goals make it a particularly useful tool for those of you that find creating HTML files by hand too tedious.
Some of the features that the markup supports:

Lists
Tables
Images
Automatic TOC creation


Automatic TOC creation makes this an excellent replacement for BookDesigner for those of us who don't use Windows, especially since it supports inline images and tables.

Look at the attached demo for to understand the basic syntax. For full details see: http://daringfireball.net/projects/markdown/syntax

The improved txt2lrf is part of libprs500 (http://libprs500.kovidgoyal.net) v0.3.53 and higher. It can be run from the commandline as
txt2lrf myfile.txt

tsgreer
06-20-2007, 01:19 PM
I am running the GUI libprs500 v0.3.54 on my Mac, but can I access txt2lrf from within the GUI or is it commandline only? I can't seem to find any reference to the txt2lrf function in the GUI.

Sorry for the silly question! :) Thanks for all of your hard work...

JSWolf
06-20-2007, 01:21 PM
txt2lrf and html2lrf and prs500 are all command line. No GUI.

kovidgoyal
06-20-2007, 01:25 PM
They'll be added to the GUI in v0.4.0...towards which libprs500 is inching every day ;-)
Indeed I'm planning to add a GUI editor, rather like the editor box used in the mobileread forums that will allow easy creation of txt file marked up in the markdown format for conversion to LRF.

tsgreer
06-20-2007, 03:10 PM
Excellent! I'm looking forward to it. I have put off adding more books the our "uploads" section because I want to be able to have a little more control over the formatting....

JSWolf
06-20-2007, 04:33 PM
Excellent! I'm looking forward to it. I have put off adding more books the our "uploads" section because I want to be able to have a little more control over the formatting....
You still have that control. You can use html2lrf, txt2lrf, pielrf, or Book Designer.yes, all but BD are from the command line, but is it really so hard to use? To me, it's trivial. And if it is hard, just ask and we'll be glad to try to help.

tsgreer
06-20-2007, 05:13 PM
Thaks JSWolf. I'm on a Mac with no access to Windows, so Book Designer is out. I'm also one of the fellows that just can't get the command line stuff to work. I've tried plenty, but just doesn't seem to work out for me. But once it's GUI, I'm all over it. Thanks again!

JSWolf
06-20-2007, 05:27 PM
To convert an HTML file to an LRF file, one of the commands is...

html2lrf --header --font-delta=.5 --bottom-margin=27 -a "Cary Rockwell" -t "Danger in Deep Space" "Rockwell Carey_Danger in Deep Space.htm"

Again, that's using Windows. But it is not hard to do. Honest!

kovidgoyal
06-20-2007, 06:56 PM
I'm surprised you still need --bottom-margin. I thought I fixed that bug.

JSWolf
06-20-2007, 06:58 PM
That's a command from a bat file before you fixed the bottom margin bug.

DaveNB
06-20-2007, 08:25 PM
Hi,

I dragged the libprs500.app to my Desktop, then executed, it asked form my admin login/password which I entered. The GUI program seems to work OK, but none of the command line utilities seem to work. Upon re-running the GUI application, I am asked for the admin login/password again, yet the utilities don't seem to work. Tried it with the current and the previous versions. No dice, I get errors when trying to execute the utilities from within the .app package in the Resources directory. Any ideas?

I'd like to use the command line utilities, fairly comfortable with that as the GUI still has a few minor issues (in updating the library list portion of the display).

Thanks,

Dave

kovidgoyal
06-20-2007, 09:55 PM
Hmm what errors do you get exactly? Run the GUI and look for any messages in the OSX console (the console should be in the Utilities sub-folder of Applications).
Also post the output of

ls -l /usr/bin/prs500

DaveNB
06-23-2007, 04:30 AM
Kovid,

Thanks for looking into these problems. I installed Python 2.5.1 sucessfully. Then I opened the libprs500 .dmg and dragged the application to a folder on my Desktop (instead of my Application) and ran it from there. It does prompt for my admin password which I do enter.

For the GUI library list refresh issue:
- no errors are showing up in the console log
- the problem is that if I add lets say a .pdf and a .lrf file, the file selection dialogues come up but the library list remains blank (not even the left hand most column of numbers shows up). If I quit libprs500 gui and then restart, the files then show up. If I add any additional files, nothing is added, only shows the 2 previously added files. Quitting and restarting again then shows all the files. Seems to be a refresh problem.

As for the command line applications not showing up:
I think that on startup the libprs500 GUI is not reliably setting up the symlinks to the executables in it's libprs500.app/Contents/Resources folder into the /usr/bin folder

The symlinks are not set up when I try to run the latest version of libprs500

When I set up the symlinks manually and run it AND when I try to run from the libprs500.app/Contents/Resources folder directly, I get the following error (in both situations, the same error):

%>html2lrf --header --font-delta=.5 --bottom-margin=27 -a "test author" -t "test title" "test.html"
Traceback (most recent call last):
File "/usr/bin/html2lrf", line 25, in <module>
os.execv(loader_path, sys.argv)
OSError: [Errno 2] No such file or directory

Anyone else on OS X having trouble with the command line utilities? Is the libprs500 application self contained or do I have to install some other python libraries/apps too?

Maybe I'm missing some of the Python libraries?

Does the libprs500 check to see if the symlinks are setup properly before running or does it re-install the symlinks each and every time? Does it delete the old symlinks first? Does libprs500 need to be installed in a particular directory?

Thanks, sorry I'm a bit confused.

Dave

kovidgoyal
06-23-2007, 01:52 PM
The GUI problems will be fixed in v0.4.0 as the GUI (and its database backend) are being completely rewritten and this time I will test it on OSX as well.

As far as I know it should be possible to install libprs500 in any normal folder of your filesystem Each time the GUI is run it checks if the symlinks exists and point to the correct location, and if they don't it asks for the password.
The reason I chose /usr/bin is that /usr/local/bin is not in the PATH in a default installation of OS X.

libprs500.app is fully self-contained and should not require any external dependencies.
I suspect what's happening is that your installation of python 2.5.1 is interfering, though I don't see why that should happen.

Is the file libprs500.app/Contents/Resources/html2lrf.py being created when you run html2lrf?

EDIT: Also try cleaning everything out and re-installing libprs500. Remove /usr/bin/html2lrf, /usr/bin/prs500, /usr/bin/txt2lrf, /usr/bin/lit2lrf, /usr/bin/libprs500, /usr/bin/rtf-meta, /usr/bin/lrf-meta.
The delete libprs500.app and re download it. If it still doesn't work please post the output of

ls -l /usr/bin/html2lrf

DailyPlanet
07-09-2007, 06:11 AM
Are you also going to be adding in the lit2lrf functions into the gui in version 4?

ns66
07-11-2007, 09:41 PM
Hi,

I need a simple utility to convert lots of txt or html files (some without extension even though they are txt or html) to lrf, or rtf, whatever format that can be easily read in sony reader, with no border, no need for special layout, just quick and batch, need to support unicode

I can code java very well, but I don't know anything about lrf or rtf format, if there's such module or code samples on that side that will help, I can easily build the rest.

or if your txt2lrf or html2lrf can handle (you can even merge them make it seemless) that's even better

thanks

JSWolf
07-11-2007, 09:50 PM
I do suggest using html2lrf to convert your html files. Give them a go and see how it works. Might be just what you want.

As for the text files, use Word to format them and then import them into Book Designer after that. That's the easiest way to do it.

kovidgoyal
07-11-2007, 10:30 PM
@DailyPlanet Yeah lit2lrf will be in the gui

@ns66 You're best bet is to write a script that runs through the files, detects if they are txt or html and call txt2lrf/html2lrf as appropriate.

DailyPlanet
07-12-2007, 01:15 AM
I don't want to seem like a pest, but what is your estimated timeframe for releasing 4.0

kovidgoyal
07-12-2007, 01:24 AM
Somewhere between 1 and 3 months

Bokeh
07-25-2007, 05:22 PM
I have been having a problem with many txt files I try to run through this, where somewhere in the txt file is a unicode character the program cannot process.

I usually get a message like this:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc6' in position 25014: ordinal not in range(128)

which I guess is a python related error? (I know very little about programming)

I can sometimes track down the character if I can understand the unicode for that character, I use Charactermap to find and replace it in Word. But often with an error like this I don't know how to locate the character.

Does anyone have any tips for me on either a macro I can run in word to remove characters txt2lrf cannot process,or how to use the error message to locate the offending character?

Are there any plans to "pre-screen" text files in txt2lrf to help remove those characters?

So far I have been finding them manually by deleting half the text, seeing if the error remains, then keep deleting half until I find it. Which is super slow.

Any help would be appreciated!

kovidgoyal
07-25-2007, 10:13 PM
You shouldn't be getting unicode errors. Can you post the full error message and the version of libprs500 you're using.

Bokeh
07-26-2007, 11:41 AM
woops, double post

Bokeh
07-26-2007, 11:43 AM
here is the error i get when trying to use txt2lrf on a gutenberg txt file from http://www.gutenberg.org/ebooks/7110


Traceback (most recent call last):
File "convert_from.py", line 89, in <module>
File "convert_from.py", line 78, in main
File "convert_from.py", line 63, in generate_html
File "encodings\cp1252.pyo", line 15, in decode
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe6' in position 5154: ordinal not in range(128)

oh and I am using the latest version of the program, 0.3.79

kovidgoyal
07-26-2007, 11:46 AM
Will be fixed in next release.

langabe
07-29-2007, 11:41 AM
When running txt2lrf on the attached file,

alex@langabe:~/shared/ebooks-cache $ txt2lrf the\ art\ of\ war.txt

I get the runtime error specified in the subject line.

Python 2.5.1 (r251:54863, May 2 2007, 16:56:35)

alex@langabe:~/shared/ebooks-cache $ libprs500 --version
0.3.75

I don't think increasing the recursion limit will help, you'll run out of stack. It should probably implement the algorithm without recursion.

Alex

kovidgoyal
07-29-2007, 02:13 PM
Nah this is probably a logic bug. Squashed in the next release. Also try upgrading to the latest version 0.3.79.

edbro
08-23-2007, 08:26 AM
I'm using ver 103 and I am getting:

Traceback (most recent call last):
File "convert_from.py", line 97, in <module>
File "convert_from.py", line 93, in main
File "convert_from.py", line 83, in process_file
File "libprs500\ebooks\lrf\html\convert_from.pyo", line 1403, in process_file
File "libprs500\devices\prs500\driver.pyo", line 56, in <module>
File "libprs500\devices\libusb.pyo", line 32, in <module>
File "libprs500\__init__.pyo", line 53, in load_library
File "ctypes\__init__.pyo", line 423, in LoadLibrary
File "ctypes\__init__.pyo", line 340, in __init__
WindowsError: [Error 126] The specified module could not be found

kovidgoyal
08-23-2007, 09:54 AM
I cant duplicate this bug and the line numbers in the error message are wrong. Can you try uninstalling and re-installing libprs500.

JSWolf
08-23-2007, 03:11 PM
I'm using ver 103 and I am getting:

Traceback (most recent call last):
File "convert_from.py", line 97, in <module>
File "convert_from.py", line 93, in main
File "convert_from.py", line 83, in process_file
File "libprs500\ebooks\lrf\html\convert_from.pyo", line 1403, in process_file
File "libprs500\devices\prs500\driver.pyo", line 56, in <module>
File "libprs500\devices\libusb.pyo", line 32, in <module>
File "libprs500\__init__.pyo", line 53, in load_library
File "ctypes\__init__.pyo", line 423, in LoadLibrary
File "ctypes\__init__.pyo", line 340, in __init__
WindowsError: [Error 126] The specified module could not be found
Create a ticket and post the command line you used and also the text file.

edbro
08-23-2007, 07:16 PM
I cant duplicate this bug and the line numbers in the error message are wrong. Can you try uninstalling and re-installing libprs500.

Okay, I uninstalled/reinstalled/rebooted. Here is a copy/paste of lastest attempt followed by version number:
D:\Work\JCarroll-BonesOfTheMoon>txt2lrf -a "John Carroll" -t "Bones of the Moon"
--cover=bones.jpg bones.txt
Traceback (most recent call last):
File "convert_from.py", line 97, in <module>
File "convert_from.py", line 93, in main
File "convert_from.py", line 83, in process_file
File "libprs500\ebooks\lrf\html\convert_from.pyo", line 1433, in process_file
File "libprs500\devices\prs500\driver.pyo", line 56, in <module>
File "libprs500\devices\libusb.pyo", line 32, in <module>
File "libprs500\__init__.pyo", line 53, in load_library
File "ctypes\__init__.pyo", line 423, in LoadLibrary
File "ctypes\__init__.pyo", line 340, in __init__
WindowsError: [Error 126] The specified module could not be found

D:\Work\JCarroll-BonesOfTheMoon>txt2lrf --version
libprs500 0.3.103

kovidgoyal
08-23-2007, 07:43 PM
Ah ok now that makes sense. Will be fixed in the next version. In the meantime if you remove the --cover option, you should be ok.

StDo
09-03-2007, 03:22 PM
Hi Kovidgoyal,

using txt2lrf I got some text-encoding problems.

German "Umlaute" won't be transfered as they should.

umlaute.txt in ISO-8859-1 was converted with following line
txt2lrf -a "Tester" -t "Umlautetest" -e "ISO-8859-1" umlaute.txt to umlaute.lrf.

Maybe I made a mistake?

kovidgoyal
09-04-2007, 02:46 AM
Sounds like the new encoding handling code is still buggy. I'll fix it when I return.

StDo
09-04-2007, 04:58 AM
Sounds like the new encoding handling code is still buggy. I'll fix it when I return.

You got a ticket. No: 192. ;)

Have nice holidays. Or is it a work trip?

mcortez
11-24-2007, 03:18 PM
I have a few layout questions...

Is there a way, when using lists to get the wrapped lines to indent and line up with the text from the previous line, like this:

* Here is some text that happens to wrap to the
second and third line and I would like it to
indent and line up


Is there an equivalent to <DL> / <DT> for txt2lrf?

Is there a way to force a page break from the text file?

Are there any plans to support descendant selectors when using the --override-css option?


/* select paragraph tags inside list item tags */
li p
{
font-style: italics;
}


Thanks for all the hard work!!!

kovidgoyal
11-24-2007, 04:04 PM
You can just embed HTML for your more sophisticated markup needs. There are no plans to support descendant selectors. I've forgotten the markdown syntax for embedded HTML, but google should allow you to find it in a jiffy.

mcortez
11-24-2007, 04:16 PM
Alrighty, I'll take a closer look at html2lrf so I can see what html is supported.

kovidgoyal
11-25-2007, 12:01 PM
The demo file at the start of this thread may prove useful.

stilliremain
01-29-2009, 08:35 AM
Can I stop it from using Markdown at all? I basically have lots of text files which I want to read on the reader as they are in sans-serif font with no jiggling. So plain text, convert it to sans-serif, don't do anything clever with the text.

kovidgoyal
01-29-2009, 12:43 PM
It shouldn't do anything fancy with your txt if your txt doesn't have any Markdown markup

murraypaul
02-10-2009, 06:06 AM
It shouldn't do anything fancy with your txt if your txt doesn't have any Markdown markup

Unfortunately your txt may have something that looks like markdown markup but isn't, and the tool will make a real mess of it.

An option to disable markup processing would be very useful.

llasram
02-10-2009, 07:41 AM
An option to disable markup processing would be very useful.

Unfortunately, that isn't really possible without a re-write. The way txt2lrf works is that it uses Markdown to convert plain text into HTML, then converts the HTML into an LRF. Not processing Markdown markup means no HTML, which means nothing to convert into an LRF.

You may want to check out pielrf (http://www.mobileread.com/forums/showthread.php?t=10752). I'm not sure that it's still maintained, but the last time I looked at it it worked, and seems to do what you want -- directly convert plain text to LRF with minimal special markup interpretation.