Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Alternative Devices

Notices

Reply
 
Thread Tools Search this Thread
Old 02-12-2009, 04:33 AM   #1
zupapa
Junior Member
zupapa began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Feb 2009
Device: eeePC 901
PDF formatting scripts for eee PC

Hi,

I'd like to contribute a couple of scripts that I'm using on my eee PC 901. They format text (plain, compressed text, rtf or html) files to pleasant to read PDF files.

You can see the resulting pdf in the attached image. The scripts (written for bash and python) are in the attached archive text_formatter.tar.gz .

The scripts are tested only under Ubuntu (8.04). You must have latex, dvipdfm, python, evince, bunzip2, unzip, unrtf, html2text and the standard gnu command line tools (gawk, sed etc... ) installed. (they all are packaged in the Ubuntu repository). You must copy the attached scripts somewhere along your path, for example /usr/local/bin . The scripts currently stupidly use the global /tmp directory, and are basically just a quick hack to get the job done.

I'm hoping that someone finds these scripts useful, and can modify them for their own use. I don't have the time to give any kind of support for these scripts. However, if someone has a cool improvement or important bug fix, I'd like to hear about it.

Best regards
Attached Thumbnails
Click image for larger version

Name:	DSCN8196.jpg
Views:	242
Size:	396.9 KB
ID:	23283  
Attached Files
File Type: gz text_formatter.tar.gz (2.3 KB, 158 views)
zupapa is offline   Reply With Quote
Old 02-12-2009, 06:04 AM   #2
Randy11
Fanatic
Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.
 
Posts: 551
Karma: 34368
Join Date: Oct 2008
Device: Samsung EB60, Onyx M92
Thanks a lot for the script, good idea !

I've tried it and I 've some differents results. The 1st line begin with "begindocument", the text on the right seems cut and the last line in the bottom seems cut too.

I've tried with 'xterm_fax.txt'.

I worked on Debian Etch 4r. and the Python's version is '2.4.4 '.
Randy11 is offline   Reply With Quote
 
Advertisement
Old 02-12-2009, 06:43 AM   #3
zupapa
Junior Member
zupapa began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Feb 2009
Device: eeePC 901
Hi,

I'm not sure if I understood the problem. Do you have texlive-latex-extra
installed? The scripts are meant for ebooks. tidy_text_mode.py tries (heuristically) to merge paragraph text to long lines, remove the possible hyphens from the ends of the lines and keep the dialog text in separate lines.
format_book.sh then inserts double linebreaks so that latex will not merge everything into one giant paragraph. In all other ways, format_book.sh lets latex worry about the text formatting (which is something latex does very well).
zupapa is offline   Reply With Quote
Old 02-12-2009, 07:28 AM   #4
Randy11
Fanatic
Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.
 
Posts: 551
Karma: 34368
Join Date: Oct 2008
Device: Samsung EB60, Onyx M92
I've the "texlive-latex-extra" package and test a more simple file, the A2PS's faq.

I found the following thing.

1 - In the begin of the tex file :
Quote:
\documentclass[landscape,twocolumn,12pt]{article}
\usepackage[left=1mm,top=1mm,right=1mm,bottom=3mm,nohead,nofoo t]{geometry}
\usepackage[dvips]{color}
\usepackage[latin1]{inputenc}
\usepackage{newcent}
\geometry{papersize={205mm,120mm}}
\definecolor{darkgray}{rgb}{0.25,0.25,0.25}
\definecolor{whitish}{rgb}{0.86,0.86,0.8}
\definecolor{bluish}{rgb}{0.54,0.83,0.55}
\special{pdf: pagesize width 205mm height 120mm}
\\begin{document}
+================================================= =====================+ | | | The following information is part of the Texinfo documentation. | | It is provided only to help you solve problems you might have | | while installing a2ps. You need not keep this file. | | | +================================================= =====================+

Frequently asked questions
**************************
=> There is "\\begin{document}" and not "\begin{documen}".

2 - The parser can't deals with some classics characters like CR,TAB when they 're writed "\n" or "\t". In the tex file they're interpreted like "n" and "t".

I've remove some controls sequence used in the configuration of A2PS for have a [B]GOOD[\B] display. With a text file (a story), I think there is no problem.

Happy to have our script
Randy11 is offline   Reply With Quote
Old 02-12-2009, 08:05 AM   #5
zupapa
Junior Member
zupapa began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Feb 2009
Device: eeePC 901
Hi,

thanks for your report. It seems that the echo command behaves a bit differently between our systems. In my system, "\b" was interpreted as backspace, and I had to use double backslash "\\b" to prevent this.

The following is probably more portable way to express the same thing. Option -E suppresses the parsing of escape sequences.

/bin/echo -E "\begin{document}" >> "/tmp/$name.tex"

best regards
zupapa is offline   Reply With Quote
Old 02-12-2009, 08:42 AM   #6
Randy11
Fanatic
Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.
 
Posts: 551
Karma: 34368
Join Date: Oct 2008
Device: Samsung EB60, Onyx M92
Hi,

It's solve the problem for

The last problem is the backslash in the file. In the A2PS's faq :
Quote:
DefaultPrinter: | //c/gstools/gs5.10/Gswin32c.exe \ -Ic:\gstools\gs5.10;c:\gstools\gs5.10\fonts \ -sDEVICE=ljet4 -sPAPERSIZE=a4 -dNOPAUSE -r300 -dSAFER \ -sOutputFile="\\spool\HP LaserJet 5L (PCL)" \ -q - -c quit
Produce, with LaTex, when I supress "-interaction batchmode" :
Quote:
! Undefined control sequence.
l.664 ...win32c.exe \ -Ic:\gstools
\gs5.10;c:\gstools\gs5.10\...
Randy11 is offline   Reply With Quote
Old 02-12-2009, 11:03 AM   #7
zupapa
Junior Member
zupapa began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Feb 2009
Device: eeePC 901
ok, attached version should handle the special characters better. I forgot to mention that by editing the format_book.sh -script, you can get portrait mode or change the colors to white text on dark background. Unfortunately, the eeePC display is a bit harder to read in sideways orientation. Probably it has a TN panel.

If anybody else wants to try these scripts out, please use the latest version (currently 0.01).
Attached Files
File Type: gz text_formatter_0.01.tar.gz (2.4 KB, 152 views)
zupapa is offline   Reply With Quote
Old 02-12-2009, 11:50 AM   #8
Randy11
Fanatic
Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.
 
Posts: 551
Karma: 34368
Join Date: Oct 2008
Device: Samsung EB60, Onyx M92
Sorry, It's me again Perfect for the standard text.

I've tried with a html file, from de /usr/share/doc, the "BeanShell". There is no output.

Maybe I take a bad type of HTML file ... ?

The characters are interpreted by Mobileread, the aspect is not like my local view.

The begin of file :
Quote:
<?xml version="1.0" encoding="UTF-8"?>
<html><head><title>Basic Syntax</title></head><body bgcolor="ffffff"><table cellspacing="10"><tr><td align="center"><a href="http://www.beanshell.org/"><img src="../images/homebutton.gif"/><br/>Home</a></td><td><a href="quickstart.html#Quick_Start"><img src="../images/backbutton.gif"/><br/>Back
</a></td><td align="center"><a href="contents.html"><img src="../images/upbutton.gif"/><br/>Contents</a></td><td align="center"><a href="methods.html#Scripted_Methods"><img src="../images/forwardbutton.gif"/><br/>Next
</a></td></tr></table><h1>Basic Syntax</h1>


BeanShell is, foremost, a Java interpreter. So you probably already know
most of what you need to start scripting with BeanShell. This
section describes specifically what portion of the Java language BeanShell
interprets and how BeanShell extends it or "loosens" it to be more
scripting-language-like.
<p CLEAR="ALL"/>
Conversion withe "html2text" :
Quote:
<?xml version="1.0" encoding="UTF-8"?>
_[_._._/_i_m_a_g_e_s_/ _[_._._/_i_m_a_g_e_s_/ _[_._._/_i_m_a_g_e_s_/_u_p_b_u_t_t_o_n_._g_i_f_] _[_._._/_i_m_a_g_e_s_/
_h_o_m_e_b_u_t_t_o_n_._g_i_f_] _b_a_c_k_b_u_t_t_o_n_._g_i_f_] _C_o_n_t_e_n_t_s _f_o_r_w_a_r_d_b_u_t_t_o_n_._g_i_f_]
_H_o_m_e _B_a_c_k _N_e_x_t
************ BBaassiicc SSyynnttaaxx ************
BeanShell is, foremost, a Java interpreter. So you probably already know most
of what you need to start scripting with BeanShell. This section describes
specifically what portion of the Java language BeanShell interprets and how
BeanShell extends it or "loosens" it to be more scripting-language-like.
The strange character is (under VIM) : "^H".

And the file /tmp/syntax.tex :
Quote:
\begin{document}
<?xml version="1.0" encoding="UTF-8"?>



\_[\_.\_.\_/\_i\_m\_a\_g\_e\_s\_/ \_[\_.\_.\_/\_i\_m\_a\_g\_e\_s\_/ \_[\_.\_.\_/\_i\_m\_a\_g\_e\_s\_/\_u\_p\_b\_u\_t\_t\_o\_n\_.\_g\_i\_f\_] \_[\_.\_.\_/\_i\_m\_a\_g\_e\_s\_/ \_h\_o\_m\_e\_b\_u\_t\_t\_o\_n\_.\_g\_i\_f\_] \_b\_a\_c\_k\_b\_u\_t\_t\_o\_n\_.\_g\_i\_f\_] \_C\_o\_n\_t\_e\_n\_t\_s \_f\_o\_r\_w\_a\_r\_d\_b\_u\_t\_t\_o\_n\_.\_g\_i\_ f\_] \_H\_o\_m\_e \_B\_a\_c\_k \_N\_e\_x\_t ************ BBaassiicc SSyynnttaaxx ************ BeanShell is, foremost, a Java interpreter. So you probably already know most of what you need to start scripting with BeanShell. This section describes specifically what portion of the Java language BeanShell interprets and how BeanShell extends it or "loosens" it to be more scripting-language-like.
With LaTeX :
Quote:
! Package inputenc Error: Keyboard character used is undefined
(inputenc) in inputencoding `latin1'.

See the inputenc package documentation for explanation.
Type H <return> for immediate help.
...

l.16 \_^^H
[\_^^H.\_^^H.\_^^H/\_^^Hi\_^^Hm\_^^Ha\_^^Hg\_^^He\_^^Hs\_^^H/ ...

Last edited by Randy11; 02-12-2009 at 11:53 AM.
Randy11 is offline   Reply With Quote
Old 02-12-2009, 12:43 PM   #9
zupapa
Junior Member
zupapa began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Feb 2009
Device: eeePC 901
ok, the fastest way forward would be if you could send the offending html file for me to perttu ddoott haimi aatt gmail ddoott com.
zupapa is offline   Reply With Quote
Old 02-12-2009, 12:53 PM   #10
Randy11
Fanatic
Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.Randy11 is out to avenge the death of his or her father, Domingo Montoya.
 
Posts: 551
Karma: 34368
Join Date: Oct 2008
Device: Samsung EB60, Onyx M92
Thanks. It's done.
Randy11 is offline   Reply With Quote
Old 02-19-2009, 07:19 PM   #11
tali3sin
Connoisseur
tali3sin began at the beginning.
 
Posts: 54
Karma: 38
Join Date: Feb 2009
Device: HTC Magic
This is fantastic news, I run Ubuntu on a daily basis and actually have an Eee 901, plenty of txt books, and no eReader

As long as I can figure out how to tweak the colours, and possibly format the text to be single column (not for me, for my girlfriend who seems to prefer it that way), you may have just saved me a whole lot of trouble

I'll report in when I've given it a go, and if I have any thoughts or troubles
Thanks!
tali3sin is offline   Reply With Quote
Old 02-19-2009, 08:09 PM   #12
tali3sin
Connoisseur
tali3sin began at the beginning.
 
Posts: 54
Karma: 38
Join Date: Feb 2009
Device: HTC Magic
SOLVED!

Alright, so I was missing a font. Pncr7t.tfm
For future reference to anyone who needs to know, it's found in the texlive-fonts-recommended package, in the Ubuntu repositories.

Script now appears to work perfectly, thank you again


The now solved problem, below:

Hurrum, so, I'm still a bit of a linux newbie when it comes to the inner workings.

But, I managed to get the script from "Not working at all." to "Generating a completely blank pdf"

Have installed latex, gawk, and the other things the script said were missing. And, I assume it's all there.

For simplicity's sake, I have my desired txt file in the same directory as the script.

However... am now running into this:

Quote:
blah@blah-laptop:/usr/local/bin$ sh format_book.sh book.txt
book.txt
"book"
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
kpathsea: Running mktextfm pncr7t
mktextfm: Running mf-nowin -progname=mf \mode:=ljfour; mag:=1; nonstopmode; input pncr7t
This is METAFONT, Version 2.71828 (Web2C 7.5.6)

kpathsea: Running mktexmf pncr7t
! I can't find file `pncr7t'.
<*> ...:=ljfour; mag:=1; nonstopmode; input pncr7t

Please type another input file name
! Emergency stop.
<*> ...:=ljfour; mag:=1; nonstopmode; input pncr7t

Transcript written on mfput.log.
grep: pncr7t.log: No such file or directory
mktextfm: `mf-nowin -progname=mf \mode:=ljfour; mag:=1; nonstopmode; input pncr7t' failed to make pncr7t.tfm.

/tmp/book.dvi -> /tmp/book.pdf
[1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61][62][63][64][65][66][67][68][69][70][71][72][73][74][75][76][77][78][79][80][81][82][83][84][85][86][87][88][89][90][91][92][93][94][95][96][97][98][99][100][101][102][103][104][105][106][107][108][109][110][111][112][113][114][115][116][117][118][119][120][121][122][123][124][125][126][127][128][129][130][131][132][133][134][135][136][137][138][139][140][141][142][143][144][145][146][147][148][149][150][151][152][153][154][155][156][157][158][159][160][161][162][163][164][165][166][167][168][169][170][171][172][173][174][175][176][177][178][179][180][181][182][183][184][185][186][187][188][189][190][191][192][193][194][195][196][197][198][199][200][201][202][203][204][205][206][207][208][209][210][211][212][213][214][215][216]
80387 bytes written
My guess is that this line: ! I can't find file `pncr7t'.
Is the problem, but I'm not sure what it means.
Pncr7t appears to be a font, I installed a tex package with a font of the same name installed - but to no avail, so maybe I'm wrong.

Tips? ;-)

Last edited by tali3sin; 02-19-2009 at 08:19 PM. Reason: Solved the problem
tali3sin is offline   Reply With Quote
Reply

Tags
eee pc, pdf conversion

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
pdf formatting Ron46 Ectaco jetBook 25 06-23-2010 10:46 AM
Reader for PDF magazine, university scripts etc + ability to write notes / highlight Czechnology Which one should I buy? 2 04-27-2010 04:30 PM
Formatting PDF for Kindle Kindle10 Workshop 3 12-26-2008 07:02 PM
CLHP Iphone PDF works well with Asus Eee wrjames Alternative Devices 1 08-11-2008 10:38 AM
Fox Searchlight scripts available in PDF jckatz Reading Recommendations 1 02-24-2008 12:39 AM


All times are GMT -4. The time now is 08:21 PM.


MobileRead.com is a privately owned, operated and funded community.