View Single Post
Old 04-22-2008, 07:52 AM   #1
caritas
Enthusiast
caritas doesn't littercaritas doesn't litter
 
Posts: 26
Karma: 161
Join Date: Feb 2008
Device: Sony PRS505
Hi,

I am interested in ebook reader for quite a while. But after trying with a 6-inch e-ink reader (Hanlin V3), I found it is almost useless to read normal PDF files on these machines. The font size is too small, while the page size is too wide.

So, a method to render PDF for these small devices is thought about and prototyped. The details are as follow:

1. Convert pdf to image. I use pdftoppm of xpdf. Such as:
pdftoppm -r 180 -f 245 -l 245 -gray -aa yes a.pdf a

2. Analyse the generated images. Break page into lines.

3. Divide each line long enough to two segments.

4. Rearrange the segments into a new page, with half of the width.

The example image before/after conversion is attached with the post. I think the result is acceptable.

The source code is attached with the post too. The source is released under the License of GPL v2/v3.

Best Regards,
Huang Ying

Basic Usage for version 0.4:

tar -xjf pi_0.4.tar.bz2
cd pi
. env.sh
cd test
pi_format.py chap.conf
/* output goes in out directory */
img_dir_to_pdf.sh out chap-rf.pdf


2008-09-20 Huang Ying <huang.ying.caritas@gmail.com>

* Version: 0.8

* overall: Reorganize program in a more modular way.

* pi.image: Add unpaper support for scanned book

* pi.image: Add column compress support for scanned book

* pi.divide: Add simple divider for divide = 1

2008-08-30 Huang Ying <huang.ying.caritas@gmail.com>

* Version: 0.7

* pi.py: Add LRF output support.

* pi.py: Add TOC support for LRF output format

* pi.py: Add output rotate support.

* pdfminfo: Add pdfminfo to extract PDF information such as TOC,
title, author, etc.

* overall: Add initial windows support, thanks ashkulz of
mobileread forum.

2008-08-11 Huang Ying <huang.ying.caritas@gmail.com>

* Version: 0.6

* pi.py: Initial implementation of embolden.

* pi.py: Use norm coordinate in class Page and Line.

* pi.py: Add edge trimming support.

* pi.py: Add run pages mode.

* pi.py: Add page range support.

* pi.py: Re-work ImageOutput, split multi-page image.

* pi.py: Rotate during scale if approriate.

* img_dir_to_pdf.sh: Add color reduction support.

2008-05-17 Huang Ying <huang.ying.caritas@gmail.com>

* Version: 0.5

* pi.py: Detect word, and break lines at word end when possible.

* pi.py: Re-align the 'split line segment' (second half of line)
to align with the next line's indenting when appropriate. This
will make the first line indent and bullet items line up better.

* img_dir_to_pdf.sh: Added to convert from images to pdf.

2008-05-10 Huang Ying <huang.ying.caritas@gmail.com>

* Version: 0.4

* Some algorithms are configurable

* For some text may have problem, present both merged and divided
version.


2008-05-03 Huang Ying <huang.ying.caritas@gmail.com>

* Version: 0.3

* Rewrite most algorithm in python except the image parsing (break
image into lines and characters). This will make it easier to
add new algorithm (hack).

* pi.py: Add some hacks to deal with equation and figure.


2008-04-29 Huang Ying <huang.ying.caritas@gmail.com>

* Version: 0.2

* Split lines in two equal halves or optional equal thirds or
equal quarters

* Separate output image into customizable page size

* Flex can be designate by user configuration

* Calculate DPI for each page

* Figure detecting and special processing. The figures are scaled
to page width and output twice, scaled and split.


2008-04-23 Huang Ying <huang.ying.caritas@gmail.com>

* Version: 0.1
Attached Thumbnails
Click image for larger version

Name:	chap6-04-0.png
Views:	1672
Size:	112.2 KB
ID:	15107   Click image for larger version

Name:	chap6-04-1.png
Views:	1292
Size:	16.8 KB
ID:	15108   Click image for larger version

Name:	chap6-04-2.png
Views:	1547
Size:	112.1 KB
ID:	15109   Click image for larger version

Name:	chap6-04-3.png
Views:	1379
Size:	147.2 KB
ID:	15110   Click image for larger version

Name:	chap6-04-4.png
Views:	1232
Size:	88.9 KB
ID:	15111   Click image for larger version

Name:	pipeline.png
Views:	1186
Size:	91.0 KB
ID:	16388  
Attached Files
File Type: gz pi.tar.gz (23.1 KB, 1059 views)
File Type: bz2 pi_0.2.tar.bz2 (300.2 KB, 1083 views)
File Type: bz2 pi_0.3.tar.bz2 (283.0 KB, 881 views)
File Type: bz2 pi_0.4.tar.bz2 (294.9 KB, 867 views)
File Type: bz2 pi_0.5.tar.bz2 (296.9 KB, 943 views)
File Type: bz2 pi_0.6.tar.bz2 (336.3 KB, 1134 views)
File Type: bz2 pi_0.7.tar.bz2 (527.6 KB, 907 views)
File Type: bz2 pi_0.8.tar.bz2 (627.5 KB, 1152 views)

Last edited by caritas; 09-20-2008 at 08:14 AM. Reason: Version update
caritas is offline   Reply With Quote