Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book General > Deals, Freebies, and Resources (No Self-Promotion)

Notices

Reply
 
Thread Tools Search this Thread
Old 04-11-2009, 03:29 PM   #1
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Free Ebook: World Fact eBook

The CIA maintains a reference manual called the World Factbook. They used to release a new edition each year; recently they decided to only maintain the online edition. I found it to be an excellent source of information, and wanted to make an off line copy.

The result is what I call the World Fact eBook. It is currently only available in Mobipocket. I decided to focus on Mobipocket because the format has certain specialized html tags. This ebook has a search index for article title, keyword, country name, and flag. It can also be used as a dictionary by most versions of Mobipocket Reader. This means that if you are reading news on, for example the Kindle, you can look up a country name to learn more information.

The current version, 0.7, can be downloaded here. Epub, IMP, and Sony LRF will be available soon.

P.S. This was my first large project. The source material consisted of over 500 html files, and close to 800 pictures. Most of the content on the web pages had to be removed. I wrote a fair amount of code to automate the cleanup. I am looking for a new project where I can repeat the process. If you would like some other website converted into an ebook, please let me know. (Please consider the copyright situation before you ask.)
Nate the great is offline   Reply With Quote
Old 04-11-2009, 03:55 PM   #2
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530531
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Great, this Nate the Great!

The conversions of your v0.6 .prc ebook are now available in the EPUB, IMP and LRF E-Book Uploads sections.

You do know that the 2009 CIA World Factbook is due out this spring...

Last edited by nrapallo; 04-11-2009 at 03:58 PM. Reason: added links
nrapallo is offline   Reply With Quote
 
Advertisement
Old 04-11-2009, 04:11 PM   #3
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Quote:
Originally Posted by nrapallo View Post
Great, this Nate the Great!

The conversions of your v0.6 .prc ebook are now available in the EPUB, IMP and LRF E-Book Uploads sections.

You do know that the 2009 CIA World Factbook is due out this spring...
Yes and no. The Factbook is updated every 2 weeks. The source material is current as of 24 February 2009. But they ddi say they will release a major update soon.
Nate the great is offline   Reply With Quote
Old 04-11-2009, 04:24 PM   #4
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530531
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
I know what you mean about updates.

I just converted an ereader .pdb of the 2008 CIA World Factbook (updated as of March 19/09) that is available here.

This CIA World Factbook 2008, includes Rank Order Pages and uses smaller sized images (640x400 max.).

There's a point where an update is no longer an update...
nrapallo is offline   Reply With Quote
Old 04-12-2009, 09:22 AM   #5
Manichean
Wizard
Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!
 
Manichean's Avatar
 
Posts: 3,130
Karma: 80520
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
First of all, thanks for the conversion. I had a (relatively quick) look at it, this is really a nicely done book.

I do have a few technical questions though. I have two books with (I assume) similar source material, they are reference books consisting of many html pages with images and some active content (lookup etc.). I know that I'll have to remove the active content, but beyond that I'm really pretty clueless as to what tools I should use for the conversion. I'd like to get a toc like yours, where you first select the character and then get a list of topics starting with that character, but I don't know how to do that (apart from manually writing the html page, but that would be a major pain in the ass).

So, my questions are:
- What did you use to parse the html files? I'm assuming some scripting language?
- What program did you use to build the Mobi-file from the multiple html files?

Thanks in advance for your answers.
Manichean is offline   Reply With Quote
Old 04-12-2009, 10:16 AM   #6
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Quote:
Originally Posted by Manichean View Post
First of all, thanks for the conversion. I had a (relatively quick) look at it, this is really a nicely done book.

I do have a few technical questions though. I have two books with (I assume) similar source material, they are reference books consisting of many html pages with images and some active content (lookup etc.). I know that I'll have to remove the active content, but beyond that I'm really pretty clueless as to what tools I should use for the conversion. I'd like to get a toc like yours, where you first select the character and then get a list of topics starting with that character, but I don't know how to do that (apart from manually writing the html page, but that would be a major pain in the ass).

So, my questions are:
- What did you use to parse the html files? I'm assuming some scripting language?
- What program did you use to build the Mobi-file from the multiple html files?

Thanks in advance for your answers.
For the base conversion I wrote scripts for jflex, which then created Java code. The scripts were basically a list of regular expressions and some Java code to execute when the regular expression is found (in the source html file). If you know Java and regular expressions, you can use jflex ( or C and flex, for that matter).

For the finishing touches I used Textpad. It can use regular expressions for the search functions, as well as work on several hundred open files at once.

The TOCs didn't quite have to be done by hand. One of the appendices already had one. After changing it to a form I prefer, I copied it to the other files. The anchor tags did have to be put in by hand, though.

I then used Mobipocket Creator to make the ebook. The user interface leaves something to be desired, but given that it saves you the effort of manually creating the OPF file, it's not bad.
Nate the great is offline   Reply With Quote
Old 04-12-2009, 12:03 PM   #7
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530531
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by Nate the great View Post
The TOCs didn't quite have to be done by hand. One of the appendices already had one. After changing it to a form I prefer, I copied it to the other files. The anchor tags did have to be put in by hand, though.

I then used Mobipocket Creator to make the ebook. The user interface leaves something to be desired, but given that it saves you the effort of manually creating the OPF file, it's not bad.
I have used Mobipocket Creator to add hyperlinks using it's Table of Contents section. The external TOC file it creates can then be used (merged in) and edited to strip all but the headings to remain in the TOC. It's a very powerful resource to add to ones toolset. The exact syntax I used is in the screenshot attached below.

The side-effect is that all the <a name>'s inserted can be then be referenced from within the ebook. That may have to be done by hand (I semi-automated this) but half the task was done, the insertion of the <a name> (or <a id>) and assigment of unique id labels.

This was the technique I used to add all those new hyperlinks to the Webster's Dictionary 1913 v2.0. (A version 2.1 with minor improvements will be uploaded soon ).
Attached Thumbnails
Click image for larger version

Name:	MPC-TOC creation tags.jpg
Views:	555
Size:	106.6 KB
ID:	27475  

Last edited by nrapallo; 04-12-2009 at 07:09 PM. Reason: Typo
nrapallo is offline   Reply With Quote
Old 04-12-2009, 01:46 PM   #8
Manichean
Wizard
Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!
 
Manichean's Avatar
 
Posts: 3,130
Karma: 80520
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Thanks for the help. I'll give it a go and see how it turns out.
Manichean is offline   Reply With Quote
Old 04-14-2009, 03:58 PM   #9
ProDigit
Karmaniac
ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.
 
ProDigit's Avatar
 
Posts: 2,157
Karma: 9023682
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, Jetbook Mini, Jetbook Color, Astak Ez Reader Pro
First off, that's a great idea!
Perhaps I'll be working on an LRF version of this book myself,for fun, and share it.

I just wanted to remind you of the line in the copyright which states:
Code:
"...The official seal of the CIA, however, 
may NOT be copied without permission as required by the CIA Act of 1949 (50 U.S.C. section 403m).
Misuse of the official seal of the CIA could result in civil and criminal penalties...."
The rest of the book is in public domain.

So be sure you don't put the seal in your book!
Otherwise thanks for the effort! Looks like an interesting book to assemble!
ProDigit is offline   Reply With Quote
Old 04-14-2009, 04:02 PM   #10
ProDigit
Karmaniac
ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.
 
ProDigit's Avatar
 
Posts: 2,157
Karma: 9023682
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, Jetbook Mini, Jetbook Color, Astak Ez Reader Pro
PS: Is the 2008 book edition an update to the 2007,or do they keep the originals of 2007, 2006,2005 etc somewhere?

It would be interesting to have one of their first books (eg '92) and compare it to a current release!
ProDigit is offline   Reply With Quote
Old 04-14-2009, 04:12 PM   #11
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Quote:
Originally Posted by ProDigit View Post
First off, that's a great idea!
Perhaps I'll be working on an LRF version of this book myself,for fun, and share it.
If you promise to put real effort in to making the ebook look nice, I'll give you my working files.
Quote:
Originally Posted by ProDigit View Post
PS: Is the 2008 book edition an update to the 2007,or do they keep the originals of 2007, 2006,2005 etc somewhere?

It would be interesting to have one of their first books (eg '92) and compare it to a current release!
They just have the one current copy.
Nate the great is offline   Reply With Quote
Old 04-14-2009, 08:39 PM   #12
igorsk
Wizard
igorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfolded
 
Posts: 3,443
Karma: 52235
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
You can probably pull some older copies from archive.org.
igorsk is offline   Reply With Quote
Old 04-15-2009, 12:03 PM   #13
ProDigit
Karmaniac
ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.
 
ProDigit's Avatar
 
Posts: 2,157
Karma: 9023682
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, Jetbook Mini, Jetbook Color, Astak Ez Reader Pro
Quote:
Originally Posted by Nate the great View Post
If you promise to put real effort in to making the ebook look nice, I'll give you my working files.
No,really I just do this for the fun of HTML encoding. I'm learning it;
but tell me, you took the printed version,just create one big HTML file,and add the flag + flag info to every country?

Unfortunately in LRF I can't keep the original formatting.. I was thinking in lines of creating one chapter per country, subchapter being the flag, flaginfo, map, and following all the other information.

What approach did you use? (I'd be interested to see how you did it).
I was basically merging all info in one big file,cleaning it up a bit with notepad++ advanced search and replace, and then infusing all flag files manually (basically some copy paste work).

I used the print version,because it's cleaner than the web version to work with.
(oh,also remove the tables,that's a bit of a pain,I'm still looking into that. it's easy to remove them with Search&Replace, but I don't want to delete any valuable info, neither end up with broken HTML code).

Then at the end I still need to add the appendixes and the rankorder directory (2001rank.html to 2211rank.html)... I'm still figuring out how to do that; analyzing the content thereof...

Last edited by ProDigit; 04-15-2009 at 12:09 PM.
ProDigit is offline   Reply With Quote
Old 04-15-2009, 12:36 PM   #14
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 10,604
Karma: 3586209
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Quote:
Originally Posted by ProDigit View Post
No,really I just do this for the fun of HTML encoding. I'm learning it;
but tell me, you took the printed version,just create one big HTML file,and add the flag + flag info to every country?

Unfortunately in LRF I can't keep the original formatting.. I was thinking in lines of creating one chapter per country, subchapter being the flag, flaginfo, map, and following all the other information.

What approach did you use? (I'd be interested to see how you did it).
I was basically merging all info in one big file,cleaning it up a bit,and then infusing all flag files.
I used the print version,because it's cleaner than the web version to work with.
I found it better to keep the files separate. There are about 9 single files, and 2 groups of files (260 country pages and 250 flag pages). The single files have to be edited one at a time, but each group of files can be edited at once.

I started with the web version, but kept none of the original formatting. Instead, I replaced it with some very basic html tags.

The formatting of each group is internally consistent. When you figure out what looks best on the Sony Reader, you can change it all at once. If you instead decided to copy everything in to one file, you will need to edit it in a linear fashion.

It's going to take you at least 20 hours of work to get the source material to where I have it. Editing it one line at a time is really boring.
Nate the great is offline   Reply With Quote
Old 04-15-2009, 06:39 PM   #15
igorsk
Wizard
igorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfoldedigorsk reads XML... blindfolded
 
Posts: 3,443
Karma: 52235
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
Maybe you could upload your simplified HTML too?
igorsk is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Links to lot of eBook shops (free & commercial ones) ebook-spot.de ebook-spot.de Deals, Freebies, and Resources (No Self-Promotion) 0 11-23-2009 03:48 PM
2,000,000 free e-books from the 4th Annual World eBook Fair Sonist Deals, Freebies, and Resources (No Self-Promotion) 4 07-16-2009 12:31 AM
World eBook Fair - Over 2 Million eBooks To Choose From For Free one month Tdew Deals, Freebies, and Resources (No Self-Promotion) 5 07-13-2009 04:22 PM
2008 World Fact Book Project Nate the great Workshop 34 06-27-2009 01:14 AM
Free Ebook on Kindle: World Wide Rave koland Deals, Freebies, and Resources (No Self-Promotion) 7 04-17-2009 02:00 AM


All times are GMT -4. The time now is 12:57 PM.


MobileRead.com is a privately owned, operated and funded community.