Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 07-06-2008, 10:02 AM   #1
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
FLAG (Fanfiction.net Lightweight Automated Grabber)

2011-04-21: FLAG has been completely rewritten as a webservice, and is available here. The CLI version is not currently supported, but I may add support for this back at some point in the future.

Currently supported formats:
  • EPUB
  • MobiPocket (Kindle)
  • PDF
  • HTML

Currently supported source websites:
-----= Original Post =-----
Quote:
For those of you who like reading fanfiction.net, I've just finished writing the first release (well, the first release I'm happy to show people anyway) of an automated grabber, with the output formatting targeted at ebook readers.

If you wish to try it out without needing to install it first, it's also available as a web service here (note that the web version doesn't make all the features available, but is more up-to-date than the version attached to this post).

Currently supported output formats:
  • HTML
  • BBeB/LRF (requires Calibre's html2lrf script)
  • ePub (requires Calibre's html2lrf script)
  • RTF (requires pandoc)
  • PDF (requires htmldoc)

FLAG is attached to this post as a tarball. As of this post, the latest version is r29, however there have been various improvements to it from both myself and ilovejedd. The next release is currently waiting on ilovejedd to organise his changes and send me a diff.

Note that this was written with a Sony PRS-505 in mind, for Linux systems. It requires calibre, php5-cli, php5-curl and php5-tidy be installed for it to work properly, plus other additional dependencies if you wish to use output codecs such as RTF or PDF.

Instructions can be found by running "./fflag --help".

All bug reports / requests / suggestions etc are welcome, although I can't promise I'll implement them.

Last edited by erayd; 09-13-2011 at 03:05 AM. Reason: Updated list of source websites.
erayd is offline   Reply With Quote
Old 07-06-2008, 10:15 AM   #2
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Do you plan to support MobiPocket? It's by far the most popular eBook format!

Easy to do automated conversion of HTML to Mobi using Tommy's excellent "MobiPerl" package.
HarryT is offline   Reply With Quote
Advert
Old 07-06-2008, 10:40 AM   #3
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
I am more than happy to add support for mobipocket, however I will need a conversion script (I'll take a look at MobiPerl and see if it'll do the trick) and a Linux viewer to test the resulting output - do you know of any?
erayd is offline   Reply With Quote
Old 07-06-2008, 10:52 AM   #4
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
FBReader can be used to view Mobi files.
HarryT is offline   Reply With Quote
Old 07-06-2008, 10:55 AM   #5
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
Lovely - assuming these utilities can do what I need, expect to see mobi support within the next few days.
erayd is offline   Reply With Quote
Advert
Old 07-06-2008, 10:58 AM   #6
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Sounds good! I suppose the obvious question is "any plans for a Windows version"? FanFiction isn't my "thing", but I'm sure that many people would like to know.
HarryT is offline   Reply With Quote
Old 07-06-2008, 11:03 AM   #7
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
None whatsoever (I don't have a Windows box here), but I have no problem with somebody else porting it, and I would be happy to host the resulting files / accept patches etc. Porting shouldn't be hard - it's just php.
erayd is offline   Reply With Quote
Old 07-06-2008, 11:06 AM   #8
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
Another thought though - what would be the legal ramifications of me hosting this with a web frontend to save people the hassle of installing / using CLI utilities? My main server is currently located in Vancouver, Canada.
erayd is offline   Reply With Quote
Old 07-06-2008, 12:13 PM   #9
SeaWolf
Connoisseur
SeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileRead
 
SeaWolf's Avatar
 
Posts: 63
Karma: 65091
Join Date: Jul 2008
Location: Sydney, Australia
Device: Kindle Paperwhite WiFi
I dunno, the legalities of it are quite a grey area. They may be reluctant about the whole thing as it may drawn unwanted attention from copyright holders. Authors and archives of fanfiction and transformative works live in constant fear that they're going to be the target of a lawsuit from the owner of whatever fandom they're working in. While all your script is doing is presenting the same information in a different form it's still the kind of stupid distinction that can lead to trouble. The argument the copyright holder might put is that this is more like a book, while the HTML on the website is more like a forum post, or some other such nonsense argument. It doesn't really have to make sense, it only has to result in a legal threat for it to be a major problem. Fanfic sites tend to jumpy about this sort of thing because they're in no position to put up a fight, so they'd rather not tempt fate, no matter how entrenched the concept of fanfic may have become. Having said that, they may not be bothered about it at all, so I think you should ask them and see what they have to say.

The major technical issue I can see them having is the potential for thousands of users to bombard your site to get eBooks and your grabber then making thousands of requests to their server to retrieve the text for them. They'd be worried about getting flooded.

I'm pretty positive about this project in general though, I think this is a great idea. Even though I don't use Fanfiction.net much myself, I do love fanfic (no matter how much rough I have to wade through to get to the diamond). I actually had the thought the other day that it would be great if FeedBooks made their engine code available so that fanfic archive software like eFiction could incorporate it and provide users with a variety of eBook options to download their fic in. But if we can't have that, then grabbers are the next best thing.
SeaWolf is offline   Reply With Quote
Old 07-06-2008, 12:55 PM   #10
Hadrien
Feedbooks.com Co-Founder
Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.Hadrien understands the importance of being earnest.
 
Hadrien's Avatar
 
Posts: 2,263
Karma: 145123
Join Date: Nov 2006
Location: Paris, France
Device: Sony PRS-t-1/350/300/500/505/600/700, Nexus S, iPad
Quote:
Originally Posted by SeaWolf View Post
I'm pretty positive about this project in general though, I think this is a great idea. Even though I don't use Fanfiction.net much myself, I do love fanfic (no matter how much rough I have to wade through to get to the diamond). I actually had the thought the other day that it would be great if FeedBooks made their engine code available so that fanfic archive software like eFiction could incorporate it and provide users with a variety of eBook options to download their fic in. But if we can't have that, then grabbers are the next best thing.
We'll provide an API to directly upload books to Feedbooks. Would be fairly easy for FanFiction.net to add support for this on their website, all new fics would be automatically uploaded to Feedbooks and available in all the formats that we support.
Hadrien is offline   Reply With Quote
Old 07-06-2008, 10:21 PM   #11
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
Quote:
Originally Posted by SeaWolf View Post
...Having said that, they may not be bothered about it at all, so I think you should ask them and see what they have to say.
I'll flick them an email and ask, although noting the scale of the site I doubt they would have a problem with it - they're already way above the radar.

Quote:
The major technical issue I can see them having is the potential for thousands of users to bombard your site to get eBooks and your grabber then making thousands of requests to their server to retrieve the text for them. They'd be worried about getting flooded.
Trust me, this won't be an issue - fanfiction.net is massive (hundreds of thousands of users, millions of stories). My server would choke and give up long before ff.net would even start to worry. And my server should handle thousands of requests without breaking a sweat, it's pretty lightly loaded at the moment.

Quote:
Originally Posted by Hadrian View Post
We'll provide an API to directly upload books to Feedbooks. Would be fairly easy for FanFiction.net to add support for this on their website, all new fics would be automatically uploaded to Feedbooks and available in all the formats that we support.
I think this is a great idea, although I'm not sure if they would go along with it - they like to maintain control of the content they host, in case someone asks them to take it down. Maybe if your API also had a 'takedown' feature so the submitter could also remove their content?

It may also be worth considering the scale of the service... fanfiction.net has a LOT of content!

Last edited by erayd; 07-06-2008 at 10:24 PM.
erayd is offline   Reply With Quote
Old 07-06-2008, 11:09 PM   #12
SeaWolf
Connoisseur
SeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileReadSeaWolf has read every ebook posted at MobileRead
 
SeaWolf's Avatar
 
Posts: 63
Karma: 65091
Join Date: Jul 2008
Location: Sydney, Australia
Device: Kindle Paperwhite WiFi
Quote:
Originally Posted by Hadrien View Post
We'll provide an API to directly upload books to Feedbooks. Would be fairly easy for FanFiction.net to add support for this on their website, all new fics would be automatically uploaded to Feedbooks and available in all the formats that we support.
Is that really something Feedbooks would want though? I mean, most fanfic is well short of being book length, some of it wouldn't even qualify as short story length, and the impression I have of Feedbooks is that it's trying to become an archive of good, quality, proper books, rather than a fanfic archive.

On the other hand, did you actually mean to have Feedbooks store the content? Or is it a case of the user would click a link on Fanfiction.net and be taken to Feedbooks which would then process the piece into the format the user wanted and then hand the book to the user without Feedbooks actually storing the content?
SeaWolf is offline   Reply With Quote
Old 07-07-2008, 01:00 AM   #13
mateo
Enthusiast
mateo began at the beginning.
 
Posts: 45
Karma: 10
Join Date: May 2005
Device: Palm Zire71
There are actually more dependencies:

php5-mysql: doesn't prevent the program from completing but does produce ugly error messages.
php5-gd: same as above
php5-tidy
mateo is offline   Reply With Quote
Old 07-07-2008, 01:03 AM   #14
mateo
Enthusiast
mateo began at the beginning.
 
Posts: 45
Karma: 10
Join Date: May 2005
Device: Palm Zire71
I tested the program and the idea is absolutely fantastic. I have tried converting fanfiction.net books myself by hand and this is much easier.

One problem that I noticed is a similar problem to what I have experienced when doing it by hand. For some reason fanfiction.net doesn't format some of their special characters correct. ', ", ` are some of the characters that do not display correctly. I have to find and replace these characters when I do it manually. Your script will probably want to fix these, because the resulting HTML file (haven't tried the other formats) doesn't display as intended.

Last edited by mateo; 07-07-2008 at 01:18 AM.
mateo is offline   Reply With Quote
Old 07-07-2008, 01:29 AM   #15
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
Quote:
Originally Posted by mateo View Post
There are actually more dependencies:

php5-mysql: doesn't prevent the program from completing but does produce ugly error messages.
php5-gd: same as above
php5-tidy
php5-mysql shouldn't produce any errors, as my script doesn't use MySQL - this is more likely an error with your setup. Same goes for php5-gd - this is an image manipulation module, which I don't use.

php5-tidy isn't required for r15, but is required for the latest subversion build.

Quote:
Originally Posted by mateo,210069
One problem that I noticed is a similar problem to what I have experienced when doing it by hand. For some reason fanfiction.net doesn't format some of their special characters correct. ', ", ` are some of the characters that do not display correctly. I have to find and replace these characters when I do it manually. Your script will probably want to fix these, because the resulting HTML file (haven't tried the other formats) doesn't display as intended.
Thanks for this - can you supply a story ID that doesn't render correctly? Which version are you using?
erayd is offline   Reply With Quote
Reply

Tags
converter, fanfiction, fanfiction.net, grabber, lrf

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Fanfiction.net on Kindle forkyfork Amazon Kindle 26 08-07-2011 08:42 AM
bookmarks/notes grabber Reader2 Android Developer's Corner 0 10-02-2010 09:24 AM
EASY fanfiction grabber? sherryg Workshop 19 01-08-2010 03:13 AM
FLAG (Fanfiction.net Lightweight Automated Grabber) and Calibre? malkie13 Calibre 1 02-10-2009 05:43 PM


All times are GMT -4. The time now is 07:13 AM.


MobileRead.com is a privately owned, operated and funded community.