07-08-2010, 04:06 PM | #196 | |
Zealot
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
|
Quote:
The problem was triggered due to the size of those two chapters - they're so huge that they were exhausting the regex backtrack limit on my server, and as a result the chapter content wasn't captured. I've implemented an alternative (not quite as good, but not dependent on the backtrack limit) capture expression in the event that the first one fails. |
|
07-09-2010, 07:33 AM | #197 |
Member
Posts: 12
Karma: 10
Join Date: Jul 2010
Device: none
|
Works like a charm now. Thank you for the fix!
|
Advert | |
|
09-24-2010, 02:01 PM | #198 |
Addict
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
|
Greasemonkey script?
Hello all,
I've found this tool to be incredibly useful. However, being basically a lazy guy, I decided that copy/pasting the story ID in the webservice's inputbox was too much work. So I put together a Greasemonkey script to insert a link into the story's header, that directly gets me an ePub and the service's output page. This however, bypasses the webservice's front page (which is the point), but also the Donate button, and that's just rude. If this button is also added to the webservice's output page, would it be OK for me to share the GM script? It's not exactly a work of art, but maybe it'd be useful to some. N. |
09-26-2010, 05:53 AM | #199 | ||
Zealot
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
|
Quote:
Quote:
|
||
09-28-2010, 05:28 PM | #200 | |
Addict
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
|
Quote:
I didn't put much thought in the CSS styling beyond the menu and layout stuff; anyone can change it as they please anyway. It still works as of a minute ago; I've had a PM warning me that a recent FFNet change broke some downloaders. (I'd also like to apologize for not answering the PM sooner, but I wanted to hear from Erayd first). N. |
|
Advert | |
|
09-29-2010, 11:52 AM | #201 |
Member
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
|
Hiya, it seems like the commandline version hasn't worked for awhile, so I fixed it. I've also made other changes:
Supports multiple stories on commandline. Supports story urls on commandline (instead of -i). Supports fetching multiple stories from community page and author page urls. The include directories are autodetected: If fflag is put in /usr/local/bin, for example, it will look for it's files in /usr/local/share/flag, /usr/local/lib/flag, or /usr/local/bin. Default output format is now HTML. I have not tested the others but they -should- work. Misc changes (no more chdir, etc). Example: fflag -i 65535 http://www.fanfiction.net/s/232323/1/whatever 424242 http://fanfiction.net/u/11/somewriter -o /tmp/whatever -i 123456 Downloads storyid 65535, 232323, 424242, all of somewriter's stories, and story 123456. All stories are saved with automatic filenames to the /tmp/whatever folder, which is created if it does not exist. If -o is '-' or unspecified, the current directory is used. fflag -s ffnet -f html -i 321123 -o whatever.html Old style syntax still works. |
09-29-2010, 02:25 PM | #202 | |
Addict
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
|
Hello,
Quote:
More generally, I think this project looks like a perfect candidate for OOP refactoring. I mean, pluggable codecs and sources? Available from CLI or web page? This is what classes are for. If erayd'll allow it, I'll volunteer; I'm really itching to do it... N. |
|
09-29-2010, 10:07 PM | #203 |
Member
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
|
Output files go to cwd by default, though I've considered adding an author directory option. ID files are supported already via your friendly neighborhood unix shell: fflag -f epub `cat mylistofurls.txt`
I've implemented a couple of conf variables at the top of the script, expanding this is a good idea. A built in throttle is probably a good idea, not sure about OO. I'm mostly concerned with the cruddy state of the various regexp used; I've only just looked into php and it's quirky regex functionality (compared to perl). I'm thinking that a true HTML parser should be used for alot of this stuff as using css selectors is probably more reliable. I'm curious to hear from erayd about the possibility of throwing this up on code.google.com svn, and collaborative coding in general. Last edited by AtomicDryad; 09-29-2010 at 10:13 PM. |
09-30-2010, 03:38 AM | #204 | |||
Addict
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
|
Quote:
Quote:
PHP Code:
Quote:
N. |
|||
10-01-2010, 04:10 AM | #205 |
Zealot
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
|
I agree with most of the above suggestions - I'll set something up tonight that will hopefully work as a useful central point of collaboration, and will post the details here.
If anyone disagrees with anything, sing out . |
10-01-2010, 05:56 AM | #206 | |
Member
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
|
Quote:
|
|
10-01-2010, 06:00 AM | #207 | |
Zealot
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
|
Quote:
|
|
10-01-2010, 06:05 AM | #208 |
Member
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
|
Update to update: fixed a regex breaking -U, fixed output display
Update: r29mod2 Warning: After one too many flooded ssh sessions I've changed the -o option to match the unix non-standard: -o - outputs to screen, no -o will automatically name. Will make this optional later. New option: -U, --update: If the output file already exists and has the same amount of chapters as the copy on fanfiction, fflag won't overwrite or download any extra chapters. This will only work for .html files written with r29mod2, which include a 'TotalPages' meta tag. fflag now takes filenames as arguments. If the file is an .html written with this version or above, fflag will find the ID via a 'StoryId' meta tag. One can batch update a story collection with: fflag -U /home/user/myfanfics/*.html, which will produce less traffic than redoing the entire collection. The story summary is now included at the top of the output file, under title and author. The output display shows current/total stories while processing, also current/total chapters. Now searches for the story text via XPath, which should be more reliable. Requires php with dom and simplexml, change USEDOM=>1 to USEDOM=>0 if you get a php error. Debug option for coders: -D will output debug text, and write all web GETS to debug-hostname-filename. If debug-hostname-filename exists, it will load that, instead of downloading. Good if you need to test without hammering fanfiction.net's server. Alot of configuration options stored in $CONFIG hasharray at the top of the 'fflag' file, runtime options in $opt hasharray. TODO: Feature freeze until merge with erayd's branch. Abstracted version of ffnet.source.php, that can be told to use config hash array which includes url specs, xpaths, regexps strings, etc. Last edited by AtomicDryad; 10-01-2010 at 06:48 AM. |
10-01-2010, 06:34 AM | #209 |
Member
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
|
|
10-01-2010, 10:58 AM | #210 | |
Addict
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
|
Quote:
On another subject: In recent versions, Calibre discontinued html2epub, and replaced it with ebook-convert, which comes with a large number of formats: http://calibre-ebook.com/user_manual...k-convert.html I've found this out when I updated the epub codec to add the summary with the --comments option; my Ubuntu Maverick didn't have html2epub. Seems like it was retired in favor of ebook-convert late last year. N. |
|
Tags |
converter, fanfiction, fanfiction.net, grabber, lrf |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Fanfiction.net on Kindle | forkyfork | Amazon Kindle | 26 | 08-07-2011 08:42 AM |
bookmarks/notes grabber | Reader2 | Android Developer's Corner | 0 | 10-02-2010 09:24 AM |
EASY fanfiction grabber? | sherryg | Workshop | 19 | 01-08-2010 03:13 AM |
FLAG (Fanfiction.net Lightweight Automated Grabber) and Calibre? | malkie13 | Calibre | 1 | 02-10-2009 05:43 PM |