Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 07-08-2010, 04:06 PM   #196
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
Quote:
Originally Posted by octarineblues View Post
I've been having some trouble with story ID 2609602 when using the online version of FLAG. For some reason, chapters 9 and 10 (the two epilogues) are generated with only the title and two separators.
Thanks for letting me know - I've fixed it, and you should now be able to fetch that story ID with no issues.

The problem was triggered due to the size of those two chapters - they're so huge that they were exhausting the regex backtrack limit on my server, and as a result the chapter content wasn't captured.

I've implemented an alternative (not quite as good, but not dependent on the backtrack limit) capture expression in the event that the first one fails.
erayd is offline   Reply With Quote
Old 07-09-2010, 07:33 AM   #197
octarineblues
Member
octarineblues began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Jul 2010
Device: none
Works like a charm now. Thank you for the fix!
octarineblues is offline   Reply With Quote
Advert
Old 09-24-2010, 02:01 PM   #198
aleyx
Addict
aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.
 
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
Greasemonkey script?

Hello all,

I've found this tool to be incredibly useful. However, being basically a lazy guy, I decided that copy/pasting the story ID in the webservice's inputbox was too much work. So I put together a Greasemonkey script to insert a link into the story's header, that directly gets me an ePub and the service's output page.

This however, bypasses the webservice's front page (which is the point), but also the Donate button, and that's just rude. If this button is also added to the webservice's output page, would it be OK for me to share the GM script? It's not exactly a work of art, but maybe it'd be useful to some.

N.
aleyx is offline   Reply With Quote
Old 09-26-2010, 05:53 AM   #199
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
Quote:
Originally Posted by aleyx View Post
...I put together a Greasemonkey script to insert a link into the story's header, that directly gets me an ePub and the service's output page.
Nice! That sounds like a rather useful addition to FF.net's story pages.

Quote:
...would it be OK for me to share the GM script? It's not exactly a work of art, but maybe it'd be useful to some.
Feel free .
erayd is offline   Reply With Quote
Old 09-28-2010, 05:28 PM   #200
aleyx
Addict
aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.
 
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
Quote:
Originally Posted by erayd View Post
Nice! That sounds like a rather useful addition to FF.net's story pages.

Feel free .
Thanks! Here it is. It adds the 'Export' menu just before the font size links. Hovering the menu lets you choose between ePub, PDF and LRF; I didn't put HTML because... well, I forget why. But it'd be trivial to add it back.

I didn't put much thought in the CSS styling beyond the menu and layout stuff; anyone can change it as they please anyway.

It still works as of a minute ago; I've had a PM warning me that a recent FFNet change broke some downloaders. (I'd also like to apologize for not answering the PM sooner, but I wanted to hear from Erayd first).

N.
Attached Files
File Type: zip flag_insert.user.js.zip (1.8 KB, 160 views)
aleyx is offline   Reply With Quote
Advert
Old 09-29-2010, 11:52 AM   #201
AtomicDryad
Member
AtomicDryad began at the beginning.
 
AtomicDryad's Avatar
 
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
Hiya, it seems like the commandline version hasn't worked for awhile, so I fixed it. I've also made other changes:
Supports multiple stories on commandline.
Supports story urls on commandline (instead of -i).
Supports fetching multiple stories from community page and author page urls.
The include directories are autodetected: If fflag is put in /usr/local/bin, for example, it will look for it's files in /usr/local/share/flag, /usr/local/lib/flag, or /usr/local/bin.
Default output format is now HTML. I have not tested the others but they -should- work.
Misc changes (no more chdir, etc).
Example:
fflag -i 65535 http://www.fanfiction.net/s/232323/1/whatever 424242 http://fanfiction.net/u/11/somewriter -o /tmp/whatever -i 123456
Downloads storyid 65535, 232323, 424242, all of somewriter's stories, and story 123456. All stories are saved with automatic filenames to the /tmp/whatever folder, which is created if it does not exist. If -o is '-' or unspecified, the current directory is used.
fflag -s ffnet -f html -i 321123 -o whatever.html
Old style syntax still works.
Attached Files
File Type: bz2 flag-r29mod1.tar.bz2 (6.5 KB, 167 views)
AtomicDryad is offline   Reply With Quote
Old 09-29-2010, 02:25 PM   #202
aleyx
Addict
aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.
 
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
Hello,

Quote:
Originally Posted by AtomicDryad View Post
Hiya, it seems like the commandline version hasn't worked for awhile, so I fixed it. I've also made other changes:
Some suggestions (I didn't read the code in detail, so maybe some don't apply):
  • A standard --version option; if multiple branches start getting used, it might be useful.
  • Output files in ~/tmp/flag/$FILENAME by default rather than the system-wide /tmp; maybe even ~/tmp/flag/$AUTHOR/$FILENAME for multiple inputs.
  • If you're adding multiple inputs, maybe also a standard @<filename> argument, where <filename> has one input per line (ex: "fflag -f epub @ids" would read the 'ids' file in the current directory, read its contents, and pass that as the inputs). this would allow a user to window-shop stories, copy/pasting IDs or URLs in a text file as they go, and when they're done they can convert everything in one short command.
  • A .conf file for overriding defaults. If we want to be thorough, we'll want a $PREFIX/etc/flag.conf for system-wide and a ~/.config/flag/flag.conf for user-specific.
  • Maybe putting a limit on the number of inputs. FFNet's TOS are pretty loose, but they do ask bots to only download a reasonable amount at once...

More generally, I think this project looks like a perfect candidate for OOP refactoring. I mean, pluggable codecs and sources? Available from CLI or web page? This is what classes are for. If erayd'll allow it, I'll volunteer; I'm really itching to do it...

N.
aleyx is offline   Reply With Quote
Old 09-29-2010, 10:07 PM   #203
AtomicDryad
Member
AtomicDryad began at the beginning.
 
AtomicDryad's Avatar
 
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
Output files go to cwd by default, though I've considered adding an author directory option. ID files are supported already via your friendly neighborhood unix shell: fflag -f epub `cat mylistofurls.txt`

I've implemented a couple of conf variables at the top of the script, expanding this is a good idea.

A built in throttle is probably a good idea, not sure about OO.

I'm mostly concerned with the cruddy state of the various regexp used; I've only just looked into php and it's quirky regex functionality (compared to perl). I'm thinking that a true HTML parser should be used for alot of this stuff as using css selectors is probably more reliable.

I'm curious to hear from erayd about the possibility of throwing this up on code.google.com svn, and collaborative coding in general.

Last edited by AtomicDryad; 09-29-2010 at 10:13 PM.
AtomicDryad is offline   Reply With Quote
Old 09-30-2010, 03:38 AM   #204
aleyx
Addict
aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.
 
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
Quote:
Originally Posted by AtomicDryad View Post
ID files are supported already via your friendly neighborhood unix shell: fflag -f epub `cat mylistofurls.txt`
I'm ashamed that I've forgot about that one. Then again, my main use for that syntax is: cat `which braindead-script.sh` on a particularly frustrating AIX server.

Quote:
Originally Posted by AtomicDryad View Post
I'm mostly concerned with the cruddy state of the various regexp used; I've only just looked into php and it's quirky regex functionality (compared to perl). I'm thinking that a true HTML parser should be used for alot of this stuff as using css selectors is probably more reliable.
Agreed. Something like:
PHP Code:
$doc = new DOMDocument();
$doc->loadHTML($chapter_contents);
$story_text $doc->getElementById('storytext'); 
or something like the XPaths functions.
Quote:
Originally Posted by AtomicDryad View Post
I'm curious to hear from erayd about the possibility of throwing this up on code.google.com svn, and collaborative coding in general.
Seconded.

N.
aleyx is offline   Reply With Quote
Old 10-01-2010, 04:10 AM   #205
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
I agree with most of the above suggestions - I'll set something up tonight that will hopefully work as a useful central point of collaboration, and will post the details here.

If anyone disagrees with anything, sing out .
erayd is offline   Reply With Quote
Old 10-01-2010, 05:56 AM   #206
AtomicDryad
Member
AtomicDryad began at the beginning.
 
AtomicDryad's Avatar
 
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
Quote:
Originally Posted by erayd View Post
I agree with most of the above suggestions - I'll set something up tonight that will hopefully work as a useful central point of collaboration, and will post the details here.

If anyone disagrees with anything, sing out .
Excellent. I'd suggest http://code.google.com unless you want to roll your own.
AtomicDryad is offline   Reply With Quote
Old 10-01-2010, 06:00 AM   #207
erayd
Zealot
erayd doesn't littererayd doesn't litter
 
Posts: 134
Karma: 146
Join Date: Apr 2008
Device: Onyx Boox Poke 2
Quote:
Originally Posted by AtomicDryad View Post
Excellent. I'd suggest http://code.google.com unless you want to roll your own.
At this point I'm thinking github - I can't stand subversion, and I personally feel that Google Code lacks a lot of functionality - but I'm happy to go with GC if more people would prefer it. Thoughts?
erayd is offline   Reply With Quote
Old 10-01-2010, 06:05 AM   #208
AtomicDryad
Member
AtomicDryad began at the beginning.
 
AtomicDryad's Avatar
 
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
Update to update: fixed a regex breaking -U, fixed output display
Update: r29mod2
Warning: After one too many flooded ssh sessions I've changed the -o option to match the unix non-standard: -o - outputs to screen, no -o will automatically name. Will make this optional later.

New option: -U, --update: If the output file already exists and has the same amount of chapters as the copy on fanfiction, fflag won't overwrite or download any extra chapters. This will only work for .html files written with r29mod2, which include a 'TotalPages' meta tag.

fflag now takes filenames as arguments. If the file is an .html written with this version or above, fflag will find the ID via a 'StoryId' meta tag. One can batch update a story collection with: fflag -U /home/user/myfanfics/*.html, which will produce less traffic than redoing the entire collection.

The story summary is now included at the top of the output file, under title and author.

The output display shows current/total stories while processing, also current/total chapters.

Now searches for the story text via XPath, which should be more reliable. Requires php with dom and simplexml, change USEDOM=>1 to USEDOM=>0 if you get a php error.

Debug option for coders: -D will output debug text, and write all web GETS to debug-hostname-filename. If debug-hostname-filename exists, it will load that, instead of downloading. Good if you need to test without hammering fanfiction.net's server.

Alot of configuration options stored in $CONFIG hasharray at the top of the 'fflag' file, runtime options in $opt hasharray.

TODO:
Feature freeze until merge with erayd's branch.
Abstracted version of ffnet.source.php, that can be told to use config hash array which includes url specs, xpaths, regexps strings, etc.
Attached Files
File Type: bz2 flag-r29mod2.tar.bz2 (8.0 KB, 171 views)

Last edited by AtomicDryad; 10-01-2010 at 06:48 AM.
AtomicDryad is offline   Reply With Quote
Old 10-01-2010, 06:34 AM   #209
AtomicDryad
Member
AtomicDryad began at the beginning.
 
AtomicDryad's Avatar
 
Posts: 14
Karma: 10
Join Date: Sep 2010
Device: psp, htc g1
Quote:
Originally Posted by erayd View Post
At this point I'm thinking github - I can't stand subversion, and I personally feel that Google Code lacks a lot of functionality - but I'm happy to go with GC if more people would prefer it. Thoughts?
I'm not picky either way.
AtomicDryad is offline   Reply With Quote
Old 10-01-2010, 10:58 AM   #210
aleyx
Addict
aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.aleyx can self-interpret dreams as they happen.
 
Posts: 245
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Cybook Diva
Quote:
Originally Posted by erayd View Post
At this point I'm thinking github - I can't stand subversion, and I personally feel that Google Code lacks a lot of functionality - but I'm happy to go with GC if more people would prefer it. Thoughts?
Quote:
Originally Posted by AtomicDryad View Post
I'm not picky either way.
Fine by me. I'm used to Subversion, but it's a pain in the behind to work with multiple branches. If I'm allowed to go with my OO experiment, github would be better, I think.

On another subject:

In recent versions, Calibre discontinued html2epub, and replaced it with ebook-convert, which comes with a large number of formats:

http://calibre-ebook.com/user_manual...k-convert.html

I've found this out when I updated the epub codec to add the summary with the --comments option; my Ubuntu Maverick didn't have html2epub. Seems like it was retired in favor of ebook-convert late last year.

N.
aleyx is offline   Reply With Quote
Reply

Tags
converter, fanfiction, fanfiction.net, grabber, lrf


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Fanfiction.net on Kindle forkyfork Amazon Kindle 26 08-07-2011 08:42 AM
bookmarks/notes grabber Reader2 Android Developer's Corner 0 10-02-2010 09:24 AM
EASY fanfiction grabber? sherryg Workshop 19 01-08-2010 03:13 AM
FLAG (Fanfiction.net Lightweight Automated Grabber) and Calibre? malkie13 Calibre 1 02-10-2009 05:43 PM


All times are GMT -4. The time now is 06:49 AM.


MobileRead.com is a privately owned, operated and funded community.