View Single Post
Old 01-26-2008, 05:12 AM   #29
Tommy
Enthusiast
Tommy began at the beginning.
 
Posts: 32
Karma: 10
Join Date: Oct 2006
Location: Germany
Device: Iliad, Sony 505
Talking New features

Hi all,

here comes a new version of getfeed which incorporates both the ideas of thetechnobear and fodiator proposed above.

Some (sort of) documentation:
Code:

getfeed V0.9e (c) by T.Berndt
This program comes with ABSOLUTELY NO WARRENTY.

usage: getfeed [...] [-o <outfile>] [-f] <feed> [<feed_1> ...]
  -f <feed>[;<start>;<stop>;<filter>;<server>;<srcURL>;<toURLa>;<toURLb>]
               : <feed> is a URLs or a filenmae.
  -d <directory: saves output into <directory>
  -o <outfile> : saves output into <outfile>
  -t <title>   : Title of this news' edition
  -r           : Retrieve and append linked atricles. Default: no
  -R <file>    : Reads <file> instead of .getfeedrc
  -e <charset> : Use <charset> for encoding. Default: utf-8
  -F <format>  : Output format: html(obvious) or tex(LaTeX) Default: html
  -S <style>   : Reads <style> and adds its content as style-information.
  -P <package> : Adds a \usepackage{<package>} into the LaTeX-file
  -C <cmd>     : Execute <cmd>
  -m           : format text in two columns
  -a           : Auto-name the output as news_YYYYMMDD.<format> Default: no
  -v           : Print debugging info to STDERR/<log>.
  -s           : Suppress all output. Default: no (i.e. not silent)
  -l <log>     : Writes debugging information to <log>

Run getfeed -v -h for more information!

getfeed reads news-feeds and converts them into either an HTML-or LaTeX-file.
The feeds currently understood are RSS, ATOM and RDF {0.91, 1.0, 2.0}.
And some more explanation:
  • config file:
    If the home directory contains a config file named .getfeedrc
    this will be parsed for the above flags and those will be used
    as default settings. Settings/input from the command line
    override the default settings/values.
    By means of -R <file> a different file can be specified.
  • On the -f swtich:
    If the feed starts with HTTP:// the respective file will be
    downloaded, otherwise it will be read from the filesystem.
  • On the <start> and <stop> tags:
    The <start> and <stop> tag are used as markers to cut the
    interesting part out of the downloaded article. They need to
    be given only if -r is given.
    If <start> is provided but <stop> is not, <start> is interpreted
    as a program to receice and process the downloaded page:
    This program must accept the name of the file into which getfeed saves
    the current page. After processing of the page the program must write
    its results back to this file.
  • On the <filter> tag:
    If this tag is given it will interpreted as the filename to
    be looked up for key words that will be checked against.
    The format of this file is <word> <weight>. When the checks
    are made the feeds header and description are parsed and if
    a word is found a counter will be increased by the weighing
    factor associated with that word. If this sum exceeds some
    threshold, this item will be rejected. The threshold is given
    as #! n anywhere in the file.
  • On the <server> tag:
    If specified, it will be used to download images from and include
    them into the output.
  • On the tags <srcURL> <toURLa> and <toURLb>:
    These tags can be specified to "redirect" the URL of the current
    feed to point to a different page, as e.g. the print-edition of the
    current page.
    Credit for this needs to go to 'thetechnobear' as he proposed this
    feature and provided a prototype. Check
    https://www.mobileread.com/forums/sho...?t=7796&page=2
  • On the -r switch:
    The linked page is dowonloaded only if <start> and <stop> tags
    are given for this feed.
  • On the -a switch:
    This is to allow to keep the news in a sorted way for later look-up
    If <outfile> is given as well, this will override -a.
  • On debugging/logging:
    The -v switch is stack-able i.e. -v -v will produce more output.
  • On the behaviour of "binary" switches:
    -a, -r, -s toggle, i.e. -a -a effectively turns autoname off.

WARNING:
As can be read above fodiator's idea to facilitate plugins has been realised by just calling an external program to "massage" the current item's page and return its result to getfeed for inclusion. Of course, this opens every door to malign code to wreak havoc on your computer, so it's up to you to check that program carefully, beforehand.

I chose this approach as
(i) it offers users to use and provide there own logic in any language they like,
(ii) it doesn't impose any artifical restrictions like interfaces or APIs, and
(iii) it is the simplest approach to realise

second WARNING:
I haven't checked this feature myself! I only wrote two sample programs - caller.pl and callee.pl - as proof of principle.

Hoping you find it useful...
Regards,
Tommy

---
please note, the "plugin" mechanism doesn't work yet :-( I just checked it.
---
UPDATE
The "plugin" mechanism has been fixed and is working now!
I uploaded the latest version (0.9e) of getfeed along with an example "plugin" (callee.pl). This program does nothing but turn the text into upper case, to illustrate the usage of this feature.
However, it might also serve as template or a starting point for your "plugins"
Attached Files
File Type: pl getfeed.pl (32.0 KB, 332 views)
File Type: pl callee.pl (377 Bytes, 271 views)

Last edited by Tommy; 02-02-2008 at 06:07 AM. Reason: bug-report
Tommy is offline   Reply With Quote