Hi all,
here comes a new version of getfeed which incorporates both the ideas of
thetechnobear and
fodiator proposed above.
Some (sort of) documentation:
Code:
getfeed V0.9e (c) by T.Berndt
This program comes with ABSOLUTELY NO WARRENTY.
usage: getfeed [...] [-o <outfile>] [-f] <feed> [<feed_1> ...]
-f <feed>[;<start>;<stop>;<filter>;<server>;<srcURL>;<toURLa>;<toURLb>]
: <feed> is a URLs or a filenmae.
-d <directory: saves output into <directory>
-o <outfile> : saves output into <outfile>
-t <title> : Title of this news' edition
-r : Retrieve and append linked atricles. Default: no
-R <file> : Reads <file> instead of .getfeedrc
-e <charset> : Use <charset> for encoding. Default: utf-8
-F <format> : Output format: html(obvious) or tex(LaTeX) Default: html
-S <style> : Reads <style> and adds its content as style-information.
-P <package> : Adds a \usepackage{<package>} into the LaTeX-file
-C <cmd> : Execute <cmd>
-m : format text in two columns
-a : Auto-name the output as news_YYYYMMDD.<format> Default: no
-v : Print debugging info to STDERR/<log>.
-s : Suppress all output. Default: no (i.e. not silent)
-l <log> : Writes debugging information to <log>
Run getfeed -v -h for more information!
getfeed reads news-feeds and converts them into either an HTML-or LaTeX-file.
The feeds currently understood are RSS, ATOM and RDF {0.91, 1.0, 2.0}.
And some more explanation:
- config file:
If the home directory contains a config file named .getfeedrc
this will be parsed for the above flags and those will be used
as default settings. Settings/input from the command line
override the default settings/values.
By means of -R <file> a different file can be specified.
- On the -f swtich:
If the feed starts with HTTP:// the respective file will be
downloaded, otherwise it will be read from the filesystem.
- On the <start> and <stop> tags:
The <start> and <stop> tag are used as markers to cut the
interesting part out of the downloaded article. They need to
be given only if -r is given.
If <start> is provided but <stop> is not, <start> is interpreted
as a program to receice and process the downloaded page:
This program must accept the name of the file into which getfeed saves
the current page. After processing of the page the program must write
its results back to this file.
- On the <filter> tag:
If this tag is given it will interpreted as the filename to
be looked up for key words that will be checked against.
The format of this file is <word> <weight>. When the checks
are made the feeds header and description are parsed and if
a word is found a counter will be increased by the weighing
factor associated with that word. If this sum exceeds some
threshold, this item will be rejected. The threshold is given
as #! n anywhere in the file.
- On the <server> tag:
If specified, it will be used to download images from and include
them into the output.
- On the tags <srcURL> <toURLa> and <toURLb>:
These tags can be specified to "redirect" the URL of the current
feed to point to a different page, as e.g. the print-edition of the
current page.
Credit for this needs to go to 'thetechnobear' as he proposed this
feature and provided a prototype. Check
https://www.mobileread.com/forums/sho...?t=7796&page=2
- On the -r switch:
The linked page is dowonloaded only if <start> and <stop> tags
are given for this feed.
- On the -a switch:
This is to allow to keep the news in a sorted way for later look-up
If <outfile> is given as well, this will override -a.
- On debugging/logging:
The -v switch is stack-able i.e. -v -v will produce more output.
- On the behaviour of "binary" switches:
-a, -r, -s toggle, i.e. -a -a effectively turns autoname off.
WARNING:
As can be read above fodiator's idea to facilitate
plugins has been realised by just calling an external program to "massage" the current item's page and return its result to getfeed for inclusion. Of course, this opens every door to malign code to wreak havoc on your computer, so it's up to you to check that program carefully, beforehand.
I chose this approach as
(i) it offers users to use and provide there own logic in any language they like,
(ii) it doesn't impose any artifical restrictions like interfaces or APIs, and
(iii) it is the simplest approach to realise
second WARNING:
I haven't checked this feature myself! I only wrote two sample programs -
caller.pl and
callee.pl - as proof of principle.
Hoping you find it useful...
Regards,
Tommy
---
please note, the "plugin" mechanism doesn't work yet :-( I just checked it.
---
UPDATE
The "plugin" mechanism has been fixed and is working now!
I uploaded the latest version (0.9e) of getfeed along with an example "plugin" (callee.pl). This program does nothing but turn the text into upper case, to illustrate the usage of this feature.
However, it might also serve as template or a starting point for your "plugins"