The following applies to netcomics version 0.5 & higher. The 'func' field is new for version 0.6. Although not recommended, you may still use the old way (versions 0.1 to 0.4) because netcomics is backwards compatible.
The following table describes the RLI hash structure.
Field | Format | Example | Description |
---|---|---|---|
name | Full_Name- yymmdd. filetype |
strftime("My_Comic- %y%m%d.gif",@ltime) |
The name of the file the comic is to be saved as. There isn't much flexibility here with what you can do, because this field is used to obtain several key pieces of information. When the -l option is given to netcomics, it will take this field from every RLI, and strip away the date & file extension, and remove any underscores to create the name of the comic. This same thing is performed for webpage creation, too. |
base | URL | http://www.comp.com/ | This should be the part of the URL that is common between the page's URL and the comic's URL. Typically, this is just the main URL for the website. |
page | latter half of a URL |
strftime( "archives/mycomic- %y%m%d.html", @ltime) |
This field is completes the URL when appended to base for the comic, or an HTML page that contains a link to another URL used to get the comic. When it points to the comic itself, the field, exprs, is set to undef, or is not specified at all. When it points to a document containing some other reference that can be used to locate the comic, the field, exprs is used. |
exprs | regular expressions |
"(comics/mycomic- \\d+.gif)" |
This field is an array of regular expressions used to find the last part of the URL for the comic. Once a starting webpage has been downloaded using the concatonation of base and page, the elements of the array are used, one by one, to match on some text in the page downloaded. The last regular expression in the array is used as the one to create a URL to download the comic itself. Each element of the array is a regular expression. It is required that a pair of parenthesis is left around the part of the text being matched on to be used as the completing part of the next URL to obtain the comic strip or another webpage containing another reference to be matched on. There only needs to be one set of parenthesis in each regular expression. If more than one pair exists, only the one that causes $1 to be defined in the perl code will be used (IOW, nothing fancy here. netcomics doesn't have any special logic, and is coded to only deal with $1). |
func | subroutine reference |
sub { return ("${date}a.gif", "${date}b.gif"); } |
This field provides the ability for multiple files to be downloaded from a website for a single comic. The last field of the RLI to be used, the function pointed to by the subroutine reference is run, being given the the test of the last webpage downloaded, if there was one. The function has to return a list of strings: each string being a relative URL, that is finally used to download a comic file. Most often, the function will be created by a module as a closure. A closure is an anonymous subroutine that is created with some specific data. For example, if you're writing a closure that is to return two relative URLs, each with some text in it that is created from the date, the data you need to embed in the subroutine when you create it needs to be the date given to the RLI function in some form. See the module files madam-n-eve, sluggy_freelance, and goats for some examples. This was introduced in version 0.6. |
The global %hof hash (associative array) contains the names of all of the functions that return RLI hashes, and the number of days in the past the comic is available using the RLI returned. The keys are the names of the functions, and the values are the numbers of days.
Add your function's name to this hash with code like this:$hof{"myfunction"} = 0;That example is for a comic whose lastest strip is available on the same day. If you have many functions in one module file to add to %hof, use code like this:
{ my $f; foreach $f ("f1","f2","f3","f4") { $hof{$f} = 7; } }In that example, every function returns RLI's whose comics they retrieve are available a week after they're published.
An RLI function takes one argument: the time as returned by time(). Your function should expect this to be the exact time that it is supposed to use to determine the date of the comic to retrieve. It is easiest to use the POSIX time functions to create strings using the time given. Here's an example:
#-*-Perl-*- #Add the name of the subroutines to the hash of functions #with the number of days from today the comic is available $hof{"uf"} = 0; #UserFriendly http://www.userfriendly.org/cartoons/ sub uf { my @ltime = localtime(shift(@_)); my $rec = { 'name' => strftime("User_Friendly-%y%m%d.gif",@ltime), 'base' => "http://www.userfriendly.org", 'page' => lc strftime("/cartoons/archives/%y%b/%Y%m%d.html", @ltime), 'exprs' => [lc strftime("(\/cartoons\/archives\/%y%b\/uf\\d+\.gif)", @ltime)] }; return $rec; } 1;
Note that the perl function, lc() (lower case) is also used to help produce the proper string. See the manpage for strftime() for a definition of what each of the "%" directives do.
Your function should return the RLI (as a reference, not as the hash itself), and it will be used by netcomics. Put the function in a file that is readible to all users in /usr/lib/netcomics, and netcomics will find it and use it without you needing to make any modification to its code. And finally, you must have as the last line in the file:
1;If you don't, the module will not load properly.